[AArch64] Support zero-extended move to FP register
The popcount expansion uses SIMD instructions acting on 64-bit values.
As a result a popcount of a 32-bit integer requires zero-extension before
moving the zero-extended value into an FP register. This patch adds
support for zero-extended int->FP moves to avoid the redundant uxtw.
Similarly, add support for 32-bit zero-extending load->FP register
and 32-bit zero-extending FP->FP and FP->int moves.
Add a missing 'fp' arch attribute to the related 8/16-bit pattern and
fix an incorrect type attribute.
To complete zero-extended load support, add a new alternative to
load_pair_zero_extendsidi2_aarch64 to support LDP into FP registers too.
int f (int a)
{
return __builtin_popcount (a);
}
Before:
uxtw x0, w0
fmov d0, x0
cnt v0.8b, v0.8b
addv b0, v0.8b
fmov w0, s0
ret
After:
fmov s0, w0
cnt v0.8b, v0.8b
addv b0, v0.8b
fmov w0, s0
ret
Passes regress & bootstrap on AArch64.
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2_aarch64): Add alternatives
to zero-extend between int and floating-point registers.
(load_pair_zero_extendsidi2_aarch64): Add alternative for zero-extended
ldp into floating-point registers. Add type and arch attributes.
(zero_extend<SHORT:mode><GPI:mode>2_aarch64): Add arch attribute.
Use f_loads for type attribute.
testsuite/
* gcc.target/aarch64/popcnt.c: Test zero-extended popcount.
* gcc.target/aarch64/vec_zeroextend.c: Test zero-extended vectors.
From-SVN: r265079