aarch64: Improve vcombine codegen [PR89057]
authorRichard Sandiford <richard.sandiford@arm.com>
Mon, 4 Jan 2021 11:59:07 +0000 (11:59 +0000)
committerRichard Sandiford <richard.sandiford@arm.com>
Mon, 4 Jan 2021 11:59:07 +0000 (11:59 +0000)
commitb41e6dd50f329b0291457e939d4c0dacd81c82c1
tree65dc2ac48b43e224d17ae38019837a277ca57bc6
parentba15b0fa0df773a90374f6b06775534ecd9f7b43
aarch64: Improve vcombine codegen [PR89057]

This patch fixes a codegen regression in the handling of things like:

  __temp.val[0]      \
    = vcombine_##funcsuffix (__b.val[0],      \
     vcreate_##funcsuffix (__AARCH64_UINT64_C (0))); \

in the 64-bit vst[234] functions.  The zero was forced into a
register at expand time, and we relied on combine to fuse the
zero and combine back together into a single combinez pattern.
The problem is that the zero could be hoisted before combine
gets a chance to do its thing.

gcc/
PR target/89057
* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
aarch64_simd_reg_or_zero for operand 2.  Use the combinez patterns
to handle zero operands.

gcc/testsuite/
PR target/89057
* gcc.target/aarch64/pr89057.c: New test.
gcc/config/aarch64/aarch64-simd.md
gcc/testsuite/gcc.target/aarch64/pr89057.c [new file with mode: 0644]