aarch64: Split vec_selects of bottom elements into simple move
In certain intrinsics use cases GCC leaves SETs of a bottom-element vec
select lying around:
(vec_select:DI (reg:V2DI 34 v2 [orig:128 __o ] [128])
(parallel [
(const_int 0 [0])
])))
This can be treated as a simple move in aarch64 when done between SIMD
registers for all normal widths.
These go through the aarch64_get_lane pattern.
This patch adds a splitter there to simplify these extracts to a move
that can, perhaps, be optimised a way.
Another benefit is if the destination is memory we can use a simpler STR
instruction rather than ST1-lane.
gcc/
* config/aarch64/aarch64-simd.md (aarch64_get_lane<mode>):
Convert to define_insn_and_split. Split into simple move when moving
bottom element.
gcc/testsuite/
* gcc.target/aarch64/vdup_lane_2.c: Scan for fmov rather than
dup.