aarch64: Reimplement most vpadal intrinsics using builtins
This patch reimplements most of the vpadal intrinsics to use RTL
builtins in the normal way.
The ones that aren't converted are the int32x2_t -> int64x1_t ones as
the RTL pattern doesn't currently handle
these modes. We don't have a V1DI mode so it would need to return a
DImode value or a V2DI one with the first lane
being the result. It's not hard to do, but it would require a bit more
refactoring so we can do it separately later.
This patch hopefully improves the status quo.
The new Vwhalf mode attribute is created because the existing Vwtype
attribute maps V8QI wrongly (for this pattern) to "8h" as the
suffix rather than "4h" as needed.
gcc/
* config/aarch64/iterators.md (Vwhalf): New iterator.
* config/aarch64/aarch64-simd.md (aarch64_<sur>adalp<mode>_3):
Rename to...
(aarch64_<sur>adalp<mode>): ... This. Make more
builtin-friendly.
(<sur>sadv16qi): Adjust callsite of the above.
* config/aarch64/aarch64-simd-builtins.def (sadalp, uadalp): New
builtins.
* config/aarch64/arm_neon.h (vpadal_s8): Reimplement using
builtins.
(vpadal_s16): Likewise.
(vpadal_u8): Likewise.
(vpadal_u16): Likewise.
(vpadalq_s8): Likewise.
(vpadalq_s16): Likewise.
(vpadalq_s32): Likewise.
(vpadalq_u8): Likewise.
(vpadalq_u16): Likewise.
(vpadalq_u32): Likewise.