git.libre-soc.org Git - gcc.git/commit

author	Richard Sandiford <richard.sandiford@arm.com>
	Fri, 2 Oct 2020 10:53:05 +0000 (11:53 +0100)
committer	Richard Sandiford <richard.sandiford@arm.com>
	Fri, 2 Oct 2020 10:53:05 +0000 (11:53 +0100)
commit	bb78e5876aa6a6b4a8158cdc0f6c8511eb2be75f
tree	516b84f00dd233c993a7bf0416d7b402ab2f651f	tree
parent	01c288035aa960631dd0ffd9131ed0a824a95f30	commit \| diff

arm: Make more use of the new mode macros

As Christophe pointed out, my r11-3522 patch didn't in fact fix
all of the armv8_2-fp16-arith-2.c failures introduced by allowing
FP16 vectorisation without -funsafe-math-optimizations. I must have
only tested the final patch on my usual arm-linux-gnueabihf bootstrap,
which it turns out treats the test as unsupported.

The focus of the original patch was to use mode macros for
patterns that are shared between Advanced SIMD, iwMMXt and MVE.
This patch uses the mode macros for general neon.md patterns too.

gcc/
* config/arm/neon.md (*sub<VDQ:mode>3_neon): Use the new mode macros
for the insn condition.
(sub<VH:mode>3, *mul<VDQW:mode>3_neon): Likewise.
(mul<VDQW:mode>3add<VDQW:mode>_neon): Likewise.
(mul<VH:mode>3add<VH:mode>_neon): Likewise.
(mul<VDQW:mode>3neg<VDQW:mode>add<VDQW:mode>_neon): Likewise.
(fma<VCVTF:mode>4, fma<VH:mode>4, *fmsub<VCVTF:mode>4): Likewise.
(quad_halves_<code>v4sf, reduc_plus_scal_<VD:mode>): Likewise.
(reduc_plus_scal_<VQ:mode>, reduc_smin_scal_<VD:mode>): Likewise.
(reduc_smin_scal_<VQ:mode>, reduc_smax_scal_<VD:mode>): Likewise.
(reduc_smax_scal_<VQ:mode>, mul<VH:mode>3): Likewise.
(neon_vabd<VF:mode>_2, neon_vabd<VF:mode>_3): Likewise.
(fma<VH:mode>4_intrinsic): Delete.
(neon_vadd<VCVTF:mode>): Use the new mode macros to decide which
form of instruction to generate.
(neon_vmla<VDQW:mode>, neon_vmls<VDQW:mode>): Likewise.
(neon_vsub<VCVTF:mode>): Likewise.
(neon_vfma<VH:mode>): Generate the main fma<mode>4 form instead
of using fma<mode>4_intrinsic.

gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-arith-2.c (float16_t): Use _Float16_t
rather than __fp16.
(float16x4_t, float16x4_t): Likewise.
(fp16_abs): Use __builtin_fabsf16.

gcc/config/arm/neon.md		diff \| blob \| history
gcc/testsuite/gcc.target/arm/armv8_2-fp16-arith-2.c		diff \| blob \| history