Add vect_perm3_* target selectors
SLP load permutation fails if any individual permutation requires more
than two vector inputs. For 128-bit vectors, it's possible to permute
3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
or 64-bit elements. The results are reversed for 256-bit vectors,
and so on for wider vectors.
This patch adds a routine that tests whether a permute will require
three vectors for a given vector count and element size, then adds
vect_perm3_* target selectors for the cases that we currently use.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_perm_short, vect_perm_byte): Document
previously undocumented selectors.
(vect_perm3_byte, vect_perm3_short, vect_perm3_int): Document.
gcc/testsuite/
* lib/target-supports.exp (vect_perm_supported): New proc.
(check_effective_target_vect_perm3_int): Likewise.
(check_effective_target_vect_perm3_short): Likewise.
(check_effective_target_vect_perm3_byte): Likewise.
* gcc.dg/vect/slp-perm-1.c: Expect SLP load permutation to
succeed if vect_perm3_int.
* gcc.dg/vect/slp-perm-5.c: Likewise.
* gcc.dg/vect/slp-perm-6.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-perm-8.c: Likewise vect_perm3_byte.
* gcc.dg/vect/slp-perm-9.c: Likewise vect_perm3_short.
Use vect_perm_short instead of vect_perm. Add a scan-tree-dump-not
test for vect_perm3_short targets.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254592