For variable-length SVE, we can only use SLP for N scalars of type
T if the number of Ts in a vector is a multiple of N. For ints
this means that N must be 4 or 2, so this patch XFAILs two tests
for N==8.
The exact limit seems inherently target-specific -- variable-length
vectors with a 256-bit granule would work fine -- so I used aarch64_sve
selectors on the XFAILs.
gcc/testsuite/
* gcc.dg/vect/slp-reduc-4.c: XFAIL test for SLP vectorization
for variable-length SVE.
* gcc.dg/vect/slp-reduc-7.c: Likewise.
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_min_max } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_min_max || vect_variable_length } } } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" } } */
+/* For variable-length SVE, the number of scalar statements in the
+ reduction exceeds the number of elements in a 128-bit granule. */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_add } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_add || vect_variable_length } } } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" } } */
+/* For variable-length SVE, the number of scalar statements in the
+ reduction exceeds the number of elements in a 128-bit granule. */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_add || { aarch64_sve && vect_variable_length } } } } } */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */