They say third time is the charm.. It looks like the testcase
disables the cost model and so AArch64 we end up being able to
do the permute but on x86 we can't. However when analyzing the
testcase I didn't disable the cost model hence the difference.
So I now guard the testcase on vect_load_lanes as there's not a
"can do any permute" test directive and load lanes is what I will
be fixing up next year so this should catch it.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-11b.c: Guard statements.
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided4 && vect_int_mult } } } } */
/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_strided4 && vect_int_mult } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { ! vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_load_lanes } } } } */