While working on PR 86871, I noticed we were being overly restrictive
when handling variable-length vectors. For:
for (i : ...)
{
res = ...;
for (j : ...)
res op= ...;
a[i] = res;
}
we don't need a reduction operation (although we do for double
reductions like:
res = ...;
for (i : ...)
for (j : ...)
res op= ...;
a[i] = res;
which must still be rejected).
2018-08-08 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_reduction): Allow inner-loop
reductions for variable-length vectors.
gcc/testsuite/
* gcc.target/aarch64/sve/reduc_8.c: New test.
From-SVN: r263451
+2018-08-09 Richard Sandiford <richard.sandiford@arm.com>
+
+ * tree-vect-loop.c (vectorizable_reduction): Allow inner-loop
+ reductions for variable-length vectors.
+
2018-08-09 David Malcolm <dmalcolm@redhat.com>
PR other/84889
+2018-08-09 Richard Sandiford <richard.sandiford@arm.com>
+
+ * gcc.target/aarch64/sve/reduc_8.c: New test.
+
2018-08-09 David Malcolm <dmalcolm@redhat.com>
PR other/84889
--- /dev/null
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+int
+reduc (int *restrict a, int *restrict b, int *restrict c)
+{
+ for (int i = 0; i < 100; ++i)
+ {
+ int res = 0;
+ for (int j = 0; j < 100; ++j)
+ if (b[i + j] != 0)
+ res = c[i + j];
+ a[i] = res;
+ }
+}
+
+/* { dg-final { scan-assembler-times {\tcmpne\tp[0-9]+\.s, } 1 } } */
+/* We ought to use the CMPNE result for the SEL too. */
+/* { dg-final { scan-assembler-not {\tcmpeq\tp[0-9]+\.s, } { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s, } 1 } } */
}
if (reduction_type != EXTRACT_LAST_REDUCTION
+ && (!nested_cycle || double_reduc)
&& reduc_fn == IFN_LAST
&& !nunits_out.is_constant ())
{