The vectorizer, for large permuted grouped loads, generates
inefficient intermediate code (cleaned up only later) that runs
into complexity issues in SCEV analysis and elsewhere. For the
non-single-element interleaving case we already put a hard limit
in place, this applies the same limit to the missing case.
2021-01-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/91403
* tree-vect-data-refs.c (vect_analyze_group_access_1): Cap
single-element interleaving group size at 4096 elements.
* gcc.dg/vect/pr91403.c: New testcase.
--- /dev/null
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+extern int a[][1000000];
+int b;
+void c()
+{
+ for (int d = 2; d <= 9; d++)
+ for (int e = 32; e <= 41; e++)
+ b += a[d][5];
+}
size. */
if (DR_IS_READ (dr)
&& (dr_step % type_size) == 0
- && groupsize > 0)
+ && groupsize > 0
+ /* This could be UINT_MAX but as we are generating code in a very
+ inefficient way we have to cap earlier.
+ See PR91403 for example. */
+ && groupsize <= 4096)
{
DR_GROUP_FIRST_ELEMENT (stmt_info) = stmt_info;
DR_GROUP_SIZE (stmt_info) = groupsize;