The following testcase is miscompiled, because there is a buffer overflow
in push_partial_def in the little-endian case when working 64-byte vectors.
The code computes the number of bytes we need in the BUFFER: NEEDED_LEN,
which is rounded up number of bits we need. Then the code
native_encode_expr each (partially overlapping) pd into THIS_BUFFER.
If pd.offset < 0, i.e. the pd.rhs store starts at some bits before the
window we are interested in, we pass -pd.offset to native_encode_expr and
shrink the size already earlier:
HOST_WIDE_INT size = pd.size;
if (pd.offset < 0)
size -= ROUND_DOWN (-pd.offset, BITS_PER_UNIT);
On this testcase, the problem is with a store with pd.offset > 0,
in particular pd.offset 256, pd.size 512, i.e. a 64-byte store which doesn't
fit into entirely into BUFFER.
We have just:
size = MIN (size, (HOST_WIDE_INT) needed_len * BITS_PER_UNIT);
in this case for little-endian, which isn't sufficient, because needed_len
is 64, the entire BUFFER (except of the last extra byte used for shifting).
native_encode_expr fills the whole THIS_BUFFER (again, except the last extra
byte), and the code then performs memcpy (BUFFER + 32, THIS_BUFFER, 64);
which overflows BUFFER and as THIS_BUFFER is usually laid out after it,
overflows it into THIS_BUFFER.
The following patch fixes it by for pd.offset > 0 making sure size is
reduced too. For big-endian the code does things differently and already
handles this right.
2020-03-25 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/94300
* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): If pd.offset
is positive, make sure that off + size isn't larger than needed_len.
* gcc.target/i386/avx512f-pr94300.c: New test.
+2020-03-25 Jakub Jelinek <jakub@redhat.com>
+
+ PR tree-optimization/94300
+ * tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): If pd.offset
+ is positive, make sure that off + size isn't larger than needed_len.
+
2020-03-25 Richard Biener <rguenther@suse.de>
Jakub Jelinek <jakub@redhat.com>
2020-03-25 Jakub Jelinek <jakub@redhat.com>
+ PR tree-optimization/94300
+ * gcc.target/i386/avx512f-pr94300.c: New test.
+
PR debug/94283
* gcc.dg/pr94283.c: New test.
--- /dev/null
+/* PR tree-optimization/94300 */
+/* { dg-do run { target { avx512f } } } */
+/* { dg-options "-O1 -mavx512f -mprefer-vector-width=512 -mtune=skylake-avx512" } */
+
+#include "avx512f-check.h"
+
+typedef double V __attribute__((vector_size (64)));
+
+static void
+avx512f_test (void)
+{
+ double mem[16];
+ const V a = { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0 };
+ const V b = { 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0 };
+ V c;
+ __builtin_memcpy (mem, &a, 64);
+ __builtin_memcpy (mem + 8, &b, 64);
+ __builtin_memcpy (&c, mem + 4, 64);
+ if (c[5] != 9.0)
+ __builtin_abort ();
+}
shift_bytes_in_array_left (this_buffer, len + 1, amnt);
unsigned int off = pd.offset / BITS_PER_UNIT;
gcc_assert (off < needed_len);
+ size = MIN (size,
+ (HOST_WIDE_INT) (needed_len - off) * BITS_PER_UNIT);
p = buffer + off;
if (amnt + size < BITS_PER_UNIT)
{