i965: Allocate scratch space for the maximum number of compute threads.
authorKenneth Graunke <kenneth@whitecape.org>
Tue, 7 Jun 2016 04:37:34 +0000 (21:37 -0700)
committerKenneth Graunke <kenneth@whitecape.org>
Sun, 12 Jun 2016 07:38:50 +0000 (00:38 -0700)
We were allocating enough space for the number of threads per subslice,
when we should have been allocating space for the number of threads in
the entire GPU.

Even though we currently run with a reduced thread count (due to a bug),
we might still overflow the scratch buffer because the address
calculation is based on the FFTID, which can depend on exactly which
threads, EUs, and threads are executing.  We need to allocate enough
for every possible thread that could run.

Fixes rendering corruption in Synmark's Gl43CSDof on Gen8+.
Earlier platforms need additional bug fixes.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
src/mesa/drivers/dri/i965/brw_cs.c

index 2a255847c654aa6eb1e602f6c59935a588af9cd4..c8598d6189103966ff5633c38c09a993f0079146 100644 (file)
@@ -149,8 +149,10 @@ brw_codegen_cs_prog(struct brw_context *brw,
    }
 
    if (prog_data.base.total_scratch) {
+      const unsigned subslices = MAX2(brw->intelScreen->subslice_total, 1);
       brw_get_scratch_bo(brw, &brw->cs.base.scratch_bo,
-                         prog_data.base.total_scratch * brw->max_cs_threads);
+                         prog_data.base.total_scratch *
+                         brw->max_cs_threads * subslices);
    }
 
    if (unlikely(INTEL_DEBUG & DEBUG_CS))