From 8ecdbb613609e58094af435a45c357aebce5ff66 Mon Sep 17 00:00:00 2001 From: Kenneth Graunke Date: Wed, 8 Nov 2017 10:56:00 -0800 Subject: [PATCH] i965: Pretend there are 4 subslices for compute shader threads on Gen9+. Similar to what we did for pixel shader threads - see gen_device_info.c. We don't want to bump the actual Maximum Number of Threads though, so we adjust it here. For pixel shaders, we don't use max_wm_threads, so we could just bump it globally. Supposedly fixes Piglit tests: arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t Reviewed-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_program.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 7607bc38840..5ecfb9f5b11 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -357,7 +357,19 @@ brw_alloc_stage_scratch(struct brw_context *brw, thread_count = devinfo->max_wm_threads; break; case MESA_SHADER_COMPUTE: { - const unsigned subslices = MAX2(brw->screen->subslice_total, 1); + unsigned subslices = MAX2(brw->screen->subslice_total, 1); + + /* The documentation for 3DSTATE_PS "Scratch Space Base Pointer" says: + * + * "Scratch Space per slice is computed based on 4 sub-slices. SW must + * allocate scratch space enough so that each slice has 4 slices + * allowed." + * + * According to the other driver team, this applies to compute shaders + * as well. This is not currently documented at all. + */ + if (devinfo->gen >= 9) + subslices = 4; /* WaCSScratchSize:hsw * -- 2.30.2