From: Kenneth Graunke <kenneth@whitecape.org>
Date: Fri, 23 Aug 2019 00:32:25 +0000 (-0700)
Subject: intel: Increase Gen11 compute shader scratch IDs to 64.
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=b9e93db20896a436c716107dd0d12057b3aa9f72;p=mesa.git

intel: Increase Gen11 compute shader scratch IDs to 64.

From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9ac ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
---

diff --git a/src/gallium/drivers/iris/iris_program.c b/src/gallium/drivers/iris/iris_program.c
index 1b80af56328..ae701ec984d 100644
--- a/src/gallium/drivers/iris/iris_program.c
+++ b/src/gallium/drivers/iris/iris_program.c
@@ -1930,6 +1930,21 @@ iris_get_scratch_space(struct iris_context *ice,
 
    if (!*bop) {
       unsigned scratch_ids_per_subslice = devinfo->max_cs_threads;
+
+      if (devinfo->gen >= 11) {
+         /* The MEDIA_VFE_STATE docs say:
+          *
+          *    "Starting with this configuration, the Maximum Number of
+          *     Threads must be set to (#EU * 8) for GPGPU dispatches.
+          *
+          *     Although there are only 7 threads per EU in the configuration,
+          *     the FFTID is calculated as if there are 8 threads per EU,
+          *     which in turn requires a larger amount of Scratch Space to be
+          *     allocated by the driver."
+          */
+         scratch_ids_per_subslice = 8 * 8;
+      }
+
       uint32_t max_threads[] = {
          [MESA_SHADER_VERTEX]    = devinfo->max_vs_threads,
          [MESA_SHADER_TESS_CTRL] = devinfo->max_tcs_threads,
diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c
index 62a527ed235..1f5495d56b2 100644
--- a/src/intel/vulkan/anv_allocator.c
+++ b/src/intel/vulkan/anv_allocator.c
@@ -1520,7 +1520,19 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool,
    const unsigned subslices = MAX2(physical_device->subslice_total, 1);
 
    unsigned scratch_ids_per_subslice;
-   if (devinfo->is_haswell) {
+   if (devinfo->gen >= 11) {
+      /* The MEDIA_VFE_STATE docs say:
+       *
+       *    "Starting with this configuration, the Maximum Number of
+       *     Threads must be set to (#EU * 8) for GPGPU dispatches.
+       *
+       *     Although there are only 7 threads per EU in the configuration,
+       *     the FFTID is calculated as if there are 8 threads per EU,
+       *     which in turn requires a larger amount of Scratch Space to be
+       *     allocated by the driver."
+       */
+      scratch_ids_per_subslice = 8 * 8;
+   } else if (devinfo->is_haswell) {
       /* WaCSScratchSize:hsw
        *
        * Haswell's scratch space address calculation appears to be sparse
diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c
index 8d1d576b87d..16762fc661b 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -464,7 +464,19 @@ brw_alloc_stage_scratch(struct brw_context *brw,
          subslices = 4 * brw->screen->devinfo.num_slices;
 
       unsigned scratch_ids_per_subslice;
-      if (devinfo->is_haswell) {
+      if (devinfo->gen >= 11) {
+         /* The MEDIA_VFE_STATE docs say:
+          *
+          *    "Starting with this configuration, the Maximum Number of
+          *     Threads must be set to (#EU * 8) for GPGPU dispatches.
+          *
+          *     Although there are only 7 threads per EU in the configuration,
+          *     the FFTID is calculated as if there are 8 threads per EU,
+          *     which in turn requires a larger amount of Scratch Space to be
+          *     allocated by the driver."
+          */
+         scratch_ids_per_subslice = 8 * 8;
+      } else if (devinfo->is_haswell) {
          /* WaCSScratchSize:hsw
           *
           * Haswell's scratch space address calculation appears to be sparse