radeonsi: prevent a gfx10_ngg_calculate_subgroup_info failure for TES+NGG GS
authorMarek Olšák <marek.olsak@amd.com>
Tue, 30 Jun 2020 20:57:23 +0000 (16:57 -0400)
committerMarge Bot <eric+marge@anholt.net>
Thu, 16 Jul 2020 04:04:52 +0000 (04:04 +0000)
arb_tessellation_shader-tes-gs-max-output -small -scan 1 50 -auto -fbo
doesn't pass, but at least all shaders are compiled successfully.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5700>

src/gallium/drivers/radeonsi/si_state_shaders.c

index d1a1aa725cdfff6a511ecae9978a5595a97cdb0c..01ed224b282eb874c42599f5a078a29ea417cec0 100644 (file)
@@ -2651,9 +2651,15 @@ static void *si_create_shader_selector(struct pipe_context *ctx,
       sel->gs_input_verts_per_prim =
          u_vertices_per_prim(sel->info.properties[TGSI_PROPERTY_GS_INPUT_PRIM]);
 
-      /* EN_MAX_VERT_OUT_PER_GS_INSTANCE does not work with tesselation. */
+      /* EN_MAX_VERT_OUT_PER_GS_INSTANCE does not work with tesselation so
+       * we can't split workgroups. Disable ngg if any of the following conditions is true:
+       * - num_invocations * gs_max_out_vertices > 256
+       * - LDS usage is too high
+       */
       sel->tess_turns_off_ngg = sscreen->info.chip_class >= GFX10 &&
-                                sel->gs_num_invocations * sel->gs_max_out_vertices > 256;
+                                (sel->gs_num_invocations * sel->gs_max_out_vertices > 256 ||
+                                 sel->gs_num_invocations * sel->gs_max_out_vertices *
+                                 (sel->info.num_outputs * 4 + 1) > 6500 /* max dw per GS primitive */);
       break;
 
    case PIPE_SHADER_TESS_CTRL: