Shader-db results on Kaby Lake:
total instructions in shared programs:
14929212 ->
14880028 (-0.33%)
instructions in affected programs: 72428 -> 23244 (-67.91%)
helped: 6
HURT: 2
helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624
helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08%
HURT stats (abs) min: 1178 max: 1178 x̄: 1178.00 x̃: 1178
HURT stats (rel) min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97%
95% mean confidence interval for instructions value: -11947.03 -348.97
95% mean confidence interval for instructions %-change: -125.72% 202.37%
Inconclusive result (%-change mean confidence interval includes 0).
total cycles in shared programs:
368585300 ->
342557344 (-7.06%)
cycles in affected programs:
28144921 ->
2116965 (-92.48%)
helped: 6
HURT: 2
helped stats (abs) min:
1404978 max:
7766106 x̄:
4353922.00 x̃:
3890682
helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28%
HURT stats (abs) min: 47778 max: 47798 x̄: 47788.00 x̃: 47788
HURT stats (rel) min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59%
95% mean confidence interval for cycles value: -
5900438.73 -606550.27
95% mean confidence interval for cycles %-change: -140.79% 146.16%
Inconclusive result (%-change mean confidence interval includes 0).
total spills in shared programs: 9243 -> 8901 (-3.70%)
spills in affected programs: 2718 -> 2376 (-12.58%)
helped: 4
HURT: 4
total fills in shared programs: 21831 -> 10141 (-53.55%)
fills in affected programs: 11804 -> 114 (-99.03%)
helped: 6
HURT: 2
total sends in shared programs: 815912 -> 815912 (0.00%)
sends in affected programs: 0 -> 0
helped: 0
HURT: 0
LOST: 1
GAINED: 3
The helped shaders are all compute shaders in Aztec Ruins. There is
also a compute shader in synmark2 OglCSDof that's helped but it doesn't
show up in above shader-db results because it went from SIMD8 to SIMD16.
That shader improves enough to yield an 15-20% performance boost to the
benchmark as a whole on my KBL laptop. The hurt shaders are a couple
shaders in Kerbal Space Program and a couple in Aztec Ruins.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
OPT(nir_lower_clip_cull_distance_arrays);
+ if (devinfo->gen >= 7 && is_scalar) {
+ /* TODO: Yes, we could in theory do this on gen6 and earlier. However,
+ * that would require plumbing through support for these indirect
+ * scratch read/write messages with message registers and that's just a
+ * pain. Also, the primary benefit of this is for compute shaders which
+ * won't run on gen6 and earlier anyway.
+ *
+ * The threshold of 128B was chosen semi-arbitrarily. The idea is that
+ * 128B per channel on a SIMD8 program is 32 registers or 25% of the
+ * register file. Any array that large is likely to cause pressure
+ * issues. Also, this value is sufficiently high that the benchmarks
+ * known to suffer from large temporary array issues are helped but
+ * nothing else in shader-db is hurt except for maybe that one kerbal
+ * space program shader.
+ */
+ OPT(nir_lower_vars_to_scratch, nir_var_function_temp, 128,
+ glsl_get_natural_size_align_bytes);
+ }
+
nir_variable_mode indirect_mask =
brw_nir_no_indirect_mask(compiler, nir->info.stage);
OPT(nir_lower_indirect_derefs, indirect_mask);