As of 4.11, the kernel isn't bothering to set the subslice hashing mode
on Apollolake, leaving it at the default of 8x8. (It initializes it to
16x4 on most platforms.)
Performance data for GPUTest Triangle on Apollolake at 1024x640:
X-tiled RT:
-----------
8x8 -> 16x4: 2.4325% +/- 0.383683% (n=107)
8x8 -> 8x4: -3.75105% +/- 0.592491% (n=40)
8x8 -> 16x16: 6.17238% +/- 0.67157% (n=30)
Y-tiled RT:
-----------
8x8 -> 16x4: 1.30307% +/- 0.297292% (n=205)
8x8 -> 8x4: -0.769282% +/- 0.729557% (n=35)
8x8 -> 16x16: 3.00254% +/- 0.715503% (n=40)
8x MSAA RT (INTEL_FORCE_MSAA=8):
--------------------------------
8x8 -> 16x4: 1.38889% +/- 0.93729% (n=7)
8x8 -> 8x4: -2.10643% +/- 1.15153% (n=3)
8x8 -> 16x16: 3.87183% +/- 1.08851% (n=5)
Based on this, we choose 16x16 for Apollolake.
Skylake GT2 with X-tiled buffers appears to be a toss-up between 16x4
and 16x16, and with Y-tiled buffers it doesn't seem to really matter.
So we'll leave Skylake alone for now.
The hashing mode doesn't seem to make a measurable impact on more
complex benchmarks.
Acked-by: Matt Turner <mattst88@gmail.com>
# define GEN8_HIZ_PMA_MASK_BITS \
REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE)
+#define GEN7_GT_MODE 0x7008
+# define GEN9_SUBSLICE_HASHING_8x8 (0 << 8)
+# define GEN9_SUBSLICE_HASHING_16x4 (1 << 8)
+# define GEN9_SUBSLICE_HASHING_8x4 (2 << 8)
+# define GEN9_SUBSLICE_HASHING_16x16 (3 << 8)
+# define GEN9_SUBSLICE_HASHING_MASK_BITS REG_MASK(3 << 8)
+
/* Predicate registers */
#define MI_PREDICATE_SRC0 0x2400
#define MI_PREDICATE_SRC1 0x2408
GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE |
GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC);
ADVANCE_BATCH();
+
+ if (brw->is_broxton) {
+ BEGIN_BATCH(3);
+ OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
+ OUT_BATCH(GEN7_GT_MODE);
+ OUT_BATCH(GEN9_SUBSLICE_HASHING_MASK_BITS |
+ GEN9_SUBSLICE_HASHING_16x16);
+ ADVANCE_BATCH();
+ }
}
if (brw->gen >= 8) {