freedreno/a3xx: also set FSSUPERTHREADENABLE
authorRob Clark <robdclark@gmail.com>
Fri, 23 Nov 2018 16:30:34 +0000 (11:30 -0500)
committerRob Clark <robdclark@gmail.com>
Tue, 27 Nov 2018 20:44:03 +0000 (15:44 -0500)
We set equiv bit in SP_FS_CTRL_REG0.  Somehow the hw doesn't hang with
this mismatched config, but does run slower.  It is faster with either
neither bit set, or both bits set, but both is the fastest of the three
configurations.  Worth a bit over 10% gain in glmark2.

Spotted-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@gmail.com>
src/gallium/drivers/freedreno/a3xx/fd3_program.c

index edc3a4dd6c457010a07b94f70fdb6544a13f99c2..9582584785716558553d2d1f3341a1ca0d1c0d06 100644 (file)
@@ -226,6 +226,7 @@ fd3_program_emit(struct fd_ringbuffer *ring, struct fd3_emit *emit,
 
        OUT_PKT0(ring, REG_A3XX_HLSQ_CONTROL_0_REG, 6);
        OUT_RING(ring, A3XX_HLSQ_CONTROL_0_REG_FSTHREADSIZE(FOUR_QUADS) |
+                       A3XX_HLSQ_CONTROL_0_REG_FSSUPERTHREADENABLE |
                        A3XX_HLSQ_CONTROL_0_REG_CONSTMODE(constmode) |
                        /* NOTE:  I guess SHADERRESTART and CONSTFULLUPDATE maybe
                         * flush some caches? I think we only need to set those