i965: Implement the PMA stall fix.
authorKenneth Graunke <kenneth@whitecape.org>
Wed, 22 Oct 2014 15:58:58 +0000 (08:58 -0700)
committerKenneth Graunke <kenneth@whitecape.org>
Tue, 4 Nov 2014 19:38:01 +0000 (11:38 -0800)
commit7423cc891b4d6fcc63bfeb79cc1d711ce81122bd
treed303842e3932b852d6b4351f685545fa5b928ca9
parent8ccf54ab098032da4652b314761c04f7724a7277
i965: Implement the PMA stall fix.

Certain non-promoted depth cases typically incur stalls.  In very
specific cases, we can enable a workaround which improves performance.

Improves performance in GLBenchmark 2.7 TRex by 1.17762% +/- 0.448765%
(n=75) at 1280x720 on Broadwell GT3.

Haswell has this feature as well, but we can't currently write registers
from userspace batches (and we'd incur additional software batch
scanning overhead as well), so we haven't enabled it.  Broadwell allows
us to write CACHE_MODE_1.  Backporters beware: the formula and flushing
incantation differs between Haswell and Broadwell.

v2: Move pma_stall_bits from brw->state to brw itself (requested by
    Kristian Høgsberg).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/brw_context.h
src/mesa/drivers/dri/i965/brw_state.h
src/mesa/drivers/dri/i965/brw_state_upload.c
src/mesa/drivers/dri/i965/gen8_depth_state.c