radeonsi: use optimal packet order when doing a pipeline sync
authorMarek Olšák <marek.olsak@amd.com>
Fri, 4 Aug 2017 15:38:57 +0000 (17:38 +0200)
committerMarek Olšák <marek.olsak@amd.com>
Mon, 7 Aug 2017 19:12:24 +0000 (21:12 +0200)
commit0fe0320dc074023489e2852771edc487c0142927
treedc3bf60bdb7ddd0986fb063608ce461a80d86699
parent895de1d03d5bd0f7a4e46ad5128cdb3a0add7ec1
radeonsi: use optimal packet order when doing a pipeline sync

Process most new SET packets in parallel with previous draw calls, then
flush caches and wait, start the draw, and do L2 prefetches last.

This decreases the [CP busy / SPI busy] ratio (verified with GRBM perf
counters). In other words, the time window when shaders are idle (between
(the wait and the draw) is much shorter now.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
src/gallium/drivers/radeonsi/si_state_draw.c