radeonsi: optimize TCS epilog when invocation 0 writes tess factors
authorMarek Olšák <marek.olsak@amd.com>
Tue, 5 Sep 2017 11:40:59 +0000 (13:40 +0200)
committerMarek Olšák <marek.olsak@amd.com>
Mon, 11 Sep 2017 17:02:02 +0000 (19:02 +0200)
commit6eade342eb223313242c1c2a7615b6bd75036087
tree50238402786df6ac9c443a4870c6938db23a69de
parent386d165d8d09317fe073d00da38f8851a9c33ee6
radeonsi: optimize TCS epilog when invocation 0 writes tess factors

This removes the barrier and LDS stores and loads for tess factors
when it's possible. The removal of the barrier seems more important
to me though.

In one shader, it removes 17 * 4 bytes from the shader binary.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
src/gallium/auxiliary/tgsi/tgsi_scan.c
src/gallium/drivers/radeonsi/si_shader.c
src/gallium/drivers/radeonsi/si_shader.h
src/gallium/drivers/radeonsi/si_shader_internal.h
src/gallium/drivers/radeonsi/si_state_shaders.c