radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC
authorMarek Olšák <marek.olsak@amd.com>
Sun, 28 May 2017 22:40:39 +0000 (00:40 +0200)
committerMarek Olšák <marek.olsak@amd.com>
Wed, 7 Jun 2017 18:14:15 +0000 (20:14 +0200)
commit2b8b9a56efc24cc0f27469bf1532c288cdca2076
tree338aae3dc064c1ff78befc7c8d26ff3d21bf780c
parent2c4ec3f93fcab3fddcbe132200b210e7def1facc
radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC

Heaven LDS usage for LS+HS is below. The masks are "outputs_written"
for LS and HS. Note that 32K is the maximum size.

Before:
  heaven_x64: ls=1f1 tcs=1f1, lds=32K
  heaven_x64: ls=31 tcs=31, lds=24K
  heaven_x64: ls=71 tcs=71, lds=28K

After:
  heaven_x64: ls=3f tcs=3f, lds=24K
  heaven_x64: ls=7 tcs=7, lds=13K
  heaven_x64: ls=f tcs=f, lds=17K

All other apps have a similar decrease in LDS usage, because
the "outputs_written" masks are similar. Also, most apps don't write
POSITION in these shader stages, so there is room for improvement.
(tight per-component input/output packing might help even more)

It's unknown whether this improves performance.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
src/gallium/drivers/radeonsi/si_shader.c
src/gallium/drivers/radeonsi/si_state_shaders.c