radeonsi: implement TC-compatible HTILE
authorMarek Olšák <marek.olsak@amd.com>
Tue, 11 Oct 2016 21:19:46 +0000 (23:19 +0200)
committerMarek Olšák <marek.olsak@amd.com>
Thu, 13 Oct 2016 17:00:51 +0000 (19:00 +0200)
commitd4d9ec55c589156df4edc227a86b4a8c41048d58
tree646cdd6806f7a311c7e8a1403d5e715a79386af7
parenta077185ea9d685967844b68aa09da6bd8aa430da
radeonsi: implement TC-compatible HTILE

so that decompress blits aren't needed and depth texturing needs less
memory bandwidth.

Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible
HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16.
The format promotion is not visible to state trackers.

This is part of TC-compatible renderbuffer compression, which has 3 parts:
DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now.

I don't see a measurable increase in performance though.

(I tested Talos Principle and DiRT: Showdown, the latter is improved by
 0.5%, which is almost noise, and it originally used layered Z16,
 so at least we know that Z16 promoted to Z32F isn't slower now)

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
src/gallium/drivers/radeon/r600_pipe_common.h
src/gallium/drivers/radeon/r600_texture.c
src/gallium/drivers/radeon/radeon_winsys.h
src/gallium/drivers/radeonsi/si_blit.c
src/gallium/drivers/radeonsi/si_descriptors.c
src/gallium/drivers/radeonsi/si_shader.c
src/gallium/drivers/radeonsi/si_state.c
src/gallium/drivers/radeonsi/si_state_draw.c
src/gallium/winsys/amdgpu/drm/amdgpu_surface.c