aco: Use 24-bit multiplication in TCS I/O
authorTimur Kristóf <timur.kristof@gmail.com>
Thu, 23 Apr 2020 13:39:56 +0000 (15:39 +0200)
committerMarge Bot <eric+marge@anholt.net>
Fri, 24 Apr 2020 17:58:57 +0000 (17:58 +0000)
commiteafc1e7365ec52d7cb979396ff977d6301cb4b7f
treefbb8475e18f9221f75f1ccdb0bf06d7972f3cd15
parent64332a0937f731fe7b090bee7d3e9f813e341e5b
aco: Use 24-bit multiplication in TCS I/O

The TCS inputs and outputs must always fit into the LDS,
which implies that their addresses also always fit 24 bits.

On AMD GPUs, 24-bit multiplication is much faster than 32-bit
multiplication, so we can take the opportunity to use that
for TCS I/O instead.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
src/amd/compiler/aco_instruction_selection.cpp