aco: Implement subgroup shuffle on GFX6-7.
authorTimur Kristóf <timur.kristof@gmail.com>
Tue, 26 May 2020 23:28:03 +0000 (01:28 +0200)
committerMarge Bot <eric+marge@anholt.net>
Tue, 2 Jun 2020 21:12:12 +0000 (21:12 +0000)
commit045c9ffa7d7f496ba347aa7acbfc0edea37a0fc1
tree92acc21efd60643756c3af608d70027cd19ef3f8
parent14a5021aff661a26d76f330fec55d400d35443a8
aco: Implement subgroup shuffle on GFX6-7.

GFX6 and GFX7 don't have the ds_bpermute (or permute) instruction,
but we would like to support subgroup shuffle on these old GPUs.

So we introduce a new pseudio instruction which will be lowered
to an "unrolled loop" that emulates bpermute on GFX6 and GFX7
using readlane instructions, while also respecting the exec mask
thanks to v_cmpx.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5223>
src/amd/compiler/aco_instruction_selection.cpp
src/amd/compiler/aco_lower_to_hw_instr.cpp