radv/llvm: fix subgroup shuffle for chips without bpermute
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.
This fixes DOOM Eternal which requires subgroup shuffle.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>