ac/llvm: add better code for fsign
There are 2 improvements:
- better code for 16, 32, and 64 bits
- vector support for 16 and 32 bits
Totals:
SGPRS:
2639738 ->
2625882 (-0.52 %)
VGPRS:
1534120 ->
1533916 (-0.01 %)
Spilled SGPRs: 3541 -> 3557 (0.45 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 292 -> 292 (0.00 %) dwords per thread
Code Size:
55640332 ->
55384892 (-0.46 %) bytes
Max Waves: 964785 -> 964857 (0.01 %)
Totals from affected shaders:
SGPRS: 377352 -> 363496 (-3.67 %)
VGPRS: 209800 -> 209596 (-0.10 %)
Spilled SGPRs: 1979 -> 1995 (0.81 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 256 -> 256 (0.00 %)
Scratch size: 256 -> 256 (0.00 %) dwords per thread
Code Size:
12549300 ->
12293860 (-2.04 %) bytes
Max Waves: 105762 -> 105834 (0.07 %)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284>