radeonsi: emit 1/sqrt for RSQ
We don't need the clamped version and we don't have to use any intrinsic.
Stats on Tonga:
15382 shaders in 9128 tests
Totals:
SGPRS:
1230560 ->
1230560 (0.00 %)
VGPRS: 469577 -> 462504 (-1.51 %)
Code Size:
22089908 ->
21730052 (-1.63 %) bytes
LDS: 598 -> 598 (0.00 %) blocks
Scratch: 283648 -> 281600 (-0.72 %) bytes per wave
Max Waves: 125664 -> 126969 (1.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 547280 -> 547280 (0.00 %)
VGPRS: 269132 -> 262059 (-2.63 %)
Code Size:
15709604 ->
15349748 (-2.29 %) bytes
LDS: 198 -> 198 (0.00 %) blocks
Scratch: 74752 -> 72704 (-2.74 %) bytes per wave
Max Waves: 47840 -> 49145 (2.73 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>