anv: Emit pushed UBO bounds checking code in the back-end compiler
This commit fixes performance regressions introduced by
e03f9652801ad7
in which we started bounds checking our push constants. This added a
LOT of shader code to shaders which use the robustBufferAccess feature
and led to substantial spilling. The checking we just added to the FS
back-end is far more efficient for two reasons:
1. It can be done at a whole register granularity rather than per-
scalar and so we emit one SIMD8 SEL per 32B GRF rather than one
SIMD16 SEL (executed as two SELs) for each component loaded.
2. Because we do it with NoMask instructions, we can do it on whole
pushed GRFs without splatting them out to SIMD8 or SIME16 values.
This means that robust buffer access no longer explodes our register
pressure for no good reason.
As a tiny side-benefit, we're now using can use AND instead of SEL which
means no need for the flag and better scheduling.
Vulkan pipeline database results on ICL:
Instructions in all programs:
293586059 ->
238009118 (-18.9%)
SENDs in all programs:
13568515 ->
13568515 (+0.0%)
Loops in all programs: 149720 -> 149720 (+0.0%)
Cycles in all programs:
88499234498 ->
84348917496 (-4.7%)
Spills in all programs:
1229018 -> 184339 (-85.0%)
Fills in all programs:
1348397 -> 246061 (-81.8%)
This also improves the performance of a few apps:
- Shadow of the Tomb Raider: +4%
- Witcher 3: +3.5%
- UE4 Shooter demo: +2%
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4447>