For bindless SSBO access, we have to do 64-bit address calculations. On
ICL and above, we don't have 64-bit integer support so we have to lower
the address calculations to 32-bit arithmetic. If we don't run the
optimization loop before lowering, we won't fold any of the address
chain calculations before lowering 64-bit arithmetic and they aren't
really foldable afterwards. This cuts the size of the generated code in
the compute shader in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 by
around 30%.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
UNUSED bool progress; /* Written by OPT */
OPT(brw_nir_lower_mem_access_bit_sizes);
- OPT(nir_lower_int64, nir->options->lower_int64_options);
do {
progress = false;
brw_nir_optimize(nir, compiler, is_scalar, false);
+ if (OPT(nir_lower_int64, nir->options->lower_int64_options))
+ brw_nir_optimize(nir, compiler, is_scalar, false);
+
if (devinfo->gen >= 6) {
/* Try and fuse multiply-adds */
OPT(brw_nir_opt_peephole_ffma);