Before, we would only do scheduling after register allocation if we
spilled, despite the fact that the pre-RA scheduler was only supposed to
be for register pressure and set the latencies of every instruction to
1. This meant that unless we spilled, which we rarely do, then we never
considered instruction latencies at all, and we usually never bothered
to try and hide texture fetch latency. Although a later commit removes
the setting the latency to 1 part, we still want to always run the
post-RA scheduler since it's able to take the false dependencies that
the register allocator creates into account, and it can be more
aggressive than the pre-RA scheduler since it doesn't have to worry
about register pressure at all.
Test master post-ra-sched diff %diff
bench_OglPSBump2 396.730 402.386 5.656 +1.400%
bench_OglPSBump8 244.370 247.591 3.221 +1.300%
bench_OglPSPhong 241.117 242.002 0.885 +0.300%
bench_OglPSPom 59.555 59.725 0.170 +0.200%
bench_OglShMapPcf 86.149 102.346 16.197 +18.800%
bench_OglVSTangent 388.849 395.489 6.640 +1.700%
bench_trex 65.471 65.862 0.390 +0.500%
bench_trexoff 69.562 70.150 0.588 +0.800%
bench_heaven 25.179 25.254 0.074 +0.200%
Reviewed-by: Jason Ekstrand <jasoan.ekstrand@intel.com>
if (failed)
return;
- if (!allocated_without_spills)
- schedule_instructions(SCHEDULE_POST);
+ schedule_instructions(SCHEDULE_POST);
if (last_scratch > 0)
prog_data->total_scratch = brw_get_scratch_size(last_scratch);