broadcom/vc4: Use the RA callback to improve register selection's choices.
authorEric Anholt <eric@anholt.net>
Thu, 4 May 2017 21:44:38 +0000 (14:44 -0700)
committerEric Anholt <eric@anholt.net>
Tue, 25 Jul 2017 21:55:10 +0000 (14:55 -0700)
commit53492917e2153e9f5eb503792c2793a8e4cba391
tree6a76eb0df685965c0670827da142bee1abb0f595
parent7a34a0e8903249c41fae06fea22be105caf290df
broadcom/vc4: Use the RA callback to improve register selection's choices.

We simply pick r4 if available (anything else would force a MOV), then
round-robin through accumulators (avoids physical regfile RAW delay
slots), then round-robin through the physical regfile.

The effect on instruction count is pretty impressive:

total instructions in shared programs: 76563 -> 74526 (-2.66%)
instructions in affected programs:     66463 -> 64426 (-3.06%)

and we could probably do better with a little heuristic of "if we're going
to choose a physical reg, and other operands of instructions using this as
a src have the same physical regfile, then use the other regfile".
src/gallium/drivers/vc4/vc4_register_allocate.c