This feels like the right tradeoff for threads vs uniforms, particularly
given that we often have very short thread segments right now:
total instructions in shared programs:
6411504 ->
6413571 (0.03%)
total threads in shared programs: 153946 -> 154214 (0.17%)
total uniforms in shared programs:
2387665 ->
2393604 (0.25%)
bool ok = ra_allocate(g);
if (!ok) {
bool ok = ra_allocate(g);
if (!ok) {
- /* Try to spill, if we can't reduce threading first. */
- if (thread_index == 0) {
- int node = v3d_choose_spill_node(c, g, temp_to_node);
+ int node = v3d_choose_spill_node(c, g, temp_to_node);
- if (node != -1) {
- v3d_spill_reg(c, map[node].temp);
+ /* Don't emit spills using the TMU until we've dropped thread
+ * conut first.
+ */
+ if (node != -1 &&
+ (vir_is_mov_uniform(c, map[node].temp) ||
+ thread_index == 0)) {
+ v3d_spill_reg(c, map[node].temp);
- /* Ask the outer loop to call back in. */
- *spilled = true;
- }
+ /* Ask the outer loop to call back in. */
+ *spilled = true;