v3d: Enable the late algebraic optimizations to get real subs.
This worked better than my original v3d-local pass for just subs, and is a
huge win over not producing subs.
total instructions in shared programs:
6408469 ->
6167932 (-3.75%)
total threads in shared programs: 153784 -> 154104 (0.21%)
total uniforms in shared programs:
2157078 ->
1905823 (-11.65%)
total max-temps in shared programs: 904546 -> 895796 (-0.97%)
total spills in shared programs: 4959 -> 4993 (0.69%)
total fills in shared programs: 6558 -> 6670 (1.71%)
total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%)
total inst-and-stalls in shared programs:
6434314 ->
6193107 (-3.75%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>