Instructions with 3 source operands have no write mask, so we may replace their
destinations with PV/PS in the next group even if their dst.write is 0.
Note: This is a candidate for the 7.11 branch.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
return r;
for (i = 0; i < max_slots; ++i) {
- if(prev[i] && prev[i]->dst.write && !prev[i]->dst.rel) {
+ if (prev[i] && (prev[i]->dst.write || prev[i]->is_op3) && !prev[i]->dst.rel) {
gpr[i] = prev[i]->dst.sel;
/* cube writes more than PV.X */
if (!is_alu_cube_inst(bc, prev[i]) && is_alu_reduction_inst(bc, prev[i]))