We want the size of a float per component, not the size of a whole vec4.
NIR instructions on i965:
total instructions in shared programs:
1261937 ->
1261929 (-0.00%)
instructions in affected programs: 114 -> 106 (-7.02%)
Looking at one of these examples (tesseract), it's from vec4 load_consts
for a MRT solid fill, which do get CSEed now that we don't memcmp off the
end of the const value and into the SSA def. For the 1-component loads
that are common in i965, we were only memcmping off into the rest of the
usually zero-filled const_value.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
return false;
return memcmp(load1->value.f, load2->value.f,
- load1->def.num_components * sizeof load2->value.f) == 0;
+ load1->def.num_components * sizeof(*load2->value.f)) == 0;
}
case nir_instr_type_phi: {
nir_phi_instr *phi1 = nir_instr_as_phi(instr1);