nir/algebraic: Don't mark expression with duplicate sources as commutative
There is no reason to mark the fmul in the expression
('fmul', ('fadd', a, b), ('fadd', a, b))
as commutative. If a source of an instruction doesn't match one of the
('fadd', a, b) patterns, it won't match the other either.
This change is enough to make this pattern work:
('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)),
('fadd', 1.0, ('fneg', a))),
('fmul', ('flrp', a, 1.0, a), b))
This pattern has 5 commutative expressions (versus a limit of 4), but
the first fmul does not need to be commutative.
No shader-db change on any Intel platform. No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).
There are more subpatterns that could be marked as non-commutative, but
detecting these is more challenging. For example, this fadd:
('fadd', ('fmul', a, b), ('fmul', a, c))
The first fadd:
('fmul', ('fadd', a, b), ('fadd', a, b))
And this fadd:
('flt', ('fadd', a, b), 0.0)
This last case may be easier to detect. If all sources are variables
and they are the only instances of those variables, then the pattern can
be marked as non-commutative. It's probably not worth the effort now,
but if we end up with some patterns that bump up on the limit again, it
may be worth revisiting.
v2: Update the comment about the explicit "len(self.sources)" check to
be more clear about why it is necessary. Requested by Connor. Many
Python fixes style / idom fixes suggested by Dylan. Add missing (!!!)
opcode check in Expression::__eq__ method. This bug is the reason the
expected number of commutative expressions in the bitfield_reverse
pattern changed from 61 to 45 in the first version of this patch.
v3: Use all() in Expression::__eq__ method. Suggested by Connor.
Revert away from using __eq__ overloads. The "equality" implementation
of Constant and Variable needed for commutativity pruning is weaker than
the one needed for propagating and validating bit sizes. Using actual
equality caused the pruning to fail for my ('fmul', ('fadd', 1, a),
('fadd', 1, a)) case. I changed the name to "equivalent" rather than
the previous "same_as" to further differentiate it from __eq__.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>