nv50/ir: optimize imul/imad to xmads
This hits the shader-db numbers a good bit, though a few xmads is way
faster than an imul or imad and the cost is mitigated by the next commit,
which optimizes many multiplications by immediates into shorter and less
register heavy instructions than the xmads.
total instructions in shared programs :
5768871 ->
5820882 (0.90%)
total gprs used in shared programs : 669919 -> 670595 (0.10%)
total shared used in shared programs : 548832 -> 548832 (0.00%)
total local used in shared programs : 21068 -> 21164 (0.46%)
local shared gpr inst bytes
helped 0 0 38 0 0
hurt 1 0 365 3076 3076
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>