freedreno: fix temp register usage
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases. For example, if the dst
register is the same as one of the src registers.
For now, just simplify it and always allocate a new register to use as
an intermediate. In some cases this will result in more registers used
than required. I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.
Signed-off-by: Rob Clark <robclark@freedesktop.org>