FPU: Relax timing around multiplier output
At present there is a state transition in the handling of the fmadd
instructions where the next state depends on the sign bit of the
multiplier result. This creates a critical path which doesn't make
timing on the A7-100. To fix this, we make the state transition
independent of the sign of the multiplier result, which improves
timing, but means we take one more cycle to do a fmadd-family
instruction in some cases.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>