execute1: Reduce width of the result mux to help timing
This reduces the number of different things that are assigned to
the result variable.
- The computations for the popcnt, prty, cmpb and exts instruction
families are moved into the logical unit.
- The result of mfspr from the slow SPRs is computed in 'spr_val'
before being assigned to 'result'.
- Writes to LR as a result of a blr or bclr instruction are done
through the exc_write path to writeback.
This eases timing considerably.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>