Remove single-issue constraint for most loads and stores
This removes the constraint that loads and stores are single-issue,
at the expense of a stall of at least 2 cycles for every load and
store.
To do this, we plumb the existing stall signal that was generated
in dcache to core, where it gets ORed with the stall signal from
execute1. Execute1 generates a stall signal for the first two
cycles of each load and store, and dcache generates the stall
signal in the 3rd and subsequent cycles if it needs to.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>