decode2: Rework to make the stall_out signal come from a register
At present the busy/stall signal going to decode1 depends on whether
control thinks it can issue the current instruction, and that depends
on completion and bypass signals coming from execute1 and writeback.
To improve the timing of stall_out, this rearranges decode2 so that
stall_out is asserted when we have a valid instruction that couldn't
be issued in the previous cycle. This means that decode1 could give
us a new instruction when we haven't issued the previous instruction.
This in turn means that we can only use d_in in the first cycle of
processing an instruction. After the first cycle, we get register
addresses etc. from dc2 rather than d_in.
Then, to avoid the need to read register operands from register_file
in each cycle until the instruction issues, we bring the bypass path
for data being written to the register file into decode2 explicitly
rather than having it in register_file.
A new process called decode2_addrs does the process of calling
decode_input_reg_* and decode_output_reg and sets up the register file
addresses. This was split out (and decode_input_reg_* reworked) to
try to reduce the number of passes through the decode2_1 process that
need to be done in simulation.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>