+The plan is, therefore, to add effectively *multiple* Q-Tables
+(or, multiple entries), recording the "history" of which *prior*
+Function Units had any given register as its destination.
Now we have exactly the information needed to "roll back", should an
exception occur. Like many augmentations and enhancements to the 6600
Scoreboard system, it's kind-of obvious in retrospect. However the *real*
Why is that important? It's because it's not enough to know that the
down-stream (dependent) instructions have all initiated (read the
FU's dest latch and taken it as a forwarded src operand). If **even one**
-of those instructions throws an exception, the "nameless" FU is hosed.
+of those instructions throws an exception, the "nameless" FU from which that
+value came is hosed, as it has nowhere to put its result.
So, firstly: the "nameless" FU absolutely has to wait until its dependencies
are clear of exceptions (and then **and only** then may it safely drop (throw
away) the data (without writing it to the Register File); and secondly,
instructions does indeed throw an exception. This is where the "History"
Q-Table Entries come into play.
+So there's a few potential ways to go about this:
+* Using the Historical Q-Table Entries, in chronological and Dependency
+ Order, store all "Nameless" Registers (using the "history" to determine
+ where), even if they are going to get overwritten in the next cycle.
+* After triggering the "Go\_die" wire from the Exception, and all
+ dependent instructions have been removed (including their Destination
+ Register Reservations), use the "history" information to work out
+ which (formerly nameless) Function Unit(s) now actually have the
+ Destination Reservation for all "vacated" Register.
+* Any remaining "nameless" Registers, if their results are available,
+ are likewise either stored or trigger their shadow (dependent)
+ instructions to die (even if it's the original exception).
+* Once the dust settles, carry on.
+Realistically, this is going to need to be investigated with simulations.
+It's quite complicated, however the payoff is a significant reduction in
+the workload on the register file. It basically means the difference between
+12 GFLOPs and 6 GFLOPs when doing 32-bit FMACs, at 800mhz (quad-core),
+and still being able to keep to a "standard" 2R1W register file.
+So it's a big deal!