move whitepapers
authorLuke Kenneth Casson Leighton <lkcl@lkcl.net>
Thu, 5 May 2022 13:21:54 +0000 (14:21 +0100)
committerLuke Kenneth Casson Leighton <lkcl@lkcl.net>
Thu, 5 May 2022 13:21:54 +0000 (14:21 +0100)
openpower/SimpleV_rationale.mdwn [new file with mode: 0644]
openpower/microcontroller_power_isa_for_ai.mdwn [new file with mode: 0644]
openpower/openpower/effect-of-more-decode-stages-on-reg-renaming.mdwn [new file with mode: 0644]
openpower/openpower/sv/effect-of-more-decode-stages-on-reg-renaming.mdwn [deleted file]
openpower/openpower/whitepapers/SimpleV_rationale.mdwn [deleted file]
openpower/openpower/whitepapers/microcontroller_power_isa_for_ai.mdwn [deleted file]

diff --git a/openpower/SimpleV_rationale.mdwn b/openpower/SimpleV_rationale.mdwn
new file mode 100644 (file)
index 0000000..fabbb12
--- /dev/null
@@ -0,0 +1,3 @@
+[[!tag whitepapers]]
+
+# Why in the 2020s would you invent a new Vector ISA
diff --git a/openpower/microcontroller_power_isa_for_ai.mdwn b/openpower/microcontroller_power_isa_for_ai.mdwn
new file mode 100644 (file)
index 0000000..76635dd
--- /dev/null
@@ -0,0 +1,114 @@
+[[!tag whitepapers]]
+
+# Increasing average area efficiency and reducing resource utilisation for the Power ISA
+
+originally posted at: <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-February/004505.html>
+
+in between attempting to compile microwatt and Libre-SOC for an 85k LUT4 FPGA which took 4 hours (and then did not run), i decided to see if, in Libre-SOC's HDL, what level of resource reduction could be achieved by going to 32 bit ALUs and register files.
+
+the difference was an astounding 1.4 to 1.
+
+* the MUL pipeline dropped an astonishing 75% which given that multiply is O(N^2) is, retrospectively, not surprising
+* SHIFT dropped to 50%
+* ALU (add) dropped over 50%
+* Logical dropped over 60%
+* BRAM usage dropped by over 75%
+
+i then took a look at the I-Cache, D-Cache and MMU, and i am not seeing any practical barriers to setting them to 32 bit either, other than needing to define a new RADIX32 data format, which looks to be as simple as reducing the PTE and PDE lengths.
+
+why consider this at all? surely "32 bit is dead, Jim, dead, Jim, dead, Jim, dead" [https://m.youtube.com/watch?v=FCARADb9asE]
+
+the answers are multiple:
+
+* Anton had a hard time getting Microwatt into the Sky130 MPW1, which is limited to 10 mm^2.
+   (Libre-SOC's 180 nm test ASIC was 30 mm^2 and
+     that was with no MMU or L1 D/I-Cache)
+* Compared to RISC-V, which can easily fit into only 3000 LUT4s of an FPGA,
+   a Power ISA implementation is completely missing on opportunities
+    to be taken up by Hardware enthusiasts because it requires a bare minimum of 40K LUT4s.
+   (Without L1/MMU, Libre-SOC 64-bit is still 20k LUT4s)
+* The high resource utilisation is making life difficult for Libre-BMC
+   and the fact that it is not the slightest bit justified to be running a 64 bit OS, just for a bootloader,
+   leaves me puzzled as to the justification of what is inherently a self-inflicted handicap
+
+The high resource utilisation is (including for Libre-BMC) pressurising everyone, hindering adoption, and slowing down the iteration cycle on development.  The 4 hour turnround was not a throwaway comment, it was deeply significant: designs using 95% of a 45K LUT4 ECP5 can usually complete on nextpnr-ecp5 in around 12-15 minutes. 4 hours is insane and wasting time.
+
+it is also worth reiterating that larger designs give FPGA tools a much harder job, dramatically reducing the maximum achievable clock rate.
+
+
+based on the above analysis, a 32 bit implementation of a MMU-capable Power ISA core could easily fit into a lower cost Digilent Arty A7-35t, a 45K LUT4 VERSA_ECP5, and with a little corner-cutting (no MMU/L1) even potentially fit into the low-cost 25K orangecrab with plenty of room.
+
+this would make it affordable and accessible to e.g. students in India as well as increase general adoption 
+
+not only that but it would cleanly fit into sky130's 10 mm^2 budget (with reduced I/D-Caches), retain an MMU, and have room for some peripherals (kinda important, that)
+
+this in turn allows for a faster iterative cycle on ASIC development through access every couple of *months* to an MPW Shuttle run.
+
+
+the next step requires a little explanation and context.  SVP64 has been designed as a "Sub-Program-Counter for-loop in hardware" (similar to x86 "REP"). it is not a new idea: Peter Hsu, designer of the MIPS R8000, came up with the exact same concept behind SVP64, in 1994.
+
+the register file is treated as a byte-addressable SRAM (with byte-level masks this is not difficult to envisage) and the ALUs end up being conceptually similar to MMX, which can do 8x8 4x16 2x32 or 1x64 bit operations, except that SVP64 introduces predicate masks which of course
+map directly and simply onto the write-select lines of the underlying
+SRAM of the register file.
+
+however as an intermediary step on the path to converting Libre-SOC's HDL to cope with 8/16/32/64 we actually have to define and implement *scalar* operations at 8, 16 and 32 bit in addition to those already present in the 64-bit Power ISA.  this is underway with a Draft RFC proposal to define the Power ISA in terms of "XLEN", where XLEN=64 very deliberately, thoroughly and intentionally matches precisely, and by definition, with exactly that which is currently in Power ISA 3.0/3.1
+
+let that sink in a moment because the implications are startling:
+
+      we are in effect defining not only a 32 bit Draft
+      variant of the Power ISA, we (Libre-SOC) are also
+      defining a 16 bit *and an 8 bit* variant of Power
+      [and anticipate someone in the future to
+      define a 128-bit variant to match RISC-V RV128].
+
+bear in mind that SVP64 *has* to have Scalar Operations first, because by design and by definition *only Scalar operations may be Vectorised*.  SVP64 *DOES NOT* add *ANY* Vector Instructions. SVP64 is a generic loop around *Scalar* operations and it us up to the Architecture to take advantage of that, at the back-end.
+
+without SVP64 Sub-Looping it would on the face of it seem absolutely mental and a total waste of time and resources to define an 8 or 16 bit General-Purpose ISA in the year 2022 until you recall that:
+
+* students cannot possibly fit a Power ISA 64 bit implementation into a USD $10 ICE40 FPGA, but they might achieve a 16 bit one, and potentially do so in a few short weeks
+
+* the primary focus of AI is FP16, BF16, and even FP8 in some cases, QTY massive parallel banks of cores numbering in the thousands, often with SIMD ALUs.
+
+* a typical GPU has over 30% by area dedicated to parallel computational
+resources (SIMD ALUs) where a General-purpose RISC Core is typically
+dwarfed by literally two orders of magnitude by routing, register files,
+caches and peripherals.
+
+the inherent downside of such massively parallel task-centric cores is that they are absolutely useless at anything other than that specialist task, and are additionally a pig to program, lacking a useful ISA and compiler or, worse, having one but under proprietary licenses.
+
+the delicate balance of massively parallel supercomputing architecture is not to overcook the performance of a single core above all else (hint: Intel), but to focus instead on *average* efficiency per *total* area or power.
+
+    what if there was a way to leverage the Power ISA
+    to have high-end AI performance yet be able to
+    allow programmers to use standard compiler tools
+    to run general-purpose programs on all of those
+    massively-parallel cores?
+
+anyone who has tried either CUDA, 3D Shader programs, deep or wide SIMD Programming, or tried to get their heads twisted round GPU SIMT threads would celebrate and welcome the opportunity.
+
+(in particular, anyone who remembers how hard programming the Cell Processor turned out to be will be having that familiar "lightbulb moment" right about now)
+
+more than that: what if those 8 and 16 bit cores had a Supercomputing-class Vectorisation option in the ISA, and there were implementations out there with back-end ALUs that could perform 64 or 128 8 or 16 bit operations per clock cycle?
+
+Quantity several thousand per processor, all of them capable of adapting to run massive AI number crunching or (at lower IPC than "normal" processors) general-purpose compute?
+
+To achieve this requires some insights:
+
+1. access (addressing memory) beyond 8-bit, 16-bit, or 32-bit, can easily be achieved by allowing LD/STs to leverage *multiple* 8/16/32-bit registers to create 32 or 64 bit addresses.
+
+   SVP64 *already* has the concept of allowing consecutive 8/16/32/64 bit registers to be considered a "Vector" so typecasting to create 32 or 64 bit addresses fits easily
+
+2. If the Power ISA did not already have Carry-In/Out and Condition Registers, this entire idea would have much less merit.
+
+the idea of using multiple instructions to construct bigger integer values is nothing new, but doing so is far easier and more efficient if the ISA has Carry Flags.  that particularly hits home if the basic arithmetic width is only 8 or 16 bit!
+
+3. SVP64 already has the concept of extending the GPRs and FPRs to 128 entries.  however if those are say 16 bit registers, the actual size of the regfile (in bytes) is back down to exactly the same size (in total bytes) as Power ISA 3.0
+
+  * only 32 16-bit registers would be alarmingly resource pressured, particularly given that 4 of them would be needed to construct a 64 bit LD/ST address
+  * 128 16-bit registers on the other hand are equivalent to 32 64-bit regs and Computer Science shows we are comfortable with that quantity.
+
+given the ease with which both 32 and 64 bit addresses may be constructed, and 32 and 64 bit integer arithmetic (and beyond) may be created using multiple instructions *and* how much more efficient that can be done by leveraging SVP64, what at first sounded like an absolutely insane-to-the-point-of-laughable idea instead would be not only workable but combine General-Purpose Compute and AI workloads into a single hybrid ISA.
+
+as you are no doubt aware this has been the focus of so many unsuccessful ventures for so many decades, it would be nice to have one that worked. but, by definition, being "General" Purpose Compute (that happens to also be Supercomputing AI capable) it starts at the ISA and grows from there.
+
+bottom line, i would very much like to see the Power ISA take on Esperanto, but without having to define a custom proprietary extension to the ISA that nobody but they have access to.
diff --git a/openpower/openpower/effect-of-more-decode-stages-on-reg-renaming.mdwn b/openpower/openpower/effect-of-more-decode-stages-on-reg-renaming.mdwn
new file mode 100644 (file)
index 0000000..740e181
--- /dev/null
@@ -0,0 +1,255 @@
+[[!toc]]
+
+# effect of more decode stages on reg renaming
+
+there's basically no effect except execution starts a few cycles later. no additional execution resources are needed, notice the exact same number of renamed hardware registers are used.
+
+# 5 decode stages, 4 wide
+
+| Cycle                      | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7                            | 8                            | 9                             | 10                             | 11                              | 12                              | 13                              | 14                             | 15                 | 16                      | 17                             | 18                             | 19               | 20                      | 21                    | 22             | 23     | 24  |
+|----------------------------|-----|-------|----------|----------|----------|----------|----------|------------------------------|------------------------------|-------------------------------|--------------------------------|---------------------------------|---------------------------------|---------------------------------|--------------------------------|--------------------|-------------------------|--------------------------------|--------------------------------|------------------|-------------------------|-----------------------|----------------|--------|-----|
+| 0x100: mtctr r4            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1              | Write Outputs: h2             | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0              | Write Outputs: h3, h4         | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: addi h5 \<- h3, 100 | Wait: h3                     | Read Inputs: h3               | Write Outputs: h5              | Retire                          |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: std h5, 0(h4)       | Wait: h5, h4                 | Wait: h5                      | Read Inputs: h5, h4            | Write Outputs:                  | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                     | Renamed: bdnz h6 \<- h2, .L2 | Read Inputs: h2               | Write Outputs: h6              | Wait: Retire                    | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: ldu h7, 8(h4 -> h8)  | Wait: all execution pipes busy | Read Inputs: h4                 | Write Outputs: h7, h8           | Retire                          |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: addi h9 \<- h7, 100  | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9               | Retire                         |                    |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: std h9, 0(h8)        | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8             | Write Outputs:                 | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: bdnz h10 \<- h6, .L2 | Read Inputs: h6                | Write Outputs: h10              | Wait: Retire                    | Wait: Retire                    | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12         | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11                | Write Outputs: h13             | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13                       | Read Inputs: h13, h12          | Write Outputs:     | Retire                  |                                |                                |                  |                         |                       |                |        |     |
+| 0x110: bdnz .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: bdnz h14 \<- h10, .L2 | Read Inputs: h10                | Write Outputs: h14              | Wait: Retire                    | Wait: Retire                   | Wait: Retire       | Retire                  |                                |                                |                  |                         |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Wait: all execution pipes busy  | Wait: all execution pipes busy | Read Inputs: h12   | Write Outputs: h15, h16 | Retire                         |                                |                  |                         |                       |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15                       | Wait: h15                      | Wait: h15          | Read Inputs: h15        | Write Outputs: h17             | Retire                         |                  |                         |                       |                |        |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16                  | Wait: h17, h16                 | Wait: h17, h16     | Wait: h17               | Read Inputs: h17, h16          | Write Outputs:                 | Retire           |                         |                       |                |        |     |
+| 0x110: bdnz .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: bdnz h18 \<- h14, .L2  | Read Inputs: h14                | Write Outputs: h18              | Wait: Retire                   | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Retire           |                         |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16                      | Wait: h16          | Read Inputs: h16        | Write Outputs: h19, h20        | Wait: Retire                   | Retire           |                         |                       |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19                      | Wait: h19          | Wait: h19               | Read Inputs: h19               | Write Outputs: h21             | Retire           |                         |                       |                |        |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20                 | Wait: h21, h20     | Wait: h21, h20          | Wait: h21                      | Read Inputs: h21, h20          | Write Outputs:   | Retire                  |                       |                |        |     |
+| 0x110: bdnz .L2            |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: bdnz h22 \<- h18, .L2  | Read Inputs: h18                | Write Outputs: h22             | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Retire                  |                       |                |        |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20                      | Wait: h20          | Wait: h20               | Wait: all execution pipes busy | Wait: all execution pipes busy | Read Inputs: h20 | Write Outputs: h23, h24 | Retire                |                |        |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: addi h25 \<- h23, 100  | Wait: h23                      | Wait: h23          | Wait: h23               | Wait: h23                      | Wait: h23                      | Wait: h23        | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: std h25, 0(h24)        | Wait: h25, h24                 | Wait: h25, h24     | Wait: h25, h24          | Wait: h25, h24                 | Wait: h25, h24                 | Wait: h25, h24   | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |
+| 0x110: bdnz .L2            |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: bdnz h26 \<- h22, .L2  | Read Inputs: h22               | Write Outputs: h26 | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Wait: Retire            | Wait: Retire          | Wait: Retire   | Retire |     |
+
+# 1 decode stage, 4 wide
+
+| Cycle                      | 0   | 1     | 2        | 3                            | 4                            | 5                             | 6                              | 7                               | 8                               | 9                               | 10                             | 11                 | 12                      | 13                             | 14                             | 15               | 16                      | 17                    | 18             | 19     | 20  | 21  | 22  | 23  | 24  |
+|----------------------------|-----|-------|----------|------------------------------|------------------------------|-------------------------------|--------------------------------|---------------------------------|---------------------------------|---------------------------------|--------------------------------|--------------------|-------------------------|--------------------------------|--------------------------------|------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|-----|
+| 0x100: mtctr r4            |     | Fetch | Decode 0 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1              | Write Outputs: h2             | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 0 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0              | Write Outputs: h3, h4         | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 0 | Renamed: addi h5 \<- h3, 100 | Wait: h3                     | Read Inputs: h3               | Write Outputs: h5              | Retire                          |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 0 | Renamed: std h5, 0(h4)       | Wait: h5, h4                 | Wait: h5                      | Read Inputs: h5, h4            | Write Outputs:                  | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 0                     | Renamed: bdnz h6 \<- h2, .L2 | Read Inputs: h2               | Write Outputs: h6              | Wait: Retire                    | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 0                     | Renamed: ldu h7, 8(h4 -> h8)  | Wait: all execution pipes busy | Read Inputs: h4                 | Write Outputs: h7, h8           | Retire                          |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 0                     | Renamed: addi h9 \<- h7, 100  | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9               | Retire                         |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 0                     | Renamed: std h9, 0(h8)        | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8             | Write Outputs:                 | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 0                     | Renamed: bdnz h10 \<- h6, .L2 | Read Inputs: h6                | Write Outputs: h10              | Wait: Retire                    | Wait: Retire                    | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12         | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11                | Write Outputs: h13             | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13                       | Read Inputs: h13, h12          | Write Outputs:     | Retire                  |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: bdnz h14 \<- h10, .L2 | Read Inputs: h10                | Write Outputs: h14              | Wait: Retire                    | Wait: Retire                   | Wait: Retire       | Retire                  |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Wait: all execution pipes busy  | Wait: all execution pipes busy | Read Inputs: h12   | Write Outputs: h15, h16 | Retire                         |                                |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15                       | Wait: h15                      | Wait: h15          | Read Inputs: h15        | Write Outputs: h17             | Retire                         |                  |                         |                       |                |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16                  | Wait: h17, h16                 | Wait: h17, h16     | Wait: h17               | Read Inputs: h17, h16          | Write Outputs:                 | Retire           |                         |                       |                |        |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: bdnz h18 \<- h14, .L2  | Read Inputs: h14                | Write Outputs: h18              | Wait: Retire                   | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Retire           |                         |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16                      | Wait: h16          | Read Inputs: h16        | Write Outputs: h19, h20        | Wait: Retire                   | Retire           |                         |                       |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19                      | Wait: h19          | Wait: h19               | Read Inputs: h19               | Write Outputs: h21             | Retire           |                         |                       |                |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20                 | Wait: h21, h20     | Wait: h21, h20          | Wait: h21                      | Read Inputs: h21, h20          | Write Outputs:   | Retire                  |                       |                |        |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: bdnz h22 \<- h18, .L2  | Read Inputs: h18                | Write Outputs: h22             | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Retire                  |                       |                |        |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20                      | Wait: h20          | Wait: h20               | Wait: all execution pipes busy | Wait: all execution pipes busy | Read Inputs: h20 | Write Outputs: h23, h24 | Retire                |                |        |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: addi h25 \<- h23, 100  | Wait: h23                      | Wait: h23          | Wait: h23               | Wait: h23                      | Wait: h23                      | Wait: h23        | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: std h25, 0(h24)        | Wait: h25, h24                 | Wait: h25, h24     | Wait: h25, h24          | Wait: h25, h24                 | Wait: h25, h24                 | Wait: h25, h24   | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: bdnz h26 \<- h22, .L2  | Read Inputs: h22               | Write Outputs: h26 | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Wait: Retire            | Wait: Retire          | Wait: Retire   | Retire |     |     |     |     |     |
+
+# 8 decode stages, 8 wide
+
+| Cycle                      | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7        | 8        | 9        | 10                           | 11                             | 12                              | 13                              | 14                      | 15                      | 16                      | 17                      | 18                    | 19             | 20     | 21  | 22  | 23  | 24  |
+|----------------------------|-----|-------|----------|----------|----------|----------|----------|----------|----------|----------|------------------------------|--------------------------------|---------------------------------|---------------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|
+| 0x100: mtctr r4            |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1                | Write Outputs: h2               | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0                | Write Outputs: h3, h4           | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: addi h5 \<- h3, 100 | Wait: h3                       | Read Inputs: h3                 | Write Outputs: h5               | Retire                  |                         |                         |                         |                       |                |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: std h5, 0(h4)       | Wait: h5, h4                   | Wait: h5                        | Read Inputs: h5, h4             | Write Outputs:          | Retire                  |                         |                         |                       |                |        |     |     |     |     |
+| 0x110: bdnz .L2            |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: bdnz h6 \<- h2, .L2 | Wait: h2                       | Read Inputs: h2                 | Write Outputs: h6               | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: ldu h7, 8(h4 -> h8) | Wait: h4                       | Read Inputs: h4                 | Write Outputs: h7, h8           | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: addi h9 \<- h7, 100 | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9       | Retire                  |                         |                         |                       |                |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: std h9, 0(h8)       | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8     | Write Outputs:          | Retire                  |                         |                       |                |        |     |     |     |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: bdnz h10 \<- h6, .L2  | Wait: h6                        | Read Inputs: h6                 | Write Outputs: h10      | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12 | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11        | Write Outputs: h13      | Retire                  |                         |                       |                |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13               | Read Inputs: h13, h12   | Write Outputs:          | Retire                  |                       |                |        |     |     |     |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: bdnz h14 \<- h10, .L2 | Wait: h10                       | Wait: h10                       | Read Inputs: h10        | Write Outputs: h14      | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Read Inputs: h12        | Write Outputs: h15, h16 | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15               | Read Inputs: h15        | Write Outputs: h17      | Retire                  |                       |                |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16          | Wait: h17               | Read Inputs: h17, h16   | Write Outputs:          | Retire                |                |        |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: bdnz h18 \<- h14, .L2  | Wait: h14                       | Wait: h14               | Read Inputs: h14        | Write Outputs: h18      | Wait: Retire            | Retire                |                |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16               | Read Inputs: h16        | Write Outputs: h19, h20 | Wait: Retire            | Retire                |                |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19               | Wait: h19               | Read Inputs: h19        | Write Outputs: h21      | Retire                |                |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20          | Wait: h21, h20          | Wait: h21               | Read Inputs: h21, h20   | Write Outputs:        | Retire         |        |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: bdnz h22 \<- h18, .L2  | Wait: h18                       | Wait: h18               | Wait: h18               | Read Inputs: h18        | Write Outputs: h22      | Wait: Retire          | Retire         |        |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20               | Wait: h20               | Read Inputs: h20        | Write Outputs: h23, h24 | Wait: Retire          | Retire         |        |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: addi h25 \<- h23, 100  | Wait: h23               | Wait: h23               | Wait: h23               | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: std h25, 0(h24)        | Wait: h25, h24          | Wait: h25, h24          | Wait: h25, h24          | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: bdnz h26 \<- h22, .L2  | Wait: h22               | Wait: h22               | Wait: h22               | Read Inputs: h22        | Write Outputs: h26    | Wait: Retire   | Retire |     |     |     |     |
+
+# 1 decode stage, 8 wide
+
+| Cycle                      | 0   | 1     | 2        | 3                            | 4                              | 5                               | 6                               | 7                       | 8                       | 9                       | 10                      | 11                    | 12             | 13     | 14  | 15  | 16  | 17  | 18  | 19  | 20  | 21  | 22  | 23  | 24  |
+|----------------------------|-----|-------|----------|------------------------------|--------------------------------|---------------------------------|---------------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
+| 0x100: mtctr r4            |     | Fetch | Decode 1 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1                | Write Outputs: h2               | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0                | Write Outputs: h3, h4           | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Renamed: addi h5 \<- h3, 100 | Wait: h3                       | Read Inputs: h3                 | Write Outputs: h5               | Retire                  |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Renamed: std h5, 0(h4)       | Wait: h5, h4                   | Wait: h5                        | Read Inputs: h5, h4             | Write Outputs:          | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     | Fetch | Decode 1 | Renamed: bdnz h6 \<- h2, .L2 | Wait: h2                       | Read Inputs: h2                 | Write Outputs: h6               | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Renamed: ldu h7, 8(h4 -> h8) | Wait: h4                       | Read Inputs: h4                 | Write Outputs: h7, h8           | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Renamed: addi h9 \<- h7, 100 | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9       | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Renamed: std h9, 0(h8)       | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8     | Write Outputs:          | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1                     | Renamed: bdnz h10 \<- h6, .L2  | Wait: h6                        | Read Inputs: h6                 | Write Outputs: h10      | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       | Fetch    | Decode 1                     | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12 | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       | Fetch    | Decode 1                     | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11        | Write Outputs: h13      | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       | Fetch    | Decode 1                     | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13               | Read Inputs: h13, h12   | Write Outputs:          | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1                     | Renamed: bdnz h14 \<- h10, .L2 | Wait: h10                       | Wait: h10                       | Read Inputs: h10        | Write Outputs: h14      | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Read Inputs: h12        | Write Outputs: h15, h16 | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 1                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15               | Read Inputs: h15        | Write Outputs: h17      | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16          | Wait: h17               | Read Inputs: h17, h16   | Write Outputs:          | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 1                       | Renamed: bdnz h18 \<- h14, .L2  | Wait: h14                       | Wait: h14               | Read Inputs: h14        | Write Outputs: h18      | Wait: Retire            | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16               | Read Inputs: h16        | Write Outputs: h19, h20 | Wait: Retire            | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 1                       | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19               | Wait: h19               | Read Inputs: h19        | Write Outputs: h21      | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20          | Wait: h21, h20          | Wait: h21               | Read Inputs: h21, h20   | Write Outputs:        | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 1                       | Renamed: bdnz h22 \<- h18, .L2  | Wait: h18                       | Wait: h18               | Wait: h18               | Read Inputs: h18        | Write Outputs: h22      | Wait: Retire          | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x104: ldu r9, 8(r3)       |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20               | Wait: h20               | Read Inputs: h20        | Write Outputs: h23, h24 | Wait: Retire          | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x108: addi r9 \<- r9, 100 |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: addi h25 \<- h23, 100  | Wait: h23               | Wait: h23               | Wait: h23               | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
+| 0x10c: std r9, 0(r3)       |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: std h25, 0(h24)        | Wait: h25, h24          | Wait: h25, h24          | Wait: h25, h24          | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |     |     |     |     |     |     |     |
+| 0x110: bdnz .L2            |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: bdnz h26 \<- h22, .L2  | Wait: h22               | Wait: h22               | Wait: h22               | Read Inputs: h22        | Write Outputs: h26    | Wait: Retire   | Retire |     |     |     |     |     |     |     |     |     |     |     |
+
+# simple loop, 1 decode stage, 8 wide
+
+| Cycle                     | 0   | 1     | 2        | 3                           | 4                            | 5                             | 6                             | 7                             | 8                             | 9                          | 10                           | 11                          | 12                          | 13                          | 14                 | 15                 | 16                 | 17                 | 18                 | 19                 | 20                 | 21                | 22                | 23                | 24                | 25                | 26             | 27     | 28  | 29  | 30  | 31  | 32  | 33  | 34  |
+|---------------------------|-----|-------|----------|-----------------------------|------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|----------------------------|------------------------------|-----------------------------|-----------------------------|-----------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|-------------------|-------------------|-------------------|-------------------|-------------------|----------------|--------|-----|-----|-----|-----|-----|-----|-----|
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h1 \<- h0, -1 | Read Inputs: h0              | Write Outputs: h1             | Retire                        |                               |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h2 \<- h1, 0 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h2             | Retire                        |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     | Fetch | Decode 0 | Renamed: bne h2, .L2        | Wait: h2                     | Wait: h2                      | Read Inputs: h2               | Write Outputs:                | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h3 \<- h1, -1 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h3             | Wait: Retire                  | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h4 \<- h3, 0 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h4             | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     | Fetch | Decode 0 | Renamed: bne h4, .L2        | Wait: h4                     | Wait: h4                      | Wait: h4                      | Read Inputs: h4               | Write Outputs:                | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h5 \<- h3, -1 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h5             | Wait: Retire                  | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h6 \<- h5, 0 | Wait: h5                     | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h6             | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h6, .L2         | Wait: h6                      | Wait: h6                      | Wait: h6                      | Read Inputs: h6               | Write Outputs:             | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h7 \<- h5, -1  | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h7             | Wait: Retire               | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0                    | Renamed: cmpdi h8 \<- h7, 0  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h8          | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h8, .L2         | Wait: h8                      | Wait: h8                      | Wait: h8                      | Wait: h8                      | Read Inputs: h8            | Write Outputs:               | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h9 \<- h7, -1  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h9          | Wait: Retire                 | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0                    | Renamed: cmpdi h10 \<- h9, 0 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h10           | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h10, .L2        | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                  | Read Inputs: h10             | Write Outputs:              | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h11 \<- h9, -1 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h11           | Wait: Retire                | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h12 \<- h11, 0 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h12          | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h12, .L2         | Wait: h12                     | Wait: h12                     | Wait: h12                     | Wait: h12                  | Wait: h12                    | Read Inputs: h12            | Write Outputs:              | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch                       | Decode 0                     | Renamed: addi h13 \<- h11, -1 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h13          | Wait: Retire                | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h14 \<- h13, 0 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h14          | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h14, .L2         | Wait: h14                     | Wait: h14                     | Wait: h14                     | Wait: h14                  | Wait: h14                    | Wait: h14                   | Read Inputs: h14            | Write Outputs:              | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch                       | Decode 0                     | Renamed: addi h15 \<- h13, -1 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h15          | Wait: Retire                | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h16 \<- h15, 0 | Wait: h15                     | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h16          | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h16, .L2         | Wait: h16                     | Wait: h16                     | Wait: h16                     | Wait: h16                  | Wait: h16                    | Wait: h16                   | Wait: h16                   | Read Inputs: h16            | Write Outputs:     | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h17 \<- h15, -1 | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h17          | Wait: Retire       | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h18 \<- h17, 0 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h18 | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: bne h18, .L2         | Wait: h18                     | Wait: h18                     | Wait: h18                  | Wait: h18                    | Wait: h18                   | Wait: h18                   | Wait: h18                   | Read Inputs: h18   | Write Outputs:     | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h19 \<- h17, -1 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h19 | Wait: Retire       | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h20 \<- h19, 0 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h20 | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: bne h20, .L2         | Wait: h20                     | Wait: h20                     | Wait: h20                  | Wait: h20                    | Wait: h20                   | Wait: h20                   | Wait: h20                   | Wait: h20          | Read Inputs: h20   | Write Outputs:     | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h21 \<- h19, -1 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h21 | Wait: Retire       | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h22 \<- h21, 0 | Wait: h21                     | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h22 | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h22, .L2         | Wait: h22                     | Wait: h22                  | Wait: h22                    | Wait: h22                   | Wait: h22                   | Wait: h22                   | Wait: h22          | Wait: h22          | Read Inputs: h22   | Write Outputs:     | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h23 \<- h21, -1 | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h23 | Wait: Retire       | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: cmpdi h24 \<- h23, 0 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h24 | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h24, .L2         | Wait: h24                     | Wait: h24                  | Wait: h24                    | Wait: h24                   | Wait: h24                   | Wait: h24                   | Wait: h24          | Wait: h24          | Wait: h24          | Read Inputs: h24   | Write Outputs:     | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h25 \<- h23, -1 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h25 | Wait: Retire       | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: cmpdi h26 \<- h25, 0 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h26 | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h26, .L2         | Wait: h26                     | Wait: h26                  | Wait: h26                    | Wait: h26                   | Wait: h26                   | Wait: h26                   | Wait: h26          | Wait: h26          | Wait: h26          | Wait: h26          | Read Inputs: h26   | Write Outputs:     | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h27 \<- h25, -1 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h27 | Wait: Retire       | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h28 \<- h27, 0 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h28 | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h28, .L2         | Wait: h28                  | Wait: h28                    | Wait: h28                   | Wait: h28                   | Wait: h28                   | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Read Inputs: h28   | Write Outputs:     | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: addi h29 \<- h27, -1 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h29 | Wait: Retire       | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h30 \<- h29, 0 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h30 | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h30, .L2         | Wait: h30                  | Wait: h30                    | Wait: h30                   | Wait: h30                   | Wait: h30                   | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Read Inputs: h30   | Write Outputs:    | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: addi h31 \<- h29, -1 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h31 | Wait: Retire      | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h0 \<- h31, 0  | Wait: h31                  | Wait: h31                    | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h0 | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h0, .L2          | Wait: h0                   | Wait: h0                     | Wait: h0                    | Wait: h0                    | Wait: h0                    | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Read Inputs: h0   | Write Outputs:    | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: addi h2 \<- h31, -1 | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h2 | Wait: Retire      | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: cmpdi h1 \<- h2, 0  | Wait: h2                    | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h1 | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: bne h1, .L2         | Wait: h1                    | Wait: h1                    | Wait: h1                    | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1          | Read Inputs: h1   | Write Outputs:    | Retire            |                   |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: addi h4 \<- h2, -1 | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h4 | Wait: Retire      | Retire            |                   |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: cmpdi h3 \<- h4, 0 | Wait: h4                    | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h3 | Retire            |                   |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: bne h3, .L2        | Wait: h3                    | Wait: h3                    | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3          | Wait: h3          | Read Inputs: h3   | Write Outputs:    | Retire            |                |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: addi h6 \<- h4, -1 | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h6 | Wait: Retire      | Retire            |                |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: cmpdi h5 \<- h6, 0 | Wait: h6                    | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h5 | Retire            |                |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: bne h5, .L2        | Wait: h5                    | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5          | Wait: h5          | Wait: h5          | Read Inputs: h5   | Write Outputs:    | Retire         |        |     |     |     |     |     |     |     |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: addi h8 \<- h6, -1 | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h8 | Wait: Retire      | Retire         |        |     |     |     |     |     |     |     |
+| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: cmpdi h7 \<- h8, 0 | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8          | Wait: h8          | Wait: h8          | Read Inputs: h8   | Write Outputs: h7 | Retire         |        |     |     |     |     |     |     |     |
+| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: bne h7, .L2        | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7          | Wait: h7          | Wait: h7          | Wait: h7          | Read Inputs: h7   | Write Outputs: | Retire |     |     |     |     |     |     |     |
+
+# simple loop, 8 decode stages, 8 wide
+
+| Cycle                     | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7        | 8        | 9        | 10                          | 11                           | 12                            | 13                            | 14                            | 15                            | 16                         | 17                           | 18                          | 19                          | 20                          | 21                 | 22                 | 23                 | 24                 | 25                 | 26                 | 27                 | 28                | 29                | 30                | 31                | 32                | 33             | 34     |
+|---------------------------|-----|-------|----------|----------|----------|----------|----------|----------|----------|----------|-----------------------------|------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|----------------------------|------------------------------|-----------------------------|-----------------------------|-----------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|-------------------|-------------------|-------------------|-------------------|-------------------|----------------|--------|
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h1 \<- h0, -1 | Read Inputs: h0              | Write Outputs: h1             | Retire                        |                               |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h2 \<- h1, 0 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h2             | Retire                        |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: bne h2, .L2        | Wait: h2                     | Wait: h2                      | Read Inputs: h2               | Write Outputs:                | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h3 \<- h1, -1 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h3             | Wait: Retire                  | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h4 \<- h3, 0 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h4             | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: bne h4, .L2        | Wait: h4                     | Wait: h4                      | Wait: h4                      | Read Inputs: h4               | Write Outputs:                | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h5 \<- h3, -1 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h5             | Wait: Retire                  | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h6 \<- h5, 0 | Wait: h5                     | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h6             | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h6, .L2         | Wait: h6                      | Wait: h6                      | Wait: h6                      | Read Inputs: h6               | Write Outputs:             | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h7 \<- h5, -1  | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h7             | Wait: Retire               | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: cmpdi h8 \<- h7, 0  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h8          | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h8, .L2         | Wait: h8                      | Wait: h8                      | Wait: h8                      | Wait: h8                      | Read Inputs: h8            | Write Outputs:               | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h9 \<- h7, -1  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h9          | Wait: Retire                 | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: cmpdi h10 \<- h9, 0 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h10           | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h10, .L2        | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                  | Read Inputs: h10             | Write Outputs:              | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h11 \<- h9, -1 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h11           | Wait: Retire                | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h12 \<- h11, 0 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h12          | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h12, .L2         | Wait: h12                     | Wait: h12                     | Wait: h12                     | Wait: h12                  | Wait: h12                    | Read Inputs: h12            | Write Outputs:              | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: addi h13 \<- h11, -1 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h13          | Wait: Retire                | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h14 \<- h13, 0 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h14          | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h14, .L2         | Wait: h14                     | Wait: h14                     | Wait: h14                     | Wait: h14                  | Wait: h14                    | Wait: h14                   | Read Inputs: h14            | Write Outputs:              | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: addi h15 \<- h13, -1 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h15          | Wait: Retire                | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h16 \<- h15, 0 | Wait: h15                     | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h16          | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h16, .L2         | Wait: h16                     | Wait: h16                     | Wait: h16                     | Wait: h16                  | Wait: h16                    | Wait: h16                   | Wait: h16                   | Read Inputs: h16            | Write Outputs:     | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h17 \<- h15, -1 | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h17          | Wait: Retire       | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h18 \<- h17, 0 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h18 | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: bne h18, .L2         | Wait: h18                     | Wait: h18                     | Wait: h18                  | Wait: h18                    | Wait: h18                   | Wait: h18                   | Wait: h18                   | Read Inputs: h18   | Write Outputs:     | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h19 \<- h17, -1 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h19 | Wait: Retire       | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h20 \<- h19, 0 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h20 | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: bne h20, .L2         | Wait: h20                     | Wait: h20                     | Wait: h20                  | Wait: h20                    | Wait: h20                   | Wait: h20                   | Wait: h20                   | Wait: h20          | Read Inputs: h20   | Write Outputs:     | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h21 \<- h19, -1 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h21 | Wait: Retire       | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h22 \<- h21, 0 | Wait: h21                     | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h22 | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h22, .L2         | Wait: h22                     | Wait: h22                  | Wait: h22                    | Wait: h22                   | Wait: h22                   | Wait: h22                   | Wait: h22          | Wait: h22          | Read Inputs: h22   | Write Outputs:     | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h23 \<- h21, -1 | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h23 | Wait: Retire       | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: cmpdi h24 \<- h23, 0 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h24 | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h24, .L2         | Wait: h24                     | Wait: h24                  | Wait: h24                    | Wait: h24                   | Wait: h24                   | Wait: h24                   | Wait: h24          | Wait: h24          | Wait: h24          | Read Inputs: h24   | Write Outputs:     | Retire             |                    |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h25 \<- h23, -1 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h25 | Wait: Retire       | Retire             |                    |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: cmpdi h26 \<- h25, 0 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h26 | Retire             |                    |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h26, .L2         | Wait: h26                     | Wait: h26                  | Wait: h26                    | Wait: h26                   | Wait: h26                   | Wait: h26                   | Wait: h26          | Wait: h26          | Wait: h26          | Wait: h26          | Read Inputs: h26   | Write Outputs:     | Retire             |                   |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h27 \<- h25, -1 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h27 | Wait: Retire       | Retire             |                   |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h28 \<- h27, 0 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h28 | Retire             |                   |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h28, .L2         | Wait: h28                  | Wait: h28                    | Wait: h28                   | Wait: h28                   | Wait: h28                   | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Read Inputs: h28   | Write Outputs:     | Retire            |                   |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: addi h29 \<- h27, -1 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h29 | Wait: Retire       | Retire            |                   |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h30 \<- h29, 0 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h30 | Retire            |                   |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h30, .L2         | Wait: h30                  | Wait: h30                    | Wait: h30                   | Wait: h30                   | Wait: h30                   | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Read Inputs: h30   | Write Outputs:    | Retire            |                   |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: addi h31 \<- h29, -1 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h31 | Wait: Retire      | Retire            |                   |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h0 \<- h31, 0  | Wait: h31                  | Wait: h31                    | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h0 | Retire            |                   |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h0, .L2          | Wait: h0                   | Wait: h0                     | Wait: h0                    | Wait: h0                    | Wait: h0                    | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Read Inputs: h0   | Write Outputs:    | Retire            |                   |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: addi h2 \<- h31, -1 | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h2 | Wait: Retire      | Retire            |                   |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: cmpdi h1 \<- h2, 0  | Wait: h2                    | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h1 | Retire            |                   |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: bne h1, .L2         | Wait: h1                    | Wait: h1                    | Wait: h1                    | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1          | Read Inputs: h1   | Write Outputs:    | Retire            |                   |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: addi h4 \<- h2, -1 | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h4 | Wait: Retire      | Retire            |                   |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: cmpdi h3 \<- h4, 0 | Wait: h4                    | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h3 | Retire            |                   |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: bne h3, .L2        | Wait: h3                    | Wait: h3                    | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3          | Wait: h3          | Read Inputs: h3   | Write Outputs:    | Retire            |                |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: addi h6 \<- h4, -1 | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h6 | Wait: Retire      | Retire            |                |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: cmpdi h5 \<- h6, 0 | Wait: h6                    | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h5 | Retire            |                |        |
+| 0x108: bne .L2            |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: bne h5, .L2        | Wait: h5                    | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5          | Wait: h5          | Wait: h5          | Read Inputs: h5   | Write Outputs:    | Retire         |        |
+| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: addi h8 \<- h6, -1 | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h8 | Wait: Retire      | Retire         |        |
+| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: cmpdi h7 \<- h8, 0 | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8          | Wait: h8          | Wait: h8          | Read Inputs: h8   | Write Outputs: h7 | Retire         |        |
+| 0x108: bne .L2            |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: bne h7, .L2        | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7          | Wait: h7          | Wait: h7          | Wait: h7          | Read Inputs: h7   | Write Outputs: | Retire |
diff --git a/openpower/openpower/sv/effect-of-more-decode-stages-on-reg-renaming.mdwn b/openpower/openpower/sv/effect-of-more-decode-stages-on-reg-renaming.mdwn
deleted file mode 100644 (file)
index 740e181..0000000
+++ /dev/null
@@ -1,255 +0,0 @@
-[[!toc]]
-
-# effect of more decode stages on reg renaming
-
-there's basically no effect except execution starts a few cycles later. no additional execution resources are needed, notice the exact same number of renamed hardware registers are used.
-
-# 5 decode stages, 4 wide
-
-| Cycle                      | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7                            | 8                            | 9                             | 10                             | 11                              | 12                              | 13                              | 14                             | 15                 | 16                      | 17                             | 18                             | 19               | 20                      | 21                    | 22             | 23     | 24  |
-|----------------------------|-----|-------|----------|----------|----------|----------|----------|------------------------------|------------------------------|-------------------------------|--------------------------------|---------------------------------|---------------------------------|---------------------------------|--------------------------------|--------------------|-------------------------|--------------------------------|--------------------------------|------------------|-------------------------|-----------------------|----------------|--------|-----|
-| 0x100: mtctr r4            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1              | Write Outputs: h2             | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0              | Write Outputs: h3, h4         | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: addi h5 \<- h3, 100 | Wait: h3                     | Read Inputs: h3               | Write Outputs: h5              | Retire                          |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Renamed: std h5, 0(h4)       | Wait: h5, h4                 | Wait: h5                      | Read Inputs: h5, h4            | Write Outputs:                  | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                     | Renamed: bdnz h6 \<- h2, .L2 | Read Inputs: h2               | Write Outputs: h6              | Wait: Retire                    | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: ldu h7, 8(h4 -> h8)  | Wait: all execution pipes busy | Read Inputs: h4                 | Write Outputs: h7, h8           | Retire                          |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: addi h9 \<- h7, 100  | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9               | Retire                         |                    |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: std h9, 0(h8)        | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8             | Write Outputs:                 | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                     | Decode 4                     | Renamed: bdnz h10 \<- h6, .L2 | Read Inputs: h6                | Write Outputs: h10              | Wait: Retire                    | Wait: Retire                    | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12         | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11                | Write Outputs: h13             | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13                       | Read Inputs: h13, h12          | Write Outputs:     | Retire                  |                                |                                |                  |                         |                       |                |        |     |
-| 0x110: bdnz .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                     | Decode 3                     | Decode 4                      | Renamed: bdnz h14 \<- h10, .L2 | Read Inputs: h10                | Write Outputs: h14              | Wait: Retire                    | Wait: Retire                   | Wait: Retire       | Retire                  |                                |                                |                  |                         |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Wait: all execution pipes busy  | Wait: all execution pipes busy | Read Inputs: h12   | Write Outputs: h15, h16 | Retire                         |                                |                  |                         |                       |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15                       | Wait: h15                      | Wait: h15          | Read Inputs: h15        | Write Outputs: h17             | Retire                         |                  |                         |                       |                |        |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16                  | Wait: h17, h16                 | Wait: h17, h16     | Wait: h17               | Read Inputs: h17, h16          | Write Outputs:                 | Retire           |                         |                       |                |        |     |
-| 0x110: bdnz .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1                     | Decode 2                     | Decode 3                      | Decode 4                       | Renamed: bdnz h18 \<- h14, .L2  | Read Inputs: h14                | Write Outputs: h18              | Wait: Retire                   | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Retire           |                         |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16                      | Wait: h16          | Read Inputs: h16        | Write Outputs: h19, h20        | Wait: Retire                   | Retire           |                         |                       |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19                      | Wait: h19          | Wait: h19               | Read Inputs: h19               | Write Outputs: h21             | Retire           |                         |                       |                |        |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20                 | Wait: h21, h20     | Wait: h21, h20          | Wait: h21                      | Read Inputs: h21, h20          | Write Outputs:   | Retire                  |                       |                |        |     |
-| 0x110: bdnz .L2            |     |       |          |          |          |          | Fetch    | Decode 0                     | Decode 1                     | Decode 2                      | Decode 3                       | Decode 4                        | Renamed: bdnz h22 \<- h18, .L2  | Read Inputs: h18                | Write Outputs: h22             | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Retire                  |                       |                |        |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20                      | Wait: h20          | Wait: h20               | Wait: all execution pipes busy | Wait: all execution pipes busy | Read Inputs: h20 | Write Outputs: h23, h24 | Retire                |                |        |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: addi h25 \<- h23, 100  | Wait: h23                      | Wait: h23          | Wait: h23               | Wait: h23                      | Wait: h23                      | Wait: h23        | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: std h25, 0(h24)        | Wait: h25, h24                 | Wait: h25, h24     | Wait: h25, h24          | Wait: h25, h24                 | Wait: h25, h24                 | Wait: h25, h24   | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |
-| 0x110: bdnz .L2            |     |       |          |          |          |          |          | Fetch                        | Decode 0                     | Decode 1                      | Decode 2                       | Decode 3                        | Decode 4                        | Renamed: bdnz h26 \<- h22, .L2  | Read Inputs: h22               | Write Outputs: h26 | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Wait: Retire            | Wait: Retire          | Wait: Retire   | Retire |     |
-
-# 1 decode stage, 4 wide
-
-| Cycle                      | 0   | 1     | 2        | 3                            | 4                            | 5                             | 6                              | 7                               | 8                               | 9                               | 10                             | 11                 | 12                      | 13                             | 14                             | 15               | 16                      | 17                    | 18             | 19     | 20  | 21  | 22  | 23  | 24  |
-|----------------------------|-----|-------|----------|------------------------------|------------------------------|-------------------------------|--------------------------------|---------------------------------|---------------------------------|---------------------------------|--------------------------------|--------------------|-------------------------|--------------------------------|--------------------------------|------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|-----|
-| 0x100: mtctr r4            |     | Fetch | Decode 0 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1              | Write Outputs: h2             | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 0 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0              | Write Outputs: h3, h4         | Retire                         |                                 |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 0 | Renamed: addi h5 \<- h3, 100 | Wait: h3                     | Read Inputs: h3               | Write Outputs: h5              | Retire                          |                                 |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 0 | Renamed: std h5, 0(h4)       | Wait: h5, h4                 | Wait: h5                      | Read Inputs: h5, h4            | Write Outputs:                  | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 0                     | Renamed: bdnz h6 \<- h2, .L2 | Read Inputs: h2               | Write Outputs: h6              | Wait: Retire                    | Retire                          |                                 |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 0                     | Renamed: ldu h7, 8(h4 -> h8)  | Wait: all execution pipes busy | Read Inputs: h4                 | Write Outputs: h7, h8           | Retire                          |                                |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 0                     | Renamed: addi h9 \<- h7, 100  | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9               | Retire                         |                    |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 0                     | Renamed: std h9, 0(h8)        | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8             | Write Outputs:                 | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 0                     | Renamed: bdnz h10 \<- h6, .L2 | Read Inputs: h6                | Write Outputs: h10              | Wait: Retire                    | Wait: Retire                    | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12         | Wait: Retire                   | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11                | Write Outputs: h13             | Retire             |                         |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13                       | Read Inputs: h13, h12          | Write Outputs:     | Retire                  |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |                              | Fetch                        | Decode 0                      | Renamed: bdnz h14 \<- h10, .L2 | Read Inputs: h10                | Write Outputs: h14              | Wait: Retire                    | Wait: Retire                   | Wait: Retire       | Retire                  |                                |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Wait: all execution pipes busy  | Wait: all execution pipes busy | Read Inputs: h12   | Write Outputs: h15, h16 | Retire                         |                                |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15                       | Wait: h15                      | Wait: h15          | Read Inputs: h15        | Write Outputs: h17             | Retire                         |                  |                         |                       |                |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16                  | Wait: h17, h16                 | Wait: h17, h16     | Wait: h17               | Read Inputs: h17, h16          | Write Outputs:                 | Retire           |                         |                       |                |        |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |                              |                              | Fetch                         | Decode 0                       | Renamed: bdnz h18 \<- h14, .L2  | Read Inputs: h14                | Write Outputs: h18              | Wait: Retire                   | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Retire           |                         |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16                      | Wait: h16          | Read Inputs: h16        | Write Outputs: h19, h20        | Wait: Retire                   | Retire           |                         |                       |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19                      | Wait: h19          | Wait: h19               | Read Inputs: h19               | Write Outputs: h21             | Retire           |                         |                       |                |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20                 | Wait: h21, h20     | Wait: h21, h20          | Wait: h21                      | Read Inputs: h21, h20          | Write Outputs:   | Retire                  |                       |                |        |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |                              |                              |                               | Fetch                          | Decode 0                        | Renamed: bdnz h22 \<- h18, .L2  | Read Inputs: h18                | Write Outputs: h22             | Wait: Retire       | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Retire                  |                       |                |        |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20                      | Wait: h20          | Wait: h20               | Wait: all execution pipes busy | Wait: all execution pipes busy | Read Inputs: h20 | Write Outputs: h23, h24 | Retire                |                |        |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: addi h25 \<- h23, 100  | Wait: h23                      | Wait: h23          | Wait: h23               | Wait: h23                      | Wait: h23                      | Wait: h23        | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: std h25, 0(h24)        | Wait: h25, h24                 | Wait: h25, h24     | Wait: h25, h24          | Wait: h25, h24                 | Wait: h25, h24                 | Wait: h25, h24   | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |                              |                              |                               |                                | Fetch                           | Decode 0                        | Renamed: bdnz h26 \<- h22, .L2  | Read Inputs: h22               | Write Outputs: h26 | Wait: Retire            | Wait: Retire                   | Wait: Retire                   | Wait: Retire     | Wait: Retire            | Wait: Retire          | Wait: Retire   | Retire |     |     |     |     |     |
-
-# 8 decode stages, 8 wide
-
-| Cycle                      | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7        | 8        | 9        | 10                           | 11                             | 12                              | 13                              | 14                      | 15                      | 16                      | 17                      | 18                    | 19             | 20     | 21  | 22  | 23  | 24  |
-|----------------------------|-----|-------|----------|----------|----------|----------|----------|----------|----------|----------|------------------------------|--------------------------------|---------------------------------|---------------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|
-| 0x100: mtctr r4            |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1                | Write Outputs: h2               | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0                | Write Outputs: h3, h4           | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: addi h5 \<- h3, 100 | Wait: h3                       | Read Inputs: h3                 | Write Outputs: h5               | Retire                  |                         |                         |                         |                       |                |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: std h5, 0(h4)       | Wait: h5, h4                   | Wait: h5                        | Read Inputs: h5, h4             | Write Outputs:          | Retire                  |                         |                         |                       |                |        |     |     |     |     |
-| 0x110: bdnz .L2            |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: bdnz h6 \<- h2, .L2 | Wait: h2                       | Read Inputs: h2                 | Write Outputs: h6               | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: ldu h7, 8(h4 -> h8) | Wait: h4                       | Read Inputs: h4                 | Write Outputs: h7, h8           | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: addi h9 \<- h7, 100 | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9       | Retire                  |                         |                         |                       |                |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8 | Renamed: std h9, 0(h8)       | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8     | Write Outputs:          | Retire                  |                         |                       |                |        |     |     |     |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: bdnz h10 \<- h6, .L2  | Wait: h6                        | Read Inputs: h6                 | Write Outputs: h10      | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12 | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11        | Write Outputs: h13      | Retire                  |                         |                       |                |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13               | Read Inputs: h13, h12   | Write Outputs:          | Retire                  |                       |                |        |     |     |     |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Decode 8                     | Renamed: bdnz h14 \<- h10, .L2 | Wait: h10                       | Wait: h10                       | Read Inputs: h10        | Write Outputs: h14      | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Read Inputs: h12        | Write Outputs: h15, h16 | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15               | Read Inputs: h15        | Write Outputs: h17      | Retire                  |                       |                |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16          | Wait: h17               | Read Inputs: h17, h16   | Write Outputs:          | Retire                |                |        |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: bdnz h18 \<- h14, .L2  | Wait: h14                       | Wait: h14               | Read Inputs: h14        | Write Outputs: h18      | Wait: Retire            | Retire                |                |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16               | Read Inputs: h16        | Write Outputs: h19, h20 | Wait: Retire            | Retire                |                |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19               | Wait: h19               | Read Inputs: h19        | Write Outputs: h21      | Retire                |                |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20          | Wait: h21, h20          | Wait: h21               | Read Inputs: h21, h20   | Write Outputs:        | Retire         |        |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                     | Decode 8                       | Renamed: bdnz h22 \<- h18, .L2  | Wait: h18                       | Wait: h18               | Wait: h18               | Read Inputs: h18        | Write Outputs: h22      | Wait: Retire          | Retire         |        |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20               | Wait: h20               | Read Inputs: h20        | Write Outputs: h23, h24 | Wait: Retire          | Retire         |        |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: addi h25 \<- h23, 100  | Wait: h23               | Wait: h23               | Wait: h23               | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: std h25, 0(h24)        | Wait: h25, h24          | Wait: h25, h24          | Wait: h25, h24          | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |          | Fetch    | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                     | Decode 7                       | Decode 8                        | Renamed: bdnz h26 \<- h22, .L2  | Wait: h22               | Wait: h22               | Wait: h22               | Read Inputs: h22        | Write Outputs: h26    | Wait: Retire   | Retire |     |     |     |     |
-
-# 1 decode stage, 8 wide
-
-| Cycle                      | 0   | 1     | 2        | 3                            | 4                              | 5                               | 6                               | 7                       | 8                       | 9                       | 10                      | 11                    | 12             | 13     | 14  | 15  | 16  | 17  | 18  | 19  | 20  | 21  | 22  | 23  | 24  |
-|----------------------------|-----|-------|----------|------------------------------|--------------------------------|---------------------------------|---------------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-----------------------|----------------|--------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
-| 0x100: mtctr r4            |     | Fetch | Decode 1 | Renamed: mtctr h2 \<- h1     | Read Inputs: h1                | Write Outputs: h2               | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Renamed: ldu h3, 8(h0 -> h4) | Read Inputs: h0                | Write Outputs: h3, h4           | Retire                          |                         |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Renamed: addi h5 \<- h3, 100 | Wait: h3                       | Read Inputs: h3                 | Write Outputs: h5               | Retire                  |                         |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Renamed: std h5, 0(h4)       | Wait: h5, h4                   | Wait: h5                        | Read Inputs: h5, h4             | Write Outputs:          | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     | Fetch | Decode 1 | Renamed: bdnz h6 \<- h2, .L2 | Wait: h2                       | Read Inputs: h2                 | Write Outputs: h6               | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     | Fetch | Decode 1 | Renamed: ldu h7, 8(h4 -> h8) | Wait: h4                       | Read Inputs: h4                 | Write Outputs: h7, h8           | Wait: Retire            | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     | Fetch | Decode 1 | Renamed: addi h9 \<- h7, 100 | Wait: h7                       | Wait: h7                        | Read Inputs: h7                 | Write Outputs: h9       | Retire                  |                         |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     | Fetch | Decode 1 | Renamed: std h9, 0(h8)       | Wait: h9, h8                   | Wait: h9, h8                    | Wait: h9                        | Read Inputs: h9, h8     | Write Outputs:          | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1                     | Renamed: bdnz h10 \<- h6, .L2  | Wait: h6                        | Read Inputs: h6                 | Write Outputs: h10      | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       | Fetch    | Decode 1                     | Renamed: ldu h11, 8(h8 -> h12) | Wait: h8                        | Read Inputs: h8                 | Write Outputs: h11, h12 | Wait: Retire            | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       | Fetch    | Decode 1                     | Renamed: addi h13 \<- h11, 100 | Wait: h11                       | Wait: h11                       | Read Inputs: h11        | Write Outputs: h13      | Retire                  |                         |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       | Fetch    | Decode 1                     | Renamed: std h13, 0(h12)       | Wait: h13, h12                  | Wait: h13, h12                  | Wait: h13               | Read Inputs: h13, h12   | Write Outputs:          | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       | Fetch    | Decode 1                     | Renamed: bdnz h14 \<- h10, .L2 | Wait: h10                       | Wait: h10                       | Read Inputs: h10        | Write Outputs: h14      | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: ldu h15, 8(h12 -> h16) | Wait: h12                       | Read Inputs: h12        | Write Outputs: h15, h16 | Wait: Retire            | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 1                       | Renamed: addi h17 \<- h15, 100  | Wait: h15                       | Wait: h15               | Read Inputs: h15        | Write Outputs: h17      | Retire                  |                       |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: std h17, 0(h16)        | Wait: h17, h16                  | Wait: h17, h16          | Wait: h17               | Read Inputs: h17, h16   | Write Outputs:          | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 1                       | Renamed: bdnz h18 \<- h14, .L2  | Wait: h14                       | Wait: h14               | Read Inputs: h14        | Write Outputs: h18      | Wait: Retire            | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: ldu h19, 8(h16 -> h20) | Wait: h16                       | Wait: h16               | Read Inputs: h16        | Write Outputs: h19, h20 | Wait: Retire            | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          | Fetch                        | Decode 1                       | Renamed: addi h21 \<- h19, 100  | Wait: h19                       | Wait: h19               | Wait: h19               | Read Inputs: h19        | Write Outputs: h21      | Retire                |                |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          | Fetch                        | Decode 1                       | Renamed: std h21, 0(h20)        | Wait: h21, h20                  | Wait: h21, h20          | Wait: h21, h20          | Wait: h21               | Read Inputs: h21, h20   | Write Outputs:        | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          | Fetch                        | Decode 1                       | Renamed: bdnz h22 \<- h18, .L2  | Wait: h18                       | Wait: h18               | Wait: h18               | Read Inputs: h18        | Write Outputs: h22      | Wait: Retire          | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x104: ldu r9, 8(r3)       |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: ldu h23, 8(h20 -> h24) | Wait: h20               | Wait: h20               | Read Inputs: h20        | Write Outputs: h23, h24 | Wait: Retire          | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x108: addi r9 \<- r9, 100 |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: addi h25 \<- h23, 100  | Wait: h23               | Wait: h23               | Wait: h23               | Read Inputs: h23        | Write Outputs: h25    | Retire         |        |     |     |     |     |     |     |     |     |     |     |     |
-| 0x10c: std r9, 0(r3)       |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: std h25, 0(h24)        | Wait: h25, h24          | Wait: h25, h24          | Wait: h25, h24          | Wait: h25               | Read Inputs: h25, h24 | Write Outputs: | Retire |     |     |     |     |     |     |     |     |     |     |     |
-| 0x110: bdnz .L2            |     |       |          |                              | Fetch                          | Decode 1                        | Renamed: bdnz h26 \<- h22, .L2  | Wait: h22               | Wait: h22               | Wait: h22               | Read Inputs: h22        | Write Outputs: h26    | Wait: Retire   | Retire |     |     |     |     |     |     |     |     |     |     |     |
-
-# simple loop, 1 decode stage, 8 wide
-
-| Cycle                     | 0   | 1     | 2        | 3                           | 4                            | 5                             | 6                             | 7                             | 8                             | 9                          | 10                           | 11                          | 12                          | 13                          | 14                 | 15                 | 16                 | 17                 | 18                 | 19                 | 20                 | 21                | 22                | 23                | 24                | 25                | 26             | 27     | 28  | 29  | 30  | 31  | 32  | 33  | 34  |
-|---------------------------|-----|-------|----------|-----------------------------|------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|----------------------------|------------------------------|-----------------------------|-----------------------------|-----------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|-------------------|-------------------|-------------------|-------------------|-------------------|----------------|--------|-----|-----|-----|-----|-----|-----|-----|
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h1 \<- h0, -1 | Read Inputs: h0              | Write Outputs: h1             | Retire                        |                               |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h2 \<- h1, 0 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h2             | Retire                        |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     | Fetch | Decode 0 | Renamed: bne h2, .L2        | Wait: h2                     | Wait: h2                      | Read Inputs: h2               | Write Outputs:                | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h3 \<- h1, -1 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h3             | Wait: Retire                  | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h4 \<- h3, 0 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h4             | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     | Fetch | Decode 0 | Renamed: bne h4, .L2        | Wait: h4                     | Wait: h4                      | Wait: h4                      | Read Inputs: h4               | Write Outputs:                | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Renamed: addi h5 \<- h3, -1 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h5             | Wait: Retire                  | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Renamed: cmpdi h6 \<- h5, 0 | Wait: h5                     | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h6             | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h6, .L2         | Wait: h6                      | Wait: h6                      | Wait: h6                      | Read Inputs: h6               | Write Outputs:             | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h7 \<- h5, -1  | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h7             | Wait: Retire               | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0                    | Renamed: cmpdi h8 \<- h7, 0  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h8          | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h8, .L2         | Wait: h8                      | Wait: h8                      | Wait: h8                      | Wait: h8                      | Read Inputs: h8            | Write Outputs:               | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h9 \<- h7, -1  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h9          | Wait: Retire                 | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0                    | Renamed: cmpdi h10 \<- h9, 0 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h10           | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0                    | Renamed: bne h10, .L2        | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                  | Read Inputs: h10             | Write Outputs:              | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0                    | Renamed: addi h11 \<- h9, -1 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h11           | Wait: Retire                | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h12 \<- h11, 0 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h12          | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h12, .L2         | Wait: h12                     | Wait: h12                     | Wait: h12                     | Wait: h12                  | Wait: h12                    | Read Inputs: h12            | Write Outputs:              | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch                       | Decode 0                     | Renamed: addi h13 \<- h11, -1 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h13          | Wait: Retire                | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h14 \<- h13, 0 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h14          | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h14, .L2         | Wait: h14                     | Wait: h14                     | Wait: h14                     | Wait: h14                  | Wait: h14                    | Wait: h14                   | Read Inputs: h14            | Write Outputs:              | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch                       | Decode 0                     | Renamed: addi h15 \<- h13, -1 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h15          | Wait: Retire                | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch                       | Decode 0                     | Renamed: cmpdi h16 \<- h15, 0 | Wait: h15                     | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h16          | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          | Fetch                       | Decode 0                     | Renamed: bne h16, .L2         | Wait: h16                     | Wait: h16                     | Wait: h16                     | Wait: h16                  | Wait: h16                    | Wait: h16                   | Wait: h16                   | Read Inputs: h16            | Write Outputs:     | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h17 \<- h15, -1 | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h17          | Wait: Retire       | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h18 \<- h17, 0 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h18 | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: bne h18, .L2         | Wait: h18                     | Wait: h18                     | Wait: h18                  | Wait: h18                    | Wait: h18                   | Wait: h18                   | Wait: h18                   | Read Inputs: h18   | Write Outputs:     | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h19 \<- h17, -1 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h19 | Wait: Retire       | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h20 \<- h19, 0 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h20 | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: bne h20, .L2         | Wait: h20                     | Wait: h20                     | Wait: h20                  | Wait: h20                    | Wait: h20                   | Wait: h20                   | Wait: h20                   | Wait: h20          | Read Inputs: h20   | Write Outputs:     | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: addi h21 \<- h19, -1 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h21 | Wait: Retire       | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             | Fetch                        | Decode 0                      | Renamed: cmpdi h22 \<- h21, 0 | Wait: h21                     | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h22 | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h22, .L2         | Wait: h22                     | Wait: h22                  | Wait: h22                    | Wait: h22                   | Wait: h22                   | Wait: h22                   | Wait: h22          | Wait: h22          | Read Inputs: h22   | Write Outputs:     | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h23 \<- h21, -1 | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h23 | Wait: Retire       | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: cmpdi h24 \<- h23, 0 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h24 | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h24, .L2         | Wait: h24                     | Wait: h24                  | Wait: h24                    | Wait: h24                   | Wait: h24                   | Wait: h24                   | Wait: h24          | Wait: h24          | Wait: h24          | Read Inputs: h24   | Write Outputs:     | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h25 \<- h23, -1 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h25 | Wait: Retire       | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: cmpdi h26 \<- h25, 0 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h26 | Retire             |                    |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: bne h26, .L2         | Wait: h26                     | Wait: h26                  | Wait: h26                    | Wait: h26                   | Wait: h26                   | Wait: h26                   | Wait: h26          | Wait: h26          | Wait: h26          | Wait: h26          | Read Inputs: h26   | Write Outputs:     | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              | Fetch                         | Decode 0                      | Renamed: addi h27 \<- h25, -1 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h27 | Wait: Retire       | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h28 \<- h27, 0 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h28 | Retire             |                   |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h28, .L2         | Wait: h28                  | Wait: h28                    | Wait: h28                   | Wait: h28                   | Wait: h28                   | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Read Inputs: h28   | Write Outputs:     | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: addi h29 \<- h27, -1 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h29 | Wait: Retire       | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h30 \<- h29, 0 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h30 | Retire            |                   |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h30, .L2         | Wait: h30                  | Wait: h30                    | Wait: h30                   | Wait: h30                   | Wait: h30                   | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Read Inputs: h30   | Write Outputs:    | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: addi h31 \<- h29, -1 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h31 | Wait: Retire      | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: cmpdi h0 \<- h31, 0  | Wait: h31                  | Wait: h31                    | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h0 | Retire            |                   |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               | Fetch                         | Decode 0                      | Renamed: bne h0, .L2          | Wait: h0                   | Wait: h0                     | Wait: h0                    | Wait: h0                    | Wait: h0                    | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Read Inputs: h0   | Write Outputs:    | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: addi h2 \<- h31, -1 | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h2 | Wait: Retire      | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: cmpdi h1 \<- h2, 0  | Wait: h2                    | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h1 | Retire            |                   |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Renamed: bne h1, .L2         | Wait: h1                    | Wait: h1                    | Wait: h1                    | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1          | Read Inputs: h1   | Write Outputs:    | Retire            |                   |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: addi h4 \<- h2, -1 | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h4 | Wait: Retire      | Retire            |                   |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: cmpdi h3 \<- h4, 0 | Wait: h4                    | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h3 | Retire            |                   |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: bne h3, .L2        | Wait: h3                    | Wait: h3                    | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3          | Wait: h3          | Read Inputs: h3   | Write Outputs:    | Retire            |                |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: addi h6 \<- h4, -1 | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h6 | Wait: Retire      | Retire            |                |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               | Fetch                         | Decode 0                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: cmpdi h5 \<- h6, 0 | Wait: h6                    | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h5 | Retire            |                |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: bne h5, .L2        | Wait: h5                    | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5          | Wait: h5          | Wait: h5          | Read Inputs: h5   | Write Outputs:    | Retire         |        |     |     |     |     |     |     |     |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: addi h8 \<- h6, -1 | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h8 | Wait: Retire      | Retire         |        |     |     |     |     |     |     |     |
-| 0x104: cmpdi r3, 0        |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: cmpdi h7 \<- h8, 0 | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8          | Wait: h8          | Wait: h8          | Read Inputs: h8   | Write Outputs: h7 | Retire         |        |     |     |     |     |     |     |     |
-| 0x108: bne .L2            |     |       |          |                             |                              |                               |                               |                               | Fetch                         | Decode 0                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: bne h7, .L2        | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7          | Wait: h7          | Wait: h7          | Wait: h7          | Read Inputs: h7   | Write Outputs: | Retire |     |     |     |     |     |     |     |
-
-# simple loop, 8 decode stages, 8 wide
-
-| Cycle                     | 0   | 1     | 2        | 3        | 4        | 5        | 6        | 7        | 8        | 9        | 10                          | 11                           | 12                            | 13                            | 14                            | 15                            | 16                         | 17                           | 18                          | 19                          | 20                          | 21                 | 22                 | 23                 | 24                 | 25                 | 26                 | 27                 | 28                | 29                | 30                | 31                | 32                | 33             | 34     |
-|---------------------------|-----|-------|----------|----------|----------|----------|----------|----------|----------|----------|-----------------------------|------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|----------------------------|------------------------------|-----------------------------|-----------------------------|-----------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|-------------------|-------------------|-------------------|-------------------|-------------------|----------------|--------|
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h1 \<- h0, -1 | Read Inputs: h0              | Write Outputs: h1             | Retire                        |                               |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h2 \<- h1, 0 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h2             | Retire                        |                               |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: bne h2, .L2        | Wait: h2                     | Wait: h2                      | Read Inputs: h2               | Write Outputs:                | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h3 \<- h1, -1 | Wait: h1                     | Read Inputs: h1               | Write Outputs: h3             | Wait: Retire                  | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h4 \<- h3, 0 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h4             | Retire                        |                            |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: bne h4, .L2        | Wait: h4                     | Wait: h4                      | Wait: h4                      | Read Inputs: h4               | Write Outputs:                | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: addi h5 \<- h3, -1 | Wait: h3                     | Wait: h3                      | Read Inputs: h3               | Write Outputs: h5             | Wait: Retire                  | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     | Fetch | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7 | Renamed: cmpdi h6 \<- h5, 0 | Wait: h5                     | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h6             | Retire                     |                              |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h6, .L2         | Wait: h6                      | Wait: h6                      | Wait: h6                      | Read Inputs: h6               | Write Outputs:             | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h7 \<- h5, -1  | Wait: h5                      | Wait: h5                      | Read Inputs: h5               | Write Outputs: h7             | Wait: Retire               | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: cmpdi h8 \<- h7, 0  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h8          | Retire                       |                             |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h8, .L2         | Wait: h8                      | Wait: h8                      | Wait: h8                      | Wait: h8                      | Read Inputs: h8            | Write Outputs:               | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h9 \<- h7, -1  | Wait: h7                      | Wait: h7                      | Wait: h7                      | Read Inputs: h7               | Write Outputs: h9          | Wait: Retire                 | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: cmpdi h10 \<- h9, 0 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h10           | Retire                      |                             |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: bne h10, .L2        | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                     | Wait: h10                  | Read Inputs: h10             | Write Outputs:              | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6 | Decode 7                    | Renamed: addi h11 \<- h9, -1 | Wait: h9                      | Wait: h9                      | Wait: h9                      | Wait: h9                      | Read Inputs: h9            | Write Outputs: h11           | Wait: Retire                | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h12 \<- h11, 0 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h12          | Retire                      |                             |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h12, .L2         | Wait: h12                     | Wait: h12                     | Wait: h12                     | Wait: h12                  | Wait: h12                    | Read Inputs: h12            | Write Outputs:              | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: addi h13 \<- h11, -1 | Wait: h11                     | Wait: h11                     | Wait: h11                     | Wait: h11                  | Read Inputs: h11             | Write Outputs: h13          | Wait: Retire                | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h14 \<- h13, 0 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h14          | Retire                      |                    |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h14, .L2         | Wait: h14                     | Wait: h14                     | Wait: h14                     | Wait: h14                  | Wait: h14                    | Wait: h14                   | Read Inputs: h14            | Write Outputs:              | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: addi h15 \<- h13, -1 | Wait: h13                     | Wait: h13                     | Wait: h13                     | Wait: h13                  | Wait: h13                    | Read Inputs: h13            | Write Outputs: h15          | Wait: Retire                | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: cmpdi h16 \<- h15, 0 | Wait: h15                     | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h16          | Retire             |                    |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5 | Decode 6                    | Decode 7                     | Renamed: bne h16, .L2         | Wait: h16                     | Wait: h16                     | Wait: h16                     | Wait: h16                  | Wait: h16                    | Wait: h16                   | Wait: h16                   | Read Inputs: h16            | Write Outputs:     | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h17 \<- h15, -1 | Wait: h15                     | Wait: h15                     | Wait: h15                  | Wait: h15                    | Wait: h15                   | Read Inputs: h15            | Write Outputs: h17          | Wait: Retire       | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h18 \<- h17, 0 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h18 | Retire             |                    |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: bne h18, .L2         | Wait: h18                     | Wait: h18                     | Wait: h18                  | Wait: h18                    | Wait: h18                   | Wait: h18                   | Wait: h18                   | Read Inputs: h18   | Write Outputs:     | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h19 \<- h17, -1 | Wait: h17                     | Wait: h17                     | Wait: h17                  | Wait: h17                    | Wait: h17                   | Wait: h17                   | Read Inputs: h17            | Write Outputs: h19 | Wait: Retire       | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h20 \<- h19, 0 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h20 | Retire             |                    |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: bne h20, .L2         | Wait: h20                     | Wait: h20                     | Wait: h20                  | Wait: h20                    | Wait: h20                   | Wait: h20                   | Wait: h20                   | Wait: h20          | Read Inputs: h20   | Write Outputs:     | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: addi h21 \<- h19, -1 | Wait: h19                     | Wait: h19                     | Wait: h19                  | Wait: h19                    | Wait: h19                   | Wait: h19                   | Wait: h19                   | Read Inputs: h19   | Write Outputs: h21 | Wait: Retire       | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4 | Decode 5                    | Decode 6                     | Decode 7                      | Renamed: cmpdi h22 \<- h21, 0 | Wait: h21                     | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h22 | Retire             |                    |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h22, .L2         | Wait: h22                     | Wait: h22                  | Wait: h22                    | Wait: h22                   | Wait: h22                   | Wait: h22                   | Wait: h22          | Wait: h22          | Read Inputs: h22   | Write Outputs:     | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h23 \<- h21, -1 | Wait: h21                     | Wait: h21                  | Wait: h21                    | Wait: h21                   | Wait: h21                   | Wait: h21                   | Wait: h21          | Read Inputs: h21   | Write Outputs: h23 | Wait: Retire       | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: cmpdi h24 \<- h23, 0 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h24 | Retire             |                    |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h24, .L2         | Wait: h24                     | Wait: h24                  | Wait: h24                    | Wait: h24                   | Wait: h24                   | Wait: h24                   | Wait: h24          | Wait: h24          | Wait: h24          | Read Inputs: h24   | Write Outputs:     | Retire             |                    |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h25 \<- h23, -1 | Wait: h23                     | Wait: h23                  | Wait: h23                    | Wait: h23                   | Wait: h23                   | Wait: h23                   | Wait: h23          | Wait: h23          | Read Inputs: h23   | Write Outputs: h25 | Wait: Retire       | Retire             |                    |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: cmpdi h26 \<- h25, 0 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h26 | Retire             |                    |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: bne h26, .L2         | Wait: h26                     | Wait: h26                  | Wait: h26                    | Wait: h26                   | Wait: h26                   | Wait: h26                   | Wait: h26          | Wait: h26          | Wait: h26          | Wait: h26          | Read Inputs: h26   | Write Outputs:     | Retire             |                   |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3 | Decode 4                    | Decode 5                     | Decode 6                      | Decode 7                      | Renamed: addi h27 \<- h25, -1 | Wait: h25                     | Wait: h25                  | Wait: h25                    | Wait: h25                   | Wait: h25                   | Wait: h25                   | Wait: h25          | Wait: h25          | Wait: h25          | Read Inputs: h25   | Write Outputs: h27 | Wait: Retire       | Retire             |                   |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h28 \<- h27, 0 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h28 | Retire             |                   |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h28, .L2         | Wait: h28                  | Wait: h28                    | Wait: h28                   | Wait: h28                   | Wait: h28                   | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Wait: h28          | Read Inputs: h28   | Write Outputs:     | Retire            |                   |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: addi h29 \<- h27, -1 | Wait: h27                  | Wait: h27                    | Wait: h27                   | Wait: h27                   | Wait: h27                   | Wait: h27          | Wait: h27          | Wait: h27          | Wait: h27          | Read Inputs: h27   | Write Outputs: h29 | Wait: Retire       | Retire            |                   |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h30 \<- h29, 0 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h30 | Retire            |                   |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h30, .L2         | Wait: h30                  | Wait: h30                    | Wait: h30                   | Wait: h30                   | Wait: h30                   | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Wait: h30          | Read Inputs: h30   | Write Outputs:    | Retire            |                   |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: addi h31 \<- h29, -1 | Wait: h29                  | Wait: h29                    | Wait: h29                   | Wait: h29                   | Wait: h29                   | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Wait: h29          | Read Inputs: h29   | Write Outputs: h31 | Wait: Retire      | Retire            |                   |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: cmpdi h0 \<- h31, 0  | Wait: h31                  | Wait: h31                    | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h0 | Retire            |                   |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2 | Decode 3                    | Decode 4                     | Decode 5                      | Decode 6                      | Decode 7                      | Renamed: bne h0, .L2          | Wait: h0                   | Wait: h0                     | Wait: h0                    | Wait: h0                    | Wait: h0                    | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Wait: h0           | Read Inputs: h0   | Write Outputs:    | Retire            |                   |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: addi h2 \<- h31, -1 | Wait: h31                   | Wait: h31                   | Wait: h31                   | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Wait: h31          | Read Inputs: h31   | Write Outputs: h2 | Wait: Retire      | Retire            |                   |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: cmpdi h1 \<- h2, 0  | Wait: h2                    | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h1 | Retire            |                   |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Renamed: bne h1, .L2         | Wait: h1                    | Wait: h1                    | Wait: h1                    | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1           | Wait: h1          | Read Inputs: h1   | Write Outputs:    | Retire            |                   |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: addi h4 \<- h2, -1 | Wait: h2                    | Wait: h2                    | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Wait: h2           | Read Inputs: h2   | Write Outputs: h4 | Wait: Retire      | Retire            |                   |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: cmpdi h3 \<- h4, 0 | Wait: h4                    | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h3 | Retire            |                   |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Renamed: bne h3, .L2        | Wait: h3                    | Wait: h3                    | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3           | Wait: h3          | Wait: h3          | Read Inputs: h3   | Write Outputs:    | Retire            |                |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: addi h6 \<- h4, -1 | Wait: h4                    | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4           | Wait: h4          | Read Inputs: h4   | Write Outputs: h6 | Wait: Retire      | Retire            |                |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          | Fetch    | Decode 0 | Decode 1 | Decode 2                    | Decode 3                     | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                      | Wait: not enough free regs | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: cmpdi h5 \<- h6, 0 | Wait: h6                    | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h5 | Retire            |                |        |
-| 0x108: bne .L2            |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Renamed: bne h5, .L2        | Wait: h5                    | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5           | Wait: h5          | Wait: h5          | Wait: h5          | Read Inputs: h5   | Write Outputs:    | Retire         |        |
-| 0x100: addi r3 \<- r3, -1 |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: addi h8 \<- h6, -1 | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6           | Wait: h6          | Wait: h6          | Read Inputs: h6   | Write Outputs: h8 | Wait: Retire      | Retire         |        |
-| 0x104: cmpdi r3, 0        |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: cmpdi h7 \<- h8, 0 | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8           | Wait: h8          | Wait: h8          | Wait: h8          | Read Inputs: h8   | Write Outputs: h7 | Retire         |        |
-| 0x108: bne .L2            |     |       |          |          |          |          |          |          | Fetch    | Decode 0 | Decode 1                    | Decode 2                     | Decode 3                      | Decode 4                      | Decode 5                      | Decode 6                      | Decode 7                   | Wait: not enough free regs   | Wait: not enough free regs  | Wait: not enough free regs  | Renamed: bne h7, .L2        | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7           | Wait: h7          | Wait: h7          | Wait: h7          | Wait: h7          | Read Inputs: h7   | Write Outputs: | Retire |
diff --git a/openpower/openpower/whitepapers/SimpleV_rationale.mdwn b/openpower/openpower/whitepapers/SimpleV_rationale.mdwn
deleted file mode 100644 (file)
index fabbb12..0000000
+++ /dev/null
@@ -1,3 +0,0 @@
-[[!tag whitepapers]]
-
-# Why in the 2020s would you invent a new Vector ISA
diff --git a/openpower/openpower/whitepapers/microcontroller_power_isa_for_ai.mdwn b/openpower/openpower/whitepapers/microcontroller_power_isa_for_ai.mdwn
deleted file mode 100644 (file)
index 76635dd..0000000
+++ /dev/null
@@ -1,114 +0,0 @@
-[[!tag whitepapers]]
-
-# Increasing average area efficiency and reducing resource utilisation for the Power ISA
-
-originally posted at: <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-February/004505.html>
-
-in between attempting to compile microwatt and Libre-SOC for an 85k LUT4 FPGA which took 4 hours (and then did not run), i decided to see if, in Libre-SOC's HDL, what level of resource reduction could be achieved by going to 32 bit ALUs and register files.
-
-the difference was an astounding 1.4 to 1.
-
-* the MUL pipeline dropped an astonishing 75% which given that multiply is O(N^2) is, retrospectively, not surprising
-* SHIFT dropped to 50%
-* ALU (add) dropped over 50%
-* Logical dropped over 60%
-* BRAM usage dropped by over 75%
-
-i then took a look at the I-Cache, D-Cache and MMU, and i am not seeing any practical barriers to setting them to 32 bit either, other than needing to define a new RADIX32 data format, which looks to be as simple as reducing the PTE and PDE lengths.
-
-why consider this at all? surely "32 bit is dead, Jim, dead, Jim, dead, Jim, dead" [https://m.youtube.com/watch?v=FCARADb9asE]
-
-the answers are multiple:
-
-* Anton had a hard time getting Microwatt into the Sky130 MPW1, which is limited to 10 mm^2.
-   (Libre-SOC's 180 nm test ASIC was 30 mm^2 and
-     that was with no MMU or L1 D/I-Cache)
-* Compared to RISC-V, which can easily fit into only 3000 LUT4s of an FPGA,
-   a Power ISA implementation is completely missing on opportunities
-    to be taken up by Hardware enthusiasts because it requires a bare minimum of 40K LUT4s.
-   (Without L1/MMU, Libre-SOC 64-bit is still 20k LUT4s)
-* The high resource utilisation is making life difficult for Libre-BMC
-   and the fact that it is not the slightest bit justified to be running a 64 bit OS, just for a bootloader,
-   leaves me puzzled as to the justification of what is inherently a self-inflicted handicap
-
-The high resource utilisation is (including for Libre-BMC) pressurising everyone, hindering adoption, and slowing down the iteration cycle on development.  The 4 hour turnround was not a throwaway comment, it was deeply significant: designs using 95% of a 45K LUT4 ECP5 can usually complete on nextpnr-ecp5 in around 12-15 minutes. 4 hours is insane and wasting time.
-
-it is also worth reiterating that larger designs give FPGA tools a much harder job, dramatically reducing the maximum achievable clock rate.
-
-
-based on the above analysis, a 32 bit implementation of a MMU-capable Power ISA core could easily fit into a lower cost Digilent Arty A7-35t, a 45K LUT4 VERSA_ECP5, and with a little corner-cutting (no MMU/L1) even potentially fit into the low-cost 25K orangecrab with plenty of room.
-
-this would make it affordable and accessible to e.g. students in India as well as increase general adoption 
-
-not only that but it would cleanly fit into sky130's 10 mm^2 budget (with reduced I/D-Caches), retain an MMU, and have room for some peripherals (kinda important, that)
-
-this in turn allows for a faster iterative cycle on ASIC development through access every couple of *months* to an MPW Shuttle run.
-
-
-the next step requires a little explanation and context.  SVP64 has been designed as a "Sub-Program-Counter for-loop in hardware" (similar to x86 "REP"). it is not a new idea: Peter Hsu, designer of the MIPS R8000, came up with the exact same concept behind SVP64, in 1994.
-
-the register file is treated as a byte-addressable SRAM (with byte-level masks this is not difficult to envisage) and the ALUs end up being conceptually similar to MMX, which can do 8x8 4x16 2x32 or 1x64 bit operations, except that SVP64 introduces predicate masks which of course
-map directly and simply onto the write-select lines of the underlying
-SRAM of the register file.
-
-however as an intermediary step on the path to converting Libre-SOC's HDL to cope with 8/16/32/64 we actually have to define and implement *scalar* operations at 8, 16 and 32 bit in addition to those already present in the 64-bit Power ISA.  this is underway with a Draft RFC proposal to define the Power ISA in terms of "XLEN", where XLEN=64 very deliberately, thoroughly and intentionally matches precisely, and by definition, with exactly that which is currently in Power ISA 3.0/3.1
-
-let that sink in a moment because the implications are startling:
-
-      we are in effect defining not only a 32 bit Draft
-      variant of the Power ISA, we (Libre-SOC) are also
-      defining a 16 bit *and an 8 bit* variant of Power
-      [and anticipate someone in the future to
-      define a 128-bit variant to match RISC-V RV128].
-
-bear in mind that SVP64 *has* to have Scalar Operations first, because by design and by definition *only Scalar operations may be Vectorised*.  SVP64 *DOES NOT* add *ANY* Vector Instructions. SVP64 is a generic loop around *Scalar* operations and it us up to the Architecture to take advantage of that, at the back-end.
-
-without SVP64 Sub-Looping it would on the face of it seem absolutely mental and a total waste of time and resources to define an 8 or 16 bit General-Purpose ISA in the year 2022 until you recall that:
-
-* students cannot possibly fit a Power ISA 64 bit implementation into a USD $10 ICE40 FPGA, but they might achieve a 16 bit one, and potentially do so in a few short weeks
-
-* the primary focus of AI is FP16, BF16, and even FP8 in some cases, QTY massive parallel banks of cores numbering in the thousands, often with SIMD ALUs.
-
-* a typical GPU has over 30% by area dedicated to parallel computational
-resources (SIMD ALUs) where a General-purpose RISC Core is typically
-dwarfed by literally two orders of magnitude by routing, register files,
-caches and peripherals.
-
-the inherent downside of such massively parallel task-centric cores is that they are absolutely useless at anything other than that specialist task, and are additionally a pig to program, lacking a useful ISA and compiler or, worse, having one but under proprietary licenses.
-
-the delicate balance of massively parallel supercomputing architecture is not to overcook the performance of a single core above all else (hint: Intel), but to focus instead on *average* efficiency per *total* area or power.
-
-    what if there was a way to leverage the Power ISA
-    to have high-end AI performance yet be able to
-    allow programmers to use standard compiler tools
-    to run general-purpose programs on all of those
-    massively-parallel cores?
-
-anyone who has tried either CUDA, 3D Shader programs, deep or wide SIMD Programming, or tried to get their heads twisted round GPU SIMT threads would celebrate and welcome the opportunity.
-
-(in particular, anyone who remembers how hard programming the Cell Processor turned out to be will be having that familiar "lightbulb moment" right about now)
-
-more than that: what if those 8 and 16 bit cores had a Supercomputing-class Vectorisation option in the ISA, and there were implementations out there with back-end ALUs that could perform 64 or 128 8 or 16 bit operations per clock cycle?
-
-Quantity several thousand per processor, all of them capable of adapting to run massive AI number crunching or (at lower IPC than "normal" processors) general-purpose compute?
-
-To achieve this requires some insights:
-
-1. access (addressing memory) beyond 8-bit, 16-bit, or 32-bit, can easily be achieved by allowing LD/STs to leverage *multiple* 8/16/32-bit registers to create 32 or 64 bit addresses.
-
-   SVP64 *already* has the concept of allowing consecutive 8/16/32/64 bit registers to be considered a "Vector" so typecasting to create 32 or 64 bit addresses fits easily
-
-2. If the Power ISA did not already have Carry-In/Out and Condition Registers, this entire idea would have much less merit.
-
-the idea of using multiple instructions to construct bigger integer values is nothing new, but doing so is far easier and more efficient if the ISA has Carry Flags.  that particularly hits home if the basic arithmetic width is only 8 or 16 bit!
-
-3. SVP64 already has the concept of extending the GPRs and FPRs to 128 entries.  however if those are say 16 bit registers, the actual size of the regfile (in bytes) is back down to exactly the same size (in total bytes) as Power ISA 3.0
-
-  * only 32 16-bit registers would be alarmingly resource pressured, particularly given that 4 of them would be needed to construct a 64 bit LD/ST address
-  * 128 16-bit registers on the other hand are equivalent to 32 64-bit regs and Computer Science shows we are comfortable with that quantity.
-
-given the ease with which both 32 and 64 bit addresses may be constructed, and 32 and 64 bit integer arithmetic (and beyond) may be created using multiple instructions *and* how much more efficient that can be done by leveraging SVP64, what at first sounded like an absolutely insane-to-the-point-of-laughable idea instead would be not only workable but combine General-Purpose Compute and AI workloads into a single hybrid ISA.
-
-as you are no doubt aware this has been the focus of so many unsuccessful ventures for so many decades, it would be nice to have one that worked. but, by definition, being "General" Purpose Compute (that happens to also be Supercomputing AI capable) it starts at the ISA and grows from there.
-
-bottom line, i would very much like to see the Power ISA take on Esperanto, but without having to define a custom proprietary extension to the ISA that nobody but they have access to.