(no commit message)

[libreriscv.git] / 3d_gpu / architecture / compared_to_register_renaming.mdwn
diff --git a/3d_gpu/architecture/compared_to_register_renaming.mdwn b/3d_gpu/architecture/compared_to_register_renaming.mdwn

index 15430a7f99e35819b2221874607a4f806d99a47c..71f7b170f53a91d22dbf15b3e420af0570da6b4e 100644 (file)
--- a/3d_gpu/architecture/compared_to_register_renaming.mdwn
+++ b/3d_gpu/architecture/compared_to_register_renaming.mdwn
@@ -37,7 +37,6 @@ void f(uint64_t *r3, uint64_t r4) {
      } while(--ctr != 0);
  }
  ```
-
  [See on Compiler Explorer](https://gcc.godbolt.org/z/hzf7d7)
  
  It produces the following Power instructions (edited for style):
@@ -57,7 +56,8 @@ f:
  
  Renamed hardware registers are named `h0`, `h1`, `h2`, ...
  
-The syntax `ldu h7, 8(h5 -> h8)` will be used to mean that the address read comes from `h5` and the address write goes to `h8`
+The syntax `ldu h7, 8(h5 -> h8)` will be used to mean that the address
+read comes from `h5` and the address write goes to `h8`
  
  The register rename table starts out as following:
  
@@ -93,14 +93,24 @@ The register rename table starts out as following:
  
  ## 6600-derived
  
-Notice how the WaR Waits on `r9` cause 2 instructions to finish per cycle (5 micro-ops per 2 cycles) instead of the 4 per cycle for the Register Renaming version, this means the processor's resources will eventually be full, limiting total throughput to 2 instructions/clock.
+Notice how the WaR Waits on `r9` cause 2 instructions to finish per cycle
+(5 micro-ops per 2 cycles) instead of the 4 per cycle for the Register
+Renaming version, this means the processor's resources will eventually
+be full, limiting total throughput to 2 instructions/clock.
  
  For the following table:
-- Assumes that `ldu` instructions are split into two micro-ops in the decode stage. The address computation is denoted "#5.a" and the memory read is denoted "#5.m".
-- Assumes that a mechanism for forwarding from a FU's result latch to a waiting operation is in place, without having to wait until the result can be written to the register file.
-- "Av `r3`" denotes that the value to be written to `r3` is computed and is available for forwarding but can't yet be written to the register file.
-- "SW: #4" denotes that the instruction is waiting on the shadow produced by instruction #4.
-- "Rf #5:`r5`" denotes that the instruction reads the result latch for instruction #5's new value for `r5` through the forwarding mechanism.
+- Assumes that `ldu` instructions are split into two micro-ops in the
+  decode stage. The address computation is denoted "#5.a" and the memory
+  read is denoted "#5.m".
+- Assumes that a mechanism for forwarding from a FU's result latch to a
+  waiting operation is in place, without having to wait until the result
+  can be written to the register file.
+- "Av `r3`" denotes that the value to be written to `r3` is computed and
+  is available for forwarding but can't yet be written to the register file.
+- "SW: #4" denotes that the instruction is waiting on the shadow produced
+  by instruction #4.
+- "Rf #5:`r5`" denotes that the instruction reads the result latch for
+  instruction #5's new value for `r5` through the forwarding mechanism.
  
  | ISA-level instruction | Num   | 0     | 1      | 2             | 3            | 4                | 5                | 6                | 7                 | 8                | 9                         | 10                         | 11                         | 12             | 13             | 14             | 15             | 16          | 17     |
  |-----------------------|-------|-------|--------|---------------|--------------|------------------|------------------|------------------|-------------------|------------------|---------------------------|----------------------------|----------------------------|----------------|----------------|----------------|----------------|-------------|--------|