From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date: Sun, 3 May 2020 21:29:40 +0000 (+0100)
Subject: whitespace
X-Git-Tag: convert-csv-opcode-to-binary~2762
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=d626da07e7b25918c387a6e3e4a42627fce02503;p=libreriscv.git

whitespace
---

diff --git a/3d_gpu/architecture/6600scoreboard.mdwn b/3d_gpu/architecture/6600scoreboard.mdwn
index 74d09af69..2c5846a78 100644
--- a/3d_gpu/architecture/6600scoreboard.mdwn
+++ b/3d_gpu/architecture/6600scoreboard.mdwn
@@ -8,18 +8,59 @@ btw one thing that's not obvious - at all - about scoreboards is: there's nothin
 
 the reason i feel that the weirdness exists is for a few reasons:
 
-* firstly, the Matrices create a Directed Acyclic Graph, using single-bit SR-Latches. Â for a software engineer, being able to express a DAG using a matrix is itself.. .weird :)
-* secondly: those Matrices preserve time *order* (instruction dependent order actually), they are not themselves dependent *on* time itself. Â this is especially weird if one is used to an in-order system, which is very much critically dependent on "time" and on strict observance of how long results are going to take to get through a pipeline. Â we could do the entire design based around low-gate-count FSMs and it would still be absolutely fine.
-* thirdly, it's the *absence* of blocks that allows a unit to proceed. Â unlike an in-order system, there's nothing saying "you go now, you go now": it's the opposite. Â the unit is told instead, "here's the resources you need to WAIT for: go when those resources are available".
-* fourth (clarifying 3): it's reads that block writes, and writes that block reads.Â  although obvious when thought through from first principles, it can get particularly confusing that it is the *absence* of read hazards that allow writes to proceed, and the *absence* of write hazards that allow reads to proceed.
-* fifth: the ComputationUnits still need to "manage" the input and output of those resources to actual pipelines (or FSMs).
-Â - (a) the CUs are *not* permitted to blithely say, if there is an expected output that also needs managingÂ "ok i got the inputs, now throw them at the pipeline, i'm done".Â  they *must* wait for that result. Â of course if there is no result to wait for, they're permitted to indicate "done" without waiting (this actually happens in the case of STORE).
-Â - (b) there's an apparent disconnect between "fetching of registers" and "Computational Unit progress". Â surely, one feels, there should be something that, again, "orders the CU to proceed in a set, orderly progressive fashion?". Â instead, because the progress is from the *absence* of hazards, the CU's FSMs likewise make forward progress from the "acknowledgement" of each blockage being dropped.
-* sixth: one of the incredible but puzzling things is that register renaming is *automatically* built-in to the design.Â  the Function Unit's input and output latches are effectively "nameless" registers.
- - (a) the more Function Units you have, the more nameless registers exist.Â  the more nameless registers exist, the further ahead that in-flight execution can progress, speculatively.
- - (b) whilst the Function Units are devoid of register "name" information, the FU-Regs Dependency Matrix is *not* devoid of that information, having latched the read/write register numbers in an unary form, as a "row", one bit in each row representing which register(s) the instruction originally contained.
- - (c) by virtue of the direct Operand Port connectivity between the FU and its corresponding FU-Regs DM "row", the Function Unit requesting for example Operand1 results in the FU-Regs DM *row* triggering a register file read-enable line, *NOT* the Function Unit itself.
-* seventh: the PriorityPickers manage resource contention between the FUs and the row-information from the FU-Regs Matrix.Â  the port bandwidth by nature has to be limited (we cannot have 200 read/write ports on the regfile).Â  therefore the connection between the FU and the FU-Regs "row" in which the actual reg numbers is stored (in unary) is even *less* direct than it is in an in-order system.
+* firstly, the Matrices create a Directed Acyclic Graph, using single-bit
+  SR-Latches. Â for a software engineer, being able to express a DAG using
+  a matrix is itself.. .weird :)
+* secondly: those Matrices preserve time *order* (instruction
+  dependent order actually), they are not themselves dependent *on* time
+  itself. Â this is especially weird if one is used to an in-order system,
+  which is very much critically dependent on "time" and on strict observance
+  of how long results are going to take to get through a pipeline. Â we
+  could do the entire design based around low-gate-count FSMs and it would
+  still be absolutely fine.
+* thirdly, it's the *absence* of blocks that allows a unit to
+  proceed. Â unlike an in-order system, there's nothing saying "you go now,
+  you go now": it's the opposite. Â the unit is told instead, "here's the
+  resources you need to WAIT for: go when those resources are available".
+* fourth (clarifying 3): it's reads that block writes, and writes
+  that block reads.Â  although obvious when thought through from first
+  principles, it can get particularly confusing that it is the *absence*
+  of read hazards that allow writes to proceed, and the *absence* of write
+  hazards that allow reads to proceed.
+* fifth: the ComputationUnits still need to "manage" the input and output
+  of those resources to actual pipelines (or FSMs).
+Â - (a) the CUs are *not* permitted to blithely say, if there is an
+  expected output that also needs managingÂ "ok i got the inputs, now throw
+  them at the pipeline, i'm done".Â  they *must* wait for that result. Â of
+  course if there is no result to wait for, they're permitted to indicate
+"done" without waiting (this actually happens in the case of STORE).
+Â - (b) there's an apparent disconnect between "fetching of registers"
+  and "Computational Unit progress". Â surely, one feels, there should
+  be something that, again, "orders the CU to proceed in a set, orderly
+  progressive fashion?". Â instead, because the progress is from the
+*absence* of hazards, the CU's FSMs likewise make forward progress from
+the "acknowledgement" of each blockage being dropped.
+* sixth: one of the incredible but puzzling things is that register
+  renaming is *automatically* built-in to the design.Â  the Function Unit's
+  input and output latches are effectively "nameless" registers.
+ - (a) the more Function Units you have, the more nameless registers
+   exist.Â  the more nameless registers exist, the further ahead that
+ in-flight execution can progress, speculatively.
+ - (b) whilst the Function Units are devoid of register "name"
+   information, the FU-Regs Dependency Matrix is *not* devoid of that
+   information, having latched the read/write register numbers in an unary
+   form, as a "row", one bit in each row representing which register(s)
+   the instruction originally contained.
+ - (c) by virtue of the direct Operand Port connectivity between the FU
+   and its corresponding FU-Regs DM "row", the Function Unit requesting for
+   example Operand1 results in the FU-Regs DM *row* triggering a register
+ file read-enable line, *NOT* the Function Unit itself.
+* seventh: the PriorityPickers manage resource contention between the FUs
+  and the row-information from the FU-Regs Matrix.Â  the port bandwidth
+  by nature has to be limited (we cannot have 200 read/write ports on
+  the regfile).Â  therefore the connection between the FU and the FU-Regs
+  "row" in which the actual reg numbers is stored (in unary) is even *less*
+  direct than it is in an in-order system.
 
 ultimately then, there is: