determine performance (and eventually to write the HDL).
# The Model
-## Brief [src](https://bugs.libre-soc.org/show_bug.cgi?id=1039)
+## Brief
+
+* [Bug description](https://bugs.libre-soc.org/show_bug.cgi?id=1039)
+
The model for the Single-Issue In-Order core needs to be added to the in-house
Python simulator (`ISACaller`, called by `pypowersim`), which will allow basic
*performance estimates*.
Eventually, Cavatools code will be studied to extract and re-implement in
Python power consumption estimation.
-## Task given [src](https://bugs.libre-soc.org/show_bug.cgi?id=1039#c1) [src](https://libre-soc.org/irclog/%23libre-soc.2023-05-02.log.html#t2023-05-02T10:51:45)
+## Task given
+
+* [Bug comment #1](https://bugs.libre-soc.org/show_bug.cgi?id=1039#c1)
+* [IRC log](https://libre-soc.org/irclog/%23libre-soc.2023-05-02.log.html#t2023-05-02T10:51:45)
+
An offline instruction ordering analyser need to be written that models a
(simple, initially V3.0-only) **in-order core** and gives an estimate of
instructions per clock (IPC).
- Instruction with its operands (as assembler listing)
- plus an optional memory-address and whether it is read or written.
-The input will come from as trace output from the ISACaller simulator,
+The input will come as a trace output from the ISACaller simulator,
[see bug comments #7-#16](https://bugs.libre-soc.org/show_bug.cgi?id=1039#c7)
Some classes needed which "model" pipeline stages: fetch, decode, issue,
The output diagram will look like this:
-| clk # | fetch | decode | issue | execute |
-| 1 | addi 3, 4, 5 | | | | |
-| 2 | cmpi 1, 0, 3, 4 | addi 3, 4, 5 | | |
-| 3 | STALL | cmpi 1, 0, 3, 4 | addi 3, 4, 5 | |
-| 4 | STALL | cmpi 1, 0, 3, 4 | | addi 3, 4, 5 |
-| 5 | ld 1, 2(3) | | cmpi 1, 0, 3, 4 | |
-| 6 | | ld 1, 2(3) | | cmpi 1, 0, 3, 4 |
-| 7 | | | ld 1, 2(3) | |
-| 8 | | | | ld 1, 2(3) |
+| clk # | fetch | decode | issue | execute |
+|-------|--------------|--------------|--------------|--------------|
+| 1 | addi 3,4,5 | | | |
+| 2 | cmpi 1,0,3,4 | addi 3,4,5 | | |
+| 3 | STALL | cmpi 1,0,3,4 | addi 3,4,5 | |
+| 4 | STALL | cmpi 1,0,3,4 | | addi 3,4,5 |
+| 5 | ld 1,2(3) | | cmpi 1,0,3,4 | |
+| 6 | | ld 1,2(3) | | cmpi 1,0,3,4 |
+| 7 | | | ld 1,2(3) | |
+| 8 | | | | ld 1,2(3) |
Explanation:
- 1: Fetched `addi`.
- 2: Decoded `addi`, fetched `cmpi`.
- 3: Issued `addi`, decoded `cmpi`, must stall decode phase, stop fetching.
- 4: Executed `addi`, everything else stalled.
- 5: Issued `cmpi`, fetched `ld`.
- 6: Executed `cmpi`, decoded `ld`.
- 7: Issued `ld`.
- 8: Executed `ld`.
+ 1: Fetched addi.
+ 2: Decoded addi, fetched cmpi.
+ 3: Issued addi, decoded cmpi, must stall decode phase, stop fetching.
+ 4: Executed addi, everything else stalled.
+ 5: Issued cmpi, fetched ld.
+ 6: Executed cmpi, decoded ld.
+ 7: Issued ld.
+ 8: Executed ld.
For this initial model, it is assumed that all instructions take one cycle to
execute (not the case for mul/div etc., but will be dealt with later.