/* state-machine for Simple-V "hardware-macro-loop-unroller".
- yes, it's an absolute non-obvious mess, thankfully a reasonbly short one.
- the mess is down to being a macro-top-heavy re-entrant state machine
- that copes with all of the different formats of instructions, including
- some quirks (branches, twin predication, compressed overloads are TODO).
-
- the re-entrancy is required in case of exceptions. the SV "STATE" CSR
- includes the two element offsets (two because twin-predication for
- c.mv and fcvt and friends needs two predicates, therefore two element
- offsets, therefore can implement a vector-version of bitwise gather/scatter)
-
- there is a lot going on here: id_regs.py (see riscv.mk.in) is responsible
- for generating the regs_{insn}.h file, such that this template can just
- #include the *standard scalar implementation, unmodified*. that's the
- whole point of SV. exceptions to changes of behaviour are branches
- (hence the alternative #define for set_pc, below).
-
- also it is important to appreciate that id_regs.py generates #defines
- for which predicate and offset that each instruction will use for
- each register. again, unfortunately, c.lwsp / c.fldsp here is an
- exception to that rule (TODO). the "if xlen == 32" test has to
- be explicitly replicated, and the predication type/target explicitly
- changed depending on that test, where in other cases it can be
- determined by id_regs.py.
-
- in case you're wondering: yes, really, id_regs.py actually parses
- the actual riscv/insns/impl.h implementations (all of them), looking
- for uses of "RVC_RS1" and "WRITE_RD" and so on, as an indicator
- of the register usage for that specific opcode. whilst it was
- hypothetically possible to use this repo, kindly written by michael clark:
- https://github.com/michaeljclark/riscv-meta
- it was instead decided that instead of having an extra dependency,
- and then having to write parsers for those files, dealing with the
- various quirks would be better done by just... parsing the actual spike
- scalar instrucion implementation(s). they contain the same information...
- just in an easier-to-identify format.
-
- the actual redirection (lookups of the register indices from the opcode
- through the CSR tables) is done in sv_insn_t, through overloads on
- rd, rs1-3, rvc_rs1, and for c.lwsp and friends, rvc_lwsp_imm.
-
- the state machine is made more complex due to it coping with scalar-scalar,
- scalar-vector, vector-scalar *and* vector-vector, dynamically, at runtime.
- it also (for now) is disabled in hypervisor mode. this is why the vlen is
- set initially to 0, then later is set to a minimum of 1.
-
- the state machine *also* copes with cases where registers are marked
- *specifically* as "redirected but still scalar", and the twin-predication
- version can also skip forward so that a scalar can be matched up with
- a single bit-predicated vector (scalar source, single-bit-predicated
- dest *or* the other way round).
-
- it's *really* comprehensive in other words, for just 200 or so lines.
+ see insn_template_sv.txt for design details
*/
#define xstr(s) str(s)
--- /dev/null
+// See LICENSE for license details.
+
+/* state-machine for Simple-V "hardware-macro-loop-unroller".
+
+yes, it's an absolute non-obvious mess, thankfully a reasonbly short one.
+the mess is down to being a macro-top-heavy re-entrant state machine
+that copes with all of the different formats of instructions, including
+some quirks (branches, twin predication, compressed overloads are TODO).
+
+the re-entrancy is required in case of exceptions. the SV "STATE" CSR
+includes the two element offsets (two because twin-predication for
+c.mv and fcvt and friends needs two predicates, therefore two element
+offsets, therefore can implement a vector-version of bitwise gather/scatter)
+
+there is a lot going on here: id_regs.py (see riscv.mk.in) is responsible
+for generating the regs_{insn}.h file, such that this template can just
+#include the *standard scalar implementation, unmodified*. that's the
+whole point of SV. exceptions to changes of behaviour are branches
+(hence the alternative #define for set_pc, below).
+
+also it is important to appreciate that id_regs.py generates #defines
+for which predicate and offset that each instruction will use for
+each register. again, unfortunately, c.lwsp / c.fldsp here is an
+exception to that rule (TODO). the "if xlen == 32" test has to
+be explicitly replicated, and the predication type/target explicitly
+changed depending on that test, where in other cases it can be
+determined by id_regs.py.
+
+in case you're wondering: yes, really, id_regs.py actually parses
+the actual riscv/insns/impl.h implementations (all of them), looking
+for uses of "RVC_RS1" and "WRITE_RD" and so on, as an indicator
+of the register usage for that specific opcode. whilst it was
+hypothetically possible to use this repo, kindly written by michael clark:
+ https://github.com/michaeljclark/riscv-meta
+it was instead decided that instead of having an extra dependency,
+and then having to write parsers for those files, dealing with the
+various quirks would be better done by just... parsing the actual spike
+scalar instrucion implementation(s). they contain the same information...
+just in an easier-to-identify format.
+
+the actual redirection (lookups of the register indices from the opcode
+through the CSR tables) is done in sv_insn_t, through overloads on
+rd, rs1-3, rvc_rs1, and for c.lwsp and friends, rvc_lwsp_imm.
+
+the state machine is made more complex due to it coping with scalar-scalar,
+scalar-vector, vector-scalar *and* vector-vector, dynamically, at runtime.
+it also (for now) is disabled in hypervisor mode. this is why the vlen is
+set initially to 0, then later is set to a minimum of 1.
+
+the state machine *also* copes with cases where registers are marked
+*specifically* as "redirected but still scalar", and the twin-predication
+version can also skip forward so that a scalar can be matched up with
+a single bit-predicated vector (scalar source, single-bit-predicated
+dest *or* the other way round).
+
+it's *really* comprehensive in other words, for just 200 or so lines.