Simple-V is a type of Vectorization best described as a "Prefix Loop
Subsystem" similar to the 5 decades-old Zilog Z80 `LDIR`[^bib_ldir] instruction and
to the 8086 `REP`[^bib_rep] Prefix instruction. More advanced features are similar
-to the Z80 `CPIR`[^bib_cpir] instruction. If naively viewed one-dimensionally as an
-actual Vector ISA it introduces over 1.5 million 64-bit True-Scalable
-Vector instructions on the SFFS Subset and closer to 10 million 64-bit
-True-Scalable Vector instructions if introduced on VSX. SVP64, the
-instruction format used by Simple-V, is therefore best viewed as an
-orthogonal RISC-paradigm "Prefixing" subsystem instead.
+to the Z80 `CPIR`[^bib_cpir] instruction.
[^bib_ldir]: [Zilog Z80 LDIR](http://z80-heaven.wikidot.com/instructions-set:ldir)
[^bib_cpir]: [Zilog Z80 CPIR](http://z80-heaven.wikidot.com/instructions-set:cpir)
the following instruction (also a Defined Word-instruction), but does **not** change the actual Decoding
of that following instruction just because it is Prefixed. Unlike EXT100-163,
where the Suffix is considered an entirely new Opcode Space,
-SVP64-Prefixed instructions **MUST NEVER** be treated or regarded
+SVP64-Prefixed instructions must never be treated or regarded
as a different Opcode Space.
[^dwi]: Defined Word-instruction: Power ISA v3.1 Section 1.6
-*Architectural note: Treating the SVP64 Prefix as an "Independent" 64-bit Encoding Space and attempting
-to allocate non-Orthogonal Opcodes within it will result
-in catastrophic unviability of Simple-V. The Orthogonality of the Scalar vs Prefixed-Scalar
-spaces has to be considered inviolate, to the extent that even RESERVED spaces must be
-kept identical. The complexity at the Decode Phase by violating the RISC paradigm inherent
-in Simple-V will be unimplementable*
-
Two apparent exceptions to the above hard rule exist: SV
Branch-Conditional operations and LD/ST-update "Post-Increment"
Mode. Post-Increment was considered sufficiently high priority
Therefore it has to be prohibited to accept RFCs
which fundamentally violate the following hard requirement: **under no circumstances**
must the use of SVP64 24-bit Suffixes **also** imply a different Opcode space
-from **any** non-prefixed Word, even RESERVED or Illegal Words.*
+from **any** non-prefixed Word. Even RESERVED or Illegal Words must be
+Orthogonal.*
Subset implementations in hardware are permitted, as long as certain
rules are followed, allowing for full soft-emulation including future
Different classes of operations require different formats. The earlier
sections cover the common formats and the five separate modes have their own
section later:
-CR operations (crops), Arithmetic/Logical (termed "normal"), Load/Store
-Immediate, Load/Store Indexed, and Branch-Conditional.
+* CR operations (crops),
+* Arithmetic/Logical (termed "normal"),
+* Load/Store Immediate,
+* Load/Store Indexed,
+* Branch-Conditional.
## Definition of Reserved in this spec.
because in Simple-V Execution of Elements is synonymous with Execution of
instructions.
+[^ieo]: Strict Instruction Execution Order is defined in Public v3.1 Book I Section 2.2
+
## Precise Interrupt Guarantees
-Strict Instruction Execution Order[^ieo] is defined as giving the appearance, as far
+Strict Instruction Execution Order is defined as giving the appearance, as far
as programs are concerned, that instructions were executed
strictly in the sequence that they occurred. A "Precise"
out-of-order
Micro-architecture goes to considerable lengths to ensure that
this is the case.
-[^ieo]: Strict Instruction Execution Order is defined in Public v3.1 Book I Section 2.2
-
Many Vector ISAs allow interrupts to occur in the middle of
processing of large Vector operations, only under the condition
that partial results are cleanly discarded, and continuation on return
Simple-V operates on an entirely different paradigm from traditional
Vector ISAs: as a "Sub-Execution Context", where "Elements" are synonymous
-with Scalar instructions. With this in mind it is critical for
-implementations to observe Strict **Element**-Level Execution Order[^svp64_eeo]
+with Scalar instructions. With this in mind
+implementations must observe Strict **Element**-Level Execution Order[[#svp64_eeo]]
at all times.
-*Any* element is Interruptible and Architectural State may
-be fully preserved and restored regardless of that same State
+*Any* element is Interruptible, and Architectural State may
+be fully preserved and restored regardless of that same State.
*Engineering note: implementations are permitted have higher latency to
perform context-switching (particularly if REMAP
Interrupts still only save `MSR` and `PC` in `SRR0` and `SRR1`
but the full SVP64 Architectural State may be saved and
restored through manual copying of `SVSTATE` (and the four
-REMAP SPRs if in use at the time)
-
-*Programmer's note: Trap Handlers (and function call stack save/restore)
-may avoid the
-use of SVP64 Prefixed instructions to perform the necessary
-save/restore of Simple-V Architectural State.
-This capability also allows nested function calls to be made from
-inside Vertical-First Vector loops, which is very rare for Vector ISAs.
-
-Strict Program Order is also preserved by the Parallel Reduction
-REMAP Schedule, but only at the cost of requiring the destination
-Vector to be used (Deterministically) to store partial progress of the
-Parallel Reduction.
+REMAP SPRs if in use at the time, which may be determined by
+`SVSTATE[32:46]` being non-zero).
+
+*Programmer's note: Trap Handlers (and any stack-based context save/restore)
+must avoid the use of SVP64 Prefixed instructions to perform the necessary
+save/restore of Simple-V Architectural State (SPR SVSTATE),
+just as use of FPRs and VSRs is presently avoided.
+However once saved, and set to known-good, SVP64 Prefixed instructions
+may be used to save/restore GPRs, SPRs, FPRs and other state.*
+
+*Programmer's note: SVSHAPE0-3 alters Element Execution Order, but only
+if activated in SVSHAPE. It is therefore technically possible in a Trap
+Handler to save SVSTATE (`mfspr t0, SVSTATE`), then clear bits 32-46.
+At this point it becomes safe to use SVP64 to save sequential batches
+of SPRs (`setvli MAXVL=VL=4; sv.mfspr *t0, *SVSHAPE0`)*
The only major caveat for REMAP is that
after an explicit change to
the re-mapping Indices were. Obvious examples include Interrupts occuring
in the middle of a non-RADIX2 Matrix Multiply Schedule (5x3 by 3x3
for example), which
-will force implementations to perform divide and modulo
+will force some implementations to perform divide and modulo
calculations.
An additional caveat involves Condition Register Fields