-[[!tag standards]]
-
# Simple-V Compliancy Levels
The purpose of the Compliancy Levels is to provide a documented
all SPRs including all reserved SPRs, all SVP64-related Context
instructions (REMAP), as well as for the entire SVP64 Prefix space.
+*Even if the Power ISA Scalar Specification states that a given
+Scalar
+instruction need not or must not raise an illegal instruction on UNDEFINED
+behaviour, unimiplemented parts of SVP64 *MUST* raise an illegal
+instruction trap when (and only when)
+that same Scalar instruction is Prefixed*. It is absolutely critical
+to note that when not Prefixed, under no circumstances shall the Scalar
+instruction deviate from the Scalar Power ISA Specification.
+
Summary of Compliancy Levels, each Level includes all lower levels:
+* **Zero-Level**: Simple-V is not implemented (at all) in hardware. This
+ Level is required to be listed because all capabilities of Simple-V
+ must be Soft-emulatable.
* **Ultra-embedded**: `setvl` instruction and context-switching of SVSTATE
- to/from SVSRR1. Register Files as Standard Power ISA.
+ to/from SVSRR1. Register Files as Standard Power ISA. `scalar identity`
+ implemented.
* **Embedded**: `svstep` instruction,
and support for Hardware for-looping
in both Horizontal-First and Vertical-First Mode as well as Predication
(Single and Twin) for the GPRs r3, r10 and r30. CR-Field-based
- Predicates, if used, may still raise illegal instruction trap.
-* **DSP/AV**: 128 registers,
+ Predicates do not need to be added.
+* **Embedded DSP/AV**: 128 registers,
element-width
overrides, and Saturation and Mapreduce/Iteration Modes.
+* **High-end DSP/AV**: Same as Embedded-DSP/AV except also
+ including Indexed and Offset REMAP capability.
* **3D/Advanced/Supercomputing**: all SV Branch instructions;
crweird and vector-assist instructions (`set-before-first` etc);
+ Swizzle Move instructions;
Matrix, DCT/FFT and Indexing
REMAP capability; Fail-First and Predicate-Result Modes.
permitted to declare meeting the 3D/Advanced Level unless implementing
*all* REMAP Capabilities.
-# Ultra-Embedded Level
+**Power ISA Compliancy Levels**
+
+The SV Compliancy Levels have nothing to do with the Power ISA Compliancy
+Levels (SFS, SFFS, Linux, AIX). They are separate and independent. It
+is perfectly fine to implement Ultra-Embedded on AIX, and perfectly fine to implement 3D/Advanced on SFS. **Compliance with SV Levels does not convey or remove the obligation of Compliance with SFS/SFFS/Linux/AIX Levels and vice-versa**.
+
+## Zero-Level
+
+This level exists to indicate the critical importance of all and any
+features attempted to be executed on hardware that has no support at
+all for Simple-V being **required** to raise Illegal Exceptions.
+**This includes existing Power ISA Implementations:** IBM POWER being
+the most notable.
+
+With parts of the Power ISA being "silent executed" (hints for example),
+it is absolutely critical to have all capabilities of Simple-V sit
+within full Illegal Instruction space of existing and future Hardware.
+
+## Ultra-Embedded Level
This level exists as an entry-level into SVP64, most suited to resource
-constrained soft cores, or Hardware implementations where cost is a
+constrained soft cores, or Hardware implementations where unit cost is a much
higher priority than execution speed.
This level sets the bare minimum requirements, where everything with the
-exception of the `setvl` instruction may be software-emulated through
-JIT Translation or Illegal Instruction traps. SVSTATE joins MSR and PC
+exception of `scalar identity` and
+the `setvl` instruction may be software-emulated through
+JIT Translation or Illegal Instruction traps. SVSTATE, as effectively
+a Sub-Program-Counter, joins MSR and PC (CIA, NIA)
as direct peers and must be switched on any context-switch (Trap or
Exception)
* Any SV instructions not implemented
* any unimplemented SV Context SPRs read or written
* all unimplemented uses of the SVP64 Prefix
+* non-scalar-identity SVP64 instructions
Implementors are free and clear to implement any other features of
SVP64 however only by meeting all of the mandatory requirements above
will Compliance with the Ultra-Embedded Level be achieved.
-# Embedded Level
+Note that `scalar identity` is defined as being when the execution of
+an SVP64 Prefixed instruction is identical in every respect to
+Scalar non-prefixed, i.e. as if the Prefix had not been present.
+Additionally all SV SPRs must be zero and the 24-bit `RM` field must be zero.
+
+## Embedded Level
This level is more suitable for Hardware implementations where performance and power saving begins to matter. A second instruction, `svstep`, used
by Vertical-First Mode, is required, as is hardware-level looping in
Another important aspect is that when Rc=1 is set, CR Field Vector co-results
are produced. Should these exceed CR7 (CR8-CR127) and the number of CR Fields
has not been increased to 128 then an Illegal Instruction Trap must be
-raised. In practical terms, to avoid this scenario, MAXVL should not
-exceed 8 for Arithmetic or Logical operations, when Rc=1.
+raised. In practical terms, to avoid this occurrence in Embedded software,
+MAXVL should not
+exceed 8 for Arithmetic or Logical operations with Rc=1.
Zeroing on source and destination for Predicates
must also be supported (sz, dz) however
Overrides is also optional.
One of the important side-benefits of this SV Compliancy Level is that it
-brings Hardware-level support for Predication to the entire Scalar Power
+brings Hardware-level support for Scalar Predication (VL=MAXVL=1)
+to the entire Scalar Power
ISA, completely without
-modifying the Scalar Power ISA. The cost is that instructions are Prefixed
+modifying the Scalar Power ISA. The cost in software is that Predicated
+instructions are Prefixed
to 64-bit.
-# DSP / Audio / Video Level
+## DSP / Audio / Video Level
This level is best suited to high-performance power-efficient but
specialist Compute workloads. 128 GPRs, FPRs and CR Fields are all
required, as is element-width overrides to allow data processing
-down to the 8-bit level.
+down to the 8-bit level. SUBVL support (Sub-Vector vec2/3/4) is also
+required, as is Pack/Unpack EXTRA format (helps with Pixel and
+Audio Stream Structured data)
-All SVP64 Modes are required to be implemented in hardware: Saturation
+All SVP64 Modes must be implemented in hardware: Saturation
in particular is a necessity for Audio DSP work. Reduction as well to
assist with Audio/Video.
-It is not mandatory for this Level to have DCT/FFT REMAP Capability but
+It is not mandatory for this Level to have DCT/FFT REMAP Capability in
+hardware but
due to the high prevalence of DCT and FFT in Audio, Video and DSP
-workloads it is strongly recommended.
+workloads it is strongly recommended. Matrix (Dimensional) REMAP
+and Swizzle may also be useful to help with 24-bit (3 byte) Structured Audio Streams and are also recommended but not mandatory.
+
+## High-end DSP
+
+In this Compliancy Level the benefits of the Offset and Index REMAP
+subsystem becomes worth its hardware cost. In lower-performing DSP
+and A/V workloads it is not.
+
+## 3D / Advanced / Supercomputing
+
+This Compliancy Level is for highest performance and energy efficiency.
+All aspects of SVP64 must be entirely implemented, in full, in Hardware.
+How that is achieved is entirely at the discretion of the implementor:
+there are no hard requirements of any kind on the level of performance,
+just as there are none in the Vulkan(TM) Specification.
+
+Throughout the SV
+Specification however there are hints to Micro-Architects: byte-level
+write-enable lines on Register Files is strongly recommended, for
+example, in order to avoid unnecessary Read-Modify-Write cycles and
+additional Register Hazard Dependencies on fine-grained (8/16/32-bit)
+operations. Just as with SRAMs multiple write-enable lines may be
+raised to update higher-width elements.
+
+## Examples
+
+Assuming that hardware implements scalar operations only,
+and implements predication but not elwidth overrides:
+
+ setvli r0, 4 # sets VL equal to 4
+ sv.addi r5, r0, 1 # raises an 0x700 trap
+ setvli r0, 1 # sets VL equal to 1
+ sv.addi r5, r0, 1 # gets executed by hardware
+ sv.addi/ew=8 r5, r0, 1 # raises an 0x700 trap
+ sv.ori/sm=EQ r5, r0, 1 # executed by hardware
+
+The first `sv.addi` raises an illegal instruction trap because
+VL has been set to 4, and this is not supported. Likewise
+elwidth overrides if requested always raise illegal instruction
+traps.
+
+Such an implementation would qualify for the "Ultra-Embedded" SV Level.
+It would not qualify for the "Embedded" level because when VL=4 an
+Illegal Exception is raised, and the Embedded Level requires full
+VL Loop support in hardware.
+
+[[!tag standards]]
+
+-------
+
+\newpage{}
+
+