\item Does not require sacrificing 32-bit Major Opcodes.
\item Does not require adding duplicates of instructions
(popcnt, popcntw, popcntd, vpopcntb, vpopcnth, vpopcntw, vpopcntd)
+\item Fully abstracted: does not create Micro-architectural dependencies
+ (no fixed "Lane" size).
\item Specifically designed to be easily implemented
on top of an existing Micro-architecture (especially
Superscalar Out-of-Order Multi-issue) without
dramatically reduced instruction count, and power consumption expected
to greatly reduce. Normally found only in high-end \acs{VLIW} \acs{DSP}
(TI MSP, Qualcomm Hexagon)
-\item Fail-First Load/Store allows strncpy to be implemented in around 14
+\item Fail-First Load/Store allows Vectorised high performance
+ strncpy to be implemented in around 14
instructions (hand-optimised \acs{VSX} assembler is 240).
\item Inner loop of MP3 implemented in under 100 instructions
(gcc produces 450 for the same function on POWER9).
registers of 64-bit length into smaller 8-, 16-, 32-bit pieces.
\cite{SIMD_HARM}\cite{SIMD_HPC}
These partitions can then be operated on simultaneously, and the initial values
-and results being stored as entire 64-bit registers. The SIMD instruction opcode
- includes the data width and the operation to perform.
+and results being stored as entire 64-bit registers (\acs{SWAR}).
+The SIMD instruction opcode
+includes the data width and the operation to perform.
\par
\begin{figure}[hb]
Multi-issue decoding
\end{itemize}
-\subsection{Vector Architectures}
+\subsection{Scalable Vector Architectures}
An older alternative exists to utilise data parallelism - vector
architectures. Vector CPUs collect operands from the main memory, and
store them in large, sequential vector registers.\par
Simple-V's "Vector" Registers are specifically designed to fit on top of
the Scalar (GPR, FPR) register files, which are extended from the default
-of 32, to 128 entries in the Libre-SOC implementation. This is a primary
+of 32, to 128 entries in the high-end Compliancy Levels. This is a primary
reason why Simple-V can be added on top of an existing Scalar ISA, and
\textit{in particular} why there is no need to add Vector Registers or
Vector instructions.