\begin{itemize}
\item EVERY register operation is inherently parallelised\\
(scalar ops are just vectors of length 1)\vspace{4pt}
+ \item Tightly coupled with the core (instruction issue)\\
+ could be disabled through MISA switch\vspace{4pt}
\item An extra pipeline phase is pretty much essential\\
for fast low-latency implementations\vspace{4pt}
- \item Assuming an instruction FIFO, N ops could be taken off\\
- of a parallel op per cycle (avoids filling entire FIFO;\\
- also is less work per cycle: lower complexity / latency)\vspace{4pt}
\item With zeroing off, skipping non-predicated elements is hard:\\
it is however an optimisation (and could be skipped).\vspace{4pt}
\item Setting up the Register/Predication tables (interpreting the\\