+\frame{\frametitle{Why are overlaps allowed in Regfiles?}
+
+ \begin{itemize}
+ \item Same register(s) can have multiple "interpretations"
+ \item xBitManip plus SIMD plus xBitManip = Hi/Lo bitops
+ \item (32-bit GREV plus 4x8-bit SIMD plus 32-bit GREV:\\
+ GREV @ VL=N,wid=32; SIMD @ VL=Nx4,wid=8)
+ \item RGB 565 (video): BEXTW plus 4x8-bit SIMD plus BDEPW\\
+ (BEXT/BDEP @ VL=N,wid=32; SIMD @ VL=Nx4,wid=8)
+ \item Same register(s) can be offset (no need for VSLIDE)\vspace{6pt}
+ \end{itemize}
+ Note:\vspace{10pt}
+ \begin{itemize}
+ \item xBitManip reduces O($N^{6}$) SIMD down to O($N^{3}$)
+ \item Hi-Performance: Macro-op fusion (more pipeline stages?)
+ \end{itemize}
+}
+
+
+\frame{\frametitle{Why no Zeroing (place zeros in non-predicated elements)?}
+
+ \begin{itemize}
+ \item Zeroing is an implementation optimisation favouring OoO\vspace{8pt}
+ \item Simple implementations may skip non-predicated operations\vspace{8pt}
+ \item Simple implementations explicitly have to destroy data\vspace{8pt}
+ \item Complex implementations may use reg-renames to save power\\
+ Zeroing on predication chains makes optimisation harder
+ \end{itemize}
+ Considerations:\vspace{10pt}
+ \begin{itemize}
+ \item Complex not really impacted, Simple impacted a LOT
+ \item Overlapping "Vectors" may issue overlapping ops
+ \item Please don't use Vectors for "security" (use Sec-Ext)
+ \end{itemize}
+}
+% with overlapping "vectors" - bearing in mind that "vectors" are
+% just a remap onto the standard register file, if the top bits of
+% predication are zero, and there happens to be a second vector
+% that uses some of the same register file that happens to be
+% predicated out, the second vector op may be issued *at the same time*
+% if there are available parallel ALUs to do so.
+
+
+\frame{\frametitle{Predication key-value CSR store}