%
\begin{abstract}
- Under normal circumstances Search and AI algorithm implementers are left with the
- unenviable task of optimising code for hardware that they had no input into its
- design, and if by chance the original designers of the hardware or crucially the
- ISA happened to have tested a particular algorithm and thought hard about it, software
+ Under normal circumstances Search and AI algorithm implementers are
+ left with the unenviable task of optimising code for hardware that
+ they had no input into its design, and if by chance the original
+ designers of the hardware or crucially the ISA happened to have
+ tested a particular algorithm and thought hard about it, software
writers might end up with optimal performance and power consumption.
- If however they step outside of that box there is nothing that they can do other
- than to search for alternative hardware on which to optimally implement a Search
- algorithm, or to tolerate the sub-par performance and power usage.
- Whilst SVP64 will ultimately likely suffer this same fate at some point, the
- opportunity exists during this early phase its lifecycle to look closely at
- Search and AI algorithms to see if there is anything that can be done. Early
- exploration showed that a paralleliseable Vector strncpy can be implemented in
- as little as ten SVP64 Assembler instructions.
+ If however they step outside of that box there is nothing that they
+ can do other than to search for alternative hardware on which to
+ optimally implement a Search algorithm, or to tolerate the sub-par
+ performance and power usage. Whilst SVP64 will ultimately likely
+ suffer this same fate at some point, the opportunity exists during
+ this early phase its lifecycle to look closely at Search and AI
+ algorithms to see if there is anything that can be done. Early
+ exploration showed that a paralleliseable Vector strncpy can be
+ implemented in as little as ten SVP64 Assembler instructions.
\end{abstract}
\section{Introduction to SVP64}
-The basic principle of SVP64 is to turn Vectorisation into a type of
-Scalar Loop Construct. This is what SIMD and normal Vector ISAs look like:
+The basic principle of SVP64 is to turn Vectorisation into a type
+of Scalar Loop Construct. This is what SIMD and normal Vector ISAs
+look like:
\begin{verbatim}
for i in range(SIMDlength):