From: Luke Kenneth Casson Leighton Date: Fri, 17 Jun 2022 17:36:42 +0000 (+0100) Subject: some minor edits to the primer summary, clarifying X-Git-Tag: opf_rfc_ls005_v1~1733 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=dc44efac63252d537ab1f105a0290732006cb9be;p=libreriscv.git some minor edits to the primer summary, clarifying --- diff --git a/svp64-primer/summary.tex b/svp64-primer/summary.tex index 3be09e3d7..c503247d1 100644 --- a/svp64-primer/summary.tex +++ b/svp64-primer/summary.tex @@ -37,14 +37,10 @@ An older alternative exists to utilise data parallelism - vector architectures. Vector CPUs collect operands from the main memory, and store them in large, sequential vector registers.\par -Pipelined execution units then perform parallel computations on these -vector registers. The result vector is then broken up into individual -results which are sent back into the main memory.\par - A simple vector processor might operate on one element at a time, -however as the operations are independent by definition \textbf{(where -is this from?)}, a processor could be made to compute all of the vector's -elements simultaneously.\par +however as the element operations are usually independent, +a processor could be made to compute all of the vector's +elements simultaneously, taking advantage of multiple pipelines.\par Typically, today's vector processors can execute two, four, or eight 64-bit elements per clock cycle\cite{SIMD_HARM}. Such processors can also @@ -59,7 +55,7 @@ between number of elements, data width and register vector length. \label{fig:vl_reg_n} \end{figure} -RISCV Vector extension supports a VL of up to $2^{16}$ or $65536$ bits, +RISC-V Vector extension supports a VL of up to $2^{16}$ or $65536$ bits, which can fit 1024 64-bit words \cite{riscv-v-spec}. \subsection{Comparison Between SIMD and Vector} @@ -71,6 +67,7 @@ test test \end{verbatim} \subsection{Shortfalls of SIMD} +Five digit Opcode proliferation (10,000 instructions) is overwhelming. The following are just some of the reasons why SIMD is unsustainable as the number of instructions increase: \begin{itemize} @@ -83,10 +80,12 @@ the number of instructions increase: \subsection{Simple Vectorisation} \ac{SV} is a Scalable Vector ISA designed for hybrid workloads (CPU, GPU, -VPU, 3D?). Includes features normally found only on Cray Supercomputers -(Cray-1, NEC SX-Aurora) and GPUs. Keeps a strictly simple RISC leveraging -a scalar ISA by using "Prefixing" No dedicated vector opcodes exist in SV! +VPU, 3D?). Includes features normally found only on Cray-style Supercomputers +(Cray-1, NEC SX-Aurora) and GPUs. Keeps to a strict uniform RISC paradigm, +leveraging a scalar ISA by using "Prefixing". +\textbf{No dedicated vector opcodes exist in SV, at all}. +\vspace{10pt} Main design principles \begin{itemize} \item Introduce by implementing on top of existing Power ISA @@ -126,23 +125,21 @@ Advantages include: ISAs. No more separate vector instructions. \end{itemize} -\subsubsection{Deviations from Power ISA} -\label{subsubsec:add_to_pow_isa} -\textit{(TODO: EXPAND)} -dropping XER.SO for example - \subsubsection{Prefix 64 - SVP64} -SVP64, is a specification designed to rival existing SIMD implementations by: +SVP64, is a specification designed to solve the problems caused by +SIMD implementations by: \begin{itemize} \item Simplifying the hardware design \item Reducing maintenance overhead + \item Reducing code size and power consumption \item Easier for compilers, coders, documentation \item Time to support platform is a fraction of conventional SIMD (Less money on R\&D, faster to deliver) \end{itemize} -- Intel SIMD is designed to be more capable and has more features, and thus has a greater complexity (?) +- Intel SIMD has been incrementally added to for decades, requires backwards + interoperability, and thus has a greater complexity (?) - What are we going to gain?