From d998d3169c55961effc4e75770d5f892207cd754 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Thu, 18 May 2023 23:50:27 +0000 Subject: [PATCH] clarification, spelling --- conferences/opensearch2023/opensearch2023.tex | 42 +++++++++---------- 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/conferences/opensearch2023/opensearch2023.tex b/conferences/opensearch2023/opensearch2023.tex index 75daaa614..e53aa60d2 100644 --- a/conferences/opensearch2023/opensearch2023.tex +++ b/conferences/opensearch2023/opensearch2023.tex @@ -123,17 +123,7 @@ limitations - in just two lines of code\footnote[1]{with the proviso that the Programmer must be mindful of both the starting point and what they set MAXVL to. Hardware will helpfully remind them of any Register File overruns -by happily throwing an Illegal Instructionp}. - -On top of these very basic but -already-profound\footnote[2]{with hardware and ISA Architectural -requirements that deal with the increased Dependency -Hazard Management, too detailed to list in full in -this document, the most important being that the total number of -registers be a fixed \textbf{and mandatory} Standards-defined quantity} -beginnings, Predication and Conditional-Exit -can be added. Predication is found in every GPU ISA, and Conditional-Exit -is a 50-year invention dating back to Zilog Z80 CPIR and LDIR. +by happily throwing an Illegal Instruction}. \begin{verbatim} for i in range(VL): @@ -145,22 +135,30 @@ is a 50-year invention dating back to Zilog Z80 CPIR and LDIR. break \end{verbatim} +On top of these very basic but +already-profound\footnote[2]{caveats: with hardware and ISA Architectural + requirements that deal with the increased Dependency + Hazard Management, too detailed to list in full in + this document, the most important being that the total number of + registers be a fixed \textbf{and mandatory} Standards-defined quantity} +beginnings, Predication and Conditional-Exit +can be added. Predication is found in every GPU ISA, and Conditional-Exit +is a 50-year invention dating back to Zilog Z80 CPIR and LDIR. + Additionally the concept may be introduced from ARM SVE and RISC-V RVV "Fault-First" on Load and Store, where if an Exception would occur then the Hardware informs the programmer that the Vector operation is truncated: \begin{verbatim} - for i in range(VL): - if predicate.bit[i] clear: - continue - EffectiveAddress = GPR(RA+i) + Immediate + for i in range(VL): + if predicate.bit[i] clear: + continue + EffectiveAddress = GPR(RA+i) + Immediate if Exception@(EffectiveAddress): - if i == 0: RAISE Exception - else: - VL = i - break - GPR(RT+i) = Mem@(EffectiveAddress) + if i == 0: RAISE Exception + else: VL = i; break # truncate + GPR(RT+i) = Mem@(EffectiveAddress) \end{verbatim} The important facet of both these "Conditional Truncation" constructs @@ -184,8 +182,8 @@ presents some unique challenges for an ISA and hardware, the primary being that in a SIMD (parallel) context, strncpy operates in bytes where SIMD operates in power-of-two multiples only. PackedSIMD is the worst offender: PredicatedSIMD is marginally -better\footnote[3]{caveat: if extended properly, as was -done successfully, with huge beneficial effect, in ARM SVE}. +better\footnote[3]{caveat: if designed properly, as was +done successfully in ARM SVE}. If SIMD Load and Store has to start on an Aligned Memory location, which is a common limitation, things get even worse. The operations that were supposed to speed -- 2.30.2