From 4753815656c9094b49d90949fcc7687849494f6b Mon Sep 17 00:00:00 2001 From: lkcl Date: Tue, 4 Oct 2022 06:23:06 +0100 Subject: [PATCH] --- openpower/sv/svp64/discussion.mdwn | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/openpower/sv/svp64/discussion.mdwn b/openpower/sv/svp64/discussion.mdwn index a5ebcfe86..9da9f5383 100644 --- a/openpower/sv/svp64/discussion.mdwn +++ b/openpower/sv/svp64/discussion.mdwn @@ -220,6 +220,13 @@ four aspects: 3. is the loss of the dynamic meaning "VL=0" nop effect important? 4. why would "sv.op all-scalar" be inside a loop in the first place? +Summary so far: + +* failfirst needs to be an illegal exception if all-scalar +* non-zeroing predication on all-scalar with VL>1 requires + all relevant bits to be set, this changes to the **first** + bit for auto-VL=1 + ## answers to 2, RM Modes **Normal Mode:** @@ -297,7 +304,8 @@ predication-based offsets (and REMAP) (not to be confused with VSPLAT mode). Answer: - No, scalar-mode requires RA.isvec=0 RT.isvec=0, but VSPLAT is RA.isvec=0 RT.isvec=1. + No, scalar-mode requires RA.isvec=0 RT.isvec=0, but + VSPLAT is RA.isvec=0 RT.isvec=1. VL>1 at the moment, with a scalar source and scalar dest, will not undergo any changes to the EA compared to if VL=1. @@ -315,16 +323,24 @@ interfered with, except that, again, RT may be set as a vector destination. EA = ireg[RA] + ireg[RB]*j # register-strided ``` -Vector destination is again "VLSPLAT" mode, but if a Scalar +Vector destination is again "VSPLAT" mode, but if a Scalar destination was set with VL>1, then just as with LD-immediate it is the entire predicate mask which must be zero to stop the scalar element from being loaded, and the same effect may be achieved with VL=1 by ORing all predicate mask bits down to a single bit as a new predicate. +**CR ops** + +TODO + +**Branch-Conditional** + +TODO + ## answers to 4, loops/uses -### REMAP +**REMAP** A REMAP would redirect operations from the first nonmasked predicated element to the first **REMAPped** element, and combined @@ -341,7 +357,7 @@ answer: use at least one vector source. this solves the predication issue. question: does this impact LD/ST which has special overrides and mode-selection based on RA.isvec? -### predication +**predication** with nonzeroing the application of a predicate mask to an all-scalar operation effectively tests **ALL** relevant bits 0..VL-1 as nonzero in the @@ -350,7 +366,7 @@ decision-making, whereas VL=1 will only test the first. a need for merging (ORing) all bits into a single alternative predicate mask (single-bit) is the sort of thing we can probably live with. -### fast traditional packed SIMD +## fast traditional packed SIMD A major motivation for changing SVP64 with all isvec=0 to temporarily override VL to 1 is to allow supporting traditional SIMD that has -- 2.30.2