From fa163b71f736c37a8e749530496032ef2fc0fe83 Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 19 Aug 2022 06:43:13 +0100 Subject: [PATCH] --- openpower/sv/ldst/discussion.mdwn | 58 +++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 openpower/sv/ldst/discussion.mdwn diff --git a/openpower/sv/ldst/discussion.mdwn b/openpower/sv/ldst/discussion.mdwn new file mode 100644 index 000000000..a6a113e3d --- /dev/null +++ b/openpower/sv/ldst/discussion.mdwn @@ -0,0 +1,58 @@ +# notes from lxo + +this section covers assembly notation for the immediate and indexed LD/ST. +the summary is that in immediate mode for LD it is not clear that if the +destination register is Vectorised `RT.v` but the source `imm(RA)` is scalar +the memory being read is *still a vector load*, known as "unit or element strides". + +This anomaly is made clear with the following notation: + + sv.ld RT.v, imm(RA).v + +The following notation, although technically correct due to being implicitly identical to the above, is prohibited and is a syntax error: + + sv.ld RT.v, imm(RA) + +Notes taken from IRC conversation + + sv.ld r#.v, ofst(r#).v -> the whole vector is at ofst+r# + sv.ld r#.v, ofst(r#.v) -> r# is a vector of addresses + similarly sv.ldx r#.v, r#, r#.v -> whole vector at r#+r# + whereas sv.ldx r#.v, r#.v, r# -> vector of addresses + point being, you take an operand with the "m" constraint (or other memory-operand constraints), append .v to it and you're done addressing the in-memory vector + as in asm ("sv.ld1 %0.v, %1.v" : "=r"(vec_in_reg) : "m"(vec_in_mem)); + (and ld%U1 got mangled into underline; %U expands to x if the address is a sum of registers + +permutations of vector selection, to identify above asm-syntax: + + imm(RA) RT.v RA.v nonstrided + sv.ld r#.v, ofst(r#2.v) -> r#2 is a vector of addresses + mem@ 0+r#2 offs+(r#2+1) offs+(r#2+2) + destreg r# r#+1 r#+2 + imm(RA) RT.s RA.v nonstrided + sv.ld r#, ofst(r#2.v) -> r#2 is a vector of addresses + (dest r# is scalar) -> VSELECT mode + imm(RA) RT.v RA.s fixed stride: unit or element + sv.ld r#.v, ofst(r#2).v -> whole vector is at ofst+r#2 + mem@r#2 +0 +1 +2 + destreg r# r#+1 r#+2 + sv.ld/els r#.v, ofst(r#2).v -> vector at ofst*elidx+r#2 + mem@r#2 +0 ... +offs ... +offs*2 + destreg r# r#+1 r#+2 + imm(RA) RT.s RA.s not vectorised + sv.ld r#, ofst(r#2) + +indexed mode: + + RA,RB RT.v RA.v RB.v + sv.ldx r#.v, r#2, r#3.v -> whole vector at r#2+r#3 + RA,RB RT.v RA.s RB.v + sv.ldx r#.v, r#2.v, r#3.v -> whole vector at r#2+r#3 + RA,RB RT.v RA.v RB.s + sv.ldx r#.v, r#2.v, r#3 -> vector of addresses + RA,RB RT.v RA.s RB.s + sv.ldx r#.v, r#2, r#3 -> VSPLAT mode + RA,RB RT.s RA.v RB.v + RA,RB RT.s RA.s RB.v + RA,RB RT.s RA.v RB.s + RA,RB RT.s RA.s RB.s not vectorised -- 2.30.2