From: lkcl Date: Sun, 26 Jun 2022 14:26:24 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~1515 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=8524cda6a12782d8819868c5929a166eea110fdd;p=libreriscv.git --- diff --git a/openpower/sv/svp64/appendix.mdwn b/openpower/sv/svp64/appendix.mdwn index fc15904a7..365d8a8f5 100644 --- a/openpower/sv/svp64/appendix.mdwn +++ b/openpower/sv/svp64/appendix.mdwn @@ -604,7 +604,7 @@ will **not** be overwritten and will **not** be zero'd. Note that when SVM is clear and SUBVL!=1 the sub-elements are *independent*, i.e. they are mapreduced per *sub-element* as a result. -illustration with a vec2, assuming RA==RT, e.g `sv.add/mr/vec2 r4, r4, r16` +illustration with a vec2, assuming RA==RT, e.g `sv.add/mr/vec2 r4, r4, r16.v` for i in range(0, VL): # RA==RT in the instruction. does not have to be @@ -622,28 +622,33 @@ like a traditional Vector Processor Reduction instruction. Example for a vec2: for i in range(VL): - iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y) + iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y) Example for a vec3: for i in range(VL): - iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y) - iregs[RT+i] = op(iregs[RT+i] , iregs[RA+i].z) + iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y) + iregs[RT+i] = op(iregs[RT+i] , iregs[RB+i].z) Example for a vec4: for i in range(VL): - iregs[RT+i] = op(iregs[RA+i].x, iregs[RA+i].y) - iregs[RT+i] = op(iregs[RT+i] , iregs[RA+i].z) - iregs[RT+i] = op(iregs[RT+i] , iregs[RA+i].w) + iregs[RT+i] = op(iregs[RA+i].x, iregs[RB+i].y) + iregs[RT+i] = op(iregs[RT+i] , iregs[RB+i].z) + iregs[RT+i] = op(iregs[RT+i] , iregs[RB+i].w) In this mode, when Rc=1 the Vector of CRs is as normal: each result element creates a corresponding CR element (for the final, reduced, result). -Note that the destination (RT) is automatically used as an "Accumulator" -register, and consequently the Sub-Vector Loop is interruptible. -If RT is a Scalar then as usual the main VL Loop terminates at the -first predicated element (or the first element if unpredicated). +Note: + +1. that the destination (RT) is inherently used as an "Accumulator" + register, and consequently the Sub-Vector Loop is interruptible. + If RT is a Scalar then as usual the main VL Loop terminates at the + first predicated element (or the first element if unpredicated). +2. that the Sub-Vector designation applies to RA and RB *but not RT*. +3. that the number of operations executed is one less than the Sub-vector + length # Fail-on-first