From 01fcbdd7e42101f0ccc9277e4a7e4cf21f02f9c0 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 7 May 2023 13:44:21 +0100 Subject: [PATCH] --- openpower/sv/svstep.mdwn | 35 ++++++++++++++++++++++++++++++++--- 1 file changed, 32 insertions(+), 3 deletions(-) diff --git a/openpower/sv/svstep.mdwn b/openpower/sv/svstep.mdwn index 133a5daed..7356957bc 100644 --- a/openpower/sv/svstep.mdwn +++ b/openpower/sv/svstep.mdwn @@ -138,14 +138,43 @@ found in Vector ISAs. Incrementing and iteration through subvector state ssubstep and dsubstep is possible with `sv.svstep/vecN` where as expected N may be 2/3/4. However it is necessary to use the exact same Sub-Vector qualifier on any Prefixed -instructions, within any given Vertical-First loop. Also valid +instructions, within any given Vertical-First loop: `vec2/3/4` is **not** +automatically applied to all instructions, it must be explicitly applied on +a per-instruction basis. Also valid is not specifying a Sub-vector qualifier at all, but it is critically important to note that operations will be repeated. For example if `sv.svstep/vec2` -is used on `sv.addi` then each Vector element operation is +is not used on `sv.addi` then each Vector element operation is repeated twice. The reason is that whilst svstep will be iterating through both the SUBVL and VL loops, the addi instruction -only uses `srcstep` and `dststep`. +only uses `srcstep` and `dststep` (not ssubstep or dsubstep) Illustrated below: + +``` + def offset(): + for step in range(VL): + for substep in range(SUBVL=2): + yield step, substep + for i, j in offset(): + vec2_offs = i * SUBVL + j # calculate vec2 offset + addi RT+i, RA+i, 1 # but sv.addi is not vec2! + muli/vec2 RT+vec2_offs, RA+vec2_offs, 2 # this is +``` + +Actual assembler would be: + +``` + loop: + setvl VF=1, CTRmode + sv.addi *RT, *RA, 1 # no vec2 + sv.muli/vec2 *RT, *RA, 2 # vec2 + sv.svstep/vec2 # must match the muli + sv.bc CTRmode, loop # subtracts VL from CTR +``` + +This illustrates the correct but seemingly-anomalous behaviour: `sv.svstep/vec2` +is being requested to update `SVSTATE` to follow a vec2 loop construct. The anomalous +`sv.addi` is not prohibited as it may in fact be desirable to execute operations twice, +or to re-load data that was overwritten, and many other possibilities. ------------- -- 2.30.2