From 7c383a571f8fdd8c555b676814ee6aceada642d9 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Thu, 13 Apr 2023 18:53:54 +0100 Subject: [PATCH] move svstep back to mdwn file, out of ls008.mdwn --- openpower/sv/rfc/ls008.mdwn | 136 +----------------------------------- openpower/sv/svstep.mdwn | 114 ++++++++++++++++++++++-------- 2 files changed, 85 insertions(+), 165 deletions(-) diff --git a/openpower/sv/rfc/ls008.mdwn b/openpower/sv/rfc/ls008.mdwn index 79e2a83fe..2e5050b03 100644 --- a/openpower/sv/rfc/ls008.mdwn +++ b/openpower/sv/rfc/ls008.mdwn @@ -110,141 +110,7 @@ is zero or non-zero. \newpage{} -# svstep: Vertical-First Stepping and status reporting - -SVL-Form - -* svstep RT,SVi,vf (Rc=0) -* svstep. RT,SVi,vf (Rc=1) - -| 0-5|6-10|11.15|16..22| 23-25 | 26-30 |31| Form | -|----|----|-----|------|----------|-------|--|--------- | -|PO | RT | / | SVi | / / vf | XO |Rc| SVL-Form | - -Pseudo-code: - -``` - if SVi[3:4] = 0b11 then - # store pack and unpack in SVSTATE - SVSTATE[53] <- SVi[5] - SVSTATE[54] <- SVi[6] - RT <- [0]*62 || SVSTATE[53:54] - else - # Vertical-First explicit stepping. - step <- SVSTATE_NEXT(SVi, vf) - RT <- [0]*57 || step -``` - -Special Registers Altered: - - CR0 (if Rc=1) - -**Description** - -svstep may be used -to enquire about the REMAP Schedule and it may be used to alter Vectorisation -State. When `vf=1` then stepping occurs. -When `vf=0` the enquiry is performed without altering internal -state. If `SVi=0, Rc=0, vf=0` the instruction is a `nop`. - -The following Modes exist: - -* `SVi=0`: appropriately step srcstep, dststep, subsrcstep and subdststep to the next - element, taking pack and unpack into consideration. -* When `SVi` is 1-4 the REMAP Schedule for a given SVSHAPE may be -returned in `RT`. SVi=1 selects SVSHAPE0 current state, -through to SVi=4 selects SVSHAPE3. -* When `SVi` is 5, `SVSTATE.srcstep` is returned. -* When `SVi` is 6, `SVSTATE.dststep` is returned. -* When `SVi` is 0b1100 pack/unpack in SVSTATE is cleared -* When `SVi` is 0b1101 pack in SVSTATE is set, unpack is cleared -* When `SVi` is 0b1110 unpack in SVSTATE is set, pack is cleared -* When `SVi` is 0b1111 pack/unpack in SVSTATE are set - -As this is a Single-Predicated (1P) instruction, predication may be applied -to skip (or zero) elements. - -* Vertical-First Mode will return the requested index - (and move to the next state if `vf=1`) -* Horizontal-First Mode can be used to return all indices, - i.e. walks through all possible states. - -**Vectorisation of svstep itself** - -As a 32-bit instruction, `svstep` may be itself be Vector-Prefixed, as -`sv.svstep`. This will work perfectly well in Horizontal-First -as it will in Vertical-First Mode. - -Example: to obtain the full set of possible computed element -indices use `sv.svstep RT.v,SVI,1` which will store all computed element -indices, starting from RT. If Rc=1 then a co-result Vector of CR Fields -will also be returned, comprising the "loop end-points" of each of the inner -loops when either Matrix Mode or DCT/FFT is set. In other words, -for example, when the `xdim` inner loop reaches the end and on the next -iteration it will begin again at zero, the CR Field `EQ` will be set. -With a maximum of three loops within both Matrix and DCT/FFT Modes, -the CR Field's EQ bit will be set at the end of the first inner loop, -the LE bit for the second, the GT bit for the outermost loop and the -SO bit set on the very last element, when all loops reach their maximum -extent. - -*Programmer's note (1): VL in some situations, particularly larger Matrices, -may exceed 64, -meaning that `sv.svshape` returning a considerable number of values. Under -such circumstances `sv.svshape/ew=8` is recommended.* - -*Programmer's note (2): having conveniently obtained a pre-computed -Schedule with `sv.svstep`, -it may then be used as the input to Indexed REMAP Mode -to achieve the exact same Schedule. It is evident however that -before use some of the Indices may be arbitrarily altered as desired. -`sv.svstep` helps the programmer avoid having to manually recreate -Indices for certain -types of common Loop patterns, and in its simplest form, without REMAP -(SVi=5 or SVi=6), -is equivalent to the `iota` instruction found in other Vector ISAs* - -**Vertical First Mode** - -Vertical First is effectively like an implicit single bit predicate -applied to every SVP64 instruction. **ONLY** one element in each -SVP64 Vector instruction is executed; srcstep and dststep do **not** -increment, and the Program Counter progresses **immediately** to -the next instruction just as it would for any standard scalar v3.0B -instruction. - -A mode of srcstep (SVi=0) is called which can move srcstep and -dststep on to the next element, still respecting predicate -masks. - -In other words, where normal SVP64 Vectorisation acts "horizontally" -by looping first through 0 to VL-1 and only then moving the PC -to the next instruction, Vertical-First moves the PC onwards -(vertically) through multiple instructions **with the same -srcstep and dststep**, then an explict instruction used to -advance srcstep/dststep. An outer loop is expected to be -used (branch instruction) which completes a series of -Vector operations. - -Testing any end condition of any loop of any REMAP state allows branches to be -used to create loops. - -Programmer's note: when Predicate Non-Zeroing is used this indicates to -the underlying hardware that any masked-out element must be skipped. -*This includes in Vertical-First Mode*, and programmers should be keenly -aware that srcstep or dststep or both *may* jump by more than one as -a result, because the actual request under these circumstances was to execute -on the first available next *non-masked-out* element. - -*Programmers should be aware that VL, srcstep and dststep are global in nature. -Nested looping with different schedules is perfectly possible, as is -calling of functions, however SVSTATE (and any associated SVSTATE) should -obviously be stored on the stack in order to achieve this benefit* - -------------- - -\newpage{} - +[[!inline pages="openpower/sv/svstep" raw=yes ]] [[!inline pages="openpower/sv/setvl" raw=yes ]] # SVSTATE SPR diff --git a/openpower/sv/svstep.mdwn b/openpower/sv/svstep.mdwn index 0a8c5a626..e537b7545 100644 --- a/openpower/sv/svstep.mdwn +++ b/openpower/sv/svstep.mdwn @@ -1,42 +1,46 @@ -# svstep +# svstep: Vertical-First Stepping and status reporting -Links +SVL-Form -* pseudocode in [[isa/simplev]] page +* svstep RT,SVi,vf (Rc=0) +* svstep. RT,SVi,vf (Rc=1) -`svstep` performs explicit stepping of the Vector for-loop, -and it can also be used to enquire about the current state -of the REMAP indices and SVSTATE. +| 0-5|6-10|11.15|16..22| 23-25 | 26-30 |31| Form | +|----|----|-----|------|----------|-------|--|--------- | +|PO | RT | / | SVi | / / vf | XO |Rc| SVL-Form | -# Format +Pseudo-code: -*(Allocation of opcode TBD pending OPF ISA WG approval)*, -using EXT22 temporarily and fitting into the -[[sv/bitmanip]] space +``` + if SVi[3:4] = 0b11 then + # store pack and unpack in SVSTATE + SVSTATE[53] <- SVi[5] + SVSTATE[54] <- SVi[6] + RT <- [0]*62 || SVSTATE[53:54] + else + # Vertical-First explicit stepping. + step <- SVSTATE_NEXT(SVi, vf) + RT <- [0]*57 || step +``` -Form: SVL-Form (see [[isatables/fields.text]]) +Special Registers Altered: -| 0.5|6.10|11.15|16..22| 23...25 | 26.30 |31| name | -| -- | -- | --- | ---- |----------- | ----- |--| ------- | -|OPCD| RT | / | SVi | / / vf | 11011 |Rc| svstep | + CR0 (if Rc=1) -Instruction format: +**Description** - svstep RT,SVi,vf (Rc=0) - svstep. RT,SVi,vf (Rc=1) +svstep may be used to enquire about the REMAP Schedule and it may be +used to alter Vectorisation State. When `vf=1` then stepping occurs. +When `vf=0` the enquiry is performed without altering internal state. +If `SVi=0, Rc=0, vf=0` the instruction is a `nop`. -# Description - -svstep may be used -to enquire about the REMAP Schedule. When `vf=1` then stepping occurs. When `vf=0` the enquiry is performed - without altering internal -state. If `SVi=0, Rc=0, vf=0` this instruction is a `nop`. -The following modes are identical to those in [[sv/setvl]], returning -identical results: +The following Modes exist: +* `SVi=0`: appropriately step srcstep, dststep, subsrcstep and subdststep + to the next element, taking pack and unpack into consideration. * When `SVi` is 1-4 the REMAP Schedule for a given SVSHAPE may be -returned in `RT`. SVi=1 selects SVSHAPE0 current state, -through to SVi=4 selects SVSHAPE3. + returned in `RT`. SVi=1 selects SVSHAPE0 current state, + through to SVi=4 selects SVSHAPE3. * When `SVi` is 5, `SVSTATE.srcstep` is returned. * When `SVi` is 6, `SVSTATE.dststep` is returned. * When `SVi` is 0b1100 pack/unpack in SVSTATE is cleared @@ -45,15 +49,21 @@ through to SVi=4 selects SVSHAPE3. * When `SVi` is 0b1111 pack/unpack in SVSTATE are set As this is a Single-Predicated (1P) instruction, predication may be applied -to skip (or zero) elements. +to skip (or zero) elements. * Vertical-First Mode will return the requested index (and move to the next state if `vf=1`) * Horizontal-First Mode can be used to return all indices, i.e. walks through all possible states. -To obtain the full set of possible computed element -indices use `svstep RT.v,SVI,1` which will store all computed element +**Vectorisation of svstep itself** + +As a 32-bit instruction, `svstep` may be itself be Vector-Prefixed, as +`sv.svstep`. This will work perfectly well in Horizontal-First +as it will in Vertical-First Mode. + +Example: to obtain the full set of possible computed element +indices use `sv.svstep RT.v,SVI,1` which will store all computed element indices, starting from RT. If Rc=1 then a co-result Vector of CR Fields will also be returned, comprising the "loop end-points" of each of the inner loops when either Matrix Mode or DCT/FFT is set. In other words, @@ -80,3 +90,47 @@ Indices for certain types of common Loop patterns, and in its simplest form, without REMAP (SVi=5 or SVi=6), is equivalent to the `iota` instruction found in other Vector ISAs* + +**Vertical First Mode** + +Vertical First is effectively like an implicit single bit predicate +applied to every SVP64 instruction. **ONLY** one element in each +SVP64 Vector instruction is executed; srcstep and dststep do **not** +increment, and the Program Counter progresses **immediately** to +the next instruction just as it would for any standard scalar v3.0B +instruction. + +A mode of srcstep (SVi=0) is called which can move srcstep and +dststep on to the next element, still respecting predicate +masks. + +In other words, where normal SVP64 Vectorisation acts "horizontally" +by looping first through 0 to VL-1 and only then moving the PC +to the next instruction, Vertical-First moves the PC onwards +(vertically) through multiple instructions **with the same +srcstep and dststep**, then an explict instruction used to +advance srcstep/dststep. An outer loop is expected to be +used (branch instruction) which completes a series of +Vector operations. + +Testing any end condition of any loop of any REMAP state allows branches to be +used to create loops. + +Programmer's note: when Predicate Non-Zeroing is used this indicates to +the underlying hardware that any masked-out element must be skipped. +*This includes in Vertical-First Mode*, and programmers should be keenly +aware that srcstep or dststep or both *may* jump by more than one as +a result, because the actual request under these circumstances was to execute +on the first available next *non-masked-out* element. + +*Programmers should be aware that VL, srcstep and dststep are global in nature. +Nested looping with different schedules is perfectly possible, as is +calling of functions, however SVSTATE (and any associated SVSTATE) should +obviously be stored on the stack in order to achieve this benefit* + +[[!tag standards]] + +------------- + +\newpage{} + -- 2.30.2