From: Luke Kenneth Casson Leighton Date: Thu, 13 Apr 2023 17:46:53 +0000 (+0100) Subject: whitespace X-Git-Tag: opf_rfc_ls010_v1~2 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=db348905eb1749dc5a2ecf8dfd39c2480469232e;p=libreriscv.git whitespace --- diff --git a/openpower/sv/setvl.mdwn b/openpower/sv/setvl.mdwn index f7132b6b5..d271cf96a 100644 --- a/openpower/sv/setvl.mdwn +++ b/openpower/sv/setvl.mdwn @@ -71,30 +71,31 @@ Special Registers Altered: * `vs` - bit 24 - allows for setting of VL * `vf` - bit 25 - sets "Vertical First Mode". -Note that in immediate setting mode VL and MVL start from **one** -but that this is compensated for in the assembly notation. -i.e. that an immediate value of 1 in assembler notation -actually places the value 0b0000000 in the `SVi` field bits: -on execution the `setvl` instruction adds one to the decoded -`SVi` field bits, resulting in -VL/MVL being set to 1. This allows VL to be set to values -ranging from 1 to 128 with only 7 bits instead of 8. -Setting VL/MVL -to 0 would result in all Vector operations becoming `nop`. If this is -truly desired (nop behaviour) then setting VL and MVL to zero is to be -done via the [[SVSTATE SPR|sv/sprs]]. +Note that in immediate setting mode VL and MVL start from **one** but that +this is compensated for in the assembly notation. i.e. that an immediate +value of 1 in assembler notation actually places the value 0b0000000 in +the `SVi` field bits: on execution the `setvl` instruction adds one to +the decoded `SVi` field bits, resulting in VL/MVL being set to 1. This +allows VL to be set to values ranging from 1 to 128 with only 7 bits +instead of 8. Setting VL/MVL to 0 would result in all Vector operations +becoming `nop`. If this is truly desired (nop behaviour) then setting +VL and MVL to zero is to be done via the [[SVSTATE SPR|sv/sprs]]. Note that setmvli is a pseudo-op, based on RA/RT=0, and setvli likewise +``` setvli VL=8 : setvl r0, r0, VL=8, vf=0, vs=1, ms=0 setvli. VL=8 : setvl. r0, r0, VL=8, vf=0, vs=1, ms=0 setmvli MVL=8 : setvl r0, r0, MVL=8, vf=0, vs=0, ms=1 setmvli. MVL=8 : setvl. r0, r0, MVL=8, vf=0, vs=0, ms=1 +``` Additional pseudo-op for obtaining VL without modifying it (or any state): +``` getvl r5 : setvl r5, r0, vf=0, vs=0, ms=0 getvl. r5 : setvl. r5, r0, vf=0, vs=0, ms=0 +``` Note that whilst it is possible to set both MVL and VL from the same immediate, it is not possible to set them to different immediates in @@ -119,25 +120,25 @@ immediate `SVi+1` is sacrificed in favour of setting from CTR. ## Unusual Rc=1 behaviour -Normally, the return result from an instruction is in `RT`. With -it being possible for `RT=0` to mean that `CTR` mode is to be read, -some different semantics are needed. +Normally, the return result from an instruction is in `RT`. With it +being possible for `RT=0` to mean that `CTR` mode is to be read, some +different semantics are needed. CR Field 0, when `Rc=1`, may be set even if `RT=0`. The reason is that overflow may occur: `VL`, if set either from an immediate or from `CTR`, may not exceed `MAXVL`, and if it is, `CR0.SO` must be set. -In reality it is **`VL`** being set. Therefore, rather -than `CR0` testing `RT` when `Rc=1`, CR0.EQ is set if `VL=0`, CR0.GE -is set if `VL` is non-zero. +In reality it is **`VL`** being set. Therefore, rather than `CR0` +testing `RT` when `Rc=1`, CR0.EQ is set if `VL=0`, CR0.GE is set if `VL` +is non-zero. **SUBVL** Sub-vector elements are not be considered "Vertical". The vec2/3/4 is to be considered as if the "single element". Caveats exist for -[[sv/mv.swizzle]] and [[sv/mv.vec]] when Pack/Unpack is enabled, -due to the order in which VL and SUBVL loops are applied being -swapped (outer-inner becomes inner-outer) +[[sv/mv.swizzle]] and [[sv/mv.vec]] when Pack/Unpack is enabled, due +to the order in which VL and SUBVL loops are applied being swapped +(outer-inner becomes inner-outer) ## Examples @@ -156,6 +157,7 @@ loop: ### Loop using Rc=1 +``` my_fn: li r3, 1000 b test @@ -167,23 +169,27 @@ loop: bne cr0, loop end: blr +``` ### Load/Store-Multi (selective) -Up to 64 FPRs will be loaded, here. `r3` is set one per bit -for each FP register required to be loaded. The block of memory -from which the registers are loaded is contiguous (no gaps): -any FP register which has a corresponding zero bit in `r3` -is *unaltered*. In essence this is a selective LD-multi with -"Scatter" capability. +Up to 64 FPRs will be loaded, here. `r3` is set one per bit for each +FP register required to be loaded. The block of memory from which the +registers are loaded is contiguous (no gaps): any FP register which has +a corresponding zero bit in `r3` is *unaltered*. In essence this is a +selective LD-multi with "Scatter" capability. +``` setvli r0, MVL=64, VL=64 sv.fld/dm=r3 *r0, 0(r30) # selective load 64 FP registers +``` Up to 64 FPRs will be saved, here. Again, `r3` +``` setvli r0, MVL=64, VL=64 sv.stfd/sm=r3 *fp0, 0(r30) # selective store 64 FP registers +``` [[!tag standards]]