From: lkcl Date: Sat, 20 May 2023 00:15:27 +0000 (+0100) Subject: (no commit message) X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=1d86c19a13dbe32c685a5aa881b288b6a8c1ce21;p=libreriscv.git --- diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index 57f5e258c..836c98573 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -75,7 +75,6 @@ In addition, reduce mode makes no sense. Realistically we need an alternative table definition for [[sv/svp64]] `RM.MODE`. The following modes make sense: -* saturation * simple (no augmentation) * Fault-first (where Vector Indexed is banned) * Data-dependent Fail-First (extremely useful for Linked-List pointer-chasing) @@ -100,7 +99,6 @@ Fields used in tables below: * **zz**: both sz and dz are set equal to this flag. * **inv CR bit** just as in branches (BO) these bits allow testing of a CR bit and whether it is set (inv=0) or unset (inv=1) -* **N** sets signed/unsigned saturation. * **RC1** as if Rc=1, stores CRs *but not the result* * **SEA** - Signed Effective Address, if enabled performs sign-extension on registers that have been reduced due to elwidth overrides @@ -125,9 +123,7 @@ The table for [[sv/svp64]] for `immed(RA)` which is `RM.MODE` | 0 | 1 | 2 | 3 4 | description | |---|---| --- |---------|--------------------------- | -| 0 | 0 | 0 | zz els | simple mode | -| 0 | 0 | 1 | PI LF | post-increment and Fault-First | -| 1 | 0 | N | zz els | sat mode: N=0/1 u/s | +|els| 0 | PI | zz LF | post-increment and Fault-First | |VLi| 1 | inv | CR-bit | ffirst CR sel | The `els` bit is only relevant when `RA.isvec` is clear: this indicates @@ -207,9 +203,9 @@ A summary of the effect of Vectorisation of src or dest: Signed Effective Address computation is only relevant for Vector Indexed Mode, when elwidth overrides are applied. The source override applies to RB, and before adding to RA in order to calculate the Effective Address, -if SEA is set RB is sign-extended from elwidth bits to the full 64 bits. -For other Modes (ffirst, saturate), all EA computation with elwidth -overrides is unsigned. RA is *not* altered (not truncated) +if SEA is set then RB is sign-extended from elwidth bits to the full 64 bits. +For other Modes (ffirst), all EA computation with elwidth +overrides is unsigned. RA is *never* altered (not truncated) by element-width overrides. Note that cache-inhibited LD/ST when VSPLAT is activated will perform @@ -587,12 +583,8 @@ which are expressly in this order: but (on Indexed Load) allow srcwidth overrides on RB * Load at the operation width (lb/lh/lw/ld) as usual * byte-reversal as usual -* Non-saturated mode: - - zero-extension or truncation from operation width to dest elwidth - - place result in destination at dest elwidth -* Saturated mode: - - Sign-extension or truncation from operation width to dest width - - signed/unsigned saturation down to dest elwidth +* zero-extension or truncation from operation width to dest elwidth +* place result in destination at dest elwidth In order to respect Power v3.0B Scalar behaviour the memory side is treated effectively as completely separate and distinct from SV @@ -628,8 +620,8 @@ capability). Observe in particular that RA, as the base address in both Immediate and Indexed LD/ST, does not have element-width overriding applied to it. -Note that predication, predication-zeroing, and other modes except -saturation have all been removed, for clarity and simplicity: +Note that predication, predication-zeroing, and other modes +have all been removed, for clarity and simplicity: ``` # LD not VLD! @@ -649,13 +641,8 @@ saturation have all been removed, for clarity and simplicity: # read the underlying memory memread <= MEM(srcbase + imm_offs, op_width) - # check saturation. - if svpctx.saturation_mode: - # ... saturation adjustment... - memread = clamp(memread, op_width, svctx.dest_elwidth) - else: - # truncate/extend to over-ridden dest width. - memread = adjust_wid(memread, op_width, svctx.dest_elwidth) + # truncate/extend to over-ridden dest width. + memread = adjust_wid(memread, op_width, svctx.dest_elwidth) # takes care of inserting memory-read (now correctly byteswapped) # into regfile underlying LE-defined order, into the right place @@ -672,7 +659,7 @@ Note above that the source elwidth is *not used at all* in LD-immediate. For LD/Indexed, the key is that in the calculation of the Effective Address, RA has no elwidth override but RB does. Pseudocode below is simplified -for clarity: predication and all modes except saturation are removed: +for clarity: predication and all modes are removed: ``` # LD not VLD! ld*rx if brev else ld* @@ -699,12 +686,8 @@ for clarity: predication and all modes except saturation are removed: if (bytereverse): memread = byteswap(memread, op_width) - if svpctx.saturation_mode: - # ... saturation adjustment... - memread = clamp(memread, op_width, svctx.dest_elwidth) - else: - # truncate/extend to over-ridden dest width. - memread = adjust_wid(memread, op_width, svctx.dest_elwidth) + # truncate/extend to over-ridden dest width. + memread = adjust_wid(memread, op_width, svctx.dest_elwidth) # takes care of inserting memory-read (now correctly byteswapped) # into regfile underlying LE-defined order, into the right place