From c0c1293638743468109f1cea0999b9eea52adda0 Mon Sep 17 00:00:00 2001 From: lkcl Date: Wed, 1 Sep 2021 19:34:04 +0100 Subject: [PATCH] --- openpower/sv/ldst.mdwn | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index a34025f54..54355b67d 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -139,6 +139,7 @@ an alternative table meaning for [[sv/svp64]] mode. The following modes make se * predicate-result (mostly for cache-inhibited LD/ST) * normal * fail-first, where a vector source on RA or RB is banned +* Signed Effective Address computation (Vector Indexed only) Also, given that FFT, DCT and other related algorithms are of such high importance in so many areas of Computer @@ -199,7 +200,7 @@ The modes for `RA+RB` indexed version are slightly different: | 0-1 | 2 | 3 4 | description | | --- | --- |---------|-------------------------- | | 00 | 0 | dz sz | normal mode | -| 00 | 1 | rsvd | reserved | +| 00 | 1 | dz SEA | Signed Effective Address | | 01 | inv | CR-bit | Rc=1: ffirst CR sel | | 01 | inv | dz RC1 | Rc=0: ffirst z/nonz | | 10 | N | dz sz | sat mode: N=0/1 u/s | @@ -217,6 +218,13 @@ A summary of the effect of Vectorisation of src or dest: RA,RB RT.v RA/RB.s VSPLAT possible RA,RB RT.s RA/RB.s not vectorised +Signed Effective Address computation is only relevant for +Vector Indexed Mode, when elwidth overrides are applied. +The source override applies to RB, and before adding to +RA in order to calculate the Effective Address, if SEA is +set RB is sign-extended from elwidth bits to the full 64 +bits. + Note that cache-inhibited LD/ST (`ldcix`) when VSPLAT is activated will perform **multiple** LD/ST operations, sequentially. `ldcix` even with scalar src will read the same memory location *multiple times*, storing the result in successive Vector destination registers. This because the cache-inhibit instructions are used to read and write memory-mapped peripherals. If a genuine cache-inhibited LD-VSPLAT is required then a *scalar* cache-inhibited LD should be performed, followed by a VSPLAT-augmented mv. -- 2.30.2