From d78ed8963da8b0a5d1bb9c4b6067e57af608dbbf Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 12 Aug 2022 17:56:25 +0100 Subject: [PATCH] --- openpower/sv/ldst.mdwn | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index 9f5e9ebb2..b1a8a1b81 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -117,7 +117,7 @@ multiple times with the same memory read at the same location. The benefit of Cache-inhibited LD-splats is that it allows for memory-mapped peripherals to have multiple data values read in quick succession and stored in sequentially -numbered registers. +numbered registers (but, see Note below). For non-cache-inhibited ST from a vector source onto a scalar destination: with the Vector @@ -132,7 +132,8 @@ destination. Just like Cache-inhibited LDs, multiple values may be written out in quick succession to a memory-mapped peripheral from sequentially-numbered registers. -Note that there are no immediate versions of cache-inhibited LD/ST. +Note that there are no immediate versions of cache-inhibited LD/ST +(no *Scalar* cache-inhibited immediate instructions to Vectorise) **LD/ST Indexed** @@ -174,6 +175,12 @@ Note that cache-inhibited LD/ST (`ldcix`) when VSPLAT is activated will perform If a genuine cache-inhibited LD-VSPLAT is required then a *scalar* cache-inhibited LD should be performed, followed by a VSPLAT-augmented mv. +Note also that cache-inhibited VSPLAT with Predicate-result is possible. +This allows for example to issue a massive batch of memory-mapped +peripheral reads, stopping at the first NULL-terminated character and +truncating VL to that point. No branch is needed to issue that large burst +of LDs. + # Vectorisation of Scalar Power ISA v3.0B OpenPOWER Load/Store operations may be seen from [[isa/fixedload]] and -- 2.30.2