From d1b3ca23e48af40a77cde2e77401649750634f88 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 28 Aug 2022 18:49:10 +0100 Subject: [PATCH] --- openpower/sv/av_opcodes.mdwn | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/openpower/sv/av_opcodes.mdwn b/openpower/sv/av_opcodes.mdwn index 617bf7f83..34c528d0c 100644 --- a/openpower/sv/av_opcodes.mdwn +++ b/openpower/sv/av_opcodes.mdwn @@ -49,7 +49,7 @@ The fundamental principle for these instructions is: Thus for example, where OpenPOWER VSX has vpkswss, this would be achieved in SV with simply: -* addition of a scalar ext/clamp instruction +* applying saturation to maxu (sv.maxu/satu) * 1st op, swizzle-selection vec2 "select X only" from source to dest: dest.X = extclamp(src.X) * 2nd op, swizzle-select vec2 "select Y only" from source to dest @@ -57,6 +57,14 @@ Thus for example, where OpenPOWER VSX has vpkswss, this would be achieved in SV Macro-op fusion may be used to detect that these two interleave cleanly, overlapping the vec2.X with vec2.Y to produce a single vec2.XY operation. +Alternatively Twin-Predication may be applied, with every even bit set in +the source mask and every odd bit set in the destination mask: + + r3=0b10101010 + r10=0b01010101 + r0=0x00007fff # or other limit + sv.maxu/satu/sm=r3/dm=r10/ew=32 *r20,*r20,r0 + ## Scalar element operations * clamping / saturation for signed and unsigned. best done similar to FP rounding modes, i.e. with an SPR. -- 2.30.2