From 5de91c7a15e0fa6a933b6abf9663686b1532abe4 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 19 Dec 2020 17:55:47 +0000 Subject: [PATCH] --- openpower/sv/svp_rewrite/svp64/discussion.mdwn | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/openpower/sv/svp_rewrite/svp64/discussion.mdwn b/openpower/sv/svp_rewrite/svp64/discussion.mdwn index bbb90105d..567671624 100644 --- a/openpower/sv/svp_rewrite/svp64/discussion.mdwn +++ b/openpower/sv/svp_rewrite/svp64/discussion.mdwn @@ -103,7 +103,7 @@ Some examples on different operation widths: 0 1 2 3 4 description ------------------ - 0 0 M sz dz reduce mode (M=1) + 0 0 M sz CRM reduce mode (M=1). 0 1 inv CR-bit Rc=1: ffirst CR sel 0 1 inv sz dz Rc=0: ffirst z/nonz 1 0 N sz dz sat mode: N=0/1 u/s @@ -116,6 +116,7 @@ Mode types: * **ffirst** or data-dependent fail-on-first: see separate section. * **sat mode** or saturation: clamps the result to a min/max rather than overflows / wraps. allows signed and unsigned clamping. * **reduce mode**. when M=1 a mapreduce is performed. the result is a scalar. a vector however is required, as it may be used to store intermediary computations. the result is in the first element with a nonzero predicate bit. + note that reduce mode only applies to 2 src operations. * **pred-result** will test the result (CR testing selects a bit of CR and inverts it, just like branch testing) and if the test fails it is as if the predicate bit was zero. When Rc=1 the CR element (CR0) however is still stored in the CR regfile. This scheme does not apply to crops (crand, cror). # Notes about rounding, clamp and saturate @@ -124,6 +125,16 @@ One of the issues with vector ops is that in integer DSP ops for example in Audi If there are spare bits it would be very good to look at using some of them to specify the mode, because otherwise a SPR has to be used which will need to be set and unset. This can get costly. +# Notes about reduce mode + +1. limited to single predicated dual src operations (add RT, RA, RB) and to triple source operations where one of the inputs is set to a scalar (e.g. isel) +2. limited to operations that make sense. divide is excluded, as us subtract. multiply, add, logical bitwise OR, CR operations. +3. the destination is a vector but the result is stored, ultimately, in the first nonzero predicated element. all other nonzero predicated elements are undefined. +4. implementations may use any ordering and any algorithm to reduce down to a single result. However it must be equivalent to a straight application of mapreduce. The destination vector (except masked out elements) may be used for storing any intermediate results. these may be left in the vector (undefined). +5. CRM applies when Rc=1. When CRM is zero, the CR associated with the result is regarded as a "some results met standard CR result criteria". When CRM is one, this changes to "all results met standard CR criteria". + +TODO: Rc=1 on Logical Operations? is this possible? + # Fail-on-first Data-dependent fail-on-first has two distinct variants: one for LD/ST, the other for arithmetic operations (actually, CR-driven). Note in each case the assumption is that vector elements are required appear to be executed in sequential Program Order, element 0 being the first. -- 2.30.2