From 789a1678472a46486ff0cf662fba4403d19907eb Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 20 Dec 2020 20:35:56 +0000 Subject: [PATCH] --- openpower/sv/svp_rewrite/svp64.mdwn | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/openpower/sv/svp_rewrite/svp64.mdwn b/openpower/sv/svp_rewrite/svp64.mdwn index 225e03b97..d00bb8305 100644 --- a/openpower/sv/svp_rewrite/svp64.mdwn +++ b/openpower/sv/svp_rewrite/svp64.mdwn @@ -214,7 +214,8 @@ The Mode table is laid out as follows: | 0-1 | 2 | 3 4 | description | | --- | --- |---------|-------------------------- | | 00 | 0 | sz dz | normal mode | -| 00 | 1 | sz CRM | reduce mode (mapreduce) | +| 00 | 1 | sz CRM | reduce mode (mapreduce), SUBVL=1 | +| 00 | 1 | SVM CRM | reduce mode (mapreduce), SUBVL>1 | | 01 | inv | CR-bit | Rc=1: ffirst CR sel | | 01 | inv | sz dz | Rc=0: ffirst z/nonz | | 10 | N | sz dz | sat mode: N=0/1 u/s | @@ -226,6 +227,7 @@ Fields: * **sz / dz** if predication is enabled will put zeros into the dest (or as src in the case of twin pred) when the predicate bit is zero. otherwise the element is ignored or skipped, depending on context. * **inv CR bit** just as in branches (BO) these bits allow testing of a CR bit and whether it is set (inv=0) or unset (inv=1) * **CRM** affects the CR on reduce mode when Rc=1 +* **SVM** sets "subvector" reduce mode * **N** sets signed/unsigned saturation. ## Rounding, clamp and saturate @@ -269,7 +271,7 @@ Pseudocode for the case where RA==RB: TODO: case where RA!=RB which involves first a vector of 2-operand results followed by a mapreduce on the intermediates. -Note that when SUBVL!=1, the sub-elements are *independent*, i.e. they are mapreduced per *sub-element* as a result. illustration with a vec2: +Note that when SUBVL!=1 the sub-elements are *independent*, i.e. they are mapreduced per *sub-element* as a result. illustration with a vec2: result.x = op(iregs[RA].x, iregs[RA+1].x) result.y = op(iregs[RA].y, iregs[RA+1].y) @@ -277,6 +279,16 @@ Note that when SUBVL!=1, the sub-elements are *independent*, i.e. they are mapre result.x = op(result.x, iregs[RA+i].x) result.y = op(result.y, iregs[RA+i].y) +When SVM is set and SUBVL!=1, another variant is enabled, which switches to `RM-2P-2S1D` such that different elwidths may be applied to src and dest. + + for i in range(VL): + result = op(iregs[RA+i].x, iregs[RA+i].x) + result = op(result, iregs[RA+i].z) + result = op(result, iregs[RA+i].z) + iregs[RT+i] = result + + + ## Fail-on-first Data-dependent fail-on-first has two distinct variants: one for LD/ST, the other for arithmetic operations (actually, CR-driven). Note in each case the assumption is that vector elements are required appear to be executed in sequential Program Order, element 0 being the first. -- 2.30.2