From 177716f3b3744cd7c62149499705745d6db8c4e4 Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 21 Apr 2022 22:36:41 +0100 Subject: [PATCH] --- openpower/sv/biginteger.mdwn | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/openpower/sv/biginteger.mdwn b/openpower/sv/biginteger.mdwn index 168db13a0..e6c6ed996 100644 --- a/openpower/sv/biginteger.mdwn +++ b/openpower/sv/biginteger.mdwn @@ -24,8 +24,13 @@ Dynamic SIMD ALUs for maximum performance and effectiveness. Covered in [[biginteger/analysis]] the summary is that standard `adde` is sufficient for SVP64 Vectorisation of big-integer addition (and subfe -for subtraction) but that big-integer multiply and divide require two -extra 3-in 2-out instructions, similar to Intel's `mulx`, to be efficient. +for subtraction) but that big-integer multiply and divide require an +extra 3-in 2-out instruction, similar to Intel's `mulx`, to be efficient. +The same instruction (`madded`) is used for both because 'madded''s primary +purpose is to perform a fused 64-bit scalar multiply with a large vector, +where that result is Big-Added for Big-Multiply, but Big-Subtracted for +Big-Divide. + Macro-op Fusion and back-end massively-wide SIMD ALUs may be deployed in a fashion that is hidden from the user, behind a consistent, stable ISA API. @@ -33,7 +38,7 @@ fashion that is hidden from the user, behind a consistent, stable ISA API. **DRAFT** -Both `madded` and `msubed` are VA-Form: +`madded` is VA-Form: |0.....5|6..10|11..15|16..20|21..25|26..31| |-------|-----|------|------|------|------| @@ -46,7 +51,7 @@ in `110110`. A corresponding `madded` is proposed for `110010` | 110000 | 110001 | 110010 | 110011 | 110100 | 110101 | 110110 | 110111 | | ------ | ------- | ------ | ------ | ------ | ------ | ------ | ------ | -| maddhd | maddhdu | madded | maddld | rsvd | rsvd | msubed | rsvd | +| maddhd | maddhdu | madded | maddld | rsvd | rsvd | rsvd | rsvd | For SVP64 EXTRA register extension, the `RM-1P-3S-1D` format is used with the additional bit set for determining RS. @@ -69,19 +74,6 @@ When `EXTRA2_MODE` is set to one, the implicit RS register is identical to RC extended to SVP64 numbering, including whether RC is set Scalar or Vector. -## msubed - -The pseudocode for `msubed RT, RA, RB, RC`` is: - - prod[0:127] = (RA) * (RB) - sub[0:127] = EXTZ(RC) - prod - RT <- sub[64:127] - RS <- sub[0:63] # RS is either RC or RT+VL - -Note that RC is not sign-extended to 64-bit. In a Vector Loop -it contains the top half of the previous multiply-with-subtract, -and the current product must be subtracted from it. - ## madded The pseudocode for `madded RT, RA, RB, RC` is: -- 2.30.2