From 8ec39aa19ae008e04de6f9abccf2627a42c63b2f Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 24 Apr 2022 22:00:16 +0100 Subject: [PATCH] --- openpower/sv/biginteger/analysis.mdwn | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/openpower/sv/biginteger/analysis.mdwn b/openpower/sv/biginteger/analysis.mdwn index c77a7a859..1f422337c 100644 --- a/openpower/sv/biginteger/analysis.mdwn +++ b/openpower/sv/biginteger/analysis.mdwn @@ -369,12 +369,24 @@ In this way a Scalar Integer divide can be performed in the same time-order as Newton-Raphson, using two hardware multipliers and a subtract. +There is however another reason for having a 128/64 division +instruction, and it's effectively the reverse of `madded`. +Look closely at Algorithm D when the divisor is only a scalar +(`v[0]`): + ``` k = 0; // the case of a for (j = m - 1; j >= 0; j--) { // single-digit - uint64_t dig2 = (k * b + u[j]); + uint64_t dig2 = ((k << 32) | u[j]); q[j] = dig2 / v[0]; // divisor here. - k = dig2 % v[0]; // modulo bak into next loop + k = dig2 % v[0]; // modulo back into next loop } ``` + +Here, just as with `madded` which can put the hi-half of the 128 bit product +back in as a form of 64-bit carry, a scalar divisor of a vector dividend +puts the modulo back in as the hi-half of a 128/64-bit divide. +By a nice coincidence this is exactly the same 128/64-bit operation +needed for the `qhat` estimate if it may produce both the quotient and +the remainder. -- 2.30.2