openpower/sv/bitmanip/appendix.mdwn

   1 # big integer division
   2
   3 links
   4
   5 * <https://skanthak.homepage.t-online.de/division.html>
   6 * <https://news.ycombinator.com/item?id=26562819>
   7 * <https://gmplib.org/~tege/division-paper.pdf>
   8 * <https://github.com/Richard-Mace/huge-integer-class/blob/master/HugeInt.cpp> nice-looking well-commented c++ implementation
   9 * <https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-April/004739.html>
  10
  11 the most efficient division algorithm is probably Knuth's Algorithm D (with modifications from the exercises section of his book) which is O(n^2) and uses 2N-by-N-bit div/rem
  12
  13 an oversimplified version of the knuth algorithm d with 32-bit words is:
  14 (TODO find original)
  15
  16 ```
  17 void div(uint32_t *n, uint32_t *d, uint32_t* q, int n_bytes, int d_bytes) {
  18     // assumes d[0] != 0, also n, d, and q have their most-significant-word in index 0
  19     int q_bytes = n_bytes - d_bytes;
  20     for(int i = 0; i < q_bytes / sizeof(n[0]); i++) {
  21         // calculate guess for quotient word
  22         q[i] = (((uint64_t)n[i] << 32) + n[i + 1]) / d[0];
  23         // n -= q[i] * d
  24         uint32_t carry = 0, carry2 = 0;
  25         for(int j = d_bytes / sizeof(d[0]) - 1; j >= 0; j--) {
  26             uint64_t v = (uint64_t)q[i] * d[j] + carry;
  27             carry = v >> 32;
  28             v = (uint32_t)v;
  29             v = n[i + j] - v + carry2;
  30             carry2 = v >> 32; // either ~0 or 0
  31             n[i + j] = v;
  32         }
  33         // fixup if carry2 != 0
  34     }
  35     // now remainder is in n and quotient is in q
  36 }
  37 ```
  38
  39 ```
  40 On Sat, Apr 16, 2022, 22:06 Jacob Lifshay <programmerjake@gmail.com> wrote:
  41 and a mrsubcarry (the one actually needed by bigint division):
  42 # for big_c - big_a * word_b
  43 result <- RC + ~(RA * RB) + CARRY # this expression is wrong, needs further thought
  44 CARRY <- HIGH_HALF(result)
  45 RT <- LOW_HALF(result)
  46
  47 turns out, after some checking with 4-bit words, afaict the correct algorithm for mrsubcarry is:
  48 # for big_c - big_a * word_b
  49 result <- RC + ~(RA * RB) + CARRY
  50 result_high <- HIGH_HALF(result)
  51 if CARRY <= 1 then # unsigned comparison
  52     result_high <- result_high + 1
  53 end
  54 CARRY <- result_high
  55 RT <- LOW_HALF(result)
  56
  57 afaict, that'll make the following algorithm work:
  58
  59 so the inner loop in the bigint division algorithm would end up being (assuming n, d, and q all fit in registers):
  60 li r3, 1 # carry in for subtraction
  61 mtspr CARRY, r3 # init carry spr
  62 setvl loop_count
  63 sv.mrsubcarry rn.v, rd.v, rq.s, rn.v
  64 ```