in only 3 instructions, one of which is setting a scalar integer to
zero.
+## 2nd experiment
+
+```
+ uint32_t carry = 0, carry2 = 0;
+ uint64_t products[];
+ for(int j = d_bytes / sizeof(d[0]) - 1; j >= 0; j--) {
+ products[j] = (uint64_t)q[i] * d[j];
+ }
+ for(int j = d_bytes / sizeof(d[0]) - 1; j >= 0; j--) {
+ uint64_t v = products[j] + carry;
+ carry = v >> 32;
+ v = (uint32_t)v;
+ v = n[i + j] - v + carry2;
+ carry2 = v >> 32; // either ~0 or 0
+ n[i + j] = v;
+ }
+```
+
## EXT004 Opcode map
See Power ISA v3.1, Book III, Appendix D, Table 13 (sheet 7 of 8), p1357.