## ternlogv
-also, another possible variant involving swizzle and vec4:
+also, another possible variant involving swizzle-like selection
+and masking, this only requires 2 64 bit registers (RA, RT) and
+only up to 16 LUT3s
| 0.5|6.10|11.15| 16.23 |24.27 | 28.30 |31|
| -- | -- | --- | ----- | ---- | ----- |--|
| NN | RT | RA | idx0-3| mask | sz 01 |0 |
- SZ = sz * 8
- raoff = idx0 * SZ
- rboff = idx0 * SZ
- rcoff = idx0 * SZ
- imoff = idx0 * SZ
+ SZ = (1+sz) * 8 # 8 or 16
+ raoff = MIN(XLEN, idx0 * SZ)
+ rboff = MIN(XLEN, idx1 * SZ)
+ rcoff = MIN(XLEN, idx2 * SZ)
+ imoff = MIN(XLEN, idx3 * SZ)
imm = RA[imoff:imoff+SZ]
- for i in range(SZ):
+ for i in range(MIN(XLEN, SZ)):
ra = RA[raoff:+i]
rb = RA[rboff+i]
rc = RA[rcoff+i]
res = lut3(imm, ra, rb, rc)
- for j in range(3):
+ for j in range(MIN(XLEN//8, 4)):
if mask[j]: RT[i+j*SZ] = res
## ternlogcr