# Fortran MAXLOC SVP64 demo
+<https://bugs.libre-soc.org/show_bug.cgi?id=676>
+
MAXLOC is a notoriously difficult function for SIMD to cope with.
-SVP64 however has similar capabilities to Z80 CPIR and LDIR
+Typical approaches are to perform leaf-node (depth-first) parallel
+operations, merging the results mapreduce-style to guage a final
+index.
-<https://bugs.libre-soc.org/show_bug.cgi?id=676>
+SVP64 however has similar capabilities to Z80 CPIR and LDIR and
+therefore hardware may transparently implement back-end parallel
+operations whilst the developer programs in a simple sequential
+algorithm.
+
+A clear reference implementation of MAXLOC is as follows:
```
-int m2(int * const restrict a, int n)
-{
- int m, nm;
- int i;
-
- m = INT_MIN;
- nm = -1;
- for (i=0; i<n; i++)
- {
- if (a[i] > m)
- {
- m = a[i];
- nm = i;
- }
- }
+int maxloc(int * const restrict a, int n) {
+ int m, nm = INT_MIN, 0;
+ for (int i=0; i<n; i++) {
+ if (a[i] > m) {
+ m = a[i];
+ nm = i;
+ }
+ }
return nm;
}
```
<https://github.com/jvdd/argminmax/blob/main/src/simd/simd_u64.rs>
+# Implementation in SVP64 Assembler
+
+The core algorithm (inner part, in-register) is below: 11 instructions.
+Loading of data, and offsetting the "core" below is relatively
+straightforward: estimated another 6 instructions and needing one
+more branch (outer loop).
+
+```
+# while (i<n)
+setvl 2,0,4,0,1,1 # set MVL=4, VL=MIN(MVL,CTR)
+# while (i<n and a[i]<=m) : i += 1
+sv.cmp/ff=gt/m=ge *0,0,*10,4 # truncates VL to min
+sv.creqv *16,*16,*16 # set mask on already-tested
+setvl 2,0,4,0,1,1 # set MVL=4, VL=MIN(MVL,CTR)
+mtcrf 128, 0 # clear CR0 (in case VL=0?)
+# while (i<n and a[i]>m):
+sv.minmax./ff=le/m=ge 4,*10,4,1 # uses r4 as accumulator
+crternlogi 0,1,2,127 # test greater/equal or VL=0
+sv.crand *19,*16,0 # clear if CR0.eq=0
+# nm = i (count masked bits. could use crweirds here TODO)
+sv.svstep/mr/m=so 1, 0, 6, 1 # svstep: get vector dststep
+sv.creqv *16,*16,*16 # set mask on already-tested
+bc 12,0, -0x40 # CR0 lt bit clear, branch back
+```
+
[[!tag svp64_cookbook ]]