From a08d3ef7d088dbad318c95a84544602fbbdbe350 Mon Sep 17 00:00:00 2001 From: Luke Kenneth Casson Leighton Date: Tue, 6 Feb 2024 14:15:56 +0000 Subject: [PATCH] bug 676: notes on maxloc algorithm, add python version for clarity --- openpower/sv/cookbook/fortran_maxloc.mdwn | 41 +++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/openpower/sv/cookbook/fortran_maxloc.mdwn b/openpower/sv/cookbook/fortran_maxloc.mdwn index 367d0d12a..e9bb553d2 100644 --- a/openpower/sv/cookbook/fortran_maxloc.mdwn +++ b/openpower/sv/cookbook/fortran_maxloc.mdwn @@ -89,6 +89,26 @@ search seems to be a common technique. +**Python implementation** + +A variant on the c reference implementation allows for skipping +of updating m (the largest value found so far), followed by +always updating m whilst a batch of numbers is found that are +(in their order of sequence) always continuously greater than +all previous numbers. The algorithm therefore flips back and +forth between "skip whilst smaller" and "update whilst bigger", +only updating the index during "bigger" sequences. This is key +later when doing SVP64 assembler. + +``` +def m2(a): # array a + m, nm, i, n = 0, 0, 0, len(a) + while i m: m, nm, i = a[i], i, i+1 + return nm; +``` + # Implementation in SVP64 Assembler The core algorithm (inner part, in-register) is below: 11 instructions. @@ -114,6 +134,27 @@ sv.creqv *16,*16,*16 # set mask on already-tested bc 12,0, -0x40 # CR0 lt bit clear, branch back ``` +`sv.cmp` can be used in the first while loop because m (r4, the current +largest-found value so far) does not change. +However `sv.minmax.` has to be used in the key while loop +because r4 is sequentially replaced each time, and mapreduce (`/mr`) +is used to request this rolling-chain (accumulator) mode. + +Also note that `i` (the `"while i