From: lkcl <lkcl@web>
Date: Fri, 5 Jan 2024 16:13:47 +0000 (+0000)
Subject: (no commit message)
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=90857dced783436245e25e58d7a3b5d93a418b1a;p=libreriscv.git

---

diff --git a/openpower/sv/cookbook/daxpy_example.mdwn b/openpower/sv/cookbook/daxpy_example.mdwn
index fc026cf79..aba10f784 100644
--- a/openpower/sv/cookbook/daxpy_example.mdwn
+++ b/openpower/sv/cookbook/daxpy_example.mdwn
@@ -23,6 +23,24 @@ Summary
 
 # SVP64 Power ISA version
 
+In SVP64 Power ISA assembler, the algorithm, despite easy parallelism in
+hardware, is almost deceptively simple and straightforward. There are however
+some key additions over Standard Scalar (SFFS Subset) Power ISA 3.0 that
+need explaining.
+
+```
+# r5: n count; r6: x ptr; r7: y ptr; fp1: a
+1  mtctr 5                # move n to CTR
+2  .L2
+3    setvl MAXVL=32,VL=CTR  # actually VL=MIN(MAXVL,CTR)
+4    sv.lfdup   *32,8(6)    # load x into fp32-63, incr x
+5    sv.lfd/els *64,8(7)    # load y into fp64-95, NO INC
+6    sv.fmadd *64,*64,1,*32 # (*y) = (*y) * (*x) + a
+7    sv.stfdup  *64,8(7)    # store at y, post-incr y
+8    sv.bc/ctr .L2          # decr CTR by VL, jump !zero
+9    blr                    # return
+```
+
 The first instruction is simple: the plan is to use CTR for looping.
 Therefore, copy n (r5) into CTR. Next however, at the start of
 the loop (L2) is not so obvious: MAXVL is being set to 32
@@ -88,19 +106,6 @@ since its inception: we propose in SVP64 to add "Decrement CTR by VL".
 The end result is an exceptionally compact daxpy that is easy to read
 and understand.
 
-```
-# r5: n count; r6: x ptr; r7: y ptr; fp1: a
-1  mtctr 5                # move n to CTR
-2  .L2
-3    setvl MAXVL=32,VL=CTR  # actually VL=MIN(MAXVL,CTR)
-4    sv.lfdup   *32,8(6)    # load x into fp32-63, incr x
-5    sv.lfd/els *64,8(7)    # load y into fp64-95, NO INC
-6    sv.fmadd *64,*64,1,*32 # (*y) = (*y) * (*x) + a
-7    sv.stfdup  *64,8(7)    # store at y, post-incr y
-8    sv.bc/ctr .L2          # decr CTR by VL, jump !zero
-9    blr                    # return
-```
-
 # RVV version