From: lkcl <lkcl@web>
Date: Wed, 6 Dec 2023 15:13:25 +0000 (+0000)
Subject: (no commit message)
X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=b4057eada8e7a0cd34f9cc8bf4fa14ea73b82cc5;p=libreriscv.git

---

diff --git a/openpower/sv/cookbook/pospopcnt.mdwn b/openpower/sv/cookbook/pospopcnt.mdwn
index bfedde880..516db6d2c 100644
--- a/openpower/sv/cookbook/pospopcnt.mdwn
+++ b/openpower/sv/cookbook/pospopcnt.mdwn
@@ -69,11 +69,6 @@ bit-position, of an array of input values. Refer to Fig.2
      alt="pospopcnt" width="60%" />
 
 
-
-
-<br />
-
-
 # Visual representation of the pospopcount algorithm
 In order to perform positional popcount we need to go 
 through series of steps shown below in figures 3, 4, 5 & 6.
@@ -107,12 +102,17 @@ Fig.6 depicts how each of the intermediate results are
 accumulated. It is worth noting that each intermediate result 
 is independent of the other intermediate results and also
 parallel reduction can be applied to all of them
-individually. This gives two opportunities for parallelism.
+individually. This gives *two* opportunities for
+hardware parallelism rather than one.
 
 <img src="/openpower/sv/cookbook/ParallelAccumulate.drawio.svg"
      alt="pospopcnt" width="100%" />
 
-
+In short this algorithm is very straightforward to implement thanks to the two
+crucial instructions, `gbbd` and `popcntd`.  Below is a walkthrough of the
+assembler, keeping it very simple, and exploiting only one of the opportunities
+for parallelism (by not including the Parallel Reduction opportunity mentioned
+above).
 
 # Walkthrough of the assembler