(no commit message)

author lkcl <lkcl@web>

Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)

committer IkiWiki <ikiwiki.info>

Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)
author lkcl <lkcl@web>
Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)
committer IkiWiki <ikiwiki.info>
Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)
diff --git a/simple_v_extension/remap.mdwn b/simple_v_extension/remap.mdwn

index a9558db057c020616d4fdc3ac16bc90d483d4c16..f9dafa572ed06ec4d00506f20cab940873d1f2bf 100644 (file)
--- a/simple_v_extension/remap.mdwn
+++ b/simple_v_extension/remap.mdwn
@@ -37,14 +37,16 @@ There are three "shape" CSRs, SHAPE0, SHAPE1, SHAPE2, 32-bits in each,
  which have the same format.  When each SHAPE CSR is set entirely to zeros,
  remapping is disabled: the register's elements are a linear (1D) vector.
  
-| 29..24 | 23..21  | 20..18  | 17..12  | 11..6   | 5..0    |
-| ------ | ------- | ------- | ------- | -------- | ------- |
-| modulo | invxyz | permute | zdimsz  | ydimsz  | xdimsz  |
+| 31..30   | 29..24 | 23..21  | 20..18  | 17..12  | 11..6   | 5..0    |
+| -------- | ------ | ------- | ------- | ------- | -------- | ------- |
+| applydim |modulo | invxyz | permute | zdimsz  | ydimsz  | xdimsz  |
  
-modulo will cause the output to wrap and remain within the range 0 to modulo. The value zero disables modulus application.
+applydim will set to zero the dimensions less than this. applydim=0 applies all three. applydim=1 applies y and z. applydim=2 applys only z. applydim=3 is reserved.
  
  invxyz will invert the start index of each of x, y or z. If invxyz[0] is zero then x-dimensional counting begins from 0 and increments, otherwise it begins from xdimsz-1 and iterates down to zero. Likewise for y and z.
  
+modulo will cause the output to wrap and remain within the range 0 to modulo. The value zero disables modulus application. Note that modulo arithmetic is applied after all other remapping calculations.
+
  xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates
  that the array dimensionality for that dimension is 1.  A value of xdimsz=2
  would indicate that in the first dimension there are 3 elements in the
@@ -79,12 +81,14 @@ shows this more clearly, and may be executed as a python program:
      idxs = [0,0,0] # starting indices
      order = [1,0,2] # experiment with different permutations, here
      modulo = 64     # experiment with different modulus, here
+    applydim=0
      invxyz = [0,0,0] 
  
      for idx in range(xdim * ydim * zdim):
          ix = [0] * 3
          for i in range(3):
-            ix[i] = idxs[i]
+            if i >= applydim:
+                ix[i] = idxs[i]
              if invxyz[i]:
                  ix[i] = lims[i] - ix[i]
          new_idx = ix[0] + ix[1] * xdim + ix[2] * xdim * ydim
@@ -198,4 +202,4 @@ At the same time, VL will, because there is no SHAPE on f8, increment straight s
  
  The only other instruction required is to ensure that f4-f7 are initialised (usually to zero).
  
-It should be clear that a 4x4 by 4x4 Matrix Multiply, being effectively the same technique applied to four independent vectors, can be done by setting VL=64, using an extra dimension on the SHAPE CSRs and applying a rotating SHAPE CSR to f8 in order to get it to apply four times to compute the four columns worth of vectors.
+It should be clear that a 4x4 by 4x4 Matrix Multiply, being effectively the same technique applied to four independent vectors, can be done by setting VL=64, using an extra dimension on the SHAPE0 and SHAPE1 CSRs, and applying a rotating 1D SHAPE CSR of xdim=16 to f8 in order to get it to apply four times to compute the four columns worth of vectors.
author	lkcl <lkcl@web>
	Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)
committer	IkiWiki <ikiwiki.info>
	Mon, 7 Oct 2019 14:47:35 +0000 (15:47 +0100)