--- /dev/null
+Shape is 32-bits When SHAPE is set entirely to zeros, remapping is
+disabled: the register's elements are a linear (1D) vector.
+
+| 31..30 | 29..24 | 23..21 | 20..18 | 17..12 | 11..6 | 5..0 |
+| -------- | ------ | ------- | ------- | ------- | -------- | ------- |
+| applydim |modulo | invxyz | permute | zdimsz | ydimsz | xdimsz |
+
+applydim will set to zero the dimensions less than this. applydim=0 applies all three. applydim=1 applies y and z. applydim=2 applys only z. applydim=3 is reserved.
+
+invxyz will invert the start index of each of x, y or z. If invxyz[0] is zero then x-dimensional counting begins from 0 and increments, otherwise it begins from xdimsz-1 and iterates down to zero. Likewise for y and z.
+
+modulo will cause the output to wrap and remain within the range 0 to modulo. The value zero disables modulus application. Note that modulo arithmetic is applied after all other remapping calculations.
+
+xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates
+that the array dimensionality for that dimension is 1. A value of xdimsz=2
+would indicate that in the first dimension there are 3 elements in the
+array. The format of the array is therefore as follows:
+
+ array[xdim+1][ydim+1][zdim+1]
+
+However whilst illustrative of the dimensionality, that does not take the
+"permute" setting into account. "permute" may be any one of six values
+(0-5, with values of 6 and 7 being reserved, and not legal). The table
+below shows how the permutation dimensionality order works:
+
+| permute | order | array format |
+| ------- | ----- | ------------------------ |
+| 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) |
+| 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) |
+| 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) |
+| 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) |
+| 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) |
+| 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) |
+
+In other words, the "permute" option changes the order in which
+nested for-loops over the array would be done.
+