(no commit message)
[libreriscv.git] / openpower / sv / shape_table_format.mdwn
1 Shape is 32-bits. When SHAPE is set entirely to zeros, remapping is
2 disabled: the register's elements are a linear (1D) vector.
3
4 | 31..30 | 29..28 | 27..24 | 23..21 | 20..18 | 17..12 | 11..6 | 5..0 |
5 | ------ | ------ | ------ | ------ | ------- | ------- | ------- | ------- |
6 | 0b00 | skip | offset | invxyz | permute | zdimsz | ydimsz | xdimsz |
7 | 0b01 | submode| offset | invxyz | submode2| rsvd | rsvd | xdimsz |
8
9 mode sets different behaviours (straight matrix multiply, FFT, DCT).
10
11 * **mode=0b00** sets straight Matrix Mode
12 * **mode=0b01** sets "FFT/DCT" mode and activates submodes
13
14 When submode2 is 0, for FFT submode the following schedules may be selected:
15
16 * **submode=0b00** selects the ``j`` offset of the innermost for-loop
17 of Tukey-Cooley
18 * **submode=0b10** selects the ``j+halfsize`` offset of the innermost for-loop
19 of Tukey-Cooley
20 * **submode=0b11** selects the ``k`` of exptable (which coefficient)
21
22 When submode2 is 1 or 2, for DCT inner butterfly submode the following
23 schedules may be selected. When submode2 is 1, additional bit-reversing
24 is also performed.
25
26 * **submode=0b00** selects the ``j`` offset of the innermost for-loop,
27 in-place
28 * **submode=0b010** selects the ``j+halfsize`` offset of the innermost for-loop,
29 in reverse-order, in-place
30 * **submode=0b10** selects the ``ci`` count of the innermost for-loop,
31 useful for calculating the cosine coefficient
32 * **submode=0b11** selects the ``size`` offset of the outermost for-loop,
33 useful for the cosine coefficient ``cos(ci + 0.5) * pi / size``
34
35 When submode2 is 3 or 4, for DCT outer butterfly submode the following
36 schedules may be selected. When submode is 3, additional bit-reversing
37 is also performed.
38
39 * **submode=0b00** selects the ``j`` offset of the innermost for-loop,
40 * **submode=0b01** selects the ``j+1`` offset of the innermost for-loop,
41
42 in Matrix Mode, skip allows dimensions to be skipped from being included
43 in the resultant output index. this allows sequences to be repeated:
44 ```0 0 0 1 1 1 2 2 2 ...``` or in the case of skip=0b11 this results in
45 modulo ```0 1 2 0 1 2 ...```
46
47 * **skip=0b00** indicates no dimensions to be skipped
48 * **skip=0b01** sets "skip 1st dimension"
49 * **skip=0b10** sets "skip 2nd dimension"
50 * **skip=0b11** sets "skip 3rd dimension"
51
52 invxyz will invert the start index of each of x, y or z. If invxyz[0] is
53 zero then x-dimensional counting begins from 0 and increments, otherwise
54 it begins from xdimsz-1 and iterates down to zero. Likewise for y and z.
55
56 offset will have the effect of offsetting the result by ```offset``` elements:
57
58 for i in 0..VL-1:
59 GPR(RT + remap(i) + SVSHAPE.offset) = ....
60
61 this appears redundant because the register RT could simply be changed by a compiler, until element width overrides are introduced. also
62 bear in mind that unlike a static compiler SVSHAPE.offset may
63 be set dynamically at runtime.
64
65 xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates
66 that the array dimensionality for that dimension is 1. any dimension
67 not intended to be used must have its value set to 0 (dimensionality
68 of 1). A value of xdimsz=2 would indicate that in the first dimension
69 there are 3 elements in the array. For example, to create a 2D array
70 X,Y of dimensionality X=3 and Y=2, set xdimsz=2, ydimsz=1 and zdimsz=0
71
72 The format of the array is therefore as follows:
73
74 array[xdimsz+1][ydimsz+1][zdimsz+1]
75
76 However whilst illustrative of the dimensionality, that does not take the
77 "permute" setting into account. "permute" may be any one of six values
78 (0-5, with values of 6 and 7 being reserved, and not legal). The table
79 below shows how the permutation dimensionality order works:
80
81 | permute | order | array format |
82 | ------- | ----- | ------------------------ |
83 | 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) |
84 | 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) |
85 | 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) |
86 | 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) |
87 | 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) |
88 | 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) |
89
90 In other words, the "permute" option changes the order in which
91 nested for-loops over the array would be done. See executable
92 python reference code for further details.
93