Shape is 32-bits. When SHAPE is set entirely to zeros, remapping is disabled: the register's elements are a linear (1D) vector. | 31..30 | 29..28 | 27..24 | 23..21 | 20..18 | 17..12 | 11..6 | 5..0 | | ------ | ------ | ------ | ------ | ------- | ------- | ------- | ------- | | 0b00 | skip | offset | invxyz | permute | zdimsz | ydimsz | xdimsz | | 0b01 | submode| offset | invxyz | submode2| rsvd | rsvd | xdimsz | mode sets different behaviours (straight matrix multiply, FFT, DCT). * **mode=0b00** sets straight Matrix Mode * **mode=0b01** sets "FFT/DCT" mode and activates submodes When submode2 is 0, for FFT submode the following schedules may be selected: * **submode=0b00** selects the ``j`` offset of the innermost for-loop of Tukey-Cooley * **submode=0b10** selects the ``j+halfsize`` offset of the innermost for-loop of Tukey-Cooley * **submode=0b11** selects the ``k`` of exptable (which coefficient) When submode2 is 1 or 2, for DCT inner butterfly submode the following schedules may be selected. When submode2 is 1, additional bit-reversing is also performed. * **submode=0b00** selects the ``j`` offset of the innermost for-loop, in-place * **submode=0b010** selects the ``j+halfsize`` offset of the innermost for-loop, in reverse-order, in-place * **submode=0b10** selects the ``ci`` count of the innermost for-loop, useful for calculating the cosine coefficient * **submode=0b11** selects the ``size`` offset of the outermost for-loop, useful for the cosine coefficient ``cos(ci + 0.5) * pi / size`` When submode2 is 3 or 4, for DCT outer butterfly submode the following schedules may be selected. When submode is 3, additional bit-reversing is also performed. * **submode=0b00** selects the ``j`` offset of the innermost for-loop, * **submode=0b01** selects the ``j+1`` offset of the innermost for-loop, in Matrix Mode, skip allows dimensions to be skipped from being included in the resultant output index. this allows sequences to be repeated: ```0 0 0 1 1 1 2 2 2 ...``` or in the case of skip=0b11 this results in modulo ```0 1 2 0 1 2 ...``` * **skip=0b00** indicates no dimensions to be skipped * **skip=0b01** sets "skip 1st dimension" * **skip=0b10** sets "skip 2nd dimension" * **skip=0b11** sets "skip 3rd dimension" invxyz will invert the start index of each of x, y or z. If invxyz[0] is zero then x-dimensional counting begins from 0 and increments, otherwise it begins from xdimsz-1 and iterates down to zero. Likewise for y and z. offset will have the effect of offsetting the result by ```offset``` elements: for i in 0..VL-1: GPR(RT + remap(i) + SVSHAPE.offset) = .... this appears redundant because the register RT could simply be changed by a compiler, until element width overrides are introduced. also bear in mind that unlike a static compiler SVSHAPE.offset may be set dynamically at runtime. xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates that the array dimensionality for that dimension is 1. any dimension not intended to be used must have its value set to 0 (dimensionality of 1). A value of xdimsz=2 would indicate that in the first dimension there are 3 elements in the array. For example, to create a 2D array X,Y of dimensionality X=3 and Y=2, set xdimsz=2, ydimsz=1 and zdimsz=0 The format of the array is therefore as follows: array[xdimsz+1][ydimsz+1][zdimsz+1] However whilst illustrative of the dimensionality, that does not take the "permute" setting into account. "permute" may be any one of six values (0-5, with values of 6 and 7 being reserved, and not legal). The table below shows how the permutation dimensionality order works: | permute | order | array format | | ------- | ----- | ------------------------ | | 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) | | 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) | | 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) | | 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) | | 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) | | 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) | In other words, the "permute" option changes the order in which nested for-loops over the array would be done. See executable python reference code for further details.