From: Luke Kenneth Casson Leighton Date: Tue, 25 Jun 2019 13:21:33 +0000 (+0100) Subject: move remap to separate page X-Git-Tag: convert-csv-opcode-to-binary~4431 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=afaf61e6412084601dbe06f3519c01561c748abe;p=libreriscv.git move remap to separate page --- diff --git a/simple_v_extension/remap.mdwn b/simple_v_extension/remap.mdwn new file mode 100644 index 000000000..0cab37962 --- /dev/null +++ b/simple_v_extension/remap.mdwn @@ -0,0 +1,161 @@ +# NOTE + +This section is under revision (and is optional) + +# REMAP CSR + +(Note: both the REMAP and SHAPE sections are best read after the + rest of the document has been read) + +There is one 32-bit CSR which may be used to indicate which registers, +if used in any operation, must be "reshaped" (re-mapped) from a linear +form to a 2D or 3D transposed form, or "offset" to permit arbitrary +access to elements within a register. + +The 32-bit REMAP CSR may reshape up to 3 registers: + +| 29..28 | 27..26 | 25..24 | 23 | 22..16 | 15 | 14..8 | 7 | 6..0 | +| ------ | ------ | ------ | -- | ------- | -- | ------- | -- | ------- | +| shape2 | shape1 | shape0 | 0 | regidx2 | 0 | regidx1 | 0 | regidx0 | + +regidx0-2 refer not to the Register CSR CAM entry but to the underlying +*real* register (see regidx, the value) and consequently is 7-bits wide. +When set to zero (referring to x0), clearly reshaping x0 is pointless, +so is used to indicate "disabled". +shape0-2 refers to one of three SHAPE CSRs. A value of 0x3 is reserved. +Bits 7, 15, 23, 30 and 31 are also reserved, and must be set to zero. + +It is anticipated that these specialist CSRs not be very often used. +Unlike the CSR Register and Predication tables, the REMAP CSRs use +the full 7-bit regidx so that they can be set once and left alone, +whilst the CSR Register entries pointing to them are disabled, instead. + +# SHAPE 1D/2D/3D vector-matrix remapping CSRs + +(Note: both the REMAP and SHAPE sections are best read after the + rest of the document has been read) + +There are three "shape" CSRs, SHAPE0, SHAPE1, SHAPE2, 32-bits in each, +which have the same format. When each SHAPE CSR is set entirely to zeros, +remapping is disabled: the register's elements are a linear (1D) vector. + +| 26..24 | 23 | 22..16 | 15 | 14..8 | 7 | 6..0 | +| ------- | -- | ------- | -- | ------- | -- | ------- | +| permute | offs[2] | zdimsz | offs[1] | ydimsz | offs[0] | xdimsz | + +offs is a 3-bit field, spread out across bits 7, 15 and 23, which +is added to the element index during the loop calculation. + +xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates +that the array dimensionality for that dimension is 1. A value of xdimsz=2 +would indicate that in the first dimension there are 3 elements in the +array. The format of the array is therefore as follows: + + array[xdim+1][ydim+1][zdim+1] + +However whilst illustrative of the dimensionality, that does not take the +"permute" setting into account. "permute" may be any one of six values +(0-5, with values of 6 and 7 being reserved, and not legal). The table +below shows how the permutation dimensionality order works: + +| permute | order | array format | +| ------- | ----- | ------------------------ | +| 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) | +| 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) | +| 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) | +| 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) | +| 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) | +| 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) | + +In other words, the "permute" option changes the order in which +nested for-loops over the array would be done. The algorithm below +shows this more clearly, and may be executed as a python program: + + # mapidx = REMAP.shape2 + xdim = 3 # SHAPE[mapidx].xdim_sz+1 + ydim = 4 # SHAPE[mapidx].ydim_sz+1 + zdim = 5 # SHAPE[mapidx].zdim_sz+1 + + lims = [xdim, ydim, zdim] + idxs = [0,0,0] # starting indices + order = [1,0,2] # experiment with different permutations, here + offs = 0 # experiment with different offsets, here + + for idx in range(xdim * ydim * zdim): + new_idx = offs + idxs[0] + idxs[1] * xdim + idxs[2] * xdim * ydim + print new_idx, + for i in range(3): + idxs[order[i]] = idxs[order[i]] + 1 + if (idxs[order[i]] != lims[order[i]]): + break + print + idxs[order[i]] = 0 + +Here, it is assumed that this algorithm be run within all pseudo-code +throughout this document where a (parallelism) for-loop would normally +run from 0 to VL-1 to refer to contiguous register +elements; instead, where REMAP indicates to do so, the element index +is run through the above algorithm to work out the **actual** element +index, instead. Given that there are three possible SHAPE entries, up to +three separate registers in any given operation may be simultaneously +remapped: + + function op_add(rd, rs1, rs2) # add not VADD! + ... + ... +  for (i = 0; i < VL; i++) + xSTATE.srcoffs = i # save context + if (predval & 1< - -(Note: both the REMAP and SHAPE sections are best read after the - rest of the document has been read) - -There is one 32-bit CSR which may be used to indicate which registers, -if used in any operation, must be "reshaped" (re-mapped) from a linear -form to a 2D or 3D transposed form, or "offset" to permit arbitrary -access to elements within a register. - -The 32-bit REMAP CSR may reshape up to 3 registers: - -| 29..28 | 27..26 | 25..24 | 23 | 22..16 | 15 | 14..8 | 7 | 6..0 | -| ------ | ------ | ------ | -- | ------- | -- | ------- | -- | ------- | -| shape2 | shape1 | shape0 | 0 | regidx2 | 0 | regidx1 | 0 | regidx0 | - -regidx0-2 refer not to the Register CSR CAM entry but to the underlying -*real* register (see regidx, the value) and consequently is 7-bits wide. -When set to zero (referring to x0), clearly reshaping x0 is pointless, -so is used to indicate "disabled". -shape0-2 refers to one of three SHAPE CSRs. A value of 0x3 is reserved. -Bits 7, 15, 23, 30 and 31 are also reserved, and must be set to zero. - -It is anticipated that these specialist CSRs not be very often used. -Unlike the CSR Register and Predication tables, the REMAP CSRs use -the full 7-bit regidx so that they can be set once and left alone, -whilst the CSR Register entries pointing to them are disabled, instead. - -## SHAPE 1D/2D/3D vector-matrix remapping CSRs - -(Note: both the REMAP and SHAPE sections are best read after the - rest of the document has been read) - -There are three "shape" CSRs, SHAPE0, SHAPE1, SHAPE2, 32-bits in each, -which have the same format. When each SHAPE CSR is set entirely to zeros, -remapping is disabled: the register's elements are a linear (1D) vector. - -| 26..24 | 23 | 22..16 | 15 | 14..8 | 7 | 6..0 | -| ------- | -- | ------- | -- | ------- | -- | ------- | -| permute | offs[2] | zdimsz | offs[1] | ydimsz | offs[0] | xdimsz | - -offs is a 3-bit field, spread out across bits 7, 15 and 23, which -is added to the element index during the loop calculation. - -xdimsz, ydimsz and zdimsz are offset by 1, such that a value of 0 indicates -that the array dimensionality for that dimension is 1. A value of xdimsz=2 -would indicate that in the first dimension there are 3 elements in the -array. The format of the array is therefore as follows: - - array[xdim+1][ydim+1][zdim+1] - -However whilst illustrative of the dimensionality, that does not take the -"permute" setting into account. "permute" may be any one of six values -(0-5, with values of 6 and 7 being reserved, and not legal). The table -below shows how the permutation dimensionality order works: - -| permute | order | array format | -| ------- | ----- | ------------------------ | -| 000 | 0,1,2 | (xdim+1)(ydim+1)(zdim+1) | -| 001 | 0,2,1 | (xdim+1)(zdim+1)(ydim+1) | -| 010 | 1,0,2 | (ydim+1)(xdim+1)(zdim+1) | -| 011 | 1,2,0 | (ydim+1)(zdim+1)(xdim+1) | -| 100 | 2,0,1 | (zdim+1)(xdim+1)(ydim+1) | -| 101 | 2,1,0 | (zdim+1)(ydim+1)(xdim+1) | - -In other words, the "permute" option changes the order in which -nested for-loops over the array would be done. The algorithm below -shows this more clearly, and may be executed as a python program: - - # mapidx = REMAP.shape2 - xdim = 3 # SHAPE[mapidx].xdim_sz+1 - ydim = 4 # SHAPE[mapidx].ydim_sz+1 - zdim = 5 # SHAPE[mapidx].zdim_sz+1 - - lims = [xdim, ydim, zdim] - idxs = [0,0,0] # starting indices - order = [1,0,2] # experiment with different permutations, here - offs = 0 # experiment with different offsets, here - - for idx in range(xdim * ydim * zdim): - new_idx = offs + idxs[0] + idxs[1] * xdim + idxs[2] * xdim * ydim - print new_idx, - for i in range(3): - idxs[order[i]] = idxs[order[i]] + 1 - if (idxs[order[i]] != lims[order[i]]): - break - print - idxs[order[i]] = 0 - -Here, it is assumed that this algorithm be run within all pseudo-code -throughout this document where a (parallelism) for-loop would normally -run from 0 to VL-1 to refer to contiguous register -elements; instead, where REMAP indicates to do so, the element index -is run through the above algorithm to work out the **actual** element -index, instead. Given that there are three possible SHAPE entries, up to -three separate registers in any given operation may be simultaneously -remapped: - - function op_add(rd, rs1, rs2) # add not VADD! - ... - ... -  for (i = 0; i < VL; i++) - xSTATE.srcoffs = i # save context - if (predval & 1< + +See optional [[remap]] section. # Instruction Execution Order