From ccb6506da99c4900060a3173d22fda438a16584f Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 20 Jun 2019 20:56:52 +0100 Subject: [PATCH] --- simple_v_extension/specification.mdwn | 38 ++++++++++++++++----------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/simple_v_extension/specification.mdwn b/simple_v_extension/specification.mdwn index 8dce303ca..b4f5144ad 100644 --- a/simple_v_extension/specification.mdwn +++ b/simple_v_extension/specification.mdwn @@ -2241,22 +2241,29 @@ of the RISC-V ISA, is as follows: | 15 | 14:12 | 11:10 | 9:8 | 7 | 6:0 | | - | ----- | ----- | ----- | --- | ------- | -| rmode | 16xil | pplen | rplen | pmode| 1111111 | +| vlset | 16xil | pplen | rplen | mode | 1111111 | VL/MAXVL/SubVL Block: | 31-30 | 29:28 | 27:22 | 21:17 | 16 | | - | ----- | ------ | ------ | - | -| rsvd | SubVL | MAXVL | VLEN | vlt | +| 0 | SubVL | VLdest | VLEN | vlt | +| 1 | SubVL | VLdest | VLEN || -If vlt is 0, VLEN is a 5 bit immediate value. If vlt is 1, it specifies the scalar register from which VL is set by this VLIW instruction group. Any changes to that register by any VLIW Group instruction *automatically* result in an immediate change to VL. Thus, the register effectively *becomes* VL, for the full duration of the group's execution. +If vlt is 0, VLEN is a 5 bit immediate value. If vlt is 1, it specifies the scalar register from which VL is set by this VLIW instruction group. VL, whether set from the register or the immediate, is then modified (truncated) to be max(VL, MAXVL), and the result stored in the scalar register specified in VLdest. If VLdest is zero, no store in the regfile occurs. + +This option will typically be used to start vectorised loops, where the VLIW instruction effectively embeds an optional "SETSUBVL, SETVL" sequence (in compact form). + +When bit 15 is set to 1, MAXVL and VL are both set to the immediate, VLEN, which is 6 bits in length, and the same value stored in scalar register VLdest (if that register is nonzero). + +This option will typically not be used so much for loops as it will be for one-off instructions such as saving the entire register file to the stack with a single one-off Vectorised LD/ST. Reminder of the variable-length format from Section 1.5 of the RISC-V ISA: -| base+4 | base+2 | base | number of bits | -| ------ | ------------------- | ---------------- | -------------------------- | -| ..xxxx | xxxxxxxxxxxxxxxx | xnnnxxxxx1111111 | (80+16\*nnn)-bit, nnn!=111 | -| {ops}{Pred}{Reg} | VL Block | SV Prefix | | +| base+4 ... base+2 | base | number of bits | +| ------ ------------------- | ---------------- -------------------------- | +| ..xxxx xxxxxxxxxxxxxxxx | xnnnxxxxx1111111 | (80+16\*nnn)-bit, nnn!=111 | +| {ops}{Pred}{Reg}{VL Block} | SV Prefix | | CSRs needed: @@ -2268,20 +2275,21 @@ CSRs needed: Notes: * Bit 7 specifies if the prefix block format is the full 16 bit format (1) or the compact less expressive format (0). In the 8 bit format, pplen is multiplied by 2. -* NOTE: 8 bit format predicate numbering is implicit and begins from x9. Thus it is critical to put blocks in the correct order as required. -* Bit 15 specifies if the register block format is 16 bit (1) or 8 bit (0). In the 8 bit format, rplen is multiplied by 2. If only an odd number of entries are needed the last may be set to 0x00, indicating "unused". -* Bits 8 and 9 define how many RegCam entries (0 to 3 if bit 15 is 1, otherwise 0 to 6) follow the VL Block. -* Bits 10 and 11 define how many PredCam entries (0 to 3 if bit 7 is 1, otherwise 0 to 6) follow after - the (optional) RegCam entries +* 8 bit format predicate numbering is implicit and begins from x9. Thus it is critical to put blocks in the correct order as required. +* Bit 7 also specifies if the register block format is 16 bit (1) or 8 bit (0). In the 8 bit format, rplen is multiplied by 2. If only an odd number of entries are needed the last may be set to 0x00, indicating "unused". +* Bit 15 specifies if the VL Block is present. If set to 1, the VL Block immediately follows the VLIW instruction Prefix +* Bits 8 and 9 define how many RegCam entries (0 to 3 if bit 15 is 1, otherwise 0 to 6) follow the (optional) VL Block. +* Bits 10 and 11 define how many PredCam entries (0 to 3 if bit 7 is 1, otherwise 0 to 6) follow the (optional) RegCam entries * Bits 14 to 12 (IL) define the actual length of the instruction: total number of bits is 80 + 16 times IL. Standard RV32, RVC and also SVPrefix (P48-\*-Type) instructions fit into this space, after the - (optional) RegCam / PredCam entries + (optional) VL / RegCam / PredCam entries * Anything - any registers - within the VLIW-prefixed format *MUST* have the RegCam and PredCam entries applied to it. -* At the end of the VLIW Group, the RegCam and PredCam CSRs *no longer apply*. +* At the end of the VLIW Group, the RegCam and PredCam entries *no longer apply*. +* Although inefficient use of resources, it is fine to set the MAXVL, VL and SUBVL CSRs with standard CSRRW instructions, within a VLIW block. -This would greatly reduce the amount of space utilised by Vectorised +All this would greatly reduce the amount of space utilised by Vectorised instructions, given that 64-bit CSRRW requires 3, even 4 32-bit opcodes: the CSR itself, a LI, and the setting up of the value into the RS register of the CSR, which, again, requires a LI / LUI to get the 32 bit -- 2.30.2