get to choose precisely where to focus and target the benefits of their
implementation efforts, without "extra baggage".
+# CSRs <a name="csrs"></a>
+
+There are a number of CSRs needed, which are used at the instruction
+decode phase to re-interpret standard RV opcodes (a practice that has
+precedent in the setting of MISA to enable / disable extensions).
+
+* Integer Register N is Vector of length M: r(N) -> r(N..N+M-1)
+* Integer Register N is of implicit bitwidth M (M=default,8,16,32,64)
+* Floating-point Register N is Vector of length M: r(N) -> r(N..N+M-1)
+* Floating-point Register N is of implicit bitwidth M (M=default,8,16,32,64)
+* Integer Register N is a Predication Register (note: a key-value store)
+* Vector Length CSR (VSETVL, VGETVL)
+
+Notes:
+
+* for the purposes of LOAD / STORE, Integer Registers which are
+ marked as a Vector will result in a Vector LOAD / STORE.
+* Vector Lengths are *not* the same as vsetl but are an integral part
+ of vsetl.
+* Actual vector length is *multipled* by how many blocks of length
+ "bitwidth" may fit into an XLEN-sized register file.
+* Predication is a key-value store due to the implicit referencing,
+ as opposed to having the predicate register explicitly in the instruction.
+
+## Predication CSR
+
+The Predication CSR is a key-value store indicating whether, if a given
+destination register (integer or floating-point) is referred to in an
+instruction, it is to be predicated. The first entry is whether predication
+is enabled. The second entry is whether the register index refers to a
+floating-point or an integer register. The third entry is the index
+of that register which is to be predicated (if referred to). The fourth entry
+is the integer register that is treated as a bitfield, indexable by the
+vector element index.
+
+| RegNo | 6 | 5 | (4..0) | (4..0) |
+| ----- | - | - | ------- | ------- |
+| r0 | pren0 | i/f | regidx | predidx |
+| r1 | pren1 | i/f | regidx | predidx |
+| .. | pren.. | i/f | regidx | predidx |
+| r15 | pren15 | i/f | regidx | predidx |
+
+The Predication CSR Table is a key-value store, so implementation-wise
+it will be faster to turn the table around (maintain topologically
+equivalent state):
+
+ fp_pred_enabled[32];
+ int_pred_enabled[32];
+ for (i = 0; i < 16; i++)
+ if CSRpred[i].pren:
+ idx = CSRpred[i].regidx
+ predidx = CSRpred[i].predidx
+ if CSRpred[i].type == 0: # integer
+ int_pred_enabled[idx] = 1
+ int_pred_reg[idx] = predidx
+ else:
+ fp_pred_enabled[idx] = 1
+ fp_pred_reg[idx] = predidx
+
+So when an operation is to be predicated, it is the internal state that
+is used. In Section 6.4.2 of Hwacha's Manual (EECS-2015-262) the following
+pseudo-code for operations is given, where p is the explicit (direct)
+reference to the predication register to be used:
+
+ for (int i=0; i<vl; ++i)
+ if ([!]preg[p][i])
+ (d ? vreg[rd][i] : sreg[rd]) =
+ iop(s1 ? vreg[rs1][i] : sreg[rs1],
+ s2 ? vreg[rs2][i] : sreg[rs2]); // for insts with 2 inputs
+
+This instead becomes an *indirect* reference using the *internal* state
+table generated from the Predication CSR key-value store:
+
+ if type(iop) == INT:
+ pred_enabled = int_pred_enabled
+ preg = int_pred_reg[rd]
+ else:
+ pred_enabled = fp_pred_enabled
+ preg = fp_pred_reg[rd]
+
+ for (int i=0; i<vl; ++i)
+ if (preg_enabled[rd] && [!]preg[i])
+ (d ? vreg[rd][i] : sreg[rd]) =
+ iop(s1 ? vreg[rs1][i] : sreg[rs1],
+ s2 ? vreg[rs2][i] : sreg[rs2]); // for insts with 2 inputs
+
+## MAXVECTORDEPTH
+
+MAXVECTORDEPTH is the same concept as MVL in RVV. However in Simple-V,
+given that its primary (base, unextended) purpose is for 3D, Video and
+other purposes (not requiring supercomputing capability), it makes sense
+to limit MAXVECTORDEPTH to the regfile bitwidth (32 for RV32, 64 for RV64
+and so on).
+
+The reason for setting this limit is so that predication registers, when
+marked as such, may fit into a single register as opposed to fanning out
+over several registers. This keeps the implementation a little simpler.
+Note that RVV on top of Simple-V may choose to over-ride this decision.
+
+## Vector-length CSRs
+
+Vector lengths are interpreted as meaning "any instruction referring to
+r(N) generates implicit identical instructions referring to registers
+r(N+M-1) where M is the Vector Length". Vector Lengths may be set to
+use up to 16 registers in the register file.
+
+One separate CSR table is needed for each of the integer and floating-point
+register files:
+
+| RegNo | (3..0) |
+| ----- | ------ |
+| r0 | vlen0 |
+| r1 | vlen1 |
+| .. | vlen.. |
+| r31 | vlen31 |
+
+An array of 32 4-bit CSRs is needed (4 bits per register) to indicate
+whether a register was, if referred to in any standard instructions,
+implicitly to be treated as a vector. A vector length of 1 indicates
+that it is to be treated as a scalar. Vector lengths of 0 are reserved.
+
+Internally, implementations may choose to use the non-zero vector length
+to set a bit-field per register, to be used in the instruction decode phase.
+In this way any standard (current or future) operation involving
+register operands may detect if the operation is to be vector-vector,
+vector-scalar or scalar-scalar (standard) simply through a single
+bit test.
+
+Note that when using the "vsetl rs1, rs2" instruction (caveat: when the
+bitwidth is specifically not set) it becomes:
+
+ CSRvlength = MIN(MIN(CSRvectorlen[rs1], MAXVECTORDEPTH), rs2)
+
+This is in contrast to RVV:
+
+ CSRvlength = MIN(MIN(rs1, MAXVECTORDEPTH), rs2)
+
+## Element (SIMD) bitwidth CSRs
+
+Element bitwidths may be specified with a per-register CSR, and indicate
+how a register (integer or floating-point) is to be subdivided.
+
+| RegNo | (2..0) |
+| ----- | ------ |
+| r0 | vew0 |
+| r1 | vew1 |
+| .. | vew.. |
+| r31 | vew31 |
+
+vew may be one of the following (giving a table "bytestable", used below):
+
+| vew | bitwidth |
+| --- | -------- |
+| 000 | default |
+| 001 | 8 |
+| 010 | 16 |
+| 011 | 32 |
+| 100 | 64 |
+| 101 | 128 |
+| 110 | rsvd |
+| 111 | rsvd |
+
+Extending this table (with extra bits) is covered in the section
+"Implementing RVV on top of Simple-V".
+
+Note that when using the "vsetl rs1, rs2" instruction, taking bitwidth
+into account, it becomes:
+
+ vew = CSRbitwidth[rs1]
+ if (vew == 0)
+ bytesperreg = (XLEN/8) # or FLEN as appropriate
+ else:
+ bytesperreg = bytestable[vew] # 1 2 4 8 16
+ simdmult = (XLEN/8) / bytesperreg # or FLEN as appropriate
+ vlen = CSRvectorlen[rs1] * simdmult
+ CSRvlength = MIN(MIN(vlen, MAXVECTORDEPTH), rs2)
+
+The reason for multiplying the vector length by the number of SIMD elements
+(in each individual register) is so that each SIMD element may optionally be
+predicated.
+
+An example of how to subdivide the register file when bitwidth != default
+is given in the section "Bitwidth Virtual Register Reordering".
+
# Instructions
By being a topological remap of RVV concepts, the following RVV instructions
in advance, accordingly: other strategies are explored in the Appendix
Section "Virtual Memory Page Faults".
-# CSRs <a name="csrs"></a>
-
-There are a number of CSRs needed, which are used at the instruction
-decode phase to re-interpret standard RV opcodes (a practice that has
-precedent in the setting of MISA to enable / disable extensions).
-
-* Integer Register N is Vector of length M: r(N) -> r(N..N+M-1)
-* Integer Register N is of implicit bitwidth M (M=default,8,16,32,64)
-* Floating-point Register N is Vector of length M: r(N) -> r(N..N+M-1)
-* Floating-point Register N is of implicit bitwidth M (M=default,8,16,32,64)
-* Integer Register N is a Predication Register (note: a key-value store)
-* Vector Length CSR (VSETVL, VGETVL)
-
-Notes:
-
-* for the purposes of LOAD / STORE, Integer Registers which are
- marked as a Vector will result in a Vector LOAD / STORE.
-* Vector Lengths are *not* the same as vsetl but are an integral part
- of vsetl.
-* Actual vector length is *multipled* by how many blocks of length
- "bitwidth" may fit into an XLEN-sized register file.
-* Predication is a key-value store due to the implicit referencing,
- as opposed to having the predicate register explicitly in the instruction.
-
-## Predication CSR
-
-The Predication CSR is a key-value store indicating whether, if a given
-destination register (integer or floating-point) is referred to in an
-instruction, it is to be predicated. The first entry is whether predication
-is enabled. The second entry is whether the register index refers to a
-floating-point or an integer register. The third entry is the index
-of that register which is to be predicated (if referred to). The fourth entry
-is the integer register that is treated as a bitfield, indexable by the
-vector element index.
-
-| RegNo | 6 | 5 | (4..0) | (4..0) |
-| ----- | - | - | ------- | ------- |
-| r0 | pren0 | i/f | regidx | predidx |
-| r1 | pren1 | i/f | regidx | predidx |
-| .. | pren.. | i/f | regidx | predidx |
-| r15 | pren15 | i/f | regidx | predidx |
-
-The Predication CSR Table is a key-value store, so implementation-wise
-it will be faster to turn the table around (maintain topologically
-equivalent state):
-
- fp_pred_enabled[32];
- int_pred_enabled[32];
- for (i = 0; i < 16; i++)
- if CSRpred[i].pren:
- idx = CSRpred[i].regidx
- predidx = CSRpred[i].predidx
- if CSRpred[i].type == 0: # integer
- int_pred_enabled[idx] = 1
- int_pred_reg[idx] = predidx
- else:
- fp_pred_enabled[idx] = 1
- fp_pred_reg[idx] = predidx
-
-So when an operation is to be predicated, it is the internal state that
-is used. In Section 6.4.2 of Hwacha's Manual (EECS-2015-262) the following
-pseudo-code for operations is given, where p is the explicit (direct)
-reference to the predication register to be used:
-
- for (int i=0; i<vl; ++i)
- if ([!]preg[p][i])
- (d ? vreg[rd][i] : sreg[rd]) =
- iop(s1 ? vreg[rs1][i] : sreg[rs1],
- s2 ? vreg[rs2][i] : sreg[rs2]); // for insts with 2 inputs
-
-This instead becomes an *indirect* reference using the *internal* state
-table generated from the Predication CSR key-value store:
-
- if type(iop) == INT:
- pred_enabled = int_pred_enabled
- preg = int_pred_reg[rd]
- else:
- pred_enabled = fp_pred_enabled
- preg = fp_pred_reg[rd]
-
- for (int i=0; i<vl; ++i)
- if (preg_enabled[rd] && [!]preg[i])
- (d ? vreg[rd][i] : sreg[rd]) =
- iop(s1 ? vreg[rs1][i] : sreg[rs1],
- s2 ? vreg[rs2][i] : sreg[rs2]); // for insts with 2 inputs
-
-## MAXVECTORDEPTH
-
-MAXVECTORDEPTH is the same concept as MVL in RVV. However in Simple-V,
-given that its primary (base, unextended) purpose is for 3D, Video and
-other purposes (not requiring supercomputing capability), it makes sense
-to limit MAXVECTORDEPTH to the regfile bitwidth (32 for RV32, 64 for RV64
-and so on).
-
-The reason for setting this limit is so that predication registers, when
-marked as such, may fit into a single register as opposed to fanning out
-over several registers. This keeps the implementation a little simpler.
-Note that RVV on top of Simple-V may choose to over-ride this decision.
-
-## Vector-length CSRs
-
-Vector lengths are interpreted as meaning "any instruction referring to
-r(N) generates implicit identical instructions referring to registers
-r(N+M-1) where M is the Vector Length". Vector Lengths may be set to
-use up to 16 registers in the register file.
-
-One separate CSR table is needed for each of the integer and floating-point
-register files:
-
-| RegNo | (3..0) |
-| ----- | ------ |
-| r0 | vlen0 |
-| r1 | vlen1 |
-| .. | vlen.. |
-| r31 | vlen31 |
-
-An array of 32 4-bit CSRs is needed (4 bits per register) to indicate
-whether a register was, if referred to in any standard instructions,
-implicitly to be treated as a vector. A vector length of 1 indicates
-that it is to be treated as a scalar. Vector lengths of 0 are reserved.
-
-Internally, implementations may choose to use the non-zero vector length
-to set a bit-field per register, to be used in the instruction decode phase.
-In this way any standard (current or future) operation involving
-register operands may detect if the operation is to be vector-vector,
-vector-scalar or scalar-scalar (standard) simply through a single
-bit test.
-
-Note that when using the "vsetl rs1, rs2" instruction (caveat: when the
-bitwidth is specifically not set) it becomes:
-
- CSRvlength = MIN(MIN(CSRvectorlen[rs1], MAXVECTORDEPTH), rs2)
-
-This is in contrast to RVV:
-
- CSRvlength = MIN(MIN(rs1, MAXVECTORDEPTH), rs2)
-
-## Element (SIMD) bitwidth CSRs
-
-Element bitwidths may be specified with a per-register CSR, and indicate
-how a register (integer or floating-point) is to be subdivided.
-
-| RegNo | (2..0) |
-| ----- | ------ |
-| r0 | vew0 |
-| r1 | vew1 |
-| .. | vew.. |
-| r31 | vew31 |
-
-vew may be one of the following (giving a table "bytestable", used below):
-
-| vew | bitwidth |
-| --- | -------- |
-| 000 | default |
-| 001 | 8 |
-| 010 | 16 |
-| 011 | 32 |
-| 100 | 64 |
-| 101 | 128 |
-| 110 | rsvd |
-| 111 | rsvd |
-
-Extending this table (with extra bits) is covered in the section
-"Implementing RVV on top of Simple-V".
-
-Note that when using the "vsetl rs1, rs2" instruction, taking bitwidth
-into account, it becomes:
-
- vew = CSRbitwidth[rs1]
- if (vew == 0)
- bytesperreg = (XLEN/8) # or FLEN as appropriate
- else:
- bytesperreg = bytestable[vew] # 1 2 4 8 16
- simdmult = (XLEN/8) / bytesperreg # or FLEN as appropriate
- vlen = CSRvectorlen[rs1] * simdmult
- CSRvlength = MIN(MIN(vlen, MAXVECTORDEPTH), rs2)
-
-The reason for multiplying the vector length by the number of SIMD elements
-(in each individual register) is so that each SIMD element may optionally be
-predicated.
-
-An example of how to subdivide the register file when bitwidth != default
-is given in the section "Bitwidth Virtual Register Reordering".
-
# Exceptions
> What does an ADD of two different-sized vectors do in simple-V?