From: lkcl Date: Thu, 24 Dec 2020 08:59:17 +0000 (+0000) Subject: (no commit message) X-Git-Tag: convert-csv-opcode-to-binary~971 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=72f634b86ee18d54d2ce57787a45adcf9cdc7345;p=libreriscv.git --- diff --git a/openpower/sv/overview.mdwn b/openpower/sv/overview.mdwn index 4c45706e8..622585574 100644 --- a/openpower/sv/overview.mdwn +++ b/openpower/sv/overview.mdwn @@ -56,9 +56,47 @@ In fairness to both VSX and RVV, there are things that are not provided by Simpl These are not insurmountable limitations, that, over time, may well be added in future revisions of SV. +# Adding Scalar / Vector +The first augmentation to the simple loop is to add the option for all source and destinations to all be either scalar or vector. As a FSM this is where our "simple" loop gets its first complexity. + function op_add(rd, rs1, rs2) # add not VADD! + int id=0, irs1=0, irs2=0; + for i = 0 to VL-1: + ireg[rd+id] <= ireg[rs1+irs1] + ireg[rs2+irs2]; + if (!rd.isvec) break; + if (rd.isvec) { id += 1; } + if (rs1.isvec) { irs1 += 1; } + if (rs2.isvec) { irs2 += 1; } + if (id == VL or irs1 == VL or irs2 == VL) + break +With some walkthroughs it is clear that the loop exits immediately after the first scalar destination result is written, and that when the destination is a Vector the loop proceeds to fill up the register file, sequentially, starting at `rd` and ending at `rd+VL-1`. The two source registers will, independently, either remain pointing at `rs1` or `rs2` respectively, or, if marked as Vectors, will march incrementally in lockstep as the destination also progresses through elements. +In this way all the eight permutations of Scalar and Vector behaviour are covered, although without predication the scalar-destination ones are reduced in usefulness. It does however clearly illustrate the principle. +Note in particular: there is no separate Scalar add instruction and separate Vector instruction and separate Scalar-Vector instruction: it's all the same instruction, just with a loop. Scalar happens to set that loop size to one. + +# Adding single predication + +The next step is to add a single predicate mask. This is where it gets interesting. Predicate masks are a bitvector, each bit specifying, in order, whether the element operation is to be skipped ("masked out") or allowed If there is no predicate, it is set to all 1s + + function op_add(rd, rs1, rs2) # add not VADD! + int id=0, irs1=0, irs2=0; + predval = get_pred_val(FALSE, rd); + for i = 0 to VL-1: + if (predval & 1<