From 7fadf46b09a2c98c00658eb31e2455dce792e0ec Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 24 Dec 2020 13:46:49 +0000 Subject: [PATCH] --- openpower/sv/overview.mdwn | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/openpower/sv/overview.mdwn b/openpower/sv/overview.mdwn index 975acd05e..041ac6a06 100644 --- a/openpower/sv/overview.mdwn +++ b/openpower/sv/overview.mdwn @@ -145,7 +145,6 @@ The solution comes in terms of rethinking the definition of a Register File. Rh Then, our simple loop, instead of accessing the array of 64 bits with a computed index, would access the appropriate element of the appropriate type. Thus we have a series of overlapping conceptual arrays that each start at what is traditionally thought of as "a register". It then helps if we have a couple of routines: - get_polymorphed_reg(reg, bitwidth, offset): reg_t res = 0; if bitwidth == 8: @@ -169,3 +168,16 @@ Then, our simple loop, instead of accessing the array of 64 bits with a computed int_regfile[reg].i[offset] = val elif bitwidth == default: # 64 int_regfile[reg].l[offset] = val + +These basically provide a convenient parameterised way to access the register file, at an arbitrary vector element offset and an arbitrary element width. Our first simple loop thus becomes: + + for i = 0 to VL-1: + src1 = get_polymorphed_reg(rs1, srcwid, i) + src2 = get_polymorphed_reg(rs2, srcwid, i) + result = src1 + src2 # actual add here + set_polymorphed_reg(rd, destwid, i, result) + +Note that things such as zero/sign-extension have been left out: also note that it turns out to be important to perform the operation at the maximum bitwidth - `max(srcwid, destwid)` - such that any truncation, rounding errors or other artefacts may all be ironed out. This turns out to be important when applying Saturation for Audio DSP workloads. + +Other than that, element width overrides, which can be applied to *either* source or destination or both, are pretty straightforward, conceptually. The details, for hardware engineers, involve byte-level write-enable lines, which is exactly what is used on SRAMs anyway. Compiler writers have to alter Register Allocation Tables to byte-level granularity. + -- 2.30.2