From 0e1642f882a9c76f4059b760e5cecd489bcc472f Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 18 Sep 2022 11:38:23 +0100 Subject: [PATCH] --- openpower/sv/rfc/ls001.mdwn | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/openpower/sv/rfc/ls001.mdwn b/openpower/sv/rfc/ls001.mdwn index 52d2bb825..b8c860b25 100644 --- a/openpower/sv/rfc/ls001.mdwn +++ b/openpower/sv/rfc/ls001.mdwn @@ -412,6 +412,29 @@ vec2/3/4 Structure Packing *and* REMAP, the combinations far exceed anything seen in any other Vector ISA in history, yet are really nothing more than concepts abstracted out in pure RISC form.[^ldstcisc] +# CR Field RM Modes. + +CR Field operations (`crand` etc.) are somewhat underappreciated in the +Power ISA. The CR Fields however are perfect for providing up to four +separate Vectors of Predicate Masks: `EQ LT GT SO` and thus some special +attention was given to first making transfer between GPR and CR Fields +much more powerful with the +[crweird](https://libre-soc.org/openpower/sv/cr_int_predication/) +operations, and secondly by adding powerful binary and ternary CR Field +operations into the bitmanip extension.[^crops] + +On top of these additional instructions, RM Modes may still be applied, particularly mapreduce and Data-Dependent Fail-first. The usefulness of +being able to auto-truncate subsequent Vector Processing at the point +at which a CR Field test fails should be very clear. + +With element-width overrides being meaningless for CR Fields the +decision to use those bits in RM for other purposes was an easy one. +This does provide more uniformity and flexibility to CR Field +operations, but there are less options: neither Saturation nor +Predicate-Result make sense. + +[^crops]: the alternative to powerful transfer instructions between GPR and CR Fields was to add the full duplicated suite of BMI and TBM operations present in GPR (popcnt, cntlz, set-before-first) as CR Field Operations. all of which was deemed inappropriate. + # SVP64Single 24-bits The `SVP64-Single` 24-bit encoding focusses primarily on ensuring that @@ -1204,6 +1227,6 @@ operations. [^autovec]: Compiler auto-vectorisation for best exploitation of SIMD and Vector ISAs on Scalar programming languages (c, c++) is an Indusstry-wide known-hard decades-long problem. Cross-reference the number of hand-optimised assembler algorithms. [^hphint]: intended for use when the compiler has determined the extent of Memory or register aliases in loops: `a[i] += a[i+4]` would necessitate a Vertical-First hphint of 4 [^svshape]: although SVSHAPE0-3 should, realistically, be regarded as high a priority as SVSTATE, and given corresponding SVSRR and SVLR equivalents, it was felt that having to context-switch **five** SPRs on Interrupts and function calls was too much. -[^whoops]: two efforts were made to mix non-uniform encodings into Simple-V space: one deliberate to see how it would go, and one accidental. They both went extremely badly, the deliberate one costing two months to add then remove. -[^mul]: Setting this "multiplier"to 1 remarkably leaves pre-existing Scalar behaviour completely intact as a degenerate case. -[ldstcisc]: At least the CISC "auto-increment" modes are not present, from the CDC 6600 and Motorola 68000! although these would be fun to introduce they do unfortunately make for 3-in 3-out register profiles, all 64-bit, which explains why the 6600 and 68000 had separate special dedicated address regfiles. +[^whoops]: two efforts were made to mix non-uniform encodings into Simple-V space: one deliberate to see how it would go, and one accidental. They both went extremely badly, the deliberate one costing over two months to add then remove. +[^mul]: Setting this "multiplier" to 1 clearly leaves pre-existing Scalar behaviour completely intact as a degenerate case. +[^ldstcisc]: At least the CISC "auto-increment" modes are not present, from the CDC 6600 and Motorola 68000! although these would be fun to introduce they do unfortunately make for 3-in 3-out register profiles, all 64-bit, which explains why the 6600 and 68000 had separate special dedicated address regfiles. -- 2.30.2