X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=simple_v_extension%2Fappendix.mdwn;h=c29044cfea6b9772be22c43d9b8dc3d968f819ee;hb=HEAD;hp=779fe63d877adb714f27ed5a9f619bb9d29cfdb8;hpb=9ee5999959ea05a06f2c91ac6cc5e9fb877a1891;p=libreriscv.git diff --git a/simple_v_extension/appendix.mdwn b/simple_v_extension/appendix.mdwn index 779fe63d8..c29044cfe 100644 --- a/simple_v_extension/appendix.mdwn +++ b/simple_v_extension/appendix.mdwn @@ -1,4 +1,8 @@ -# Simple-V (Parallelism Extension Proposal) Appendix +[[!oldstandards]] + +# Simple-V (Parallelism Extension Proposal) Appendix (OBSOLETE) + +**OBSOLETE** * Copyright (C) 2017, 2018, 2019 Luke Kenneth Casson Leighton * Status: DRAFTv0.6 @@ -194,34 +198,52 @@ comprehensive in its effect on instructions. Branch operations are augmented slightly to be a little more like FP Compares (FEQ, FNE etc.), by permitting the cumulation (and storage) of multiple comparisons into a register (taken indirectly from the predicate -table). As such, "ffirst" - fail-on-first - condition mode can be enabled. +table) and enhancing them to branch "consensually" depending on *multiple* +tests. "ffirst" - fail-on-first - condition mode can also be enabled, +to terminate the comparisons early. See ffirst mode in the Predication Table section. -There are two registers for the comparison operation, therefore there is -the opportunity to associate two predicate registers. The first is a -"normal" predicate register, which acts just as it does on any other -single-predicated operation: masks out elements where a bit is zero, -applies an inversion to the predicate mask, and enables zeroing / non-zeroing -mode. - -The second is utilised to indicate where the results of each comparison -are to be stored, as a bitmask. Additionally, the behaviour of the branch -- when it occurs - may also be modified depending on whether the predicate -"invert" bit is set. - -* If the "invert" bit is zero, then the branch will occur if and only - all tests pass -* If the "invert" bit is set, the branch will occur if and only if all - tests *fail*. - -This inversion capability, with some careful boolean logic manipulation, -covers AND, OR, NAND and NOR branching based on multiple element comparisons. -Note that unlike normal computer programming early-termination of chains -of AND or OR conditional tests, the chain does *not* terminate early except -if fail-on-first is set, and even then ffirst ends on the first data-dependent -zero. When ffirst mode is not set, *all* conditional element tests must be -performed (and the result optionally stored in the result mask), with a -"post-analysis" phase carried out which checks whether to branch. +There are two registers for the comparison operation, therefore there +is the opportunity to associate two predicate registers (note: not in +the same way as twin-predication). The first is a "normal" predicate +register, which acts just as it does on any other single-predicated +operation: masks out elements where a bit is zero, applies an inversion +to the predicate mask, and enables zeroing / non-zeroing mode. + +The second (not to be confused with a twin-predication 2nd register) +is utilised to indicate where the results of each comparison are to +be stored, as a bitmask. Additionally, the behaviour of the branch - +when it occurs - may also be modified depending on whether the 2nd predicate's +"invert" and "zeroing" bits are set. These four combinations result +in "consensual branches", cbranch.ifnone (NOR), cbranch.ifany (OR), +cbranch.ifall (AND), cbranch.ifnotall (NAND). + +| invert | zeroing | description | operation | cbranch | +| ------ | ------- | --------------------------- | --------- | ------- | +| 0 | 0 | branch if all pass | AND | ifall | +| 1 | 0 | branch if one fails | NAND | ifnall | +| 0 | 1 | branch if one passes | OR | ifany | +| 1 | 1 | branch if all fail | NOR | ifnone | + +This inversion capability covers AND, OR, NAND and NOR branching +based on multiple element comparisons. Without the full set of four, +it is necessary to have two-sequence branch operations: one conditional, one +unconditional. + +Note that unlike normal computer programming, early-termination of chains +of AND or OR conditional tests, the chain does *not* terminate early +except if fail-on-first is set, and even then ffirst ends on the first +data-dependent zero. When ffirst mode is not set, *all* conditional +element tests must be performed (and the result optionally stored in +the result mask), with a "post-analysis" phase carried out which checks +whether to branch. + +Note also that whilst it may seem excessive to have all four (because +conditional comparisons may be inverted by swapping src1 and src2), +data-dependent fail-on-first is *not* invertible and *only* terminates +on first zero-condition encountered. Additionally it may be inconvenient +to have to swap the predicate registers associated with src1 and src2, +because this involves a new VBLOCK Context. ### Standard Branch @@ -290,9 +312,9 @@ complex), this becomes: ffirst_mode, zeroing = get_pred_flags(rs1) if exists(rd): - pred_inversion = get_pred_invert(rs2) + pred_inversion, pred_zeroing = get_pred_flags(rs2) else - pred_inversion = False + pred_inversion, pred_zeroing = False, False if not exists(rd) or zeroing: result = (1< This section contains examples of vectorised LOAD operations, showing how the two stage process works (three if zero/sign-extension is included). @@ -1426,7 +1460,7 @@ circumstances it is perfectly fine to simply have the lanes "inactive" for predicated elements, even though it results in less than 100% ALU utilisation. -## Twin-predication (based on source and destination register) +## Twin-predication (based on source and destination register) Twin-predication is not that much different, except that that the source is independently zero-predicated from the destination.