From 49de7a0118e62d67d78cbbd9a58cd38ba27de6b0 Mon Sep 17 00:00:00 2001 From: lkcl Date: Sun, 21 Aug 2022 17:50:36 +0100 Subject: [PATCH] --- openpower/sv/normal.mdwn | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/openpower/sv/normal.mdwn b/openpower/sv/normal.mdwn index 2e136f8ad..9382fd0a2 100644 --- a/openpower/sv/normal.mdwn +++ b/openpower/sv/normal.mdwn @@ -135,19 +135,21 @@ prefix-sum in certain circumstances. Details are in the [[svp64/appendix]] Data-dependent fail-on-first has two distinct variants: one for LD/ST, the other for arithmetic operations (actually, CR-driven). Note in each -case the assumption is that vector elements are required appear to be -executed in sequential Program Order, element 0 being the first. - -* Data-driven (CR-driven) fail-on-first activates when Rc=1 or other - CR-creating operation produces a result (including cmp). Similar to - branch, an analysis of the CR is performed and if the test fails, the - vector operation terminates and discards all element operations at and - above the current one, and VL is truncated to either - the *previous* element or the current one, depending on whether - VLi (VL "inclusive") is set. +case the assumption is that vector elements are required to appear to be +executed in sequential Program Order. When REMAP is not active, +element 0 would be the first. + +Data-driven (CR-driven) fail-on-first activates when Rc=1 or other +CR-creating operation produces a result (including cmp). Similar to +branch, an analysis of the CR is performed and if the test fails, the +vector operation terminates and discards all element operations at and +above the current one, and VL is truncated to either +the *previous* element or the current one, depending on whether +VLi (VL "inclusive") is set. Thus the new VL comprises a contiguous vector of results, -all of which pass the testing criteria (equal to zero, less than zero). +all of which pass the testing criteria (equal to zero, less than zero etc +as defined by the CR-bit test). The CR-based data-driven fail-on-first is "new" and not found in ARM SVE or RVV. At the same time it is "old" because it is almost @@ -160,13 +162,14 @@ into a type of `cmp`. The CR is stored (and the CR.eq bit tested against the `inv` field). If the CR.eq bit is equal to `inv` then the Vector is truncated and the loop ends. -Note that when RC1=1 the result elements are never stored, only the CRs. +Note that when RC1=1 the result elements are never stored, only the CR +Fields. VLi is only available as an option when `Rc=0` (or for instructions which do not have Rc). When set, the current element is always also included in the count (the new length that VL will be set to). This may be useful in combination with "inv" to truncate the Vector -to `exclude` elements that fail a test, or, in the case of implementations +to *exclude* elements that fail a test, or, in the case of implementations of strncpy, to include the terminating zero. In CR-based data-driven fail-on-first there is only the option to select @@ -192,12 +195,13 @@ The second crucial aspect, compared to LDST Ffirst: * LD/ST Failfirst may (beyond the initial first element conditions) truncate VL for any architecturally - suitable reason. + suitable reason. Beyond the first element LD/ST Failfirst is + arbitrarily speculative and 100% non-deterministic. * CR-based data-dependent first on the other hand MUST NOT truncate VL arbitrarily to a length decided by the hardware: VL MUST only be truncated based explicitly on whether a test fails. -This because it is a precise test on which algorithms -will rely. +This because it is a precise Deterministic test on which algorithms +can and will will rely. **Floating-point Exceptions** @@ -219,7 +223,7 @@ the exceptions entirely** Operations that actually produce or alter CR Field as a result have their own SVP64 Mode, described -in [[sv/cr_ops]] +in [[sv/cr_ops]]. # pred-result mode -- 2.30.2