From 4e1efb3f567faf947a765ab47ea51c0ab7ee74ce Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 6 Oct 2022 20:19:12 +0100 Subject: [PATCH] --- openpower/sv/normal.mdwn | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/openpower/sv/normal.mdwn b/openpower/sv/normal.mdwn index 620427855..09e63f81f 100644 --- a/openpower/sv/normal.mdwn +++ b/openpower/sv/normal.mdwn @@ -141,15 +141,22 @@ element 0 would be the first. Data-driven (CR-driven) fail-on-first activates when Rc=1 or other CR-creating operation produces a result (including cmp). Similar to branch, an analysis of the CR is performed and if the test fails, the -vector operation terminates and discards all element operations at and -above the current one, and VL is truncated to either +vector operation terminates and discards all element operations **at and +above the current one**, and VL is truncated to either the *previous* element or the current one, depending on whether -VLi (VL "inclusive") is set. +VLi (VL "inclusive") is clear or set, respectively. Thus the new VL comprises a contiguous vector of results, all of which pass the testing criteria (equal to zero, less than zero etc as defined by the CR-bit test). +*Note: when VLi is clear, the behaviour at first seems counter-intuitive. +A result is calculated but if the test fails it is prohibited from being +actually written. This becomes intuitive again when it is remembered +that the length that VL is set to is the number of *written* elements, +and only when VLI is set will the current element be included in that +count.* + The CR-based data-driven fail-on-first is "new" and not found in ARM SVE or RVV. At the same time it is "old" because it is almost identical to a generalised form of Z80's `CPIR` instruction. @@ -201,10 +208,10 @@ The second crucial aspect, compared to LDST Ffirst: suitable reason. Beyond the first element LD/ST Failfirst is arbitrarily speculative and 100% non-deterministic. * CR-based data-dependent first on the other hand MUST NOT truncate VL -arbitrarily to a length decided by the hardware: VL MUST only be -truncated based explicitly on whether a test fails. -This because it is a precise Deterministic test on which algorithms -can and will will rely. + arbitrarily to a length decided by the hardware: VL MUST only be + truncated based explicitly on whether a test fails. + This because it is a precise Deterministic test on which algorithms + can and will will rely. **Floating-point Exceptions** -- 2.30.2