From 234f292db3289d0ca02203aca046fce4f3f05f4a Mon Sep 17 00:00:00 2001 From: lkcl Date: Sat, 12 Dec 2020 23:57:39 +0000 Subject: [PATCH] --- openpower/sv/svp_rewrite/svp64/discussion.mdwn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/openpower/sv/svp_rewrite/svp64/discussion.mdwn b/openpower/sv/svp_rewrite/svp64/discussion.mdwn index a8d39beb6..5006b91f9 100644 --- a/openpower/sv/svp_rewrite/svp64/discussion.mdwn +++ b/openpower/sv/svp_rewrite/svp64/discussion.mdwn @@ -105,7 +105,7 @@ If there are spare bits it would be very good to look at using some of them to s Data-dependent fail-on-first has two distinct variants: one for LD/ST, the other for arithmetic operations (actually, CR-driven). Note in each case the assumption is that vector elements are required appear to be executed in sequential Program Order, element 0 being the first. * LD/ST ffirst treats the first LD/ST in a vector (element 0) as an ordinary one. Exceptions occur "as normal". However for elements 1 and above, if an exception would occur, then VL is **truncated** to the previous element. -* Data-driven (CR-driven) fail-on-first activates when Rc=1 or other CR-creating operation produces a result (including cmp). Similar to branch, an analysis of the CR is performed and if the test succeeds, the vector operation terminates all element operations at and above the current one, and VL is truncated to the *previous* element. +* Data-driven (CR-driven) fail-on-first activates when Rc=1 or other CR-creating operation produces a result (including cmp). Similar to branch, an analysis of the CR is performed and if the test fails, the vector operation terminates and discards all element operations at and above the current one, and VL is truncated to the *previous* element. Thus the new VL comprises a vector of results that pass certain criteria (equal to zero, less than zero). The CR-based data-driven fail-on-first is new and not found in ARM SVE or RVV. It is extremely useful for reducing instruction count, however requires speculative execution involving modifications of VL to get high performance implementations. -- 2.30.2