From a479c3b8b0337f7ecdd3358b6892af88d325254f Mon Sep 17 00:00:00 2001 From: Andrey Miroshnikov Date: Wed, 5 Apr 2023 11:52:57 +0000 Subject: [PATCH] normal.mdwn: Fix lowercase start of sentence and remove extra spaces. --- openpower/sv/normal.mdwn | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/openpower/sv/normal.mdwn b/openpower/sv/normal.mdwn index d3aa75ead..cb260ce21 100644 --- a/openpower/sv/normal.mdwn +++ b/openpower/sv/normal.mdwn @@ -6,7 +6,7 @@ * [[svp64]] Normal SVP64 Mode covers Arithmetic and Logical operations -to provide suitable additional behaviour. The Mode +to provide suitable additional behaviour. The Mode field is bits 19-23 of the [[svp64]] RM Field. Table of contents: @@ -16,24 +16,24 @@ Table of contents: ## Mode Mode is an augmentation of SV behaviour, providing additional -functionality. Some of these alterations are element-based (saturation), +functionality. Some of these alterations are element-based (saturation), others involve post-analysis (predicate result) and others are Vector-based (mapreduce, fail-on-first). [[sv/ldst]], [[sv/cr_ops]] and [[sv/branches]] are covered separately: the following Modes apply to Arithmetic and Logical SVP64 operations: -* **simple** mode is straight vectorisation. no augmentations: the +* **simple** mode is straight vectorisation. No augmentations: the vector comprises an array of independently created results. * **ffirst** or data-dependent fail-on-first: see separate section. - the vector may be truncated depending on certain criteria. + The vector may be truncated depending on certain criteria. *VL is altered as a result*. * **sat mode** or saturation: clamps each element result to a min/max - rather than overflows / wraps. allows signed and unsigned clamping + rather than overflows / wraps. Allows signed and unsigned clamping for both INT and FP. -* **reduce mode**. if used correctly, a mapreduce (or a prefix sum) - is performed. see [[svp64/appendix]]. - note that there are comprehensive caveats when using this mode. +* **reduce mode**. If used correctly, a mapreduce (or a prefix sum) + is performed. See [[svp64/appendix]]. + Note that there are comprehensive caveats when using this mode. * **pred-result** will test the result (CR testing selects a bit of CR and inverts it, just like branch conditional testing) and if the test fails it is as if the *destination* predicate bit was zero even @@ -66,7 +66,7 @@ Fields: * **sz / dz** if predication is enabled will put zeros into the dest (or as src in the case of twin pred) when the predicate bit is zero. - otherwise the element is ignored or skipped, depending on context. + Otherwise the element is ignored or skipped, depending on context. * **zz**: both sz and dz are set equal to this flag * **inv CR bit** just as in branches (BO) these bits allow testing of a CR bit and whether it is set (inv=0) or unset (inv=1) @@ -123,7 +123,7 @@ dest elwidth. given element hit saturation may be done using a mapreduced CR op (cror), or by using the new crrweird instruction with Rc=1, which will transfer the required CR bits to a scalar integer and update CR0, which will allow -testing the scalar integer for nonzero. see [[sv/cr_int_predication]]. +testing the scalar integer for nonzero. See [[sv/cr_int_predication]]. Alternatively, a Data-Dependent Fail-First may be used to truncate the Vector Length to non-saturated elements, greatly increasing the productivity of parallelised inner hot-loops.* @@ -141,7 +141,7 @@ Reduce Mode should not be confused with Parallel Reduction [[sv/remap]]. As explained in the [[sv/appendix]] Reduce Mode switches off the check which would normally stop looping if the result register is scalar. Thus, the result scalar register, if also used as a source scalar, -may be used to perform sequential accumulation. This *deliberately* +may be used to perform sequential accumulation. This *deliberately* sets up a chain of Register Hazard Dependencies, whereas Parallel Reduce [[sv/remap]] deliberately issues a Tree-Schedule of operations that may be parallelised. @@ -149,7 +149,7 @@ be parallelised. ## Data-dependent Fail-on-first Data-dependent fail-on-first is CR-field-driven and is completely separate -and distinct from LD/ST Fail-First (also known as Fault-First). Note in +and distinct from LD/ST Fail-First (also known as Fault-First). Note in each case the assumption is that vector elements are required to appear to be executed in sequential Program Order. When REMAP is not active, element 0 would be the first. @@ -220,7 +220,7 @@ the other hand are expected, unavoidably, to be low-performance*. Two extremely important aspects of ffirst are: -* LDST ffirst may never set VL equal to zero. This because on the first +* LDST ffirst may never set VL equal to zero. This because on the first element an exception must be raised "as normal". * CR-based data-dependent ffirst on the other hand **can** set VL equal to zero. This is the only means in the entirety of SV that VL may be set @@ -237,7 +237,7 @@ The second crucial aspect, compared to LDST Ffirst: non-deterministic. * CR-based data-dependent first on the other hand MUST NOT truncate VL arbitrarily to a length decided by the hardware: VL MUST only be - truncated based explicitly on whether a test fails. This because it is + truncated based explicitly on whether a test fails. This because it is a precise Deterministic test on which algorithms can and will will rely. **Floating-point Exceptions** @@ -245,7 +245,7 @@ The second crucial aspect, compared to LDST Ffirst: When Floating-point exceptions are enabled VL must be truncated at the point where the Exception appears not to have occurred. If `VLi` is set then VL must include the faulting element, and thus the faulting -element will always raise its exception. If however `VLi` is clear then +element will always raise its exception. If however `VLi` is clear then VL **excludes** the faulting element and thus the exception will **never** be raised. -- 2.30.2