X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=openpower%2Fsv%2Fldst.mdwn;h=ca67a4f004ba1461da6fd59f5613fa0a9168961e;hb=0a873ae35b9633a01c0090a30d919cc8c979cdfc;hp=e633e87b01def7900447bbb6cfddb9c28cd3db15;hpb=5e47a443522fcc7633e6ff57717b34c2559c2f29;p=libreriscv.git diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index e633e87b0..ca67a4f00 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -268,6 +268,12 @@ cache-inhibited LD should be performed, followed by a VSPLAT-augmented mv. ## LD/ST ffirst +LD/ST ffirst treats the first LD/ST in a vector (element 0) as an +ordinary one. Exceptions occur "as normal". However for elements 1 +and above, if an exception would occur, then VL is **truncated** to the +previous element: the exception is **not** then raised because the +LD/ST was effectively speculative. + ffirst LD/ST to multiple pages via a Vectorised Index base is considered a security risk due to the abuse of probing multiple pages in rapid succession and getting feedback on which pages would fail. Therefore Vector Indexed LD/ST is prohibited entirely, and the Mode bit instead used for element-strided LD/ST. See for(i = 0; i < VL; i++) @@ -282,12 +288,23 @@ speculative probing (and also adversely affect performance), but will at least not require applications to be rewritten. Low-performance simpler hardware implementations may -choose to also set VL=1 as the bare minimum compliant implementation of +choose (always) to also set VL=1 as the bare minimum compliant implementation of LD/ST Fail-First. It is however critically important to remember that the first element LD/ST **MUST** be treated as an ordinary LD/ST, i.e. **MUST** raise exceptions exactly like an ordinary LD/ST. -For ffirst LD/STs, VL may be truncated arbitrarily to a nonzero value for any implementation-specific reason. For example: it is perfectly reasonable for implementations to alter VL when ffirst LD or ST operations are initiated on a nonaligned boundary, such that within a loop the subsequent iteration of that loop begins subsequent ffirst LD/ST operations on an aligned boundary. Likewise, to reduce workloads or balance resources. +For ffirst LD/STs, VL may be truncated arbitrarily to a nonzero value for any implementation-specific reason. For example: it is perfectly reasonable for implementations to alter VL when ffirst LD or ST operations are initiated on a nonaligned boundary, such that within a loop the subsequent iteration of that loop begins subsequent ffirst LD/ST operations on an aligned boundary +such as the beginning of a cache line, or beginning of a Virtual Memory +page. Likewise, to reduce workloads or balance resources. + +Vertical-First Mode is slightly strange in that only one element +at a time is ever executed anyway. Given that programmers may +legitimately choose to alter srcstep and dststep in non-sequential +order as part of explicit loops, it is neither possible nor +safe to make speculative assumptions about future LD/STs. +Therefore, Fail-First LD/ST in Vertical-First is `UNDEFINED`. +This is very different from Arithmetic (Data-dependent) FFirst +where Vertical-First Mode is deterministic, not speculative. # LOAD/STORE Elwidths