(no commit message)
[libreriscv.git] / openpower / sv / cr_ops.mdwn
1 # Condition Register SVP64 Operations
2
3 Condition Register Fields are only 4 bits wide: this presents some
4 interesting conceptual challenges for SVP64, particularly with respect to element
5 width (which is clearly meaningless). Likewise, arithmetic saturation
6 (an important part of Arithmetic SVP64)
7 has no meaning. Additionally, extra modes are required that only make
8 sense for Vectorised CR Operations. Consequently an alternative Mode Format is required.
9
10 This alternative mapping **only** applies to instructions that **only**
11 reference a CR Field or CR bit as the sole exclusive result. This section
12 **does not** apply to instructions which primarily produce arithmetic
13 results that also, as an aside, produce a corresponding
14 CR Field (such as when Rc=1).
15 Instructions that involve Rc=1 are definitively arithmetic in nature,
16 where the corresponding Condition Register Field can be considered to
17 be a "co-result". Thus, if the arithmetic result is Vectorised, so
18 is the CR Field "co-result", which puts both firmly out of scope for
19 this section.
20
21 Other modes are still applicable and include:
22
23 * **Data-dependent fail-first**.
24 useful to truncate VL based on
25 analysis of a Condition Register result bit.
26 * **Scalar and parallel reduction**.
27 Reduction is useful
28 for turning a Vector of Condition Register Fields into one
29 single Condition Register.
30 * **Predicate-result**.
31 Equivalent
32 to python "filter", in that only elements which pass a test
33 will end up actually being modified. This is in effect the same
34 as ANDing the Condition Test with the destination predicate
35 mask (hence the name, "predicate-result").
36
37 SVP64 RM `MODE` (includes `ELWIDTH` bits) for CR-based operations:
38
39 | 4 | 5 | 19-20 | 21 | 22 23 | description |
40 | - | - | ----- | --- |---------|----------------- |
41 | / | / | 00 | 0 | dz sz | normal mode |
42 | / | / | 00 | 1 | 0 RG | scalar reduce mode (mapreduce), SUBVL=1 |
43 | / | / | 00 | 1 | 1 CRM | parallel reduce mode (mapreduce), SUBVL=1 |
44 | / | / | 00 | 1 | SVM RG | subvector reduce mode, SUBVL>1 |
45 |dz |VLi| 01 | inv | CR-bit | Ffirst 3-bit mode |
46 |sz |VLi| 01 | inv | dz Rc1 | Ffirst 5-bit mode |
47 | / | / | 10 | / | / / | RESERVED |
48 |sz |SNZ| 11 | inv | CR-bit | 3-bit pred-result CR sel |
49 | / |SNZ| 11 | inv | dz sz | 5-bit pred-result z/nonz |
50
51 Fields:
52
53 TODO
54
55 # Data-dependent fail-first on CR operations
56
57 Data-dependent SVP64 Vectorised Operations involving the creation or
58 modification of a CR require an extra two bits, which are not available
59 in the compact space of the `MODE` Field. With the concept of element
60 width overrides being meaningless for CR Fields it is possible to use the
61 `ELWIDTH` field for extra fields.
62
63 Condition Register based operations such as `mfcr` and `crand` can thus
64 be made more flexible. However the rules that apply in this section
65 also apply to future CR-based instructions.
66
67 There are two primary different types of CR operations:
68
69 * Those which have a 3-bit operand field (referring to a CR Field)
70 * Those which have a 5-bit operand (referring to a bit within the
71 whole 32-bit CR)
72
73 Examining these two as has already been done it is observed that the
74 difference may be considered to be that the 5-bit variant provides
75 additional information about which CR Field bit (EQ, GE, LT, SO) is to
76 be operated on by the instruction.
77
78 Thus, logically, we may set the following rule:
79
80 * When a 5-bit CR Result field is used in an instruction, the
81 `inv, VLi and RC1` variant of Data-Dependent Fail-First
82 must be used. i.e. the bit of the CR field to be tested is
83 the one that has just been modified by the operation.
84 * When a 3-bit CR Result field is used the `inv CRbit` variant
85 must be used in order to select which CR Field bit shall
86 be tested (EQ, LE, GE, SO)
87
88 The reason why the 3-bit CR variant needs the additional CR-bit
89 field should be obvious from the fact that the 3-bit CR Field
90 from the base Power ISA v3.0B operation clearly does not contain
91 and is missing the two CR Field Selector bits. Thus, these two
92 bits (to select EQ, LE, GE or SO) must be provided in another
93 way.
94
95 Examples of the former type:
96
97 * crand, cror, crnor. These all are 5-bit (BA, BB, BT). The bit
98 to be tested against `inv` is the one selected by `BT`
99 * mcrf. This has only 3-bit (BF, BFA). In order to select the
100 bit to be tested, the alternative FFirst encoding must be used.
101
102 This limits sv.mcrf in that it may not use the `VLi` (VL inclusive)
103 Mode. This is unfortunste but unavoidable due to encoding pressure
104 on SVP64.
105
106 # Predicate-result Condition Register operations
107
108 These are again slightly different compared to SVP64 arithmetic
109 pred-result (described in [[svp64/appendix]]). The reason is that,
110 again, for arithmetic operations the production of a CR Field when
111 Rc=1 is a *co-result* accompanying the main arithmetic result, whereas
112 for CR-based operations the CR Field (referred to by a 3-bit
113 v3.0B base operand from e.g. `mfcr`) or CR bit (referred to by a 5-bit operand from e.g. `crnor`)
114 *is* itself the explicit and sole result of the operation.
115
116 Therefore, logically, Predicate-result needs to be adapted to
117 test the actual result of the CR-based instruction, rather than
118 test the co-resultant CR when Rc=1.
119
120 for i in range(VL):
121 # predication test, skip all masked out elements.
122 # skips when sz=0
123 if sz=0 and predicate_masked_out(i):
124 continue
125 if predicate_masked_out(i):
126 if 5bit mode:
127 # only one bit of CR to update
128 result = SNZ
129 else
130 # four copies of SNZ
131 result = SNZ || SNZ || SNZ || SNZ
132 else
133 # result is to go into CR. may be a 4-bit CR Field
134 # (3-bit mode) or just a single bit (5-bit mode)
135 result = op(...)
136 if 5bit mode:
137 # if this CR op has 5-bit CR result operands
138 # the single bit result is what must be tested
139 to_test = result
140 else
141 # if however this is a 3-bit CR *field* result
142 # then the bit to be tested must be selected
143 to_test = result[CRbit]
144 # now test CR, similar to branch
145 if to_test != inv:
146 continue # test failed: cancel store
147 # result optionally stored
148 update_CR(result)