(no commit message)
[libreriscv.git] / openpower / sv / rfc / ls002 / discussion.mdwn
1 # Links
2
3 * [[sv/int_fp_mv]]
4
5 # Questions (09 oct 2022)
6
7 **
8 1. What is "BF16"? It seems not to be mentioned in the architecture spec.
9 The architecture spec (VSX chapter) defines two 16-bit binary FP formats.
10 Judging by the way the RFC uses "BF16", I think it means what the VSX
11 chapter calls "bfloat16", which has the exponent in the same bits as
12 single format. This should be clarified, and the corresponding format
13 will need to be defined in Section 4.3.1 (Data Format).
14 **
15
16 BF16 seems to be an equally commonly used term for bfloat16, yes.
17
18 **
19 2. For fishmv, what happens if the value supplied in the FPR is not
20 representable in single format?
21 **
22
23 exactly the same thing as if `fld` were used to load an "unrepresentable"
24 value: nothing. if `fld` raised flags or exceptions then so would (should)
25 `fmvis`. these are immediates, statically-compiled. if the developer
26 wants "invalid" data, statically-compiled into a binary, it is reasonable
27 to assume they have good reasons for doing so.
28
29 **
30 3. The first clause of the verbal description of fishmv seems to assume
31 that the contents of the specified register were produced by fmvis.
32 Is there any other use of fishmv? If yes, the verbal description should
33 be generalized. If no, the wording should be explicit about this use.
34 **
35
36 given that the bits are spread out in `DOUBLE()` format it seems unlikely.
37 if the bits were placed contiguously (sequentially) then it would indeed
38 be a different matter: temporary storage for constants to be transferred
39 directly (unmodified) to GPRs for example. but DOUBLE() formatting
40 makes that not possible unfortunately.
41
42 **
43 4. The instruction names and mnemonics should be more consistent with the
44 architecture spec. In particular, the architecture spec tends to use
45 "Move" for instructions that transfer data between registers. Here are
46 two approaches.
47 **
48
49 ```
50 a. Model the instructions on li (Load Immediate), an extended mnemonic for
51 addi.
52 fmvis --> Floating Load Immediate Single (flis)
53 fishmv --> Floating Load Immediate Single Lower (flisl)
54 Under this approach the new instructions would belong in their own
55 3-level section, after Section 4.6.4 (Floating-Point Load and Store
56 Double Pair Instructions).
57
58 b. Model the instructions on lxvkq (and the existing FP Load instructions)
59 fmvis --> Load Floating-Point Single Immediate (lfsi)
60 fishmv --> Load Floating-Point Single Immediate Lower (lfsil)
61 Under this approach the new instructions would belong in Section 4.6.2
62 (Floating-Point Load Instructions), with the Load Floating-Point
63 Single instructions.
64
65 I prefer (a), because I think it's confusing to treat these instructions,
66 which don't access storage, like instructions that do access storage.
67 ```
68
69 the fact that they bypass D-Cache and correspondingly raise no flags or
70 exceptions is the connection to `ld`. despite that i like (a) as well
71 although for purely non-technical reasons (more "memorable") i do love
72 the two mnemonics `flis fishmv` :)
73
74 we picked "s" on the end of `fmvis` (`flis`) because it is "shifted"
75 (like `oris`)
76
77 Other:
78
79 **
80 1. The RFC should be based on the current version of the architecture,
81 which is V. 3.1B. I believe this has no effect on the substance of the
82 RFC. But it affects the identities of the instruction-list appendices,
83 which in V. 3.1B are E, F, G, and H.
84 **
85
86 acknowledged. will edit. done.
87
88 **
89 2. Additional affected sections are 1.6.1.6 (additional line for DX-form),
90 1.6.2 (additional use for d0,d1,d2), and Appendix D (Opcode Maps).
91 **
92
93 ditto. TODO.
94
95 **
96 3. Does the last line of the Summary apply to both instructions or just to
97 fishmv? I can see why you would want a prefixed version of fmvis, which
98 would supply the entire 32-bit FP single format value and avoid the need
99 for fishmv. Why would you want a prefixed version of fishmv?
100 **
101
102 the analysis counting instructions and D-Cache Loads actually shows
103 that whilst the initial idea for `pfmvis` would be to fill in the
104 remaining mantissa and high exponent bits to complete a full FP64,
105 the cost of doing so is:
106
107 * 1x32 flis
108 * 1x32 fishmv
109 * 1x64 pfishmv
110
111 which is QTY 8 bytes which is actually *more* than just `fld`,
112 which is only QTY 6 bytes. the only technical reason therefore is
113 to avoid D-Cache entirely, just like the 5-instruction sequence
114 that writes a 64-bit GPR only from immediates
115 (li, oris, rldicl, li, oris) although that is justifiable
116 as a critical means of bootstrapping (constructing 64 bit addresses)
117
118 **
119 4. The Motivation says "Even clearing an FPR to zero presently requires Load".
120 What about fsub FRT,FRA,FRA?
121 **
122
123 didn't know about it! although technically that reads registers
124 (unless micro-code-redirected to an internal zeroing operation)
125
126 **
127 5. "FRS" for both instructions should be changed to "FRT". ("FRS" normally
128 specifies a source register; see Section 1.6.2. I understand that for
129 fishmv the specified register is both source and target. But "TX,T"
130 provides precedent for using the "target form" of register specification
131 for such cases.)
132 6. The RTL for fmvis should use left arrow for assignment.
133 **
134
135 RTL error corrected. ack on FRT.
136
137 **
138 7. The architecture spec (VSX chapter) uses "BFP32" and "BFP64", and the
139 lower-case versions thereof, for the 32-bit and 64-bit binary FP formats.
140 The RFC's "FP32" and "FP64" (and lower case of same) should be made
141 consistent with this usage.
142 **
143
144 acknowledged. TODO.
145
146 **
147 8. More generally, the style of the verbal description for both instructions
148 should be made more consistent with the style used in the architecture
149 spec.
150 **
151
152 yes Paul kindly gave advice on that.
153
154 **
155 9. In the first clause of the verbal description of fishmv I think "inserted
156 into FRS" should be "inserted into the low-order half of the single-
157 format value corresponding to the contents of FRT".
158 A similar change should be made in the second sentence of the next
159 paragraph.
160 **
161
162 ack. TODO.
163
164 **
165 10. The paragraph before the Programming Note in the fishmv description
166 says "This is strategically similar to how li combined with oris is used
167 to construct 32-bit Integers". li combined with oris works only if bit 16
168 of the desired 32-bit integer is 0. (A better way to construct a 32-bit
169 integer is to use pli (extended mnemonic for paddi).)
170 **
171
172 it is extremely unlikely that we (Libre-SOC) will implement any of v3.1
173 64-bit prefixing (it cannot be Vectorised, resulting unacceptably in
174 96-bit instructions so what is the point). that said the LD
175 addressing immediate extended range is extremely useful.
176
177 bottom line we have given almost no thought to using any v3.1 Scalar
178 Prefixed instructions, at all, so don't even know 99% of what they do.