57c3cc5268f23bbe3a6d1d8a8a64857517a21618
[libreriscv.git] / openpower / sv / rfc / ls002.mdwn
1 # RFC ls002 Floating-Point Load-Immediate
2
3 **URLs**:
4
5 * <https://libre-soc.org/openpower/sv/int_fp_mv/#fmvis>
6 * <https://libre-soc.org/openpower/sv/rfc/ls002/>
7 * <https://bugs.libre-soc.org/show_bug.cgi?id=944>
8 * <https://git.openpower.foundation/isa/PowerISA/issues/87>
9
10 **Severity**: Major
11
12 **Status**: New
13
14 **Date**: 03 Oct 2022
15
16 **Target**: v3.2
17
18 **Source**: v3.0B
19
20 **Books and Section affected**:
21
22 ```
23 Book I Scalar Floating-Point 4.6.2.1
24 Appendix D Power ISA sorted by opcode
25 Appendix E Power ISA sorted by version
26 Appendix F Power ISA sorted by mnemonic
27 ```
28
29 **Summary**
30
31 ```
32 Instructions added
33 fmvis - Floating-Point Move Immediate, Single
34 fishmv - Floating-Point Immediate, Second-half Move
35 (Potentially 64-bit prefixed of the same)
36 ```
37
38 **Submitter**: Luke Leighton (Libre-SOC)
39
40 **Requester**: Libre-SOC
41
42 **Impact on processor**:
43
44 ```
45 Addition of two new FPR-based instructions
46 (potentially 3 if EXT001 Prefixed variants added)
47 ```
48
49 **Impact on software**:
50
51 ```
52 Requires support for new instructions in assembler, debuggers,
53 and related tools.
54 ```
55
56 **Keywords**:
57
58 ```
59 FPR, Floating-point, Load-immediate, BF16, FP32
60 ```
61
62 **Motivation**
63
64 Similar to `lxvkq` but extended to a full BF16 with one
65 32-bit instruction and a full FP32 in two 32-bit instructions
66 these instructions always save a Data Load and associated L1
67 and TLB lookup. Even clearing an FPR to zero presently requires Load.
68
69 **Notes and Observations**:
70
71 1. There is no need for an Rc=1 variant because this is an immediate
72 loading instruction (an FPR equivalent to `li`)
73 2. There is no need for Special Registers (FP Flags) because this
74 is an immediate loading instruction. No FPR Load Operations
75 alter `FPSCR`, neither does `lxvkq`, and on that basis neither
76 should these instructions.
77 3. An EXT001 Variant which also save similar Data-Load and Data-TLB
78 lookups are mentioned for completeness but not included as part
79 of this RFC. Another Stakeholder with a vested interest in 64-bit
80 Prefixed instructions may wish to consider submitting them.
81 4. `fishmv` as a FRS-only Read-Modify-Write (instead of an unnecessary
82 FRS,FRA pair) saves five potential bits, making
83 the difference between a 5-bit XO (VA/DX-Form) and requiring an entire
84 Primary Opcode.
85
86 **Changes**
87
88 Add the following entries to the Appendices and instructions of
89 Book I as a new Section 4.6.2.1
90
91 ----------------
92
93 # Appendices
94
95 Appendix D Power ISA sorted by opcode
96 Appendix E Power ISA sorted by version
97 Appendix F Power ISA sorted by mnemonic
98
99 | Form | Book | Page | Version | mnemonic | Description |
100 |------|------|------|---------|----------|-------------|
101 | DX | I | # | 3.0B | fmvis | Floating-point Move Immediate, Single |
102 | DX | I | # | 3.0B | fishmv | Floating-point Immediate, Second-half Move |
103
104 \newpage{}
105
106 # Floating-Point Move Immediate
107
108 `fmvis FRS, D`
109
110 | 0-5 | 6-10 | 11-15 | 16-25 | 26-30 | 31 | Form |
111 |--------|------|-------|-------|-------|-----|---------|
112 | Major | FRS | d1 | d0 | XO | d2 | DX-Form |
113
114 Pseudocode:
115
116 ```
117 bf16 <- d0 || d1 || d2 # create BF16 immediate
118 fp32 <- bf16 || [0]*16 # convert BF16 to FP32
119 FRS <- DOUBLE(fp32) # convert FP32 to FP64
120 ```
121
122 Special registers altered:
123
124 None
125
126 Reinterprets `D << 16` as a 32-bit float, which is then converted to a
127 64-bit float and written to `FRS`. This is equivalent to reinterpreting
128 `D` as a `BF16` and converting to 64-bit float.
129
130 Examples:
131
132 ```
133 fmvis f4, 0 # writes +0.0 to f4 (clears an FPR)
134 fmvis f4, 0x8000 # writes -0.0 to f4
135 fmvis f4, 0x3F80 # writes +1.0 to f4
136 fmvis f4, 0xBFC0 # writes -1.5 to f4
137 fmvis f4, 0x7FC0 # writes +qNaN to f4
138 fmvis f4, 0x7F80 # writes +Infinity to f4
139 fmvis f4, 0xFF80 # writes -Infinity to f4
140 fmvis f4, 0x3FFF # writes +1.9921875 to f4
141 ```
142
143 # Floating-Point Immediate Second-Half Move
144
145 `fishmv FRS, D`
146
147 DX-Form:
148
149 | 0-5 | 6-10 | 11-15 | 16-25 | 26-30 | 31 | Form |
150 |--------|------|-------|-------|-------|-----|---------|
151 | Major | FRS | d1 | d0 | XO | d2 | DX-Form |
152
153 Pseudocode:
154
155 ```
156 n <- (FRS) # read FRS
157 fp32 <- SINGLE(n) # convert to FP32
158 fp32[16:31] <- d0 || d1 || d2 # replace LSB half
159 FRS <- DOUBLE(fp32) # convert back to FP64
160 ```
161
162 Special registers altered:
163
164 None
165
166 An additional 16-bits of immediate is
167 inserted into `FRS` to extend its accuracy to
168 a full FP32, which is then stored in the usual FP64 Format within the FPR.
169
170 **This instruction performs a Read-Modify-Write.** *FRS is read, the
171 additional
172 16 bit immediate inserted, and the result also written to FRS.
173 This is strategically similar to how `li` combined with `oris` is
174 used to construct 32-bit Integers.
175 `fishmv` may be macro-op-fused with `fmvis`*
176
177 Programmer's note:
178 If a prior `fmvis` instruction had been used to
179 set the upper 16-bits from an FP32 value, `fishmv` may be used
180 to set the
181 lower 16-bits.
182 Example:
183
184 ```
185 # these two combined instructions write 0x3f808000
186 # into f4 as an FP32 to be converted to an FP64.
187 # actual contents in f4 after conversion: 0x3ff0_1000_0000_0000
188 # first the upper bits, happens to be +1.0
189 fmvis f4, 0x3F80 # writes +1.0 to f4
190 # now write the lower 16 bits of an FP32
191 fishmv f4, 0x8000 # writes +1.00390625 to f4
192 ```
193 [[!tag opf_rfc]]
194
195 -------------
196