(no commit message)
[libreriscv.git] / isa_conflict_resolution / isamux_isans.mdwn
1 # Note-form on ISAMUX (aka "ISANS")
2
3 A fixed number of additional (hidden) bits, conceptually a "namespace", that go directly and non-optionally
4 into the instruction decode phase, extending (in each implementation) the
5 opcode length to 16+N, 32+N, 48+N, where N is a hard fixed quantity on
6 a per-implementor basis.
7
8 Where the opcode is normally loaded from the location at the PC, the extra
9 bits, set via a CSR, are mandatorially appended to every instruction: hence why they are described as "hidden" opcode bits, and as a "namespace".
10
11 The parallels with c++ "using namespace" are direct and clear.
12
13 # Hypothetical Format
14
15 Note that this is a hypothetical format, yet TBD, where particular attention
16 needs to be paid to the fact that there is an "immediate" version of CSRRW
17 (with 5 bits of immediate) that could save a lot of space in binaries.
18
19 <code>
20 <pre>
21 3 2 1
22 |1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0|
23 |------------------------------ |-------|---------------------|-|
24 |reserved reserved reserved reserved reserved | foreignarch |1|
25 |custom | reserved | official|B| rvcpage |0|
26 </pre>
27 <code>
28
29 RV Mode
30
31 * when bit 0 is 0, "RV" mode is selected.
32 * in RV mode, bits 1 thru 5 provide up to 16 possible alternative meanings (namespaces) for 16 Bit opcodes. "pages" if you will. The top bit indicates custom meanings. When set to 0, the top bit is for official usage.
33 * Bits 15 thru 23 are reserved.
34 * Bits 24 thru 31 are for custom usage.
35 * bit 6 ("B") is LE/BE
36
37 16 bit page examples:
38
39 * 0b0000 STANDARD (2019) RVC
40 * 0b0001 RVCv2
41 * 0b0010 RV16
42 * 0b0011 RVCv3
43 * ...
44 * 0b1000 custom 16 bit opcode meanings 1
45 * 0b1001 custom 16 bit opcode meanings 2
46 * .....
47
48 Foreign Arch Mode
49
50 * when bit 0 is 1, "Foreign arch" mode is selected.
51 * Bits 1 thru 7 are a table of foreign arches.
52 * when the MSB is 1, this is for custom use.
53 * when the MSB is 0, bits 1 thru 6 are reserved for 64 possible official foreign archs.
54
55 Foreign archs could be (examples):
56
57 * 0b0000000 x86_32
58 * 0b0000001 x86_64
59 * 0b0000010 MIPS32
60 * 0b0000011 MIPS64
61 * ....
62 * 0b0010000 Java Bytecode
63 * 0b0010001 N.E.Other Bytecode
64 * ....
65 * 0b1000000 custom foreign arch 1
66 * 0b1000001 custom foreign arch 2
67 * ....
68
69 Note that "official" foreign archs have a binary value where the MSB is zero,
70 and custom foreign archs have a binary value where the MSB is 1.
71
72 # Namespaces are permitted to swap to new state <a name="stateswap"></a>
73
74 In each privilege level, on a change of ISANS (whether through manual setting of ISANS or through trap entry or exit changing the ISANS CSR), an implementation is permitted to completely and arbitrarily switch not only the instruction set, it is permitted to switch to a new bank of CSRs (or a subset of the same), and even to switch to a new PC.
75
76 This to occur immediately and atomically at the point at which the change in ISANS occurs.
77
78 The most obvious application of this is for Foreign Archs, which may have their own completely separate PC. Thus, foreign assembly code and RISCV assembly code need not be mixed in the same binary.
79
80 Further use-cases may be envisaged however great care needs to be taken to not cause massive complications for JIT emulation, as the RV ISANS is unary encoded (2^31 permutations).
81
82 In addition, the state information of *all* namespaces has to be saved and restored on a context-switch (unless the SP is also switched as part of the state!) which is quite severely burdensome and getting exceptionally complex.
83
84 Switching CSR, PC (and potentially SP) and other state on a NS change in the RISCV unary NS therefore needs to be done wisely and responsibly, i.e. minimised!
85
86 To be discussed. Context <https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/x-uFZDXiOxY/27QDW5KvBQAJ>
87
88 # Privileged Modes / Traps <a name="privtraps"></a>
89
90 An additional WLRL CSR per priv-level named "LAST-ISANS" is required, and
91 another called "TRAP-ISANS"
92 These mirrors the ISANS CSR, and, on a trap, if the current ISANS in
93 that privilege level is not equal to TRAP-ISANS, its value is atomically
94 transferred into LAST-ISANS by the hardware, and ISANS in that trap
95 is set to TRAP-ISANS. Hardware is *only then* permitted to modify the PC to
96 begin execution of the trap.
97
98 On exit from the trap, hardware must check to see if LAST-ISANS is equal
99 to TRAP-ISANS. If it is not, LAST-ISANS is copied into the ISANS CSR,
100 LAST-ISANS is set to TRAP-ISANS, and *only then* is the hardware permitted
101 to modify the PC to begin execution where the trap left off.
102
103 Note 1: in the case of Supervisor Mode (context switches in particular),
104 saving and changing of LAST-ISANS (to and from the stack) must be done
105 atomically and under the protection of the SIE bit. Failure to do so
106 could result in corruption of LAST-ISANS when multiple traps occur in
107 the same privilege level.
108
109 Note 2: question - should the trap due to illegal (unsupported) values
110 written into LAST-ISANS occur when the *software* writes to LAST-ISANS,
111 or when the *trap* (on exit) writes into LAST-ISANS? this latter seems
112 fraught: a trap, on exit, causing another trap??
113
114 Per-privilege-level pseudocode (there exists UISANS, UTRAPISANS, ULASTISANS,
115 MISANS, MTRAPISANS, MLASTISANS and so on):
116
117 <code>
118 <pre>
119 trap_entry()
120 {
121     LAST-ISANS = ISANS // record the old NS
122     ISANS = TRAP_ISANS // traps are executed in "trap" NS
123 }
124
125 and trap_exit:
126
127 trap_exit():
128 {
129     ISANS = LAST-ISANS
130     LAST-ISANS = TRAP_ISANS
131 }
132 </pre>
133 </code>
134
135 # Why not have TRAP-ISANS as a vector table, matching mtvec? <a name="trap-isans-vec"></a>
136
137 Use case to be determined. Rather than be a global per-priv-level value,
138 TRAP-ISANS is a table of length exactly equal to the mtvec/utvec/stvec table,
139 with corresponding entries that specify the assembly-code namespace in which
140 the trap handler routine is written.
141
142 Open question: see <https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/IAhyOqEZoWA/BM0G3J2zBgAJ>
143
144 <code>
145 <pre>
146 trap_entry(x_cause)
147 {
148     LAST-ISANS = ISANS // record the old NS
149     ISANS = TRAP_ISANS_VEC[xcause] // traps are executed in "trap" NS
150 }
151
152 and trap_exit:
153
154 trap_exit(x_cause):
155 {
156     ISANS = LAST-ISANS
157     LAST-ISANS = TRAP_ISANS_VEC[x_cause]
158 }
159 </pre>
160 </code>
161
162 # Is this like MISA? <a name="misa"></a>
163
164 No.
165
166 * MISA's space is entirely taken up (and running out).
167 * There is no allocation (provision) for custom extensions.
168 * MISA switches on and off entire extensions: ISAMUX/NS may be used to switch multiple opcodes (present and future), to alternate meanings.
169 * MISA is WARL and is inaccessible from everything but M-Mode (not even readable).
170
171 MISA is therefore wholly unsuited to U-Mode usage; ISANS is specifically permitted to be called by userspace to switch (with no stalling) between namespaces, repeatedly and in quick succession.
172
173 # What happens if this scheme is not adopted? Why is it better than leaving things well alone? <a name="lassezfaire"></a>
174
175 At the first sign of an emergency non-backwards compatible and unavoidable
176 change to the *frozen* RISCV *official* Standards, the entire RISCV
177 community is fragmented and divided into two:
178
179 * Those vendors that are hardware compatible with the legacy standard.
180 * Those that are compatible with the new standard.
181
182 *These two communities would be mutually exclusively incompatible*. If
183 a second emergency occurs, RISCV becomes even less tenable.
184
185 Hardware that wished to be "compatible" with either flavour would require
186 JIT or offline static binary recompilation. No vendor would willingly
187 accept this as a condition of the standards divergence in the first place,
188 locking up decision making to the detriment of RISCV as a whole.
189
190 By providing a "safety valve" in the form of a hidden namespace, at least
191 newer hardware has the option to implement both (or more) variations,
192 *and still apply for Certification*.
193
194 However to also allow "legacy" hardware to at least be JIT soft
195 compatible, some very strict rules *must* be adhered to, that appear at
196 first sight not to make any sense.
197
198 It's complicated in other words!
199
200 # Surely it's okay to just tell people to use 48-bit encodings? <a name="use48bit"></a>
201
202 Short answer: it doesn't help resolve conflicts, and costs hardware and
203 redesigns to do so. Softcores in cost-sensitive embedded applications may
204 even not actually be able to fit the required 48 bit instruction decode engine
205 into a (small, ICE40) FPGA. 48-bit instruction decoding is much more complex
206 than straight 32-bit decoding, requiring a queue.
207
208 Second answer: conflicts can still occur in the (unregulated, custom) 48-bit
209 space, which *could* be resolved by ISAMUX/ISANS as applied to the *48* bit
210 space in exactly the same way. And the 64-bit space.
211
212 # Why not leave this to individual custom vendors to solve on a case by case basis? <a name="case-by-case"></a>
213
214 The suggestion was raised that a custom extension vendor could create
215 their own CSR that selects between conflicting namespaces that resolve
216 the meaning of the exact same opcode. This to be done by all and any
217 vendors, as they see fit, with little to no collaboration or coordination
218 towards standardisation in any form.
219
220 The problems with this approach are numerous, when presented to a
221 worldwide context that the UNIX Platform, in particular, has to face
222 (where the embedded platform does not)
223
224 First: lack of coordination, in the proliferation of arbitrary solutions,
225 has to primarily be borne by gcc, binutils, LLVM and other compilers.
226
227 Secondly: CSR space is precious. With each vendor likely needing only one
228 or two bits to express the namespace collision avoidance, if they make
229 even a token effort to use worldwide unique CSRs (an effort that would
230 benefit compiler writers), the CSR register space is quickly exhausted.
231
232 Thirdly: JIT Emulation of such an unregulated space becomes just as
233 much hell as it is for compiler writers. In addition, if two vendors
234 use conflicting CSR addresses, the only sane way to tell the emulator
235 what to do is to give the emulator a runtime commandline argument.
236
237 Fourthly: with each vendor coming up with their own way of handling
238 conflicts, not only are the chances of mistakes higher, it is against the
239 very principles of collaboration and cooperation that save vendors money
240 on development and ongoing maintenance. Each custom vendor will have
241 to maintain their own separate hard fork of the toolchain and software,
242 which is well known to result in security vulnerabilities.
243
244 By coordinating and managing the allocation of namespace bits (unary
245 or binary) the above issues are solved. CSR space is no longer wasted,
246 compiler and JIT software writers have an easier time, clashes are
247 avoided, and RISCV is stabilised and has a trustable long term future.
248
249 # Why ISAMUX / ISANS has to be WLRL and mandatory trap on illegal writes <a name="wlrlmandatorytrap"></a>
250
251 The namespaces, set by bits in the CSR, are functionally directly
252 equivalent to c++ namespaces, even down to the use of braces.
253
254 WARL, by allowing implementors to choose the value, prevents and prohibits
255 the critical and necessary raising of an exception that would begin the
256 JIT process in the case of ongoing standards evolution.
257
258 Without this opportunity, an implementation has no reliable guaranteed way of knowing
259 when to drop into full JIT mode,
260 which is the only guaranteed way to distinguish
261 any given conflicting opcode. It is as if the c++
262 standard was given a similar optional
263 opportunity to completely ignore the
264 "using namespace" prefix!
265
266 --
267
268 Ok so I trust it's now clear why WLRL (thanks Allen) is needed.
269
270 When Dan raised the WARL concern initially a situation was masked by
271 the conflict, that if gone unnoticed would jeapordise ISAMUX/ISANS
272 entirely. Actually, two separate errors. So thank you for raising the
273 question.
274
275 The situation arises when foreign archs are to be given their own NS
276 bit. MIPS is allocated bit 8, x86 bit 9, whilst LE/BE is given bit 0,
277 RVCv2 bit 1 andso on. All of this potential rather than actual, clearly.
278
279 Imagine then that software tries to write and set not just bit 8 and
280 bit 9, it also tries to set bit 0 and 1 as well.
281
282 This *IS* on the face of it a legitimate reason to make ISAMUX/ISANS WARL.
283
284 However it masks a fundamental flaw that has to be addressed, which
285 brings us back much closer to the original design of 18 months ago,
286 and it's highlighted thus:
287
288 x86 and simultaneous RVCv2 modes are total nonsense in the first place!
289
290 The solution instead is to have a NS bit (bit0) that SPECIFICALLY
291 determines if the arch is RV or not. If 0, the rest of the ISAMUX/ISANS
292 is very specifically RV *only*, and if 1, the ISAMUX/ISANS is a *binary*
293 table of foreign architectures and foreign architectures only.
294
295 Exactly how many bits are used for the foreign arch table, is to
296 be determined. 7 bits, one of which is reserved for custom usage,
297 leaving a whopping 64 possible "official" foreign instruction sets to
298 be hardware-supported/JIT-emulated seems to be sufficiently gratuitous,
299 to me.
300
301 One of those could even be Java Bytecode!
302
303 Now, it could *hypothetically* be argued that the permutation of setting
304 LE/BE and MIPS for example is desirable. A simple analysis shows this
305 not to be the case: once in the MIPS foreign NS, it is the MIPS hardware
306 implementation that should have its own way of setting and managing its
307 LE/BE mode, because to do otherwise drastically interferes with MIPS
308 binary compatibility.
309
310 Thus, it is officially Not Our Problem: only flipping into one foreign
311 arch at a time makes sense, thus this has to be reflected in the
312 ISAMUX/ISANS CSR itself, completely side-stepping the (apparent) need
313 to make the NS CSR WARL (which would not work anyway, as previously
314 mentioned).
315
316 So, thank you, again, Dan, for raising this. It would have completely
317 jeapordised ISAMUX/NS if not spotted.
318
319 The second issue is: how does any hardware system, whether it support
320 ISANS or not, and whether any future hardware supports some Namespaces
321 and, in a transitive fashion, has to support *more* future namespaces,
322 through JIT emulation, if this is not planned properly in advance?
323
324 Let us take the simple case first: a current 2019 RISCV fully compliant
325 RV64GC UNIX capable system (with mandatory traps on all unsupported CSRs).
326
327 Fast forward 20 years, there are now 5 ISAMUX/NS unary bits, and 3
328 foreign arch binary table entries.
329
330 Such a system is perfectly possible of software JIT emulating ALL of these
331 options because the write to the (illegal, for that system) ISAMUX/NS
332 CSR generates the trap that is needed for that system ti begin JIT mode.
333
334 (This again emphasises exactly why the trap is mandatory).
335
336 Now let us take the case of a hypothetical system from say 2021 that
337 implements RVCv2 at the hardware level.
338
339 Fast forward 20 years: if the CSR were made WARL, that system would be
340 absolutely screwed. The implementor would be under the false impression
341 that ignoring setting of "illegal" bits was acceptable, making the
342 transition to JIT mode flat-out impossible to detect.
343
344 When this is considered transitively, considering all future additions to
345 the NS, and all permutations, it can be logically deduced that there is
346 a need to reserve a *full* set of bits in the ISAMUX/NS CSR *in advance*.
347
348 i.e. that *right now*, in the year 2019, the entire ISAMUX/NS CSR cannot
349 be added to piecemeal, the full 32 (or 64) bits *has* to be reserved,
350 and reserved bits set at zero.
351
352 Furthermore, if any software attempts to write to those reserved bits,
353 it *must* be treated just as if those bits were distinct and nonexistent
354 CSRs, and a trap raised.
355
356 It makes more sense to consider each NS as having its own completely
357 separate CSR, which, if it does not exist, clearly it should be obvious
358 that, as an unsupported CSR, a trap should be raised (and JIT emulation
359 activated).
360
361 However given that only the one bit is needed (in RV NS Mode, not
362 Foreign NS Mode), it would be terribly wasteful of the CSRs to do this,
363 despite it being technically correct and much easier to understand why
364 trap raising is so essential (mandatory).
365
366 This again should emphasise how to mentally get one's head round this
367 mind-bendingly complex problem space: think of each NS bit as its own
368 totally separate CSR that every implementor is free and clear to implement
369 (or leave to JIT Emulation) as they see fit.
370
371 Only then does the mandatory need to trap on write really start to hit
372 home, as does the need to preallocate a full set of reserved zero values
373 in the RV ISAMUX/NS.
374
375 Lastly, I *think* it's ok to only reserve say 32 bits, and, in 50 years
376 time if that genuinely is not enough, start the process all over again
377 with a new CSR. ISAMUX2/NS2.
378
379 Subdivision of the RV NS (support for RVCv3/4/5/RV16 without wasting
380 precious CSR bits) best left for discussion another time, the above is
381 a heck of a lot to absorb, already.
382
383 # Alternative RVC 16 Bit Opcode meanings
384
385 Ok, here is appropriate to raise an idea how to cover RVC and future
386 variants, including RV16.
387
388 Just as with foreign archs, and you quite rightly highlight above, it
389 makes absolutely no sense to try to select both RVCv1, v2, v3 and so on,
390 all simultaneously. An unary bit vector for RVC modes, changing the 16
391 BIT opcode space meaning, is wasteful and again has us believe that WARL
392 is the "solution".
393
394 The correct thing to do is, again, just like with foreign archs, to
395 treat RVCs as a *binary* namespace selector. Bits 1 thru 3 would give
396 8 possible completely new alternative meanings, just like how the Z80
397 and the 286 and 386 used to do bank switching.
398
399 All zeros is clearly reserved for the present RVC. 0b001 for RVCv2. 0b010
400 for RV16 (look it up) and there should definitely be room reserved here
401 for custom reencodings of the 16 bit opcode space.
402
403 # Why WARL will not work and why WLRL is required
404
405 WARL requires a follow-up read of the CSR to ascertain what heuristic
406 the hardware *might* have applied, and if that procedure is followed in
407 this proposal, performance even on hardware would be severely compromised.
408
409 In addition when switching to foreign architectures, the switch has to
410 be done atomically and guaranteed to occur.
411
412 In the case of JIT emulation, the WARL "detection" code will be in an
413 assembly language that is alien to hardware.
414
415 Support for both assembly languages immediately after the CSR write
416 is clearly impossible, this leaves no other option but to have the CSR
417 be WLRL (on all platforms) and for traps to be mandatory (on the UNIX
418 Platform).
419
420 # Is it strictly necessary for foreign archs to switch back? <a name="foreignswitch"></a>
421
422 No, because LAST-ISANS handles the setting and unsetting of the ISANS CSR
423 in a completely transparent fashion as far as the foreign arch is concerned.
424
425 Thus, in e.g. Hypervisor Mode, the foreign guest arch has no knowledge
426 or need to know that the hypervisor is flipping back to RV at the time of
427 a trap.
428
429 Note however that this is **not** the same as the foreign arch executing
430 *foreign* traps! Foreign architecture trap and interrupt handling mechanisms
431 are **out of scope** of this document and MUST be handled by the foreign
432 architecture implementation in a completely transparent fashion that in
433 no way interacts or interferes with this proposal.
434