22563a8e025d30df460933a8f1d69bff7c4986f0
[libreriscv.git] / openpower / isans_letter.mdwn
1 # Letter regarding ISAMUX / NS
2
3 Hardware-level dynamic ISA Muxing (also known as ISA Namespaces and ISA
4 escape-sequencing) is commonly used in instruction sets, in an arbitrary
5 and ad-hoc fashion, added often on an on-demand basis. Examples include:
6
7 * Setting a SPR to switch the meaning of certain opcodes for Little-Endian /
8 Big-Endian behaviour (present in POWER and SPARC)
9 * Setting a SPR to provide "backwards-compatibility" for features from
10 older versions of an ISA (such as changing to new ratified versions of
11 the IEEE754 standard)
12
13 (These we term "ISA Muxing" because, ultimately, they are extra bits
14 (or change existing bits) in the actual instruction decoder phase,
15 which involves "MUXes" to switch them on and off).
16
17 The Libre-SOC team, developing a hybrid CPU-VPU-GPU, needs to add
18 significantly and strategically to the POWER ISA to support, for example,
19 Khronos Vulkan IEEE754 Conformance, whilst *at the same time being able
20 to run full POWER9 compliant instructions*.
21
22 There is absolutely no way that we are going to duplicate the
23 entire FP opcode set as a custom extension to POWER, just to add a
24 literally-identical suite of FP opcodes that are compliant with the
25 Khronos Conformance Suites: this would be a significant and irresponsible
26 use of opcode space.
27
28 In addition, as this processor is likely to be used for parallel
29 compute purposes in high-efficiency environments, we also need to add
30 FP16 support. Again: there is no way that we are going to add *triple*
31 duplicated opcodes to POWER, given that the opcodes needed are absolutely
32 identical to those that already exist, apart from the FP bitwidth (32
33 / 64).
34
35 There are several other strategically critical uses to which we would
36 like to put such a scheme (related to power consumption and reducing
37 throughput bottlenecks needed for heavy-computation workloads in GPU
38 and VPU scenarios).
39
40 In addition, the scheme has several other key advantages over other ISA
41 "extending" ideas (such as extending the general ISA POWER space to
42 64 bit) in that, unlike 64 bit opcodes, its judicious and careful use
43 does not require large increases in I-Cache size because all opcodes,
44 ultimately, remain 32-bit. The scheme also allows future *official*
45 POWER extensions to the ISA - managed by the OpenPOWER Foundation -
46 to be strategically managed in a controlled, long-term, non-damaging
47 way to the reputation and stability of OpenPOWER.
48
49 Therefore we advocate being able to set "ISAMUX/NS" mode-switching bits
50 that, like the *existing* LE/BE mode-switching bits, change the behaviour
51 of *existing* opcodes to an alternative "meaning" (followed by another
52 mode-switch that returns them to their original meaning. Note: to reduce
53 binary code-size, alternative schemes include setting a countdown which,
54 when it expires, automatically disables the requested mode-switch)
55
56 Note also that to ensure that kernels and hypervisors are not impacted
57 by userspace ISAMUX/NS mode-switching, it is critical that Supervisor
58 and Hypervisor modes have their own completely separate ISAMUX/NS SPRs
59 (imagine a userspace application setting the LE/BE bit on a global basis,
60 or setting a global IEEE754 FP Standards compatibility flag).
61
62 Further, that Supervisor / Hypervisor modes have access to and control
63 over userspace ISAMUX/NS SPRs (without themselves being affected by
64 setting *of* userspace ISAMUX/NS SPRs), in order to be able to correctly
65 context-switch userspace applications to their correct (former) running
66 state.
67
68 Given the number of mode-switch bits that we anticipate using, we advocate
69 that such a scheme be formalised, and that the OpenPOWER Foundation be
70 the "atomic arbiter" similar to IANA and JEDEC in the formal allocation
71 of mode-switch bits to OpenPOWER implementors.
72
73 We envisage that some of these bits will be unary, some will be binary,
74 some will be allocated for exclusive use by the OpenPOWER Foundation,
75 some allocated to OpenPOWER Members (by the OpenPOWER Foundation),
76 and some reserved for "custom and experimentation usage".
77
78 (This latter - custom experimentation - to be explicitly documented
79 that upstream compiler and toolchain support will never, under any
80 circumstances be accepted by the OpenPOWER Foundation, and that this be
81 enforced through the EULA and through Trademark law).
82
83
84 However as we are quite new to POWER 3.0B (1300+ page PDF), we do
85 appreciate that such a formal scheme may already be present in POWER9
86 3.0B, that we have simply overlooked.
87