add draft status to letter
[libreriscv.git] / openpower / isans_letter.mdwn
1 (Draft Status)
2
3 # Letter regarding ISAMUX / NS
4
5 This is a quick overview of the changes that we are proposing to the PowerPC
6 instruction set.
7
8 ## Overview
9
10 The PowerPC Instruction Set Architecture (ISA) is an abstract model of a
11 computer. This is what programmers use when they write programs for the machine,
12 even if indirectly via a compiler for a high level language. We must be
13 conservative in how we add to the ISA to:
14
15 * not break existing programs
16
17 * be mindful as to how others may wish to add to the ISA in the future
18
19 This document describes our strategy.
20
21
22 ## ISA modes and escape sequences
23
24 New chips usually need to be able to run older (legacy) software that is
25 incompatible with the latest and greatest ISA. Eg: 64 bit chip must be able to
26 run older 16 bit and 32 bit software.
27
28 To enable backwards compatability the CPU will be set into 'legacy' mode. This
29 is done with an ISA Mode switch, also known as ISA Muxing or ISA Namespaces.
30
31 The operating system is able to quickly change between 'modern' ISA mode and
32 various legacy modes.
33
34 Another technique is an ISA escape-sequence. This is a type of mode that is
35 only operational for a short time, unlike 32 or 64 bit which would be for the
36 entire run of a program.
37
38
39 ## What are we adding to the ISA
40
41 When high quality graphical display were developed the CPUs at the time were
42 shown to not be able to run the display fast enough. The solution was the use of
43 Graphics cards, these are specialised computers that are good at rendering
44 pixels; often by doing the same thing in different parts of the screen at the
45 same time (in parallel). These specialised computers are called Graphical
46 Processing Units (GPUs).
47
48 The parallelism of some GPUs is thousands. This has led to GPUs being used to
49 solve non graphical problems where high parallelism is useful.
50
51 **break**
52
53 # Letter regarding ISAMUX / NS
54
55 Hardware-level dynamic ISA Muxing (also known as ISA Namespaces and ISA
56 escape-sequencing) is commonly used in instruction sets, in an arbitrary
57 and ad-hoc fashion, added often on an on-demand basis. Examples include:
58
59 * Setting a SPR to switch the meaning of certain opcodes for Little-Endian /
60 Big-Endian behaviour (present in POWER and SPARC)
61 * Setting a SPR to provide "backwards-compatibility" for features from
62 older versions of an ISA (such as changing to new ratified versions of
63 the IEEE754 standard)
64
65 (These we term "ISA Muxing" because, ultimately, they are extra bits
66 (or change existing bits) in the actual instruction decoder phase,
67 which involves "MUXes" to switch them on and off).
68
69 The Libre-SOC team, developing a hybrid CPU-VPU-GPU, needs to add
70 significantly and strategically to the POWER ISA to support, for example,
71 Khronos Vulkan IEEE754 Conformance, whilst *at the same time being able
72 to run full POWER9 compliant instructions*.
73
74 There is absolutely no way that we are going to duplicate the
75 entire FP opcode set as a custom extension to POWER, just to add a
76 literally-identical suite of FP opcodes that are compliant with the
77 Khronos Conformance Suites: this would be a significant and irresponsible
78 use of opcode space.
79
80 In addition, as this processor is likely to be used for parallel
81 compute purposes in high-efficiency environments, we also need to add
82 FP16 support. Again: there is no way that we are going to add *triple*
83 duplicated opcodes to POWER, given that the opcodes needed are absolutely
84 identical to those that already exist, apart from the FP bitwidth (32
85 / 64).
86
87 There are several other strategically critical uses to which we would
88 like to put such a scheme (related to power consumption and reducing
89 throughput bottlenecks needed for heavy-computation workloads in GPU
90 and VPU scenarios).
91
92 In addition, the scheme has several other key advantages over other ISA
93 "extending" ideas (such as extending the general ISA POWER space to
94 64 bit) in that, unlike 64 bit opcodes, its judicious and careful use
95 does not require large increases in I-Cache size because all opcodes,
96 ultimately, remain 32-bit. The scheme also allows future *official*
97 POWER extensions to the ISA - managed by the OpenPOWER Foundation -
98 to be strategically managed in a controlled, long-term, non-damaging
99 way to the reputation and stability of OpenPOWER.
100
101 Therefore we advocate being able to set "ISAMUX/NS" mode-switching bits
102 that, like the *existing* LE/BE mode-switching bits, change the behaviour
103 of *existing* opcodes to an alternative "meaning" (followed by another
104 mode-switch that returns them to their original meaning. Note: to reduce
105 binary code-size, alternative schemes include setting a countdown which,
106 when it expires, automatically disables the requested mode-switch)
107
108 Note also that to ensure that kernels and hypervisors are not impacted
109 by userspace ISAMUX/NS mode-switching, it is critical that Supervisor
110 and Hypervisor modes have their own completely separate ISAMUX/NS SPRs
111 (imagine a userspace application setting the LE/BE bit on a global basis,
112 or setting a global IEEE754 FP Standards compatibility flag).
113
114 Further, that Supervisor / Hypervisor modes have access to and control
115 over userspace ISAMUX/NS SPRs (without themselves being affected by
116 setting *of* userspace ISAMUX/NS SPRs), in order to be able to correctly
117 context-switch userspace applications to their correct (former) running
118 state.
119
120 Given the number of mode-switch bits that we anticipate using, we advocate
121 that such a scheme be formalised, and that the OpenPOWER Foundation be
122 the "atomic arbiter" similar to IANA and JEDEC in the formal allocation
123 of mode-switch bits to OpenPOWER implementors.
124
125 We envisage that some of these bits will be unary, some will be binary,
126 some will be allocated for exclusive use by the OpenPOWER Foundation,
127 some allocated to OpenPOWER Members (by the OpenPOWER Foundation),
128 and some reserved for "custom and experimentation usage".
129
130 (This latter - custom experimentation - to be explicitly documented
131 that upstream compiler and toolchain support will never, under any
132 circumstances be accepted by the OpenPOWER Foundation, and that this be
133 enforced through the EULA and through Trademark law).
134
135
136 However as we are quite new to POWER 3.0B (1300+ page PDF), we do
137 appreciate that such a formal scheme may already be present in POWER9
138 3.0B, that we have simply overlooked.
139