clarify, whitespace
[libreriscv.git] / openpower / isans_letter.mdwn
1 (Draft Status)
2
3 # Letter regarding ISAMUX / NS
4
5 ## Why has Libre-SOC chosen PowerPC ?
6
7 For a hybrid CPU-VPU-GPU, intended for mass-volume adoption in tablets,
8 netbooks, chromebooks and industrial embedded (SBC) systems, our choice
9 was between Nyuzi, MIAOW, RISC-V, PowerPC, MIPS and OpenRISC.
10
11 Of all the options, the PowerPC architecture is more complete and far more
12 mature. It also has a deeper adoption by Linux distributions.
13
14 Following IBM's release of the Power Architecture instruction set to the
15 Linux Foundation in August 2019 the barrier to using it is no more than
16 that of using RISC-V. We are encouraged that the OpenPOWER Foundation is
17 supportive of what we are doing and helping, e.g by putting us in touch
18 with people who can help us.
19
20 ## Summary
21
22 * We propose the standardisation of the way that the PowerPC Instruction
23 Set Architecture (PPC ISA) is extended, enabling many different flavours
24 within a well supported family to co-exist, long-term, without conflict,
25 right across the board.
26 * This is about more than just our project. Our proposals will facilitate
27 the use of PPC in novel or niche applications without breaking the PPC
28 ISA into incompatible islands.
29 * PPC will gain a competitive market advantage by removing the need
30 for separate VPU or GPU functions in RTL or ASICs thus enabling lower
31 cost systems. Libre-SOC's project is to extend the PPC to integrate
32 the GPU and VPU functionality directly as part of the PPC ISA (example:
33 Broadcom VideoCore IV being based around extensions to an ARC core).
34 * Libre-SOC's extensions will be easily adopted, as the standard GNU/Linux
35 distributions will very deliberately run unmodified on our ISA,
36 including full compatibility with illegal instruction trap requirements.
37
38 ## One CPU multiple ISAs
39
40 This is a quick overview of the way that we would like to add changes
41 that we are proposing to the PowerPC instruction set (ISA). It is based on
42 a Open Standardisation of the way that existing "mode switches",
43 already found in the POWER instruction set, are added:
44
45 * FPSCR's "NI" bit, setting non-IEEE754 FP mode
46 * MSR's "LE" bit (and associated HILE bit), setting little-endian mode
47 * MSR's "SF" bit, setting either 32-bit or 64-bit mode
48 * PCR's "compatibility" bits 60-62, V2.05 V2.06 V2.07 mode
49
50 [It is well-noted that unless each "mode switch" bit is set, any
51 alternative (additional) instructions (and functionality) are completely
52 inaccessible, and will result in "illegal instruction" traps being thrown.
53 This is recognised as being critically important.]
54
55 These bits effectively create multiple, incompatible run-time switchable ISAs
56 within one CPU. They are selectable for the needs of the individual
57 program (or OS) being run.
58
59 All of these bits are set by an instruction, that, once set, radically
60 changes the entire behaviour and characteristics of subsequent instructions.
61
62 With these (and other) long-established precedents already in POWER,
63 there is therefore essentially conceptually nothing new about what we
64 propose: we simply seek that the process by which such "switching" is
65 added is formalised and standardised, such that we (and others, including
66 IBM itself) have a clear, well-defined standards-non-disruptive, atomic
67 and non-intrusive path to extend the POWER ISA for use in markets that
68 it presently cannot enter.
69
70 We advocate that some of "mode-setting" (escape-sequencing) bits be
71 binary encoded, some unary encoded, and that some space marked for
72 "offical" use, some "experimental", some "custom" and some "reserved".
73 The available space in a suitably-chosen SPR to be formalised, and
74 recommend the OpenPOWER Foundation be given the IANA-like role in
75 atomically allocating mode bits.
76
77 Instructions that we need to add, which are a normal part of GPUs,
78 include ATAN2, LOG, NORMALISE, YUV2RGB, Khronos Compliance FP mode
79 (different from both IEEE754 and "NI" mode), and many more. Many of
80 these may turn out to be useful in a wider context: they however need
81 to be fully isolated behind "mode-setting".
82
83 Some mode-setting instructions are privileged, ie can only be set by
84 the kernel (eg 32 or 64 bit mode). Most of the escape sequences that we
85 propose will be (have to be) usable without the need for an expensive
86 system call overhead (because some of the instructions needed will be
87 in extremely tight inner loops).
88
89 # About Libre-SOC Commercial Project
90
91 The Libre-SOC Commercial Product is a hybrid GPU-GPU-VPU intended for
92 mass-volume production. There is no separate GPU, because the CPU
93 *is* the GPU. There is no separate VPU, because the CPU *is* the GPU.
94 There is not even a separate pipeline: the CPU pipelines *are* the
95 GPU and VPU pipelines.
96
97 Closest equivalents include the ARC core (which has VPU extensions and
98 3D extensions in the form of Broadcom's VideoCore IV) and the ICubeCorp
99 IC3128. Both are considered "hybrid" CPU-GPU-VPU processors.
100
101 "Normal" Commercial GPUs are entirely separate processors. The development
102 cost and complexity purely in terms of Software Drivers alone is immense.
103 We reject that approach (and as a small team we do not have the resources
104 anyway).
105
106 With the project being Libre - not proprietary and secretive and never
107 to be published, ever - it is no good having the extensions as "custom"
108 because "custom" is specifically for the cases where the augmented
109 toolchain is never, under any circumstances, published and made public by
110 the proprietary company (and would never be accepted upstream anyway).
111 For business commercial reasons, Libre-SOC is the total opposite of this
112 proprietary, secretive approach.
113
114 Therefore, to meet our business objectives:
115
116 * As shown from Nyuzi and Larrabee, although ideally suited to high
117 performance compute tasks, a "traditional" general-purpose full
118 IEEE754-compliant Vector ISA (such as that in POWER9) is not an adequate
119 basis for a commercially competitive GPU. Nyuzi's conclusion is that
120 using such general-purpose Vector ISAs results in reaching only 25%
121 performance (or requiring 4-fold increase in power consumption) to
122 achieve par with current commercial-grade GPUs.
123 * We are not going the "traditional" (separate custom GPU) route because
124 it is not practical for a new team to design hardware and spend 8+
125 man-years on massively complex inter-processor driver development as well
126 * We cannot meet our objectives with a "custom extension" because the
127 financial burden on our team to maintain a total hard fork of not just
128 toolchains, but also entire GNU/Linux Distros, is highly undesirable,
129 and completely impractical (and Redhat would strongly object anyway)
130 * We cannot "go ahead anyway" because to do so would be highly irresponsible
131 and cause massive disruption to the POWER community.
132
133 With all impractical options eliminated the only remaining responsible
134 option is to extend the POWER ISA in an atomically-managed (IANA-style)
135 formal fashion, whilst (critically and absolutely essentially) always
136 providing a PCR compatibility mode that is fully POWER compliant, including
137 all illegal instruction traps.
138