(Draft Status) # Letter regarding ISAMUX / NS ## Summary * We propose the standardisation of the way that the PowerPC Instruction Set Architecture (PPC ISA) is extended, enabling many different flavours within a well supported family to co-exist, long-term, without conflict, right across the board. * This will facilitate the use of PPC in novel or niche applications without breaking the PPC ISA into incompatible islands. This is about more than just our project. * Libre-SOC's project is to extend the PPC to integrate GPU and VPU functionality directly as part of the PPC ISA (example: Broadcom VideoCore IV being based around extensions to an ARC core). By not needing separate VPU or GPU RTL or ASICs this will enable lower cost systems to be built so giving PPC a competitive market advantage. * Libre-SOC's extensions will be easily adopted, as the standard GNU/Linux distributions will very deliberately run unmodified on our ISA, including full compatibility with illegal instruction requirements. ## Why has Libre-SOC chosen PowerPC ? For a hybrid CPU-VPU-GPU, intended for mass-volume adoption in tablets, netbooks, chromebooks and industrial embedded (SBC) systems, our choice was between Nyuzi, MIAOW, RISC-V, PowerPC, MIPS and OpenRISC. Of all the options, the PowerPC architecture is more complete and far more mature. It also has a deeper adoption by Linux distributions. Following IBM's release of the Power Architecture instruction set to the Linux Foundation in August 2019 the barrier to using it is no more than that of using RISC-V. We are encouraged that the OpenPOWER Foundation is supportive of what we are doing and helping, e.g by putting us in touch with people who can help us. ## One CPU multiple ISAs This is a quick overview of the way that we would like to add changes that we are proposing to the PowerPC instruction set (ISA). It is based on a Open Standardisation of the way that existing "mode switches", already found in the POWER instruction set, are added: * FPSCR's "NI" bit, setting non-IEEE754 FP mode * MSR's "LE" bit (and associated HILE bit), setting little-endian mode * MSR's "SF" bit, setting either 32-bit or 64-bit mode * PCR's "compatibility" bits 60-62, V2.05 V2.06 V2.07 mode It is well-noted that unless each "mode switch" bit is set, the alternative instructions (and functionality) are completely inaccessible, and will result in "illegal instruction" traps being thrown. This is recognised as being critically important. These bits effectively create multiple, incompatible run-time switchable ISAs within one CPU. They are selectable for the needs of the individual program being run. All of these are set by one instruction, that, once set, radically changes the entire behaviour and characteristics of subsequent instructions. With these (and other) long-established precedents already in POWER, there is therefore essentially conceptually nothing new about what we propose: we simply seek that the process by which such "switching" is added is formalised and standardised, such that we (and others) have a clear, standards-non-disruptive, atomic and non-intrusive path to extend the POWER ISA for use in markets that it presently cannot enter. We advocate that some of "mode-setting" (escape-sequencing) bits be binary encoded, some unary encoded, some "offical", some "experimental" and some "reserved". The available space in a suitably-chosen SPR to be formalised, and recommend the OpenPOWER Foundation be given the IANA-like role in atomically allocating mode bits. Instructions that we need to add, which are a normal part of GPUs, include ATAN2, LOG, NORMALISE, YUV2RGB, Khronos Compliance FP mode (different from both IEEE754 and "NI" mode), and many more. Many of these may turn out to be useful in a wider context: they however need to be fully isolated behind "mode-setting". Some mode-setting instructions are privileged, ie can only be set by the kernel (eg 32 or 64 bit mode). The escape sequences that we propose will be (have to be) usable without the need for an expensive system call overhead (because some of the instructions needed will be in extremely tight inner loops). # Summary of Libre-SOC Commercial Project The Libre-SOC Commercial Product is a hybrid GPU-GPU-VPU intended for mass-volume production. There is no separate GPU, because the CPU *is* the GPU. There is no separate VPU, because the CPU *is* the GPU. There is not even a separate pipeline: the CPU pipelines *are* the GPU and VPU pipelines. Closest equivalents include the ARC core (which has VPU extensions and 3D extensions in the form of Broadcom's VideoCore IV) and the ICubeCorp IC3128. Both are considered "hybrid" CPU-GPU-VPU processors. "Normal" Commercial GPUs are entirely separate processors. The development cost and complexity purely in terms of Software Drivers alone is immense. We reject that approach (and as a small team we do not have the resources anyway). With the project being Libre - not proprietary and secretive and never to be published, ever - it is no good having the extensions as "custom" because "custom" is specifically for the cases where the augmented toolchain is never, under any circumstances, published and made public by the proprietary company (and would never be accepted upstream anyway). For business commercial reasons, Libre-SOC is the total opposite of this proprietary, secretive approach. Therefore, to meet our business objectives: * As shown from Nyuzi and Larrabee, although ideally suited to high performance compute tasks, a "traditional" general-purpose full IEEE754-compliant Vector ISA (such as that in POWER9) is not an adequate basis for a commercially competitive GPU. Nyuzi's conclusion is that using such general-purpose Vector ISAs results in reaching only 25% performance (or requiring 4-fold increase in power consumption) to achieve par with current commercial-grade GPUs. * We are not going the "traditional" (separate custom GPU) route because it is not practical for a new team to design hardware and spend 8+ man-years on massively complex inter-processor driver development as well * We cannot meet our objectives with a "custom extension" because the financial burden on our team to maintain a total hard fork of not just toolchains, but also entire GNU/Linux Distros, is highly undesirable, and completely impractical (and Redhat would strongly object anyway) * We cannot "go ahead anyway" because to do so would be highly irresponsible and cause massive disruption to the POWER community. With all impractical options eliminated the only remaining responsible option is to extend the POWER ISA in an atomically-managed (IANA-style) formal fashion, whilst (critically and absolutely essentially) always providing a PCR compatibility mode that is fully POWER compliant.