(no commit message)
[libreriscv.git] / openpower / sv / SimpleV_rationale.mdwn
1 [[!tag whitepapers]]
2
3 # Why in the 2020s would you invent a new Vector ISA
4
5 Inventing a new Scalar ISA from scratch is over a decade-long task including
6 simulators and compilers: OpenRISC 1200 took 12 years to mature.
7 A Vector or SIMD ISA to reach stable
8 general-purpose auto-vectorisation compiler support has never been
9 achieved in the history of computing, not with the combined resources of
10 ARM, Intel, AMD, MIPS, Sun Microsystems, SGI, Cray, and many more.
11 Rather: GPUs have ultra-specialist compilers that are designed
12 from the ground up to support Vector/SIMD parallelism,
13 and associated standards managed by the
14 Khronos Group, with multi-man-century development committment from
15 multiple billion-dollar-revenue companies, to sustain them.
16
17 Therefore it begs the question, why on earth would anyone consider this
18 task?
19
20 Hints as to the answer emerge from an article
21 "[SIMD considered harmful](https://www.sigarch.org/simd-instructions-considered-harmful/)"
22 which illustrates a catastrophic rabbit-hole taken by Industry Giants
23 ARM, Intel, AMD,
24 since the 90s (over 3 decades) whereby SIMD, an Order(N^6) opcode
25 proliferation nightmare, with its mantra "make it easy for hardware engineers,
26 let software sort out the mess" literally overwhelming programmers.
27 Worse than that, specialists in charging clients Optimisation Services
28 are finding that AVX-512, to take an example, is anything but optimal:
29 overall performance of AVX-512 actually *decreases* even as power
30 consumption goes up.
31
32 Cray-style Vectors solved, over thirty years ago, the opcode proliferation
33 nightmare. Only the NEC SX Aurora however truly kept the Cray Vector flame
34 alive, until RISC-V RVV and now SVP64 and recently MRISC32 joined it.
35 ARM's SVE/SVE2 is critically flawed (lacking the Cray `setvl` instruction
36 that makes a truly ubiquitous Vector ISA) in ways that will become apparent
37 over time as adoption increases. In the meantime programmers are, in
38 direct violation of ARM's advice on how to use SVE2, trying desperately
39 to use it as if it was Packed SIMD NEON. The advice not to create SVE2
40 assembler that is hardcoded to fixed widths is being disregarded, in
41 favour of writing *multiple identical implementations* of a function,
42 each with a different hardware width, and compelling software to choose
43 one at runtime after probing the hardware.
44
45 Even RISC-V, for all that we can be grateful to the RISC-V Founders for
46 reviving Cray Vectors, has severe performance and implementation
47 limitations that are only really apparent to exceptionally experienced
48 assembly-level developers with a wide, diverse depth in multiple ISAs:
49 one of the best and clearest is a
50 [ycombinator post](https://news.ycombinator.com/item?id=24459041)
51 by adrian_b.
52
53 Adrian logically and concisely points out that the fundamental
54 design assumptions and
55 simplifications that went into the RISC-V ISA have an
56 irrevocably damaging effect
57 on its viability for high performance use. That is not to say that
58 its use in low-performance embedded scenarios is not ideal: in
59 private custom secretive commercial usage it is perfect. Ubiquitous
60 and common everyday usage in scenarios currently occupied by ARM, Intel,
61 AMD and IBM? not so much. Thus, even though RISC-V has Cray-style Vectors,
62 the whole ISA is, unfortunately, fundamentally flawed as far as power
63 efficient high performance is concerned.
64
65 Slowly, at this point, a realisation should be sinking in that, actually,
66 there aren't as many really truly viable Vector ISAs out there, as the
67 ones that are evolving in the general direction of Vectorisation are,
68 in various completely different ways, flawed.
69
70 **Successfully identifying a limitation marks the beginning of an opportunity**
71
72 We are nowhere near done, however, because a Vector ISA is a superset of
73 a Scalar ISA, and even a Scalar ISA takes over a decade to develop
74 compiler support, and even longer to get the software ecosystem up and
75 running.
76
77 Which ISAs, therefore, have or have had, at one point in time, a decent Software
78 Ecosystem? Debian supports most of these including s390:
79
80 * SPARC, created by Sun Microsystems and all but abandoned by Oracle.
81 Gaisler Research maintains the LEON Open Source Cores but with Oracle's
82 reputation nobody wants to go near SPARC.
83 * MIPS, created by SGI and only really commonly used in Network switches.
84 Exceptions: Ingenic with embedded CPUs,
85 and China ICT with the Loongson supercomputers.
86 * x86, the most well-known ISA and also one of the most heavily
87 litigously-protected.
88 * ARM, well known in embedded and smartphone scenarios, very slowly
89 making its way into data centres.
90 * OpenRISC, an entirely Open ISA suitable for embedded systems.
91 * s390, a Mainframe ISA very similar to Power.
92 * Power ISA, a Supercomputing-class ISA, as demonstrated by
93 two out of three of the top500.org supercomputers using
94 160,000 IBM POWER9 Cores.
95 * ARC, a competitor at the time to ARM, best known for use in
96 Broadcom VideoCore IV.
97 * RISC-V, with a software ecosystem heavily in development
98 and with rapid adoption
99 in an uncontrolled fashion, is set on an unstoppable
100 and inevitable trainwreck path to replicate the
101 opcode conflict nightmare that plagued the Power ISA,
102 two decades ago.
103 * Tensilica, Andes STAR and Western Digital for successful
104 commercial proprietary ISAs: Tensilica in Baseband Modems,
105 Andes in Audio DSPs, WD in HDDs and SSDs. These are all
106 multi-billion-unit mass volume markets that almost nobody
107 knows anything about. Included for completeness.
108
109 In order of least controlled to most controlled, the viable
110 candidates for further advancement are:
111
112 * OpenRISC 1200, not controlled or restricted by anyone. no patent
113 protection.
114 * RISC-V, touted as "Open" but actually strictly controlled under
115 Trademark License: too new to have adequate patent pool protection,
116 as evidenced by multiple adopters having been hit by patent lawsuits.
117 * MIPS, SPARC, ARC, and others, simply have no viable ecosystem.
118 * Power ISA: protected by IBM's extensive patent portfolio for Members
119 of the OpenPOWER Foundation, covered by Trademarks, permitting
120 and encouraging contributions, and having software support for over
121 20 years.
122 * ARM, not permitting Open Licensing, they survived in the early 90s
123 only by doing a deal with Samsung for a Royalty-free License in exchange
124 for GBP 3 million and legal protection under Samsung Research Division.
125 Several large Corporations (Apple most notably) have licensed the ISA
126 but not ARM designs, the barrier to entry is high and the ISA itself
127 protected from interference as a result.
128 * x86, famous for a Court Ruling in 2004 where a Judge "banged heads
129 together" and ordered AMD and Intel to stop wasting his time,
130 make peace, and cross-license each other's patents, anyone wishing
131 to use the x86 ISA need only look at Transmeta, SiS, the Vortex x86,
132 and VIA EDEN processors, and see how they fared.
133 * s390, IBM's mainframe ISA. Nowhere near as well-known as x86 lawsuits,
134 but the 800lb Gorilla Syndrome seems not to have deterred one
135 particularly disingenuous group from performing illegal
136 Reverse-Engineering.
137
138 *Of all of these, the only one with the most going for it is the Power ISA.*