1 \documentclass[slidestop
]{beamer
}
2 \usepackage{beamerthemesplit
}
6 \title{Commercial Libre-RISCV SoC
}
7 \author{Luke Kenneth Casson Leighton
}
14 \huge{Designing a Commercial Libre RISC-V SoC
}\\
16 \Large{Ethical Strategic Leveraging of the benefits
}\\
17 \Large{of Libre and Open SW/HW
}\\
18 \Large{for pure unadulterated Commercial gain
}\\
20 \Large{Chennai
9th RISC-V Workshop
}\\
27 \frame{\frametitle{Credits and Acknowledgements
}
30 \item The Designers of RISC-V
31 \item The RISC-V Foundation
32 \item The Shakti Group, and IIT Madras RISE Group
33 \item Prof. G S Madhusudan
36 \item Members of the RISC-V Open Groups (SW/HW/ISA)
37 \item Libre and Open Software and Hardware Communities
38 \item Richard Herveille (RoaLogic), Edmund Humenberger, Clifford Wolf
39 (Symbiotica EDA), Rudi (Asics.ws), Enjoy-Digital.fr,
40 Alex Forenchich, LowRISC Team
41 \item Anonymous Sponsor
46 \frame{\frametitle{Why, How, What?
}
49 \item Why? Because these days it's just not necessary to
50 make
[un
]ethical compromises in order to make a profitable,
51 desirable mass-volume product\\
52 {\it (There's enough companies doing that: where it's got us??)
}
53 \item How? By leveraging the long-establised strategic cost and
54 maintenance benefits of libre-licensed software (and
56 {\it making sure that the people who provide it are
57 financially rewarded
}. Also by empowering diverse team
59 \item What? A
2.5ghz RISC-V
64-bit SoC that has
60 a
3D Embedded GPU,
1080p Video decode, and interfaces
61 to make it attractive for use in tablets, netbooks, industrial
62 embedded and more.
22nm or less, under
400 pins, under USD \$
4.\\
63 {\it All sounds obvious... but is it practical and achievable?
}
68 \frame{\frametitle{Definitions
}
71 \item {\bf Business
}: the provision of a service and being
72 commensurately financially rewarded for doing so
73 \item {\bf Spongeing
}: the provision of a service and being
74 taken advantage of for doing so
{\it (cf: Professor Yunus)
}
75 \item {\bf An ethical act
}: an act that increases truth,
76 love, awareness or creativity for one or more people
77 (including yourself),
{\it without
} reducing those
78 same four qualities
{\it for anyone
}
79 \item {\bf The Four Freedoms
}: the rights and guarantees
80 associated with and embedded within GNU Licenses
{\it (cf: FSF)
}
82 {\it Is it possible to ethically do business and respect the
83 Four Freedoms? That's where it gets interesting, as there are
84 even cases where the Four Freedoms are unethical. Note: google's
85 former motto "don't be evil" is clearly (unintentionally) unethical
}
89 \frame{\frametitle{Does what we want already exist? Surely this is nonsense!
}
91 \includegraphics[height=
2.4in
]{nolibresocs.jpg
}\\
92 {\bf Analysis of SoCs over the past
7+ years (answer: no)
}
97 \frame{\frametitle{Breakdown of non-existence of fully-Libre SoCs
}
100 \item {\bf iMX6
}: Libre bootable, Vivante
3D GPU (libre etnaviv)
101 but proprietary VPU (and a power-hungry Cortex A9)
102 \item {\bf Allwinner SoCs
}: mostly Libre bootable,
103 VPU reverse engineered; GPU: MALI or PowerVR (i.e. proprietary)
104 \item {\bf Rockchip SoCs
}: good but using MALI or PowerVR.
105 \item {\bf TI OMAP
}: good but using PowerVR. and expensive.
106 \item {\bf Samsung
}: good but using MALI.
107 \item {\bf Ingenic jz4775
}: GREAT! performance
109 \item {\bf Broadcom SoCs
}: Cartelled. and boots from the GPU
111 {\it Basically there does not exist one single commercial SoC that
112 provides full source code for all functions (CPU, GPU, VPU)
113 with modern performance. Which is kinda bizarre if you think about it
}
117 \frame{\frametitle{What would a good (Libre) boring, mundane SoC have?
}
120 \item Cover a lot of different scenarios (embedded, tablets, industrial,
121 netbooks, crypto-currency mining).
122 \item Decent performance with high efficiency. RISC-V:
40\%
123 more efficient than ARM / Intel. Shakti a good
124 candidate:
2.5ghz and
120mW per core @
22nm.
125 \item 1080p video: y'all gotta watch cute kittens on youtube, right?
126 \item 3D GPU: y'all gotta play Angri Burds, right? (or Minecraft)
127 \item No spying back-door co-processors (to steal crypto-wallets)
128 \item No Spectres, no Meltdowns.
130 {\it Basically quite boring and mundane. No Monster Performance,
131 no AI stuff, no special sauce. Just a plain-old SoC,
132 40\% more power efficient than ARM/Intel,
133 and not spying on end-users, that's all
}
137 \frame{\frametitle{How on earth does an ethical Libre SoC make money???
}
140 \item Simple answer: Mask Rights.
141 \item Without Mask Rights: by having a desirable
142 product, and packaging it for a customer (i.e. by being a middle-man
143 a service is still being provided for which payment etc. etc.)
144 \item Without a desirable product or customer(s): err... you don't.\\
145 (cf: definition of Business)
146 \item By not having high NREs (leveraging back-to-back deals,
147 and helping others fulfil their needs and goals)
149 {\it Detachment from the goal also helps. If someone else makes this
150 product then GREAT! I can go do something else
}\\
152 {\bf Main point: please do not automatically assume Ethical and Libre is
153 non-commercial. It's not nice, and it's not helping
}
157 \frame{\frametitle{Things wot are "off-limits"
}
160 \item Customer entrapment (through proprietary software).\\
161 Strong business case for not entrapping customers:\\
162 \url{https://tinyurl.com/most-productive-meeting-ever
}
163 \item Funding, endorsing, supporting or empowering unethical
164 Companies, Organisations, Cartels and Individuals.\\
165 (cf: definition of an ethical act).
166 \item Being totally inflexible / unrealistic. Goals have
167 to be met: it's no good being an idiot about that. e.g. if
168 a Libre
3D GPU really can't be made, use Vivante GC800
170 \item Spying back-door co-processors a no-no. Sovereignty
171 is critical. Russia has Baikal. China has Loongson.
174 {\it Still no real show-stoppers to making money (or product):
175 it's just slightly harder, that's all. Ultimately it's about
180 \frame{\frametitle{Interfaces, Block Diagram, of the Libre-RISCV SoC
}
182 \includegraphics[height=
2.1in
]{../shakti_libre_riscv.jpg
}\\
183 {\bf Separate Power Domains for GPIO banks, Variable voltages
184 required, low-power sleep states etc. Quite involved
}
189 \frame{\frametitle{Hardware / Development Complexity Comparison
}
192 \item {\bf Server
}: relatively easy. PCIe, RapidIO, XAUI, SATA, GbE,
10GE,
193 DDR3/
4 (or HMC) etc. etc. No multiplexing: all interfaces dedicated
194 and high-speed differential pairs.
195 \item {\bf Desktop
}: really just a variant of Server.
196 Graphics is a PCIe Card (except if integrated). Peripherals
197 often done in dedicated external ICs ("Southbridge" concept)
198 \item {\bf Embedded
}: also pretty easy. Really needs a pinmux. Low clock
199 rate, low power mode. e.g. SiFive Freedom U310.
200 \item {\bf Mobile
}: HARD. Performance/Watt matters $=>$ variable core
201 voltage domains
{\it per core
}. Number of pins matters (affects
202 yield and package cost). Cost
203 matters. Pinmux critical.
205 {\it Bottom line: Mobile-class processors are challenging!
}
209 \frame{\frametitle{Proprietary vs Libre-licensed Interface HDL
}
212 \item DDR3/
4: challenging! \$
1m for single-use, single instance.\\
213 Symbiotic EDA: \$
600k for PHY; CERN developed a Controller\\
214 \url{http://libre-riscv.org/shakti/m_class/DDR/
}
215 \item HyperRAM (JEDEC xSPI): lower risk than DDR3/
4\\
216 \url{http://libre-riscv.org/shakti/m_class/HyperRAM/
}
217 \item RGMII: several available (saves \$
50k)\\
218 \url{http://libre-riscv.org/shakti/m_class/RGMII/
}
219 \item UART, SPI, I2C, PWM, SD/MMC: all libre (except eMMC).
220 \item Shakti Group has FlexBus, QuadSPI, SRAM, many more.
221 \item RGB/TTL: R. Herveille (SSD2828, SN75LVDS83b, TFP410a)
223 {\it Basically there's no compelling reason to spend vast sums
224 on proprietary HDL. Sorry Cadence / Mentor / Synopsis / whoever
}
228 \frame{\frametitle{Challenging Stuff
[1] - Memory Interfaces
}
231 \item DDR3/
4 PHYs are analog and very high speed.
232 Impedance training. Extreme timing tolerances on parallel buses.\\
233 No surprise proprietary cost is USD \$
1m and above.
234 \item Symbiotic EDA will do (Libre) PHY layout for USD \$
300k,
235 time to completion for chosen geometry:
8-
12 months.
237 {\it Silicon-proven but still risky. What are the alternatives?
}
240 \item FlexBus/SDRAM (low clock, lots of pins, single-data-rate).
241 \item HyperRAM (aka JEDEC xSPI)
8-bit SPI
166mhz or DDR-
300.\\
242 300mbyte/sec for only
13 wires, not bad! (We'll take several)\\
243 \url{http://libre-riscv.org/shakti/m_class/HyperRAM/
}
244 \item HMC: insanely fast, very low power. OpenHMC (LGPL)
245 \url{https://opencores.org/project/openhmc
}
250 \frame{\frametitle{Challenging Stuff
[2] - Video Decode Engine
}
253 \item Richard Herveille's Video Core Blocks\\
254 https://opencores.org/project/video
\_systems
255 \item Symbiotic EDA MP4 decoder in FPGA
256 \item H
.264 seems to have been done...\\
257 https://github.com/adsc-hls/synthesizable
\_h264
258 \item Really needs SIMD (or better, not-SIMD)\\
259 \url{http://libre-riscv.org/simple_v_extension/
}
260 \item Definitely needs xBitManip (parallelised by Simple-V)\\
261 \url{https://github.com/cliffordwolf/xbitmanip
}
263 {\it SIMD is insane. $O(N^
6)$ opcode proliferation. See\\
264 https://www.sigarch.org/simd-instructions-considered-harmful/ \\
265 (
1): P-Ext designed for Audio. (
2): Investigate RI5CY's SIMD
270 \frame{\frametitle{Challenging Stuff
[3] - Power Management
}
273 \item Been done before (many times), but not as a Libre Design.
274 \item Sanjay Charagulla: GlobalFoundries
22nm mobile process
275 can reach as low as
0.4v
276 \item GPIO Banks need per-bank VREF (
1.8v? to
3.3v)\\
277 IO pads need built-in
278 level-shifting to convert to CPU VCORE
279 \item Each core needs independent variable-voltage capability
280 and independent shut-down (PMIC supplies external voltage)
281 \item DDR RAM still needs refreshing (even in sleep mode)
282 \item Extra RV32 (PicoRV32?) always-on core for wake-up / RTC
283 \item PLLs are Analog. fun fun fun in the sun sun sun...
285 {\it Really need help. PLLs, Analog stuff: specific
286 domain expertise. Fall-back example:
}
287 \url{https://www.dolphin-integration.com
}?
292 \frame{\frametitle{Challenging Stuff
[4] - Libre
3D GPU. Sigh.
}
295 \item Actual requirements quite modest:
30MP/s
100MT/s
5GFLOPS
296 but power/area is crucial ($
2mm^
2$ @
40nm,
1W)
297 \item Nyuzi, MIAOW, GPLGPU (Number Nine), OGP.
298 \item Nyuzi based on Larrabee. Jeff Bush really helpful.
299 \item MIAOW is an OpenCL engine. GPLGPU is fixed-function
300 \item Nyuzi lessons: Software-only rendering not enough.
301 Getting through L1 cache takes most power. Fixed functions
302 such as parallel FP-Quad to ARGB Pixel, and Z-Buffer
304 \item Fallback is GC800 (\$
250k)
{\it contact me if you can do better!
}
306 {\it Jacob Bachmeyer's Cache-control proposal turns L1 Cache into
307 scratchpad RAM. RVV is just too heavy (sorry!), Simple-V much
308 more light-weight and flexible ($O(
1)$ ISA proliferation)
313 \frame{\frametitle{Challenging Stuff
[5] - Public Custom Extensions
}
316 \item GPUs are usually done with incompatible ISAs and effectively
317 doing OpenGL over IPC / RPC (Remote Procedure Calls)
318 \item Much simpler: GPGPU "one ISA" approach. Custom-extend the
319 core ISA to handle
3D, use Gallium3D-LLVM.
320 \item Now add Video Extensions. and SIMD etc and
321 {\bf we are well beyond the only
2 available
32-bit custom opcodes
}
322 \item Due to the Libre nature of this project, the custom opcode
323 space will be "dominated" by
324 high-profile public hard-forks of gcc, binutils, llvm etc.
325 Which isn't going to go down well.
326 \item ISA "Conflict Resolution" is therefore absolutely critical\\
327 \url{http://libre-riscv.org/isa_conflict_resolution/
}
329 {\it Remember Altivec. Learn from Intel.
330 \underline{This is everyone's problem.
}
335 \frame{\frametitle{Interesting Missing Stuff
[1] - Pinmux
}
338 \item Pinmux: multiplexer of functions onto pins\\
339 {\it DRAM Cell != DDR3/
4, Mux Cell != Muxer
}
340 \item Strategically extremely important to Commercial SoC success\\
341 STMicro, Rockchip, Freescale, Samsung, TI,
{\bf EVERYONE
}
342 \item Bizarrely, a libre-licensed multi-way Pinmux doesn't exist.\\
343 {\it not on anyone's radar. at all.
}
344 SiFive IOF not enough.
345 \item Verification (scenario analysis) and auto-generation of
346 TRM, header files, device-tree files, pretty much everything
347 makes sense (to any "lazy" Software Engineer...)
348 \item Corporations with legacy pinmux unlikely to be interested.
349 \item \url{http://git.libre-riscv.org/?p=pinmux.git
} \\
350 \url{http://hands.com/~lkcl/pinmux
\_chennai\_2018.pdf
}
355 \frame{\frametitle{Interesting Missing Stuff
[2] - AC97/I2S, USB2 PHY
}
359 \item Rudi (Asics.ws) donating time to create a Multi-Protocol
360 Audio Controller: AC97, PCM, PDM, I2S\\
361 \url{http://libre-riscv.org/shakti/m_class/AC97/
}
362 \item USB2 is... convoluted. UTMI-ULPI-USB2 PHY\\
363 USB2-PHY not confirmed (Rudi has one)\\
364 Also Rudi has DDR (
8-pin) variant of ULPI
365 \url{http://libre-riscv.org/shakti/m_class/ULPI/
}
366 \item USB3 not necessarily a good idea to put into Libre-RISCV\\
367 Daisho USB3 Pipe exists, TUSB1310a PHY is
175 pin FBGA!
368 \item Libre SD/MMC typically at "Open" Level
20MB/sec appx.
369 Full spec and eMMC needed (Rudi again).
371 {\it Trying to keep interfaces all-digital (USB3 isn't,
372 HP/Mic definitely isn't). Use
373 external (Analog) PHYs and/or Multi-chip Module
378 \frame{\frametitle{Which Processor Cores to use?
}
381 \item Shakti RV64 at the top of the list, not just for technical
382 reasons, but for the Shakti Group's goals and vision.
384 \item Libre
3D GPGPU (SMP RV64 plus accelerated custom ISA)
385 would make things interesting\\
386 (
3D app pinned to a non-uniform but SMP architecture)
388 \item Video Processing again is reasonable to be a different
389 RV32/
64 Core (SMP or otherwise), possibly not even RV
390 at all (MIPS, OR1200)
392 \item RV32 (PicoRV32?) always-on definitely needed (sleep mode)
395 {\it Ultimately, decisions are flexible, heavily weighted
396 towards "what does good and doesn't do bad" as
402 \frame{\frametitle{Summary
}
405 \item Making a commercially-desirable SoC is neither academically
406 nor standard-investor sexy! No AI. Boring. zzzz
407 \item Luckily there is an anonymous sponsor who needs an SoC that
408 doesn't exist (who knows the commercial benefits of Libre)
409 \item Shakti Group know the benefits (cost, sovereignty) of a Libre
410 Mobile-Class SoC as well (No spying on India citizens!)
411 \item A Libre GPU, even a modest performer (
100MT/s etc.)
412 is the biggest technical risk/unknown, besides DDR3/
4.\\
413 (fall-back is GC800. Do please help with a Libre GPU!)
414 \item DDR3/
4 and eMMC are the main high-risk interfaces\\
415 (there are fall-back strategies in place)
416 \item Ultimately the strategy is all about cost reduction
418 with Libre/Ethical prioritised over "convenience"
425 {\Huge The end
\vspace{20pt
}\\
426 Thank you
\vspace{20pt
}\\
427 Questions?
\vspace{20pt
}
432 \item Contact: lkcl@lkcl.net
433 \item \url{http://libre-riscv.org/shakti/m_class/
}