1 # Resources and Specifications
3 This page aims to collect all the resources and specifications we need
4 in one place for quick access. We will try our best to keep links here
5 up-to-date. Feel free to add more links here.
11 This section is primarily a series of useful links found online
13 * [FSiC2019](https://wiki.f-si.org/index.php/FSiC2019)
14 * Fundamentals to learn to get started [[3d_gpu/tutorial]]
16 ## Is Open Source Hardware Profitable?
17 [RaptorCS on FOSS Hardware Interview](https://www.youtube.com/watch?v=o5Ihqg72T3c&feature=youtu.be)
21 * [3.0 PDF](https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0)
22 * [2.07 PDF](https://openpowerfoundation.org/?resource_lib=ibm-power-isa-version-2-07-b)
24 ## Overview of the user ISA:
26 [Raymond Chen's PowerPC series](https://devblogs.microsoft.com/oldnewthing/20180806-00/?p=99425)
28 ## OpenPOWER OpenFSI Spec (2016)
30 * [OpenPOWER OpenFSI Spec](http://openpowerfoundation.org/wp-content/uploads/resources/OpenFSI-spec-100/OpenFSI-spec-20161212.pdf)
32 * [OpenPOWER OpenFSI Compliance Spec](http://openpowerfoundation.org/wp-content/uploads/resources/openpower-fsi-thts-1.0/openpower-fsi-thts-20180130.pdf)
36 * <https://www.reddit.com/r/OpenPOWER/>
37 * <http://lists.mailinglist.openpowerfoundation.org/pipermail/openpower-hdl-cores/>
38 * <http://lists.mailinglist.openpowerfoundation.org/pipermail/openpower-community-dev/>
43 * [Useful JTAG implementation reference: Design Of IEEE 1149.1 TAP Controller IP Core by Shelja, Nandakumar and Muruganantham, DOI:10.5121/csit.2016.60910](https://web.archive.org/web/20201021174944/https://airccj.org/CSCP/vol6/csit65610.pdf)
47 "The objective of this work is to design and implement a TAP controller IP core compatible with IEEE 1149.1-2013 revision of the standard. The test logic architecture also includes the Test Mode Persistence controller and its associated logic. This work is expected to serve as a ready to use module that can be directly inserted in to a new digital IC designs with little modifications."
49 # RISC-V Instruction Set Architecture
51 **PLEASE UPDATE** - we are no longer implementing full RISCV, only user-space
54 The Libre RISC-V Project is building a hybrid CPU/GPU SoC. As the name
55 of the project implies, we will be following the RISC-V ISA I due to it
56 being open-source and also because of the huge software and hardware
57 ecosystem building around it. There are other open-source ISAs but none
58 of them have the same momentum and energy behind it as RISC-V.
60 To fully take advantage of the RISC-V ecosystem, it is important to be
61 compliant with the RISC-V standards. Doing so will allow us to to reuse
62 most software as-is and avoid major forks.
64 * [Official compiled PDFs of RISC-V ISA Manual]
65 (https://github.com/riscv/riscv-isa-manual/releases/latest)
66 * [Working draft of the proposed RISC-V Bitmanipulation extension](https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-draft.pdf)
67 * [RISC-V "V" Vector Extension](https://riscv.github.io/documents/riscv-v-spec/)
68 * [RISC-V Supervisor Binary Interface Specification](https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.md)
70 Note: As far as I know, we aren't using the RISC-V V Extension directly
71 at the moment (correction: we were never going to). However, there are many wiki pages that make a reference
72 to the V extension so it would be good to include it here as a reference
73 for comparative/informative purposes with regard to Simple-V.
74 <https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vsetvlivsetvl-instructions>
77 - [Qemu emulation](https://github.com/qemu/qemu/commit/d5fee0bbe68d5e61e2d2beb5ff6de0b9c1cfd182)
81 ## D-Cache Possible Optimizations papers and links
82 - [ACDC: Small, Predictable and High-Performance Data Cache](https://dl.acm.org/doi/10.1145/2677093)
84 # BW Enhancing Shared L1 Cache Design research done in cooperation with AMD
85 - [Youtube video PACT 2020 - Analyzing and Leveraging Shared L1 Caches in GPUs](https://m.youtube.com/watch?v=CGIhOnt7F6s)
86 - [Url to PDF of paper on author's website (clicking will download the pdf)](https://adwaitjog.github.io/docs/pdf/sharedl1-pact20.pdf)
89 # RTL Arithmetic SQRT, FPU etc.
91 ## Wallace vs Dadda Multipliers
93 * [Paper comparing efficiency of Wallace and Dadda Multipliers in RTL implementations (clicking will download the pdf from archive.org)](https://web.archive.org/web/20180717013227/http://ieeemilestones.ethw.org/images/d/db/A_comparison_of_Dadda_and_Wallace_multiplier_delays.pdf)
96 * [Fast Floating Point Square Root](https://pdfs.semanticscholar.org/5060/4e9aff0e37089c4ab9a376c3f35761ffe28b.pdf)
97 * [Reciprocal Square Root Algorithm](http://www.acsel-lab.com/arithmetic/arith15/papers/ARITH15_Takagi.pdf)
99 ## CORDIC and related algorithms
101 * <https://bugs.libre-soc.org/show_bug.cgi?id=127> research into CORDIC
102 * <https://bugs.libre-soc.org/show_bug.cgi?id=208>
103 * [BKM (log(x) and e^x)](https://en.wikipedia.org/wiki/BKM_algorithm)
104 * [CORDIC](http://www.andraka.com/files/crdcsrvy.pdf)
105 - Does not have an easy way of computing tan(x)
106 * [zipcpu CORDIC](https://zipcpu.com/dsp/2017/08/30/cordic.html)
107 * [Low latency and Low error floating point TCORDIC](https://ieeexplore.ieee.org/document/7784797) (email Michael or Cole if you don't have IEEE access)
108 * <http://www.myhdl.org/docs/examples/sinecomp/> MyHDL version of CORDIC
109 * <https://dspguru.com/dsp/faqs/cordic/>
111 ## IEEE Standard for Floating-Point Arithmetic (IEEE 754)
113 Almost all modern computers follow the IEEE Floating-Point Standard. Of
114 course, we will follow it as well for interoperability.
116 * IEEE 754-2019: <https://standards.ieee.org/standard/754-2019.html>
118 Note: Even though this is such an important standard used by everyone,
119 it is unfortunately not freely available and requires a payment to
120 access. However, each of the Libre RISC-V members already have access
123 * [Lecture notes - Floating Point Appreciation](http://pages.cs.wisc.edu/~markhill/cs354/Fall2008/notes/flpt.apprec.html)
125 Among other things, has a nice explanation on arithmetic, rounding modes and the sticky bit.
127 * [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)
129 Nice resource on rounding errors (ulps and epsilon) and the "table maker's dilemma".
131 ## Past FPU Mistakes to learn from
133 * [Intel Underestimates Error Bounds by 1.3 quintillion on
134 Random ASCII – tech blog of Bruce Dawson ](https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/)
135 * [Intel overstates FPU accuracy 06/01/2013](http://notabs.org/fpuaccuracy)
136 * How not to design an ISA
137 <https://player.vimeo.com/video/450406346>
138 Meester Forsyth <http://eelpi.gotdns.org/>
141 The Khronos Group creates open standards for authoring and acceleration
142 of graphics, media, and computation. It is a requirement for our hybrid
143 CPU/GPU to be compliant with these standards *as well* as with IEEE754,
144 in order to be commercially-competitive in both areas: especially Vulkan
145 and OpenCL being the most important. SPIR-V is also important for the
148 Thus the [[zfpacc_proposal]] has been created which permits runtime dynamic
149 switching between different accuracy levels, in userspace applications.
151 [**SPIR-V Main Page Link**](https://www.khronos.org/registry/spir-v/)
153 * [SPIR-V 1.5 Specification Revision 1](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html)
154 * [SPIR-V OpenCL Extended Instruction Set](https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.ExtendedInstructionSet.100.html)
155 * [SPIR-V GLSL Extended Instruction Set](https://www.khronos.org/registry/spir-v/specs/unified1/GLSL.std.450.html)
157 [**Vulkan Main Page Link**](https://www.khronos.org/registry/vulkan/)
159 * [Vulkan 1.1.122](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/index.html)
161 [**OpenCL Main Page**](https://www.khronos.org/registry/OpenCL/)
163 * [OpenCL 2.2 API Specification](https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_API.html)
164 * [OpenCL 2.2 Extension Specification](https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_Ext.html)
165 * [OpenCL 2.2 SPIR-V Environment Specification](https://www.khronos.org/registry/OpenCL/specs/2.2/html/OpenCL_Env.html)
167 * OpenCL released the proposed OpenCL 3.0 spec for comments in april 2020
169 * [Announcement video](https://youtu.be/h0_syTg6TtY)
170 * [Announcement video slides (PDF)](https://www.khronos.org/assets/uploads/apis/OpenCL-3.0-Launch-Apr20.pdf)
172 Note: We are implementing hardware accelerated Vulkan and
173 OpenCL while relying on other software projects to translate APIs to
174 Vulkan. E.g. Zink allows for OpenGL-to-Vulkan in software.
176 # Graphics and Compute API Stack
178 I found this informative post that mentions Kazan and a whole bunch of
179 other stuff. It looks like *many* APIs can be emulated on top of Vulkan,
180 although performance is not evaluated.
182 <https://synappsis.wordpress.com/2017/06/03/opengl-over-vulkan-dev/>
184 * Pixilica is heading up an initiative to create a RISC-V graphical ISA
186 * [Pixilica 3D Graphical ISA Slides](https://b5792ddd-543e-4dd4-9b97-fe259caf375d.filesusr.com/ugd/841f2a_c8685ced353b4c3ea20dbb993c4d4d18.pdf)
188 # 3D Graphics Texture compression software and hardware
190 * [Proprietary Rad Game Tools Oddle Texture Software Compression](https://web.archive.org/web/20200913122043/http://www.radgametools.com/oodle.htm)
192 * [Blog post by one of the engineers who developed the proprietary Rad Game Tools Oddle Texture Software Compression and the Oodle Kraken decompression software and hardware decoder used in the ps5 ssd](https://archive.vn/oz0pG)
194 # Various POWER Communities
195 - [An effort to make a 100% Libre POWER Laptop](https://www.powerpc-notebook.org/en/)
196 The T2080 is a POWER8 chip.
197 - [Power Progress Community](https://www.powerprogress.org/campaigns/donations-to-all-the-power-progress-community-projects/)
198 Supporting/Raising awareness of various POWER related open projects on the FOSS
200 - [OpenPOWER](https://openpowerfoundation.org)
201 Promotes and ensure compliance with the Power ISA amongst members.
202 - [OpenCapi](https://opencapi.org)
203 High performance interconnect for POWER machines. One of the big advantages
204 of the POWER architecture. Notably more performant than PCIE Gen4, and is
205 designed to be layered on top of the physical PCIE link.
206 - [OpenPOWER “Virtual Coffee” Calls](https://openpowerfoundation.org/openpower-virtual-coffee-calls/)
207 Truly open bi-weekly teleconference lines for anybody interested in helping
208 advance or adopting the POWER architecture.
212 ## Free Silicon Conference
214 The conference brought together experts and enthusiasts who want to build
215 a complete Free and Open Source CAD ecosystem for designing analog and
216 digital integrated circuits. The conference covered the full spectrum of
217 the design process, from system architecture, to layout and verification.
219 * <https://wiki.f-si.org/index.php/FSiC2019#Foundries.2C_PDKs_and_cell_libraries>
221 * LIP6's Coriolis - a set of backend design tools:
222 <https://www-soc.lip6.fr/equipe-cian/logiciels/coriolis/>
224 Note: The rest of LIP6's website is in French, but there is a UK flag
225 in the corner that gives the English version.
227 * KLayout - Layout viewer and editor: <https://www.klayout.de/>
229 # The OpenROAD Project
231 OpenROAD seeks to develop and foster an autonomous, 24-hour, open-source
232 layout generation flow (RTL-to-GDS).
234 * <https://theopenroadproject.org/>
236 # Other RISC-V GPU attempts
238 * <https://fossi-foundation.org/2019/09/03/gsoc-64b-pointers-in-rv32>
240 * <http://bjump.org/manycore/>
242 * <https://resharma.github.io/RISCV32-GPU/>
244 TODO: Get in touch and discuss collaboration
246 # Tests, Benchmarks, Conformance, Compliance, Verification, etc.
250 RISC-V Foundation is in the process of creating an official conformance
251 test. It's still in development as far as I can tell.
253 * //TODO LINK TO RISC-V CONFORMANCE TEST
255 ## IEEE 754 Testing/Emulation
257 IEEE 754 has no official tests for floating-point but there are
258 well-known third party tools to check such as John Hauser's TestFloat.
260 There is also his SoftFloat library, which is a software emulation
261 library for IEEE 754.
263 * <http://www.jhauser.us/arithmetic/>
265 Jacob is also working on an IEEE 754 software emulation library written
266 in Rust which also has Python bindings:
268 * Source: <https://salsa.debian.org/Kazan-team/simple-soft-float>
269 * Crate: <https://crates.io/crates/simple-soft-float>
270 * Autogenerated Docs: <https://docs.rs/simple-soft-float/>
272 A cool paper I came across in my research is "IeeeCC754++ : An Advanced
273 Set of Tools to Check IEEE 754-2008 Conformity" by Dr. Matthias Hüsken.
275 * Direct link to PDF:
276 <http://elpub.bib.uni-wuppertal.de/servlets/DerivateServlet/Derivate-7505/dc1735.pdf>
280 OpenCL Conformance Tests
282 * <https://github.com/KhronosGroup/OpenCL-CTS>
284 Vulkan Conformance Tests
286 * <https://github.com/KhronosGroup/VK-GL-CTS>
288 MAJOR NOTE: We are **not** allowed to say we are compliant with any of
289 the Khronos standards until we actually make an official submission,
290 do the paperwork, and pay the relevant fees.
292 ## Formal Verification
294 Formal verification of Libre RISC-V ensures that it is bug-free in
295 regards to what we specify. Of course, it is important to do the formal
296 verification as a final step in the development process before we produce
297 thousands or millions of silicon.
299 * Possible way to speed up our solvers for our formal proofs <https://web.archive.org/web/20201029205507/https://github.com/eth-sri/fastsmt>
301 * Algorithms (papers) submitted for 2018 International SAT Competition <https://web.archive.org/web/20201029205239/https://helda.helsinki.fi/bitstream/handle/10138/237063/sc2018_proceedings.pdf> <https://web.archive.org/web/20201029205637/http://www.satcompetition.org/>
303 Some learning resources I found in the community:
305 * ZipCPU: <http://zipcpu.com/> ZipCPU provides a comprehensive
306 tutorial for beginners and many exercises/quizzes/slides:
307 <http://zipcpu.com/tutorial/>
308 * Western Digital's SweRV CPU blog (I recommend looking at all their
309 posts): <https://tomverbeure.github.io/>
310 * <https://tomverbeure.github.io/risc-v/2018/11/19/A-Bug-Free-RISC-V-Core-without-Simulation.html>
311 * <https://tomverbeure.github.io/rtl/2019/01/04/Under-the-Hood-of-Formal-Verification.html>
315 * <https://www.ohwr.org/project/wishbone-gen>
319 ## Adding new instructions:
321 * <https://archive.fosdem.org/2015/schedule/event/llvm_internal_asm/>
325 * <https://danluu.com/branch-prediction/>
329 * [Migen - a Python RTL](https://jeffrey.co.in/blog/2014/01/d-flip-flop-using-migen/)
330 * [LiTeX](https://github.com/timvideos/litex-buildenv/wiki/LiteX-for-Hardware-Engineers)
331 An SOC builder written in Python Migen DSL. Allows you to generate functional
332 RTL for a SOC configured with cache, a RISCV core, ethernet, DRAM support,
333 and parameterizeable CSRs.
334 * [Migen Tutorial](http://blog.lambdaconcept.com/doku.php?id=migen:tutorial>)
335 * There is a great guy, Robert Baruch, who has a good
336 [tutorial](https://github.com/RobertBaruch/nmigen-tutorial) on nMigen.
337 He also build an FPGA-proven Motorola 6800 CPU clone with nMigen and put
338 [the code](https://github.com/RobertBaruch/n6800) and
339 [instructional videos](https://www.youtube.com/playlist?list=PLEeZWGE3PwbbjxV7_XnPSR7ouLR2zjktw)
341 * [Minerva](https://github.com/lambdaconcept/minerva)
342 An SOC written in Python nMigen DSL
343 * Minerva example using nmigen-soc
344 <https://github.com/jfng/minerva-examples/blob/master/hello/core.py>
345 * [Using our Python Unit Tests(old)](http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-March/000705.html)
346 * <https://chisel.eecs.berkeley.edu/api/latest/chisel3/util/DecoupledIO.html>
347 * <http://www.clifford.at/papers/2016/yosys-synth-formal/slides.pdf>
351 * <https://debugger.medium.com/why-is-apples-m1-chip-so-fast-3262b158cba2> N1
352 * <https://codeberg.org/tok/librecell> Libre Cell Library
353 * <https://wiki.f-si.org/index.php/FSiC2019>
354 * <https://fusesoc.net>
355 * <https://www.lowrisc.org/open-silicon/>
356 * <http://fpgacpu.ca/fpga/Pipeline_Skid_Buffer.html> pipeline skid buffer
357 * <https://pyvcd.readthedocs.io/en/latest/vcd.gtkw.html> GTKwave
358 * <http://www.sunburst-design.com/papers/CummingsSNUG2002SJ_Resets.pdf>
359 Synchronous Resets? Asynchronous Resets? I am so confused! How will I
360 ever know which to use? by Clifford E. Cummings
361 * <http://www.sunburst-design.com/papers/CummingsSNUG2008Boston_CDC.pdf>
362 Clock Domain Crossing (CDC) Design & Verification Techniques Using
363 SystemVerilog, by Clifford E. Cummings
364 In particular, see section 5.8.2: Multi-bit CDC signal passing using
365 1-deep / 2-register FIFO synchronizer.
366 * <http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-143.pdf>
367 Understanding Latency Hiding on GPUs, by Vasily Volkov
368 * Efabless "Openlane" <https://github.com/efabless/openlane>
369 * Co-simulation plugin for verilator, transferring to ECP5
370 <https://github.com/vmware/cascade>
371 * Multi-read/write ported memories
372 <https://tomverbeure.github.io/2019/08/03/Multiport-Memories.html>
373 * Data-dependent fail-on-first aka "Fault-tolerant speculative vectorisation"
374 <https://arxiv.org/pdf/1803.06185.pdf>
375 * OpenPOWER Foundation Membership
376 <https://openpowerfoundation.org/membership/how-to-join/membership-kit-9-27-16-4/>
377 * Clock switching (and formal verification)
378 <https://zipcpu.com/formal/2018/05/31/clkswitch.html>
379 * Circuit of Compunit <http://home.macintosh.garden/~mepy2/libre-soc/comp_unit_req_rel.html>
380 * Circuitverse 16-bit <https://circuitverse.org/users/17603/projects/54486>
381 * Nice example model of a Tomasulo-based architecture, with multi-issue, in-order issue, out-of-order execution, in-order commit, with reservation stations and reorder buffers, and hazard avoidance.
382 <https://www.brown.edu/Departments/Engineering/Courses/En164/Tomasulo_10.pdf>
383 # Real/Physical Projects
385 * [Samuel's KC5 code](http://chiselapp.com/user/kc5tja/repository/kestrel-3/dir?ci=6c559135a301f321&name=cores/cpu)
386 * <https://chips4makers.io/blog/>
387 * <https://hackaday.io/project/7817-zynqberry>
388 * <https://github.com/efabless/raven-picorv32>
389 * <https://efabless.com>
390 * <https://efabless.com/design_catalog/default>
391 * <https://wiki.f-si.org/index.php/The_Raven_chip:_First-time_silicon_success_with_qflow_and_efabless>
392 * <https://mshahrad.github.io/openpiton-asplos16.html>
394 # ASIC tape-out pricing
396 * <https://europractice-ic.com/wp-content/uploads/2020/05/General-MPW-EUROPRACTICE-200505-v8.pdf>
400 * <https://toyota-ai.ventures/>
401 * [NLNet Applications](http://bugs.libre-riscv.org/buglist.cgi?columnlist=assigned_to%2Cbug_status%2Cresolution%2Cshort_desc%2Ccf_budget&f1=cf_nlnet_milestone&o1=equals&query_format=advanced&resolution=---&v1=NLnet.2019.02)
403 # Good Programming/Design Practices
405 * [Liskov Substitution Principle](https://en.wikipedia.org/wiki/Liskov_substitution_principle)
406 * [Principle of Least Astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment)
407 * <https://peertube.f-si.org/videos/watch/379ef007-40b7-4a51-ba1a-0db4f48e8b16>
408 * [Rust-Lang Philosophy and Consensus](http://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/)
410 * <https://youtu.be/o5Ihqg72T3c>
411 * <http://flopoco.gforge.inria.fr/>
412 * Fundamentals of Modern VLSI Devices
413 <https://groups.google.com/a/groups.riscv.org/d/msg/hw-dev/b4pPvlzBzu0/7hDfxArEAgAJ>
417 * <https://www.crnhq.org/cr-kit/>
421 * <https://github.com/Isotel/mixedsim>
422 * <http://www.vlsiacademy.org/open-source-cad-tools.html>
423 * <http://ngspice.sourceforge.net/adms.html>
424 * <https://en.wikipedia.org/wiki/Verilog-AMS#Open_Source_Implementations>
426 # Libre-SOC Standards
428 This list auto-generated from a page tag "standards":
430 [[!inline pages="tagged(standards)" actions="no" archive="yes" quick="yes"]]
434 * [[resources/server-setup/web-server]]
435 * [[resources/server-setup/git-mirroring]]
436 * [[resources/server-setup/nagios-monitoring]]
440 * <https://www.fed4fire.eu/testbeds/>
442 # Really Useful Stuff
444 * <https://github.com/im-tomu/fomu-workshop/blob/master/docs/requirements.txt>
445 * <https://github.com/im-tomu/fomu-workshop/blob/master/docs/conf.py#L39-L47>
449 * https://store.digilentinc.com/pmod-sf3-32-mb-serial-nor-flash/
450 * https://store.digilentinc.com/arty-a7-artix-7-fpga-development-board-for-makers-and-hobbyists/
451 * https://store.digilentinc.com/pmod-vga-video-graphics-array/
452 * https://store.digilentinc.com/pmod-microsd-microsd-card-slot/
453 * https://store.digilentinc.com/pmod-rtcc-real-time-clock-calendar/
454 * https://store.digilentinc.com/pmod-i2s2-stereo-audio-input-and-output/
456 # CircuitJS experiments
458 * [[resources/high-speed-serdes-in-circuitjs]]
460 # ASIC Timing and Design flow resources
462 * <https://www.linkedin.com/pulse/asic-design-flow-introduction-timing-constraints-mahmoud-abdellatif/>
463 * <https://www.icdesigntips.com/2020/10/setup-and-hold-time-explained.html>
464 * <https://www.vlsiguide.com/2018/07/clock-tree-synthesis-cts.html>
465 * <https://en.wikipedia.org/wiki/Frequency_divider>
467 # Geometric Haskell Library
469 * <https://github.com/julialongtin/hslice/blob/master/Graphics/Slicer/Math/GeometricAlgebra.hs>
470 * <https://github.com/julialongtin/hslice/blob/master/Graphics/Slicer/Math/PGA.hs>
471 * <https://arxiv.org/pdf/1501.06511.pdf>
472 * <https://bivector.net/index.html>