(no commit message)
[libreriscv.git] / nlnet_2019_formal.mdwn
1 # NL.net proposal 2019-10-046
2
3 ## Project name
4
5 The Libre RISC-V SoC, Formal Correctness Proofs
6
7 ## Website / wiki
8
9 <https://libre-riscv.org/nlnet_2019_formal>
10
11 Please be short and to the point in your answers; focus primarily on
12 the what and how, not so much on the why. Add longer descriptions as
13 attachments (see below). If English isn't your first language, don't
14 worry - our reviewers don't care about spelling errors, only about
15 great ideas. We apologise for the inconvenience of having to submit in
16 English. On the up side, you can be as technical as you need to be (but
17 you don't have to). Do stay concrete. Use plain text in your reply only,
18 if you need any HTML to make your point please include this as attachment.
19
20 ## Abstract: Can you explain the whole project and its expected outcome(s).
21
22 The Libre RISCV SoC is being developed to provide a privacy-respecting
23 modern processor, developed transparently and as libre to the bedrock
24 as possible.
25
26 The entire hardware design is libre licensed so that independent third
27 party audits may be carried out, in order for endusers to have a degree of
28 confidence that their privacy is not being violated through the addition
29 of spying backdoor coprocessors or simply through hardware implementation
30 oversights (SPECTRE, Meltdown, Intel Pentium FPU Divide Bug and so on)
31
32 However: not only is that an extremely daunting task, but "quis custodiet
33 custodiens?" Who guards the guards?
34
35 A solution to this problem is to provide formal mathematical correctness
36 proofs at every level of the hardware design. With mathematically
37 inviolate proofs, even an enduser can run the tests for themselves.
38
39 # Have you been involved with projects or organisations relevant to this project before? And if so, can you tell us a bit about your contributions?
40
41 Luke Leighton is an ethical technology specialist who has a consistent
42 24-year track record of developing code in a real-time transparent
43 (fully libre) fashion, and in managing Software Libre teams. He is the
44 lead developer on the Libre RISC-V SoC.
45
46 Dan Leighton is a software entrepreneur and experienced tech leader. He
47 has worked on delivering innovative open source software projects for
48 over 25 years in education, publishing and internet infrastructure. He
49 specialises in bridging the gap between engineering and commercial teams.
50
51 # Requested Amount
52
53 EUR 50,000.
54
55 # Explain what the requested budget will be used for?
56
57 Working with mathematically-minded Software Engineers, every module in
58 the Libre RISC-V SoC will have a "formal proof unit test written". This
59 is an unusual design choice: most hardware designs will have monte carlo
60 and corner case unit tests etc. written which, unfortunately, are both
61 complex (and a distraction) and often incomplete.
62
63 Examples include the IEEE754 Floating Point Unit, where in the 1990s
64 Intel managed to introduce an actual hardware division bug. We seek to
65 formally *prove* that the output from the FP Divide unit outputs the
66 correct answer, for all possible inputs.
67
68 There are other areas which can benefit from correctness proofs,
69 at the low level: pipelines, FIFOs, the basic building blocks of a
70 processor. nmigen, interestingly, already has a formal correctness
71 proof for its FIFO library due to the complexity of testing FIFOs.
72 On this stable foundation the higher level capabilities will then also
73 get their own proofs.
74
75 Finally a high level formal proof will be run, which already exists in
76 the form of "official" RISC-V Conformance Tests (if it does not depend
77 on proprietary software), as well as the unofficial formal correctness
78 test suite from SymbioticEDA.
79
80 Throughout this process, bugs will be found, including in code
81 already written. These will require fixing, where previously, with
82 non-mathematical unit tests, it was believed that the work was completed.
83
84 # Does the project have other funding sources, both past and present?
85
86 The overall project has sponsorship from Purism as well as a prior grant
87 from NLNet. However that is for specifically covering the development
88 of the RTL (the hardware source code).
89
90 The formal correctness testing requires specialist expertise involving
91 formal logic mathematical training, which is a different skillset from
92 hardware design. Our initial proposal does not cover this scope.
93
94 Also not covered in the initial funding is the bugfixing that will be
95 required should the more rigorous formal proofs discover any issues.
96
97 # Compare your own project with existing or historical efforts.
98
99 There do exist high level formal RISC-V Correctness proofs in various
100 forms. One of these is the SymbioticEDA formal RISC-V proof which can
101 for example test the Register File, and test that the integer operation
102 is correct and so on, working its way through all operations one by one.
103 This however is at a high level.
104
105 The Kestrel 53000 Series of embedded controllers have some formal unit
106 tests written in verilog, at the lowest level. We are following their
107 development and porting to nmigen closely, and consulting with their
108 part time developer.
109
110 A massive comprehensive suite of formal correctness proofs for a
111 processor of the scope and size of the Libre RISC-V SoC is just not
112 normally done. The only reason we are considering it is because of the
113 dramatic simplification of unit tests that the approach brings, and the
114 mathematically inviolate guarantees it brings for endusers and developers.
115
116 ## What are significant technical challenges you expect to solve during the project, if any?
117
118 This project is critically dependent on having software engineers that
119 have the mathematical acumen for formal logic correctness proofs. This
120 is quite a rare combination.
121
122 In addition, we are using an unusual choice of HDL: a python based tool
123 called nmigen. Fortunately its backend is yosys, which has well known
124 industry recognised links to formal proof libraries, through symbiyosys.
125
126 We will need to train engineers to adapt to the unique mathematics,
127 or train mathematicians to the unique software quirks.
128
129 Luckily we have several examples already, in the form of the work carried
130 out by the developer of the Kestrel 53000 series of CPUs.
131
132 The other main challenge is that as the size of the code being tested
133 goes up, the CPU resources required go up exponentially. At the low level
134 it is fine: tests can take several hours on a standard high end desktop.
135 However as things progress to larger levels we may actually need access
136 to Beowulf Clusters or other supercomputing resources.
137
138 ## Describe the ecosystem of the project, and how you will engage with relevant actors and promote the outcomes?
139
140 As mentioned in the 2018 submission, the Libre RISC-V
141 SoC has a full set of resources for Libre Project Management and development:
142 mailing list, bugtracker, git repository and wiki - all listed here:
143 <https://libre-riscv.org/>
144
145 In addition, we have a Crowdsupply page
146 <https://www.crowdsupply.com/libre-risc-v/m-class> which provides a public
147 gateway, and heise.de, reddit, phoronix, slashdot and other locations have
148 all picked up the story. The list is updated and maintained here:
149 <https://libre-riscv.org/3d_gpu/>
150
151 # Extra info to be submitted
152
153 * <http://libre-riscv.org/3d_gpu/>
154 * <https://nlnet.nl/project/Libre-RISCV/>
155 * <https://symbiyosys.readthedocs.io>
156
157 # Management Summary
158
159 The Libre RISC-V SoC Project, https://nlnet.nl/project/Libre-RISCV/, is funded by NLNet to reach ASIC-proven status. As of Dec 2019 It has been in development for a year, and writing comprehensive unit tests has been both a critical part of that process and a major part of the time taken. Formal Mathematical Proofs turn out to be critical for several reasons: firstly, they are simpler to read and much more comprehensive (100% coverage), saving hugely on development and maintenance; secondly, they're mathematically inviolate. From a security and trust perspective, both aspects are extremely important. Firstly: security mistakes are often accidental due to complexity: a reduction in complexity helps avoid mistakes. Secondly: independent auditing of the processor is a matter of running the formal proofs. This proposal therefore not only saves on development time, it helps us meet the goal of developing a privacy-respecting processor in a way that is *independently* verifiable.