README: Flow
[litex.git] / README
1 Migen (Milkymist Generator)
2 a Python toolbox for building complex digital hardware
3 ======================================================
4
5 Background
6 ==========
7 Even though the Milkymist system-on-chip [1] is technically successful,
8 it suffers from several limitations stemming from its implementation in
9 manually written Verilog HDL:
10
11 (1) The "event-driven" paradigm of today's dominant hardware descriptions
12 languages (Verilog and VHDL, collectively referred to as "V*HDL" in the
13 rest of this document) is often too general. Today's FPGA architectures
14 are optimized for the implementation of fully synchronous circuits. This
15 means that the bulk of the code for an efficient FPGA design falls into
16 three categories:
17 (a) Combinatorial statements
18 (b) Synchronous statements
19 (c) Initialization of registers at reset
20 V*HDL do not follow this organization. This means that a lot of
21 repetitive manual coding is needed, which brings sources of human errors,
22 petty issues, and confusion for beginners:
23 - wire vs. reg in Verilog
24 - forgetting to initialize a register at reset
25 - deciding whether a combinatorial statement must go into a
26 process/always block or not
27 - simulation mismatches with combinatorial processes/always blocks
28 - and more...
29 A little-known fact about FPGAs is that many of them have to ability to
30 initialize their registers from the bitstream contents. This can be done
31 in a portable and standard way using an "initial" block in Verilog, and
32 by affecting a value at the signal declaration in VHDL. This renders an
33 explicit reset signal unnecessary in practice in some cases, which opens
34 the way for further design optimization. However, this form of
35 initialization is entirely not synthesizable for ASIC targets, and it is
36 not easy to switch between the two forms of reset using V*HDL.
37
38 (2) V*HDL support for composite types is very limited. Signals having a
39 record type in VHDL are unidirectional, which makes them clumsy to use
40 e.g. in bus interfaces. There is no record type support in Verilog, which
41 means that a lot of copy-and-paste has to be done when forwarding grouped
42 signals.
43
44 (3) V*HDL support for procedurally generated logic is extremely limited.
45 The most advanced forms of procedural generation of synthesizable logic
46 that V*HDL offers are CPP-style directives in Verilog, combinatorial
47 functions, and generate statements. Nothing really fancy, and it shows.
48 To give a few examples:
49 - Building highly flexible bus interconnect is not possible. Even
50 arbitrating any given number of bus masters for commonplace protocols
51 such as Wishbone cannot be done with the tools at V*HDL puts at our
52 disposal. This requires manual recoding of parts of the arbiter to add or
53 remove a master, which is tedious and often cause frustrating errors.
54 Each occurence of the latter can easily cause one or two hours of lost
55 productivity when combined with the long compilation times of moderately
56 complex system-on-chip designs.
57 - Building a memory infrastructure (including bus interconnect, bridges
58 and caches) that can automatically adapt itself at compile-time to any
59 word size of the SDRAM is clumsy and tedious.
60 - Building register banks for control, status and interrupt management
61 of cores can also largely benefit from automation.
62 - Many hardware acceleration problems can fit into the dataflow
63 programming model. Manual dataflow implementation in V*HDL has, again, a
64 lot of redundancy and potential for human errors. See the Milkymist
65 texture mapping unit [3][4] for an example of this. The amount of detail
66 to deal with manually also makes the design space exploration difficult,
67 and therefore hinders the design of efficient architectures.
68 - Pre-computation of values, such as filter coefficients for DSP or
69 even simply trigonometric tables, must often be done using external tools
70 whose results are copy-and-pasted (in the best cases, automatically) into
71 the V*HDL source.
72
73 Enter Migen, a Python toolbox for building complex digital hardware. We
74 could have designed a brand new programming language, but that would have
75 been reinventing the wheel instead of being able to benefit from Python's
76 rich features and immense library. The price to pay is a slightly
77 cluttered syntax at times when writing descriptions in FHDL, but we
78 believe this is totally acceptable, particularly when compared to VHDL
79 ;-)
80
81 Migen is made up of several related components, which are briefly
82 described below.
83
84 Migen FHDL
85 ==========
86 The Fragmented Hardware Description Language (FHDL) is the lowest layer
87 of Migen. It consists of a formal system to describe signals, and
88 combinatorial and synchronous statements operating on them. The formal
89 system itself is low level and close to the synthesizable subset of
90 Verilog, and we then rely on Python algorithms to build complex
91 structures by combining FHDL elements and encapsulating them in
92 "fragments".
93 The FHDL module also contains a back-end to produce synthesizable
94 Verilog, and some basic analysis functions. It would be possible to
95 develop a VHDL back-end as well, though more difficult than for Verilog -
96 we are "cheating" a bit now as Verilog provides most of the FHDL
97 semantics.
98
99 FHDL differs from MyHDL [2] in fundamental ways. MyHDL follows the
100 event-driven paradigm of traditional HDLs (see Background, #1) while FHDL
101 separates the code into combinatorial statements, synchronous statements,
102 and reset values. In MyHDL, the logic is described directly in the Python
103 AST. The converter to Verilog or VHDL then examines the Python AST and
104 recognizes a subset of Python that it translates into V*HDL statements.
105 This seriously impedes the capability of MyHDL to generate logic
106 procedurally. With FHDL, you manipulate a custom AST from Python, and you
107 can more easily design algorithms that operate on it.
108
109 FHDL is made of several elements, which are briefly explained below.
110
111 BV
112 --
113 The bit vector (BV) object defines if a constant or signal is signed or
114 unsigned, and how many bits it has. This is useful e.g. to:
115 - determine when to perform sign extension (FHDL uses the same rules as
116 Verilog).
117 - determine the size of registers.
118 - determine how many bits should be used by each value in
119 concatenations.
120
121 Constant
122 --------
123 This object should be self-explanatory. All constant objects contain a BV
124 object and a value. If no BV object is specified, one will be made up
125 using the following rules:
126 - If the value is positive, the BV is unsigned and has the minimum
127 number of bits needed to represent the constant's value in the canonical
128 base-2 system.
129 - If the value is negative, the BV is signed, and has the minimum
130 number of bits needed to represent the constant's value in the canonical
131 two's complement, base-2 system.
132
133 Signal
134 ------
135 The signal object represents a value that is expected to change in the
136 circuit. It does exactly what Verilog's "wire" and "reg" and VHDL's
137 "signal" and "variable" do.
138
139 The main point of the signal object is that it is identified by its
140 Python ID (as returned by the id() function), and nothing else. It is the
141 responsibility of the V*HDL back-end to establish an injective mapping
142 between Python IDs and the V*HDL namespace. It should perform name
143 mangling to ensure this. The consequence of this is that signal objects
144 can safely become members of arbitrary Python classes, or be passed as
145 parameters to functions or methods that generate logic involving them.
146
147 The properties of a signal object are:
148 - a bit vector description
149 - a name, used as a hint for the V*HDL back-end name mangler.
150 - a boolean "variable". If true, the signal will behave like a VHDL
151 variable, or a Verilog reg that uses blocking assignment. This parameter
152 only has an effect when the signal's value is modified in a synchronous
153 statement.
154 - the signal's reset value. It must be an integer, and defaults to 0.
155 When the signal's value is modified with a synchronous statement, the
156 reset value is the initialization value of the associated register.
157 When the signal is assigned to in a conditional combinatorial statement
158 (If or Case), the reset value is the value that the signal has when no
159 condition that causes the signal to be driven is verified. This enforces
160 the absence of latches in designs. If the signal is permanently driven
161 using a combinatorial statement, the reset value has no effect.
162
163 The sole purpose of the name property is to make the generated V*HDL code
164 easier to understand and debug. From a purely functional point of view,
165 it is perfectly OK to have several signals with the same name property.
166 The back-end will generate a unique name for each object. If no name
167 property is specified, Migen will analyze the code that created the
168 signal object, and try to extract the variable or member name from there.
169 It then uses the module name that created the signal, a underscore, and
170 the variable name. For example, if we are in module "foo", the following
171 statements will create one or several signal(s) named "foo_bar":
172 bar = Signal()
173 self.bar = Signal()
174 self.baz.bar = Signal()
175 bar = [Signal() for x in range(42)]
176
177 Operators
178 ---------
179 Operators are represented by the _Operator object, which generally should
180 not be used directly. Instead, most FHDL objects overload the usual
181 Python logic and arithmetic operators, which allows a much lighter syntax
182 to be used. For example, the expression:
183 a * b + c
184 is equivalent to:
185 _Operator('+', [_Operator('*', [a, b]), c])
186
187 Slices
188 ------
189 Likewise, slices are represented by the _Slice object, which often should
190 not be used in favor of the Python slice operation [x:y].
191 Implicit indices using the forms [x], [x:] and [:y] are supported.
192 Beware! Slices work like Python slices, not like VHDL or Verilog slices.
193 The first bound is the index of the LSB and is inclusive. The second
194 bound is the index of MSB and is exclusive. In V*HDL, bounds are MSB:LSB
195 and both are inclusive.
196
197 Concatenations
198 --------------
199 Concatenations are done using the Cat object. To make the syntax lighter,
200 its constructor takes a variable number of arguments, which are the
201 signals to be concatenated together (you can use the Python '*' operator
202 to pass a list instead).
203 To be consistent with slices, the first signal is connected to the bits
204 with the lowest indices in the result. This is the opposite of the way
205 the '{}' construct works in Verilog.
206
207 Replications
208 ------------
209 The Replicate object represents the equivalent of {count{expression}} in
210 Verilog.
211
212 Assignments
213 -----------
214 Assignments are represented with the _Assign object. Since using it
215 directly would result in a cluttered syntax, the preferred technique for
216 assignments is to use the be() method provided by objects that can have a
217 value assigned to them. They are signals, and their combinations with the
218 slice and concatenation operators.
219 As an example, the statement:
220 a[0].be(b)
221 is equivalent to:
222 _Assign(_Slice(a, 0, 1), b)
223
224 If statement
225 ------------
226 The If object takes a first parameter which must be an expression
227 (combination of the Constant, Signal, _Operator, _Slice, etc. objects)
228 representing the condition, then a variable number of parameters
229 representing the statements (_Assign, If, Case, etc. objects) to be
230 executed when the condition is verified.
231
232 The If object defines a Else() method, which when called defines the
233 statements to be executed when the condition is not true. Those
234 statements are passed as parameters to the variadic method.
235
236 For convenience, there is also a Elif() method.
237
238 Example:
239 If(tx_count16 == 0,
240 tx_bitcount.be(tx_bitcount + 1),
241 If(tx_bitcount == 8,
242 self.tx.be(1)
243 ).Elif(tx_bitcount == 9,
244 self.tx.be(1),
245 tx_busy.be(0)
246 ).Else(
247 self.tx.be(tx_reg[0]),
248 tx_reg.be(Cat(tx_reg[1:], 0))
249 )
250 )
251
252 Case statement
253 --------------
254 The Case object constructor takes as first parameter the expression to be
255 tested, then a variable number of lists describing the various cases.
256
257 Each list contains an expression (typically a constant) describing the
258 value to be matched, followed by the statements to be executed when there
259 is a match. The head of the list can be the an instance of the Default
260 object.
261
262 Instances
263 ---------
264 Instance objects represent the parametrized instantiation of a V*HDL
265 module, and the connection of its ports to FHDL signals. They are useful
266 in a number of cases:
267 - reusing legacy or third-party V*HDL code.
268 - using special FPGA features (DCM, ICAP, ...).
269 - implementing logic that cannot be expressed with FHDL (asynchronous
270 circuits, ...).
271 - breaking down a Migen system into multiple sub-systems, possibly
272 using different clock domains.
273
274 The properties of the instance object are:
275 - the type of the instance (i.e. name of the instantiated module).
276 - a list of output ports of the instantiated module. Each element of
277 the list is a pair containing a string, which is the name of the
278 module's port, and either an existing signal (on which the port will
279 be connected to) or a BV (which will cause the creation of a new
280 signal).
281 - a list of input ports (likewise).
282 - a list of (name, value) pairs for the parameters ("generics" in VHDL)
283 of the module.
284 - the name of the clock port of the module (if any). If this is
285 specified, the port will be connected to the system clock.
286 - the name of the reset port of the module (likewise).
287 - the name of the instance (can be mangled like signal names).
288
289 Memories
290 --------
291 Memories (on-chip SRAM) are not supported, but will be soon, using a
292 mechanism similar to instances. (TODO)
293
294 Fragments
295 ---------
296 A "fragment" is a unit of logic, which is composed of:
297 - a list of combinatorial statements.
298 - a list of synchronous statements.
299 - a list of instances.
300 - a list of memories.
301 - a set of pads, which are signals intended to be connected to
302 off-chip devices.
303
304 Fragments can reference arbitrary signals, including signals that are
305 referenced in other fragments. Fragments can be combined using the "+"
306 operator, which returns a new fragment containing the concatenation of
307 each pair of lists.
308
309 Fragments can be passed to the back-end for conversion to Verilog.
310
311 By convention, classes that generate logic implement a method called
312 "get_fragment". When called, this method builds a new fragment
313 implementing the desired functionality of the class, and returns it. This
314 convention allows fragments to be built automatically by combining the
315 fragments from all relevant objects in the local scope, by using the
316 autofragment module.
317
318 Migen Core Logic
319 ================
320 Migen Core Logic is a convenience library of common logic circuits
321 implemented using FHDL:
322 - a multi-cycle integer divider.
323 - a round-robin arbiter, useful to build bus arbiters.
324 - a multiplexer bank (multimux), useful to multiplex composite
325 (grouped) signals.
326 - a condition-triggered static scheduler of FHDL synchronous statements
327 (timeline).
328
329 Migen Bus
330 =========
331 Migen Bus contains classes providing a common structure for master and
332 slave interfaces of the following buses:
333 - Wishbone [5], the general purpose bus recommended by Opencores.
334 - CSR-NG, a low-bandwidth, resource-sensitive bus designed for
335 accessing the configuration and status registers of cores from
336 software.
337 - FastMemoryLink-NG, a split-transaction bus optimized for use with a
338 high-performance, out-of-order SDRAM controller. (TODO)
339
340 It also provides interconnect components for these buses, such as
341 arbiters and address decoders. The strength of the Migen procedurally
342 generated logic can be illustrated by the following example:
343 wbcon = wishbone.InterconnectShared(
344 [cpu.ibus, cpu.dbus, ethernet.dma, audio.dma],
345 [(0, norflash.bus), (1, wishbone2fml.wishbone),
346 (3, wishbone2csr.wishbone)])
347 In this example, the interconnect component generates a 4-way round-robin
348 arbiter, multiplexes the master bus signals into a shared bus, determines
349 that the address decoding must occur on 2 bits, and connects all slave
350 interfaces to the shared bus, inserting the address decoder logic in the
351 bus cycle qualification signals and multiplexing the data return path. It
352 can recognize the signals in each core's bus interface thanks to the
353 common structure mandated by Migen Bus. All this happens automatically,
354 using only that much user code. The resulting interconnect logic can be
355 retrieved using wbcon.get_fragment(), and combined with the fragments
356 from the rest of the system.
357
358 Migen Bank
359 ==========
360 Migen Bank is a system comparable to wishbone-gen [6], which automates
361 the creation of configuration and status register banks and
362 (TODO) interrupt/event managers implemented in cores.
363
364 Bank takes a description made up of a list of registers and generates
365 logic implementing it with a slave interface compatible with Migen Bus.
366
367 A register can be "raw", which means that the core has direct access to
368 it. It also means that the register width must be less or equal to the
369 bus word width. In that case, the register object provides the following
370 signals:
371 - dev_r, which contains the data written from the bus interface.
372 - dev_re, which is the strobe signal for dev_r. It is active for one
373 cycle, after or during a write from the bus. dev_r is only valid when
374 dev_re is high.
375 - dev_w, which must provide at all times the value to be read from the
376 bus.
377
378 Registers that are not raw are managed by Bank and contain fields. If the
379 sum of the widths of all fields attached to a register exceeds the bus
380 word width, the register will automatically be sliced into words of the
381 maximum size and implemented at consecutive bus addresses, MSB first.
382 Field objects have two parameters, access_bus and access_dev, determining
383 respectively the access policies for the bus and core sides. They can
384 take the values READ_ONLY, WRITE_ONLY and READ_WRITE.
385 If the device can read, the field object provides the dev_r signal, which
386 contains at all times the current value of the field (kept by the logic
387 generated by Bank).
388 If the device can write, the field object provides the following signals:
389 - dev_w, which provides the value to be written into the field.
390 - dev_we, which strobes the value into the field.
391
392 Migen Flow (TODO)
393 ==========
394 Many hardware acceleration problems can be expressed in the dataflow
395 paradigm, that is, using a directed graph representing the flow of data
396 between actors.
397
398 Actors in Migen are written directly in FHDL. This maximizes the
399 flexibility: for example, an actor can implement a DMA master to read
400 data from system memory. It is conceivable that a CAL [7] to FHDL
401 compiler be implemented at some point, to support higher level
402 descriptions of some actors and reuse of third-party RVC-CAL
403 applications. [8] [9] [10]
404
405 Actors communicate by exchanging tokens, whose flow is typically
406 controlled using handshake signals (strobe/ack).
407
408 Each actor has a "scheduling model". It can be:
409 - N-sequential: the actor fires when tokens are available at all its
410 inputs, and it produces one output token after N cycles. It cannot
411 accept new input tokens until it has produced its output. A
412 multicycle integer divider would use this model.
413 - N-pipelined: similar to the sequential model, but the actor can
414 always accept new input tokens. It produces an output token N cycles
415 of latency after accepting input tokens. A pipelined multiplier would
416 use this model.
417 - Dynamic: the general case, when no simple hypothesis can be made on
418 the token flow behaviour of the actor. An actor accessing system
419 memory on a shared bus would use this model.
420
421 Migen Flow automatically generates handshake logic for the first two
422 scheduling models. In the third case, the FHDL descriptions for the logic
423 driving the handshake signals must be provided by the actor.
424
425 If sequential or pipelined actors are connected together, Migen Flow will
426 attempt to find a static schedule, remove the handshake signals, optimize
427 away the control logic in each actor and replace it with a centralized
428 FSM implementing the static schedule.
429
430 An actor can be a composition of other actors.
431
432 Actor graphs are managed using the NetworkX [11] library.
433
434
435 References:
436 [ 1] http://milkymist.org
437 [ 2] http://www.myhdl.org
438 [ 3] http://milkymist.org/thesis/thesis.pdf
439 [ 4] http://www.xilinx.com/publications/archives/xcell/Xcell77.pdf p30-35
440 [ 5] http://cdn.opencores.org/downloads/wbspec_b4.pdf
441 [ 6] http://www.ohwr.org/projects/wishbone-gen
442 [ 7] http://opendf.svn.sourceforge.net/viewvc/opendf/trunk/doc/
443 GentleIntro/GentleIntro.pdf
444 [ 8] http://orcc.sourceforge.net/
445 [ 9] http://orc-apps.sourceforge.net/
446 [10] http://opendf.sourceforge.net/
447 [11] http://networkx.lanl.gov/
448
449 Practical information
450 =====================
451 Code repository:
452 https://github.com/milkymist/migen
453 Experimental version of the Milkymist SoC based on Migen:
454 https://github.com/milkymist/milkymist-ng
455
456 Migen is designed for Python 3.2.
457
458 Send questions, comments and patches to devel [AT] lists.milkymist.org
459 We are also on IRC: #milkymist on the Freenode network.
460
461 Migen is free software: you can redistribute it and/or modify it under
462 the terms of the GNU General Public License as published by the Free
463 Software Foundation, version 3 of the License. This program is
464 distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
465 without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
466 PARTICULAR PURPOSE. See the GNU General Public License for more details.
467 Unless otherwise noted, Migen's source code is copyright (C) 2011
468 Sebastien Bourdeauducq. Authors retain ownership of their contributions.