6ee0bb96a2130cb704f4152bba041d78075871f2
1 """ Pipeline and BufferedHandshake implementation, conforming to the same API.
2 For multi-input and multi-output variants, see multipipe.
4 Associated development bugs:
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
11 a strategically very important function that is identical in function
12 to nmigen's Signal.eq function, except it may take objects, or a list
13 of objects, or a tuple of objects, and where objects may also be
19 stage requires compliance with a strict API that may be
20 implemented in several means, including as a static class.
21 the methods of a stage instance must be as follows:
23 * ispec() - Input data format specification
24 returns an object or a list or tuple of objects, or
25 a Record, each object having an "eq" function which
26 takes responsibility for copying by assignment all
28 * ospec() - Output data format specification
29 requirements as for ospec
30 * process(m, i) - Processes an ispec-formatted object
31 returns a combinatorial block of a result that
32 may be assigned to the output, by way of the "eq"
34 * setup(m, i) - Optional function for setting up submodules
35 may be used for more complex stages, to link
36 the input (i) to submodules. must take responsibility
37 for adding those submodules to the module (m).
38 the submodules must be combinatorial blocks and
39 must have their inputs and output linked combinatorially.
41 Both StageCls (for use with non-static classes) and Stage (for use
42 by static classes) are abstract classes from which, for convenience
43 and as a courtesy to other developers, anything conforming to the
44 Stage API may *choose* to derive.
49 A useful combinatorial wrapper around stages that chains them together
50 and then presents a Stage-API-conformant interface. By presenting
51 the same API as the stages it wraps, it can clearly be used recursively.
56 A convenience class that takes an input shape, output shape, a
57 "processing" function and an optional "setup" function. Honestly
58 though, there's not much more effort to just... create a class
59 that returns a couple of Records (see ExampleAddRecordStage in
65 A convenience class that takes a single function as a parameter,
66 that is chain-called to create the exact same input and output spec.
67 It has a process() function that simply returns its input.
69 Instances of this class are completely redundant if handed to
70 StageChain, however when passed to UnbufferedPipeline they
71 can be used to introduce a single clock delay.
76 The base class for pipelines. Contains previous and next ready/valid/data.
77 Also has an extremely useful "connect" function that can be used to
78 connect a chain of pipelines and present the exact same prev/next
84 A simple stalling clock-synchronised pipeline that has no buffering
85 (unlike BufferedHandshake). Data flows on *every* clock cycle when
86 the conditions are right (this is nominally when the input is valid
87 and the output is ready).
89 A stall anywhere along the line will result in a stall back-propagating
90 down the entire chain. The BufferedHandshake by contrast will buffer
91 incoming data, allowing previous stages one clock cycle's grace before
94 An advantage of the UnbufferedPipeline over the Buffered one is
95 that the amount of logic needed (number of gates) is greatly
96 reduced (no second set of buffers basically)
98 The disadvantage of the UnbufferedPipeline is that the valid/ready
99 logic, if chained together, is *combinatorial*, resulting in
100 progressively larger gate delay.
102 PassThroughHandshake:
105 A Control class that introduces a single clock delay, passing its
106 data through unaltered. Unlike RegisterPipeline (which relies
107 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
113 A convenience class that, because UnbufferedPipeline introduces a single
114 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
115 stage that, duh, delays its (unmodified) input by one clock cycle.
120 nmigen implementation of buffered pipeline stage, based on zipcpu:
121 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
123 this module requires quite a bit of thought to understand how it works
124 (and why it is needed in the first place). reading the above is
125 *strongly* recommended.
127 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
128 the STB / ACK signals to raise and lower (on separate clocks) before
129 data may proceeed (thus only allowing one piece of data to proceed
130 on *ALTERNATE* cycles), the signalling here is a true pipeline
131 where data will flow on *every* clock when the conditions are right.
133 input acceptance conditions are when:
134 * incoming previous-stage strobe (p.valid_i) is HIGH
135 * outgoing previous-stage ready (p.ready_o) is LOW
137 output transmission conditions are when:
138 * outgoing next-stage strobe (n.valid_o) is HIGH
139 * outgoing next-stage ready (n.ready_i) is LOW
141 the tricky bit is when the input has valid data and the output is not
142 ready to accept it. if it wasn't for the clock synchronisation, it
143 would be possible to tell the input "hey don't send that data, we're
144 not ready". unfortunately, it's not possible to "change the past":
145 the previous stage *has no choice* but to pass on its data.
147 therefore, the incoming data *must* be accepted - and stored: that
148 is the responsibility / contract that this stage *must* accept.
149 on the same clock, it's possible to tell the input that it must
150 not send any more data. this is the "stall" condition.
152 we now effectively have *two* possible pieces of data to "choose" from:
153 the buffered data, and the incoming data. the decision as to which
154 to process and output is based on whether we are in "stall" or not.
155 i.e. when the next stage is no longer ready, the output comes from
156 the buffer if a stall had previously occurred, otherwise it comes
157 direct from processing the input.
159 this allows us to respect a synchronous "travelling STB" with what
160 dan calls a "buffered handshake".
162 it's quite a complex state machine!
167 Synchronised pipeline, Based on:
168 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
171 from nmigen
import Signal
, Cat
, Const
, Mux
, Module
, Value
, Elaboratable
172 from nmigen
.cli
import verilog
, rtlil
173 from nmigen
.lib
.fifo
import SyncFIFO
, SyncFIFOBuffered
174 from nmigen
.hdl
.ast
import ArrayProxy
175 from nmigen
.hdl
.rec
import Record
, Layout
177 from abc
import ABCMeta
, abstractmethod
178 from collections
.abc
import Sequence
, Iterable
179 from collections
import OrderedDict
180 from queue
import Queue
186 self
.fields
= OrderedDict()
188 def __setattr__(self
, k
, v
):
190 if (k
.startswith('_') or k
in ["fields", "name", "src_loc"] or
191 k
in dir(Object
) or "fields" not in self
.__dict
__):
192 return object.__setattr
__(self
, k
, v
)
195 def __getattr__(self
, k
):
196 if k
in self
.__dict
__:
197 return object.__getattr
__(self
, k
)
199 return self
.fields
[k
]
200 except KeyError as e
:
201 raise AttributeError(e
)
204 for x
in self
.fields
.values():
205 if isinstance(x
, Iterable
):
212 for (k
, o
) in self
.fields
.items():
216 if isinstance(rres
, Sequence
):
227 class RecordObject(Record
):
228 def __init__(self
, layout
=None, name
=None):
229 Record
.__init
__(self
, layout
=layout
or [], name
=None)
231 def __setattr__(self
, k
, v
):
233 if (k
.startswith('_') or k
in ["fields", "name", "src_loc"] or
234 k
in dir(Record
) or "fields" not in self
.__dict
__):
235 return object.__setattr
__(self
, k
, v
)
237 #print ("RecordObject setattr", k, v)
238 if isinstance(v
, Record
):
239 newlayout
= {k
: (k
, v
.layout
)}
240 elif isinstance(v
, Value
):
241 newlayout
= {k
: (k
, v
.shape())}
243 newlayout
= {k
: (k
, shape(v
))}
244 self
.layout
.fields
.update(newlayout
)
247 for x
in self
.fields
.values():
248 if isinstance(x
, Iterable
):
257 def _spec(fn
, name
=None):
260 varnames
= dict(inspect
.getmembers(fn
.__code
__))['co_varnames']
261 if 'name' in varnames
:
266 class PrevControl(Elaboratable
):
267 """ contains signals that come *from* the previous stage (both in and out)
268 * valid_i: previous stage indicating all incoming data is valid.
269 may be a multi-bit signal, where all bits are required
270 to be asserted to indicate "valid".
271 * ready_o: output to next stage indicating readiness to accept data
272 * data_i : an input - added by the user of this class
275 def __init__(self
, i_width
=1, stage_ctl
=False):
276 self
.stage_ctl
= stage_ctl
277 self
.valid_i
= Signal(i_width
, name
="p_valid_i") # prev >>in self
278 self
._ready
_o
= Signal(name
="p_ready_o") # prev <<out self
279 self
.data_i
= None # XXX MUST BE ADDED BY USER
281 self
.s_ready_o
= Signal(name
="p_s_o_rdy") # prev <<out self
282 self
.trigger
= Signal(reset_less
=True)
286 """ public-facing API: indicates (externally) that stage is ready
289 return self
.s_ready_o
# set dynamically by stage
290 return self
._ready
_o
# return this when not under dynamic control
292 def _connect_in(self
, prev
, direct
=False, fn
=None):
293 """ internal helper function to connect stage to an input source.
294 do not use to connect stage-to-stage!
296 valid_i
= prev
.valid_i
if direct
else prev
.valid_i_test
297 data_i
= fn(prev
.data_i
) if fn
is not None else prev
.data_i
298 return [self
.valid_i
.eq(valid_i
),
299 prev
.ready_o
.eq(self
.ready_o
),
300 eq(self
.data_i
, data_i
),
304 def valid_i_test(self
):
305 vlen
= len(self
.valid_i
)
307 # multi-bit case: valid only when valid_i is all 1s
308 all1s
= Const(-1, (len(self
.valid_i
), False))
309 valid_i
= (self
.valid_i
== all1s
)
311 # single-bit valid_i case
312 valid_i
= self
.valid_i
314 # when stage indicates not ready, incoming data
315 # must "appear" to be not ready too
317 valid_i
= valid_i
& self
.s_ready_o
321 def elaborate(self
, platform
):
323 m
.d
.comb
+= self
.trigger
.eq(self
.valid_i_test
& self
.ready_o
)
327 return [self
.data_i
.eq(i
.data_i
),
328 self
.ready_o
.eq(i
.ready_o
),
329 self
.valid_i
.eq(i
.valid_i
)]
334 if hasattr(self
.data_i
, "ports"):
335 yield from self
.data_i
.ports()
336 elif isinstance(self
.data_i
, Sequence
):
337 yield from self
.data_i
345 class NextControl(Elaboratable
):
346 """ contains the signals that go *to* the next stage (both in and out)
347 * valid_o: output indicating to next stage that data is valid
348 * ready_i: input from next stage indicating that it can accept data
349 * data_o : an output - added by the user of this class
351 def __init__(self
, stage_ctl
=False):
352 self
.stage_ctl
= stage_ctl
353 self
.valid_o
= Signal(name
="n_valid_o") # self out>> next
354 self
.ready_i
= Signal(name
="n_ready_i") # self <<in next
355 self
.data_o
= None # XXX MUST BE ADDED BY USER
357 self
.d_valid
= Signal(reset
=1) # INTERNAL (data valid)
358 self
.trigger
= Signal(reset_less
=True)
361 def ready_i_test(self
):
363 return self
.ready_i
& self
.d_valid
366 def connect_to_next(self
, nxt
):
367 """ helper function to connect to the next stage data/valid/ready.
368 data/valid is passed *TO* nxt, and ready comes *IN* from nxt.
369 use this when connecting stage-to-stage
371 return [nxt
.valid_i
.eq(self
.valid_o
),
372 self
.ready_i
.eq(nxt
.ready_o
),
373 eq(nxt
.data_i
, self
.data_o
),
376 def _connect_out(self
, nxt
, direct
=False, fn
=None):
377 """ internal helper function to connect stage to an output source.
378 do not use to connect stage-to-stage!
380 ready_i
= nxt
.ready_i
if direct
else nxt
.ready_i_test
381 data_o
= fn(nxt
.data_o
) if fn
is not None else nxt
.data_o
382 return [nxt
.valid_o
.eq(self
.valid_o
),
383 self
.ready_i
.eq(ready_i
),
384 eq(data_o
, self
.data_o
),
387 def elaborate(self
, platform
):
389 m
.d
.comb
+= self
.trigger
.eq(self
.ready_i_test
& self
.valid_o
)
395 if hasattr(self
.data_o
, "ports"):
396 yield from self
.data_o
.ports()
397 elif isinstance(self
.data_o
, Sequence
):
398 yield from self
.data_o
407 """ a helper class for iterating twin-argument compound data structures.
409 Record is a special (unusual, recursive) case, where the input may be
410 specified as a dictionary (which may contain further dictionaries,
411 recursively), where the field names of the dictionary must match
412 the Record's field spec. Alternatively, an object with the same
413 member names as the Record may be assigned: it does not have to
416 ArrayProxy is also special-cased, it's a bit messy: whilst ArrayProxy
417 has an eq function, the object being assigned to it (e.g. a python
418 object) might not. despite the *input* having an eq function,
419 that doesn't help us, because it's the *ArrayProxy* that's being
420 assigned to. so.... we cheat. use the ports() function of the
421 python object, enumerate them, find out the list of Signals that way,
424 def iterator2(self
, o
, i
):
425 if isinstance(o
, dict):
426 yield from self
.dict_iter2(o
, i
)
428 if not isinstance(o
, Sequence
):
430 for (ao
, ai
) in zip(o
, i
):
431 #print ("visit", fn, ao, ai)
432 if isinstance(ao
, Record
):
433 yield from self
.record_iter2(ao
, ai
)
434 elif isinstance(ao
, ArrayProxy
) and not isinstance(ai
, Value
):
435 yield from self
.arrayproxy_iter2(ao
, ai
)
439 def dict_iter2(self
, o
, i
):
440 for (k
, v
) in o
.items():
441 print ("d-iter", v
, i
[k
])
445 def _not_quite_working_with_all_unit_tests_record_iter2(self
, ao
, ai
):
446 print ("record_iter2", ao
, ai
, type(ao
), type(ai
))
447 if isinstance(ai
, Value
):
448 if isinstance(ao
, Sequence
):
450 for o
, i
in zip(ao
, ai
):
453 for idx
, (field_name
, field_shape
, _
) in enumerate(ao
.layout
):
454 if isinstance(field_shape
, Layout
):
458 if hasattr(val
, field_name
): # check for attribute
459 val
= getattr(val
, field_name
)
461 val
= val
[field_name
] # dictionary-style specification
462 yield from self
.iterator2(ao
.fields
[field_name
], val
)
464 def record_iter2(self
, ao
, ai
):
465 for idx
, (field_name
, field_shape
, _
) in enumerate(ao
.layout
):
466 if isinstance(field_shape
, Layout
):
470 if hasattr(val
, field_name
): # check for attribute
471 val
= getattr(val
, field_name
)
473 val
= val
[field_name
] # dictionary-style specification
474 yield from self
.iterator2(ao
.fields
[field_name
], val
)
476 def arrayproxy_iter2(self
, ao
, ai
):
478 op
= getattr(ao
, p
.name
)
479 print ("arrayproxy - p", p
, p
.name
)
480 yield from self
.iterator2(op
, p
)
484 """ a helper class for iterating single-argument compound data structures.
487 def iterate(self
, i
):
488 """ iterate a compound structure recursively using yield
490 if not isinstance(i
, Sequence
):
493 #print ("iterate", ai)
494 if isinstance(ai
, Record
):
495 #print ("record", list(ai.layout))
496 yield from self
.record_iter(ai
)
497 elif isinstance(ai
, ArrayProxy
) and not isinstance(ai
, Value
):
498 yield from self
.array_iter(ai
)
502 def record_iter(self
, ai
):
503 for idx
, (field_name
, field_shape
, _
) in enumerate(ai
.layout
):
504 if isinstance(field_shape
, Layout
):
508 if hasattr(val
, field_name
): # check for attribute
509 val
= getattr(val
, field_name
)
511 val
= val
[field_name
] # dictionary-style specification
512 #print ("recidx", idx, field_name, field_shape, val)
513 yield from self
.iterate(val
)
515 def array_iter(self
, ai
):
517 yield from self
.iterate(p
)
521 """ makes signals equal: a helper routine which identifies if it is being
522 passed a list (or tuple) of objects, or signals, or Records, and calls
523 the objects' eq function.
526 for (ao
, ai
) in Visitor2().iterator2(o
, i
):
528 if not isinstance(rres
, Sequence
):
538 #print ("shape?", part)
545 """ flattens a compound structure recursively using Cat
547 from nmigen
.tools
import flatten
548 #res = list(flatten(i)) # works (as of nmigen commit f22106e5) HOWEVER...
549 res
= list(Visitor().iterate(i
)) # needed because input may be a sequence
553 class StageCls(metaclass
=ABCMeta
):
554 """ Class-based "Stage" API. requires instantiation (after derivation)
556 see "Stage API" above.. Note: python does *not* require derivation
557 from this class. All that is required is that the pipelines *have*
558 the functions listed in this class. Derivation from this class
559 is therefore merely a "courtesy" to maintainers.
562 def ispec(self
): pass # REQUIRED
564 def ospec(self
): pass # REQUIRED
566 #def setup(self, m, i): pass # OPTIONAL
568 def process(self
, i
): pass # REQUIRED
571 class Stage(metaclass
=ABCMeta
):
572 """ Static "Stage" API. does not require instantiation (after derivation)
574 see "Stage API" above. Note: python does *not* require derivation
575 from this class. All that is required is that the pipelines *have*
576 the functions listed in this class. Derivation from this class
577 is therefore merely a "courtesy" to maintainers.
589 #def setup(m, i): pass
596 class RecordBasedStage(Stage
):
597 """ convenience class which provides a Records-based layout.
598 honestly it's a lot easier just to create a direct Records-based
599 class (see ExampleAddRecordStage)
601 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
602 self
.in_shape
= in_shape
603 self
.out_shape
= out_shape
604 self
.__process
= processfn
605 self
.__setup
= setupfn
606 def ispec(self
): return Record(self
.in_shape
)
607 def ospec(self
): return Record(self
.out_shape
)
608 def process(seif
, i
): return self
.__process
(i
)
609 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
612 class StageChain(StageCls
):
613 """ pass in a list of stages, and they will automatically be
614 chained together via their input and output specs into a
617 the end result basically conforms to the exact same Stage API.
619 * input to this class will be the input of the first stage
620 * output of first stage goes into input of second
621 * output of second goes into input into third (etc. etc.)
622 * the output of this class will be the output of the last stage
624 def __init__(self
, chain
, specallocate
=False):
626 self
.specallocate
= specallocate
629 return _spec(self
.chain
[0].ispec
, "chainin")
632 return _spec(self
.chain
[-1].ospec
, "chainout")
634 def _specallocate_setup(self
, m
, i
):
635 for (idx
, c
) in enumerate(self
.chain
):
636 if hasattr(c
, "setup"):
637 c
.setup(m
, i
) # stage may have some module stuff
638 ofn
= self
.chain
[idx
].ospec
# last assignment survives
639 o
= _spec(ofn
, 'chainin%d' % idx
)
640 m
.d
.comb
+= eq(o
, c
.process(i
)) # process input into "o"
641 if idx
== len(self
.chain
)-1:
643 ifn
= self
.chain
[idx
+1].ispec
# new input on next loop
644 i
= _spec(ifn
, 'chainin%d' % (idx
+1))
645 m
.d
.comb
+= eq(i
, o
) # assign to next input
646 return o
# last loop is the output
648 def _noallocate_setup(self
, m
, i
):
649 for (idx
, c
) in enumerate(self
.chain
):
650 if hasattr(c
, "setup"):
651 c
.setup(m
, i
) # stage may have some module stuff
652 i
= o
= c
.process(i
) # store input into "o"
653 return o
# last loop is the output
655 def setup(self
, m
, i
):
656 if self
.specallocate
:
657 self
.o
= self
._specallocate
_setup
(m
, i
)
659 self
.o
= self
._noallocate
_setup
(m
, i
)
661 def process(self
, i
):
662 return self
.o
# conform to Stage API: return last-loop output
665 class ControlBase(Elaboratable
):
666 """ Common functions for Pipeline API
668 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False):
669 """ Base class containing ready/valid/data to previous and next stages
671 * p: contains ready/valid to the previous stage
672 * n: contains ready/valid to the next stage
674 Except when calling Controlbase.connect(), user must also:
675 * add data_i member to PrevControl (p) and
676 * add data_o member to NextControl (n)
680 # set up input and output IO ACK (prev/next ready/valid)
681 self
.p
= PrevControl(in_multi
, stage_ctl
)
682 self
.n
= NextControl(stage_ctl
)
684 # set up the input and output data
685 if stage
is not None:
686 self
.p
.data_i
= _spec(stage
.ispec
, "data_i") # input type
687 self
.n
.data_o
= _spec(stage
.ospec
, "data_o") # output type
689 def connect_to_next(self
, nxt
):
690 """ helper function to connect to the next stage data/valid/ready.
692 return self
.n
.connect_to_next(nxt
.p
)
694 def _connect_in(self
, prev
):
695 """ internal helper function to connect stage to an input source.
696 do not use to connect stage-to-stage!
698 return self
.p
._connect
_in
(prev
.p
)
700 def _connect_out(self
, nxt
):
701 """ internal helper function to connect stage to an output source.
702 do not use to connect stage-to-stage!
704 return self
.n
._connect
_out
(nxt
.n
)
706 def connect(self
, pipechain
):
707 """ connects a chain (list) of Pipeline instances together and
708 links them to this ControlBase instance:
710 in <----> self <---> out
713 [pipe1, pipe2, pipe3, pipe4]
716 out---in out--in out---in
718 Also takes care of allocating data_i/data_o, by looking up
719 the data spec for each end of the pipechain. i.e It is NOT
720 necessary to allocate self.p.data_i or self.n.data_o manually:
721 this is handled AUTOMATICALLY, here.
723 Basically this function is the direct equivalent of StageChain,
724 except that unlike StageChain, the Pipeline logic is followed.
726 Just as StageChain presents an object that conforms to the
727 Stage API from a list of objects that also conform to the
728 Stage API, an object that calls this Pipeline connect function
729 has the exact same pipeline API as the list of pipline objects
732 Thus it becomes possible to build up larger chains recursively.
733 More complex chains (multi-input, multi-output) will have to be
736 eqs
= [] # collated list of assignment statements
738 # connect inter-chain
739 for i
in range(len(pipechain
)-1):
741 pipe2
= pipechain
[i
+1]
742 eqs
+= pipe1
.connect_to_next(pipe2
)
744 # connect front of chain to ourselves
746 self
.p
.data_i
= _spec(front
.stage
.ispec
, "chainin")
747 eqs
+= front
._connect
_in
(self
)
749 # connect end of chain to ourselves
751 self
.n
.data_o
= _spec(end
.stage
.ospec
, "chainout")
752 eqs
+= end
._connect
_out
(self
)
756 def _postprocess(self
, i
): # XXX DISABLED
757 return i
# RETURNS INPUT
758 if hasattr(self
.stage
, "postprocess"):
759 return self
.stage
.postprocess(i
)
762 def set_input(self
, i
):
763 """ helper function to set the input data
765 return eq(self
.p
.data_i
, i
)
774 def elaborate(self
, platform
):
775 """ handles case where stage has dynamic ready/valid functions
778 m
.submodules
.p
= self
.p
779 m
.submodules
.n
= self
.n
781 if self
.stage
is not None and hasattr(self
.stage
, "setup"):
782 self
.stage
.setup(m
, self
.p
.data_i
)
784 if not self
.p
.stage_ctl
:
787 # intercept the previous (outgoing) "ready", combine with stage ready
788 m
.d
.comb
+= self
.p
.s_ready_o
.eq(self
.p
._ready
_o
& self
.stage
.d_ready
)
790 # intercept the next (incoming) "ready" and combine it with data valid
791 sdv
= self
.stage
.d_valid(self
.n
.ready_i
)
792 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.ready_i
& sdv
)
797 class BufferedHandshake(ControlBase
):
798 """ buffered pipeline stage. data and strobe signals travel in sync.
799 if ever the input is ready and the output is not, processed data
800 is shunted in a temporary register.
802 Argument: stage. see Stage API above
804 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
805 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
806 stage-1 p.data_i >>in stage n.data_o out>> stage+1
812 input data p.data_i is read (only), is processed and goes into an
813 intermediate result store [process()]. this is updated combinatorially.
815 in a non-stall condition, the intermediate result will go into the
816 output (update_output). however if ever there is a stall, it goes
817 into r_data instead [update_buffer()].
819 when the non-stall condition is released, r_data is the first
820 to be transferred to the output [flush_buffer()], and the stall
823 on the next cycle (as long as stall is not raised again) the
824 input may begin to be processed and transferred directly to output.
827 def elaborate(self
, platform
):
828 self
.m
= ControlBase
.elaborate(self
, platform
)
830 result
= _spec(self
.stage
.ospec
, "r_tmp")
831 r_data
= _spec(self
.stage
.ospec
, "r_data")
833 # establish some combinatorial temporaries
834 o_n_validn
= Signal(reset_less
=True)
835 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
836 nir_por
= Signal(reset_less
=True)
837 nir_por_n
= Signal(reset_less
=True)
838 p_valid_i
= Signal(reset_less
=True)
839 nir_novn
= Signal(reset_less
=True)
840 nirn_novn
= Signal(reset_less
=True)
841 por_pivn
= Signal(reset_less
=True)
842 npnn
= Signal(reset_less
=True)
843 self
.m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
844 o_n_validn
.eq(~self
.n
.valid_o
),
845 n_ready_i
.eq(self
.n
.ready_i_test
),
846 nir_por
.eq(n_ready_i
& self
.p
._ready
_o
),
847 nir_por_n
.eq(n_ready_i
& ~self
.p
._ready
_o
),
848 nir_novn
.eq(n_ready_i | o_n_validn
),
849 nirn_novn
.eq(~n_ready_i
& o_n_validn
),
850 npnn
.eq(nir_por | nirn_novn
),
851 por_pivn
.eq(self
.p
._ready
_o
& ~p_valid_i
)
854 # store result of processing in combinatorial temporary
855 self
.m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
857 # if not in stall condition, update the temporary register
858 with self
.m
.If(self
.p
.ready_o
): # not stalled
859 self
.m
.d
.sync
+= eq(r_data
, result
) # update buffer
861 # data pass-through conditions
862 with self
.m
.If(npnn
):
863 data_o
= self
._postprocess
(result
)
864 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(p_valid_i
), # valid if p_valid
865 eq(self
.n
.data_o
, data_o
), # update output
867 # buffer flush conditions (NOTE: can override data passthru conditions)
868 with self
.m
.If(nir_por_n
): # not stalled
869 # Flush the [already processed] buffer to the output port.
870 data_o
= self
._postprocess
(r_data
)
871 self
.m
.d
.sync
+= [self
.n
.valid_o
.eq(1), # reg empty
872 eq(self
.n
.data_o
, data_o
), # flush buffer
874 # output ready conditions
875 self
.m
.d
.sync
+= self
.p
._ready
_o
.eq(nir_novn | por_pivn
)
880 class SimpleHandshake(ControlBase
):
881 """ simple handshake control. data and strobe signals travel in sync.
882 implements the protocol used by Wishbone and AXI4.
884 Argument: stage. see Stage API above
886 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
887 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
888 stage-1 p.data_i >>in stage n.data_o out>> stage+1
893 Inputs Temporary Output Data
894 ------- ---------- ----- ----
895 P P N N PiV& ~NiR& N P
902 0 0 1 0 0 0 0 1 process(data_i)
903 0 0 1 1 0 0 0 1 process(data_i)
907 0 1 1 0 0 0 0 1 process(data_i)
908 0 1 1 1 0 0 0 1 process(data_i)
912 1 0 1 0 0 0 0 1 process(data_i)
913 1 0 1 1 0 0 0 1 process(data_i)
915 1 1 0 0 1 0 1 0 process(data_i)
916 1 1 0 1 1 1 1 0 process(data_i)
917 1 1 1 0 1 0 1 1 process(data_i)
918 1 1 1 1 1 0 1 1 process(data_i)
922 def elaborate(self
, platform
):
923 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
926 result
= _spec(self
.stage
.ospec
, "r_tmp")
928 # establish some combinatorial temporaries
929 n_ready_i
= Signal(reset_less
=True, name
="n_i_rdy_data")
930 p_valid_i_p_ready_o
= Signal(reset_less
=True)
931 p_valid_i
= Signal(reset_less
=True)
932 m
.d
.comb
+= [p_valid_i
.eq(self
.p
.valid_i_test
),
933 n_ready_i
.eq(self
.n
.ready_i_test
),
934 p_valid_i_p_ready_o
.eq(p_valid_i
& self
.p
.ready_o
),
937 # store result of processing in combinatorial temporary
938 m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
940 # previous valid and ready
941 with m
.If(p_valid_i_p_ready_o
):
942 data_o
= self
._postprocess
(result
)
943 m
.d
.sync
+= [r_busy
.eq(1), # output valid
944 eq(self
.n
.data_o
, data_o
), # update output
946 # previous invalid or not ready, however next is accepting
947 with m
.Elif(n_ready_i
):
948 data_o
= self
._postprocess
(result
)
949 m
.d
.sync
+= [eq(self
.n
.data_o
, data_o
)]
950 # TODO: could still send data here (if there was any)
951 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
952 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
954 m
.d
.comb
+= self
.n
.valid_o
.eq(r_busy
)
955 # if next is ready, so is previous
956 m
.d
.comb
+= self
.p
._ready
_o
.eq(n_ready_i
)
961 class UnbufferedPipeline(ControlBase
):
962 """ A simple pipeline stage with single-clock synchronisation
963 and two-way valid/ready synchronised signalling.
965 Note that a stall in one stage will result in the entire pipeline
968 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
969 travel synchronously with the data: the valid/ready signalling
970 combines in a *combinatorial* fashion. Therefore, a long pipeline
971 chain will lengthen propagation delays.
973 Argument: stage. see Stage API, above
975 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
976 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
977 stage-1 p.data_i >>in stage n.data_o out>> stage+1
985 p.data_i : StageInput, shaped according to ispec
987 p.data_o : StageOutput, shaped according to ospec
989 r_data : input_shape according to ispec
990 A temporary (buffered) copy of a prior (valid) input.
991 This is HELD if the output is not ready. It is updated
993 result: output_shape according to ospec
994 The output of the combinatorial logic. it is updated
995 COMBINATORIALLY (no clock dependence).
999 Inputs Temp Output Data
1000 ------- - ----- ----
1021 1 1 0 0 0 1 1 process(data_i)
1022 1 1 0 1 1 1 0 process(data_i)
1023 1 1 1 0 0 1 1 process(data_i)
1024 1 1 1 1 0 1 1 process(data_i)
1027 Note: PoR is *NOT* involved in the above decision-making.
1030 def elaborate(self
, platform
):
1031 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1033 data_valid
= Signal() # is data valid or not
1034 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
1037 p_valid_i
= Signal(reset_less
=True)
1038 pv
= Signal(reset_less
=True)
1039 buf_full
= Signal(reset_less
=True)
1040 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
1041 m
.d
.comb
+= pv
.eq(self
.p
.valid_i
& self
.p
.ready_o
)
1042 m
.d
.comb
+= buf_full
.eq(~self
.n
.ready_i_test
& data_valid
)
1044 m
.d
.comb
+= self
.n
.valid_o
.eq(data_valid
)
1045 m
.d
.comb
+= self
.p
._ready
_o
.eq(~data_valid | self
.n
.ready_i_test
)
1046 m
.d
.sync
+= data_valid
.eq(p_valid_i | buf_full
)
1049 m
.d
.sync
+= eq(r_data
, self
.stage
.process(self
.p
.data_i
))
1050 data_o
= self
._postprocess
(r_data
)
1051 m
.d
.comb
+= eq(self
.n
.data_o
, data_o
)
1055 class UnbufferedPipeline2(ControlBase
):
1056 """ A simple pipeline stage with single-clock synchronisation
1057 and two-way valid/ready synchronised signalling.
1059 Note that a stall in one stage will result in the entire pipeline
1062 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
1063 travel synchronously with the data: the valid/ready signalling
1064 combines in a *combinatorial* fashion. Therefore, a long pipeline
1065 chain will lengthen propagation delays.
1067 Argument: stage. see Stage API, above
1069 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
1070 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
1071 stage-1 p.data_i >>in stage n.data_o out>> stage+1
1073 +- process-> buf <-+
1076 p.data_i : StageInput, shaped according to ispec
1078 p.data_o : StageOutput, shaped according to ospec
1080 buf : output_shape according to ospec
1081 A temporary (buffered) copy of a valid output
1082 This is HELD if the output is not ready. It is updated
1085 Inputs Temp Output Data
1087 P P N N ~NiR& N P (buf_full)
1092 0 0 0 0 0 0 1 process(data_i)
1093 0 0 0 1 1 1 0 reg (odata, unchanged)
1094 0 0 1 0 0 0 1 process(data_i)
1095 0 0 1 1 0 0 1 process(data_i)
1097 0 1 0 0 0 0 1 process(data_i)
1098 0 1 0 1 1 1 0 reg (odata, unchanged)
1099 0 1 1 0 0 0 1 process(data_i)
1100 0 1 1 1 0 0 1 process(data_i)
1102 1 0 0 0 0 1 1 process(data_i)
1103 1 0 0 1 1 1 0 reg (odata, unchanged)
1104 1 0 1 0 0 1 1 process(data_i)
1105 1 0 1 1 0 1 1 process(data_i)
1107 1 1 0 0 0 1 1 process(data_i)
1108 1 1 0 1 1 1 0 reg (odata, unchanged)
1109 1 1 1 0 0 1 1 process(data_i)
1110 1 1 1 1 0 1 1 process(data_i)
1113 Note: PoR is *NOT* involved in the above decision-making.
1116 def elaborate(self
, platform
):
1117 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1119 buf_full
= Signal() # is data valid or not
1120 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
1123 p_valid_i
= Signal(reset_less
=True)
1124 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
1126 m
.d
.comb
+= self
.n
.valid_o
.eq(buf_full | p_valid_i
)
1127 m
.d
.comb
+= self
.p
._ready
_o
.eq(~buf_full
)
1128 m
.d
.sync
+= buf_full
.eq(~self
.n
.ready_i_test
& self
.n
.valid_o
)
1130 data_o
= Mux(buf_full
, buf
, self
.stage
.process(self
.p
.data_i
))
1131 data_o
= self
._postprocess
(data_o
)
1132 m
.d
.comb
+= eq(self
.n
.data_o
, data_o
)
1133 m
.d
.sync
+= eq(buf
, self
.n
.data_o
)
1138 class PassThroughStage(StageCls
):
1139 """ a pass-through stage which has its input data spec equal to its output,
1140 and "passes through" its data from input to output.
1142 def __init__(self
, iospecfn
):
1143 self
.iospecfn
= iospecfn
1144 def ispec(self
): return self
.iospecfn()
1145 def ospec(self
): return self
.iospecfn()
1146 def process(self
, i
): return i
1149 class PassThroughHandshake(ControlBase
):
1150 """ A control block that delays by one clock cycle.
1152 Inputs Temporary Output Data
1153 ------- ------------------ ----- ----
1154 P P N N PiV& PiV| NiR| pvr N P (pvr)
1155 i o i o PoR ~PoR ~NoV o o
1159 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
1160 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
1161 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
1162 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
1164 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
1165 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
1166 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
1167 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
1169 1 0 0 0 0 1 1 1 1 1 process(in)
1170 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
1171 1 0 1 0 0 1 1 1 1 1 process(in)
1172 1 0 1 1 0 1 1 1 1 1 process(in)
1174 1 1 0 0 1 1 1 1 1 1 process(in)
1175 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
1176 1 1 1 0 1 1 1 1 1 1 process(in)
1177 1 1 1 1 1 1 1 1 1 1 process(in)
1182 def elaborate(self
, platform
):
1183 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1185 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
1188 p_valid_i
= Signal(reset_less
=True)
1189 pvr
= Signal(reset_less
=True)
1190 m
.d
.comb
+= p_valid_i
.eq(self
.p
.valid_i_test
)
1191 m
.d
.comb
+= pvr
.eq(p_valid_i
& self
.p
.ready_o
)
1193 m
.d
.comb
+= self
.p
.ready_o
.eq(~self
.n
.valid_o | self
.n
.ready_i_test
)
1194 m
.d
.sync
+= self
.n
.valid_o
.eq(p_valid_i | ~self
.p
.ready_o
)
1196 odata
= Mux(pvr
, self
.stage
.process(self
.p
.data_i
), r_data
)
1197 m
.d
.sync
+= eq(r_data
, odata
)
1198 r_data
= self
._postprocess
(r_data
)
1199 m
.d
.comb
+= eq(self
.n
.data_o
, r_data
)
1204 class RegisterPipeline(UnbufferedPipeline
):
1205 """ A pipeline stage that delays by one clock cycle, creating a
1206 sync'd latch out of data_o and valid_o as an indirect byproduct
1207 of using PassThroughStage
1209 def __init__(self
, iospecfn
):
1210 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
1213 class FIFOControl(ControlBase
):
1214 """ FIFO Control. Uses SyncFIFO to store data, coincidentally
1215 happens to have same valid/ready signalling as Stage API.
1217 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
1220 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
1221 fwft
=True, buffered
=False, pipe
=False):
1224 * depth: number of entries in the FIFO
1225 * stage: data processing block
1226 * fwft : first word fall-thru mode (non-fwft introduces delay)
1227 * buffered: use buffered FIFO (introduces extra cycle delay)
1229 NOTE 1: FPGAs may have trouble with the defaults for SyncFIFO
1230 (fwft=True, buffered=False)
1232 NOTE 2: data_i *must* have a shape function. it can therefore
1233 be a Signal, or a Record, or a RecordObject.
1235 data is processed (and located) as follows:
1237 self.p self.stage temp fn temp fn temp fp self.n
1238 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
1240 yes, really: cat produces a Cat() which can be assigned to.
1241 this is how the FIFO gets de-catted without needing a de-cat
1245 assert not (fwft
and buffered
), "buffered cannot do fwft"
1249 self
.buffered
= buffered
1252 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
1254 def elaborate(self
, platform
):
1255 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
1257 # make a FIFO with a signal of equal width to the data_o.
1258 (fwidth
, _
) = shape(self
.n
.data_o
)
1260 fifo
= SyncFIFOBuffered(fwidth
, self
.fdepth
)
1262 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
1263 m
.submodules
.fifo
= fifo
1265 # store result of processing in combinatorial temporary
1266 result
= _spec(self
.stage
.ospec
, "r_temp")
1267 m
.d
.comb
+= eq(result
, self
.stage
.process(self
.p
.data_i
))
1269 # connect previous rdy/valid/data - do cat on data_i
1270 # NOTE: cannot do the PrevControl-looking trick because
1271 # of need to process the data. shaaaame....
1272 m
.d
.comb
+= [fifo
.we
.eq(self
.p
.valid_i_test
),
1273 self
.p
.ready_o
.eq(fifo
.writable
),
1274 eq(fifo
.din
, cat(result
)),
1277 # connect next rdy/valid/data - do cat on data_o
1278 connections
= [self
.n
.valid_o
.eq(fifo
.readable
),
1279 fifo
.re
.eq(self
.n
.ready_i_test
),
1281 if self
.fwft
or self
.buffered
:
1282 m
.d
.comb
+= connections
1284 m
.d
.sync
+= connections
# unbuffered fwft mode needs sync
1285 data_o
= cat(self
.n
.data_o
).eq(fifo
.dout
)
1286 data_o
= self
._postprocess
(data_o
)
1293 class UnbufferedPipeline(FIFOControl
):
1294 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1295 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1296 fwft
=True, pipe
=False)
1298 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
1299 class PassThroughHandshake(FIFOControl
):
1300 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1301 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1302 fwft
=True, pipe
=True)
1304 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1305 class BufferedHandshake(FIFOControl
):
1306 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1307 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
1308 fwft
=True, pipe
=False)
1312 # this is *probably* SimpleHandshake (note: memory cell size=0)
1313 class SimpleHandshake(FIFOControl):
1314 def __init__(self, stage, in_multi=None, stage_ctl=False):
1315 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1316 fwft=True, pipe=False)