1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
3 This work is funded through NLnet under Grant 2019-02-012
8 Associated development bugs:
9 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
10 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
11 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
13 Important: see Stage API (stageapi.py) and IO Control API
14 (iocontrol.py) in combination with below. This module
15 "combines" the Stage API with the IO Control API to create
18 The one critically important key difference between StageAPI and
21 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
22 * PipelineAPI: synchronous registers / latches get added here
27 A convenience class that takes an input shape, output shape, a
28 "processing" function and an optional "setup" function. Honestly
29 though, there's not much more effort to just... create a class
30 that returns a couple of Records (see ExampleAddRecordStage in
36 A convenience class that takes a single function as a parameter,
37 that is chain-called to create the exact same input and output spec.
38 It has a process() function that simply returns its input.
40 Instances of this class are completely redundant if handed to
41 StageChain, however when passed to UnbufferedPipeline they
42 can be used to introduce a single clock delay.
47 The base class for pipelines. Contains previous and next ready/valid/data.
48 Also has an extremely useful "connect" function that can be used to
49 connect a chain of pipelines and present the exact same prev/next
52 Note: pipelines basically do not become pipelines as such until
53 handed to a derivative of ControlBase. ControlBase itself is *not*
54 strictly considered a pipeline class. Wishbone and AXI4 (master or
55 slave) could be derived from ControlBase, for example.
59 A simple stalling clock-synchronised pipeline that has no buffering
60 (unlike BufferedHandshake). Data flows on *every* clock cycle when
61 the conditions are right (this is nominally when the input is valid
62 and the output is ready).
64 A stall anywhere along the line will result in a stall back-propagating
65 down the entire chain. The BufferedHandshake by contrast will buffer
66 incoming data, allowing previous stages one clock cycle's grace before
69 An advantage of the UnbufferedPipeline over the Buffered one is
70 that the amount of logic needed (number of gates) is greatly
71 reduced (no second set of buffers basically)
73 The disadvantage of the UnbufferedPipeline is that the valid/ready
74 logic, if chained together, is *combinatorial*, resulting in
75 progressively larger gate delay.
80 A Control class that introduces a single clock delay, passing its
81 data through unaltered. Unlike RegisterPipeline (which relies
82 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
88 A convenience class that, because UnbufferedPipeline introduces a single
89 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
90 stage that, duh, delays its (unmodified) input by one clock cycle.
95 nmigen implementation of buffered pipeline stage, based on zipcpu:
96 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
98 this module requires quite a bit of thought to understand how it works
99 (and why it is needed in the first place). reading the above is
100 *strongly* recommended.
102 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
103 the STB / ACK signals to raise and lower (on separate clocks) before
104 data may proceeed (thus only allowing one piece of data to proceed
105 on *ALTERNATE* cycles), the signalling here is a true pipeline
106 where data will flow on *every* clock when the conditions are right.
108 input acceptance conditions are when:
109 * incoming previous-stage strobe (p.i_valid) is HIGH
110 * outgoing previous-stage ready (p.o_ready) is LOW
112 output transmission conditions are when:
113 * outgoing next-stage strobe (n.o_valid) is HIGH
114 * outgoing next-stage ready (n.i_ready) is LOW
116 the tricky bit is when the input has valid data and the output is not
117 ready to accept it. if it wasn't for the clock synchronisation, it
118 would be possible to tell the input "hey don't send that data, we're
119 not ready". unfortunately, it's not possible to "change the past":
120 the previous stage *has no choice* but to pass on its data.
122 therefore, the incoming data *must* be accepted - and stored: that
123 is the responsibility / contract that this stage *must* accept.
124 on the same clock, it's possible to tell the input that it must
125 not send any more data. this is the "stall" condition.
127 we now effectively have *two* possible pieces of data to "choose" from:
128 the buffered data, and the incoming data. the decision as to which
129 to process and output is based on whether we are in "stall" or not.
130 i.e. when the next stage is no longer ready, the output comes from
131 the buffer if a stall had previously occurred, otherwise it comes
132 direct from processing the input.
134 this allows us to respect a synchronous "travelling STB" with what
135 dan calls a "buffered handshake".
137 it's quite a complex state machine!
142 Synchronised pipeline, Based on:
143 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
146 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
147 from nmigen
.cli
import verilog
, rtlil
148 from nmigen
.hdl
.rec
import Record
150 from nmutil
.queue
import Queue
153 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
154 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
155 from nmutil
import nmoperator
158 class RecordBasedStage(Stage
):
159 """ convenience class which provides a Records-based layout.
160 honestly it's a lot easier just to create a direct Records-based
161 class (see ExampleAddRecordStage)
164 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
165 self
.in_shape
= in_shape
166 self
.out_shape
= out_shape
167 self
.__process
= processfn
168 self
.__setup
= setupfn
170 def ispec(self
): return Record(self
.in_shape
)
171 def ospec(self
): return Record(self
.out_shape
)
172 def process(seif
, i
): return self
.__process
(i
)
173 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
176 class PassThroughStage(StageCls
):
177 """ a pass-through stage with its input data spec identical to its output,
178 and "passes through" its data from input to output (does nothing).
180 use this basically to explicitly make any data spec Stage-compliant.
181 (many APIs would potentially use a static "wrap" method in e.g.
182 StageCls to achieve a similar effect)
185 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
186 def ispec(self
): return self
.iospecfn()
187 def ospec(self
): return self
.iospecfn()
190 class ControlBase(StageHelper
, Elaboratable
):
191 """ Common functions for Pipeline API. Note: a "pipeline stage" only
192 exists (conceptually) when a ControlBase derivative is handed
193 a Stage (combinatorial block)
195 NOTE: ControlBase derives from StageHelper, making it accidentally
196 compliant with the Stage API. Using those functions directly
197 *BYPASSES* a ControlBase instance ready/valid signalling, which
198 clearly should not be done without a really, really good reason.
201 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
202 """ Base class containing ready/valid/data to previous and next stages
204 * p: contains ready/valid to the previous stage
205 * n: contains ready/valid to the next stage
207 Except when calling Controlbase.connect(), user must also:
208 * add i_data member to PrevControl (p) and
209 * add o_data member to NextControl (n)
210 Calling ControlBase._new_data is a good way to do that.
212 print("ControlBase", self
, stage
, in_multi
, stage_ctl
)
213 StageHelper
.__init
__(self
, stage
)
215 # set up input and output IO ACK (prev/next ready/valid)
216 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
217 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
219 # set up the input and output data
220 if stage
is not None:
221 self
._new
_data
("data")
223 def _new_data(self
, name
):
224 """ allocates new i_data and o_data
226 self
.p
.i_data
, self
.n
.o_data
= self
.new_specs(name
)
230 return self
.process(self
.p
.i_data
)
232 def connect_to_next(self
, nxt
):
233 """ helper function to connect to the next stage data/valid/ready.
235 return self
.n
.connect_to_next(nxt
.p
)
237 def _connect_in(self
, prev
):
238 """ internal helper function to connect stage to an input source.
239 do not use to connect stage-to-stage!
241 return self
.p
._connect
_in
(prev
.p
)
243 def _connect_out(self
, nxt
):
244 """ internal helper function to connect stage to an output source.
245 do not use to connect stage-to-stage!
247 return self
.n
._connect
_out
(nxt
.n
)
249 def connect(self
, pipechain
):
250 """ connects a chain (list) of Pipeline instances together and
251 links them to this ControlBase instance:
253 in <----> self <---> out
256 [pipe1, pipe2, pipe3, pipe4]
259 out---in out--in out---in
261 Also takes care of allocating i_data/o_data, by looking up
262 the data spec for each end of the pipechain. i.e It is NOT
263 necessary to allocate self.p.i_data or self.n.o_data manually:
264 this is handled AUTOMATICALLY, here.
266 Basically this function is the direct equivalent of StageChain,
267 except that unlike StageChain, the Pipeline logic is followed.
269 Just as StageChain presents an object that conforms to the
270 Stage API from a list of objects that also conform to the
271 Stage API, an object that calls this Pipeline connect function
272 has the exact same pipeline API as the list of pipline objects
275 Thus it becomes possible to build up larger chains recursively.
276 More complex chains (multi-input, multi-output) will have to be
281 * :pipechain: - a sequence of ControlBase-derived classes
282 (must be one or more in length)
286 * a list of eq assignments that will need to be added in
287 an elaborate() to m.d.comb
289 assert len(pipechain
) > 0, "pipechain must be non-zero length"
290 assert self
.stage
is None, "do not use connect with a stage"
291 eqs
= [] # collated list of assignment statements
293 # connect inter-chain
294 for i
in range(len(pipechain
)-1):
295 pipe1
= pipechain
[i
] # earlier
296 pipe2
= pipechain
[i
+1] # later (by 1)
297 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
299 # connect front and back of chain to ourselves
300 front
= pipechain
[0] # first in chain
301 end
= pipechain
[-1] # last in chain
302 self
.set_specs(front
, end
) # sets up ispec/ospec functions
303 self
._new
_data
("chain") # NOTE: REPLACES existing data
304 eqs
+= front
._connect
_in
(self
) # front p to our p
305 eqs
+= end
._connect
_out
(self
) # end n to our n
309 def set_input(self
, i
):
310 """ helper function to set the input data (used in unit tests)
312 return nmoperator
.eq(self
.p
.i_data
, i
)
315 yield from self
.p
# yields ready/valid/data (data also gets yielded)
316 yield from self
.n
# ditto
321 def elaborate(self
, platform
):
322 """ handles case where stage has dynamic ready/valid functions
325 m
.submodules
.p
= self
.p
326 m
.submodules
.n
= self
.n
328 self
.setup(m
, self
.p
.i_data
)
330 if not self
.p
.stage_ctl
:
333 # intercept the previous (outgoing) "ready", combine with stage ready
334 m
.d
.comb
+= self
.p
.s_o_ready
.eq(self
.p
._o
_ready
& self
.stage
.d_ready
)
336 # intercept the next (incoming) "ready" and combine it with data valid
337 sdv
= self
.stage
.d_valid(self
.n
.i_ready
)
338 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.i_ready
& sdv
)
343 class BufferedHandshake(ControlBase
):
344 """ buffered pipeline stage. data and strobe signals travel in sync.
345 if ever the input is ready and the output is not, processed data
346 is shunted in a temporary register.
348 Argument: stage. see Stage API above
350 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
351 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
352 stage-1 p.i_data >>in stage n.o_data out>> stage+1
358 input data p.i_data is read (only), is processed and goes into an
359 intermediate result store [process()]. this is updated combinatorially.
361 in a non-stall condition, the intermediate result will go into the
362 output (update_output). however if ever there is a stall, it goes
363 into r_data instead [update_buffer()].
365 when the non-stall condition is released, r_data is the first
366 to be transferred to the output [flush_buffer()], and the stall
369 on the next cycle (as long as stall is not raised again) the
370 input may begin to be processed and transferred directly to output.
373 def elaborate(self
, platform
):
374 self
.m
= ControlBase
.elaborate(self
, platform
)
376 result
= _spec(self
.stage
.ospec
, "r_tmp")
377 r_data
= _spec(self
.stage
.ospec
, "r_data")
379 # establish some combinatorial temporaries
380 o_n_validn
= Signal(reset_less
=True)
381 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
382 nir_por
= Signal(reset_less
=True)
383 nir_por_n
= Signal(reset_less
=True)
384 p_i_valid
= Signal(reset_less
=True)
385 nir_novn
= Signal(reset_less
=True)
386 nirn_novn
= Signal(reset_less
=True)
387 por_pivn
= Signal(reset_less
=True)
388 npnn
= Signal(reset_less
=True)
389 self
.m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
),
390 o_n_validn
.eq(~self
.n
.o_valid
),
391 n_i_ready
.eq(self
.n
.i_ready_test
),
392 nir_por
.eq(n_i_ready
& self
.p
._o
_ready
),
393 nir_por_n
.eq(n_i_ready
& ~self
.p
._o
_ready
),
394 nir_novn
.eq(n_i_ready | o_n_validn
),
395 nirn_novn
.eq(~n_i_ready
& o_n_validn
),
396 npnn
.eq(nir_por | nirn_novn
),
397 por_pivn
.eq(self
.p
._o
_ready
& ~p_i_valid
)
400 # store result of processing in combinatorial temporary
401 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
403 # if not in stall condition, update the temporary register
404 with self
.m
.If(self
.p
.o_ready
): # not stalled
405 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
407 # data pass-through conditions
408 with self
.m
.If(npnn
):
409 # XXX TBD, does nothing right now
410 o_data
= self
._postprocess
(result
)
411 self
.m
.d
.sync
+= [self
.n
.o_valid
.eq(p_i_valid
), # valid if p_valid
413 nmoperator
.eq(self
.n
.o_data
, o_data
),
415 # buffer flush conditions (NOTE: can override data passthru conditions)
416 with self
.m
.If(nir_por_n
): # not stalled
417 # Flush the [already processed] buffer to the output port.
418 # XXX TBD, does nothing right now
419 o_data
= self
._postprocess
(r_data
)
420 self
.m
.d
.sync
+= [self
.n
.o_valid
.eq(1), # reg empty
421 nmoperator
.eq(self
.n
.o_data
, o_data
), # flush
423 # output ready conditions
424 self
.m
.d
.sync
+= self
.p
._o
_ready
.eq(nir_novn | por_pivn
)
429 class MaskNoDelayCancellable(ControlBase
):
430 """ Mask-activated Cancellable pipeline (that does not respect "ready")
432 Based on (identical behaviour to) SimpleHandshake.
433 TODO: decide whether to merge *into* SimpleHandshake.
435 Argument: stage. see Stage API above
437 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
438 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
439 stage-1 p.i_data >>in stage n.o_data out>> stage+1
444 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
445 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
447 def elaborate(self
, platform
):
448 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
450 # store result of processing in combinatorial temporary
451 result
= _spec(self
.stage
.ospec
, "r_tmp")
452 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
454 # establish if the data should be passed on. cancellation is
456 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
457 # is NOT "normal" for the Stage API.
458 p_i_valid
= Signal(reset_less
=True)
459 #print ("self.p.i_data", self.p.i_data)
460 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
461 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
462 m
.d
.comb
+= p_i_valid
.eq(maskedout
.bool())
464 # if idmask nonzero, mask gets passed on (and register set).
465 # register is left as-is if idmask is zero, but out-mask is set to zero
466 # note however: only the *uncancelled* mask bits get passed on
467 m
.d
.sync
+= self
.n
.o_valid
.eq(p_i_valid
)
468 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_i_valid
, maskedout
, 0))
469 with m
.If(p_i_valid
):
470 # XXX TBD, does nothing right now
471 o_data
= self
._postprocess
(result
)
472 m
.d
.sync
+= nmoperator
.eq(self
.n
.o_data
, o_data
) # update output
475 # input always "ready"
476 #m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
477 m
.d
.comb
+= self
.p
._o
_ready
.eq(Const(1))
479 # always pass on stop (as combinatorial: single signal)
480 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
485 class MaskCancellable(ControlBase
):
486 """ Mask-activated Cancellable pipeline
490 * stage. see Stage API above
491 * maskwid - sets up cancellation capability (mask and stop).
494 * dynamic - allows switching from sync to combinatorial (passthrough)
495 USE WITH CARE. will need the entire pipe to be quiescent
496 before switching, otherwise data WILL be destroyed.
498 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
499 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
500 stage-1 p.i_data >>in stage n.o_data out>> stage+1
505 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
507 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
508 self
.dynamic
= dynamic
510 self
.latchmode
= Signal()
512 self
.latchmode
= Const(1)
514 def elaborate(self
, platform
):
515 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
517 mask_r
= Signal(len(self
.p
.mask_i
), reset_less
=True)
518 data_r
= _spec(self
.stage
.ospec
, "data_r")
519 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
521 with m
.If(self
.latchmode
):
523 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
525 # establish if the data should be passed on. cancellation is
527 p_i_valid
= Signal(reset_less
=True)
528 #print ("self.p.i_data", self.p.i_data)
529 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
530 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
532 # establish some combinatorial temporaries
533 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
534 p_i_valid_p_o_ready
= Signal(reset_less
=True)
535 m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
& maskedout
.bool()),
536 n_i_ready
.eq(self
.n
.i_ready_test
),
537 p_i_valid_p_o_ready
.eq(p_i_valid
& self
.p
.o_ready
),
540 # if idmask nonzero, mask gets passed on (and register set).
541 # register is left as-is if idmask is zero, but out-mask is set to
543 # note however: only the *uncancelled* mask bits get passed on
544 m
.d
.sync
+= mask_r
.eq(Mux(p_i_valid
, maskedout
, 0))
545 m
.d
.comb
+= self
.n
.mask_o
.eq(mask_r
)
547 # always pass on stop (as combinatorial: single signal)
548 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
550 stor
= Signal(reset_less
=True)
551 m
.d
.comb
+= stor
.eq(p_i_valid_p_o_ready | n_i_ready
)
553 # store result of processing in combinatorial temporary
554 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
556 # previous valid and ready
557 with m
.If(p_i_valid_p_o_ready
):
558 m
.d
.sync
+= r_busy
.eq(1) # output valid
559 # previous invalid or not ready, however next is accepting
560 with m
.Elif(n_i_ready
):
561 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
563 # output set combinatorially from latch
564 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, r_latch
)
566 m
.d
.comb
+= self
.n
.o_valid
.eq(r_busy
)
567 # if next is ready, so is previous
568 m
.d
.comb
+= self
.p
._o
_ready
.eq(n_i_ready
)
571 # pass everything straight through. p connected to n: data,
572 # valid, mask, everything. this is "effectively" just a
573 # StageChain: MaskCancellable is doing "nothing" except
574 # combinatorially passing everything through
575 # (except now it's *dynamically selectable* whether to do that)
576 m
.d
.comb
+= self
.n
.o_valid
.eq(self
.p
.i_valid_test
)
577 m
.d
.comb
+= self
.p
._o
_ready
.eq(self
.n
.i_ready_test
)
578 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
579 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
580 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, data_r
)
585 class SimpleHandshake(ControlBase
):
586 """ simple handshake control. data and strobe signals travel in sync.
587 implements the protocol used by Wishbone and AXI4.
589 Argument: stage. see Stage API above
591 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
592 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
593 stage-1 p.i_data >>in stage n.o_data out>> stage+1
598 Inputs Temporary Output Data
599 ------- ---------- ----- ----
600 P P N N PiV& ~NiR& N P
607 0 0 1 0 0 0 0 1 process(i_data)
608 0 0 1 1 0 0 0 1 process(i_data)
612 0 1 1 0 0 0 0 1 process(i_data)
613 0 1 1 1 0 0 0 1 process(i_data)
617 1 0 1 0 0 0 0 1 process(i_data)
618 1 0 1 1 0 0 0 1 process(i_data)
620 1 1 0 0 1 0 1 0 process(i_data)
621 1 1 0 1 1 1 1 0 process(i_data)
622 1 1 1 0 1 0 1 1 process(i_data)
623 1 1 1 1 1 0 1 1 process(i_data)
627 def elaborate(self
, platform
):
628 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
631 result
= _spec(self
.stage
.ospec
, "r_tmp")
633 # establish some combinatorial temporaries
634 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
635 p_i_valid_p_o_ready
= Signal(reset_less
=True)
636 p_i_valid
= Signal(reset_less
=True)
637 m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
),
638 n_i_ready
.eq(self
.n
.i_ready_test
),
639 p_i_valid_p_o_ready
.eq(p_i_valid
& self
.p
.o_ready
),
642 # store result of processing in combinatorial temporary
643 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
645 # previous valid and ready
646 with m
.If(p_i_valid_p_o_ready
):
647 # XXX TBD, does nothing right now
648 o_data
= self
._postprocess
(result
)
649 m
.d
.sync
+= [r_busy
.eq(1), # output valid
650 nmoperator
.eq(self
.n
.o_data
, o_data
), # update output
652 # previous invalid or not ready, however next is accepting
653 with m
.Elif(n_i_ready
):
654 # XXX TBD, does nothing right now
655 o_data
= self
._postprocess
(result
)
656 m
.d
.sync
+= [nmoperator
.eq(self
.n
.o_data
, o_data
)]
657 # TODO: could still send data here (if there was any)
658 # m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
659 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
661 m
.d
.comb
+= self
.n
.o_valid
.eq(r_busy
)
662 # if next is ready, so is previous
663 m
.d
.comb
+= self
.p
._o
_ready
.eq(n_i_ready
)
668 class UnbufferedPipeline(ControlBase
):
669 """ A simple pipeline stage with single-clock synchronisation
670 and two-way valid/ready synchronised signalling.
672 Note that a stall in one stage will result in the entire pipeline
675 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
676 travel synchronously with the data: the valid/ready signalling
677 combines in a *combinatorial* fashion. Therefore, a long pipeline
678 chain will lengthen propagation delays.
680 Argument: stage. see Stage API, above
682 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
683 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
684 stage-1 p.i_data >>in stage n.o_data out>> stage+1
692 p.i_data : StageInput, shaped according to ispec
694 p.o_data : StageOutput, shaped according to ospec
696 r_data : input_shape according to ispec
697 A temporary (buffered) copy of a prior (valid) input.
698 This is HELD if the output is not ready. It is updated
700 result: output_shape according to ospec
701 The output of the combinatorial logic. it is updated
702 COMBINATORIALLY (no clock dependence).
706 Inputs Temp Output Data
728 1 1 0 0 0 1 1 process(i_data)
729 1 1 0 1 1 1 0 process(i_data)
730 1 1 1 0 0 1 1 process(i_data)
731 1 1 1 1 0 1 1 process(i_data)
734 Note: PoR is *NOT* involved in the above decision-making.
737 def elaborate(self
, platform
):
738 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
740 data_valid
= Signal() # is data valid or not
741 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
744 p_i_valid
= Signal(reset_less
=True)
745 pv
= Signal(reset_less
=True)
746 buf_full
= Signal(reset_less
=True)
747 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
748 m
.d
.comb
+= pv
.eq(self
.p
.i_valid
& self
.p
.o_ready
)
749 m
.d
.comb
+= buf_full
.eq(~self
.n
.i_ready_test
& data_valid
)
751 m
.d
.comb
+= self
.n
.o_valid
.eq(data_valid
)
752 m
.d
.comb
+= self
.p
._o
_ready
.eq(~data_valid | self
.n
.i_ready_test
)
753 m
.d
.sync
+= data_valid
.eq(p_i_valid | buf_full
)
756 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
757 o_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
758 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, o_data
)
763 class UnbufferedPipeline2(ControlBase
):
764 """ A simple pipeline stage with single-clock synchronisation
765 and two-way valid/ready synchronised signalling.
767 Note that a stall in one stage will result in the entire pipeline
770 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
771 travel synchronously with the data: the valid/ready signalling
772 combines in a *combinatorial* fashion. Therefore, a long pipeline
773 chain will lengthen propagation delays.
775 Argument: stage. see Stage API, above
777 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
778 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
779 stage-1 p.i_data >>in stage n.o_data out>> stage+1
784 p.i_data : StageInput, shaped according to ispec
786 p.o_data : StageOutput, shaped according to ospec
788 buf : output_shape according to ospec
789 A temporary (buffered) copy of a valid output
790 This is HELD if the output is not ready. It is updated
793 Inputs Temp Output Data
795 P P N N ~NiR& N P (buf_full)
800 0 0 0 0 0 0 1 process(i_data)
801 0 0 0 1 1 1 0 reg (odata, unchanged)
802 0 0 1 0 0 0 1 process(i_data)
803 0 0 1 1 0 0 1 process(i_data)
805 0 1 0 0 0 0 1 process(i_data)
806 0 1 0 1 1 1 0 reg (odata, unchanged)
807 0 1 1 0 0 0 1 process(i_data)
808 0 1 1 1 0 0 1 process(i_data)
810 1 0 0 0 0 1 1 process(i_data)
811 1 0 0 1 1 1 0 reg (odata, unchanged)
812 1 0 1 0 0 1 1 process(i_data)
813 1 0 1 1 0 1 1 process(i_data)
815 1 1 0 0 0 1 1 process(i_data)
816 1 1 0 1 1 1 0 reg (odata, unchanged)
817 1 1 1 0 0 1 1 process(i_data)
818 1 1 1 1 0 1 1 process(i_data)
821 Note: PoR is *NOT* involved in the above decision-making.
824 def elaborate(self
, platform
):
825 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
827 buf_full
= Signal() # is data valid or not
828 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
831 p_i_valid
= Signal(reset_less
=True)
832 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
834 m
.d
.comb
+= self
.n
.o_valid
.eq(buf_full | p_i_valid
)
835 m
.d
.comb
+= self
.p
._o
_ready
.eq(~buf_full
)
836 m
.d
.sync
+= buf_full
.eq(~self
.n
.i_ready_test
& self
.n
.o_valid
)
838 o_data
= Mux(buf_full
, buf
, self
.data_r
)
839 o_data
= self
._postprocess
(o_data
) # XXX TBD, does nothing right now
840 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, o_data
)
841 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.o_data
)
846 class PassThroughHandshake(ControlBase
):
847 """ A control block that delays by one clock cycle.
849 Inputs Temporary Output Data
850 ------- ------------------ ----- ----
851 P P N N PiV& PiV| NiR| pvr N P (pvr)
852 i o i o PoR ~PoR ~NoV o o
856 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
857 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
858 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
859 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
861 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
862 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
863 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
864 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
866 1 0 0 0 0 1 1 1 1 1 process(in)
867 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
868 1 0 1 0 0 1 1 1 1 1 process(in)
869 1 0 1 1 0 1 1 1 1 1 process(in)
871 1 1 0 0 1 1 1 1 1 1 process(in)
872 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
873 1 1 1 0 1 1 1 1 1 1 process(in)
874 1 1 1 1 1 1 1 1 1 1 process(in)
879 def elaborate(self
, platform
):
880 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
882 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
885 p_i_valid
= Signal(reset_less
=True)
886 pvr
= Signal(reset_less
=True)
887 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
888 m
.d
.comb
+= pvr
.eq(p_i_valid
& self
.p
.o_ready
)
890 m
.d
.comb
+= self
.p
.o_ready
.eq(~self
.n
.o_valid | self
.n
.i_ready_test
)
891 m
.d
.sync
+= self
.n
.o_valid
.eq(p_i_valid | ~self
.p
.o_ready
)
893 odata
= Mux(pvr
, self
.data_r
, r_data
)
894 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
895 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
896 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, r_data
)
901 class RegisterPipeline(UnbufferedPipeline
):
902 """ A pipeline stage that delays by one clock cycle, creating a
903 sync'd latch out of o_data and o_valid as an indirect byproduct
904 of using PassThroughStage
907 def __init__(self
, iospecfn
):
908 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
911 class FIFOControl(ControlBase
):
912 """ FIFO Control. Uses Queue to store data, coincidentally
913 happens to have same valid/ready signalling as Stage API.
915 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
918 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
919 fwft
=True, pipe
=False):
922 * :depth: number of entries in the FIFO
923 * :stage: data processing block
924 * :fwft: first word fall-thru mode (non-fwft introduces delay)
925 * :pipe: specifies pipe mode.
927 when fwft = True it indicates that transfers may occur
928 combinatorially through stage processing in the same clock cycle.
929 This requires that the Stage be a Moore FSM:
930 https://en.wikipedia.org/wiki/Moore_machine
932 when fwft = False it indicates that all output signals are
933 produced only from internal registers or memory, i.e. that the
934 Stage is a Mealy FSM:
935 https://en.wikipedia.org/wiki/Mealy_machine
937 data is processed (and located) as follows:
939 self.p self.stage temp fn temp fn temp fp self.n
940 i_data->process()->result->cat->din.FIFO.dout->cat(o_data)
942 yes, really: cat produces a Cat() which can be assigned to.
943 this is how the FIFO gets de-catted without needing a de-cat
949 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
951 def elaborate(self
, platform
):
952 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
954 # make a FIFO with a signal of equal width to the o_data.
955 (fwidth
, _
) = nmoperator
.shape(self
.n
.o_data
)
956 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
957 m
.submodules
.fifo
= fifo
959 def processfn(i_data
):
960 # store result of processing in combinatorial temporary
961 result
= _spec(self
.stage
.ospec
, "r_temp")
962 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(i_data
))
963 return nmoperator
.cat(result
)
965 # prev: make the FIFO (Queue object) "look" like a PrevControl...
966 m
.submodules
.fp
= fp
= PrevControl()
967 fp
.i_valid
, fp
._o
_ready
, fp
.i_data
= fifo
.w_en
, fifo
.w_rdy
, fifo
.w_data
968 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
970 # next: make the FIFO (Queue object) "look" like a NextControl...
971 m
.submodules
.fn
= fn
= NextControl()
972 fn
.o_valid
, fn
.i_ready
, fn
.o_data
= fifo
.r_rdy
, fifo
.r_en
, fifo
.r_data
973 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
974 valid_eq
, ready_eq
, o_data
= connections
976 # ok ok so we can't just do the ready/valid eqs straight:
977 # first 2 from connections are the ready/valid, 3rd is data.
979 # combinatorial on next ready/valid
980 m
.d
.comb
+= [valid_eq
, ready_eq
]
982 m
.d
.sync
+= [valid_eq
, ready_eq
] # non-fwft mode needs sync
983 o_data
= self
._postprocess
(o_data
) # XXX TBD, does nothing right now
990 class UnbufferedPipeline(FIFOControl
):
991 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
992 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
993 fwft
=True, pipe
=False)
995 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
998 class PassThroughHandshake(FIFOControl
):
999 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1000 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1001 fwft
=True, pipe
=True)
1003 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1006 class BufferedHandshake(FIFOControl
):
1007 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1008 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
1009 fwft
=True, pipe
=False)
1013 # this is *probably* SimpleHandshake (note: memory cell size=0)
1014 class SimpleHandshake(FIFOControl):
1015 def __init__(self, stage, in_multi=None, stage_ctl=False):
1016 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1017 fwft=True, pipe=False)