1 # SPDX-License-Identifier: LGPL-3-or-later
2 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
4 This work is funded through NLnet under Grant 2019-02-012
9 Associated development bugs:
10 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
11 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
12 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
14 Important: see Stage API (stageapi.py) and IO Control API
15 (iocontrol.py) in combination with below. This module
16 "combines" the Stage API with the IO Control API to create
19 The one critically important key difference between StageAPI and
22 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
23 * PipelineAPI: synchronous registers / latches get added here
28 A convenience class that takes an input shape, output shape, a
29 "processing" function and an optional "setup" function. Honestly
30 though, there's not much more effort to just... create a class
31 that returns a couple of Records (see ExampleAddRecordStage in
37 A convenience class that takes a single function as a parameter,
38 that is chain-called to create the exact same input and output spec.
39 It has a process() function that simply returns its input.
41 Instances of this class are completely redundant if handed to
42 StageChain, however when passed to UnbufferedPipeline they
43 can be used to introduce a single clock delay.
48 The base class for pipelines. Contains previous and next ready/valid/data.
49 Also has an extremely useful "connect" function that can be used to
50 connect a chain of pipelines and present the exact same prev/next
53 Note: pipelines basically do not become pipelines as such until
54 handed to a derivative of ControlBase. ControlBase itself is *not*
55 strictly considered a pipeline class. Wishbone and AXI4 (master or
56 slave) could be derived from ControlBase, for example.
60 A simple stalling clock-synchronised pipeline that has no buffering
61 (unlike BufferedHandshake). Data flows on *every* clock cycle when
62 the conditions are right (this is nominally when the input is valid
63 and the output is ready).
65 A stall anywhere along the line will result in a stall back-propagating
66 down the entire chain. The BufferedHandshake by contrast will buffer
67 incoming data, allowing previous stages one clock cycle's grace before
70 An advantage of the UnbufferedPipeline over the Buffered one is
71 that the amount of logic needed (number of gates) is greatly
72 reduced (no second set of buffers basically)
74 The disadvantage of the UnbufferedPipeline is that the valid/ready
75 logic, if chained together, is *combinatorial*, resulting in
76 progressively larger gate delay.
81 A Control class that introduces a single clock delay, passing its
82 data through unaltered. Unlike RegisterPipeline (which relies
83 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
89 A convenience class that, because UnbufferedPipeline introduces a single
90 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
91 stage that, duh, delays its (unmodified) input by one clock cycle.
96 nmigen implementation of buffered pipeline stage, based on zipcpu:
97 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
99 this module requires quite a bit of thought to understand how it works
100 (and why it is needed in the first place). reading the above is
101 *strongly* recommended.
103 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
104 the STB / ACK signals to raise and lower (on separate clocks) before
105 data may proceeed (thus only allowing one piece of data to proceed
106 on *ALTERNATE* cycles), the signalling here is a true pipeline
107 where data will flow on *every* clock when the conditions are right.
109 input acceptance conditions are when:
110 * incoming previous-stage strobe (p.i_valid) is HIGH
111 * outgoing previous-stage ready (p.o_ready) is LOW
113 output transmission conditions are when:
114 * outgoing next-stage strobe (n.o_valid) is HIGH
115 * outgoing next-stage ready (n.i_ready) is LOW
117 the tricky bit is when the input has valid data and the output is not
118 ready to accept it. if it wasn't for the clock synchronisation, it
119 would be possible to tell the input "hey don't send that data, we're
120 not ready". unfortunately, it's not possible to "change the past":
121 the previous stage *has no choice* but to pass on its data.
123 therefore, the incoming data *must* be accepted - and stored: that
124 is the responsibility / contract that this stage *must* accept.
125 on the same clock, it's possible to tell the input that it must
126 not send any more data. this is the "stall" condition.
128 we now effectively have *two* possible pieces of data to "choose" from:
129 the buffered data, and the incoming data. the decision as to which
130 to process and output is based on whether we are in "stall" or not.
131 i.e. when the next stage is no longer ready, the output comes from
132 the buffer if a stall had previously occurred, otherwise it comes
133 direct from processing the input.
135 this allows us to respect a synchronous "travelling STB" with what
136 dan calls a "buffered handshake".
138 it's quite a complex state machine!
143 Synchronised pipeline, Based on:
144 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
147 from nmigen
import Signal
, Mux
, Module
, Elaboratable
, Const
148 from nmigen
.cli
import verilog
, rtlil
149 from nmigen
.hdl
.rec
import Record
151 from nmutil
.queue
import Queue
154 from nmutil
.iocontrol
import (PrevControl
, NextControl
, Object
, RecordObject
)
155 from nmutil
.stageapi
import (_spec
, StageCls
, Stage
, StageChain
, StageHelper
)
156 from nmutil
import nmoperator
159 class RecordBasedStage(Stage
):
160 """ convenience class which provides a Records-based layout.
161 honestly it's a lot easier just to create a direct Records-based
162 class (see ExampleAddRecordStage)
165 def __init__(self
, in_shape
, out_shape
, processfn
, setupfn
=None):
166 self
.in_shape
= in_shape
167 self
.out_shape
= out_shape
168 self
.__process
= processfn
169 self
.__setup
= setupfn
171 def ispec(self
): return Record(self
.in_shape
)
172 def ospec(self
): return Record(self
.out_shape
)
173 def process(seif
, i
): return self
.__process
(i
)
174 def setup(seif
, m
, i
): return self
.__setup
(m
, i
)
177 class PassThroughStage(StageCls
):
178 """ a pass-through stage with its input data spec identical to its output,
179 and "passes through" its data from input to output (does nothing).
181 use this basically to explicitly make any data spec Stage-compliant.
182 (many APIs would potentially use a static "wrap" method in e.g.
183 StageCls to achieve a similar effect)
186 def __init__(self
, iospecfn
): self
.iospecfn
= iospecfn
187 def ispec(self
): return self
.iospecfn()
188 def ospec(self
): return self
.iospecfn()
191 class ControlBase(StageHelper
, Elaboratable
):
192 """ Common functions for Pipeline API. Note: a "pipeline stage" only
193 exists (conceptually) when a ControlBase derivative is handed
194 a Stage (combinatorial block)
196 NOTE: ControlBase derives from StageHelper, making it accidentally
197 compliant with the Stage API. Using those functions directly
198 *BYPASSES* a ControlBase instance ready/valid signalling, which
199 clearly should not be done without a really, really good reason.
202 def __init__(self
, stage
=None, in_multi
=None, stage_ctl
=False, maskwid
=0):
203 """ Base class containing ready/valid/data to previous and next stages
205 * p: contains ready/valid to the previous stage
206 * n: contains ready/valid to the next stage
208 Except when calling Controlbase.connect(), user must also:
209 * add i_data member to PrevControl (p) and
210 * add o_data member to NextControl (n)
211 Calling ControlBase._new_data is a good way to do that.
213 print("ControlBase", self
, stage
, in_multi
, stage_ctl
)
214 StageHelper
.__init
__(self
, stage
)
216 # set up input and output IO ACK (prev/next ready/valid)
217 self
.p
= PrevControl(in_multi
, stage_ctl
, maskwid
=maskwid
)
218 self
.n
= NextControl(stage_ctl
, maskwid
=maskwid
)
220 # set up the input and output data
221 if stage
is not None:
222 self
._new
_data
("data")
224 def _new_data(self
, name
):
225 """ allocates new i_data and o_data
227 self
.p
.i_data
, self
.n
.o_data
= self
.new_specs(name
)
231 return self
.process(self
.p
.i_data
)
233 def connect_to_next(self
, nxt
):
234 """ helper function to connect to the next stage data/valid/ready.
236 return self
.n
.connect_to_next(nxt
.p
)
238 def _connect_in(self
, prev
):
239 """ internal helper function to connect stage to an input source.
240 do not use to connect stage-to-stage!
242 return self
.p
._connect
_in
(prev
.p
)
244 def _connect_out(self
, nxt
):
245 """ internal helper function to connect stage to an output source.
246 do not use to connect stage-to-stage!
248 return self
.n
._connect
_out
(nxt
.n
)
250 def connect(self
, pipechain
):
251 """ connects a chain (list) of Pipeline instances together and
252 links them to this ControlBase instance:
254 in <----> self <---> out
257 [pipe1, pipe2, pipe3, pipe4]
260 out---in out--in out---in
262 Also takes care of allocating i_data/o_data, by looking up
263 the data spec for each end of the pipechain. i.e It is NOT
264 necessary to allocate self.p.i_data or self.n.o_data manually:
265 this is handled AUTOMATICALLY, here.
267 Basically this function is the direct equivalent of StageChain,
268 except that unlike StageChain, the Pipeline logic is followed.
270 Just as StageChain presents an object that conforms to the
271 Stage API from a list of objects that also conform to the
272 Stage API, an object that calls this Pipeline connect function
273 has the exact same pipeline API as the list of pipline objects
276 Thus it becomes possible to build up larger chains recursively.
277 More complex chains (multi-input, multi-output) will have to be
282 * :pipechain: - a sequence of ControlBase-derived classes
283 (must be one or more in length)
287 * a list of eq assignments that will need to be added in
288 an elaborate() to m.d.comb
290 assert len(pipechain
) > 0, "pipechain must be non-zero length"
291 assert self
.stage
is None, "do not use connect with a stage"
292 eqs
= [] # collated list of assignment statements
294 # connect inter-chain
295 for i
in range(len(pipechain
)-1):
296 pipe1
= pipechain
[i
] # earlier
297 pipe2
= pipechain
[i
+1] # later (by 1)
298 eqs
+= pipe1
.connect_to_next(pipe2
) # earlier n to later p
300 # connect front and back of chain to ourselves
301 front
= pipechain
[0] # first in chain
302 end
= pipechain
[-1] # last in chain
303 self
.set_specs(front
, end
) # sets up ispec/ospec functions
304 self
._new
_data
("chain") # NOTE: REPLACES existing data
305 eqs
+= front
._connect
_in
(self
) # front p to our p
306 eqs
+= end
._connect
_out
(self
) # end n to our n
310 def set_input(self
, i
):
311 """ helper function to set the input data (used in unit tests)
313 return nmoperator
.eq(self
.p
.i_data
, i
)
316 yield from self
.p
# yields ready/valid/data (data also gets yielded)
317 yield from self
.n
# ditto
322 def elaborate(self
, platform
):
323 """ handles case where stage has dynamic ready/valid functions
326 m
.submodules
.p
= self
.p
327 m
.submodules
.n
= self
.n
329 self
.setup(m
, self
.p
.i_data
)
331 if not self
.p
.stage_ctl
:
334 # intercept the previous (outgoing) "ready", combine with stage ready
335 m
.d
.comb
+= self
.p
.s_o_ready
.eq(self
.p
._o
_ready
& self
.stage
.d_ready
)
337 # intercept the next (incoming) "ready" and combine it with data valid
338 sdv
= self
.stage
.d_valid(self
.n
.i_ready
)
339 m
.d
.comb
+= self
.n
.d_valid
.eq(self
.n
.i_ready
& sdv
)
344 class BufferedHandshake(ControlBase
):
345 """ buffered pipeline stage. data and strobe signals travel in sync.
346 if ever the input is ready and the output is not, processed data
347 is shunted in a temporary register.
349 Argument: stage. see Stage API above
351 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
352 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
353 stage-1 p.i_data >>in stage n.o_data out>> stage+1
359 input data p.i_data is read (only), is processed and goes into an
360 intermediate result store [process()]. this is updated combinatorially.
362 in a non-stall condition, the intermediate result will go into the
363 output (update_output). however if ever there is a stall, it goes
364 into r_data instead [update_buffer()].
366 when the non-stall condition is released, r_data is the first
367 to be transferred to the output [flush_buffer()], and the stall
370 on the next cycle (as long as stall is not raised again) the
371 input may begin to be processed and transferred directly to output.
374 def elaborate(self
, platform
):
375 self
.m
= ControlBase
.elaborate(self
, platform
)
377 result
= _spec(self
.stage
.ospec
, "r_tmp")
378 r_data
= _spec(self
.stage
.ospec
, "r_data")
380 # establish some combinatorial temporaries
381 o_n_validn
= Signal(reset_less
=True)
382 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
383 nir_por
= Signal(reset_less
=True)
384 nir_por_n
= Signal(reset_less
=True)
385 p_i_valid
= Signal(reset_less
=True)
386 nir_novn
= Signal(reset_less
=True)
387 nirn_novn
= Signal(reset_less
=True)
388 por_pivn
= Signal(reset_less
=True)
389 npnn
= Signal(reset_less
=True)
390 self
.m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
),
391 o_n_validn
.eq(~self
.n
.o_valid
),
392 n_i_ready
.eq(self
.n
.i_ready_test
),
393 nir_por
.eq(n_i_ready
& self
.p
._o
_ready
),
394 nir_por_n
.eq(n_i_ready
& ~self
.p
._o
_ready
),
395 nir_novn
.eq(n_i_ready | o_n_validn
),
396 nirn_novn
.eq(~n_i_ready
& o_n_validn
),
397 npnn
.eq(nir_por | nirn_novn
),
398 por_pivn
.eq(self
.p
._o
_ready
& ~p_i_valid
)
401 # store result of processing in combinatorial temporary
402 self
.m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
404 # if not in stall condition, update the temporary register
405 with self
.m
.If(self
.p
.o_ready
): # not stalled
406 self
.m
.d
.sync
+= nmoperator
.eq(r_data
, result
) # update buffer
408 # data pass-through conditions
409 with self
.m
.If(npnn
):
410 # XXX TBD, does nothing right now
411 o_data
= self
._postprocess
(result
)
412 self
.m
.d
.sync
+= [self
.n
.o_valid
.eq(p_i_valid
), # valid if p_valid
414 nmoperator
.eq(self
.n
.o_data
, o_data
),
416 # buffer flush conditions (NOTE: can override data passthru conditions)
417 with self
.m
.If(nir_por_n
): # not stalled
418 # Flush the [already processed] buffer to the output port.
419 # XXX TBD, does nothing right now
420 o_data
= self
._postprocess
(r_data
)
421 self
.m
.d
.sync
+= [self
.n
.o_valid
.eq(1), # reg empty
422 nmoperator
.eq(self
.n
.o_data
, o_data
), # flush
424 # output ready conditions
425 self
.m
.d
.sync
+= self
.p
._o
_ready
.eq(nir_novn | por_pivn
)
430 class MaskNoDelayCancellable(ControlBase
):
431 """ Mask-activated Cancellable pipeline (that does not respect "ready")
433 Based on (identical behaviour to) SimpleHandshake.
434 TODO: decide whether to merge *into* SimpleHandshake.
436 Argument: stage. see Stage API above
438 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
439 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
440 stage-1 p.i_data >>in stage n.o_data out>> stage+1
445 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False):
446 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
448 def elaborate(self
, platform
):
449 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
451 # store result of processing in combinatorial temporary
452 result
= _spec(self
.stage
.ospec
, "r_tmp")
453 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
455 # establish if the data should be passed on. cancellation is
457 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
458 # is NOT "normal" for the Stage API.
459 p_i_valid
= Signal(reset_less
=True)
460 #print ("self.p.i_data", self.p.i_data)
461 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
462 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
463 m
.d
.comb
+= p_i_valid
.eq(maskedout
.bool())
465 # if idmask nonzero, mask gets passed on (and register set).
466 # register is left as-is if idmask is zero, but out-mask is set to zero
467 # note however: only the *uncancelled* mask bits get passed on
468 m
.d
.sync
+= self
.n
.o_valid
.eq(p_i_valid
)
469 m
.d
.sync
+= self
.n
.mask_o
.eq(Mux(p_i_valid
, maskedout
, 0))
470 with m
.If(p_i_valid
):
471 # XXX TBD, does nothing right now
472 o_data
= self
._postprocess
(result
)
473 m
.d
.sync
+= nmoperator
.eq(self
.n
.o_data
, o_data
) # update output
476 # input always "ready"
477 #m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
478 m
.d
.comb
+= self
.p
._o
_ready
.eq(Const(1))
480 # always pass on stop (as combinatorial: single signal)
481 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
486 class MaskCancellable(ControlBase
):
487 """ Mask-activated Cancellable pipeline
491 * stage. see Stage API above
492 * maskwid - sets up cancellation capability (mask and stop).
495 * dynamic - allows switching from sync to combinatorial (passthrough)
496 USE WITH CARE. will need the entire pipe to be quiescent
497 before switching, otherwise data WILL be destroyed.
499 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
500 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
501 stage-1 p.i_data >>in stage n.o_data out>> stage+1
506 def __init__(self
, stage
, maskwid
, in_multi
=None, stage_ctl
=False,
508 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
, maskwid
)
509 self
.dynamic
= dynamic
511 self
.latchmode
= Signal()
513 self
.latchmode
= Const(1)
515 def elaborate(self
, platform
):
516 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
518 mask_r
= Signal(len(self
.p
.mask_i
), reset_less
=True)
519 data_r
= _spec(self
.stage
.ospec
, "data_r")
520 m
.d
.comb
+= nmoperator
.eq(data_r
, self
._postprocess
(self
.data_r
))
522 with m
.If(self
.latchmode
):
524 r_latch
= _spec(self
.stage
.ospec
, "r_latch")
526 # establish if the data should be passed on. cancellation is
528 p_i_valid
= Signal(reset_less
=True)
529 #print ("self.p.i_data", self.p.i_data)
530 maskedout
= Signal(len(self
.p
.mask_i
), reset_less
=True)
531 m
.d
.comb
+= maskedout
.eq(self
.p
.mask_i
& ~self
.p
.stop_i
)
533 # establish some combinatorial temporaries
534 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
535 p_i_valid_p_o_ready
= Signal(reset_less
=True)
536 m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
& maskedout
.bool()),
537 n_i_ready
.eq(self
.n
.i_ready_test
),
538 p_i_valid_p_o_ready
.eq(p_i_valid
& self
.p
.o_ready
),
541 # if idmask nonzero, mask gets passed on (and register set).
542 # register is left as-is if idmask is zero, but out-mask is set to
544 # note however: only the *uncancelled* mask bits get passed on
545 m
.d
.sync
+= mask_r
.eq(Mux(p_i_valid
, maskedout
, 0))
546 m
.d
.comb
+= self
.n
.mask_o
.eq(mask_r
)
548 # always pass on stop (as combinatorial: single signal)
549 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
551 stor
= Signal(reset_less
=True)
552 m
.d
.comb
+= stor
.eq(p_i_valid_p_o_ready | n_i_ready
)
554 # store result of processing in combinatorial temporary
555 m
.d
.sync
+= nmoperator
.eq(r_latch
, data_r
)
557 # previous valid and ready
558 with m
.If(p_i_valid_p_o_ready
):
559 m
.d
.sync
+= r_busy
.eq(1) # output valid
560 # previous invalid or not ready, however next is accepting
561 with m
.Elif(n_i_ready
):
562 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
564 # output set combinatorially from latch
565 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, r_latch
)
567 m
.d
.comb
+= self
.n
.o_valid
.eq(r_busy
)
568 # if next is ready, so is previous
569 m
.d
.comb
+= self
.p
._o
_ready
.eq(n_i_ready
)
572 # pass everything straight through. p connected to n: data,
573 # valid, mask, everything. this is "effectively" just a
574 # StageChain: MaskCancellable is doing "nothing" except
575 # combinatorially passing everything through
576 # (except now it's *dynamically selectable* whether to do that)
577 m
.d
.comb
+= self
.n
.o_valid
.eq(self
.p
.i_valid_test
)
578 m
.d
.comb
+= self
.p
._o
_ready
.eq(self
.n
.i_ready_test
)
579 m
.d
.comb
+= self
.n
.stop_o
.eq(self
.p
.stop_i
)
580 m
.d
.comb
+= self
.n
.mask_o
.eq(self
.p
.mask_i
)
581 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, data_r
)
586 class SimpleHandshake(ControlBase
):
587 """ simple handshake control. data and strobe signals travel in sync.
588 implements the protocol used by Wishbone and AXI4.
590 Argument: stage. see Stage API above
592 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
593 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
594 stage-1 p.i_data >>in stage n.o_data out>> stage+1
599 Inputs Temporary Output Data
600 ------- ---------- ----- ----
601 P P N N PiV& ~NiR& N P
608 0 0 1 0 0 0 0 1 process(i_data)
609 0 0 1 1 0 0 0 1 process(i_data)
613 0 1 1 0 0 0 0 1 process(i_data)
614 0 1 1 1 0 0 0 1 process(i_data)
618 1 0 1 0 0 0 0 1 process(i_data)
619 1 0 1 1 0 0 0 1 process(i_data)
621 1 1 0 0 1 0 1 0 process(i_data)
622 1 1 0 1 1 1 1 0 process(i_data)
623 1 1 1 0 1 0 1 1 process(i_data)
624 1 1 1 1 1 0 1 1 process(i_data)
628 def elaborate(self
, platform
):
629 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
632 result
= _spec(self
.stage
.ospec
, "r_tmp")
634 # establish some combinatorial temporaries
635 n_i_ready
= Signal(reset_less
=True, name
="n_i_rdy_data")
636 p_i_valid_p_o_ready
= Signal(reset_less
=True)
637 p_i_valid
= Signal(reset_less
=True)
638 m
.d
.comb
+= [p_i_valid
.eq(self
.p
.i_valid_test
),
639 n_i_ready
.eq(self
.n
.i_ready_test
),
640 p_i_valid_p_o_ready
.eq(p_i_valid
& self
.p
.o_ready
),
643 # store result of processing in combinatorial temporary
644 m
.d
.comb
+= nmoperator
.eq(result
, self
.data_r
)
646 # previous valid and ready
647 with m
.If(p_i_valid_p_o_ready
):
648 # XXX TBD, does nothing right now
649 o_data
= self
._postprocess
(result
)
650 m
.d
.sync
+= [r_busy
.eq(1), # output valid
651 nmoperator
.eq(self
.n
.o_data
, o_data
), # update output
653 # previous invalid or not ready, however next is accepting
654 with m
.Elif(n_i_ready
):
655 # XXX TBD, does nothing right now
656 o_data
= self
._postprocess
(result
)
657 m
.d
.sync
+= [nmoperator
.eq(self
.n
.o_data
, o_data
)]
658 # TODO: could still send data here (if there was any)
659 # m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
660 m
.d
.sync
+= r_busy
.eq(0) # ...so set output invalid
662 m
.d
.comb
+= self
.n
.o_valid
.eq(r_busy
)
663 # if next is ready, so is previous
664 m
.d
.comb
+= self
.p
._o
_ready
.eq(n_i_ready
)
669 class UnbufferedPipeline(ControlBase
):
670 """ A simple pipeline stage with single-clock synchronisation
671 and two-way valid/ready synchronised signalling.
673 Note that a stall in one stage will result in the entire pipeline
676 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
677 travel synchronously with the data: the valid/ready signalling
678 combines in a *combinatorial* fashion. Therefore, a long pipeline
679 chain will lengthen propagation delays.
681 Argument: stage. see Stage API, above
683 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
684 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
685 stage-1 p.i_data >>in stage n.o_data out>> stage+1
693 p.i_data : StageInput, shaped according to ispec
695 p.o_data : StageOutput, shaped according to ospec
697 r_data : input_shape according to ispec
698 A temporary (buffered) copy of a prior (valid) input.
699 This is HELD if the output is not ready. It is updated
701 result: output_shape according to ospec
702 The output of the combinatorial logic. it is updated
703 COMBINATORIALLY (no clock dependence).
707 Inputs Temp Output Data
729 1 1 0 0 0 1 1 process(i_data)
730 1 1 0 1 1 1 0 process(i_data)
731 1 1 1 0 0 1 1 process(i_data)
732 1 1 1 1 0 1 1 process(i_data)
735 Note: PoR is *NOT* involved in the above decision-making.
738 def elaborate(self
, platform
):
739 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
741 data_valid
= Signal() # is data valid or not
742 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
745 p_i_valid
= Signal(reset_less
=True)
746 pv
= Signal(reset_less
=True)
747 buf_full
= Signal(reset_less
=True)
748 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
749 m
.d
.comb
+= pv
.eq(self
.p
.i_valid
& self
.p
.o_ready
)
750 m
.d
.comb
+= buf_full
.eq(~self
.n
.i_ready_test
& data_valid
)
752 m
.d
.comb
+= self
.n
.o_valid
.eq(data_valid
)
753 m
.d
.comb
+= self
.p
._o
_ready
.eq(~data_valid | self
.n
.i_ready_test
)
754 m
.d
.sync
+= data_valid
.eq(p_i_valid | buf_full
)
757 m
.d
.sync
+= nmoperator
.eq(r_data
, self
.data_r
)
758 o_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
759 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, o_data
)
764 class UnbufferedPipeline2(ControlBase
):
765 """ A simple pipeline stage with single-clock synchronisation
766 and two-way valid/ready synchronised signalling.
768 Note that a stall in one stage will result in the entire pipeline
771 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
772 travel synchronously with the data: the valid/ready signalling
773 combines in a *combinatorial* fashion. Therefore, a long pipeline
774 chain will lengthen propagation delays.
776 Argument: stage. see Stage API, above
778 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
779 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
780 stage-1 p.i_data >>in stage n.o_data out>> stage+1
785 p.i_data : StageInput, shaped according to ispec
787 p.o_data : StageOutput, shaped according to ospec
789 buf : output_shape according to ospec
790 A temporary (buffered) copy of a valid output
791 This is HELD if the output is not ready. It is updated
794 Inputs Temp Output Data
796 P P N N ~NiR& N P (buf_full)
801 0 0 0 0 0 0 1 process(i_data)
802 0 0 0 1 1 1 0 reg (odata, unchanged)
803 0 0 1 0 0 0 1 process(i_data)
804 0 0 1 1 0 0 1 process(i_data)
806 0 1 0 0 0 0 1 process(i_data)
807 0 1 0 1 1 1 0 reg (odata, unchanged)
808 0 1 1 0 0 0 1 process(i_data)
809 0 1 1 1 0 0 1 process(i_data)
811 1 0 0 0 0 1 1 process(i_data)
812 1 0 0 1 1 1 0 reg (odata, unchanged)
813 1 0 1 0 0 1 1 process(i_data)
814 1 0 1 1 0 1 1 process(i_data)
816 1 1 0 0 0 1 1 process(i_data)
817 1 1 0 1 1 1 0 reg (odata, unchanged)
818 1 1 1 0 0 1 1 process(i_data)
819 1 1 1 1 0 1 1 process(i_data)
822 Note: PoR is *NOT* involved in the above decision-making.
825 def elaborate(self
, platform
):
826 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
828 buf_full
= Signal() # is data valid or not
829 buf
= _spec(self
.stage
.ospec
, "r_tmp") # output type
832 p_i_valid
= Signal(reset_less
=True)
833 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
835 m
.d
.comb
+= self
.n
.o_valid
.eq(buf_full | p_i_valid
)
836 m
.d
.comb
+= self
.p
._o
_ready
.eq(~buf_full
)
837 m
.d
.sync
+= buf_full
.eq(~self
.n
.i_ready_test
& self
.n
.o_valid
)
839 o_data
= Mux(buf_full
, buf
, self
.data_r
)
840 o_data
= self
._postprocess
(o_data
) # XXX TBD, does nothing right now
841 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, o_data
)
842 m
.d
.sync
+= nmoperator
.eq(buf
, self
.n
.o_data
)
847 class PassThroughHandshake(ControlBase
):
848 """ A control block that delays by one clock cycle.
850 Inputs Temporary Output Data
851 ------- ------------------ ----- ----
852 P P N N PiV& PiV| NiR| pvr N P (pvr)
853 i o i o PoR ~PoR ~NoV o o
857 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
858 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
859 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
860 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
862 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
863 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
864 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
865 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
867 1 0 0 0 0 1 1 1 1 1 process(in)
868 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
869 1 0 1 0 0 1 1 1 1 1 process(in)
870 1 0 1 1 0 1 1 1 1 1 process(in)
872 1 1 0 0 1 1 1 1 1 1 process(in)
873 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
874 1 1 1 0 1 1 1 1 1 1 process(in)
875 1 1 1 1 1 1 1 1 1 1 process(in)
880 def elaborate(self
, platform
):
881 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
883 r_data
= _spec(self
.stage
.ospec
, "r_tmp") # output type
886 p_i_valid
= Signal(reset_less
=True)
887 pvr
= Signal(reset_less
=True)
888 m
.d
.comb
+= p_i_valid
.eq(self
.p
.i_valid_test
)
889 m
.d
.comb
+= pvr
.eq(p_i_valid
& self
.p
.o_ready
)
891 m
.d
.comb
+= self
.p
.o_ready
.eq(~self
.n
.o_valid | self
.n
.i_ready_test
)
892 m
.d
.sync
+= self
.n
.o_valid
.eq(p_i_valid | ~self
.p
.o_ready
)
894 odata
= Mux(pvr
, self
.data_r
, r_data
)
895 m
.d
.sync
+= nmoperator
.eq(r_data
, odata
)
896 r_data
= self
._postprocess
(r_data
) # XXX TBD, does nothing right now
897 m
.d
.comb
+= nmoperator
.eq(self
.n
.o_data
, r_data
)
902 class RegisterPipeline(UnbufferedPipeline
):
903 """ A pipeline stage that delays by one clock cycle, creating a
904 sync'd latch out of o_data and o_valid as an indirect byproduct
905 of using PassThroughStage
908 def __init__(self
, iospecfn
):
909 UnbufferedPipeline
.__init
__(self
, PassThroughStage(iospecfn
))
912 class FIFOControl(ControlBase
):
913 """ FIFO Control. Uses Queue to store data, coincidentally
914 happens to have same valid/ready signalling as Stage API.
916 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
919 def __init__(self
, depth
, stage
, in_multi
=None, stage_ctl
=False,
920 fwft
=True, pipe
=False):
923 * :depth: number of entries in the FIFO
924 * :stage: data processing block
925 * :fwft: first word fall-thru mode (non-fwft introduces delay)
926 * :pipe: specifies pipe mode.
928 when fwft = True it indicates that transfers may occur
929 combinatorially through stage processing in the same clock cycle.
930 This requires that the Stage be a Moore FSM:
931 https://en.wikipedia.org/wiki/Moore_machine
933 when fwft = False it indicates that all output signals are
934 produced only from internal registers or memory, i.e. that the
935 Stage is a Mealy FSM:
936 https://en.wikipedia.org/wiki/Mealy_machine
938 data is processed (and located) as follows:
940 self.p self.stage temp fn temp fn temp fp self.n
941 i_data->process()->result->cat->din.FIFO.dout->cat(o_data)
943 yes, really: cat produces a Cat() which can be assigned to.
944 this is how the FIFO gets de-catted without needing a de-cat
950 ControlBase
.__init
__(self
, stage
, in_multi
, stage_ctl
)
952 def elaborate(self
, platform
):
953 self
.m
= m
= ControlBase
.elaborate(self
, platform
)
955 # make a FIFO with a signal of equal width to the o_data.
956 (fwidth
, _
) = nmoperator
.shape(self
.n
.o_data
)
957 fifo
= Queue(fwidth
, self
.fdepth
, fwft
=self
.fwft
, pipe
=self
.pipe
)
958 m
.submodules
.fifo
= fifo
960 def processfn(i_data
):
961 # store result of processing in combinatorial temporary
962 result
= _spec(self
.stage
.ospec
, "r_temp")
963 m
.d
.comb
+= nmoperator
.eq(result
, self
.process(i_data
))
964 return nmoperator
.cat(result
)
966 # prev: make the FIFO (Queue object) "look" like a PrevControl...
967 m
.submodules
.fp
= fp
= PrevControl()
968 fp
.i_valid
, fp
._o
_ready
, fp
.i_data
= fifo
.w_en
, fifo
.w_rdy
, fifo
.w_data
969 m
.d
.comb
+= fp
._connect
_in
(self
.p
, fn
=processfn
)
971 # next: make the FIFO (Queue object) "look" like a NextControl...
972 m
.submodules
.fn
= fn
= NextControl()
973 fn
.o_valid
, fn
.i_ready
, fn
.o_data
= fifo
.r_rdy
, fifo
.r_en
, fifo
.r_data
974 connections
= fn
._connect
_out
(self
.n
, fn
=nmoperator
.cat
)
975 valid_eq
, ready_eq
, o_data
= connections
977 # ok ok so we can't just do the ready/valid eqs straight:
978 # first 2 from connections are the ready/valid, 3rd is data.
980 # combinatorial on next ready/valid
981 m
.d
.comb
+= [valid_eq
, ready_eq
]
983 m
.d
.sync
+= [valid_eq
, ready_eq
] # non-fwft mode needs sync
984 o_data
= self
._postprocess
(o_data
) # XXX TBD, does nothing right now
991 class UnbufferedPipeline(FIFOControl
):
992 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
993 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
994 fwft
=True, pipe
=False)
996 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
999 class PassThroughHandshake(FIFOControl
):
1000 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1001 FIFOControl
.__init
__(self
, 1, stage
, in_multi
, stage_ctl
,
1002 fwft
=True, pipe
=True)
1004 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1007 class BufferedHandshake(FIFOControl
):
1008 def __init__(self
, stage
, in_multi
=None, stage_ctl
=False):
1009 FIFOControl
.__init
__(self
, 2, stage
, in_multi
, stage_ctl
,
1010 fwft
=True, pipe
=False)
1014 # this is *probably* SimpleHandshake (note: memory cell size=0)
1015 class SimpleHandshake(FIFOControl):
1016 def __init__(self, stage, in_multi=None, stage_ctl=False):
1017 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1018 fwft=True, pipe=False)