3fae364960e19e8553bbdef34a9048653ab00b00
[nmutil.git] / src / nmutil / singlepipe.py
1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
2
3 This work is funded through NLnet under Grant 2019-02-012
4
5 License: LGPLv3+
6
7
8 Associated development bugs:
9 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
10 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
11 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
12
13 Important: see Stage API (stageapi.py) and IO Control API
14 (iocontrol.py) in combination with below. This module
15 "combines" the Stage API with the IO Control API to create
16 the Pipeline API.
17
18 The one critically important key difference between StageAPI and
19 PipelineAPI:
20
21 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
22 * PipelineAPI: synchronous registers / latches get added here
23
24 RecordBasedStage:
25 ----------------
26
27 A convenience class that takes an input shape, output shape, a
28 "processing" function and an optional "setup" function. Honestly
29 though, there's not much more effort to just... create a class
30 that returns a couple of Records (see ExampleAddRecordStage in
31 examples).
32
33 PassThroughStage:
34 ----------------
35
36 A convenience class that takes a single function as a parameter,
37 that is chain-called to create the exact same input and output spec.
38 It has a process() function that simply returns its input.
39
40 Instances of this class are completely redundant if handed to
41 StageChain, however when passed to UnbufferedPipeline they
42 can be used to introduce a single clock delay.
43
44 ControlBase:
45 -----------
46
47 The base class for pipelines. Contains previous and next ready/valid/data.
48 Also has an extremely useful "connect" function that can be used to
49 connect a chain of pipelines and present the exact same prev/next
50 ready/valid/data API.
51
52 Note: pipelines basically do not become pipelines as such until
53 handed to a derivative of ControlBase. ControlBase itself is *not*
54 strictly considered a pipeline class. Wishbone and AXI4 (master or
55 slave) could be derived from ControlBase, for example.
56 UnbufferedPipeline:
57 ------------------
58
59 A simple stalling clock-synchronised pipeline that has no buffering
60 (unlike BufferedHandshake). Data flows on *every* clock cycle when
61 the conditions are right (this is nominally when the input is valid
62 and the output is ready).
63
64 A stall anywhere along the line will result in a stall back-propagating
65 down the entire chain. The BufferedHandshake by contrast will buffer
66 incoming data, allowing previous stages one clock cycle's grace before
67 also having to stall.
68
69 An advantage of the UnbufferedPipeline over the Buffered one is
70 that the amount of logic needed (number of gates) is greatly
71 reduced (no second set of buffers basically)
72
73 The disadvantage of the UnbufferedPipeline is that the valid/ready
74 logic, if chained together, is *combinatorial*, resulting in
75 progressively larger gate delay.
76
77 PassThroughHandshake:
78 ------------------
79
80 A Control class that introduces a single clock delay, passing its
81 data through unaltered. Unlike RegisterPipeline (which relies
82 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
83 itself.
84
85 RegisterPipeline:
86 ----------------
87
88 A convenience class that, because UnbufferedPipeline introduces a single
89 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
90 stage that, duh, delays its (unmodified) input by one clock cycle.
91
92 BufferedHandshake:
93 ----------------
94
95 nmigen implementation of buffered pipeline stage, based on zipcpu:
96 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
97
98 this module requires quite a bit of thought to understand how it works
99 (and why it is needed in the first place). reading the above is
100 *strongly* recommended.
101
102 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
103 the STB / ACK signals to raise and lower (on separate clocks) before
104 data may proceeed (thus only allowing one piece of data to proceed
105 on *ALTERNATE* cycles), the signalling here is a true pipeline
106 where data will flow on *every* clock when the conditions are right.
107
108 input acceptance conditions are when:
109 * incoming previous-stage strobe (p.i_valid) is HIGH
110 * outgoing previous-stage ready (p.o_ready) is LOW
111
112 output transmission conditions are when:
113 * outgoing next-stage strobe (n.o_valid) is HIGH
114 * outgoing next-stage ready (n.i_ready) is LOW
115
116 the tricky bit is when the input has valid data and the output is not
117 ready to accept it. if it wasn't for the clock synchronisation, it
118 would be possible to tell the input "hey don't send that data, we're
119 not ready". unfortunately, it's not possible to "change the past":
120 the previous stage *has no choice* but to pass on its data.
121
122 therefore, the incoming data *must* be accepted - and stored: that
123 is the responsibility / contract that this stage *must* accept.
124 on the same clock, it's possible to tell the input that it must
125 not send any more data. this is the "stall" condition.
126
127 we now effectively have *two* possible pieces of data to "choose" from:
128 the buffered data, and the incoming data. the decision as to which
129 to process and output is based on whether we are in "stall" or not.
130 i.e. when the next stage is no longer ready, the output comes from
131 the buffer if a stall had previously occurred, otherwise it comes
132 direct from processing the input.
133
134 this allows us to respect a synchronous "travelling STB" with what
135 dan calls a "buffered handshake".
136
137 it's quite a complex state machine!
138
139 SimpleHandshake
140 ---------------
141
142 Synchronised pipeline, Based on:
143 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
144 """
145
146 from nmigen import Signal, Mux, Module, Elaboratable, Const
147 from nmigen.cli import verilog, rtlil
148 from nmigen.hdl.rec import Record
149
150 from nmutil.queue import Queue
151 import inspect
152
153 from nmutil.iocontrol import (PrevControl, NextControl, Object, RecordObject)
154 from nmutil.stageapi import (_spec, StageCls, Stage, StageChain, StageHelper)
155 from nmutil import nmoperator
156
157
158 class RecordBasedStage(Stage):
159 """ convenience class which provides a Records-based layout.
160 honestly it's a lot easier just to create a direct Records-based
161 class (see ExampleAddRecordStage)
162 """
163
164 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
165 self.in_shape = in_shape
166 self.out_shape = out_shape
167 self.__process = processfn
168 self.__setup = setupfn
169
170 def ispec(self): return Record(self.in_shape)
171 def ospec(self): return Record(self.out_shape)
172 def process(seif, i): return self.__process(i)
173 def setup(seif, m, i): return self.__setup(m, i)
174
175
176 class PassThroughStage(StageCls):
177 """ a pass-through stage with its input data spec identical to its output,
178 and "passes through" its data from input to output (does nothing).
179
180 use this basically to explicitly make any data spec Stage-compliant.
181 (many APIs would potentially use a static "wrap" method in e.g.
182 StageCls to achieve a similar effect)
183 """
184
185 def __init__(self, iospecfn): self.iospecfn = iospecfn
186 def ispec(self): return self.iospecfn()
187 def ospec(self): return self.iospecfn()
188
189
190 class ControlBase(StageHelper, Elaboratable):
191 """ Common functions for Pipeline API. Note: a "pipeline stage" only
192 exists (conceptually) when a ControlBase derivative is handed
193 a Stage (combinatorial block)
194
195 NOTE: ControlBase derives from StageHelper, making it accidentally
196 compliant with the Stage API. Using those functions directly
197 *BYPASSES* a ControlBase instance ready/valid signalling, which
198 clearly should not be done without a really, really good reason.
199 """
200
201 def __init__(self, stage=None, in_multi=None, stage_ctl=False, maskwid=0):
202 """ Base class containing ready/valid/data to previous and next stages
203
204 * p: contains ready/valid to the previous stage
205 * n: contains ready/valid to the next stage
206
207 Except when calling Controlbase.connect(), user must also:
208 * add i_data member to PrevControl (p) and
209 * add o_data member to NextControl (n)
210 Calling ControlBase._new_data is a good way to do that.
211 """
212 print("ControlBase", self, stage, in_multi, stage_ctl)
213 StageHelper.__init__(self, stage)
214
215 # set up input and output IO ACK (prev/next ready/valid)
216 self.p = PrevControl(in_multi, stage_ctl, maskwid=maskwid)
217 self.n = NextControl(stage_ctl, maskwid=maskwid)
218
219 # set up the input and output data
220 if stage is not None:
221 self._new_data("data")
222
223 def _new_data(self, name):
224 """ allocates new i_data and o_data
225 """
226 self.p.i_data, self.n.o_data = self.new_specs(name)
227
228 @property
229 def data_r(self):
230 return self.process(self.p.i_data)
231
232 def connect_to_next(self, nxt):
233 """ helper function to connect to the next stage data/valid/ready.
234 """
235 return self.n.connect_to_next(nxt.p)
236
237 def _connect_in(self, prev):
238 """ internal helper function to connect stage to an input source.
239 do not use to connect stage-to-stage!
240 """
241 return self.p._connect_in(prev.p)
242
243 def _connect_out(self, nxt):
244 """ internal helper function to connect stage to an output source.
245 do not use to connect stage-to-stage!
246 """
247 return self.n._connect_out(nxt.n)
248
249 def connect(self, pipechain):
250 """ connects a chain (list) of Pipeline instances together and
251 links them to this ControlBase instance:
252
253 in <----> self <---> out
254 | ^
255 v |
256 [pipe1, pipe2, pipe3, pipe4]
257 | ^ | ^ | ^
258 v | v | v |
259 out---in out--in out---in
260
261 Also takes care of allocating i_data/o_data, by looking up
262 the data spec for each end of the pipechain. i.e It is NOT
263 necessary to allocate self.p.i_data or self.n.o_data manually:
264 this is handled AUTOMATICALLY, here.
265
266 Basically this function is the direct equivalent of StageChain,
267 except that unlike StageChain, the Pipeline logic is followed.
268
269 Just as StageChain presents an object that conforms to the
270 Stage API from a list of objects that also conform to the
271 Stage API, an object that calls this Pipeline connect function
272 has the exact same pipeline API as the list of pipline objects
273 it is called with.
274
275 Thus it becomes possible to build up larger chains recursively.
276 More complex chains (multi-input, multi-output) will have to be
277 done manually.
278
279 Argument:
280
281 * :pipechain: - a sequence of ControlBase-derived classes
282 (must be one or more in length)
283
284 Returns:
285
286 * a list of eq assignments that will need to be added in
287 an elaborate() to m.d.comb
288 """
289 assert len(pipechain) > 0, "pipechain must be non-zero length"
290 assert self.stage is None, "do not use connect with a stage"
291 eqs = [] # collated list of assignment statements
292
293 # connect inter-chain
294 for i in range(len(pipechain)-1):
295 pipe1 = pipechain[i] # earlier
296 pipe2 = pipechain[i+1] # later (by 1)
297 eqs += pipe1.connect_to_next(pipe2) # earlier n to later p
298
299 # connect front and back of chain to ourselves
300 front = pipechain[0] # first in chain
301 end = pipechain[-1] # last in chain
302 self.set_specs(front, end) # sets up ispec/ospec functions
303 self._new_data("chain") # NOTE: REPLACES existing data
304 eqs += front._connect_in(self) # front p to our p
305 eqs += end._connect_out(self) # end n to our n
306
307 return eqs
308
309 def set_input(self, i):
310 """ helper function to set the input data (used in unit tests)
311 """
312 return nmoperator.eq(self.p.i_data, i)
313
314 def __iter__(self):
315 yield from self.p # yields ready/valid/data (data also gets yielded)
316 yield from self.n # ditto
317
318 def ports(self):
319 return list(self)
320
321 def elaborate(self, platform):
322 """ handles case where stage has dynamic ready/valid functions
323 """
324 m = Module()
325 m.submodules.p = self.p
326 m.submodules.n = self.n
327
328 self.setup(m, self.p.i_data)
329
330 if not self.p.stage_ctl:
331 return m
332
333 # intercept the previous (outgoing) "ready", combine with stage ready
334 m.d.comb += self.p.s_o_ready.eq(self.p._o_ready & self.stage.d_ready)
335
336 # intercept the next (incoming) "ready" and combine it with data valid
337 sdv = self.stage.d_valid(self.n.i_ready)
338 m.d.comb += self.n.d_valid.eq(self.n.i_ready & sdv)
339
340 return m
341
342
343 class BufferedHandshake(ControlBase):
344 """ buffered pipeline stage. data and strobe signals travel in sync.
345 if ever the input is ready and the output is not, processed data
346 is shunted in a temporary register.
347
348 Argument: stage. see Stage API above
349
350 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
351 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
352 stage-1 p.i_data >>in stage n.o_data out>> stage+1
353 | |
354 process --->----^
355 | |
356 +-- r_data ->-+
357
358 input data p.i_data is read (only), is processed and goes into an
359 intermediate result store [process()]. this is updated combinatorially.
360
361 in a non-stall condition, the intermediate result will go into the
362 output (update_output). however if ever there is a stall, it goes
363 into r_data instead [update_buffer()].
364
365 when the non-stall condition is released, r_data is the first
366 to be transferred to the output [flush_buffer()], and the stall
367 condition cleared.
368
369 on the next cycle (as long as stall is not raised again) the
370 input may begin to be processed and transferred directly to output.
371 """
372
373 def elaborate(self, platform):
374 self.m = ControlBase.elaborate(self, platform)
375
376 result = _spec(self.stage.ospec, "r_tmp")
377 r_data = _spec(self.stage.ospec, "r_data")
378
379 # establish some combinatorial temporaries
380 o_n_validn = Signal(reset_less=True)
381 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
382 nir_por = Signal(reset_less=True)
383 nir_por_n = Signal(reset_less=True)
384 p_i_valid = Signal(reset_less=True)
385 nir_novn = Signal(reset_less=True)
386 nirn_novn = Signal(reset_less=True)
387 por_pivn = Signal(reset_less=True)
388 npnn = Signal(reset_less=True)
389 self.m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
390 o_n_validn.eq(~self.n.o_valid),
391 n_i_ready.eq(self.n.i_ready_test),
392 nir_por.eq(n_i_ready & self.p._o_ready),
393 nir_por_n.eq(n_i_ready & ~self.p._o_ready),
394 nir_novn.eq(n_i_ready | o_n_validn),
395 nirn_novn.eq(~n_i_ready & o_n_validn),
396 npnn.eq(nir_por | nirn_novn),
397 por_pivn.eq(self.p._o_ready & ~p_i_valid)
398 ]
399
400 # store result of processing in combinatorial temporary
401 self.m.d.comb += nmoperator.eq(result, self.data_r)
402
403 # if not in stall condition, update the temporary register
404 with self.m.If(self.p.o_ready): # not stalled
405 self.m.d.sync += nmoperator.eq(r_data, result) # update buffer
406
407 # data pass-through conditions
408 with self.m.If(npnn):
409 # XXX TBD, does nothing right now
410 o_data = self._postprocess(result)
411 self.m.d.sync += [self.n.o_valid.eq(p_i_valid), # valid if p_valid
412 # update out
413 nmoperator.eq(self.n.o_data, o_data),
414 ]
415 # buffer flush conditions (NOTE: can override data passthru conditions)
416 with self.m.If(nir_por_n): # not stalled
417 # Flush the [already processed] buffer to the output port.
418 # XXX TBD, does nothing right now
419 o_data = self._postprocess(r_data)
420 self.m.d.sync += [self.n.o_valid.eq(1), # reg empty
421 nmoperator.eq(self.n.o_data, o_data), # flush
422 ]
423 # output ready conditions
424 self.m.d.sync += self.p._o_ready.eq(nir_novn | por_pivn)
425
426 return self.m
427
428
429 class MaskNoDelayCancellable(ControlBase):
430 """ Mask-activated Cancellable pipeline (that does not respect "ready")
431
432 Based on (identical behaviour to) SimpleHandshake.
433 TODO: decide whether to merge *into* SimpleHandshake.
434
435 Argument: stage. see Stage API above
436
437 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
438 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
439 stage-1 p.i_data >>in stage n.o_data out>> stage+1
440 | |
441 +--process->--^
442 """
443
444 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False):
445 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
446
447 def elaborate(self, platform):
448 self.m = m = ControlBase.elaborate(self, platform)
449
450 # store result of processing in combinatorial temporary
451 result = _spec(self.stage.ospec, "r_tmp")
452 m.d.comb += nmoperator.eq(result, self.data_r)
453
454 # establish if the data should be passed on. cancellation is
455 # a global signal.
456 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
457 # is NOT "normal" for the Stage API.
458 p_i_valid = Signal(reset_less=True)
459 #print ("self.p.i_data", self.p.i_data)
460 maskedout = Signal(len(self.p.mask_i), reset_less=True)
461 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
462 m.d.comb += p_i_valid.eq(maskedout.bool())
463
464 # if idmask nonzero, mask gets passed on (and register set).
465 # register is left as-is if idmask is zero, but out-mask is set to zero
466 # note however: only the *uncancelled* mask bits get passed on
467 m.d.sync += self.n.o_valid.eq(p_i_valid)
468 m.d.sync += self.n.mask_o.eq(Mux(p_i_valid, maskedout, 0))
469 with m.If(p_i_valid):
470 # XXX TBD, does nothing right now
471 o_data = self._postprocess(result)
472 m.d.sync += nmoperator.eq(self.n.o_data, o_data) # update output
473
474 # output valid if
475 # input always "ready"
476 #m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
477 m.d.comb += self.p._o_ready.eq(Const(1))
478
479 # always pass on stop (as combinatorial: single signal)
480 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
481
482 return self.m
483
484
485 class MaskCancellable(ControlBase):
486 """ Mask-activated Cancellable pipeline
487
488 Arguments:
489
490 * stage. see Stage API above
491 * maskwid - sets up cancellation capability (mask and stop).
492 * in_multi
493 * stage_ctl
494 * dynamic - allows switching from sync to combinatorial (passthrough)
495 USE WITH CARE. will need the entire pipe to be quiescent
496 before switching, otherwise data WILL be destroyed.
497
498 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
499 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
500 stage-1 p.i_data >>in stage n.o_data out>> stage+1
501 | |
502 +--process->--^
503 """
504
505 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False,
506 dynamic=False):
507 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
508 self.dynamic = dynamic
509 if dynamic:
510 self.latchmode = Signal()
511 else:
512 self.latchmode = Const(1)
513
514 def elaborate(self, platform):
515 self.m = m = ControlBase.elaborate(self, platform)
516
517 mask_r = Signal(len(self.p.mask_i), reset_less=True)
518 data_r = _spec(self.stage.ospec, "data_r")
519 m.d.comb += nmoperator.eq(data_r, self._postprocess(self.data_r))
520
521 with m.If(self.latchmode):
522 r_busy = Signal()
523 r_latch = _spec(self.stage.ospec, "r_latch")
524
525 # establish if the data should be passed on. cancellation is
526 # a global signal.
527 p_i_valid = Signal(reset_less=True)
528 #print ("self.p.i_data", self.p.i_data)
529 maskedout = Signal(len(self.p.mask_i), reset_less=True)
530 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
531
532 # establish some combinatorial temporaries
533 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
534 p_i_valid_p_o_ready = Signal(reset_less=True)
535 m.d.comb += [p_i_valid.eq(self.p.i_valid_test & maskedout.bool()),
536 n_i_ready.eq(self.n.i_ready_test),
537 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
538 ]
539
540 # if idmask nonzero, mask gets passed on (and register set).
541 # register is left as-is if idmask is zero, but out-mask is set to
542 # zero
543 # note however: only the *uncancelled* mask bits get passed on
544 m.d.sync += mask_r.eq(Mux(p_i_valid, maskedout, 0))
545 m.d.comb += self.n.mask_o.eq(mask_r)
546
547 # always pass on stop (as combinatorial: single signal)
548 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
549
550 stor = Signal(reset_less=True)
551 m.d.comb += stor.eq(p_i_valid_p_o_ready | n_i_ready)
552 with m.If(stor):
553 # store result of processing in combinatorial temporary
554 m.d.sync += nmoperator.eq(r_latch, data_r)
555
556 # previous valid and ready
557 with m.If(p_i_valid_p_o_ready):
558 m.d.sync += r_busy.eq(1) # output valid
559 # previous invalid or not ready, however next is accepting
560 with m.Elif(n_i_ready):
561 m.d.sync += r_busy.eq(0) # ...so set output invalid
562
563 # output set combinatorially from latch
564 m.d.comb += nmoperator.eq(self.n.o_data, r_latch)
565
566 m.d.comb += self.n.o_valid.eq(r_busy)
567 # if next is ready, so is previous
568 m.d.comb += self.p._o_ready.eq(n_i_ready)
569
570 with m.Else():
571 # pass everything straight through. p connected to n: data,
572 # valid, mask, everything. this is "effectively" just a
573 # StageChain: MaskCancellable is doing "nothing" except
574 # combinatorially passing everything through
575 # (except now it's *dynamically selectable* whether to do that)
576 m.d.comb += self.n.o_valid.eq(self.p.i_valid_test)
577 m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
578 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
579 m.d.comb += self.n.mask_o.eq(self.p.mask_i)
580 m.d.comb += nmoperator.eq(self.n.o_data, data_r)
581
582 return self.m
583
584
585 class SimpleHandshake(ControlBase):
586 """ simple handshake control. data and strobe signals travel in sync.
587 implements the protocol used by Wishbone and AXI4.
588
589 Argument: stage. see Stage API above
590
591 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
592 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
593 stage-1 p.i_data >>in stage n.o_data out>> stage+1
594 | |
595 +--process->--^
596 Truth Table
597
598 Inputs Temporary Output Data
599 ------- ---------- ----- ----
600 P P N N PiV& ~NiR& N P
601 i o i o PoR NoV o o
602 V R R V V R
603
604 ------- - - - -
605 0 0 0 0 0 0 >0 0 reg
606 0 0 0 1 0 1 >1 0 reg
607 0 0 1 0 0 0 0 1 process(i_data)
608 0 0 1 1 0 0 0 1 process(i_data)
609 ------- - - - -
610 0 1 0 0 0 0 >0 0 reg
611 0 1 0 1 0 1 >1 0 reg
612 0 1 1 0 0 0 0 1 process(i_data)
613 0 1 1 1 0 0 0 1 process(i_data)
614 ------- - - - -
615 1 0 0 0 0 0 >0 0 reg
616 1 0 0 1 0 1 >1 0 reg
617 1 0 1 0 0 0 0 1 process(i_data)
618 1 0 1 1 0 0 0 1 process(i_data)
619 ------- - - - -
620 1 1 0 0 1 0 1 0 process(i_data)
621 1 1 0 1 1 1 1 0 process(i_data)
622 1 1 1 0 1 0 1 1 process(i_data)
623 1 1 1 1 1 0 1 1 process(i_data)
624 ------- - - - -
625 """
626
627 def elaborate(self, platform):
628 self.m = m = ControlBase.elaborate(self, platform)
629
630 r_busy = Signal()
631 result = _spec(self.stage.ospec, "r_tmp")
632
633 # establish some combinatorial temporaries
634 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
635 p_i_valid_p_o_ready = Signal(reset_less=True)
636 p_i_valid = Signal(reset_less=True)
637 m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
638 n_i_ready.eq(self.n.i_ready_test),
639 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
640 ]
641
642 # store result of processing in combinatorial temporary
643 m.d.comb += nmoperator.eq(result, self.data_r)
644
645 # previous valid and ready
646 with m.If(p_i_valid_p_o_ready):
647 # XXX TBD, does nothing right now
648 o_data = self._postprocess(result)
649 m.d.sync += [r_busy.eq(1), # output valid
650 nmoperator.eq(self.n.o_data, o_data), # update output
651 ]
652 # previous invalid or not ready, however next is accepting
653 with m.Elif(n_i_ready):
654 # XXX TBD, does nothing right now
655 o_data = self._postprocess(result)
656 m.d.sync += [nmoperator.eq(self.n.o_data, o_data)]
657 # TODO: could still send data here (if there was any)
658 # m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
659 m.d.sync += r_busy.eq(0) # ...so set output invalid
660
661 m.d.comb += self.n.o_valid.eq(r_busy)
662 # if next is ready, so is previous
663 m.d.comb += self.p._o_ready.eq(n_i_ready)
664
665 return self.m
666
667
668 class UnbufferedPipeline(ControlBase):
669 """ A simple pipeline stage with single-clock synchronisation
670 and two-way valid/ready synchronised signalling.
671
672 Note that a stall in one stage will result in the entire pipeline
673 chain stalling.
674
675 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
676 travel synchronously with the data: the valid/ready signalling
677 combines in a *combinatorial* fashion. Therefore, a long pipeline
678 chain will lengthen propagation delays.
679
680 Argument: stage. see Stage API, above
681
682 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
683 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
684 stage-1 p.i_data >>in stage n.o_data out>> stage+1
685 | |
686 r_data result
687 | |
688 +--process ->-+
689
690 Attributes:
691 -----------
692 p.i_data : StageInput, shaped according to ispec
693 The pipeline input
694 p.o_data : StageOutput, shaped according to ospec
695 The pipeline output
696 r_data : input_shape according to ispec
697 A temporary (buffered) copy of a prior (valid) input.
698 This is HELD if the output is not ready. It is updated
699 SYNCHRONOUSLY.
700 result: output_shape according to ospec
701 The output of the combinatorial logic. it is updated
702 COMBINATORIALLY (no clock dependence).
703
704 Truth Table
705
706 Inputs Temp Output Data
707 ------- - ----- ----
708 P P N N ~NiR& N P
709 i o i o NoV o o
710 V R R V V R
711
712 ------- - - -
713 0 0 0 0 0 0 1 reg
714 0 0 0 1 1 1 0 reg
715 0 0 1 0 0 0 1 reg
716 0 0 1 1 0 0 1 reg
717 ------- - - -
718 0 1 0 0 0 0 1 reg
719 0 1 0 1 1 1 0 reg
720 0 1 1 0 0 0 1 reg
721 0 1 1 1 0 0 1 reg
722 ------- - - -
723 1 0 0 0 0 1 1 reg
724 1 0 0 1 1 1 0 reg
725 1 0 1 0 0 1 1 reg
726 1 0 1 1 0 1 1 reg
727 ------- - - -
728 1 1 0 0 0 1 1 process(i_data)
729 1 1 0 1 1 1 0 process(i_data)
730 1 1 1 0 0 1 1 process(i_data)
731 1 1 1 1 0 1 1 process(i_data)
732 ------- - - -
733
734 Note: PoR is *NOT* involved in the above decision-making.
735 """
736
737 def elaborate(self, platform):
738 self.m = m = ControlBase.elaborate(self, platform)
739
740 data_valid = Signal() # is data valid or not
741 r_data = _spec(self.stage.ospec, "r_tmp") # output type
742
743 # some temporaries
744 p_i_valid = Signal(reset_less=True)
745 pv = Signal(reset_less=True)
746 buf_full = Signal(reset_less=True)
747 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
748 m.d.comb += pv.eq(self.p.i_valid & self.p.o_ready)
749 m.d.comb += buf_full.eq(~self.n.i_ready_test & data_valid)
750
751 m.d.comb += self.n.o_valid.eq(data_valid)
752 m.d.comb += self.p._o_ready.eq(~data_valid | self.n.i_ready_test)
753 m.d.sync += data_valid.eq(p_i_valid | buf_full)
754
755 with m.If(pv):
756 m.d.sync += nmoperator.eq(r_data, self.data_r)
757 o_data = self._postprocess(r_data) # XXX TBD, does nothing right now
758 m.d.comb += nmoperator.eq(self.n.o_data, o_data)
759
760 return self.m
761
762
763 class UnbufferedPipeline2(ControlBase):
764 """ A simple pipeline stage with single-clock synchronisation
765 and two-way valid/ready synchronised signalling.
766
767 Note that a stall in one stage will result in the entire pipeline
768 chain stalling.
769
770 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
771 travel synchronously with the data: the valid/ready signalling
772 combines in a *combinatorial* fashion. Therefore, a long pipeline
773 chain will lengthen propagation delays.
774
775 Argument: stage. see Stage API, above
776
777 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
778 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
779 stage-1 p.i_data >>in stage n.o_data out>> stage+1
780 | | |
781 +- process-> buf <-+
782 Attributes:
783 -----------
784 p.i_data : StageInput, shaped according to ispec
785 The pipeline input
786 p.o_data : StageOutput, shaped according to ospec
787 The pipeline output
788 buf : output_shape according to ospec
789 A temporary (buffered) copy of a valid output
790 This is HELD if the output is not ready. It is updated
791 SYNCHRONOUSLY.
792
793 Inputs Temp Output Data
794 ------- - -----
795 P P N N ~NiR& N P (buf_full)
796 i o i o NoV o o
797 V R R V V R
798
799 ------- - - -
800 0 0 0 0 0 0 1 process(i_data)
801 0 0 0 1 1 1 0 reg (odata, unchanged)
802 0 0 1 0 0 0 1 process(i_data)
803 0 0 1 1 0 0 1 process(i_data)
804 ------- - - -
805 0 1 0 0 0 0 1 process(i_data)
806 0 1 0 1 1 1 0 reg (odata, unchanged)
807 0 1 1 0 0 0 1 process(i_data)
808 0 1 1 1 0 0 1 process(i_data)
809 ------- - - -
810 1 0 0 0 0 1 1 process(i_data)
811 1 0 0 1 1 1 0 reg (odata, unchanged)
812 1 0 1 0 0 1 1 process(i_data)
813 1 0 1 1 0 1 1 process(i_data)
814 ------- - - -
815 1 1 0 0 0 1 1 process(i_data)
816 1 1 0 1 1 1 0 reg (odata, unchanged)
817 1 1 1 0 0 1 1 process(i_data)
818 1 1 1 1 0 1 1 process(i_data)
819 ------- - - -
820
821 Note: PoR is *NOT* involved in the above decision-making.
822 """
823
824 def elaborate(self, platform):
825 self.m = m = ControlBase.elaborate(self, platform)
826
827 buf_full = Signal() # is data valid or not
828 buf = _spec(self.stage.ospec, "r_tmp") # output type
829
830 # some temporaries
831 p_i_valid = Signal(reset_less=True)
832 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
833
834 m.d.comb += self.n.o_valid.eq(buf_full | p_i_valid)
835 m.d.comb += self.p._o_ready.eq(~buf_full)
836 m.d.sync += buf_full.eq(~self.n.i_ready_test & self.n.o_valid)
837
838 o_data = Mux(buf_full, buf, self.data_r)
839 o_data = self._postprocess(o_data) # XXX TBD, does nothing right now
840 m.d.comb += nmoperator.eq(self.n.o_data, o_data)
841 m.d.sync += nmoperator.eq(buf, self.n.o_data)
842
843 return self.m
844
845
846 class PassThroughHandshake(ControlBase):
847 """ A control block that delays by one clock cycle.
848
849 Inputs Temporary Output Data
850 ------- ------------------ ----- ----
851 P P N N PiV& PiV| NiR| pvr N P (pvr)
852 i o i o PoR ~PoR ~NoV o o
853 V R R V V R
854
855 ------- - - - - - -
856 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
857 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
858 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
859 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
860 ------- - - - - - -
861 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
862 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
863 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
864 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
865 ------- - - - - - -
866 1 0 0 0 0 1 1 1 1 1 process(in)
867 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
868 1 0 1 0 0 1 1 1 1 1 process(in)
869 1 0 1 1 0 1 1 1 1 1 process(in)
870 ------- - - - - - -
871 1 1 0 0 1 1 1 1 1 1 process(in)
872 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
873 1 1 1 0 1 1 1 1 1 1 process(in)
874 1 1 1 1 1 1 1 1 1 1 process(in)
875 ------- - - - - - -
876
877 """
878
879 def elaborate(self, platform):
880 self.m = m = ControlBase.elaborate(self, platform)
881
882 r_data = _spec(self.stage.ospec, "r_tmp") # output type
883
884 # temporaries
885 p_i_valid = Signal(reset_less=True)
886 pvr = Signal(reset_less=True)
887 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
888 m.d.comb += pvr.eq(p_i_valid & self.p.o_ready)
889
890 m.d.comb += self.p.o_ready.eq(~self.n.o_valid | self.n.i_ready_test)
891 m.d.sync += self.n.o_valid.eq(p_i_valid | ~self.p.o_ready)
892
893 odata = Mux(pvr, self.data_r, r_data)
894 m.d.sync += nmoperator.eq(r_data, odata)
895 r_data = self._postprocess(r_data) # XXX TBD, does nothing right now
896 m.d.comb += nmoperator.eq(self.n.o_data, r_data)
897
898 return m
899
900
901 class RegisterPipeline(UnbufferedPipeline):
902 """ A pipeline stage that delays by one clock cycle, creating a
903 sync'd latch out of o_data and o_valid as an indirect byproduct
904 of using PassThroughStage
905 """
906
907 def __init__(self, iospecfn):
908 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
909
910
911 class FIFOControl(ControlBase):
912 """ FIFO Control. Uses Queue to store data, coincidentally
913 happens to have same valid/ready signalling as Stage API.
914
915 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
916 """
917
918 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
919 fwft=True, pipe=False):
920 """ FIFO Control
921
922 * :depth: number of entries in the FIFO
923 * :stage: data processing block
924 * :fwft: first word fall-thru mode (non-fwft introduces delay)
925 * :pipe: specifies pipe mode.
926
927 when fwft = True it indicates that transfers may occur
928 combinatorially through stage processing in the same clock cycle.
929 This requires that the Stage be a Moore FSM:
930 https://en.wikipedia.org/wiki/Moore_machine
931
932 when fwft = False it indicates that all output signals are
933 produced only from internal registers or memory, i.e. that the
934 Stage is a Mealy FSM:
935 https://en.wikipedia.org/wiki/Mealy_machine
936
937 data is processed (and located) as follows:
938
939 self.p self.stage temp fn temp fn temp fp self.n
940 i_data->process()->result->cat->din.FIFO.dout->cat(o_data)
941
942 yes, really: cat produces a Cat() which can be assigned to.
943 this is how the FIFO gets de-catted without needing a de-cat
944 function
945 """
946 self.fwft = fwft
947 self.pipe = pipe
948 self.fdepth = depth
949 ControlBase.__init__(self, stage, in_multi, stage_ctl)
950
951 def elaborate(self, platform):
952 self.m = m = ControlBase.elaborate(self, platform)
953
954 # make a FIFO with a signal of equal width to the o_data.
955 (fwidth, _) = nmoperator.shape(self.n.o_data)
956 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
957 m.submodules.fifo = fifo
958
959 def processfn(i_data):
960 # store result of processing in combinatorial temporary
961 result = _spec(self.stage.ospec, "r_temp")
962 m.d.comb += nmoperator.eq(result, self.process(i_data))
963 return nmoperator.cat(result)
964
965 # prev: make the FIFO (Queue object) "look" like a PrevControl...
966 m.submodules.fp = fp = PrevControl()
967 fp.i_valid, fp._o_ready, fp.i_data = fifo.w_en, fifo.w_rdy, fifo.w_data
968 m.d.comb += fp._connect_in(self.p, fn=processfn)
969
970 # next: make the FIFO (Queue object) "look" like a NextControl...
971 m.submodules.fn = fn = NextControl()
972 fn.o_valid, fn.i_ready, fn.o_data = fifo.r_rdy, fifo.r_en, fifo.r_data
973 connections = fn._connect_out(self.n, fn=nmoperator.cat)
974 valid_eq, ready_eq, o_data = connections
975
976 # ok ok so we can't just do the ready/valid eqs straight:
977 # first 2 from connections are the ready/valid, 3rd is data.
978 if self.fwft:
979 # combinatorial on next ready/valid
980 m.d.comb += [valid_eq, ready_eq]
981 else:
982 m.d.sync += [valid_eq, ready_eq] # non-fwft mode needs sync
983 o_data = self._postprocess(o_data) # XXX TBD, does nothing right now
984 m.d.comb += o_data
985
986 return m
987
988
989 # aka "RegStage".
990 class UnbufferedPipeline(FIFOControl):
991 def __init__(self, stage, in_multi=None, stage_ctl=False):
992 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
993 fwft=True, pipe=False)
994
995 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
996
997
998 class PassThroughHandshake(FIFOControl):
999 def __init__(self, stage, in_multi=None, stage_ctl=False):
1000 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
1001 fwft=True, pipe=True)
1002
1003 # this is *probably* BufferedHandshake, although test #997 now succeeds.
1004
1005
1006 class BufferedHandshake(FIFOControl):
1007 def __init__(self, stage, in_multi=None, stage_ctl=False):
1008 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
1009 fwft=True, pipe=False)
1010
1011
1012 """
1013 # this is *probably* SimpleHandshake (note: memory cell size=0)
1014 class SimpleHandshake(FIFOControl):
1015 def __init__(self, stage, in_multi=None, stage_ctl=False):
1016 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
1017 fwft=True, pipe=False)
1018 """