likewise replace data_o with o_data and data_i with i_data
[nmutil.git] / src / nmutil / singlepipe.py
1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
2
3 This work is funded through NLnet under Grant 2019-02-012
4
5 License: LGPLv3+
6
7
8 Associated development bugs:
9 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
10 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
11 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
12
13 Important: see Stage API (stageapi.py) and IO Control API
14 (iocontrol.py) in combination with below. This module
15 "combines" the Stage API with the IO Control API to create
16 the Pipeline API.
17
18 The one critically important key difference between StageAPI and
19 PipelineAPI:
20
21 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
22 * PipelineAPI: synchronous registers / latches get added here
23
24 RecordBasedStage:
25 ----------------
26
27 A convenience class that takes an input shape, output shape, a
28 "processing" function and an optional "setup" function. Honestly
29 though, there's not much more effort to just... create a class
30 that returns a couple of Records (see ExampleAddRecordStage in
31 examples).
32
33 PassThroughStage:
34 ----------------
35
36 A convenience class that takes a single function as a parameter,
37 that is chain-called to create the exact same input and output spec.
38 It has a process() function that simply returns its input.
39
40 Instances of this class are completely redundant if handed to
41 StageChain, however when passed to UnbufferedPipeline they
42 can be used to introduce a single clock delay.
43
44 ControlBase:
45 -----------
46
47 The base class for pipelines. Contains previous and next ready/valid/data.
48 Also has an extremely useful "connect" function that can be used to
49 connect a chain of pipelines and present the exact same prev/next
50 ready/valid/data API.
51
52 Note: pipelines basically do not become pipelines as such until
53 handed to a derivative of ControlBase. ControlBase itself is *not*
54 strictly considered a pipeline class. Wishbone and AXI4 (master or
55 slave) could be derived from ControlBase, for example.
56 UnbufferedPipeline:
57 ------------------
58
59 A simple stalling clock-synchronised pipeline that has no buffering
60 (unlike BufferedHandshake). Data flows on *every* clock cycle when
61 the conditions are right (this is nominally when the input is valid
62 and the output is ready).
63
64 A stall anywhere along the line will result in a stall back-propagating
65 down the entire chain. The BufferedHandshake by contrast will buffer
66 incoming data, allowing previous stages one clock cycle's grace before
67 also having to stall.
68
69 An advantage of the UnbufferedPipeline over the Buffered one is
70 that the amount of logic needed (number of gates) is greatly
71 reduced (no second set of buffers basically)
72
73 The disadvantage of the UnbufferedPipeline is that the valid/ready
74 logic, if chained together, is *combinatorial*, resulting in
75 progressively larger gate delay.
76
77 PassThroughHandshake:
78 ------------------
79
80 A Control class that introduces a single clock delay, passing its
81 data through unaltered. Unlike RegisterPipeline (which relies
82 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
83 itself.
84
85 RegisterPipeline:
86 ----------------
87
88 A convenience class that, because UnbufferedPipeline introduces a single
89 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
90 stage that, duh, delays its (unmodified) input by one clock cycle.
91
92 BufferedHandshake:
93 ----------------
94
95 nmigen implementation of buffered pipeline stage, based on zipcpu:
96 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
97
98 this module requires quite a bit of thought to understand how it works
99 (and why it is needed in the first place). reading the above is
100 *strongly* recommended.
101
102 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
103 the STB / ACK signals to raise and lower (on separate clocks) before
104 data may proceeed (thus only allowing one piece of data to proceed
105 on *ALTERNATE* cycles), the signalling here is a true pipeline
106 where data will flow on *every* clock when the conditions are right.
107
108 input acceptance conditions are when:
109 * incoming previous-stage strobe (p.i_valid) is HIGH
110 * outgoing previous-stage ready (p.o_ready) is LOW
111
112 output transmission conditions are when:
113 * outgoing next-stage strobe (n.o_valid) is HIGH
114 * outgoing next-stage ready (n.i_ready) is LOW
115
116 the tricky bit is when the input has valid data and the output is not
117 ready to accept it. if it wasn't for the clock synchronisation, it
118 would be possible to tell the input "hey don't send that data, we're
119 not ready". unfortunately, it's not possible to "change the past":
120 the previous stage *has no choice* but to pass on its data.
121
122 therefore, the incoming data *must* be accepted - and stored: that
123 is the responsibility / contract that this stage *must* accept.
124 on the same clock, it's possible to tell the input that it must
125 not send any more data. this is the "stall" condition.
126
127 we now effectively have *two* possible pieces of data to "choose" from:
128 the buffered data, and the incoming data. the decision as to which
129 to process and output is based on whether we are in "stall" or not.
130 i.e. when the next stage is no longer ready, the output comes from
131 the buffer if a stall had previously occurred, otherwise it comes
132 direct from processing the input.
133
134 this allows us to respect a synchronous "travelling STB" with what
135 dan calls a "buffered handshake".
136
137 it's quite a complex state machine!
138
139 SimpleHandshake
140 ---------------
141
142 Synchronised pipeline, Based on:
143 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
144 """
145
146 from nmigen import Signal, Mux, Module, Elaboratable, Const
147 from nmigen.cli import verilog, rtlil
148 from nmigen.hdl.rec import Record
149
150 from nmutil.queue import Queue
151 import inspect
152
153 from nmutil.iocontrol import (PrevControl, NextControl, Object, RecordObject)
154 from nmutil.stageapi import (_spec, StageCls, Stage, StageChain, StageHelper)
155 from nmutil import nmoperator
156
157
158 class RecordBasedStage(Stage):
159 """ convenience class which provides a Records-based layout.
160 honestly it's a lot easier just to create a direct Records-based
161 class (see ExampleAddRecordStage)
162 """
163 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
164 self.in_shape = in_shape
165 self.out_shape = out_shape
166 self.__process = processfn
167 self.__setup = setupfn
168 def ispec(self): return Record(self.in_shape)
169 def ospec(self): return Record(self.out_shape)
170 def process(seif, i): return self.__process(i)
171 def setup(seif, m, i): return self.__setup(m, i)
172
173
174 class PassThroughStage(StageCls):
175 """ a pass-through stage with its input data spec identical to its output,
176 and "passes through" its data from input to output (does nothing).
177
178 use this basically to explicitly make any data spec Stage-compliant.
179 (many APIs would potentially use a static "wrap" method in e.g.
180 StageCls to achieve a similar effect)
181 """
182 def __init__(self, iospecfn): self.iospecfn = iospecfn
183 def ispec(self): return self.iospecfn()
184 def ospec(self): return self.iospecfn()
185
186
187 class ControlBase(StageHelper, Elaboratable):
188 """ Common functions for Pipeline API. Note: a "pipeline stage" only
189 exists (conceptually) when a ControlBase derivative is handed
190 a Stage (combinatorial block)
191
192 NOTE: ControlBase derives from StageHelper, making it accidentally
193 compliant with the Stage API. Using those functions directly
194 *BYPASSES* a ControlBase instance ready/valid signalling, which
195 clearly should not be done without a really, really good reason.
196 """
197 def __init__(self, stage=None, in_multi=None, stage_ctl=False, maskwid=0):
198 """ Base class containing ready/valid/data to previous and next stages
199
200 * p: contains ready/valid to the previous stage
201 * n: contains ready/valid to the next stage
202
203 Except when calling Controlbase.connect(), user must also:
204 * add i_data member to PrevControl (p) and
205 * add o_data member to NextControl (n)
206 Calling ControlBase._new_data is a good way to do that.
207 """
208 print ("ControlBase", self, stage, in_multi, stage_ctl)
209 StageHelper.__init__(self, stage)
210
211 # set up input and output IO ACK (prev/next ready/valid)
212 self.p = PrevControl(in_multi, stage_ctl, maskwid=maskwid)
213 self.n = NextControl(stage_ctl, maskwid=maskwid)
214
215 # set up the input and output data
216 if stage is not None:
217 self._new_data("data")
218
219 def _new_data(self, name):
220 """ allocates new i_data and o_data
221 """
222 self.p.i_data, self.n.o_data = self.new_specs(name)
223
224 @property
225 def data_r(self):
226 return self.process(self.p.i_data)
227
228 def connect_to_next(self, nxt):
229 """ helper function to connect to the next stage data/valid/ready.
230 """
231 return self.n.connect_to_next(nxt.p)
232
233 def _connect_in(self, prev):
234 """ internal helper function to connect stage to an input source.
235 do not use to connect stage-to-stage!
236 """
237 return self.p._connect_in(prev.p)
238
239 def _connect_out(self, nxt):
240 """ internal helper function to connect stage to an output source.
241 do not use to connect stage-to-stage!
242 """
243 return self.n._connect_out(nxt.n)
244
245 def connect(self, pipechain):
246 """ connects a chain (list) of Pipeline instances together and
247 links them to this ControlBase instance:
248
249 in <----> self <---> out
250 | ^
251 v |
252 [pipe1, pipe2, pipe3, pipe4]
253 | ^ | ^ | ^
254 v | v | v |
255 out---in out--in out---in
256
257 Also takes care of allocating i_data/o_data, by looking up
258 the data spec for each end of the pipechain. i.e It is NOT
259 necessary to allocate self.p.i_data or self.n.o_data manually:
260 this is handled AUTOMATICALLY, here.
261
262 Basically this function is the direct equivalent of StageChain,
263 except that unlike StageChain, the Pipeline logic is followed.
264
265 Just as StageChain presents an object that conforms to the
266 Stage API from a list of objects that also conform to the
267 Stage API, an object that calls this Pipeline connect function
268 has the exact same pipeline API as the list of pipline objects
269 it is called with.
270
271 Thus it becomes possible to build up larger chains recursively.
272 More complex chains (multi-input, multi-output) will have to be
273 done manually.
274
275 Argument:
276
277 * :pipechain: - a sequence of ControlBase-derived classes
278 (must be one or more in length)
279
280 Returns:
281
282 * a list of eq assignments that will need to be added in
283 an elaborate() to m.d.comb
284 """
285 assert len(pipechain) > 0, "pipechain must be non-zero length"
286 assert self.stage is None, "do not use connect with a stage"
287 eqs = [] # collated list of assignment statements
288
289 # connect inter-chain
290 for i in range(len(pipechain)-1):
291 pipe1 = pipechain[i] # earlier
292 pipe2 = pipechain[i+1] # later (by 1)
293 eqs += pipe1.connect_to_next(pipe2) # earlier n to later p
294
295 # connect front and back of chain to ourselves
296 front = pipechain[0] # first in chain
297 end = pipechain[-1] # last in chain
298 self.set_specs(front, end) # sets up ispec/ospec functions
299 self._new_data("chain") # NOTE: REPLACES existing data
300 eqs += front._connect_in(self) # front p to our p
301 eqs += end._connect_out(self) # end n to our n
302
303 return eqs
304
305 def set_input(self, i):
306 """ helper function to set the input data (used in unit tests)
307 """
308 return nmoperator.eq(self.p.i_data, i)
309
310 def __iter__(self):
311 yield from self.p # yields ready/valid/data (data also gets yielded)
312 yield from self.n # ditto
313
314 def ports(self):
315 return list(self)
316
317 def elaborate(self, platform):
318 """ handles case where stage has dynamic ready/valid functions
319 """
320 m = Module()
321 m.submodules.p = self.p
322 m.submodules.n = self.n
323
324 self.setup(m, self.p.i_data)
325
326 if not self.p.stage_ctl:
327 return m
328
329 # intercept the previous (outgoing) "ready", combine with stage ready
330 m.d.comb += self.p.s_o_ready.eq(self.p._o_ready & self.stage.d_ready)
331
332 # intercept the next (incoming) "ready" and combine it with data valid
333 sdv = self.stage.d_valid(self.n.i_ready)
334 m.d.comb += self.n.d_valid.eq(self.n.i_ready & sdv)
335
336 return m
337
338
339 class BufferedHandshake(ControlBase):
340 """ buffered pipeline stage. data and strobe signals travel in sync.
341 if ever the input is ready and the output is not, processed data
342 is shunted in a temporary register.
343
344 Argument: stage. see Stage API above
345
346 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
347 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
348 stage-1 p.i_data >>in stage n.o_data out>> stage+1
349 | |
350 process --->----^
351 | |
352 +-- r_data ->-+
353
354 input data p.i_data is read (only), is processed and goes into an
355 intermediate result store [process()]. this is updated combinatorially.
356
357 in a non-stall condition, the intermediate result will go into the
358 output (update_output). however if ever there is a stall, it goes
359 into r_data instead [update_buffer()].
360
361 when the non-stall condition is released, r_data is the first
362 to be transferred to the output [flush_buffer()], and the stall
363 condition cleared.
364
365 on the next cycle (as long as stall is not raised again) the
366 input may begin to be processed and transferred directly to output.
367 """
368
369 def elaborate(self, platform):
370 self.m = ControlBase.elaborate(self, platform)
371
372 result = _spec(self.stage.ospec, "r_tmp")
373 r_data = _spec(self.stage.ospec, "r_data")
374
375 # establish some combinatorial temporaries
376 o_n_validn = Signal(reset_less=True)
377 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
378 nir_por = Signal(reset_less=True)
379 nir_por_n = Signal(reset_less=True)
380 p_i_valid = Signal(reset_less=True)
381 nir_novn = Signal(reset_less=True)
382 nirn_novn = Signal(reset_less=True)
383 por_pivn = Signal(reset_less=True)
384 npnn = Signal(reset_less=True)
385 self.m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
386 o_n_validn.eq(~self.n.o_valid),
387 n_i_ready.eq(self.n.i_ready_test),
388 nir_por.eq(n_i_ready & self.p._o_ready),
389 nir_por_n.eq(n_i_ready & ~self.p._o_ready),
390 nir_novn.eq(n_i_ready | o_n_validn),
391 nirn_novn.eq(~n_i_ready & o_n_validn),
392 npnn.eq(nir_por | nirn_novn),
393 por_pivn.eq(self.p._o_ready & ~p_i_valid)
394 ]
395
396 # store result of processing in combinatorial temporary
397 self.m.d.comb += nmoperator.eq(result, self.data_r)
398
399 # if not in stall condition, update the temporary register
400 with self.m.If(self.p.o_ready): # not stalled
401 self.m.d.sync += nmoperator.eq(r_data, result) # update buffer
402
403 # data pass-through conditions
404 with self.m.If(npnn):
405 o_data = self._postprocess(result) # XXX TBD, does nothing right now
406 self.m.d.sync += [self.n.o_valid.eq(p_i_valid), # valid if p_valid
407 nmoperator.eq(self.n.o_data, o_data), # update out
408 ]
409 # buffer flush conditions (NOTE: can override data passthru conditions)
410 with self.m.If(nir_por_n): # not stalled
411 # Flush the [already processed] buffer to the output port.
412 o_data = self._postprocess(r_data) # XXX TBD, does nothing right now
413 self.m.d.sync += [self.n.o_valid.eq(1), # reg empty
414 nmoperator.eq(self.n.o_data, o_data), # flush
415 ]
416 # output ready conditions
417 self.m.d.sync += self.p._o_ready.eq(nir_novn | por_pivn)
418
419 return self.m
420
421
422 class MaskNoDelayCancellable(ControlBase):
423 """ Mask-activated Cancellable pipeline (that does not respect "ready")
424
425 Based on (identical behaviour to) SimpleHandshake.
426 TODO: decide whether to merge *into* SimpleHandshake.
427
428 Argument: stage. see Stage API above
429
430 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
431 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
432 stage-1 p.i_data >>in stage n.o_data out>> stage+1
433 | |
434 +--process->--^
435 """
436 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False):
437 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
438
439 def elaborate(self, platform):
440 self.m = m = ControlBase.elaborate(self, platform)
441
442 # store result of processing in combinatorial temporary
443 result = _spec(self.stage.ospec, "r_tmp")
444 m.d.comb += nmoperator.eq(result, self.data_r)
445
446 # establish if the data should be passed on. cancellation is
447 # a global signal.
448 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
449 # is NOT "normal" for the Stage API.
450 p_i_valid = Signal(reset_less=True)
451 #print ("self.p.i_data", self.p.i_data)
452 maskedout = Signal(len(self.p.mask_i), reset_less=True)
453 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
454 m.d.comb += p_i_valid.eq(maskedout.bool())
455
456 # if idmask nonzero, mask gets passed on (and register set).
457 # register is left as-is if idmask is zero, but out-mask is set to zero
458 # note however: only the *uncancelled* mask bits get passed on
459 m.d.sync += self.n.o_valid.eq(p_i_valid)
460 m.d.sync += self.n.mask_o.eq(Mux(p_i_valid, maskedout, 0))
461 with m.If(p_i_valid):
462 o_data = self._postprocess(result) # XXX TBD, does nothing right now
463 m.d.sync += nmoperator.eq(self.n.o_data, o_data) # update output
464
465 # output valid if
466 # input always "ready"
467 #m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
468 m.d.comb += self.p._o_ready.eq(Const(1))
469
470 # always pass on stop (as combinatorial: single signal)
471 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
472
473 return self.m
474
475
476 class MaskCancellable(ControlBase):
477 """ Mask-activated Cancellable pipeline
478
479 Arguments:
480
481 * stage. see Stage API above
482 * maskwid - sets up cancellation capability (mask and stop).
483 * in_multi
484 * stage_ctl
485 * dynamic - allows switching from sync to combinatorial (passthrough)
486 USE WITH CARE. will need the entire pipe to be quiescent
487 before switching, otherwise data WILL be destroyed.
488
489 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
490 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
491 stage-1 p.i_data >>in stage n.o_data out>> stage+1
492 | |
493 +--process->--^
494 """
495 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False,
496 dynamic=False):
497 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
498 self.dynamic = dynamic
499 if dynamic:
500 self.latchmode = Signal()
501 else:
502 self.latchmode = Const(1)
503
504 def elaborate(self, platform):
505 self.m = m = ControlBase.elaborate(self, platform)
506
507 mask_r = Signal(len(self.p.mask_i), reset_less=True)
508 data_r = _spec(self.stage.ospec, "data_r")
509 m.d.comb += nmoperator.eq(data_r, self._postprocess(self.data_r))
510
511 with m.If(self.latchmode):
512 r_busy = Signal()
513 r_latch = _spec(self.stage.ospec, "r_latch")
514
515 # establish if the data should be passed on. cancellation is
516 # a global signal.
517 p_i_valid = Signal(reset_less=True)
518 #print ("self.p.i_data", self.p.i_data)
519 maskedout = Signal(len(self.p.mask_i), reset_less=True)
520 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
521
522 # establish some combinatorial temporaries
523 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
524 p_i_valid_p_o_ready = Signal(reset_less=True)
525 m.d.comb += [p_i_valid.eq(self.p.i_valid_test & maskedout.bool()),
526 n_i_ready.eq(self.n.i_ready_test),
527 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
528 ]
529
530 # if idmask nonzero, mask gets passed on (and register set).
531 # register is left as-is if idmask is zero, but out-mask is set to
532 # zero
533 # note however: only the *uncancelled* mask bits get passed on
534 m.d.sync += mask_r.eq(Mux(p_i_valid, maskedout, 0))
535 m.d.comb += self.n.mask_o.eq(mask_r)
536
537 # always pass on stop (as combinatorial: single signal)
538 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
539
540 stor = Signal(reset_less=True)
541 m.d.comb += stor.eq(p_i_valid_p_o_ready | n_i_ready)
542 with m.If(stor):
543 # store result of processing in combinatorial temporary
544 m.d.sync += nmoperator.eq(r_latch, data_r)
545
546 # previous valid and ready
547 with m.If(p_i_valid_p_o_ready):
548 m.d.sync += r_busy.eq(1) # output valid
549 # previous invalid or not ready, however next is accepting
550 with m.Elif(n_i_ready):
551 m.d.sync += r_busy.eq(0) # ...so set output invalid
552
553 # output set combinatorially from latch
554 m.d.comb += nmoperator.eq(self.n.o_data, r_latch)
555
556 m.d.comb += self.n.o_valid.eq(r_busy)
557 # if next is ready, so is previous
558 m.d.comb += self.p._o_ready.eq(n_i_ready)
559
560 with m.Else():
561 # pass everything straight through. p connected to n: data,
562 # valid, mask, everything. this is "effectively" just a
563 # StageChain: MaskCancellable is doing "nothing" except
564 # combinatorially passing everything through
565 # (except now it's *dynamically selectable* whether to do that)
566 m.d.comb += self.n.o_valid.eq(self.p.i_valid_test)
567 m.d.comb += self.p._o_ready.eq(self.n.i_ready_test)
568 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
569 m.d.comb += self.n.mask_o.eq(self.p.mask_i)
570 m.d.comb += nmoperator.eq(self.n.o_data, data_r)
571
572 return self.m
573
574
575 class SimpleHandshake(ControlBase):
576 """ simple handshake control. data and strobe signals travel in sync.
577 implements the protocol used by Wishbone and AXI4.
578
579 Argument: stage. see Stage API above
580
581 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
582 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
583 stage-1 p.i_data >>in stage n.o_data out>> stage+1
584 | |
585 +--process->--^
586 Truth Table
587
588 Inputs Temporary Output Data
589 ------- ---------- ----- ----
590 P P N N PiV& ~NiR& N P
591 i o i o PoR NoV o o
592 V R R V V R
593
594 ------- - - - -
595 0 0 0 0 0 0 >0 0 reg
596 0 0 0 1 0 1 >1 0 reg
597 0 0 1 0 0 0 0 1 process(i_data)
598 0 0 1 1 0 0 0 1 process(i_data)
599 ------- - - - -
600 0 1 0 0 0 0 >0 0 reg
601 0 1 0 1 0 1 >1 0 reg
602 0 1 1 0 0 0 0 1 process(i_data)
603 0 1 1 1 0 0 0 1 process(i_data)
604 ------- - - - -
605 1 0 0 0 0 0 >0 0 reg
606 1 0 0 1 0 1 >1 0 reg
607 1 0 1 0 0 0 0 1 process(i_data)
608 1 0 1 1 0 0 0 1 process(i_data)
609 ------- - - - -
610 1 1 0 0 1 0 1 0 process(i_data)
611 1 1 0 1 1 1 1 0 process(i_data)
612 1 1 1 0 1 0 1 1 process(i_data)
613 1 1 1 1 1 0 1 1 process(i_data)
614 ------- - - - -
615 """
616
617 def elaborate(self, platform):
618 self.m = m = ControlBase.elaborate(self, platform)
619
620 r_busy = Signal()
621 result = _spec(self.stage.ospec, "r_tmp")
622
623 # establish some combinatorial temporaries
624 n_i_ready = Signal(reset_less=True, name="n_i_rdy_data")
625 p_i_valid_p_o_ready = Signal(reset_less=True)
626 p_i_valid = Signal(reset_less=True)
627 m.d.comb += [p_i_valid.eq(self.p.i_valid_test),
628 n_i_ready.eq(self.n.i_ready_test),
629 p_i_valid_p_o_ready.eq(p_i_valid & self.p.o_ready),
630 ]
631
632 # store result of processing in combinatorial temporary
633 m.d.comb += nmoperator.eq(result, self.data_r)
634
635 # previous valid and ready
636 with m.If(p_i_valid_p_o_ready):
637 o_data = self._postprocess(result) # XXX TBD, does nothing right now
638 m.d.sync += [r_busy.eq(1), # output valid
639 nmoperator.eq(self.n.o_data, o_data), # update output
640 ]
641 # previous invalid or not ready, however next is accepting
642 with m.Elif(n_i_ready):
643 o_data = self._postprocess(result) # XXX TBD, does nothing right now
644 m.d.sync += [nmoperator.eq(self.n.o_data, o_data)]
645 # TODO: could still send data here (if there was any)
646 #m.d.sync += self.n.o_valid.eq(0) # ...so set output invalid
647 m.d.sync += r_busy.eq(0) # ...so set output invalid
648
649 m.d.comb += self.n.o_valid.eq(r_busy)
650 # if next is ready, so is previous
651 m.d.comb += self.p._o_ready.eq(n_i_ready)
652
653 return self.m
654
655
656 class UnbufferedPipeline(ControlBase):
657 """ A simple pipeline stage with single-clock synchronisation
658 and two-way valid/ready synchronised signalling.
659
660 Note that a stall in one stage will result in the entire pipeline
661 chain stalling.
662
663 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
664 travel synchronously with the data: the valid/ready signalling
665 combines in a *combinatorial* fashion. Therefore, a long pipeline
666 chain will lengthen propagation delays.
667
668 Argument: stage. see Stage API, above
669
670 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
671 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
672 stage-1 p.i_data >>in stage n.o_data out>> stage+1
673 | |
674 r_data result
675 | |
676 +--process ->-+
677
678 Attributes:
679 -----------
680 p.i_data : StageInput, shaped according to ispec
681 The pipeline input
682 p.o_data : StageOutput, shaped according to ospec
683 The pipeline output
684 r_data : input_shape according to ispec
685 A temporary (buffered) copy of a prior (valid) input.
686 This is HELD if the output is not ready. It is updated
687 SYNCHRONOUSLY.
688 result: output_shape according to ospec
689 The output of the combinatorial logic. it is updated
690 COMBINATORIALLY (no clock dependence).
691
692 Truth Table
693
694 Inputs Temp Output Data
695 ------- - ----- ----
696 P P N N ~NiR& N P
697 i o i o NoV o o
698 V R R V V R
699
700 ------- - - -
701 0 0 0 0 0 0 1 reg
702 0 0 0 1 1 1 0 reg
703 0 0 1 0 0 0 1 reg
704 0 0 1 1 0 0 1 reg
705 ------- - - -
706 0 1 0 0 0 0 1 reg
707 0 1 0 1 1 1 0 reg
708 0 1 1 0 0 0 1 reg
709 0 1 1 1 0 0 1 reg
710 ------- - - -
711 1 0 0 0 0 1 1 reg
712 1 0 0 1 1 1 0 reg
713 1 0 1 0 0 1 1 reg
714 1 0 1 1 0 1 1 reg
715 ------- - - -
716 1 1 0 0 0 1 1 process(i_data)
717 1 1 0 1 1 1 0 process(i_data)
718 1 1 1 0 0 1 1 process(i_data)
719 1 1 1 1 0 1 1 process(i_data)
720 ------- - - -
721
722 Note: PoR is *NOT* involved in the above decision-making.
723 """
724
725 def elaborate(self, platform):
726 self.m = m = ControlBase.elaborate(self, platform)
727
728 data_valid = Signal() # is data valid or not
729 r_data = _spec(self.stage.ospec, "r_tmp") # output type
730
731 # some temporaries
732 p_i_valid = Signal(reset_less=True)
733 pv = Signal(reset_less=True)
734 buf_full = Signal(reset_less=True)
735 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
736 m.d.comb += pv.eq(self.p.i_valid & self.p.o_ready)
737 m.d.comb += buf_full.eq(~self.n.i_ready_test & data_valid)
738
739 m.d.comb += self.n.o_valid.eq(data_valid)
740 m.d.comb += self.p._o_ready.eq(~data_valid | self.n.i_ready_test)
741 m.d.sync += data_valid.eq(p_i_valid | buf_full)
742
743 with m.If(pv):
744 m.d.sync += nmoperator.eq(r_data, self.data_r)
745 o_data = self._postprocess(r_data) # XXX TBD, does nothing right now
746 m.d.comb += nmoperator.eq(self.n.o_data, o_data)
747
748 return self.m
749
750
751 class UnbufferedPipeline2(ControlBase):
752 """ A simple pipeline stage with single-clock synchronisation
753 and two-way valid/ready synchronised signalling.
754
755 Note that a stall in one stage will result in the entire pipeline
756 chain stalling.
757
758 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
759 travel synchronously with the data: the valid/ready signalling
760 combines in a *combinatorial* fashion. Therefore, a long pipeline
761 chain will lengthen propagation delays.
762
763 Argument: stage. see Stage API, above
764
765 stage-1 p.i_valid >>in stage n.o_valid out>> stage+1
766 stage-1 p.o_ready <<out stage n.i_ready <<in stage+1
767 stage-1 p.i_data >>in stage n.o_data out>> stage+1
768 | | |
769 +- process-> buf <-+
770 Attributes:
771 -----------
772 p.i_data : StageInput, shaped according to ispec
773 The pipeline input
774 p.o_data : StageOutput, shaped according to ospec
775 The pipeline output
776 buf : output_shape according to ospec
777 A temporary (buffered) copy of a valid output
778 This is HELD if the output is not ready. It is updated
779 SYNCHRONOUSLY.
780
781 Inputs Temp Output Data
782 ------- - -----
783 P P N N ~NiR& N P (buf_full)
784 i o i o NoV o o
785 V R R V V R
786
787 ------- - - -
788 0 0 0 0 0 0 1 process(i_data)
789 0 0 0 1 1 1 0 reg (odata, unchanged)
790 0 0 1 0 0 0 1 process(i_data)
791 0 0 1 1 0 0 1 process(i_data)
792 ------- - - -
793 0 1 0 0 0 0 1 process(i_data)
794 0 1 0 1 1 1 0 reg (odata, unchanged)
795 0 1 1 0 0 0 1 process(i_data)
796 0 1 1 1 0 0 1 process(i_data)
797 ------- - - -
798 1 0 0 0 0 1 1 process(i_data)
799 1 0 0 1 1 1 0 reg (odata, unchanged)
800 1 0 1 0 0 1 1 process(i_data)
801 1 0 1 1 0 1 1 process(i_data)
802 ------- - - -
803 1 1 0 0 0 1 1 process(i_data)
804 1 1 0 1 1 1 0 reg (odata, unchanged)
805 1 1 1 0 0 1 1 process(i_data)
806 1 1 1 1 0 1 1 process(i_data)
807 ------- - - -
808
809 Note: PoR is *NOT* involved in the above decision-making.
810 """
811
812 def elaborate(self, platform):
813 self.m = m = ControlBase.elaborate(self, platform)
814
815 buf_full = Signal() # is data valid or not
816 buf = _spec(self.stage.ospec, "r_tmp") # output type
817
818 # some temporaries
819 p_i_valid = Signal(reset_less=True)
820 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
821
822 m.d.comb += self.n.o_valid.eq(buf_full | p_i_valid)
823 m.d.comb += self.p._o_ready.eq(~buf_full)
824 m.d.sync += buf_full.eq(~self.n.i_ready_test & self.n.o_valid)
825
826 o_data = Mux(buf_full, buf, self.data_r)
827 o_data = self._postprocess(o_data) # XXX TBD, does nothing right now
828 m.d.comb += nmoperator.eq(self.n.o_data, o_data)
829 m.d.sync += nmoperator.eq(buf, self.n.o_data)
830
831 return self.m
832
833
834 class PassThroughHandshake(ControlBase):
835 """ A control block that delays by one clock cycle.
836
837 Inputs Temporary Output Data
838 ------- ------------------ ----- ----
839 P P N N PiV& PiV| NiR| pvr N P (pvr)
840 i o i o PoR ~PoR ~NoV o o
841 V R R V V R
842
843 ------- - - - - - -
844 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
845 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
846 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
847 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
848 ------- - - - - - -
849 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
850 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
851 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
852 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
853 ------- - - - - - -
854 1 0 0 0 0 1 1 1 1 1 process(in)
855 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
856 1 0 1 0 0 1 1 1 1 1 process(in)
857 1 0 1 1 0 1 1 1 1 1 process(in)
858 ------- - - - - - -
859 1 1 0 0 1 1 1 1 1 1 process(in)
860 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
861 1 1 1 0 1 1 1 1 1 1 process(in)
862 1 1 1 1 1 1 1 1 1 1 process(in)
863 ------- - - - - - -
864
865 """
866
867 def elaborate(self, platform):
868 self.m = m = ControlBase.elaborate(self, platform)
869
870 r_data = _spec(self.stage.ospec, "r_tmp") # output type
871
872 # temporaries
873 p_i_valid = Signal(reset_less=True)
874 pvr = Signal(reset_less=True)
875 m.d.comb += p_i_valid.eq(self.p.i_valid_test)
876 m.d.comb += pvr.eq(p_i_valid & self.p.o_ready)
877
878 m.d.comb += self.p.o_ready.eq(~self.n.o_valid | self.n.i_ready_test)
879 m.d.sync += self.n.o_valid.eq(p_i_valid | ~self.p.o_ready)
880
881 odata = Mux(pvr, self.data_r, r_data)
882 m.d.sync += nmoperator.eq(r_data, odata)
883 r_data = self._postprocess(r_data) # XXX TBD, does nothing right now
884 m.d.comb += nmoperator.eq(self.n.o_data, r_data)
885
886 return m
887
888
889 class RegisterPipeline(UnbufferedPipeline):
890 """ A pipeline stage that delays by one clock cycle, creating a
891 sync'd latch out of o_data and o_valid as an indirect byproduct
892 of using PassThroughStage
893 """
894 def __init__(self, iospecfn):
895 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
896
897
898 class FIFOControl(ControlBase):
899 """ FIFO Control. Uses Queue to store data, coincidentally
900 happens to have same valid/ready signalling as Stage API.
901
902 i_data -> fifo.din -> FIFO -> fifo.dout -> o_data
903 """
904 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
905 fwft=True, pipe=False):
906 """ FIFO Control
907
908 * :depth: number of entries in the FIFO
909 * :stage: data processing block
910 * :fwft: first word fall-thru mode (non-fwft introduces delay)
911 * :pipe: specifies pipe mode.
912
913 when fwft = True it indicates that transfers may occur
914 combinatorially through stage processing in the same clock cycle.
915 This requires that the Stage be a Moore FSM:
916 https://en.wikipedia.org/wiki/Moore_machine
917
918 when fwft = False it indicates that all output signals are
919 produced only from internal registers or memory, i.e. that the
920 Stage is a Mealy FSM:
921 https://en.wikipedia.org/wiki/Mealy_machine
922
923 data is processed (and located) as follows:
924
925 self.p self.stage temp fn temp fn temp fp self.n
926 i_data->process()->result->cat->din.FIFO.dout->cat(o_data)
927
928 yes, really: cat produces a Cat() which can be assigned to.
929 this is how the FIFO gets de-catted without needing a de-cat
930 function
931 """
932 self.fwft = fwft
933 self.pipe = pipe
934 self.fdepth = depth
935 ControlBase.__init__(self, stage, in_multi, stage_ctl)
936
937 def elaborate(self, platform):
938 self.m = m = ControlBase.elaborate(self, platform)
939
940 # make a FIFO with a signal of equal width to the o_data.
941 (fwidth, _) = nmoperator.shape(self.n.o_data)
942 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
943 m.submodules.fifo = fifo
944
945 def processfn(i_data):
946 # store result of processing in combinatorial temporary
947 result = _spec(self.stage.ospec, "r_temp")
948 m.d.comb += nmoperator.eq(result, self.process(i_data))
949 return nmoperator.cat(result)
950
951 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
952 m.submodules.fp = fp = PrevControl()
953 fp.i_valid, fp._o_ready, fp.i_data = fifo.w_en, fifo.w_rdy, fifo.w_data
954 m.d.comb += fp._connect_in(self.p, fn=processfn)
955
956 # next: make the FIFO (Queue object) "look" like a NextControl...
957 m.submodules.fn = fn = NextControl()
958 fn.o_valid, fn.i_ready, fn.o_data = fifo.r_rdy, fifo.r_en, fifo.r_data
959 connections = fn._connect_out(self.n, fn=nmoperator.cat)
960 valid_eq, ready_eq, o_data = connections
961
962 # ok ok so we can't just do the ready/valid eqs straight:
963 # first 2 from connections are the ready/valid, 3rd is data.
964 if self.fwft:
965 m.d.comb += [valid_eq, ready_eq] # combinatorial on next ready/valid
966 else:
967 m.d.sync += [valid_eq, ready_eq] # non-fwft mode needs sync
968 o_data = self._postprocess(o_data) # XXX TBD, does nothing right now
969 m.d.comb += o_data
970
971 return m
972
973
974 # aka "RegStage".
975 class UnbufferedPipeline(FIFOControl):
976 def __init__(self, stage, in_multi=None, stage_ctl=False):
977 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
978 fwft=True, pipe=False)
979
980 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
981 class PassThroughHandshake(FIFOControl):
982 def __init__(self, stage, in_multi=None, stage_ctl=False):
983 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
984 fwft=True, pipe=True)
985
986 # this is *probably* BufferedHandshake, although test #997 now succeeds.
987 class BufferedHandshake(FIFOControl):
988 def __init__(self, stage, in_multi=None, stage_ctl=False):
989 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
990 fwft=True, pipe=False)
991
992
993 """
994 # this is *probably* SimpleHandshake (note: memory cell size=0)
995 class SimpleHandshake(FIFOControl):
996 def __init__(self, stage, in_multi=None, stage_ctl=False):
997 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
998 fwft=True, pipe=False)
999 """