speed up ==, hash, <, >, <=, and >= for plain_data
[nmutil.git] / src / nmutil / singlepipe.py
1 """ Pipeline API. For multi-input and multi-output variants, see multipipe.
2
3 Associated development bugs:
4 * http://bugs.libre-riscv.org/show_bug.cgi?id=148
5 * http://bugs.libre-riscv.org/show_bug.cgi?id=64
6 * http://bugs.libre-riscv.org/show_bug.cgi?id=57
7
8 Important: see Stage API (stageapi.py) and IO Control API
9 (iocontrol.py) in combination with below. This module
10 "combines" the Stage API with the IO Control API to create
11 the Pipeline API.
12
13 The one critically important key difference between StageAPI and
14 PipelineAPI:
15
16 * StageAPI: combinatorial (NO REGISTERS / LATCHES PERMITTED)
17 * PipelineAPI: synchronous registers / latches get added here
18
19 RecordBasedStage:
20 ----------------
21
22 A convenience class that takes an input shape, output shape, a
23 "processing" function and an optional "setup" function. Honestly
24 though, there's not much more effort to just... create a class
25 that returns a couple of Records (see ExampleAddRecordStage in
26 examples).
27
28 PassThroughStage:
29 ----------------
30
31 A convenience class that takes a single function as a parameter,
32 that is chain-called to create the exact same input and output spec.
33 It has a process() function that simply returns its input.
34
35 Instances of this class are completely redundant if handed to
36 StageChain, however when passed to UnbufferedPipeline they
37 can be used to introduce a single clock delay.
38
39 ControlBase:
40 -----------
41
42 The base class for pipelines. Contains previous and next ready/valid/data.
43 Also has an extremely useful "connect" function that can be used to
44 connect a chain of pipelines and present the exact same prev/next
45 ready/valid/data API.
46
47 Note: pipelines basically do not become pipelines as such until
48 handed to a derivative of ControlBase. ControlBase itself is *not*
49 strictly considered a pipeline class. Wishbone and AXI4 (master or
50 slave) could be derived from ControlBase, for example.
51 UnbufferedPipeline:
52 ------------------
53
54 A simple stalling clock-synchronised pipeline that has no buffering
55 (unlike BufferedHandshake). Data flows on *every* clock cycle when
56 the conditions are right (this is nominally when the input is valid
57 and the output is ready).
58
59 A stall anywhere along the line will result in a stall back-propagating
60 down the entire chain. The BufferedHandshake by contrast will buffer
61 incoming data, allowing previous stages one clock cycle's grace before
62 also having to stall.
63
64 An advantage of the UnbufferedPipeline over the Buffered one is
65 that the amount of logic needed (number of gates) is greatly
66 reduced (no second set of buffers basically)
67
68 The disadvantage of the UnbufferedPipeline is that the valid/ready
69 logic, if chained together, is *combinatorial*, resulting in
70 progressively larger gate delay.
71
72 PassThroughHandshake:
73 ------------------
74
75 A Control class that introduces a single clock delay, passing its
76 data through unaltered. Unlike RegisterPipeline (which relies
77 on UnbufferedPipeline and PassThroughStage) it handles ready/valid
78 itself.
79
80 RegisterPipeline:
81 ----------------
82
83 A convenience class that, because UnbufferedPipeline introduces a single
84 clock delay, when its stage is a PassThroughStage, it results in a Pipeline
85 stage that, duh, delays its (unmodified) input by one clock cycle.
86
87 BufferedHandshake:
88 ----------------
89
90 nmigen implementation of buffered pipeline stage, based on zipcpu:
91 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html
92
93 this module requires quite a bit of thought to understand how it works
94 (and why it is needed in the first place). reading the above is
95 *strongly* recommended.
96
97 unlike john dawson's IEEE754 FPU STB/ACK signalling, which requires
98 the STB / ACK signals to raise and lower (on separate clocks) before
99 data may proceeed (thus only allowing one piece of data to proceed
100 on *ALTERNATE* cycles), the signalling here is a true pipeline
101 where data will flow on *every* clock when the conditions are right.
102
103 input acceptance conditions are when:
104 * incoming previous-stage strobe (p.valid_i) is HIGH
105 * outgoing previous-stage ready (p.ready_o) is LOW
106
107 output transmission conditions are when:
108 * outgoing next-stage strobe (n.valid_o) is HIGH
109 * outgoing next-stage ready (n.ready_i) is LOW
110
111 the tricky bit is when the input has valid data and the output is not
112 ready to accept it. if it wasn't for the clock synchronisation, it
113 would be possible to tell the input "hey don't send that data, we're
114 not ready". unfortunately, it's not possible to "change the past":
115 the previous stage *has no choice* but to pass on its data.
116
117 therefore, the incoming data *must* be accepted - and stored: that
118 is the responsibility / contract that this stage *must* accept.
119 on the same clock, it's possible to tell the input that it must
120 not send any more data. this is the "stall" condition.
121
122 we now effectively have *two* possible pieces of data to "choose" from:
123 the buffered data, and the incoming data. the decision as to which
124 to process and output is based on whether we are in "stall" or not.
125 i.e. when the next stage is no longer ready, the output comes from
126 the buffer if a stall had previously occurred, otherwise it comes
127 direct from processing the input.
128
129 this allows us to respect a synchronous "travelling STB" with what
130 dan calls a "buffered handshake".
131
132 it's quite a complex state machine!
133
134 SimpleHandshake
135 ---------------
136
137 Synchronised pipeline, Based on:
138 https://github.com/ZipCPU/dbgbus/blob/master/hexbus/rtl/hbdeword.v
139 """
140
141 from nmigen import Signal, Mux, Module, Elaboratable, Const
142 from nmigen.cli import verilog, rtlil
143 from nmigen.hdl.rec import Record
144
145 from nmutil.queue import Queue
146 import inspect
147
148 from nmutil.iocontrol import (PrevControl, NextControl, Object, RecordObject)
149 from nmutil.stageapi import (_spec, StageCls, Stage, StageChain, StageHelper)
150 from nmutil import nmoperator
151
152
153 class RecordBasedStage(Stage):
154 """ convenience class which provides a Records-based layout.
155 honestly it's a lot easier just to create a direct Records-based
156 class (see ExampleAddRecordStage)
157 """
158 def __init__(self, in_shape, out_shape, processfn, setupfn=None):
159 self.in_shape = in_shape
160 self.out_shape = out_shape
161 self.__process = processfn
162 self.__setup = setupfn
163 def ispec(self): return Record(self.in_shape)
164 def ospec(self): return Record(self.out_shape)
165 def process(seif, i): return self.__process(i)
166 def setup(seif, m, i): return self.__setup(m, i)
167
168
169 class PassThroughStage(StageCls):
170 """ a pass-through stage with its input data spec identical to its output,
171 and "passes through" its data from input to output (does nothing).
172
173 use this basically to explicitly make any data spec Stage-compliant.
174 (many APIs would potentially use a static "wrap" method in e.g.
175 StageCls to achieve a similar effect)
176 """
177 def __init__(self, iospecfn): self.iospecfn = iospecfn
178 def ispec(self): return self.iospecfn()
179 def ospec(self): return self.iospecfn()
180
181
182 class ControlBase(StageHelper, Elaboratable):
183 """ Common functions for Pipeline API. Note: a "pipeline stage" only
184 exists (conceptually) when a ControlBase derivative is handed
185 a Stage (combinatorial block)
186
187 NOTE: ControlBase derives from StageHelper, making it accidentally
188 compliant with the Stage API. Using those functions directly
189 *BYPASSES* a ControlBase instance ready/valid signalling, which
190 clearly should not be done without a really, really good reason.
191 """
192 def __init__(self, stage=None, in_multi=None, stage_ctl=False, maskwid=0):
193 """ Base class containing ready/valid/data to previous and next stages
194
195 * p: contains ready/valid to the previous stage
196 * n: contains ready/valid to the next stage
197
198 Except when calling Controlbase.connect(), user must also:
199 * add data_i member to PrevControl (p) and
200 * add data_o member to NextControl (n)
201 Calling ControlBase._new_data is a good way to do that.
202 """
203 print ("ControlBase", self, stage, in_multi, stage_ctl)
204 StageHelper.__init__(self, stage)
205
206 # set up input and output IO ACK (prev/next ready/valid)
207 self.p = PrevControl(in_multi, stage_ctl, maskwid=maskwid)
208 self.n = NextControl(stage_ctl, maskwid=maskwid)
209
210 # set up the input and output data
211 if stage is not None:
212 self._new_data("data")
213
214 def _new_data(self, name):
215 """ allocates new data_i and data_o
216 """
217 self.p.data_i, self.n.data_o = self.new_specs(name)
218
219 @property
220 def data_r(self):
221 return self.process(self.p.data_i)
222
223 def connect_to_next(self, nxt):
224 """ helper function to connect to the next stage data/valid/ready.
225 """
226 return self.n.connect_to_next(nxt.p)
227
228 def _connect_in(self, prev):
229 """ internal helper function to connect stage to an input source.
230 do not use to connect stage-to-stage!
231 """
232 return self.p._connect_in(prev.p)
233
234 def _connect_out(self, nxt):
235 """ internal helper function to connect stage to an output source.
236 do not use to connect stage-to-stage!
237 """
238 return self.n._connect_out(nxt.n)
239
240 def connect(self, pipechain):
241 """ connects a chain (list) of Pipeline instances together and
242 links them to this ControlBase instance:
243
244 in <----> self <---> out
245 | ^
246 v |
247 [pipe1, pipe2, pipe3, pipe4]
248 | ^ | ^ | ^
249 v | v | v |
250 out---in out--in out---in
251
252 Also takes care of allocating data_i/data_o, by looking up
253 the data spec for each end of the pipechain. i.e It is NOT
254 necessary to allocate self.p.data_i or self.n.data_o manually:
255 this is handled AUTOMATICALLY, here.
256
257 Basically this function is the direct equivalent of StageChain,
258 except that unlike StageChain, the Pipeline logic is followed.
259
260 Just as StageChain presents an object that conforms to the
261 Stage API from a list of objects that also conform to the
262 Stage API, an object that calls this Pipeline connect function
263 has the exact same pipeline API as the list of pipline objects
264 it is called with.
265
266 Thus it becomes possible to build up larger chains recursively.
267 More complex chains (multi-input, multi-output) will have to be
268 done manually.
269
270 Argument:
271
272 * :pipechain: - a sequence of ControlBase-derived classes
273 (must be one or more in length)
274
275 Returns:
276
277 * a list of eq assignments that will need to be added in
278 an elaborate() to m.d.comb
279 """
280 assert len(pipechain) > 0, "pipechain must be non-zero length"
281 assert self.stage is None, "do not use connect with a stage"
282 eqs = [] # collated list of assignment statements
283
284 # connect inter-chain
285 for i in range(len(pipechain)-1):
286 pipe1 = pipechain[i] # earlier
287 pipe2 = pipechain[i+1] # later (by 1)
288 eqs += pipe1.connect_to_next(pipe2) # earlier n to later p
289
290 # connect front and back of chain to ourselves
291 front = pipechain[0] # first in chain
292 end = pipechain[-1] # last in chain
293 self.set_specs(front, end) # sets up ispec/ospec functions
294 self._new_data("chain") # NOTE: REPLACES existing data
295 eqs += front._connect_in(self) # front p to our p
296 eqs += end._connect_out(self) # end n to our n
297
298 return eqs
299
300 def set_input(self, i):
301 """ helper function to set the input data (used in unit tests)
302 """
303 return nmoperator.eq(self.p.data_i, i)
304
305 def __iter__(self):
306 yield from self.p # yields ready/valid/data (data also gets yielded)
307 yield from self.n # ditto
308
309 def ports(self):
310 return list(self)
311
312 def elaborate(self, platform):
313 """ handles case where stage has dynamic ready/valid functions
314 """
315 m = Module()
316 m.submodules.p = self.p
317 m.submodules.n = self.n
318
319 self.setup(m, self.p.data_i)
320
321 if not self.p.stage_ctl:
322 return m
323
324 # intercept the previous (outgoing) "ready", combine with stage ready
325 m.d.comb += self.p.s_ready_o.eq(self.p._ready_o & self.stage.d_ready)
326
327 # intercept the next (incoming) "ready" and combine it with data valid
328 sdv = self.stage.d_valid(self.n.ready_i)
329 m.d.comb += self.n.d_valid.eq(self.n.ready_i & sdv)
330
331 return m
332
333
334 class BufferedHandshake(ControlBase):
335 """ buffered pipeline stage. data and strobe signals travel in sync.
336 if ever the input is ready and the output is not, processed data
337 is shunted in a temporary register.
338
339 Argument: stage. see Stage API above
340
341 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
342 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
343 stage-1 p.data_i >>in stage n.data_o out>> stage+1
344 | |
345 process --->----^
346 | |
347 +-- r_data ->-+
348
349 input data p.data_i is read (only), is processed and goes into an
350 intermediate result store [process()]. this is updated combinatorially.
351
352 in a non-stall condition, the intermediate result will go into the
353 output (update_output). however if ever there is a stall, it goes
354 into r_data instead [update_buffer()].
355
356 when the non-stall condition is released, r_data is the first
357 to be transferred to the output [flush_buffer()], and the stall
358 condition cleared.
359
360 on the next cycle (as long as stall is not raised again) the
361 input may begin to be processed and transferred directly to output.
362 """
363
364 def elaborate(self, platform):
365 self.m = ControlBase.elaborate(self, platform)
366
367 result = _spec(self.stage.ospec, "r_tmp")
368 r_data = _spec(self.stage.ospec, "r_data")
369
370 # establish some combinatorial temporaries
371 o_n_validn = Signal(reset_less=True)
372 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
373 nir_por = Signal(reset_less=True)
374 nir_por_n = Signal(reset_less=True)
375 p_valid_i = Signal(reset_less=True)
376 nir_novn = Signal(reset_less=True)
377 nirn_novn = Signal(reset_less=True)
378 por_pivn = Signal(reset_less=True)
379 npnn = Signal(reset_less=True)
380 self.m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
381 o_n_validn.eq(~self.n.valid_o),
382 n_ready_i.eq(self.n.ready_i_test),
383 nir_por.eq(n_ready_i & self.p._ready_o),
384 nir_por_n.eq(n_ready_i & ~self.p._ready_o),
385 nir_novn.eq(n_ready_i | o_n_validn),
386 nirn_novn.eq(~n_ready_i & o_n_validn),
387 npnn.eq(nir_por | nirn_novn),
388 por_pivn.eq(self.p._ready_o & ~p_valid_i)
389 ]
390
391 # store result of processing in combinatorial temporary
392 self.m.d.comb += nmoperator.eq(result, self.data_r)
393
394 # if not in stall condition, update the temporary register
395 with self.m.If(self.p.ready_o): # not stalled
396 self.m.d.sync += nmoperator.eq(r_data, result) # update buffer
397
398 # data pass-through conditions
399 with self.m.If(npnn):
400 data_o = self._postprocess(result) # XXX TBD, does nothing right now
401 self.m.d.sync += [self.n.valid_o.eq(p_valid_i), # valid if p_valid
402 nmoperator.eq(self.n.data_o, data_o), # update out
403 ]
404 # buffer flush conditions (NOTE: can override data passthru conditions)
405 with self.m.If(nir_por_n): # not stalled
406 # Flush the [already processed] buffer to the output port.
407 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
408 self.m.d.sync += [self.n.valid_o.eq(1), # reg empty
409 nmoperator.eq(self.n.data_o, data_o), # flush
410 ]
411 # output ready conditions
412 self.m.d.sync += self.p._ready_o.eq(nir_novn | por_pivn)
413
414 return self.m
415
416
417 class MaskNoDelayCancellable(ControlBase):
418 """ Mask-activated Cancellable pipeline (that does not respect "ready")
419
420 Based on (identical behaviour to) SimpleHandshake.
421 TODO: decide whether to merge *into* SimpleHandshake.
422
423 Argument: stage. see Stage API above
424
425 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
426 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
427 stage-1 p.data_i >>in stage n.data_o out>> stage+1
428 | |
429 +--process->--^
430 """
431 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False):
432 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
433
434 def elaborate(self, platform):
435 self.m = m = ControlBase.elaborate(self, platform)
436
437 # store result of processing in combinatorial temporary
438 result = _spec(self.stage.ospec, "r_tmp")
439 m.d.comb += nmoperator.eq(result, self.data_r)
440
441 # establish if the data should be passed on. cancellation is
442 # a global signal.
443 # XXX EXCEPTIONAL CIRCUMSTANCES: inspection of the data payload
444 # is NOT "normal" for the Stage API.
445 p_valid_i = Signal(reset_less=True)
446 #print ("self.p.data_i", self.p.data_i)
447 maskedout = Signal(len(self.p.mask_i), reset_less=True)
448 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
449 m.d.comb += p_valid_i.eq(maskedout.bool())
450
451 # if idmask nonzero, mask gets passed on (and register set).
452 # register is left as-is if idmask is zero, but out-mask is set to zero
453 # note however: only the *uncancelled* mask bits get passed on
454 m.d.sync += self.n.valid_o.eq(p_valid_i)
455 m.d.sync += self.n.mask_o.eq(Mux(p_valid_i, maskedout, 0))
456 with m.If(p_valid_i):
457 data_o = self._postprocess(result) # XXX TBD, does nothing right now
458 m.d.sync += nmoperator.eq(self.n.data_o, data_o) # update output
459
460 # output valid if
461 # input always "ready"
462 #m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
463 m.d.comb += self.p._ready_o.eq(Const(1))
464
465 # always pass on stop (as combinatorial: single signal)
466 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
467
468 return self.m
469
470
471 class MaskCancellable(ControlBase):
472 """ Mask-activated Cancellable pipeline
473
474 Arguments:
475
476 * stage. see Stage API above
477 * maskwid - sets up cancellation capability (mask and stop).
478 * in_multi
479 * stage_ctl
480 * dynamic - allows switching from sync to combinatorial (passthrough)
481 USE WITH CARE. will need the entire pipe to be quiescent
482 before switching, otherwise data WILL be destroyed.
483
484 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
485 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
486 stage-1 p.data_i >>in stage n.data_o out>> stage+1
487 | |
488 +--process->--^
489 """
490 def __init__(self, stage, maskwid, in_multi=None, stage_ctl=False,
491 dynamic=False):
492 ControlBase.__init__(self, stage, in_multi, stage_ctl, maskwid)
493 self.dynamic = dynamic
494 if dynamic:
495 self.latchmode = Signal()
496 else:
497 self.latchmode = Const(1)
498
499 def elaborate(self, platform):
500 self.m = m = ControlBase.elaborate(self, platform)
501
502 mask_r = Signal(len(self.p.mask_i), reset_less=True)
503 data_r = _spec(self.stage.ospec, "data_r")
504 m.d.comb += nmoperator.eq(data_r, self._postprocess(self.data_r))
505
506 with m.If(self.latchmode):
507 r_busy = Signal()
508 r_latch = _spec(self.stage.ospec, "r_latch")
509
510 # establish if the data should be passed on. cancellation is
511 # a global signal.
512 p_valid_i = Signal(reset_less=True)
513 #print ("self.p.data_i", self.p.data_i)
514 maskedout = Signal(len(self.p.mask_i), reset_less=True)
515 m.d.comb += maskedout.eq(self.p.mask_i & ~self.p.stop_i)
516
517 # establish some combinatorial temporaries
518 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
519 p_valid_i_p_ready_o = Signal(reset_less=True)
520 m.d.comb += [p_valid_i.eq(self.p.valid_i_test & maskedout.bool()),
521 n_ready_i.eq(self.n.ready_i_test),
522 p_valid_i_p_ready_o.eq(p_valid_i & self.p.ready_o),
523 ]
524
525 # if idmask nonzero, mask gets passed on (and register set).
526 # register is left as-is if idmask is zero, but out-mask is set to
527 # zero
528 # note however: only the *uncancelled* mask bits get passed on
529 m.d.sync += mask_r.eq(Mux(p_valid_i, maskedout, 0))
530 m.d.comb += self.n.mask_o.eq(mask_r)
531
532 # always pass on stop (as combinatorial: single signal)
533 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
534
535 stor = Signal(reset_less=True)
536 m.d.comb += stor.eq(p_valid_i_p_ready_o | n_ready_i)
537 with m.If(stor):
538 # store result of processing in combinatorial temporary
539 m.d.sync += nmoperator.eq(r_latch, data_r)
540
541 # previous valid and ready
542 with m.If(p_valid_i_p_ready_o):
543 m.d.sync += r_busy.eq(1) # output valid
544 # previous invalid or not ready, however next is accepting
545 with m.Elif(n_ready_i):
546 m.d.sync += r_busy.eq(0) # ...so set output invalid
547
548 # output set combinatorially from latch
549 m.d.comb += nmoperator.eq(self.n.data_o, r_latch)
550
551 m.d.comb += self.n.valid_o.eq(r_busy)
552 # if next is ready, so is previous
553 m.d.comb += self.p._ready_o.eq(n_ready_i)
554
555 with m.Else():
556 # pass everything straight through. p connected to n: data,
557 # valid, mask, everything. this is "effectively" just a
558 # StageChain: MaskCancellable is doing "nothing" except
559 # combinatorially passing everything through
560 # (except now it's *dynamically selectable* whether to do that)
561 m.d.comb += self.n.valid_o.eq(self.p.valid_i_test)
562 m.d.comb += self.p._ready_o.eq(self.n.ready_i_test)
563 m.d.comb += self.n.stop_o.eq(self.p.stop_i)
564 m.d.comb += self.n.mask_o.eq(self.p.mask_i)
565 m.d.comb += nmoperator.eq(self.n.data_o, data_r)
566
567 return self.m
568
569
570 class SimpleHandshake(ControlBase):
571 """ simple handshake control. data and strobe signals travel in sync.
572 implements the protocol used by Wishbone and AXI4.
573
574 Argument: stage. see Stage API above
575
576 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
577 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
578 stage-1 p.data_i >>in stage n.data_o out>> stage+1
579 | |
580 +--process->--^
581 Truth Table
582
583 Inputs Temporary Output Data
584 ------- ---------- ----- ----
585 P P N N PiV& ~NiR& N P
586 i o i o PoR NoV o o
587 V R R V V R
588
589 ------- - - - -
590 0 0 0 0 0 0 >0 0 reg
591 0 0 0 1 0 1 >1 0 reg
592 0 0 1 0 0 0 0 1 process(data_i)
593 0 0 1 1 0 0 0 1 process(data_i)
594 ------- - - - -
595 0 1 0 0 0 0 >0 0 reg
596 0 1 0 1 0 1 >1 0 reg
597 0 1 1 0 0 0 0 1 process(data_i)
598 0 1 1 1 0 0 0 1 process(data_i)
599 ------- - - - -
600 1 0 0 0 0 0 >0 0 reg
601 1 0 0 1 0 1 >1 0 reg
602 1 0 1 0 0 0 0 1 process(data_i)
603 1 0 1 1 0 0 0 1 process(data_i)
604 ------- - - - -
605 1 1 0 0 1 0 1 0 process(data_i)
606 1 1 0 1 1 1 1 0 process(data_i)
607 1 1 1 0 1 0 1 1 process(data_i)
608 1 1 1 1 1 0 1 1 process(data_i)
609 ------- - - - -
610 """
611
612 def elaborate(self, platform):
613 self.m = m = ControlBase.elaborate(self, platform)
614
615 r_busy = Signal()
616 result = _spec(self.stage.ospec, "r_tmp")
617
618 # establish some combinatorial temporaries
619 n_ready_i = Signal(reset_less=True, name="n_i_rdy_data")
620 p_valid_i_p_ready_o = Signal(reset_less=True)
621 p_valid_i = Signal(reset_less=True)
622 m.d.comb += [p_valid_i.eq(self.p.valid_i_test),
623 n_ready_i.eq(self.n.ready_i_test),
624 p_valid_i_p_ready_o.eq(p_valid_i & self.p.ready_o),
625 ]
626
627 # store result of processing in combinatorial temporary
628 m.d.comb += nmoperator.eq(result, self.data_r)
629
630 # previous valid and ready
631 with m.If(p_valid_i_p_ready_o):
632 data_o = self._postprocess(result) # XXX TBD, does nothing right now
633 m.d.sync += [r_busy.eq(1), # output valid
634 nmoperator.eq(self.n.data_o, data_o), # update output
635 ]
636 # previous invalid or not ready, however next is accepting
637 with m.Elif(n_ready_i):
638 data_o = self._postprocess(result) # XXX TBD, does nothing right now
639 m.d.sync += [nmoperator.eq(self.n.data_o, data_o)]
640 # TODO: could still send data here (if there was any)
641 #m.d.sync += self.n.valid_o.eq(0) # ...so set output invalid
642 m.d.sync += r_busy.eq(0) # ...so set output invalid
643
644 m.d.comb += self.n.valid_o.eq(r_busy)
645 # if next is ready, so is previous
646 m.d.comb += self.p._ready_o.eq(n_ready_i)
647
648 return self.m
649
650
651 class UnbufferedPipeline(ControlBase):
652 """ A simple pipeline stage with single-clock synchronisation
653 and two-way valid/ready synchronised signalling.
654
655 Note that a stall in one stage will result in the entire pipeline
656 chain stalling.
657
658 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
659 travel synchronously with the data: the valid/ready signalling
660 combines in a *combinatorial* fashion. Therefore, a long pipeline
661 chain will lengthen propagation delays.
662
663 Argument: stage. see Stage API, above
664
665 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
666 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
667 stage-1 p.data_i >>in stage n.data_o out>> stage+1
668 | |
669 r_data result
670 | |
671 +--process ->-+
672
673 Attributes:
674 -----------
675 p.data_i : StageInput, shaped according to ispec
676 The pipeline input
677 p.data_o : StageOutput, shaped according to ospec
678 The pipeline output
679 r_data : input_shape according to ispec
680 A temporary (buffered) copy of a prior (valid) input.
681 This is HELD if the output is not ready. It is updated
682 SYNCHRONOUSLY.
683 result: output_shape according to ospec
684 The output of the combinatorial logic. it is updated
685 COMBINATORIALLY (no clock dependence).
686
687 Truth Table
688
689 Inputs Temp Output Data
690 ------- - ----- ----
691 P P N N ~NiR& N P
692 i o i o NoV o o
693 V R R V V R
694
695 ------- - - -
696 0 0 0 0 0 0 1 reg
697 0 0 0 1 1 1 0 reg
698 0 0 1 0 0 0 1 reg
699 0 0 1 1 0 0 1 reg
700 ------- - - -
701 0 1 0 0 0 0 1 reg
702 0 1 0 1 1 1 0 reg
703 0 1 1 0 0 0 1 reg
704 0 1 1 1 0 0 1 reg
705 ------- - - -
706 1 0 0 0 0 1 1 reg
707 1 0 0 1 1 1 0 reg
708 1 0 1 0 0 1 1 reg
709 1 0 1 1 0 1 1 reg
710 ------- - - -
711 1 1 0 0 0 1 1 process(data_i)
712 1 1 0 1 1 1 0 process(data_i)
713 1 1 1 0 0 1 1 process(data_i)
714 1 1 1 1 0 1 1 process(data_i)
715 ------- - - -
716
717 Note: PoR is *NOT* involved in the above decision-making.
718 """
719
720 def elaborate(self, platform):
721 self.m = m = ControlBase.elaborate(self, platform)
722
723 data_valid = Signal() # is data valid or not
724 r_data = _spec(self.stage.ospec, "r_tmp") # output type
725
726 # some temporaries
727 p_valid_i = Signal(reset_less=True)
728 pv = Signal(reset_less=True)
729 buf_full = Signal(reset_less=True)
730 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
731 m.d.comb += pv.eq(self.p.valid_i & self.p.ready_o)
732 m.d.comb += buf_full.eq(~self.n.ready_i_test & data_valid)
733
734 m.d.comb += self.n.valid_o.eq(data_valid)
735 m.d.comb += self.p._ready_o.eq(~data_valid | self.n.ready_i_test)
736 m.d.sync += data_valid.eq(p_valid_i | buf_full)
737
738 with m.If(pv):
739 m.d.sync += nmoperator.eq(r_data, self.data_r)
740 data_o = self._postprocess(r_data) # XXX TBD, does nothing right now
741 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
742
743 return self.m
744
745
746 class UnbufferedPipeline2(ControlBase):
747 """ A simple pipeline stage with single-clock synchronisation
748 and two-way valid/ready synchronised signalling.
749
750 Note that a stall in one stage will result in the entire pipeline
751 chain stalling.
752
753 Also that unlike BufferedHandshake, the valid/ready signalling does NOT
754 travel synchronously with the data: the valid/ready signalling
755 combines in a *combinatorial* fashion. Therefore, a long pipeline
756 chain will lengthen propagation delays.
757
758 Argument: stage. see Stage API, above
759
760 stage-1 p.valid_i >>in stage n.valid_o out>> stage+1
761 stage-1 p.ready_o <<out stage n.ready_i <<in stage+1
762 stage-1 p.data_i >>in stage n.data_o out>> stage+1
763 | | |
764 +- process-> buf <-+
765 Attributes:
766 -----------
767 p.data_i : StageInput, shaped according to ispec
768 The pipeline input
769 p.data_o : StageOutput, shaped according to ospec
770 The pipeline output
771 buf : output_shape according to ospec
772 A temporary (buffered) copy of a valid output
773 This is HELD if the output is not ready. It is updated
774 SYNCHRONOUSLY.
775
776 Inputs Temp Output Data
777 ------- - -----
778 P P N N ~NiR& N P (buf_full)
779 i o i o NoV o o
780 V R R V V R
781
782 ------- - - -
783 0 0 0 0 0 0 1 process(data_i)
784 0 0 0 1 1 1 0 reg (odata, unchanged)
785 0 0 1 0 0 0 1 process(data_i)
786 0 0 1 1 0 0 1 process(data_i)
787 ------- - - -
788 0 1 0 0 0 0 1 process(data_i)
789 0 1 0 1 1 1 0 reg (odata, unchanged)
790 0 1 1 0 0 0 1 process(data_i)
791 0 1 1 1 0 0 1 process(data_i)
792 ------- - - -
793 1 0 0 0 0 1 1 process(data_i)
794 1 0 0 1 1 1 0 reg (odata, unchanged)
795 1 0 1 0 0 1 1 process(data_i)
796 1 0 1 1 0 1 1 process(data_i)
797 ------- - - -
798 1 1 0 0 0 1 1 process(data_i)
799 1 1 0 1 1 1 0 reg (odata, unchanged)
800 1 1 1 0 0 1 1 process(data_i)
801 1 1 1 1 0 1 1 process(data_i)
802 ------- - - -
803
804 Note: PoR is *NOT* involved in the above decision-making.
805 """
806
807 def elaborate(self, platform):
808 self.m = m = ControlBase.elaborate(self, platform)
809
810 buf_full = Signal() # is data valid or not
811 buf = _spec(self.stage.ospec, "r_tmp") # output type
812
813 # some temporaries
814 p_valid_i = Signal(reset_less=True)
815 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
816
817 m.d.comb += self.n.valid_o.eq(buf_full | p_valid_i)
818 m.d.comb += self.p._ready_o.eq(~buf_full)
819 m.d.sync += buf_full.eq(~self.n.ready_i_test & self.n.valid_o)
820
821 data_o = Mux(buf_full, buf, self.data_r)
822 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
823 m.d.comb += nmoperator.eq(self.n.data_o, data_o)
824 m.d.sync += nmoperator.eq(buf, self.n.data_o)
825
826 return self.m
827
828
829 class PassThroughHandshake(ControlBase):
830 """ A control block that delays by one clock cycle.
831
832 Inputs Temporary Output Data
833 ------- ------------------ ----- ----
834 P P N N PiV& PiV| NiR| pvr N P (pvr)
835 i o i o PoR ~PoR ~NoV o o
836 V R R V V R
837
838 ------- - - - - - -
839 0 0 0 0 0 1 1 0 1 1 odata (unchanged)
840 0 0 0 1 0 1 0 0 1 0 odata (unchanged)
841 0 0 1 0 0 1 1 0 1 1 odata (unchanged)
842 0 0 1 1 0 1 1 0 1 1 odata (unchanged)
843 ------- - - - - - -
844 0 1 0 0 0 0 1 0 0 1 odata (unchanged)
845 0 1 0 1 0 0 0 0 0 0 odata (unchanged)
846 0 1 1 0 0 0 1 0 0 1 odata (unchanged)
847 0 1 1 1 0 0 1 0 0 1 odata (unchanged)
848 ------- - - - - - -
849 1 0 0 0 0 1 1 1 1 1 process(in)
850 1 0 0 1 0 1 0 0 1 0 odata (unchanged)
851 1 0 1 0 0 1 1 1 1 1 process(in)
852 1 0 1 1 0 1 1 1 1 1 process(in)
853 ------- - - - - - -
854 1 1 0 0 1 1 1 1 1 1 process(in)
855 1 1 0 1 1 1 0 0 1 0 odata (unchanged)
856 1 1 1 0 1 1 1 1 1 1 process(in)
857 1 1 1 1 1 1 1 1 1 1 process(in)
858 ------- - - - - - -
859
860 """
861
862 def elaborate(self, platform):
863 self.m = m = ControlBase.elaborate(self, platform)
864
865 r_data = _spec(self.stage.ospec, "r_tmp") # output type
866
867 # temporaries
868 p_valid_i = Signal(reset_less=True)
869 pvr = Signal(reset_less=True)
870 m.d.comb += p_valid_i.eq(self.p.valid_i_test)
871 m.d.comb += pvr.eq(p_valid_i & self.p.ready_o)
872
873 m.d.comb += self.p.ready_o.eq(~self.n.valid_o | self.n.ready_i_test)
874 m.d.sync += self.n.valid_o.eq(p_valid_i | ~self.p.ready_o)
875
876 odata = Mux(pvr, self.data_r, r_data)
877 m.d.sync += nmoperator.eq(r_data, odata)
878 r_data = self._postprocess(r_data) # XXX TBD, does nothing right now
879 m.d.comb += nmoperator.eq(self.n.data_o, r_data)
880
881 return m
882
883
884 class RegisterPipeline(UnbufferedPipeline):
885 """ A pipeline stage that delays by one clock cycle, creating a
886 sync'd latch out of data_o and valid_o as an indirect byproduct
887 of using PassThroughStage
888 """
889 def __init__(self, iospecfn):
890 UnbufferedPipeline.__init__(self, PassThroughStage(iospecfn))
891
892
893 class FIFOControl(ControlBase):
894 """ FIFO Control. Uses Queue to store data, coincidentally
895 happens to have same valid/ready signalling as Stage API.
896
897 data_i -> fifo.din -> FIFO -> fifo.dout -> data_o
898 """
899 def __init__(self, depth, stage, in_multi=None, stage_ctl=False,
900 fwft=True, pipe=False):
901 """ FIFO Control
902
903 * :depth: number of entries in the FIFO
904 * :stage: data processing block
905 * :fwft: first word fall-thru mode (non-fwft introduces delay)
906 * :pipe: specifies pipe mode.
907
908 when fwft = True it indicates that transfers may occur
909 combinatorially through stage processing in the same clock cycle.
910 This requires that the Stage be a Moore FSM:
911 https://en.wikipedia.org/wiki/Moore_machine
912
913 when fwft = False it indicates that all output signals are
914 produced only from internal registers or memory, i.e. that the
915 Stage is a Mealy FSM:
916 https://en.wikipedia.org/wiki/Mealy_machine
917
918 data is processed (and located) as follows:
919
920 self.p self.stage temp fn temp fn temp fp self.n
921 data_i->process()->result->cat->din.FIFO.dout->cat(data_o)
922
923 yes, really: cat produces a Cat() which can be assigned to.
924 this is how the FIFO gets de-catted without needing a de-cat
925 function
926 """
927 self.fwft = fwft
928 self.pipe = pipe
929 self.fdepth = depth
930 ControlBase.__init__(self, stage, in_multi, stage_ctl)
931
932 def elaborate(self, platform):
933 self.m = m = ControlBase.elaborate(self, platform)
934
935 # make a FIFO with a signal of equal width to the data_o.
936 (fwidth, _) = nmoperator.shape(self.n.data_o)
937 fifo = Queue(fwidth, self.fdepth, fwft=self.fwft, pipe=self.pipe)
938 m.submodules.fifo = fifo
939
940 def processfn(data_i):
941 # store result of processing in combinatorial temporary
942 result = _spec(self.stage.ospec, "r_temp")
943 m.d.comb += nmoperator.eq(result, self.process(data_i))
944 return nmoperator.cat(result)
945
946 ## prev: make the FIFO (Queue object) "look" like a PrevControl...
947 m.submodules.fp = fp = PrevControl()
948 fp.valid_i, fp._ready_o, fp.data_i = fifo.w_en, fifo.w_rdy, fifo.w_data
949 m.d.comb += fp._connect_in(self.p, fn=processfn)
950
951 # next: make the FIFO (Queue object) "look" like a NextControl...
952 m.submodules.fn = fn = NextControl()
953 fn.valid_o, fn.ready_i, fn.data_o = fifo.r_rdy, fifo.r_en, fifo.r_data
954 connections = fn._connect_out(self.n, fn=nmoperator.cat)
955 valid_eq, ready_eq, data_o = connections
956
957 # ok ok so we can't just do the ready/valid eqs straight:
958 # first 2 from connections are the ready/valid, 3rd is data.
959 if self.fwft:
960 m.d.comb += [valid_eq, ready_eq] # combinatorial on next ready/valid
961 else:
962 m.d.sync += [valid_eq, ready_eq] # non-fwft mode needs sync
963 data_o = self._postprocess(data_o) # XXX TBD, does nothing right now
964 m.d.comb += data_o
965
966 return m
967
968
969 # aka "RegStage".
970 class UnbufferedPipeline(FIFOControl):
971 def __init__(self, stage, in_multi=None, stage_ctl=False):
972 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
973 fwft=True, pipe=False)
974
975 # aka "BreakReadyStage" XXX had to set fwft=True to get it to work
976 class PassThroughHandshake(FIFOControl):
977 def __init__(self, stage, in_multi=None, stage_ctl=False):
978 FIFOControl.__init__(self, 1, stage, in_multi, stage_ctl,
979 fwft=True, pipe=True)
980
981 # this is *probably* BufferedHandshake, although test #997 now succeeds.
982 class BufferedHandshake(FIFOControl):
983 def __init__(self, stage, in_multi=None, stage_ctl=False):
984 FIFOControl.__init__(self, 2, stage, in_multi, stage_ctl,
985 fwft=True, pipe=False)
986
987
988 """
989 # this is *probably* SimpleHandshake (note: memory cell size=0)
990 class SimpleHandshake(FIFOControl):
991 def __init__(self, stage, in_multi=None, stage_ctl=False):
992 FIFOControl.__init__(self, 0, stage, in_multi, stage_ctl,
993 fwft=True, pipe=False)
994 """