1 # Dynamic Partitioned Slice (`SimdSlice`)
3 In order to match the semantics of nmigen's `Slice` class, `SimdSlice` has to have each element of the result have
4 exactly the same `Shape` as the result of slicing the input `SimdSignal`'s corresponding element.
10 a = a_s.sig # shorthand to make table smaller
12 b = b_s.sig # shorthand to make table smaller
17 (TODO 1: shrink to only 4 partitions. TODO 2: convert to markdown)
20 <tr class="text-right">
21 <th scope="row" class="text-left">Bit #</th>
22 <td>63⁠…⁠56</td>
23 <td>55⁠…⁠48</td>
24 <td>47⁠…⁠40</td>
25 <td>39⁠…⁠32</td>
26 <td>31⁠…⁠24</td>
27 <td>23⁠…⁠16</td>
28 <td>15⁠…⁠8</td>
29 <td>7⁠…⁠0</td>
31 <tr class="text-right">
32 <th scope="row" class="text-left">ElWid: 8-bit</th>
33 <td><code>a[56:64]</code></td>
34 <td><code>a[48:56]</code></td>
35 <td><code>a[40:48]</code></td>
36 <td><code>a[32:40]</code></td>
37 <td><code>a[24:32]</code></td>
38 <td><code>a[16:24]</code></td>
39 <td><code>a[8:16]</code></td>
40 <td><code>a[0:8]</code></td>
42 <tr class="text-right">
43 <th scope="row" class="text-left">ElWid: 16-bit</th>
44 <td colspan="2"><code>a[48:64]</code></td>
45 <td colspan="2"><code>a[32:48]</code></td>
46 <td colspan="2"><code>a[16:32]</code></td>
47 <td colspan="2"><code>a[0:16]</code></td>
49 <tr class="text-right">
50 <th scope="row" class="text-left">ElWid: 32-bit</th>
51 <td colspan="4"><code>a[32:64]</code></td>
52 <td colspan="4"><code>a[0:32]</code></td>
54 <tr class="text-right">
55 <th scope="row" class="text-left">ElWid: 64-bit</th>
56 <td colspan="8"><code>a[0:64]</code></td>
60 So, slicing bits `3:6` of a 32-bit element of `a` must, because we have to match nmigen, produce a 3-bit element, which might seem like no problem, however, slicing bits `3:6` of a 16-bit element of a 64-bit `SimdSignal` must *also* produce a 3-bit element, so, in order to get a `SimdSignal` where *all* elements are 3-bit elements, as required by `SimdSlice`'s output, we have to introduce padding:
64 (TODO 1: shrink to only 4 partitions. TODO 2: convert to markdown)
67 <tr class="text-right">
68 <th scope="row" class="text-left">Bit #</th>
69 <td>23⁠…⁠21</td>
70 <td>20⁠…⁠18</td>
71 <td>17⁠…⁠15</td>
72 <td>14⁠…⁠12</td>
73 <td>11⁠…⁠9</td>
74 <td>8⁠…⁠6</td>
75 <td>5⁠…⁠3</td>
76 <td>2⁠…⁠0</td>
78 <tr class="text-right">
79 <th scope="row" class="text-left">ElWid: 8-bit</th>
80 <td><code>b[21:24]</code></td>
81 <td><code>b[18:21]</code></td>
82 <td><code>b[15:18]</code></td>
83 <td><code>b[12:15]</code></td>
84 <td><code>b[9:12]</code></td>
85 <td><code>b[6:9]</code></td>
86 <td><code>b[3:6]</code></td>
87 <td><code>b[0:3]</code></td>
89 <tr class="text-right">
90 <th scope="row" class="text-left">ElWid: 16-bit</th>
91 <td class="text-center"><i>Padding</i></td>
92 <td><code>b[18:21]</code></td>
93 <td class="text-center"><i>Padding</i></td>
94 <td><code>b[12:15]</code></td>
95 <td class="text-center"><i>Padding</i></td>
96 <td><code>b[6:9]</code></td>
97 <td class="text-center"><i>Padding</i></td>
98 <td><code>b[0:3]</code></td>
100 <tr class="text-right">
101 <th scope="row" class="text-left">ElWid: 32-bit</th>
102 <td colspan="3" class="text-center"><i>Padding</i></td>
103 <td><code>b[12:15]</code></td>
104 <td colspan="3" class="text-center"><i>Padding</i></td>
105 <td><code>b[0:3]</code></td>
107 <tr class="text-right">
108 <th scope="row" class="text-left">ElWid: 64-bit</th>
109 <td colspan="7" class="text-center"><i>Padding</i></td>
110 <td><code>b[0:3]</code></td>
115 /* duplicated from bootstrap so text editors can see it
116 -- ignored by ikiwiki */
118 text-align: left !important
122 text-align: right !important
126 text-align: center !important
130 # Partitioned SIMD Design implications
132 Slice is the very first of the entire suite of sub-modules of Partitioned
133 SimdSignal that requires (and propagates) fixed element widths. All other
134 sub-modules have up until this point been a fixed *overall* width where the
135 element widths adapt to completely fill the entire underlying Signal.
137 (**This includes for [[dynamic_simd/eq]] and other comparators and the
138 [[dynamic_simd/logicops]] which very deliberately propagate the LSB boolean
139 value in each partition throughout the entire partition on a per-element
140 basis in order to make Mux and Switch function correctly**)
142 Given that this new width context is then passed through to other SimdSignals,
143 the entire SimdSignal suite has to adapt to this change in requirements.
144 It is however not as big an adaptation as it first seems, because ultimately
145 SimdSignals use PartitionPoints (and a PartType) to decide what to do.
146 Illustrating that SimdSignal uses PartitionPoints to make its decisions
147 at the low level, an add example using `b` and a new SimdSignal `c` of
148 an overall 8-bit width (with fixed element widths of size 2):
150 (TODO: add an example of how this would then do e.g. an add (to another
151 SimdSignal of only 8 bits in length or so - all element widths being
152 2 in all partitions, but having the exact same PartitionPoints)
154 Questions raised by the add example:
156 * after performing a Slice, which creates an entirely new
157 (padded) set of PartitionPoints, where does c's PartitionPoints
159 * how should a SimdSignal that does not contain the same
160 padding be add()ed to a Slice()d SimdSignal that does *not*
161 contain padding, having a completely different set of PartitionPoints?
162 * what happens when a fixed element width Slice()d source `b` is
163 add()ed to a fixed *overall* width SimdSignal of width 8 that
164 permits variable-length (max available space) elements?
166 Illustrating the case of adding a SimdSignal with padding to one that
169 (TODO: add a second example of how this would then do e.g. an add (to another
170 SimdSignal of only 8 bits in length or so, but having a **different**
171 style of PartitionPoints, with no padding this time)
173 take signal a, of 16 bits, each bit being numbered in hexadecimal:
176 AfAeAdAc AbAaA9A8 A7A6A5A4 A3A2A1A0
178 and take a slice a[0:1] to create 3-bit values, where padding is
179 specified by "x", at each elwid:
182 0b00 x x x x x x x x x x x x x A2A1A0
183 0b01 x x x x x AaA9A8 x x x x x A2A1A0
184 0b10 x AeAdAc x AaA9A8 x A6A5A4 x A2A1A0
186 The presence of "x" unused portions actually requires some additional
190 0b00 x x x x x x x x x x x x x A2A1A0
191 0b01 x x x x x AaA9A8 x x x x x A2A1A0
192 0b10 x AeAdAc x AaA9A8 x A6A5A4 x A2A1A0
194 Now let us take a signal, b, of 2-bit lengths,
195 and attempt to perform an add operation:
198 0b00 x x x x x x B1B0
199 0b01 x x B5B4 x x B1B0
200 0b10 B7B6 B5B4 B3B2 B1B0
202 This is not immediately possible (at least not
203 obviously so) and consequently b needs expanding
204 to the same padding and PartitionPoints:
207 0b00 x x x x x x x x x x x x x 0 B1B0
208 0b01 x x x x x 0 B5B4 x 0 x x x 0 B1B0
209 0b10 x 0 B7B6 x 0 B5B4 x 0 B3B2 x 0 B1B0
211 Note here that zero-extension also had to occur to
212 bring b up to the same element width in each partition,
213 at which point, "x" padding being ignored, a straight
214 PartitionedAdd may be deployed because both the overall
215 width and the positions of the PartitionPoints are exactly
218 Another example: Cat() on the same 2 signals: here at least we
219 know that the end-result is elements of 5 bits each, because
220 all "a" slices are 3 bit and all "b" elements are 2 bit:
223 0b00 x x x x x x x x x x x x x x x x x x x x x A2A1A0
224 0b01 x x x x x x x B5B4AaA9A8 x x x x x x x x x A2A1A0
225 0b10 x B7B6AeAdAc x B5B4AaA9A8 x B3B2A6A5A4 x B1B0A2A1A0
228 Illustrating the case where a Sliced (fixed element width) SimdSignal
229 is added to one which has variable-length elements that take up the
230 entirety of the partition (overall fixed width):
232 (TODO: third example)