f17648ca010c84b6d4c1c62c7b1b8b08cb833143
[libreriscv.git] / 3d_gpu / architecture / dynamic_simd / slice.mdwn
1 # Dynamic Partitioned Slice (`SimdSlice`)
2
3 In order to match the semantics of nmigen's `Slice` class, `SimdSlice` has to have each element of the result have
4 exactly the same `Shape` as the result of slicing the input `SimdSignal`'s corresponding element.
5
6 ## Example code:
7
8 ```python
9 a_s = SimdSignal(...)
10 a = a_s.sig # shorthand to make table smaller
11 b_s = a_s[3:6]
12 b = b_s.sig # shorthand to make table smaller
13 ```
14
15 ## `a`'s Elements:
16
17 (TODO 1: shrink to only 4 partitions. TODO 2: convert to markdown)
18
19 <table>
20 <tr class="text-right">
21 <th scope="row" class="text-left">Bit #</th>
22 <td>63&#8288;&hellip;&#8288;56</td>
23 <td>55&#8288;&hellip;&#8288;48</td>
24 <td>47&#8288;&hellip;&#8288;40</td>
25 <td>39&#8288;&hellip;&#8288;32</td>
26 <td>31&#8288;&hellip;&#8288;24</td>
27 <td>23&#8288;&hellip;&#8288;16</td>
28 <td>15&#8288;&hellip;&#8288;8</td>
29 <td>7&#8288;&hellip;&#8288;0</td>
30 </tr>
31 <tr class="text-right">
32 <th scope="row" class="text-left">ElWid: 8-bit</th>
33 <td><code>a[56:64]</code></td>
34 <td><code>a[48:56]</code></td>
35 <td><code>a[40:48]</code></td>
36 <td><code>a[32:40]</code></td>
37 <td><code>a[24:32]</code></td>
38 <td><code>a[16:24]</code></td>
39 <td><code>a[8:16]</code></td>
40 <td><code>a[0:8]</code></td>
41 </tr>
42 <tr class="text-right">
43 <th scope="row" class="text-left">ElWid: 16-bit</th>
44 <td colspan="2"><code>a[48:64]</code></td>
45 <td colspan="2"><code>a[32:48]</code></td>
46 <td colspan="2"><code>a[16:32]</code></td>
47 <td colspan="2"><code>a[0:16]</code></td>
48 </tr>
49 <tr class="text-right">
50 <th scope="row" class="text-left">ElWid: 32-bit</th>
51 <td colspan="4"><code>a[32:64]</code></td>
52 <td colspan="4"><code>a[0:32]</code></td>
53 </tr>
54 <tr class="text-right">
55 <th scope="row" class="text-left">ElWid: 64-bit</th>
56 <td colspan="8"><code>a[0:64]</code></td>
57 </tr>
58 </table>
59
60 So, slicing bits `3:6` of a 32-bit element of `a` must, because we have to match nmigen, produce a 3-bit element, which might seem like no problem, however, slicing bits `3:6` of a 16-bit element of a 64-bit `SimdSignal` must *also* produce a 3-bit element, so, in order to get a `SimdSignal` where *all* elements are 3-bit elements, as required by `SimdSlice`'s output, we have to introduce padding:
61
62 ## `b`'s Elements:
63
64 (TODO 1: shrink to only 4 partitions. TODO 2: convert to markdown)
65
66 <table>
67 <tr class="text-right">
68 <th scope="row" class="text-left">Bit #</th>
69 <td>23&#8288;&hellip;&#8288;21</td>
70 <td>20&#8288;&hellip;&#8288;18</td>
71 <td>17&#8288;&hellip;&#8288;15</td>
72 <td>14&#8288;&hellip;&#8288;12</td>
73 <td>11&#8288;&hellip;&#8288;9</td>
74 <td>8&#8288;&hellip;&#8288;6</td>
75 <td>5&#8288;&hellip;&#8288;3</td>
76 <td>2&#8288;&hellip;&#8288;0</td>
77 </tr>
78 <tr class="text-right">
79 <th scope="row" class="text-left">ElWid: 8-bit</th>
80 <td><code>b[21:24]</code></td>
81 <td><code>b[18:21]</code></td>
82 <td><code>b[15:18]</code></td>
83 <td><code>b[12:15]</code></td>
84 <td><code>b[9:12]</code></td>
85 <td><code>b[6:9]</code></td>
86 <td><code>b[3:6]</code></td>
87 <td><code>b[0:3]</code></td>
88 </tr>
89 <tr class="text-right">
90 <th scope="row" class="text-left">ElWid: 16-bit</th>
91 <td class="text-center"><i>Padding</i></td>
92 <td><code>b[18:21]</code></td>
93 <td class="text-center"><i>Padding</i></td>
94 <td><code>b[12:15]</code></td>
95 <td class="text-center"><i>Padding</i></td>
96 <td><code>b[6:9]</code></td>
97 <td class="text-center"><i>Padding</i></td>
98 <td><code>b[0:3]</code></td>
99 </tr>
100 <tr class="text-right">
101 <th scope="row" class="text-left">ElWid: 32-bit</th>
102 <td colspan="3" class="text-center"><i>Padding</i></td>
103 <td><code>b[12:15]</code></td>
104 <td colspan="3" class="text-center"><i>Padding</i></td>
105 <td><code>b[0:3]</code></td>
106 </tr>
107 <tr class="text-right">
108 <th scope="row" class="text-left">ElWid: 64-bit</th>
109 <td colspan="7" class="text-center"><i>Padding</i></td>
110 <td><code>b[0:3]</code></td>
111 </tr>
112 </table>
113
114 <style>
115 /* duplicated from bootstrap so text editors can see it
116 -- ignored by ikiwiki */
117 .text-left {
118 text-align: left !important
119 }
120
121 .text-right {
122 text-align: right !important
123 }
124
125 .text-center {
126 text-align: center !important
127 }
128 </style>
129
130 # Partitioned SIMD Design implications
131
132 Slice is the very first of the entire suite of sub-modules of Partitioned
133 SimdSignal that requires (and propagates) fixed element widths. All other
134 sub-modules have up until this point been a fixed *overall* width where the
135 element widths adapt to completely fill the entire underlying Signal.
136
137 (**This includes for [[dynamic_simd/eq]] and other comparators and the
138 [[dynamic_simd/logicops]] which very deliberately propagate the LSB boolean
139 value in each partition throughout the entire partition on a per-element
140 basis in order to make Mux and Switch function correctly**)
141
142 Given that this new width context is then passed through to other SimdSignals,
143 the entire SimdSignal suite has to adapt to this change in requirements.
144 It is however not as big an adaptation as it first seems, because ultimately
145 SimdSignals use PartitionPoints (and a PartType) to decide what to do.
146 Illustrating that SimdSignal uses PartitionPoints to make its decisions
147 at the low level, an add example using `b` and a new SimdSignal `c` of
148 an overall 8-bit width (with fixed element widths of size 2):
149
150 (TODO: add an example of how this would then do e.g. an add (to another
151 SimdSignal of only 8 bits in length or so - all element widths being
152 2 in all partitions, but having the exact same PartitionPoints)
153
154 Questions raised by the add example:
155
156 * after performing a Slice, which creates an entirely new
157 (padded) set of PartitionPoints, where does c's PartitionPoints
158 come from?
159 * how should a SimdSignal that does not contain the same
160 padding be add()ed to a Slice()d SimdSignal that does *not*
161 contain padding, having a completely different set of PartitionPoints?
162 * what happens when a fixed element width Slice()d source `b` is
163 add()ed to a fixed *overall* width SimdSignal of width 8 that
164 permits variable-length (max available space) elements?
165
166 Illustrating the case of adding a SimdSignal with padding to one that
167 does not:
168
169 (TODO: add a second example of how this would then do e.g. an add (to another
170 SimdSignal of only 8 bits in length or so, but having a **different**
171 style of PartitionPoints, with no padding this time)
172
173 take signal a, of 16 bits, each bit being numbered in hexadecimal:
174
175 | | |
176 AfAeAdAc AbAaA9A8 A7A6A5A4 A3A2A1A0
177
178 and take a slice a[0:2] to create 3-bit values, where padding is
179 specified by "x", at each elwid:
180
181 elwid | | |
182 0b00 AfAeAdAc AbAaA9A8 A7A6A5A4 A3A2A1A0
183
184 Illustrating the case where a Sliced (fixed element width) SimdSignal
185 is added to one which has variable-length elements that take up the
186 entirety of the partition (overall fixed width):
187
188 (TODO: third example)