Output pattern matcher items as log_debug()
[yosys.git] / passes / pmgen / README.md
1 Pattern Matcher Generator
2 =========================
3
4 The program `pmgen.py` reads a `.pmg` (Pattern Matcher Generator) file and
5 writes a header-only C++ library that implements that pattern matcher.
6
7 The "patterns" in this context are subgraphs in a Yosys RTLIL netlist.
8
9 The algorithm used in the generated pattern matcher is a simple recursive
10 search with backtracking. It is left to the author of the `.pmg` file to
11 determine an efficient cell order for the search that allows for maximum
12 use of indices and early backtracking.
13
14
15 API of Generated Matcher
16 ========================
17
18 When `pmgen.py` reads a `foobar.pmg` file, it writes `foobar_pm.h` containing
19 a class `foobar_pm`. That class is instantiated with an RTLIL module and a
20 list of cells from that module:
21
22 foobar_pm pm(module, module->selected_cells());
23
24 The caller must make sure that none of the cells in the 2nd argument are
25 deleted for as long as the patter matcher instance is used.
26
27 At any time it is possible to disable cells, preventing them from showing
28 up in any future matches:
29
30 pm.blacklist(some_cell);
31
32 The `.run_<pattern_name>(callback_function)` method searches for all matches
33 for the pattern`<pattern_name>` and calls the callback function for each found
34 match:
35
36 pm.run_foobar([&](){
37 log("found matching 'foo' cell: %s\n", log_id(pm.st.foo));
38 log(" with 'bar' cell: %s\n", log_id(pm.st.bar));
39 });
40
41 The `.pmg` file declares matcher state variables that are accessible via the
42 `.st_<pattern_name>.<state_name>` members. (The `.st_<pattern_name>` member is
43 of type `foobar_pm::state_<pattern_name>_t`.)
44
45 Similarly the `.pmg` file declares user data variables that become members of
46 `.ud_<pattern_name>`, a struct of type `foobar_pm::udata_<pattern_name>_t`.
47
48 There are three versions of the `run_<pattern_name>()` method: Without callback,
49 callback without arguments, and callback with reference to `pm`. All versions
50 of the `run_<pattern_name>()` method return the number of found matches.
51
52
53 The .pmg File Format
54 ====================
55
56 The `.pmg` file format is a simple line-based file format. For the most part
57 lines consist of whitespace-separated tokens.
58
59 Lines in `.pmg` files starting with `//` are comments.
60
61 Declaring a pattern
62 -------------------
63
64 A `.pmg` file contains one or more patterns. Each pattern starts with a line
65 with the `pattern` keyword followed by the name of the pattern.
66
67 Declaring state variables
68 -------------------------
69
70 One or more state variables can be declared using the `state` statement,
71 followed by a C++ type in angle brackets, followed by a whitespace separated
72 list of variable names. For example:
73
74 state <bool> flag1 flag2 happy big
75 state <SigSpec> sigA sigB sigY
76
77 State variables are automatically managed by the generated backtracking algorithm
78 and saved and restored as needed.
79
80 They are automatically initialized to the default constructed value of their type
81 when `.run_<pattern_name>(callback_function)` is called.
82
83 Declaring udata variables
84 -------------------------
85
86 Udata (user-data) variables can be used for example to configure the matcher or
87 the callback function used to perform actions on found matches.
88
89 There is no automatic management of udata variables. For this reason it is
90 recommended that the user-supplied matcher code treats them as read-only
91 variables.
92
93 They are declared like state variables, just using the `udata` statement:
94
95 udata <int> min_data_width max_data_width
96 udata <IdString> data_port_name
97
98 They are automatically initialized to the default constructed value of their type
99 when the pattern matcher object is constructed.
100
101 Embedded C++ code
102 -----------------
103
104 Many statements in a `.pmg` file contain C++ code. However, there are some
105 slight additions to regular C++/Yosys/RTLIL code that make it a bit easier to
106 write matchers:
107
108 - Identifiers starting with a dollar sign or backslash are automatically
109 converted to special IdString variables that are initialized when the
110 matcher object is constructed.
111
112 - The `port(<cell>, <portname>)` function is a handy alias for
113 `sigmap(<cell>->getPort(<portname>))`.
114
115 - Similarly `param(<cell>, <paramname>)` looks up a parameter on a cell.
116
117 - The function `nusers(<sigspec>)` returns the number of different cells
118 connected to any of the given signal bits, plus one if any of the signal
119 bits is also a primary input or primary output.
120
121 - In `code..endcode` blocks there exist `accept`, `reject`, `branch`,
122 `finish`, and `subpattern` statements.
123
124 - In `index` statements there is a special `===` operator for the index
125 lookup.
126
127 Matching cells
128 --------------
129
130 Cells are matched using `match..endmatch` blocks. For example:
131
132 match mul
133 if ff
134 select mul->type == $mul
135 select nusers(port(mul, \Y) == 2
136 index <SigSpec> port(mul, \Y) === port(ff, \D)
137 filter some_weird_function(mul) < other_weird_function(ff)
138 optional
139 endmatch
140
141 A `match` block starts with `match <statevar>` and implicitly generates
142 a state variable `<statevar>` of type `RTLIL::Cell*`.
143
144 All statements in the match block are optional. (An empty match block
145 would simply match each and every cell in the module.)
146
147 The `if <expression>` statement makes the match block conditional. If
148 `<expression>` evaluates to `false` then the match block will be ignored
149 and the corresponding state variable is set to `nullptr`. In our example
150 we only try to match the `mul` cell if the `ff` state variable points
151 to a cell. (Presumably `ff` is provided by a prior `match` block.)
152
153 The `select` lines are evaluated once for each cell when the matcher is
154 initialized. A `match` block will only consider cells for which all `select`
155 expressions evaluated to `true`. Note that the state variable corresponding to
156 the match (in the example `mul`) is the only state variable that may be used
157 in `select` lines.
158
159 Index lines are using the `index <type> expr1 === expr2` syntax. `expr1` is
160 evaluated during matcher initialization and the same restrictions apply as for
161 `select` expressions. `expr2` is evaluated when the match is calulated. It is a
162 function of any state variables assigned to by previous blocks. Both expression
163 are converted to the given type and compared for equality. Only cells for which
164 all `index` statements in the block pass are considered by the match.
165
166 Note that `select` and `index` are fast operations. Thus `select` and `index`
167 should be used whenever possible to create efficient matchers.
168
169 Finally, `filter <expression>` narrows down the remaining list of cells. For
170 performance reasons `filter` statements should only be used for things that
171 can't be done using `select` and `index`.
172
173 The `optional` statement marks optional matches. That is, the matcher will also
174 explore the case where `mul` is set to `nullptr`. Without the `optional`
175 statement a match may only be assigned nullptr when one of the `if` expressions
176 evaluates to `false`.
177
178 The `semioptional` statement marks matches that must match if at least one
179 matching cell exists, but if no matching cell exists it is set to `nullptr`.
180
181 Slices and choices
182 ------------------
183
184 Cell matches can contain "slices" and "choices". Slices can be used to
185 create matches for different sections of a cell. For example:
186
187 state <int> pmux_slice
188
189 match pmux
190 select pmux->type == $pmux
191 slice idx GetSize(port(pmux, \S))
192 index <SigBit> port(pmux, \S)[idx] === port(eq, \Y)
193 set pmux_slice idx
194 endmatch
195
196 The first argument to `slice` is the local variable name used to identify the
197 slice. The second argument is the number of slices that should be created for
198 this cell. The `set` statement can be used to copy that index into a state
199 variable so that later matches and/or code blocks can refer to it.
200
201 A similar mechanism is "choices", where a list of options is given as
202 second argument, and the matcher will iterate over those options:
203
204 state <SigSpec> foo bar
205 state <IdString> eq_ab eq_ba
206
207 match eq
208 select eq->type == $eq
209 choice <IdString> AB {\A, \B}
210 define <IdString> BA (AB == \A ? \B : \A)
211 index <SigSpec> port(eq, AB) === foo
212 index <SigSpec> port(eq, BA) === bar
213 set eq_ab AB
214 set eq_ba BA
215 generate
216
217 Notice how `define` can be used to define additional local variables similar
218 to the loop variables defined by `slice` and `choice`.
219
220 Additional code
221 ---------------
222
223 Interleaved with `match..endmatch` blocks there may be `code..endcode` blocks.
224 Such a block starts with the keyword `code` followed by a list of state variables
225 that the block may modify. For example:
226
227 code addAB sigS
228 if (addA) {
229 addAB = addA;
230 sigS = port(addA, \B);
231 }
232 if (addB) {
233 addAB = addB;
234 sigS = port(addB, \A);
235 }
236 endcode
237
238 The special keyword `reject` can be used to reject the current state and
239 backtrack. For example:
240
241 code
242 if (ffA && ffB) {
243 if (port(ffA, \CLK) != port(ffB, \CLK))
244 reject;
245 if (param(ffA, \CLK_POLARITY) != param(ffB, \CLK_POLARITY))
246 reject;
247 }
248 endcode
249
250 Similarly, the special keyword `accept` can be used to accept the current
251 state. (`accept` will not backtrack. This means it continues with the current
252 branch and may accept a larger match later.)
253
254 The special keyword `branch` can be used to explore different cases. Note that
255 each code block has an implicit `branch` at the end. So most use-cases of the
256 `branch` keyword need to end the block with `reject` to avoid the implicit
257 branch at the end. For example:
258
259 state <int> mode
260
261 code mode
262 for (mode = 0; mode < 8; mode++)
263 branch;
264 reject;
265 endcode
266
267 But in some cases it is more natural to utilize the implicit branch statement:
268
269 state <IdString> portAB
270
271 code portAB
272 portAB = \A;
273 branch;
274 portAB = \B;
275 endcode
276
277 There is an implicit `code..endcode` block at the end of each (sub)pattern
278 that just rejects.
279
280 A `code..finally..endcode` block executes the code after `finally` during
281 back-tracking. This is useful for maintaining user data state or printing
282 debug messages. For example:
283
284 udata <vector<Cell*>> stack
285
286 code
287 stack.push_back(addAB);
288 ...
289 finally
290 stack.pop_back();
291 endcode
292
293 `accept` and `finish` statements can be used inside the `finally` section,
294 but not `reject`, `branch`, or `subpattern`.
295
296 Declaring a subpattern
297 ----------------------
298
299 A subpattern starts with a line containing the `subpattern` keyword followed
300 by the name of the subpattern. Subpatterns can be called from a `code` block
301 using a `subpattern(<subpattern_name>);` C statement.
302
303 Arguments may be passed to subpattern via state variables. The `subpattern`
304 line must be followed by a `arg <arg1> <arg2> ...` line that lists the
305 state variables used to pass arguments.
306
307 state <IdString> foobar_type
308 state <bool> foobar_state
309
310 code foobar_type foobar_state
311 foobar_state = false;
312 foobar_type = $add;
313 subpattern(foo);
314 foobar_type = $sub;
315 subpattern(bar);
316 endcode
317
318 subpattern foo
319 arg foobar_type foobar_state
320
321 match addsub
322 index <IdString> addsub->type === foobar_type
323 ...
324 endmatch
325
326 code
327 if (foobar_state) {
328 subpattern(tail);
329 } else {
330 foobar_state = true;
331 subpattern(bar);
332 }
333 endcode
334
335 subpattern bar
336 arg foobar_type foobar_state
337
338 match addsub
339 index <IdString> addsub->type === foobar_type
340 ...
341 endmatch
342
343 code
344 if (foobar_state) {
345 subpattern(tail);
346 } else {
347 foobar_state = true;
348 subpattern(foo);
349 }
350 endcode
351
352 subpattern tail
353 ...
354
355 Subpatterns can be called recursively.
356
357 If a `subpattern` statement is preceded by a `fallthrough` statement, this is
358 equivalent to calling the subpattern at the end of the preceding block.
359
360 Generate Blocks
361 ---------------
362
363 Match blocks may contain an optional `generate` section that is used for automatic
364 test-case generation. For example:
365
366 match mul
367 ...
368 generate 10 0
369 SigSpec Y = port(ff, \D);
370 SigSpec A = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
371 SigSpec B = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
372 module->addMul(NEW_ID, A, B, Y, rng(2));
373 endmatch
374
375 The expression `rng(n)` returns a non-negative integer less than `n`.
376
377 The first argument to `generate` is the chance of this generate block being
378 executed when the match block did not match anything, in percent.
379
380 The second argument to `generate` is the chance of this generate block being
381 executed when the match block did match something, in percent.
382
383 The special statement `finish` can be used within generate blocks to terminate
384 the current pattern matcher run.