--- /dev/null
+Pattern Matcher Generator
+=========================
+
+The program `pmgen.py` reads a `.pmg` (Pattern Matcher Generator) file and
+writes a header-only C++ library that implements that pattern matcher.
+
+The "patterns" in this context are subgraphs in a Yosys RTLIL netlist.
+
+The algorithm used in the generated pattern matcher is a simple recursive
+search with backtracking. It is left to the author of the `.pmg` file to
+determine an efficient cell order for the search that allows for maximum
+use of indices and early backtracking.
+
+
+API of Generated Matcher
+========================
+
+When `pmgen.py` reads a `foobar.pmg` file, it writes `foobar_pm.h` containing
+a class `foobar_pm`. That class is instanciated with an RTLIL module and a
+list of cells from that module:
+
+ foobar_pm pm(module, module->selected_cells());
+
+The caller must make sure that none of the cells in the 2nd argument are
+deleted for as long as the patter matcher instance is used.
+
+At any time it is possible to disable cells, preventing them from showing
+up in any future matches:
+
+ pm.blacklist(some_cell);
+
+The `.run(callback_function)` method searches for all matches and calls the
+callback function for each found match:
+
+ pm.run([&](){
+ log("found matching 'foo' cell: %s\n", log_id(pm.st.foo));
+ log(" with 'bar' cell: %s\n", log_id(pm.st.bar));
+ });
+
+The `.pmg` file declares matcher state variables that are accessible via the
+`.st.<state_name>` members. (The `.st` member is of type `foobar_pm::state_t`.)
+
+Similarly the `.pmg` file declares user data variables that become members of
+`.ud`, a struct of type `foobar_pm::udata_t`.
+
+
+The .pmg File Format
+====================
+
+The `.pmg` file format is a simple line-based file format. For the most part
+lines consist of whitespace-separated tokens.
+
+Lines in `.pmg` files starting with `//` are comments.
+
+Declaring state variables
+-------------------------
+
+One or more state variables can be declared using the `state` statement,
+followed by a C++ type in angle brackets, followed by a whitespace separated
+list of variable names. For example:
+
+ state <bool> flag1 flag2 happy big
+ state <SigSpec> sigA sigB sigY
+
+State variables are automatically managed by the generated backtracking algorithm
+and saved and restored as needed.
+
+They are atomatically initialzed to the default constructed value of their type
+when `.run(callback_function)` is called.
+
+Declaring udata variables
+-------------------------
+
+Udata (user-data) variables can be used for example to configure the matcher or
+the callback function used to perform actions on found matches.
+
+There is no automatic management of udata variables. For this reason it is
+recommended that the user-supplied matcher code treats them as read-only
+variables.
+
+They are declared like state variables, just using the `udata` statement:
+
+ udata <int> min_data_width max_data_width
+ udata <IdString> data_port_name
+
+They are atomatically initialzed to the default constructed value of their type
+when ther pattern matcher object is constructed.
+
+Embedded C++ code
+-----------------
+
+Many statements in a `.pmg` file contain C++ code. However, there are some
+slight additions to regular C++/Yosys/RTLIL code that make it a bit easier to
+write matchers:
+
+- Identifiers starting with a dollar sign or backslash are automatically
+ converted to special IdString variables that are initialized when the
+ matcher object is constructed.
+
+- The `port(<cell>, <portname>)` function is a handy alias for
+ `sigmap(<cell>->getPort(<portname>))`.
+
+- Similarly `param(<cell>, <paramname>)` looks up a parameter on a cell.
+
+- The function `nusers(<sigspec>)` returns the number of different cells
+ connected to any of the given signal bits, plus one if any of the signal
+ bits is also a primary input or primary output.
+
+- In `code..endcode` blocks there exist `accept`, `reject`, and `branch`
+ statements.
+
+- In `index` statements there is a special `===` operator for the index
+ lookup.
+
+Matching cells
+--------------
+
+Cells are matched using `match..endmatch` blocks. For example:
+
+ match mul
+ if ff
+ select mul->type == $mul
+ select nusers(port(mul, \Y) == 2
+ index <SigSpec> port(mul, \Y) === port(ff, \D)
+ filter some_weird_function(mul) < other_weird_function(ff)
+ optional
+ endmatch
+
+A `match` block starts with `match <statevar>` and implicitly generates
+a state variable `<statevar>` of type `RTLIL::Cell*`.
+
+All statements in the match block are optional. (An empty match block
+would simply match each and every cell in the module.)
+
+The `if <expression>` statement makes the match block conditional. If
+`<expression>` evaluates to `false` then the match block will be ignored
+and the corresponding state variable is set to `nullptr`. In our example
+we only try to match the `mul` cell if the `ff` state variable points
+to a cell. (Presumably `ff` is provided by a prior `match` block.)
+
+The `select` lines are evaluated once for each cell when the matcher is
+initialized. A `match` block will only consider cells for which all `select`
+expressions evaluated to `true`. Note that the state variable corresponding to
+the match (in the example `mul`) is the only state variable that may be used
+`select` lines.
+
+Index lines are using the `index <type> expr1 === expr2` syntax. `expr1` is
+evaluated during matcher initialization and the same restrictions apply as for
+`select` expressions. `expr2` is evaluated when the match is calulated. It is a
+function of any state variables assigned to by previous blocks. Both expression
+are converted to the given type and compared for equality. Only cells for which
+all `index` statements in the block pass are considered by the match.
+
+Note that `select` and `index` are fast operations. Thus `select` and `index`
+should be used whenever possible to create efficient matchers.
+
+Finally, `filter <expression>` narrows down the remaining list of cells. For
+performance reasons `filter` statements should only be used for things that
+can't be done using `select` and `index`.
+
+The `optional` statement marks optional matches. I.e. the matcher will also
+explore the case where `mul` is set to `nullptr`. Without the `optional`
+statement a match may only be assigned nullptr when one of the `if` expressions
+evaluates to `false`.
+
+Additional code
+---------------
+
+Interleaved with `match..endmatch` blocks there may be `code..endcode` blocks.
+Such a block starts with the keyword `code` followed by a list of state variables
+that the block may modify. For example:
+
+ code addAB sigS
+ if (addA) {
+ addAB = addA;
+ sigS = port(addA, \B);
+ }
+ if (addB) {
+ addAB = addB;
+ sigS = port(addB, \A);
+ }
+ endcode
+
+The special keyword `reject` can be used to reject the current state and
+backtrack. For example:
+
+ code
+ if (ffA && ffB) {
+ if (port(ffA, \CLK) != port(ffB, \CLK))
+ reject;
+ if (param(ffA, \CLK_POLARITY) != param(ffB, \CLK_POLARITY))
+ reject;
+ }
+ endcode
+
+Similarly, the special keyword `accept` can be used to accept the current
+state. (`accept` will not backtrack. This means it continues with the current
+branch and may accept a larger match later.)
+
+The special keyword `branch` can be used to explore different cases. Note that
+each code block has an implicit `branch` at the end. So most use-cases of the
+`branch` keyword need to end the block with `reject` to avoid the implicit
+branch at the end. For example:
+
+ state <int> mode
+
+ code mode
+ for (mode = 0; mode < 8; mode++)
+ branch;
+ reject;
+ endcode
+
+But in some cases it is more natural to utilize the implicit branch statement:
+
+ state <IdString> portAB
+
+ code portAB
+ portAB = \A;
+ branch;
+ portAB = \B;
+ endcode
+
+There is an implicit `code..endcode` block at the end of each `.pgm` file
+that just accepts everything that gets all the way there.
prefix = sys.argv[1]
state_types = dict()
+udata_types = dict()
blocks = list()
ids = dict()
state_types[s] = type_str
continue
+ if cmd == "udata":
+ m = re.match(r"^udata\s+<(.*?)>\s+(([A-Za-z_][A-Za-z_0-9]*\s+)*[A-Za-z_][A-Za-z_0-9]*)\s*$", line)
+ assert m
+ type_str = m.group(1)
+ udatas_str = m.group(2)
+ for s in re.split(r"\s+", udatas_str):
+ assert s not in udata_types
+ udata_types[s] = type_str
+ continue
+
if cmd == "match":
block = dict()
block["type"] = "match"
print("", file=f)
print(" struct state_t {", file=f)
-
for s, t in sorted(state_types.items()):
print(" {} {};".format(t, s), file=f)
print(" } st;", file=f)
print("", file=f)
+ print(" struct udata_t {", file=f)
+ for s, t in sorted(udata_types.items()):
+ print(" {} {};".format(t, s), file=f)
+ print(" } ud;", file=f)
+ print("", file=f)
+
for v, n in sorted(ids.items()):
if n[0] == "\\":
print(" IdString {}{{\"\\{}\"}};".format(v, n), file=f)
print(" {}_pm(Module *module, const vector<Cell*> &cells) :".format(prefix), file=f)
print(" module(module), sigmap(module) {", file=f)
- print(" for (auto cell : cells) {", file=f)
+ for s, t in sorted(udata_types.items()):
+ if t.endswith("*"):
+ print(" ud.{} = nullptr;".format(s), file=f)
+ else:
+ print(" ud.{} = {}();".format(s, t), file=f)
+ print(" for (auto cell : module->cells()) {", file=f)
print(" for (auto &conn : cell->connections())", file=f)
print(" add_siguser(conn.second, cell);", file=f)
print(" }", file=f)
print(" }", file=f)
print("", file=f)
- print(" void match(std::function<void()> on_accept_f) {{".format(prefix), file=f)
+ print(" void run(std::function<void()> on_accept_f) {{".format(prefix), file=f)
print(" on_accept = on_accept_f;", file=f)
print(" rollback = 0;", file=f)
for s, t in sorted(state_types.items()):
print(" }", file=f)
print("", file=f)
- print("#define reject do { check_blacklist(); goto rollback_label; } while(0)", file=f)
- print("#define accept do { on_accept(); check_blacklist(); if (rollback) goto rollback_label; } while(0)", file=f)
- print("", file=f)
-
for index in range(len(blocks)):
block = blocks[index]
if block["type"] == "code":
print("", file=f)
print(" do {", file=f)
+ print("#define reject do { check_blacklist(); goto rollback_label; } while(0)", file=f)
+ print("#define accept do { on_accept(); check_blacklist(); if (rollback) goto rollback_label; } while(0)", file=f)
+ print("#define branch do {{ block_{}(); }} while(0)".format(index+1), file=f)
+
for line in block["code"]:
print(" " + line, file=f)
print("", file=f)
print(" block_{}();".format(index+1), file=f)
- print("rollback_label: YS_ATTRIBUTE(unused);", file=f)
+ print("#undef reject", file=f)
+ print("#undef accept", file=f)
+ print("#undef branch", file=f)
print(" } while (0);", file=f)
+ print("", file=f)
+ print("rollback_label:", file=f)
+ print(" YS_ATTRIBUTE(unused);", file=f)
if len(restore_st) or len(nonconst_st):
print("", file=f)
print(" }", file=f)
print("", file=f)
- print("#undef reject", file=f)
- print("#undef accept", file=f)
- print("", file=f)
-
print(" void block_{}() {{".format(len(blocks)), file=f)
print(" on_accept();", file=f)
print(" check_blacklist();", file=f)