Bump version
[yosys.git] / libs / subcircuit / README
1
2 **************************************************************************
3 * *
4 * The SubCircuit C++11 library *
5 * *
6 * An implementation of a modified Ullmann Subgraph Isomorphism Algorithm *
7 * for coarse grain logic networks. by Claire Xenia Wolf *
8 * *
9 **************************************************************************
10
11 ============
12 Introduction
13 ============
14
15 This is a library that implements a modified Ullmann Subgraph Isomorphism
16 Algorithm with additional features aimed at working with coarse grain logic
17 networks. It also contains a simple frequent subcircuit mining algorithm.
18
19 A simple command line tool that exposes the features of the library is also
20 included.
21
22
23 C++11 Warning
24 -------------
25
26 This project is written in C++11. Use appropriate compiler switches to compile
27 it. Tested with clang version 3.0 and option -std=c++11. Also tested with gcc
28 version 4.6.3 and option -std=c++0x.
29
30
31 ========
32 Features
33 ========
34
35 The input is two graphs (needle and haystack) that represent coarse grain
36 logic networks. The algorithm identifies all subgraphs of haystack that are
37 isomorphic to needle.
38
39 The following additional features over the regular Ullmann Subgraph Isomorphism
40 Algorithm are provided by the library.
41
42 * The graphs are attributed hypergraphs capable of representing netlists:
43
44 - Nodes represent the logic cells:
45 - Nodes have types and only match compatible types
46 - Nodes have ports with variable bit-width
47
48 - Hyperedges represent the signals:
49 - Each hyperedge connects one to many bits on ports on nodes
50
51 - Callback functions for advanced attributes and compatibility rules:
52 Any set of node-node compatibility rules and edge-edge
53 compatibility rules can be implemented by providing
54 the necessary callback functions.
55
56 * The algorithm is very efficient when all or many bits of one port are
57 connected to bits of the same other port. This is usually the case
58 in coarse grain logic networks. But the algorithm does not add any
59 restrictions in this area; it is just optimized for this scenario.
60
61 * The algorithm can be configured to allow larger ports in needle cells to
62 match smaller ports in haystack cells in certain situations. This way it
63 is possible to e.g. have a 32-bit adder cell in the needle match a
64 16-bit adder cell in the haystack.
65
66 * The algorithm can be configured to perform port-swapping on certain
67 ports on certain cell types to match commutative operations properly.
68
69 This is, however, not implemented very efficiently when a larger number
70 of permutations is possible on a cell type. Therefore it is recommended
71 to only use swap groups with only a few members and a few such groups
72 on one cell type type.
73
74 Also note, that the algorithm can not resolve complex dependencies
75 between the port swappings of different cells. Therefore it is
76 recommended to only use port swapping on input pins of commutative
77 operations, where such complex dependencies can not emerge.
78
79 * The algorithm can be configured to distinguish between internal signals
80 of the needle and externally visible signals. The needle will only
81 match a subgraph of the haystack if that subgraph does not expose the
82 internal signal to nodes in the haystack outside the matching subgraph.
83
84 * The algorithm can recognize a subcircuit even if some or all of its
85 inputs and/or outputs are shorted together.
86
87 * Explicit fast support for constant signals without extra nodes for
88 constant drivers.
89
90 * Support for finding only non-overlapping matches.
91
92 * A simple miner for frequent subcircuts that operates on the same circuit
93 description format.
94
95 * The public API of the library is using std::string identifiers for
96 nodes, node types and ports. Internally the costly part of the
97 algorithm is only using integer values, thus speeding up the
98 algorithm without exposing complex internal encodings to the caller.
99
100
101 =================
102 API Documentation
103 =================
104
105 This section gives a brief overview of the API. For a working example, have a
106 look at the demo.cc example program in this directory.
107
108
109 Setting up graphs
110 -----------------
111
112 Instantiate the SubCircuit::Graph class and use the methods of this class to
113 set up the circuit.
114
115 SubCircuit::Graph myGraph;
116
117 For each node in the circuit call the createNode() method. Specify the
118 identifier for the node and also the type of function implemented by the node.
119 Then call createPort() for each port of this node.
120
121 E.g. the following code adds a node "myAdder" of type "add" with three 32 bit
122 wide ports "A", "B" and "Y". Note that SubCircuit does not care which port is
123 an input and which port is an output. The last (and optional) argument to
124 createPort() specifies the minimum number of bits required for this port in the
125 haystack (this field is only used in the needle graph). So in this example the
126 node would e.g. also match any adder with a bit width smaller 32.
127
128 myGraph.createNode("myAdder", "add");
129 myGraph.createPort("myAdder", "A", 32, 1);
130 myGraph.createPort("myAdder", "B", 32, 1);
131 myGraph.createPort("myAdder", "Y", 32, 1);
132
133 The createConnection() method can be used to connect the nodes. It internally
134 creates a hypergraph. So the following code does not only connect cell1.Y with
135 cell2.A and cell3.A but also implicitly cell2.A with cell3.A.
136
137 myGraph.createConnection("cell1", "Y", "cell2", "A");
138 myGraph.createConnection("cell1", "Y", "cell3", "A");
139
140 Redundent calls to createConnection() are ignored. As long as the method is
141 called after the relevant nodes and ports are created, the order in which the
142 createConnection() calls are performed is irrelevant.
143
144 The createConnection() method can also be used to connect single bit signals.
145 In this case the start bit for both ports must be provided as well as an
146 optional width (which defaults to 1). E.g. the following calls can be used to
147 connect the 32 bit port cell4.Y to the 32 bit port cell5.A with a one bit left
148 rotate shift,
149
150 myGraph.createConnection("cell4", "Y", 0, "cell5", "A", 1, 31);
151 myGraph.createConnection("cell4", "Y", 31, "cell5", "A", 0);
152
153 The method createConstant() can be used to add a constant driver to a signal.
154 The signal value is encoded as one char by bit, allowing for multi-valued
155 logic matching. The following command sets the lowest bit of cell6.A to a
156 logic 1:
157
158 myGraph.createConnection("cell6", "A", 0, '1');
159
160 It is also possible to set an entire port to a integer value, using the
161 encodings '0' and '1' for the binary digits:
162
163 myGraph.createConnection("cell6", "A", 42);
164
165 The method markExtern() can be used to mark a signal as externally visible. In
166 a needle graph this means, this signal may match a signal in the haystack that
167 is used outside the matching subgraph. In a haystack graph this means, this
168 signal is used outside the haystack graph. I.e. an internal signal of the
169 needle won't match an external signal of the haystack regardless where the
170 signal is used in the haystack.
171
172 In some application one may disable this extern/intern checks. This can easily
173 be achieved by marking all signals in the needle as extern. This can be done
174 using the Graph::markAllExtern() method.
175
176
177 Setting up and running solvers
178 ------------------------------
179
180 To actually run the subgraph isomorphism algorithm, an instance of
181 SubCircuit::Solver must be created.
182
183 SubCircuit::Solver mySolver;
184
185 The addGraph() method can be used to register graphs with the solver:
186
187 mySolver.addGraph("graph1", myGraph);
188 mySolver.addGraph("graph2", myOtherGraph);
189
190 Usually nodes in the needle and the haystack must have the same type identifier
191 to match each other. Additionally pairs of compatible needle and haystack node
192 pairs can be registered using the addCompatibleTypes() method:
193
194 mySolver.addCompatibleTypes("alu", "add");
195 mySolver.addCompatibleTypes("alu", "sub");
196 mySolver.addCompatibleTypes("alu", "and");
197 mySolver.addCompatibleTypes("alu", "or");
198 mySolver.addCompatibleTypes("alu", "xor");
199
200 Note that nodes in needle and haystack must also use the same naming convention
201 for their ports in order to be considered compatible by the algorithm.
202
203 Similarly the method addCompatibleConstants() can be used the specify which
204 constant values in the needle should match which constant value in the haystack.
205 Equal values always do match.
206
207 mySolver.addCompatibleConstants('x', '0');
208 mySolver.addCompatibleConstants('x', '1');
209
210 Some cells implement commutative operations that don't care if their input
211 operands are swapped. For this cell types it is possible to register groups
212 of swappable ports. Let's consider a cell "macc23" that implements the
213 function Y = (A * B) + (C * D * E):
214
215 mySolver.addSwappablePorts("macc23", "A", "B");
216 mySolver.addSwappablePorts("macc23", "C", "D", "E");
217
218 Sometimes the rules for port swapping are a more complicated and the swapping
219 of one port is related to the swapping of another port. Let's consider a cell
220 "macc22" that implements the function Y = (A * B) + (C * D):
221
222 mySolver.addSwappablePorts("macc22", "A", "B");
223 mySolver.addSwappablePorts("macc22", "C", "D");
224
225 std::map<std::string, std::string> portMapping;
226 portMapping["A"] = "C";
227 portMapping["B"] = "D";
228 portMapping["C"] = "A";
229 portMapping["D"] = "B";
230 mySolver.addSwappablePortsPermutation("macc22", portMapping);
231
232 I.e. the method mySolver.addSwappablePortsPermutation() can be used to register
233 additional permutations for a node type of which one or none is applied on top
234 of the permutations yielded by the permutations generated by the swap groups.
235
236 Note that two solutions that differ only in the applied port swapping are not
237 reported as separate solutions. Instead only one of them is selected (in most
238 cases the one with less port swapping as it is usually identified first).
239
240 Once everything has been set up, the solve() method can be used to actually
241 search for isomorphic subgraphs. The first argument to solve() is an
242 std::vector<SubCircuit::Solver::Result> objects to which all found solutions
243 are appended. The second argument is the identifier under which the needle
244 graph has been registered and the third argument is the identifier under which
245 the haystack graph has been registered:
246
247 std::vector<SubCircuit::Solver::Result> results;
248 mySolver.solve(results, "graph1", "graph2");
249
250 The SubCircuit::Solver::Result object is a simple data structure that contains
251 the mappings between needle and haystack nodes, port mappings after the port
252 swapping and some additional metadata. See "subcircuit.h" and "demo.cc" for
253 details.
254
255 The solve() method has a third optional boolean argument. If it is set to
256 false, solve will not return any solutions that contain haystack nodes that
257 have been part of a previously found solution. This way it is e.g. easy
258 to implement a greedy macro cell matching algorithm:
259
260 std::vector<SubCircuit::Solver::Result> results;
261 mySolver.solve(results, "macroCell1", "circuit", false);
262 mySolver.solve(results, "macroCell2", "circuit", false);
263 mySolver.solve(results, "macroCell3", "circuit", false);
264
265 After this code has been executed, the results vector contains all
266 non-overlapping matches of the three macrocells. The method
267 clearOverlapHistory() can be used to reset the internal state used
268 for this feature. The default value for the third argument to solve()
269 is true (allow overlapping). The optional boolean fourth argument to the
270 Graph::createNode() method can be used to mark a node as shareable even
271 in non-overlapping solver mode.
272
273 The solve() method also has a fourth optional integer argument. If it is set to
274 a positive integer, this integer specifies the maximum number of solutions to
275 be appended to the results vector, i.e. to terminate the algorithm early when
276 the set number of matches is found. When this fourth argument is negative or
277 omitted all matches are found and appended.
278
279 An alternative version of the solve() method supports an additional argument
280 after they haystack graph identifier that specifies initial mappings for
281 the algorithm. In the following example only the haystack nodes cell_1 and
282 cell_2 are considered as mappings for the needle node cell_A:
283
284 std::map<std::string, std::set<std::string>> initialMappings;
285 initialMappings["cell_A"].insert("cell_1");
286 initialMappings["cell_A"].insert("cell_2");
287
288 std::vector<SubCircuit::Solver::Result> results;
289 mySolver.solve(results, "graph1", "graph2", initialMappings);
290
291 The clearConfig() method can be used to clear all data registered using
292 addCompatibleTypes(), addCompatibleConstants(), addSwappablePorts() and
293 addSwappablePortsPermutation() but retaining the graphs and the overlap state.
294
295
296 Using user callback function
297 ----------------------------
298
299 For more complex tasks it is possible to derive a class from SubCircuit::Solver
300 that overloads one or more of the following virtual methods. The userData
301 arguments to the following methods are void pointers that can be passed as
302 third argument to Graph::createNode() and are simly passed thru to the user
303 callback functions together with the node id whenever a node is referenced.
304
305 bool userCompareNodes(needleGraphId, needleNodeId, needleUserData, haystackGraphId, haystackNodeId, haystackUserData):
306
307 Perform additional checks on a pair of nodes (one from the needle, one
308 from the haystack) to determine if the nodes are compatible. The default
309 implementation always returns true.
310
311
312 bool userCompareEdge(needleGraphId, needleFromNodeId, needleFromUserData, needleToNodeId, needleToUserData,
313 haystackGraphId, haystackFromNodeId, haystackFromUserData, haystackToNodeId, haystackToUserData):
314
315 Perform additional checks on a pair of a pair of adjacent nodes (one
316 adjacent pair from the needle and one adjacent pair from the haystack)
317 to determine whether this edge from the needle is compatible with
318 that edge from the haystack. The default implementation always
319 returns true.
320
321 bool userCheckSolution(result):
322
323 Perform additional checks on a solution before appending it to the
324 results vector. When this function returns false, the solution is
325 ignored. The default implementation always returns true.
326
327
328 Mining for frequent SubCircuits
329 -------------------------------
330
331 The solver also contains a miner for frequent subcircuits. The following code
332 fragment will find all frequent subcircuits with at least minNodes nodes and
333 at most maxNodes nodes that occurs at least minMatches times:
334
335 std::vector<SubCircuit::Solver::MineResult> results;
336 mySolver.mine(results, minNodes, maxNodes, minMatches);
337
338 The mine() method has an optional fifth parameter that limits the number of
339 matches counted in one graph. This can be useful when mining for circuits that
340 are found in at least a number of graphs. E.g. the following call would find
341 all subcircuits with 5 nodes that are found in at least 7 of the registered
342 graphs:
343
344 mySolver.mine(results, 5, 5, 7, 1);
345
346 Note that this miner is not very efficient and therefore its use is not
347 recommended for large circuits. Also note that the miner is working under the
348 assumption that subgraph isomorphism is bidirectional. This is not the case in
349 circuits with gates with shorted pins. This can result in undetected frequent
350 subcircuits in some corner cases.
351
352
353 Debugging
354 ---------
355
356 For debugging purposes the SubCircuit::Solver class implements a setVerbose()
357 method. When called once, all further calls to the solve() method cause the
358 algorithm to dump out a lot of debug information to stdout.
359
360 In conjunction with setVerbose() one can also overload the userAnnotateEdge()
361 method in order to add additional information about the edges to the debug
362 output.
363
364
365 ===================
366 Shell Documentation
367 ===================
368
369 This package also contains a small command-line tool called "scshell" that can
370 be used for experimentation with the algorithm. This program reads a series of
371 commands from stdin and reports its findings to stdout on exit.
372
373 $ ./scshell < test_macc22.txt
374
375 ...
376
377 Match #3: (macc22 in macc4x2)
378 add_1 -> add_2 A:B B:A Y:Y
379 mul_1 -> mul_4 A:A B:B Y:Y
380 mul_2 -> mul_3 A:A B:B Y:Y
381
382 The following commands can be used in scshell to specify graphs:
383
384 graph <graph_name>
385 ...
386 endgraph
387
388 Used to specify a graph with the given name. Only the commands
389 "node", "connect" and "extern" may be used within the graph ...
390 endgraph block.
391
392 node <node_name> [<port_name> [<bits> [<min_bits>]]]+
393
394 Used to create a node and ports. This command is a direct frontend
395 to the Graph::createNode() and Graph::createPort() methods.
396
397 connect <from_node> <from_port> <to_node> <to_port>
398 connect <from_node> <from_port> <from_bit> <to_node> <to_port> <to_bit>
399 connect <from_node> <from_port> <from_bit> <to_node> <to_port> <to_bit> <width>
400
401 Used to connect the nodes in the graph via Graph::createConnection().
402
403 constant <node> <port> [<bit>] <value>
404
405 Call Graph::createConstant().
406
407 extern <node> [<port> [<bit>]]+
408
409 Mark signals as extern via Graph::markExtern().
410
411 allextern
412
413 Mark all signals as extern via Graph::markAllExtern().
414
415 The following commands can be used in scshell outside a graph ... endgraph block:
416
417 compatible <needle_type> <haystack_type>
418
419 Call Solver::addCompatibleTypes().
420
421 constcompat <needle_value> <haystack_value>
422
423 Call Solver::addCompatibleConstants().
424
425 swapgroup <needle_type> <port>+
426
427 Call Solver::addSwappablePorts().
428
429 swapperm <needle_type> <ports>+ : <ports>+
430
431 Call Solver::addSwappablePortsPermutation(). Both port lists must
432 have the same length and the second one must be a permutation of the
433 first one.
434
435 initmap <needle_node> <haystack_node>+
436
437 Add an entry to the initial mappings for the next solve command.
438 This mappings are automatically reset after the solve command.
439
440 solve <needle_graph> <haystack_graph> [<allow_overlap> [<max_solutions>]]
441
442 Call Solver::solve(). The <allow_overlap> must be "1" or "true"
443 for true and "0" or "false" for false.
444
445 mine <min_nodes> <max_nodes> <min_matches> [<limit_matches_per_graph>]
446
447 Call Solver::mine().
448
449 expect <number>
450
451 Print all results so far since the last call to expect. Expect
452 <number> results and exit with error code 1 if a different number
453 of results have been found.
454
455 clearoverlap
456
457 Call Solver::clearOverlapHistory().
458
459 clearconfig
460
461 Call Solver::clearConfig().
462
463 verbose
464
465 Call Solver::setVerbose().
466