summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Carl Worth [Mon, 17 May 2010 19:45:16 +0000 (12:45 -0700)]
Expect 1 shift/reduce conflict.
The most recent fix to the parser introduced a shift/reduce
conflict. We document this conflict here, and tell bison that it need
not report it (since I verified that it's being resolved in the
direction desired).
For the record, I did write additional lexer code to eliminate this
conflict, but it was quite fragile, (would not accept a newline
between a function-like macro name and the left parenthesis, for
example).
Carl Worth [Mon, 17 May 2010 17:34:29 +0000 (10:34 -0700)]
Fix bug (and add test) for a function-like-macro appearing as a non-macro.
That is, when a function-like macro appears in the content without
parentheses it should be accepted and passed on through, (previously
the parser was regarding this as a syntax error).
Carl Worth [Mon, 17 May 2010 17:15:23 +0000 (10:15 -0700)]
Add test and fix bug leading to infinite recursion.
The test case here is simply "#define foo foo" and "#define bar foo"
and then attempting to expand "bar".
Previously, our termination condition for the recursion was overly
simple---just looking for the single identifier that began the
expansion. We now fix this to maintain a stack of identifiers and
terminate when any one of them occurs in the replacement list.
Carl Worth [Sat, 15 May 2010 00:29:24 +0000 (17:29 -0700)]
Fix two whitespace bugs in the lexer.
The first bug was not allowing whitespace between '#' and the
directive name.
The second bug was swallowing a terminating newline along with any
trailing whitespace on a line.
With these two fixes, and the previous commit to stop emitting SPACE
tokens, the recently added extra-whitespace test now passes.
Carl Worth [Sat, 15 May 2010 00:08:45 +0000 (17:08 -0700)]
Don't return SPACE tokens unless strictly needed.
This reverts the unconditional return of SPACE tokens from the lexer
from commit
48b94da0994b44e41324a2419117dcd81facce8b .
That commit seemed useful because it kept the lexer simpler, but the
presence of SPACE tokens is causing lots of extra complication for the
parser itself, (redundant productions other than whitespace
differences, several productions buggy in the case of extra
whitespace, etc.)
Of course, we'd prefer to never have any whitespace token, but that's
not possible with the need to distinguish between "#define foo()" and
"#define foo ()". So we'll accept a little bit of pain in the lexer,
(enough state to support this special-case token), in exchange for
keeping most of the parser blissffully ignorant of whether tokens are
separated by whitespace or not.
This change does mean that our output now differs from that of "gcc -E",
but only in whitespace. So we test with "diff -w now to ignore those
differences.
Carl Worth [Fri, 14 May 2010 23:58:00 +0000 (16:58 -0700)]
Add test with extra whitespace in macro defintions and invocations.
This whitespace is not dealt with in an elegant way yet so this test
does not pass currently.
Carl Worth [Fri, 14 May 2010 23:53:52 +0000 (16:53 -0700)]
Provide implementation for macro arguments containing parentheses.
We were correctly parsing this already, but simply not returning any
value (for no good reason). Fortunately the fix is quite simple.
This makes the test added in the previous commit now pass.
Carl Worth [Fri, 14 May 2010 23:51:54 +0000 (16:51 -0700)]
Add test invoking a macro with an argument containing (non-macro) parentheses.
The macro invocation is defined to consume all text between a set of
matched parentheses. We previously tested for inner parentheses from a
nested function-like macro invocation. Here we test for inner
parentheses occuring on their own, (not part of another macro
invocation).
Carl Worth [Fri, 14 May 2010 19:05:37 +0000 (12:05 -0700)]
Fix expansion of composited macros.
This is a case such as "foo(bar(x))". The recently added test for this
now passes.
Carl Worth [Fri, 14 May 2010 17:01:44 +0000 (10:01 -0700)]
Add test for composed invocation of function-like macros.
This is a case like "foo(bar(x))" where both foo and bar are defined
function-like macros. This is not yet parsed correctly so this test
fails.
Carl Worth [Fri, 14 May 2010 18:33:00 +0000 (11:33 -0700)]
Eliminate a shift/reduce conflict.
By simply allowing for the argument_list production to be empty rather
than the lower-level argument production to be empty.
Carl Worth [Fri, 14 May 2010 17:44:19 +0000 (10:44 -0700)]
Support macro invocations with multiple tokens for a single argument.
We provide for this by changing the value of the argument-list
production from a list of strings (string_list_t) to a new
data-structure that holds a list of lists of strings
(argument_list_t).
Carl Worth [Fri, 14 May 2010 17:00:59 +0000 (10:00 -0700)]
Add test for function-like macro invocations with multiple-token arguments.
These are not yet parsed correctly, so these tests fail.
Carl Worth [Fri, 14 May 2010 17:31:43 +0000 (10:31 -0700)]
Make macro-expansion productions create string-list values rather than printing
Then we print the final string list up at the top-level content
production along with all other printing.
Additionally, having macro-expansion productions that create values
will make it easier to solve problems like composed function-like
macro invocations in the future.
Carl Worth [Fri, 14 May 2010 17:17:38 +0000 (10:17 -0700)]
Move most printing to the action in the content production.
Previously, printing was occurring all over the place. Here we
document that it should all be happening at the top-level content
production, and we move the printing of directive newlines.
The printing of expanded macros is still happening in lower-level
productions, but we plan to fix that soon.
Carl Worth [Fri, 14 May 2010 17:12:21 +0000 (10:12 -0700)]
Remove _list suffix from several identifiers.
Instead of "parameter_list" and "replacement_list" just use
"parameters" and "replacements". This is consistent with the existing
"arguments" and keeps the line length down in the face of the
now-longer "string_list_t" rather than "list_t".
Carl Worth [Fri, 14 May 2010 17:05:11 +0000 (10:05 -0700)]
Rename list_t and node_t to string_list_t and string_node_t.
We'll soon be adding other types of lists, so it will be helpful to
have a qualified name here.
Carl Worth [Thu, 13 May 2010 19:58:49 +0000 (12:58 -0700)]
Fix case of a macro formal parameter matching a defined macro.
Simply need to allow for a macro name to appear in the parameter list.
This makes the recently-added test pass.
Carl Worth [Thu, 13 May 2010 19:57:34 +0000 (12:57 -0700)]
Add test where a macro formal parameter is the same as an existing macro.
This is a well-defined condition, but something that currently trips up
the implementation. Should be easy to fix.
Carl Worth [Thu, 13 May 2010 19:56:42 +0000 (12:56 -0700)]
Implement substitution of macro arguments.
Making the two recently-added tests for this functionality now pass.
Carl Worth [Thu, 13 May 2010 19:54:17 +0000 (12:54 -0700)]
Add tests exercising substitution of arguments in function-like macros.
This capability is the only thing that makes function-like macros
interesting. This isn't supported yet so these tests fail for now.
Carl Worth [Thu, 13 May 2010 17:46:29 +0000 (10:46 -0700)]
Make the lexer return SPACE tokens unconditionally.
It seems strange to always be returning SPACE tokens, but since we
were already needing to return a SPACE token in some cases, this
actually simplifies our lexer.
This also allows us to fix two whitespace-handling differences
compared to "gcc -E" so that now the recent modification to the test
suite passes once again.
Carl Worth [Thu, 13 May 2010 17:45:32 +0000 (10:45 -0700)]
Makefile: Make "make test" depend on the main program.
Otherwise, running "make test" can run an old version of the code,
(even when new changes are sitting in the source waiting to be
compiled).
Carl Worth [Thu, 13 May 2010 17:41:53 +0000 (10:41 -0700)]
Add some whitespace variations to test 15.
This shows two minor failures in our current parsing (resulting in
whitespace-only changes, oso not that significant):
1. We are inserting extra whitespace between tokens not originally
separated by whitespace in the replacement list of a macro
definition.
2. We are swallowing whitespace separating tokens in the general
content.
Carl Worth [Thu, 13 May 2010 17:29:07 +0000 (10:29 -0700)]
Fix parsing of object-like macro with a definition that begins with '('.
Previously our parser was incorrectly treating this case as a
function-like macro. We fix this by conditionally passing a SPACE
token from the lexer, (but only immediately after the identifier
immediately after #define).
Carl Worth [Thu, 13 May 2010 17:26:58 +0000 (10:26 -0700)]
Add test for an object-like macro with a definition beginning with '('
Our current parser sees "#define foo (" as an identifier token
followed by a '(' token and parses this as a function-like macro.
That would be correct for "#define foo(" but the preprocessor
specification treats this whitespace as significant here so this test
currently fails.
Carl Worth [Fri, 14 May 2010 15:47:32 +0000 (08:47 -0700)]
Eliminate a reduce/reduce conflict in the function-like macro production.
Previously, an empty argument could be parsed as either an "argument_list"
directly or first as an "argument" and then an "argument_list".
We fix this by removing the possibility of an empty "argument_list"
directly.
Carl Worth [Thu, 13 May 2010 16:36:23 +0000 (09:36 -0700)]
Add support for the structure of function-like macros.
We accept the structure of arguments in both macro definition and
macro invocation, but we don't yet expand those arguments. This is
just enough code to pass the recently-added tests, but does not yet
provide any sort of useful function-like macro.
Carl Worth [Thu, 13 May 2010 16:34:21 +0000 (09:34 -0700)]
Add tests for the structure of function-like macros.
These test only the most basic aspect of parsing of function-like
macros. Specifically, none of the definitions of these function like
macros use the arguments of the function.
No function-like macros are implemented yet, so all of these fail for
now.
Carl Worth [Thu, 13 May 2010 14:38:29 +0000 (07:38 -0700)]
Make the lexer distinguish between identifiers and defined macros.
This is just a minor style improvement for now. But the same
mechanism, (having the lexer peek into the table of defined macros),
will be essential when we add function-like macros in addition to the
current object-like macros.
Carl Worth [Wed, 12 May 2010 20:21:20 +0000 (13:21 -0700)]
Remove some redundancy in the top-level production.
Previously we had two copies of all top-level actions, (once in a list
context and once in a non-list context). Much simpler to instead have
a single list-context production with no action and then only have the
actions in their own non-list contexts.
Carl Worth [Wed, 12 May 2010 20:19:23 +0000 (13:19 -0700)]
Simplify lexer significantly (remove all stateful lexing).
We are able to remove all state by simply passing NEWLINE through
as a token unconditionally (as opposed to only passing newline when
on a driective line as we did previously).
Carl Worth [Wed, 12 May 2010 20:14:08 +0000 (13:14 -0700)]
Add test case to define, undef, and then again define a macro.
Happily, this is another test case that works just fine without any
additional code.
Carl Worth [Wed, 12 May 2010 20:11:50 +0000 (13:11 -0700)]
Add support for the #undef macro.
This isn't ideal for two reasons:
1. There's a bunch of stateful redundancy in the lexer that should be
cleaned up.
2. The hash table does not provide a mechanism to delete an entry, so
we waste memory to add a new NULL entry in front of the existing
entry with the same key.
But this does at least work, (it passes the recently added undef test
case).
Carl Worth [Wed, 12 May 2010 19:51:31 +0000 (12:51 -0700)]
Add test for #undef.
Which hasn't been implemented yet, so this test fails.
Carl Worth [Wed, 12 May 2010 19:49:07 +0000 (12:49 -0700)]
Add test for an empty definition.
Happily this one passes without needing any additional code.
Carl Worth [Wed, 12 May 2010 19:45:33 +0000 (12:45 -0700)]
Convert lexer to talloc and add xtalloc wrappers.
The lexer was previously using strdup (expecting the parser to free),
but is now more consistent, easier to use, and slightly more efficent
by using talloc along with the parser.
Also, we add xtalloc and xtalloc_strdup wrappers around talloc and
talloc_strdup to put all of the out-of-memory-checking code in one
place.
Carl Worth [Wed, 12 May 2010 19:17:10 +0000 (12:17 -0700)]
Fix defines involving both literals and other defined macros.
We now store a list of tokens in our hash-table rather than a single
string. This lets us replace each macro in the value as necessary.
This code adds a link dependency on talloc which does exactly what we
want in terms of memory management for a parser.
The 3 tests added in the previous commit now pass.
Carl Worth [Tue, 11 May 2010 19:39:29 +0000 (12:39 -0700)]
Add tests defining a macro to be a literal and another macro.
These 3 new tests are modeled after 3 existing tests but made slightly
more complex since now instead of definining a new macro to be an
existing macro, we define it to be replaced with two tokens, (one a
literal, and one an existing macro).
These tests all fail currently because the replacement lookup is
currently happening on the basis of the entire replacement string
rather than on a list of tokens.
Carl Worth [Tue, 11 May 2010 19:35:06 +0000 (12:35 -0700)]
Add a couple more tests for chained #define directives.
One with the chained defines in the opposite order, and one with the
potential to trigger an infinite-loop bug through mutual
recursion. Each of these tests pass already.
Carl Worth [Tue, 11 May 2010 19:30:09 +0000 (12:30 -0700)]
Fix to handle chained #define directives.
The fix is as simple as adding a loop to continue to lookup values
in the hash table until one of the following termination conditions:
1. The token we look up has no definition
2. We get back the original symbol we started with
This second termination condition prevents infinite iteration.
Carl Worth [Tue, 11 May 2010 19:29:22 +0000 (12:29 -0700)]
Add test for chained #define directives.
Where one macro is defined in terms of another macro. The current
implementation does not yet deal with this correctly.
Carl Worth [Tue, 11 May 2010 19:04:42 +0000 (12:04 -0700)]
Add README file describing glcpp.
Mostly this is a place for me to write down the URLs of the GLSL and
C99 specifications that I need to write this code.
Carl Worth [Mon, 10 May 2010 23:21:10 +0000 (16:21 -0700)]
Add a very simple test for the pre-processor.
Validate desired test cases by ensuring the output of glcpp matches
the output of the gcc preprocessor, (ignoring any lines of the gcc
output beginning with '#').
Only one test case so far with a trivial #define.
Carl Worth [Mon, 10 May 2010 23:16:06 +0000 (16:16 -0700)]
Implment #define
By using the recently-imported hash_table implementation.
Carl Worth [Mon, 10 May 2010 23:14:59 +0000 (16:14 -0700)]
Makefile: Enable debugging of parser.
This compiles the debugging code for teh parser. It's not active
unless the yydebug variable is set to a non-zero value.
Carl Worth [Mon, 10 May 2010 20:36:26 +0000 (13:36 -0700)]
Add hash table implementation from glsl2 project.
The preprocessor here is intended to become part of the glsl2 codebase
eventually anyway.
Carl Worth [Mon, 10 May 2010 20:32:42 +0000 (13:32 -0700)]
Add .gitignore file.
To ignore generated source files (and glcpp binary).
Carl Worth [Mon, 10 May 2010 20:17:25 +0000 (13:17 -0700)]
Add some compiler warnings and corresponding fixes.
Most of the current problems were (mostly) harmless things like
missing declarations, but there was at least one real error, (reversed
argument order for yyerrror).
Carl Worth [Mon, 10 May 2010 18:52:29 +0000 (11:52 -0700)]
Make the lexer reentrant (to avoid "still reachable" memory).
This allows the final program to be 100% "valgrind clean", (freeing
all memory that it allocates). This will make it much easier to ensure
that any allocation that parser actions perform are also cleaned up.
Carl Worth [Mon, 10 May 2010 18:44:09 +0000 (11:44 -0700)]
Add the tiniest shell of a flex/bison-based parser.
It doesn't really *do* anything yet---merlely parsing a stream of
whitespace-separated tokens, (and not interpreting them at all).