[antlr-interest] ANTLR Questions
Gavin Lambert
antlr at mirality.co.nz
Wed May 28 01:54:23 PDT 2008
At 11:02 28/05/2008, ANTLR Mailing List wrote:
>Using this grammar:
>http://www.antlr.org/pipermail/antlr-interest/attachments/2008
>0526/595e3dfb/attachment-0001.obj
>
>I seem to get ambiguity errors, or so I think. The error
messages
>are very ambiguous themselves (Yes, I know, wait until ANTLR 3
is
>built on ANTLR 3), but I cannot pinpoint the results of them..
A very quick glance over the grammar suggests these might be
problems:
1. The use of ~IdentifierPart means you're actually consuming the
following non-IdentifierPart character, which may not be what you
want. You should probably use a syntactic predicate instead.
2. Actually, you probably shouldn't do it at all, since
'IdentifierPart' is not a character set, it is a sequence (it
contains IdentifierStart, which contains EscapeSequence, which can
represent a sequence of characters); it's illegal to use ~ on a
sequence.
3. Your various integer tokens are ambiguous; remember, the lexer
doesn't have any context, and can't lookahead past a + or *
without an explicit syntactic predicate (or backtracking, which
doesn't work in the lexer). You'll need to merge all of these
into one rule with type switches depending on predicates.
4. RegExpLiteral, SingleLineComment, MultiLineComment, and
DocComment are all ambiguous (RegExpLiteral can match all of
them).
5. MultiLineCommentInside is just plain illegal, as previously
mentioned. To do reversed sequences you have to explicitly spell
out the possibilities; ie. instead of this:
~'*/'
you need to do this:
(~'*' | '*' ~'/')
Another option is to use ANTLR's automatic non-greedy matching and
change MultiLineComment to:
'/*' .* '*/'
(You can't extract a fragment out of that though, it won't work.)
You also need to watch out a bit for over-use of fragments. Since
fragments are still treated as rules (they get their own method)
they unfortunately don't always give the same behaviour as when
they're inlined. This is especially true when used with ~.
> * How would you create a code generator using a tree grammar?
You make the parser output an AST, then create a tree grammar to
recognise that AST and either output the desired code directly or
use StringTemplate to do it for you.
> * What would be an efficient system for entering and exiting
>contexts?
You mean like scopes? ANTLR provides stackable scopes, which are
useful for contextual information.
More information about the antlr-interest
mailing list