[antlr-interest] Parsing Best Practices---Where to check for predefined names
Gavin Lambert
antlr at mirality.co.nz
Tue May 5 04:52:59 PDT 2009
At 13:12 5/05/2009, Matthew M. Burke wrote:
>1) I could go with something that matches '<' ID expr* '>' and
then
>in the parser action, I can test ID.text and act as appropriate
>
>or
>
>2) I could do something like
>
>lhs
> : '<' 'array' expr '>' -> ^(ARRAY_REF expr)
> | '<' 'socket' expr '>' -> ^(SOCKET_REF expr)
> | ...
> ;
>
>Is either alternative especially better than the other?
In general, option #2 is more efficient -- but you need to bear in
mind that it'll introduce new top-level lexer tokens, and thus
"array" will always be treated as a single token (with an obscure
generated name), not as an ID or some other token. So if "array"
is not always a keyword in your language then you'll need a bit
more intelligence in your identifier-matching (eg. id : ID |
'array' | 'socket';) or go with option #1 instead.
(If you want to avoid the obscurely-named tokens, then you should
avoid using quoted constants in parser rules and just create the
corresponding lexer rules yourself.)
More information about the antlr-interest
mailing list