[antlr-interest] Re: on parsers look and feel

Thomas Brandon tom at psy.unsw.edu.au
Wed Nov 26 17:02:53 PST 2003


Is that going to work? I think you'll need to have a rule that 
matches those particular characters. Literal matching only happens 
to the match of rules, not to every character on the input stream. 
Having a rule which matched the characters used as literalswould 
mean you'd have to maintain it. So you probably want to add a rule 
that matches all characters that aren't matched in other rules, 
probably just all non alpha-numeric, less maintenance, though you'd 
have to have to have your own literal matching that threw an 
exception if there was no match, or you'd have weird invalid tokens 
coming through. Then there's the issue of making that informative. 
And when something breaks in your parser with handling those it 
could be wierdness related to the literals handling.

As to whether it's better, even without the implementation problems, 
I don't know. True you don't have to remember the token name, but 
only in the parser, so you've got this weird stuff going on in the 
parser, somehow these tokens don't really have labels, you can't do 
this your tree parsers... But I guess it's partly a matter of taste, 
but I'd be worried about how maintainable any code like that is, for 
you and then more so for others.

Tom.
--- In antlr-interest at yahoogroups.com, Cristian Amitroaie 
<cristian at a...> wrote:
> Hello guys,
> 
> Case:
>    o sometimes I kind of foreget what name I gave to the "=" token 
from the 
> Lexer (EQ/EQUAL/EQUALS/ASSIGN) when I want to add a new rule to a 
parser.
>    o sometimes I get bored to write LCURLEY instead of "{" or '{'
>    o sometimes it's hard for me to follow rules full of SEMI, LCURL
(E)?Y, 
> LBRACK, LPARENS and so on
> 
> For example, I would like to see my parser rules look like:
> 
> assign:
>         ID "="^ ID ";"!
>     ;
> 
> I browsed throw the documentation/big examples, yet I couldn't 
find any 
> similar approach as a guideline or something.
> 
> Yet, it doesn't seem impossible (see the attached files).
> 
> Although the parsers token table won't have a token type attached 
to "=" (I 
> asssume LITERAL_= is not a valid id in almost any language), it 
reserves a 
> number for it. Now importing the parsers vocabulary in the lexer, 
and leaving 
> testLiterals true (default value) it seems that the lexer's token 
table keeps 
> the number from the parser for "=" and adds to it a token type 
> (EQUAL/EQ/ASSIGN, oops I don't remember).
> 
> Are there any disadvantages/risks related to this approach?
> 
> Of course, in the parser, if somebody likes to build new AST nodes 
using "=", 
> it may attach it a token type in the tokens section and use it...
> 
> Either I am a maniac, or the parser gramar looks much clearer to 
me...
> 
> And the walkers import the lexers vocabulary (see the attached 
files).
> 
> Or it's just a matter of taste?
> Cristian
> 
> 
> 
> 
> ana = mihai;
> mihai = maria;
> ana = maria;


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list