[antlr-interest] Parsing whole-line comments?

John B. Brodie jbb at acm.org
Sun Jun 6 11:16:59 PDT 2010


Greetings!

On Sun, 2010-06-06 at 12:19 -0400, Christian Convey wrote:
> > Alternatively, you can apply semantic predicate to lexer rules like this:
> > ------------------------
> >
> > C:  { $pos == 0 }?=> 'C' ;
> >
> > ------------------------
> >
> > It should only match "C" at the beginning of the line, but I found (in
> > my noob experiences) semantic predicate can be pretty tricky due to
> > "hoisting out" business and how it affects prediction DFA construction -
> > I'm sure more experienced hands can tell you better.
> 
> Thanks.  But I'm actually pretty against intermixing lexical,
> grammatical, and semantic rules.  At that point (at least in my
> particular project) I've given up most of the clarity that I was
> hoping to gain by using ANTLR as opposed to a hand-written recursive
> descent parser.
> 
> I think at this point I'm just going to hand-write the parser for my
> DSL.  Thanks very much for the help.
> 

you might look at the Python lexer in the examples. It seems to me the
Python lexer would have a similar problem to yours --- identifying white
space at the beginning of a line --- your case seems a little simpler
because you seem to care about just the first letter at the beginning of
the line.

also perhaps realizing that the first character of a line must be
preceeded by a new-line character (except the very first line).

so:

tokens { C; E; }

......

NEWLINE : ( '\r' | '\n' )+  // for the last line....
   ( 'C' { $type = C; }
   | 'E' { $type = E; }
//..... other first-char possibilities go here
   )
   ;

CALL : 'CALL' ;
ID : ('a'..'z'|'A'..'Z')+ // or whatever

and of course create a wrapper around the input stream in order to
supply a new-line as the very first character and then the actual input
text as the rest of the stream. (in effect append a new-line to the
front of the input)

just a thought.....
   -jbb




More information about the antlr-interest mailing list