[antlr-interest] Order of token matching
Jim Idle
jimi at temporal-wave.com
Wed Sep 3 09:21:00 PDT 2008
On Wed, 2008-09-03 at 18:14 +0200, Jenny Balfer wrote:
> Thanks for that, but unfortunately this does not solve the problem. I
> declared MLCOM etc. as fragment, but COMMENT and IMPL must not be fragments
> in order to skip them.
Have you shown all of your grammar here?
Jim
>
> On Wed, 03 Sep 2008 09:05:48 -0700, Jim Idle <jimi at temporal-wave.com>
> wrote:
> > On Wed, 2008-09-03 at 18:00 +0200, Jenny Balfer wrote:
> >
> >> Hello guys,
> >>
> >> I think I have too little understanding of the work of my lexer. I
> > thought
> >> the rules that are specified first are matched first, but in my grammar
> >> this is not the case.
> >> What I am trying to do is first skipping all comments of my source
> > files,
> >> and then skipping everything between curly braces:
> >>
> >
> >
> > Make sure that any token that you don't want returned to the parser is a
> > fragment:
> >
> > fragment
> > MLCOM : '/*' ;
> >
> > etc. Then you should have more luck, your comment lead-ins are matching
> > the MLCOM and SLCOM rules and then likely throwing recognition errors
> > for the rest up until the '{'
> >
> > Jim
> >
> >
> >> MLCOM : '/*'
> >> ;
> >> SLCOM : '//'
> >> ;
> >> RCOM : '*/'
> >> ;
> >> NL : '\r' {skip();}
> >> | '\n' {skip();}
> >> ;
> >> WS : ' ' {$channel=HIDDEN;}
> >> | '\t' {skip();}
> >> ;
> >>
> >> COMMENT : SLCOM (options{greedy=false;}: .)* NL {skip();}
> >> | MLCOM (options{greedy=false;}: .)* RCOM {skip();}
> >> ;
> >> IMPL : '{' (IMPL|'}')* '}' {skip();}
> >> ;
> >>
> >> Rule IMPL matches everything between curly braces, but in between counts
> >> them (by recursively calling itself).
> >> Now the problem appears if there are braces in comments:
> >>
> >> someFunction = function(a,b) {
> >> // this is one brace too much: {
> >> }
> >>
> >> My lexer now sees the opening brace in the comment and searches for the
> >> closing one until the end of file, which results in:
> >> mismatched character '<EOF>' expecting '}'
> >>
> >> What I want my lexer to do is first sort out all comments, and second
> > sort
> >> out everything between curly braces. Are there any predicates that could
> >> cause this?
> >>
> >> Thanks!
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080903/02555f75/attachment.html
More information about the antlr-interest
mailing list