[antlr-interest] Matching tokens only at certain places

Emond Papegaaij e.papegaaij at student.utwente.nl
Mon Jun 19 05:49:35 PDT 2006


On Monday 19 June 2006 14:03, you wrote:
> On 6/19/06, Emond Papegaaij <e.papegaaij at student.utwente.nl> wrote:
> > I'm trying to parse partially language independent input. The input is in
> > fact similar to that of actions in ANTLR itself. The action tokens in
> > ANTLR all have braces around them. My tokens don't. Here is some example
> > input:
> >
> > iface public String getString() ;
> > \___/ \_______________________/\_/
> >
> > As can be seen in this example, the token can only follow a 'iface' token
> > and will not contain a ';'. This second condition is straight forward to
> > implement. The first however is more difficult. In the parser I know
> > exactly when this token should occur, but not in the lexer. Is there any
> > way to force the lexer to produce a certain token at that point and
> > nowhere else? BTW, I'm using ANTLR v3.
> >
> Could you include a demo lexer/parser, a cut-down version of the lexer and
> parser that you're working on?
>
> It seems to me that you could make 'iface', 'getstring' and 'semicolon'
> token types in the lexer. \
> (IFACE, GETSTRING, SEMI, probably)
> and a production of the form
> line: IFACE GETSTRING SEMI
> in the parser.
>

Below is a very cut-down version of the lexer and parser that is (obviously) not working and a sample of the input it should be able to parse. The problem is that I don't know anything about the contents of METHOD_SIG_ACTION, except that it will not contain a semicolon. Creating a token that matches everything except a semicolon does not work, as ANTLR will always create that token for all input. I need a way to specify that the the METHOD_SIG_ACTION token can only follow the 'iface' token. As 'iface' is always followed by METHOD_SIG_ACTION, it is possible to specify it in the lexer (ie. set some boolean to true after emitting an 'iface' token).


grammar TPL;

specification
        : IDENTIFIER '{' body '}'
        ;

body
        : 'iface' METHOD_SIG_ACTION ';'
        ;

IDENTIFIER: ('a'..'z'|'A'..'Z'|'_'|'$') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')* ;

WS  : ( ' '
                | '\t'
                | '\f'
                  // handle newlines
                | ( '\r'    // Macintosh
                        | '\n'    // Unix (the right way)
                        )
                )+
                { channel=99; /*token = JavaParser.IGNORE_TOKEN;*/ }
                ;

METHOD_SIG_ACTION: (~';')+;



Input:
Printable {
  iface public String getString();
}

IndentedConstruct {
  iface protected String writeIndentation(int indentation);
}

Program {
  iface String visitStmt (int indentation);
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060619/4c831269/attachment.html


More information about the antlr-interest mailing list