[antlr-interest] Matching tokens only at certain places
Emond Papegaaij
e.papegaaij at student.utwente.nl
Mon Jun 19 05:49:35 PDT 2006
On Monday 19 June 2006 14:03, you wrote:
> On 6/19/06, Emond Papegaaij <e.papegaaij at student.utwente.nl> wrote:
> > I'm trying to parse partially language independent input. The input is in
> > fact similar to that of actions in ANTLR itself. The action tokens in
> > ANTLR all have braces around them. My tokens don't. Here is some example
> > input:
> >
> > iface public String getString() ;
> > \___/ \_______________________/\_/
> >
> > As can be seen in this example, the token can only follow a 'iface' token
> > and will not contain a ';'. This second condition is straight forward to
> > implement. The first however is more difficult. In the parser I know
> > exactly when this token should occur, but not in the lexer. Is there any
> > way to force the lexer to produce a certain token at that point and
> > nowhere else? BTW, I'm using ANTLR v3.
> >
> Could you include a demo lexer/parser, a cut-down version of the lexer and
> parser that you're working on?
>
> It seems to me that you could make 'iface', 'getstring' and 'semicolon'
> token types in the lexer. \
> (IFACE, GETSTRING, SEMI, probably)
> and a production of the form
> line: IFACE GETSTRING SEMI
> in the parser.
>
Below is a very cut-down version of the lexer and parser that is (obviously) not working and a sample of the input it should be able to parse. The problem is that I don't know anything about the contents of METHOD_SIG_ACTION, except that it will not contain a semicolon. Creating a token that matches everything except a semicolon does not work, as ANTLR will always create that token for all input. I need a way to specify that the the METHOD_SIG_ACTION token can only follow the 'iface' token. As 'iface' is always followed by METHOD_SIG_ACTION, it is possible to specify it in the lexer (ie. set some boolean to true after emitting an 'iface' token).
grammar TPL;
specification
: IDENTIFIER '{' body '}'
;
body
: 'iface' METHOD_SIG_ACTION ';'
;
IDENTIFIER: ('a'..'z'|'A'..'Z'|'_'|'$') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')* ;
WS : ( ' '
| '\t'
| '\f'
// handle newlines
| ( '\r' // Macintosh
| '\n' // Unix (the right way)
)
)+
{ channel=99; /*token = JavaParser.IGNORE_TOKEN;*/ }
;
METHOD_SIG_ACTION: (~';')+;
Input:
Printable {
iface public String getString();
}
IndentedConstruct {
iface protected String writeIndentation(int indentation);
}
Program {
iface String visitStmt (int indentation);
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060619/4c831269/attachment.html
More information about the antlr-interest
mailing list