[antlr-interest] parsing ugly grammars
Tomasz Jastrzebski
tdjastrzebski at yahoo.com
Sat Apr 4 09:21:27 PDT 2009
Hello all,
Writing a parser for some ugly grammar I came across a problem I do not know how to approach. Here is a sample grammar illustrating the problem:
grammar test;
program : (statement)*;
statement
: RawData
| Identifier ';'
;
RawData: 'data;' ((options {greedy=false;} : .)* ';;')? ;
Identifier : ('a'..'z')+;
WhiteSpace : (' ' | '\t' | '\r\n' | '\r')+ { $channel=HIDDEN; };
The RawData can contain data ended with ‘;;’ or can be empty. Two sample valid inputs:
data; some raw data here;; identifier;
data; identifier;
The parser does not correctly recognize the second input (mismatched character '<EOF>' expecting ';') .
Parser does not realize that what follows ‘data’ keyword is not followed by ‘;;’ so it is not "raw data" and should be interpreted as Identifier.
I am clueless. Could anyone help?
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090404/b2cd8e7a/attachment.html
More information about the antlr-interest
mailing list