[antlr-interest] Lexer issue with Python target and predicates
Laurie Harper
laurie at holoweb.net
Thu May 17 15:31:21 PDT 2007
I have a grammar which is producing illegal Python code. Both semantic
and syntactic predicates seem to trigger the incorrect code.
I've reduced the grammar to a minimal sub-set that demonstrates the problem:
synpred.g:
--------8<--------8<--------8<--------
grammar synpred;
options {
language=Python;
}
// matches input of the form aaa.a.a or aaa|aaa
SLASH : '\\';
DOLLAR : '$';
HASH : '#';
LCURLY : '}';
startRule : LiteralExpression+;
LiteralExpression
: { literalText=True; }
(LiteralComponent)* (DOLLAR|HASH)?
{ literalText=False; }
;
fragment
LiteralComponent
: {literalText}? => ( options { greedy=true; } : (
(SLASH) => SLASH (DOLLAR | HASH)
| (DOLLAR | HASH) => (DOLLAR | HASH) ~(LCURLY)
| ~(DOLLAR | HASH)
))+
;
--------8<--------8<--------8<--------
Generate lexer/parser:
--------8<--------8<--------8<--------
$ java org.antlr.Tool synpred.g
ANTLR Parser Generator Version 3.0b7 (April 12, 2007) 1989-2007
warning(11): internal warning: ignoring unsupported option: seperator
warning(11): internal warning: ignoring unsupported option: seperator
--------8<--------8<--------8<--------
(I don't know if those warnings are relevant; I always get them, even on
grammars which produce working parsers...)
The resulting lexer generated from this grammar contains Python
'statements' like this:
elraise NotImplementedError("eotDFAEdge")
I'm not sure why, or how to fix this. Manually replacing the 'elraise'
with 'else: raise' makes the lexer syntactically correct Python code
but, with the full grammar, the lexer is over 28Mb of Python (!) and
can't be imported :-(
Any help or suggestions?
L.
More information about the antlr-interest
mailing list