[antlr-interest] Common left prefix for Antlr tokens...
Stuart Dootson
stuart.dootson at gmail.com
Mon Jan 16 07:08:48 PST 2012
Hello
One of my colleagues has been using Antlr 3 to create a lexer/parser
for the L5K language (used to program Allen-Bradley PLCs). This has
proceeded generally well, until coming across a little problem.
The problem is with the array literal start token ('[') and an
'extended property' indicator ('[[[___'). More specifically, nested
arrays with no whitespace between the outer and inner array start, for
example "[[1], 2]", are interpreted by Antlr as an extended property
introduction, causing a "mismatched character" exception.
I have come up with a workaround, by overriding the 'emit' and
'nextToken' methods of the lexer, to allow the strings "[[" and "[[["
to be converted to multiple "[" tokens through calling 'emit' in
actions, but was wondering if this use-case can be implemented without
requiring this extra code, through use of one or more options on the
grammar/rules?
A minimal Antlr grammar is appended...
Stuart Dootson
grammar arrays;
stat
: array
| EXTENDED_PROP
;
array
: LSQ value ( ',' value)* RSQ
;
value
: INT
| array
;
INT : ('0' .. '9')+
;
EXTENDED_PROP
: '[[[___'
;
LSQ : '['
;
RSQ : ']'
;
WS : (' '|'\n'|'\r')+ {$channel=HIDDEN;}
;
More information about the antlr-interest
mailing list