[antlr-interest] Re: Recommendation for Lexer
Micheal J
open.zone at virgin.net
Thu Feb 9 12:54:20 PST 2006
> A typical rule might look like this:
>
> <STATEX> "foo" (("." {Digits}) |
> ({Digits} ("." [0-9]*)?)) [eE] [+-]? {Digits}
> { setState(STATEY); return token(FOO); }
>
> Now my problem is how to access certain parts of the match
> without re-parsing the string in the Java code part (e.g.
> ideally I'd like no indexOf(), substring() stuff but rather
> something like $1 or \1 to get the capturing groups).
No general solution but, you can use lexer states to extract a contiguous
substring (much like we do to remove the '@' from verbatim strings and
identifiers). Not that you want any more states...
Micheal
More information about the antlr-interest
mailing list