[antlr-interest] Matching Last Line in ANTLR?
consiliens at gmail.com
consiliens at gmail.com
Tue Aug 18 14:05:11 PDT 2009
On 09-08-18 02:22 PM, Gavin Lambert wrote:
> At 08:08 19/08/2009, consiliens at gmail.com wrote:
> >I want to use your solution, however it throws errors about "The
> >following alternatives can never be matched: 1" for MC_QUESTION
> >and MC_INCORRECT. Shouldn't the below work?
> >
> >MC_QUESTION : INT ('.'|')') .* ENDOFLINE;
> >MC_INCORRECT : LETTER '.' .* ENDOFLINE;
> >MC_CORRECT : '*' MC_INCORRECT;
> >
> >fragment ENDOFLINE : NEWLINE | { input.LA(1) == EOF }?;
>
> No. You can't use a .* wildcard loop without (a) always having at least
> one termination character and (b) specifying it inline rather than in a
> subrule.
>
> If you remove the .* (or make it more specific, eg. WS*) then it should
> work.
>
>
For testing I removed the .* and, while there are no errors, it still
doesn't match b. as the token MC_INCORRECT unless there is a newline
after it. The purpose of .*, within the context of this grammar, is to
match the text between the line identifier and the line end. So the
input could be
1. Is ANTLR useful?
*a. True
b. False
The existing regular expression based parser solves many of these issues
in an elegant way, however I want to use another tool for language
recognition. I'm hoping that this ANTLR grammar will at least be able to
reach feature parity.
Sample Input:
1.
*a.
b.
MC_QUESTION : INT ('.'|')') ENDOFLINE;
MC_INCORRECT : LETTER '.' ENDOFLINE;
MC_CORRECT : '*' MC_INCORRECT;
fragment ENDOFLINE : NEWLINE | { input.LA(1) == EOF }?;
fragment NEWLINE : '\r'? '\n';
fragment LETTER : ('a'..'z'|'A'..'Z');
fragment INT : '0'..'9'+;
More information about the antlr-interest
mailing list