[antlr-interest] How to write this lexer rule?
chain one
chainone at gmail.com
Tue Jan 13 03:29:42 PST 2009
Hi Gavin Lamber: Thanks for your reply.
I tried the lexer rule you gave me. But following error comes out:
Alternative 155: after matching input such as
'F''U''N''C''T''I''O''N''F''U''N''C''T''I''O''N''F''U''N''C''T''I''O''N''E''N''D''_''F''U''N''C''T''I''O''N'{'0'..'9',
'A'..'Z', '_',
'a'..'z'}'F''U''N''C''T''I''O''N''E''N''D''_''F''U''N''C''T''I''O''N'{'0'..'9',
'A'..'Z', '_', 'a'..'z'}'F''U''N''C''T''I''O'{'\u0000'..'/', ':'..'@', 'N',
'['..'^', '`', '{'..'\uFFFF'} decision cannot predict what comes next due to
recursion overflow to FUNCTION_DECL from FUNCTION_DECL
On Tue, Jan 13, 2009 at 7:11 PM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 22:10 13/01/2009, chain one wrote:
>
>> I want to recognize a function definition and skip it before passing
>> tokens to the parser.
>> The function definition starts with "FUNCTION" ,ends with "END_FUNCTION".
>>
> [...]
>
>> FUNCTION_DECL
>> : 'FUNCTION'
>> {
>> $channel=HIDDEN;
>> }
>> ( options {greedy=false;} : . )* FUNCTION_DECL ( options
>> {greedy=false;} : . )* 'END_FUNCTION' SEMI
>> ;
>>
>
> You might need to be more explicit about it:
>
> FUNCTION_DECL
> : 'FUNCTION' { $channel = HIDDEN; }
> (FUNCTION_DECL | ~'E' | 'E' ~'N' | 'EN' ~'D' | 'END' ~'_' |
> 'END_' ~'F' | 'END_F' ~'U' | 'END_FU' ~'N' | 'END_FUN' ~'C' |
> 'END_FUNC' ~'T' | 'END_FUNCT' ~'I' | 'END_FUNCTI' ~'O' |
> 'END_FUNCTIO' ~'N' | 'END_FUNCTION' ~SEMI)*
> 'END_FUNCTION' SEMI
> ;
>
> (This assumes that whitespace isn't permitted between END_FUNCTION and the
> semicolon.)
>
> Also, if you're wanting to skip over large chunks of your input, then you
> might want to investigate filtering lexers.
>
> This also could not work : ( :
>>
>> fragment
>> FUNCTION:
>> 'FUNCTION'
>> ;
>>
> [...]
>
>> FUNCTION_DECL
>> :FUNCTION
>> {
>> SKIP();
>> }
>> ( ~(FUNCTION|END_FUNCTION)
>> |
>> FUNCTION_DECL
>> )* END_FUNCTION SEMI
>> ;
>>
>
> The reason why that doesn't work is that ~ can only take the inverse of
> sets, and sets in a lexer rule are alternatives of individual characters.
> FUNCTION and END_FUNCTION are not sets, they're sequences, so it's illegal
> to use ~ on them.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090113/76f67fa6/attachment.html
More information about the antlr-interest
mailing list