[antlr-interest] ANTLR 3 Lexical States
Bertalan Fodor (LilyPondTool)
lilypondtool at organum.hu
Fri Jan 25 08:07:47 PST 2008
Yes, that's a good idea, but that doesn't solve the problem that the
state change must be done in the parser. So in the switch(state)
statement the value of state is always NORMAL, because the lexing will
be done first.
Now I'm thinking of the following possibilities:
- Harald Müller's lexing parser - as I see currently it doesn't work
with overlapping Lexer rules, like if in the example below STRING is
'a'..'z' and SPECIAL_STRING is '<'|'a'
- David Holroyd's lazy token stream - with which I see the problem that
it lazily loads the tokens from the source, but not from the source, so
I may not be able to change the token type according to lexical state
- handling all lexer-state-pushing situations as recursively embedded
island-grammars - the problem is that these islands actually can be
infinitely embedded in each other.
- going back to Antlr 2
- writing the lexer with JFlex
I really don't want the last 2 possibilities, so I'm very curious if
there is some good ways for my grammar.
Bert
Jim Idle wrote:
> It will be a lot more readable, and generate better code if you do this
> instead:
>
> 1) Create fragment rules for your tokens that have the same pattern. In
> fact you can just use the tokens {} section to create the token types,
> but then ANTLR will give you warnings that there is no token called XYZ
> when you try to use this type in the lexer. As I hate warnings, I use
> fragment tokens. You won't use them for matching (usually) so they don't
> actually have to match the pattern they represent, but it is probably
> good to document this if they don't!!
>
> 2) Use one pattern match for all the tokens that clash, then change the
> type according to the context:
>
>
> fragment STRING : LETTERS ;
> fragment SPECIAL_STRING : LETTERS ;
>
> STRINGS:
> LETTERS
>
> {
> switch (state)
> {
> case States.NORMAL:
>
> $type = STRING;
> Break;
>
> case States.SPECIAL:
> $type = SPECIAL_STRING;
> ...
> }
> }
> ;
>
>
> And so on.
>
> Jim
>
>
>> -----Original Message-----
>> From: Bertalan Fodor (LilyPondTool) [mailto:lilypondtool at organum.hu]
>> Sent: Friday, January 25, 2008 2:21 AM
>> To: Antlr Interest
>> Subject: [antlr-interest] ANTLR 3 Lexical States
>>
>> My Antlr grammar I'm migrating to Antlr 3 heavily uses lexical states,
>> that is, the Lexer has lots of semantic predicates to distinguish
>> between alternatives like this
>> STRING: {inState(States.NORMAL)}? LETTER+
>> SPECIAL_STRING: {inState(States.SPECIAL)}? LETTER+
>>
>> The states are set during the parse process, like this
>> special_handling: "\special" { setState(States.SPECIAL); }
>> SPECIAL_STRING;
>>
>> It worked perfectly well in Anltr 2. However, now I'm a bit afraid
>>
> that
>
>> the Antlr 3 style lexing will make this not work.
>>
>> What do you think?
>>
>> Thank you,
>>
>> Bertalan Fodor
>>
>> --
>> LilyPondTool is the editor for LilyPond files.
>> See http://lilypondtool.organum.hu
>>
>>
>
>
>
>
--
LilyPondTool is the editor for LilyPond files.
See http://lilypondtool.organum.hu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080125/026030bb/attachment-0001.html
More information about the antlr-interest
mailing list