[antlr-interest] Recognizing Indentation as blocks
Daniels, Troy (US SSA)
troy.daniels at baesystems.com
Wed Mar 26 13:21:02 PDT 2008
Not having the book, I can't look at the grammar. But I'd guess you'd want something like:
CHANGE_INDENTATION: EOL ws+=WHITE_SPACE*
{
if (sizeOf(ws) > previousWhiteSpace)
emit(INDENT);
else if (sizeOf(ws) < previousWhiteSpace)
emit(DEDENT);
previousWhiteSpace = sizeOf(ws);
}
Basically, when you find the end of line character, you want to look at the whitespace after it, and emit the appropriate token if it's changed. Since WHITE_SPACE has a * after it, this matches even when there is no white space. Since it starts with an EOL, you don't need to worry about false triggers in the middle of a line like just WHITE_SPACE* would.
I'm not familiar with the API for emitting tokens, so the details of the above code are almost certainly wrong, but the general concept should be right.
Troy
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Sven Busse
> Sent: Wednesday, March 26, 2008 3:57 PM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Recognizing Indentation as blocks
>
> uhm, has anybody an idea?
>
> thanks
> Sven
>
> ----------
>
> Von: Sven Busse [mailto:mail at ghost23.de]
> Gesendet: Montag, 24. März 2008 11:56
> An: antlr-interest at antlr.org
> Betreff: [antlr-interest] Recognizing Indentation as blocks
>
> Hi,
>
> i am currently reading Terrence's book. I am currently at the
> chapter "Emitting more than one token per Lexer rule". He
> gives an example from
> python:
>
> if foo:
> print "foo is true"
> f()
> g()
>
> He then discusses an exemplary INDENT lexer rule, which i am
> trying to understand.
>
> His INDENT rule aims to match Whitespace and Tabs if they
> start at the beginning of the line. If the indentation is
> bigger than in a previous line, an imaginary INDENT token is
> emitted. If it is smaller than in the previous line, one or
> multiple DEDENT token are emitted.
>
> Now my question is, would this actually work with an example
> like the little python script? Because the line with "g()"
> has actually no whitespace at all, so i would assume there
> would be no match and thus the logic of emitting DEDENT would
> not even be invoked.
>
> Is this correct or am i missing something? I am referring to
> the book "The defintive ANTRL Reference", page 95.
>
> Thank you
> Sven
>
>
>
>
More information about the antlr-interest
mailing list