[antlr-interest] Whitespace: More than meets the eye?
Sam Barnett-Cormack
s.barnett-cormack at lancaster.ac.uk
Wed Aug 5 22:59:39 PDT 2009
Graham Wideman wrote:
> Ah-hah -- OK, time for slap on the forehead. (Mine! It must be the >100
> degree weather here.)
>
> Thanks for your answers. Yes, of COURSE it works as you say. Somehow,
> after not really worrying about how the lexer works, my brain got stuck
> thinking that lexer and parser work more analogously than they actually do.
>
> Whereas at any juncture the parser only tries certain rules as predicted
> by the grammar and the current state, the lexer effectively "tries all
> its rules" every time it's starting to discern the next token.
>
> So in that process, if the next characters match a rule that discards
> the characters, (a la whitespace), then that pattern functions as an
> optional separator.
>
> And I also see that in order to do anything with whitespace at the
> parser level, either whitespace has to not be discarded (in which case
> many parser rules will have to deal with it) or custom code will need to
> be included in the relevant rules to look at the hidden channel etc.
Don't forget that 'whitespace' is arbitrary - you could consider spaces
to be whietspace, but not, say, tabs or newlines. I believe there are
languages where this is the case - spaces are never significant, but
some other types of whitespace are.
I'm curious as to why you want to sometimes consider whitespace, though.
Is this a self-designed language, or a specification you're working from
that makes whitespace 'sometimes' significant?
You example was a function call or declaration. You can always get help
from the lexer here if there are situations where there *must* be a
space, and situations where there *mustn't* be a space, and nothing
else... have tokens that include the lparen.
--
Sam Barnett-Cormack
More information about the antlr-interest
mailing list