[antlr-interest] related question: multiple grammars..
Eitan Suez
eitan.suez at gmail.com
Thu Mar 29 13:36:14 PDT 2007
hi all,
i apologize i was not careful and sent a couple of replies
to individuals when in fact i meant to send them back on
this thread on the mailing list..
below are my followup thoughts on this subject..
/ eitan
------
Gavin Lambert wrote:
> As far as I am aware, whitespace isn't really significant anywhere
> in CSS -- it merely divides tokens but is not otherwise important,
> which is exactly the same as most other languages. Hence, just
> ignore it all the time (either by skipping it entirely or
> assigning it to an alternate channel).
you're correct.
here is an illustration of my problem and why i think
whitespace significance plays a role:
length returns [Length value]
: INT Unit
{
$value = new Length(Integer.parseInt($INT.text), $Unit.text);
}
;
the rule matches "4 px" and not "4px" (what i want).
i believe that if whitespace was significant (not skipped
or hidden) then the rule would work/match.
-----
i've been thinking about this and there's something fundamental
that i don't understand.
changing the rule to a Lexer rule: "Length", which must return
a single token, resolves the problem. but lexer rules cannot
return custom objects. i'm at a point in this rule where i've
parsed, say, that '4px' value and i have the parts, and i want
to now construct a corresponding object from the parts, not
just match the result. if i just match the result, i end up with
'4px' as the value of my Token.text and then i have to parse
it again later to extract the parts (say in java using regex),
which seems to defeat the purpose of writing a parser.
anyone want to clue me in on what's wrong with my above
line of thinking?
-----
Robert Hill wrote:
> Although Im not sure you need to go to all that trouble
> just to turn whitespace consumption on/off though. - i
> did it with an earlier version like this
>
> (in lexer)<snip>
> { public bool IgnoreWhitespace = true; }
> WS : ('\u000c' | ' ' | '\t' ) { if (bIgnoreWhitespace) $setType(
Token.SKIP); } ;
> <snip>
i suppose that brings up this next question: how to define a global in
antlrv3.
i can define the boolean in @member but then it's not visible in the lexer
(and vice versa). i tried using a global scope but ran into the same
problem: the scope's stack is visible only in the parser and not the lexer.
thanks, / eitan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070329/009102c1/attachment.html
More information about the antlr-interest
mailing list