[antlr-interest] related question: multiple grammars..

Eitan Suez eitan.suez at gmail.com
Thu Mar 29 13:36:14 PDT 2007


hi all,
  i apologize i was not careful and sent a couple of replies
  to individuals when in fact i meant to send them back on
  this thread on the mailing list..

  below are my followup thoughts on this subject..
/ eitan
------
Gavin Lambert wrote:
>  As far as I am aware, whitespace isn't really significant anywhere
>  in CSS -- it merely divides tokens but is not otherwise important,
>  which is exactly the same as most other languages.  Hence, just
>  ignore it all the time (either by skipping it entirely or
>  assigning it to an alternate channel).

  you're correct.
here is an illustration of my problem and why i think
whitespace significance plays a role:

length returns [Length value]
 : INT Unit
 {
   $value = new Length(Integer.parseInt($INT.text), $Unit.text);
 }
 ;

the rule matches "4 px" and not "4px" (what i want).
i believe that if whitespace was significant (not skipped
or hidden) then the rule would work/match.

-----
i've been thinking about this and there's something fundamental
that i don't understand.

changing the rule to a Lexer rule:  "Length", which must return
a single token, resolves the problem.  but lexer rules cannot
return custom objects.  i'm at a point in this rule where i've
parsed, say, that '4px' value and i have the parts, and i want
to now construct a corresponding object from the parts, not
just match the result.  if i just match the result, i end up with
'4px' as the value of my Token.text and then i have to parse
it again later to extract the parts (say in java using regex),
which seems to defeat the purpose of writing a parser.

anyone want to clue me in on what's wrong with my above
line of thinking?

-----

Robert Hill wrote:
>  Although Im not sure you need to go to all that trouble
>  just to turn whitespace consumption on/off though. - i
> did it with an earlier version like this
>
>  (in lexer)<snip>
>  { public bool IgnoreWhitespace = true; }
>  WS : ('\u000c' | ' ' | '\t' ) { if (bIgnoreWhitespace) $setType(
Token.SKIP); } ;
>  <snip>

i suppose that brings up this next question:  how to define a global in
antlrv3.
i can define the boolean in @member but then it's not visible in the lexer
(and vice versa).  i tried using a global scope but ran into the same
problem:  the scope's stack is visible only in the parser and not the lexer.



thanks, / eitan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070329/009102c1/attachment.html 


More information about the antlr-interest mailing list