[antlr-interest] Lexing colons again (consume and getColumn)

Anthony Youngman Anthony.Youngman at ECA-International.com
Fri May 14 05:45:48 PDT 2004


It probably is familiar ... although I don't think I've literally copied
anything.

As for "is truncated to the original length dropping the trailing
colon", why then does my tracking code in the parser (this code is in
the lexer) still have the colon as part of the text? The parser displays
the label text as "123:" or "456", which matches exactly what I've got
in my source. (Actually, I might decide to go the other way, and
forcibly add a colon, but that's going to be fun too :-)

And reading the UniVerse manual on labels it says "A statement label can
be put either in front of a BASIC statement or on its own line. The
label must be first on the line - that is it cannot start with a space."

I read that as that a label MUST start in column 1 of a line ...
(although I admit it is ambiguous).

If other MVBasics allow labels in the middle of a line, I don't care.
None of the ones I know permit it ... And I'm inclined to follow the
Pr1me engineering philosopy - if it don't make sense, leave it out. Like
I intend to do with implicit formats! If someone else wants it bad
enough, they can add it themselves :-)

Cheers,
Wol

-----Original Message-----
From: Robert Colquhoun [mailto:rjc at trump.net.au] 
Sent: 14 May 2004 13:05
To: antlr-interest at yahoogroups.com
Subject: Re: [antlr-interest] Lexing colons again (consume and
getColumn)

Hello,

At 09:23 PM 14/05/2004, Anthony Youngman wrote:
>Two little bits of lexer ...
>
>NUMBER_LITERAL  : {getColumn() == 1}? INT {if (LA(1) == ':')
>consume();_ttype = LABEL;}
>                         | ( INT (DECIMAL (INT)?)? | DECIMAL INT ) ;
>
>and
>
>IDENT
>         : ( ALPHA ( ALPHA|NUMERIC|'.'|'$'|'%'|'_')* )
>                 {
>                         if (state == STATEMENT) {
>             if (LA(1) == ':' && getColumn() == 1) {
>                                         int len=text.length();
>                                         consume();
>                                         text.setLength(len);
>                                         _ttype = LABEL;
>                state = STATEMENT;

Above looks vaguely familiar in a strange kind of way, except for the 
getColumn() calls.

An original copy of the full working grammar is available here:
 
http://cvs.sourceforge.net/viewcvs.py/maverick/maverick/src/org/maverick
dbms/tools/BASIC.g?view=markup


>In both cases, having tested for the colon, I want to throw it away, as
>it is sometimes optional (even after IDENT!) so best ignored. "consume"
>seems to add it to the token currently being processed. What do I call
>instead?

Look again! the token text is stored in the "text" variable which is 
truncated to the original length dropping the trailing colon matched
after 
the consume() statement.

>And for IDENT, I want to get the token's starting column. I thought
>"getColumn" was wrong (it feels wrong and doesn't seem to work), so is
>there a $getColumn, and is that what I'm looking for?

I would not touch getColumn() for this, in the target language labels
can 
exist indented by white space or after ';' in the middle of lines.  The 
'state' variable is there to determine when you can and cannot match a
label.

>Note to Ter - reading the lexer section of the manual, getColumn is
only
>mentioned in passing, and while there's a table of various functions,
>the fact that getColumn is missing means it's obviously incomplete. Is
>there other stuff missing?

Try running 'javap -classpath antlr.jar antlr.Token' to see  what's 
available in the token class.  If you have the full source look in 
antlr\Token.java for more clues.

  - Robert



 
Yahoo! Groups Links



 





****************************************************************************

This transmission is intended for the named recipient only. It may contain private and confidential information. If this has come to you in error you must not act on anything disclosed in it, nor must you copy it, modify it, disseminate it in any way, or show it to anyone. Please e-mail the sender to inform us of the transmission error or telephone ECA International immediately and delete the e-mail from your information system.

Telephone numbers for ECA International offices are: Sydney +61 (0)2 9911 7799, Hong Kong + 852 2121 2388, London +44 (0)20 7351 5000 and New York +1 212 582 2333.

****************************************************************************



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list