[antlr-interest] Lexing colons again (consume and getColumn)

Robert Colquhoun rjc at trump.net.au
Fri May 14 05:04:53 PDT 2004


Hello,

At 09:23 PM 14/05/2004, Anthony Youngman wrote:
>Two little bits of lexer ...
>
>NUMBER_LITERAL  : {getColumn() == 1}? INT {if (LA(1) == ':')
>consume();_ttype = LABEL;}
>                         | ( INT (DECIMAL (INT)?)? | DECIMAL INT ) ;
>
>and
>
>IDENT
>         : ( ALPHA ( ALPHA|NUMERIC|'.'|'$'|'%'|'_')* )
>                 {
>                         if (state == STATEMENT) {
>             if (LA(1) == ':' && getColumn() == 1) {
>                                         int len=text.length();
>                                         consume();
>                                         text.setLength(len);
>                                         _ttype = LABEL;
>                state = STATEMENT;

Above looks vaguely familiar in a strange kind of way, except for the 
getColumn() calls.

An original copy of the full working grammar is available here:
         http://cvs.sourceforge.net/viewcvs.py/maverick/maverick/src/org/maverickdbms/tools/BASIC.g?view=markup


>In both cases, having tested for the colon, I want to throw it away, as
>it is sometimes optional (even after IDENT!) so best ignored. "consume"
>seems to add it to the token currently being processed. What do I call
>instead?

Look again! the token text is stored in the "text" variable which is 
truncated to the original length dropping the trailing colon matched after 
the consume() statement.

>And for IDENT, I want to get the token's starting column. I thought
>"getColumn" was wrong (it feels wrong and doesn't seem to work), so is
>there a $getColumn, and is that what I'm looking for?

I would not touch getColumn() for this, in the target language labels can 
exist indented by white space or after ';' in the middle of lines.  The 
'state' variable is there to determine when you can and cannot match a label.

>Note to Ter - reading the lexer section of the manual, getColumn is only
>mentioned in passing, and while there's a table of various functions,
>the fact that getColumn is missing means it's obviously incomplete. Is
>there other stuff missing?

Try running 'javap -classpath antlr.jar antlr.Token' to see  what's 
available in the token class.  If you have the full source look in 
antlr\Token.java for more clues.

  - Robert



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list