[antlr-interest] Multiple lexer tokens per rule

Jim Idle jimi at temporal-wave.com
Thu Jun 3 14:18:14 PDT 2010


Add to an array or collection then get nextToken to remove from the collection. It si slower to do this so it isn't the default way.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ken Williams
> Sent: Thursday, June 03, 2010 1:42 PM
> To: ANTLR list
> Subject: [antlr-interest] Multiple lexer tokens per rule
> 
> Both the DAR book and the Javadoc
> (http://www.antlr.org/api/ActionScript/org/antlr/runtime/Lexer.html#emi
> tToke
> n() ) mention that if you want to emit multiple tokens for a single
> lexer
> rule, you need to override emit() or emitToken().  Does anyone have any
> examples of doing that?
> 
> I assume nextToken() would also need to be overridden.
> 
> 
> In case I have an XY Problem
> (http://www.perlmonks.org/index.pl?node_id=542341), my use case is to
> parse
> as in the following examples:
> 
> 23      -> DIGITS
> 23,     -> DIGITS PUNC
> 23,450  -> NUMERIC
> 23,450, -> NUMERIC PUNC
> 
> To do that, I'm using a lexer rule that consumes all the numeric &
> permitted
> in-numeric punctuation, then I fix it up afterwards:
> 
> -----------------------
> token    : ...
>     | DIGITS
>     | NUMERIC -> {fixNum($text)}
>     | PUNC
> 
> PUNC   : '-' | ',' | '.' ;
> fragment DIGIT    : '0'..'9' ;
> NUMERIC    :    DIGIT (DIGIT | PUNC)*
>         {if ($text.matches("^[0-9]+$")) {$type=DIGITS;}} ;
> -----------------------
> 
> My fixNum() method is trying to fix things up at the parser level, but
> I
> really want to do it in the lexer.
> 
> An alternate solution might be to "push back" any trailing punctuation
> onto
> the input stream.  Not sure if that's possible?
> 
> 
> --
> Ken Williams
> Sr. Research Scientist
> Thomson Reuters
> Phone: 651-848-7712
> ken.williams at thomsonreuters.com
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list