[antlr-interest] Lexer - length/position as token delimiter?

Mark Lentczner markl at glyphic.com
Thu Apr 29 13:11:17 PDT 2004


As offen is the case, the problems are with your grammar, not the 
ability to lex or parse it.

> :23B:CRED
> :32A:000612USD5443,99
> :33B:USD5443,99

Does the grammar know from the tag what the format of the tag body 
should be?  Or can any tag have any tag _body format?  If the later is 
the case, then the grammar is almost certainly inherently ambiguous and 
you won't be able to get far.  (Unless the tag_body formats are far 
more restricted than I'm guessing from your example.)

Here's an example:

:33X:12040678,99

Unless the grammar says something about tag "33X", there is no way to 
know if this is should be parsed as:
     1) a date, "120406" and an amount "78,99"
or  2) an amount "12040678,99"

Assuming there is a way to know from the tag what to expect from the 
tag_body, then I'd approach this by putting most of the work in the 
parser, not the lexer.

In the lexer I'd have:

class ScriptLexer extends Lexer;
     options { testLiterals = false; }

TAG options{testLiterals=true;}: ':' DIGIT DIGIT LETTER ':';
DIGIT: '0'..'9';
COMMA: ',';
LETTER: 'A'..'Z';

In the parser I'd define rules for each tag_body format:

transaction: (LETTER)+;
date: DIGIT DIGIT DIGIT DIGIT DIGIT DIGIT;
currency: LETTER LETTER LETTER;
value: (DIGIT)+ (COMMA (DIGIT)+)?;
amount: currency value;
dated_amount: date amount;

Then each I'd run the rest of the parser like:

message : headers entry+ trailer;
line : (
       ":23B:" transaction
     | ":32A:" dated_amount
     | ":33B:" amount
     );

Notice the trick of allowing the literal test in the TAG rule, and then 
using all the tag names as literals in the parser.

	- Mark

Mark Lentczner
markl at wheatfarm.org
http://www.wheatfarm.org/



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list