[antlr-interest] Lexer - length/position as token delimiter?

Thu Apr 29 12:54:09 PDT 2004

On Apr 29, 2004, at 12:34 PM, angrymongoose wrote:

> Hello all,
>
> I need to be able to parse messages of the form:
>
> message : headers tag+ trailer;
>
> tag : tag_id tag_body;
>
> << begin of file omitted>>
> :23B:CRED
> :32A:000612USD5443,99
> :33B:USD5443,99
> << end of file omitted>>
>
> The problem I am having is that the tag body uses position as the
> element delimiter rather than a clearly defined character. For example
> looking at the 32A line:
>
> ":32A:" is the tag id and "000612USD5443,99" is the
> tag_body.
>
> The tag body in turn breaks down into a date "000612" (6), a
> currency
> code "USD"  (3), and an amount (1-15).
>
> Is it possible to somehow parse the tag body using ANTLR using one
> lexer/parser or I am stuck writing tag parsers by hand. I guess an
> alternative is to use ANTLR to write parsers for each tag and have a
> master parser invoke the subparsers?

Hi Norman,

is the tag body fixed size for each "field"?  If so, pretty easy.  Just 
match 6 digits for the date, then look for the 3 letter currency code 
etc...

Ter
--
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
Cofounder, http://www.knowspam.net enjoy email again!
Cofounder, http://www.peerscope.com pure link sharing

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/