[antlr-interest] Tokens in fixed columns (e.g. COBOL)

Greg Lindholm glindholm at yahoo.com
Mon May 27 15:43:30 PDT 2002


--- Terence Parr <parrt at jguru.com> wrote:
> I have added a FAQ entry (using your name and answer if that's ok
> Greg):
> 
> http://www.jguru.com/faq/view.jsp?EID=893706
> 
> Ter
Glad to help.

On a related note, a little while back I was working on lexer for COBOL
and was having problems with it's fixed column tokens (Linenum,
Continuation/Comment Indicator, Modcode).  Most of COBOL is free format
except for tokens that occur at 3 fixed margins (columns 1, 7, and 73).

In case anyone is interested or encounters a similar problem here is
the solution I came up with: I pass the imput file through a filter
that inserts "margin markers" into the input stream then use the
following MARGIN_TOKEN rule to figure out these special tokens.

/*
Margin tokens are tokens that are in fixed columns.
A "margin maker" ('\001') is artificially inserted into the character
stream at the margin column (1, 7, 73) so it can be used as a prefix to
a fixed column token.

The margin marker should be suffixed with a '!' so it is excluded from
the token text. The column position must also be ajusted to not count
the marker.

Use getColumn() in sematic predicates to determine which column we are
in and thus which token type to create.
*/
MARGIN_TOKEN:
    // T_LINENUM starts in column 1 and goes to the next margin or EOL
        { 1==getColumn()}?
        '\001'! (~('\001'|'\n'))*
        { $setType(T_LINENUM); setColumn(getColumn()-1); }

    // If the column 7 indicator is blank then skip.
    |   { 7==getColumn() && LA(2)==' '}?
        '\001'! ' '
        {   $setType(Token.SKIP); 
            setColumn(getColumn()-1); 
        }

    // T_CONTINUATION is a '-' in column 7
    |   { 7==getColumn() && LA(2)=='-'}?
        '\001'! '-'
        { $setType(T_CONTINUATION); setColumn(getColumn()-1); }

    // If the column 7 indicator area is not blank or continuation then
    // its a comment that goes to the next margin or EOL.
    |   { 7==getColumn() && LA(2)!=' ' && LA(2)!='-' }?
        '\001'!
            ( '*' { $setType(T_COMMENT); }
            | '/' { $setType(T_COMMENT); }
            | 'd' { $setType(T_DEBUG_COMMENT); }
            | '$' { $setType(T_SPECIAL_COMMENT); }
            )
            (~('\001'|'\n'))*
        { setColumn(getColumn()-1); }

    // T_MODCODE starts at column 73 and goes to EOL
    |   {73==getColumn()}?
        '\001'! (~'\n')*
        { $setType(T_MODCODE); setColumn(getColumn()-1); }
    ;

Greg


__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list