[antlr-interest] Can ANTLR build a COBOL lexer?

Sat Apr 13 17:28:17 PDT 2002

Some things are hard for a lexer to do, and that includes
having a special mode for column 72.  Write something
that goes in front of the lexer that strips the first and 
last part and saves them.  Perhaps an array indexed by line
number.

On Sat, 13 Apr 2002, glindholm wrote:

> I'm working on a COBOL parser and trying to decide if I can use 
> ANTLR to build the lexer or if I should just roll my own. 
> 
> I'm going to use this for language translation so I want to preserve 
> all the COBOL "fluff" tokens like line-numbers and mod-codes as 
> hidden tokens.
> 
> The problem is that COBOL has column positional tokens. 
> Everything in columns 1-6 is considered the line-number.
> The character in column 7 is the comment or continuation character.
> Columns 73-80 are the mod-code.
> Everything in columns 8 to 72 is free format (mostly).
> 
> 
> So my first attempt (which of course failed) at getting the line 
> number was:
> 
> LINENUM: {1==getColumn()}? . . . . . .;
> 
> This has a nondeterminism with every other token rule because of 
> the '.' matches everything. The semantic predicate {1==getColumn()}? 
> doesn't seem to help because it doesn't get checked until we're 
> already in the rule where it throws a SemanticException() if it 
> fails.
> 
> Question 1) Is the SemanticException suppose to be caught in 
> nextToken() and the next rule tried?  I.e. We went into the wrong 
> rule let's try the next one?
> 
> Question 2) Is this what Hoisting is all about? If Hoisting was 
> supported would the {1==getColumn()}? be checked before going into 
> the rule?
> 
> Question 3) Can this be made to work? Is there any facility in ANTLR 
> that I can use for this or do I write my own lexer?
> 
> 
> If I write my own lexer I know I need to implement TokenStream.  No 
> problem.
> 
> Question 4)
> What is the strategy for keeping the Token Vocabularies syncronized 
> between the ANTLR parser and my non-Antlr lexer?
> 
> Should I write the parser first so I can use xTokenTypes in my 
> lexer? Or is there some reason I need to hand code a xTokenTypes.txt 
> file?
> 
> Any other tips or suggestions?
> 
> Thanks
> 
> Greg Lindholm
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 
> 
> 
> 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/