[antlr-interest] Can ANTLR build a COBOL lexer?
glindholm
glindholm at yahoo.com
Sat Apr 13 15:00:12 PDT 2002
I'm working on a COBOL parser and trying to decide if I can use
ANTLR to build the lexer or if I should just roll my own.
I'm going to use this for language translation so I want to preserve
all the COBOL "fluff" tokens like line-numbers and mod-codes as
hidden tokens.
The problem is that COBOL has column positional tokens.
Everything in columns 1-6 is considered the line-number.
The character in column 7 is the comment or continuation character.
Columns 73-80 are the mod-code.
Everything in columns 8 to 72 is free format (mostly).
So my first attempt (which of course failed) at getting the line
number was:
LINENUM: {1==getColumn()}? . . . . . .;
This has a nondeterminism with every other token rule because of
the '.' matches everything. The semantic predicate {1==getColumn()}?
doesn't seem to help because it doesn't get checked until we're
already in the rule where it throws a SemanticException() if it
fails.
Question 1) Is the SemanticException suppose to be caught in
nextToken() and the next rule tried? I.e. We went into the wrong
rule let's try the next one?
Question 2) Is this what Hoisting is all about? If Hoisting was
supported would the {1==getColumn()}? be checked before going into
the rule?
Question 3) Can this be made to work? Is there any facility in ANTLR
that I can use for this or do I write my own lexer?
If I write my own lexer I know I need to implement TokenStream. No
problem.
Question 4)
What is the strategy for keeping the Token Vocabularies syncronized
between the ANTLR parser and my non-Antlr lexer?
Should I write the parser first so I can use xTokenTypes in my
lexer? Or is there some reason I need to hand code a xTokenTypes.txt
file?
Any other tips or suggestions?
Thanks
Greg Lindholm
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list