[antlr-interest] Fighting with Mismatched Character Exceptions

Matt Rapczynski rapczynskimatthew at fhda.edu
Fri Apr 13 18:55:21 PDT 2012


Howdy,

I am working on a research project to develop a PL/SQL documentation 
tool in the hopes of picking up where some previous developers had left 
off years. My goal is create something javadoc-like, especially since 
PL/SQL supports multi line /* */ comments. The existing ANTLR grammer 
files out there just don't work, and it doesn't seem like me trying a 
crack at it would be all that hard. I picked up the Definitive ANTLR 
reference book earlier today, and so far I have successfully created an 
AST that models an input source code file into a nice relationship of 
parent package, procedure, function, and parameter nodes.

I'm stuck right now trying to get the lexer to appreciate the difference 
between the string literal "in" and the string literal "is". Given this 
code sample...

procedure P_RenderPage(p_param1 varchar2, p_param2 in out number, 
p_param3 cursor%rowtype)

...the lexer is insisting through MismatchedCharacter exceptions that 
the "in" (following p_param2) really should be "is" because that matches 
a different token definition which drops any use of the "is" keyword 
altogether. It's not a language keyword relevant to generating docs. 
Here are the details:

Error Message:

line 2:52 mismatched character 'n' expecting 's'

That rules that should work just fine:

parameter
     :    ORACLE_IDENTIFIER ' ' parameter_modifier* oracle_type -> 
^(PARAM ORACLE_IDENTIFIER oracle_type parameter_modifier*);

parameter_modifier
     :    ('in' | 'out' | 'nocopy') ' '*;

The token that sticking its nose where it doesn't belong:

STATEMENT_TERMINATORS
     :    ' '* (';' | 'is' | 'as' | 'end') ' '* { $channel = HIDDEN; };

I've read some elaborate commentary about how the lexer selects a rule 
to work with, but "in" and "is" are in fact different How can it 
possibly be messing that up? Is there anything I can do? I really should 
be capturing those parameter modifiers because they can signify some 
pretty important behavioral changes that belong in a good documentation 
tool.

Thanks in advance for any help!

-- 
Matt Rapczynski
ETS/IS, Database Analyst
Foothill-De Anza College District



More information about the antlr-interest mailing list