[antlr-interest] ANTLR3 and C# 2.0 Lexer troubles
Igor Trofimov
iamhere2 at gmail.com
Sun Oct 8 10:43:13 PDT 2006
Hi, All!
I'm newbie in syntax analysis / lexers / parsers / etc.
Nevertheless, i try build C# lexer/parser with ANTLR. ;)
For a start, i describe _lexer_ grammar from C# language specification in
ANTLR3 grammar syntax, including Unciode classes definitions.
I remove some left-recursion, simplify some definitions using java.g as
template.
ANTLRWorks reports no invalid rules itself (without Cgrammar Check command).
And now, i have some troubles/questions with this LEXER grammar. I hope, you
help me with it.
1. There are some rules in specification in informal form:
<single-character> ::= Any character except ' (U+0027), \ (U+005C), and
<new-line-character>
I try define such rules, using '~' syntax:
SINGLE_CHARACTER : ~ (NOT_SINGLE_CHARACTER);
NOT_SINGLE_CHARACTER : '\u0027' | '\u005c' | NEW_LINE_CHARACTER;
But it dont work properly :(
Fortunately, expanded version seems to be worked:
SINGLE_CHARACTER : ~( '\u000D' | '\u000A' | '\u0085' | '\u2028' |
'\u2029');
Why the first variant not works? Is it invalid grammar syntax or ANTLR
bug?
2. There are some rules in specification, which requires some additional
logic to be difined, e.g:
DECIMAL_DIGIT_CHARACTER
: UNICODE_CATEGORY_DECIMALDIGITNUMBER // A Unicode character
of the class Nd
| UNICODE_CHARACTER_ESCAPE_SEQUENCE // representing a
character of the class Nd -- ??? How to check this ???
CONDITIONAL_SYMBOL
: IDENTIFIER_OR_KEYWORD // Any identifier-or-keyword
except true or false ??? How to check this ???
3. C# target seems unfinished? It miss some evident "override" keywords, and
some DFA definitions :(
4. And, the last, but most important. My grammar dont works absolutely :(
And there are no errors reported in grammar, but in ANTLR itself.
ANTLR tool prints the message:
=====================================================
ANTLR Parser Generator Early Access Version 3.0b4 (??, 2006) 1989-2006
internal error: org.antlr.tool.Grammar.getCharValueFromGrammarCharLiteral(
Grammar.java:1519): invalid char literal: ''
internal error: org.antlr.tool.Grammar.getCharValueFromGrammarCharLiteral(
Grammar.java:1519): invalid char literal: ''
internal error: CSharp.g : java.lang.NullPointerException
org.antlr.analysis.NFAToDFAConverter.convertToAcceptState(
NFAToDFAConverter.java:989)
org.antlr.analysis.NFAToDFAConverter.addDFAStateToWorkList(
NFAToDFAConverter.java:953)
org.antlr.analysis.NFAToDFAConverter.findNewDFAStatesAndAddDFATransitions(
NFAToDFAConverter.java:291)
org.antlr.analysis.NFAToDFAConverter.convert(NFAToDFAConverter.java:101)
org.antlr.analysis.DFA.<init>(DFA.java:214)
org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:763)
org.antlr.tool.Grammar.createLookaheadDFAs(Grammar.java:711)
org.antlr.codegen.Target.performGrammarAnalysis(Target.java:111)
org.antlr.codegen.CodeGenerator.genRecognizer(CodeGenerator.java:284)
org.antlr.Tool.processGrammar(Tool.java:320)
org.antlr.Tool.process(Tool.java:251)
org.antlr.Tool.main(Tool.java:70)
=====================================================
I attach my grammar to this post. May be, i have some terrible fundamental
errors in grammar?
Please, give me the direction to further progress...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061008/a9dca816/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: grammar.zip
Type: application/zip
Size: 12305 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20061008/a9dca816/attachment-0001.zip
More information about the antlr-interest
mailing list