[antlr-interest] ANTLRWorks getting confused with import

Laurent Caillette laurent.caillette at gmail.com
Tue Aug 5 00:51:13 PDT 2008


Hi all,

There is a strange behavior happening with ANTLRWorks-1.2b5 on MacOsX
10.4 (Intel) and import feature. I've got a big grammar to split in
many parts. The two grammars I've trouble with should be imported in
some other grammar files, so I can't use combined grammars here
because combined grammars (lexer + parser if I'm exact) can't import
combined grammars. At least that's what TestCompositeGrammars.java
says in ANTLR sources.

You can take a look at the big grammar file if you're curious but it's
not the point.
http://github.com/caillette/novelang/tree/master%2Fsrc%2Fantlr%2FNovelang.g?raw=true

Here is how I get the error. It seems that ANTLRWorks gets confused
between the two files.

I'm opening Symbol.g in ANTLRWorks and I launch "Generate code". There
is an error dialog showing me I'm wrong to use lexer rules in a parser
grammar and the error appears at line 24 of Token.g. That's weird
because Token.g is a lexer grammar. Looking at the console dump for
both files it seems the errors don't correspond.

I hope I've shipped everything to reproduce this behavior.

Regards,

c.





Console dump for Symbol.g:
<<<
[09:31:02] java.lang.NullPointerException
	at org.antlr.tool.GrammarSanity.traceStatesLookingForLeftRecursion(GrammarSanity.java:91)
	at org.antlr.tool.GrammarSanity.checkAllRulesForLeftRecursion(GrammarSanity.java:66)
	at org.antlr.tool.Grammar.checkAllRulesForLeftRecursion(Grammar.java:1852)
	at org.antlr.works.grammar.antlr.ANTLRGrammarEngineImpl.analyze(Unknown Source)
	at org.antlr.works.grammar.engine.GrammarEngineImpl.analyze(Unknown Source)
	at org.antlr.works.grammar.CheckGrammar.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:637)
>>> (end of console dump for Symbol.g)


Console dump for Token.g :
<<<

[09:31:03] error(103): Token.g:24:0: parser rule anySymbol not allowed in lexer
[09:31:03] error(103): Token.g:29:0: parser rule
anySymbolExceptGreaterthansign not allowed in lexer
[09:31:03] error(103): Token.g:34:0: parser rule
anySymbolExceptGraveAccent not allowed in lexer
[09:31:03] error(103): Token.g:39:0: parser rule
anySymbolExceptGreaterthansignAndGraveAccent not allowed in lexer
[09:31:03] error(103): Token.g:80:0: parser rule digit not allowed in lexer
[09:31:03] error(103): Token.g:84:0: parser rule letters not allowed in lexer
[09:31:03] error(103): Token.g:86:0: parser rule hexLetter not allowed in lexer
[09:31:03] error(103): Token.g:88:0: parser rule nonHexLetter not
allowed in lexer
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:20:0: lexer rule
SOFTBREAK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:21:0: lexer rule
WHITESPACE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:26:0: lexer rule
AMPERSAND not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:27:0: lexer rule
APOSTROPHE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:28:0: lexer rule
ASTERISK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:29:0: lexer rule
CIRCUMFLEX_ACCENT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:30:0: lexer rule
COLON not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:31:0: lexer rule
COMMA not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:32:0: lexer rule
COMMERCIAL_AT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:33:0: lexer rule
DEGREE_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:34:0: lexer rule
DOLLAR_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:35:0: lexer rule
DOUBLE_QUOTE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:36:0: lexer rule
ELLIPSIS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:37:0: lexer rule
EQUALS_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:38:0: lexer rule
EXCLAMATION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:39:0: lexer rule
FULL_STOP not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:40:0: lexer rule
GRAVE_ACCENT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:41:0: lexer rule
GREATER_THAN_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:42:0: lexer rule
HYPHEN_MINUS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:43:0: lexer rule
LEFT_CURLY_BRACKET not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:44:0: lexer rule
LEFT_PARENTHESIS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:45:0: lexer rule
LEFT_SQUARE_BRACKET not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:46:0: lexer rule
LEFT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:47:0: lexer rule
LESS_THAN_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:48:0: lexer rule
LOW_LINE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:49:0: lexer rule
NUMBER_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:50:0: lexer rule
PLUS_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:51:0: lexer rule
PERCENT_SIGN not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:52:0: lexer rule
QUESTION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:53:0: lexer rule
REVERSE_SOLIDUS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:54:0: lexer rule
RIGHT_CURLY_BRACKET not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:55:0: lexer rule
RIGHT_PARENTHESIS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:56:0: lexer rule
RIGHT_SQUARE_BRACKET not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:57:0: lexer rule
RIGHT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:58:0: lexer rule
SEMICOLON not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:59:0: lexer rule
SINGLE_LEFT_POINTING_ANGLE_QUOTATION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:60:0: lexer rule
SINGLE_RIGHT_POINTING_ANGLE_QUOTATION_MARK not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:61:0: lexer rule
SOLIDUS not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:62:0: lexer rule
TILDE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:63:0: lexer rule
VERTICAL_LINE not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:65:0: lexer rule
DIGIT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:67:0: lexer rule
HEX_LETTER not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:72:0: lexer rule
NON_HEX_LETTER not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:135:0: lexer rule
BLOCK_COMMENT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:141:0: lexer rule
LINE_COMMENT not allowed in parser
[09:31:03] error(102):
/Users/Shared/Scratch/Novelang/src/antlr/Symbol.g:1:0: lexer rule
Tokens not allowed in parser

>>> (end of console dump for Token.g)


<<<
parser grammar Symbol ;

import Token ;

anySymbol
  : anySymbolExceptGreaterthansign
  | GREATER_THAN_SIGN
  ;

anySymbolExceptGreaterthansign
  : anySymbolExceptGreaterthansignAndGraveAccent
  | GRAVE_ACCENT
  ;

anySymbolExceptGraveAccent
  : anySymbolExceptGreaterthansignAndGraveAccent
  | GREATER_THAN_SIGN
  ;

anySymbolExceptGreaterthansignAndGraveAccent
  :     digit
      | hexLetter
      | nonHexLetter
      | AMPERSAND
      | APOSTROPHE
      | ASTERISK
      | CIRCUMFLEX_ACCENT
      | COLON
      | COMMA
      | COMMERCIAL_AT
      | DOLLAR_SIGN
      | DOUBLE_QUOTE
      | ELLIPSIS
      | EQUALS_SIGN
      | EXCLAMATION_MARK
      | FULL_STOP
//      | GRAVE_ACCENT
//      | GREATER_THAN_SIGN
      | HYPHEN_MINUS
      | LEFT_CURLY_BRACKET
      | LEFT_PARENTHESIS
      | LEFT_SQUARE_BRACKET
      | LESS_THAN_SIGN
      | LOW_LINE
      | NUMBER_SIGN
      | PLUS_SIGN
      | PERCENT_SIGN
      | QUESTION_MARK
      | RIGHT_CURLY_BRACKET
      | RIGHT_PARENTHESIS
      | RIGHT_SQUARE_BRACKET
      | SEMICOLON
      | SOLIDUS
      | TILDE
      | VERTICAL_LINE
  ;


digit : DIGIT ;

letters : ( hexLetter | nonHexLetter )+ ;

hexLetter : HEX_LETTER ;

nonHexLetter : NON_HEX_LETTER ;

>>> (end of Symbol.g)




<<<
lexer grammar Token ;

SOFTBREAK : ( '\r' '\n' ? ) | '\n' ;
WHITESPACE : ( ' ' | '\t' )+ ;

// All namings respect Unicode standard.
// http://www.fileformat.info/info/unicode

AMPERSAND : '&' ;
APOSTROPHE : '\'' ;
ASTERISK : '*' ;
CIRCUMFLEX_ACCENT : '^' ;
COLON : ':' ;
COMMA : ',' ;
COMMERCIAL_AT : '@' ;
DEGREE_SIGN : '°' ;
DOLLAR_SIGN : '$' ;
DOUBLE_QUOTE : '\"' ;
ELLIPSIS : '...' ;
EQUALS_SIGN : '=' ;
EXCLAMATION_MARK : '!' ;
FULL_STOP : '.' ;
GRAVE_ACCENT : '`' ;
GREATER_THAN_SIGN : '>' ;
HYPHEN_MINUS : '-' ;
LEFT_CURLY_BRACKET : '{' ;
LEFT_PARENTHESIS : '(' ;
LEFT_SQUARE_BRACKET : '[' ;
LEFT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK : '\u00ab' ;
LESS_THAN_SIGN : '<' ;
LOW_LINE : '_' ;
NUMBER_SIGN : '#' ;
PLUS_SIGN : '+' ;
PERCENT_SIGN : '%' ;
QUESTION_MARK : '?' ;
REVERSE_SOLIDUS : '\\' ;
RIGHT_CURLY_BRACKET : '}' ;
RIGHT_PARENTHESIS : ')' ;
RIGHT_SQUARE_BRACKET : ']' ;
RIGHT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK : '\u00bb' ;
SEMICOLON : ';' ;
SINGLE_LEFT_POINTING_ANGLE_QUOTATION_MARK : '\u2039' ;
SINGLE_RIGHT_POINTING_ANGLE_QUOTATION_MARK : '\u203a' ;
SOLIDUS : '/' ;
TILDE : '~' ;
VERTICAL_LINE : '|' ;

DIGIT : '0'..'9' ;

HEX_LETTER
  : 'a' | 'b' | 'c' | 'd' | 'e' | 'f'
  | 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
  ;

NON_HEX_LETTER
  : 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p'
  | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'
  | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P'
  | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z'

  | '\u00e0' // LATIN SMALL LETTER A WITH GRAVE
  | '\u00c0' // LATIN CAPITAL LETTER A WITH GRAVE

  | '\u00e2' // LATIN SMALL LETTER A WITH CIRCUMFLEX (&acirc;)
  | '\u00c2' // LATIN CAPITAL LETTER A WITH CIRCUMFLEX (&Acirc;)

  | '\u00e4' // LATIN SMALL LETTER A WITH DIAERESIS (&auml;)
  | '\u00c4' // LATIN CAPITAL LETTER A WITH DIAERESIS (&Auml;)

  | '\u00e6' // LATIN SMALL LETTER AE
  | '\u00c6' // LATIN CAPITAL LETTER AE

  | '\u00e7' // LATIN SMALL LETTER C WITH CEDILLA (&ccedil;)
  | '\u00c7' // LATIN CAPITAL LETTER C WITH CEDILLA (&Ccedil;)

  | '\u00e8' // LATIN SMALL LETTER E WITH GRAVE (&egrave;)
  | '\u00c8' // LATIN CAPITAL LETTER E WITH GRAVE (&Egrave;)

  | '\u00e9' // LATIN SMALL LETTER E WITH ACUTE
  | '\u00c9' // LATIN CAPITAL LETTER E WITH ACUTE

  | '\u00ea' // LATIN SMALL LETTER E WITH CIRCUMFLEX (&ecirc;)
  | '\u00ca' // LATIN CAPITAL LETTER E WITH CIRCUMFLEX (&Ecirc;)

  | '\u00eb' // LATIN SMALL LETTER E WITH DIAERESIS (&euml;)
  | '\u00cb' // LATIN CAPITAL LETTER E WITH DIAERESIS (&Euml;)

  | '\u00ee' // LATIN SMALL LETTER I WITH CIRCUMFLEX (&icirc;)
  | '\u00ce' // LATIN SMALL LETTER I WITH CIRCUMFLEX (&icirc;)

  | '\u00ef' // LATIN SMALL LETTER I WITH DIAERESIS (&iuml;)
  | '\u00cf' // LATIN CAPITAL LETTER I WITH DIAERESIS (&Iuml;)

  | '\u00f4' // LATIN SMALL LETTER O WITH CIRCUMFLEX (&ocirc;)
  | '\u00d4' // LATIN CAPITAL LETTER O WITH CIRCUMFLEX (&Ocirc;)

  | '\u00f6' // LATIN SMALL LETTER O WITH DIAERESIS (&ouml;)
  | '\u00d6' // LATIN CAPITAL LETTER O WITH DIAERESIS (&Ouml;)

  | '\u00f9' // LATIN SMALL LETTER U WITH GRAVE (&ugrave;)
  | '\u00d9' // LATIN CAPITAL LETTER U WITH GRAVE (&Ugrave;)

  | '\u00fb' // LATIN SMALL LETTER U WITH CIRCUMFLEX (&ucirc;)
  | '\u00db' // LATIN CAPITAL LETTER U WITH CIRCUMFLEX (&Ucirc;)

  | '\u00fc' // LATIN SMALL LETTER U WITH DIAERESIS (&uuml;)
  | '\u00dc' // LATIN CAPITAL LETTER U WITH DIAERESIS (&Uuml;)

  | '\u0153' // LATIN SMALL LIGATURE OE
  | '\u0152' // LATIN CAPITAL LIGATURE OE

  ;



// From Java 5 grammar http://www.antlr.org/grammar/1152141644268/Java.g

/** We can't use '/*' because it gets confused with wildcards in file names.
 */
BLOCK_COMMENT
  : '{{' ( options { greedy = false ; } : . )* '}}' { $channel = HIDDEN ; }
  ;

/** As we don't support '/*' we avoid confusion by not supporting
 * usually-associated '//', also used for italics.
 */
LINE_COMMENT
  : '%%' ~('\n'|'\r')* '\r'? '\n' { $channel=HIDDEN ; }
  ;
>>> (end of Symbol.g)


More information about the antlr-interest mailing list