[antlr-interest] Bug Report - Composite Grammars only allow 1 level of lexer import

Sun Jan 25 13:34:13 PST 2009

Only one level of import works for lexers in Composite grammars. I present
my suggested solution at the bottom of this message, but for an example of
the error, here is a slight modification of the final example at
http://www.antlr.org/wiki/display/ANTLR3/Composite+Grammars that tries to
import 2 levels of lexers.

------- Start grammar files ------
lexer grammar LSub ;

SPACE : ' ' ;
----------------------------------
lexer grammar L ;

import LSub;

LETTER : 'a'..'z' ;
// SPACE : ' ' ;
NUMBER : '0'..'9' ;

----------------------------------
parser grammar P1 ;

letter : LETTER ;
spaces : SPACE+ ;
----------------------------------
parser grammar P2 ;
import P1 ;

letters : letter+ ;
----------------------------------
grammar C ;
import L, P2 ;

stuff : ( letters spaces )+ ;

LETTER : 'a'..'z' ;
------ End grammar files ---------

---------- ANTLR SysOut ----------
ANTLR Parser Generator  Version 3.1.1
---- End SysOut (notice there is no error) ----

--------- gUnit testsuite ---------
gunit C;

LETTER
: "a"   OK
  "A"   FAIL 

letter
: "a"   OK
  "B"   FAIL

spaces
: "  "  OK

letters
: "abc"     OK
  "aBc"     FAIL

stuff
: "a ab c   "   OK
  " A ab C  "   FAIL
  "A ab C   "   FAIL
------- End gUnit testsuite -------

---------- gUnit output ----------
-----------------------------------------------------------------------
executing testsuite for grammar:C with 10 tests
-----------------------------------------------------------------------
2 failures found:
test5 (spaces, line12) -
expected: OK
actual: FAIL

test8 (stuff, line19) -
expected: OK
actual: FAIL

Tests run: 10, Failures: 2
----- CTest gUnit completed -----
-------- End gUnit output --------

----------- CTest.java -----------
import java.io.IOException;

import org.antlr.runtime.*;

public class CTest {

      public static void 
  parseText(String fileToParse) throws IOException, RecognitionException {
    final CParser parser = new CParser(lexFileToTokenStream(fileToParse));
    parser.stuff();
  }

      private static CommonTokenStream 
  lexFileToTokenStream(String fileToParse) throws IOException {
    final ANTLRStringStream input 
      = (fileToParse==null) ? new ANTLRInputStream() 
                            : new ANTLRFileStream(fileToParse);
    final CLexer lexer = new CLexer(input);
    return new CommonTokenStream(lexer);
  }

      public static void 
  main(String[] args) throws Exception {
    if (args.length == 0) {
      parseText(null);
    } else {
      for (int i=0; i<args.length; i++) {
        parseText(args[i]);
      }
    }
  }
}
--------- End CTest.java ---------

------------ Test File (C01.testme)-----------
a b ccc 
---------- End Test File ---------

-------- Testing File C01.testme  --------
C:\Projects\SW_Development\ANTLRv3\CompositeLexBug\work>java -cp
C:\Java\ANTLR\antlrworks-1.2.2.jar;. CTest ..\*.testme
Exception in thread "main" java.lang.NullPointerException
        at CLexer.mTokens(CLexer.java:121)
        at org.antlr.runtime.Lexer.nextToken(Unknown Source)
        at org.antlr.runtime.CommonTokenStream.fillBuffer(Unknown Source)
        at org.antlr.runtime.CommonTokenStream.LT(Unknown Source)
        at org.antlr.runtime.CommonTokenStream.LA(Unknown Source)
        at CParser.stuff(CParser.java:51)
        at CTest.parseText(CTest.java:10)
        at CTest.main(CTest.java:28)
------ End Test File 1 Output ----

** THE SOLUTION **

The problem seems to be that, although CLexer.java does have
        gL = new C_L(input, state, this);
at line 26 in the CLexer constructor, it needs to follow it with
        gLSub = gL.gLSub;
(a similar thing is already done in the parsers).

When I add that line in the generated CLexer, everything works fine.

The following is a model for a temporary fix for grammars that have multiple
levels of lexers. However, it does not allow direct testing of any rules but
the ones that have the @init fix added to them.

------------- C.g with temporary fix --------------
grammar C ;
import L, P2 ;

stuff 
  @init {
    // kludge for problem in generating CLexer.java  
    CLexer clex = (CLexer)input.getTokenSource();
    clex.gLSub = clex.gL.gLSub; 
  }
: ( letters spaces )+ ;

LETTER : 'a'..'z' ;
----------------------------------------------------

Regards,
George

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090125/8239b762/attachment.html