[antlr-interest] How does INTEGER+ '.' INTEGER+ match "2."?

Ken Klose kenklose at gmail.com
Sun Aug 8 15:42:55 PDT 2010


I'm an ANTLR noob constructing a grammar to parse a data file that is a mix
of structured data and unstructured text.  At various points in this data
file there are entire lines of free form text that I need to associate with
the previously matched data record. I am having difficulty.

I've distilled my grammar and input down to the smallest subset that causes
the problem ( line 1:20 required (...)+ loop did not match anything at
character '\r' ).  I don't understand why it is matching "2." as PRICE
instead of INTEGER SYMBOL.  Any help is greatly appreciated.

===GRAMMAR ===
grammar Herman;

options {
  language = Java;
  output = AST;
}

detail: ( descline)+;

descline: (INTEGER | LETTER | SYMBOL | ' ' )+ LBR;

fragment DIGIT: '0'..'9';
LETTER : ('a'..'z' | 'A'..'Z');
fragment WSCHAR : (' ' | '\t' | '\n' | '\r' | '\f');
SYMBOL : ~( DIGIT | LETTER | WSCHAR );
INTEGER: DIGIT+;
PRICE: INTEGER '.' INTEGER;        /* <= If I remove this TOKEN then it
parses fine, but I need this token for other parts of the data */
LBR: ('\n' | '\r' | '\r\n');
WS: WSCHAR+ {$channel = HIDDEN;};

=== Test Bed ===

CharStream charStream = new ANTLRStringStream("I like the number 2.\r\n");
HermanLexer lexer = new HermanLexer(charStream);
TokenStream tokenStream = new CommonTokenStream(lexer);
HermanParser parser = new HermanParser(tokenStream );
parser.detail();
System.out.println("Done.");

=== Output ===
enter LETTER I line=1:0
exit LETTER   line=1:1
enter T__12   line=1:1
exit T__12 l line=1:2
enter LETTER l line=1:2
exit LETTER i line=1:3
enter LETTER i line=1:3
exit LETTER k line=1:4
enter LETTER k line=1:4
exit LETTER e line=1:5
enter LETTER e line=1:5
exit LETTER   line=1:6
enter T__12   line=1:6
exit T__12 t line=1:7
enter LETTER t line=1:7
exit LETTER h line=1:8
enter LETTER h line=1:8
exit LETTER e line=1:9
enter LETTER e line=1:9
exit LETTER   line=1:10
enter T__12   line=1:10
exit T__12 n line=1:11
enter LETTER n line=1:11
exit LETTER u line=1:12
enter LETTER u line=1:12
exit LETTER m line=1:13
enter LETTER m line=1:13
exit LETTER b line=1:14
enter LETTER b line=1:14
exit LETTER e line=1:15
enter LETTER e line=1:15
exit LETTER r line=1:16
enter LETTER r line=1:16
exit LETTER   line=1:17
enter T__12   line=1:17
exit T__12 2 line=1:18
enter PRICE 2 line=1:18
enter INTEGER 2 line=1:18
enter DIGIT 2 line=1:18
exit DIGIT . line=1:19
exit INTEGER . line=1:19
enter INTEGER
 line=1:20
exit INTEGER
 line=1:20
exit PRICE
 line=1:20
enter LBR
 line=1:20
exit LBR ? line=2:0
line 1:20 required (...)+ loop did not match anything at character '\r'
java.net.BindException: Address already in use: JVM_Bind
java.net.BindException: Address already in use: JVM_Bind
at java.net.PlainSocketImpl.socketBind(Native Method)
at java.net.PlainSocketImpl.bind(Unknown Source)
at java.net.ServerSocket.bind(Unknown Source)
at java.net.ServerSocket.<init>(Unknown Source)
at java.net.ServerSocket.<init>(Unknown Source)
at
org.antlr.runtime.debug.DebugEventSocketProxy.handshake(DebugEventSocketProxy.java:75)
at com.kenklose.ibdscraper.IBD100ListParser.<init>(IBD100ListParser.java:52)
at com.kenklose.ibdscraper.IBD100ListParser.<init>(IBD100ListParser.java:43)
at com.kenklose.ibdscraper.Test.main(Test.java:23)
Exception in thread "main" java.lang.NullPointerException
at
org.antlr.runtime.debug.DebugEventSocketProxy.transmit(DebugEventSocketProxy.java:116)
at org.antlr.runtime.debug.DebugEventSocketProxy.LT
(DebugEventSocketProxy.java:161)
at org.antlr.runtime.debug.DebugTokenStream.LT(DebugTokenStream.java:82)
at org.antlr.runtime.Parser.traceIn(Parser.java:92)
at com.kenklose.ibdscraper.IBD100ListParser.detail(IBD100ListParser.java:96)
at com.kenklose.ibdscraper.Test.main(Test.java:24)


More information about the antlr-interest mailing list