[antlr-interest] problems getting a simple grammar to accept it's input
Kevin J. Cummings
cummings at kjchome.homeip.net
Thu Mar 24 08:31:47 PDT 2011
On 03/24/2011 11:08 AM, Florian Franzmann wrote:
> Hi,
>
> I'm having problems getting a (so far) very simple grammar to accept it's input:
>
> -------------------------------------
>
> grammar Simulink;
>
> IDENTIFIER : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
> ;
>
> INT : '0'..'9'+
> ;
>
> FLOAT
> : ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
> | '.' ('0'..'9')+ EXPONENT?
> | ('0'..'9')+ EXPONENT
> ;
>
> COMMENT
> : '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
> ;
>
> WS : ( ' '
> | '\t'
> | '\r'
> | '\n'
> ) {$channel=HIDDEN;}
> ;
>
> STRING
> : '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
> ;
>
> fragment
> EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
>
> fragment
> HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
>
> fragment
> ESC_SEQ
> : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
> | UNICODE_ESC
> | OCTAL_ESC
> ;
>
> fragment
> OCTAL_ESC
> : '\\' ('0'..'3') ('0'..'7') ('0'..'7')
> | '\\' ('0'..'7') ('0'..'7')
> | '\\' ('0'..'7')
> ;
>
> fragment
> UNICODE_ESC
> : '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
> ;
>
> fragment
> BLOCK_BEGIN
> : '{'
> ;
>
> fragment
> BLOCK_END
> : '}'
> ;
>
> file : block+
> ;
>
> block : IDENTIFIER BLOCK_BEGIN BLOCK_END
> ;
Because you actually defined BLOCK_BEGIN and BLOCK_END as fragments,
those tokens are never actually created. Remove the "fragment" from the
TOKEN rules.
> -------------------------------------
>
> This is the input:
>
> -------------------------------------
>
> # bla
> Model {
> }
>
> -------------------------------------
>
> And here is what happens when I try to feed it to the grammar:
>
> -------------------------------------
> $ make smalltests
> antlr3 -verbose -trace -report Simulink.g
> ANTLR Parser Generator Version 3.3 Nov 30, 2010 12:50:56
> Simulink.g
> Simulink.file:65:8 decision 1: k=1
> javac -classpath antlr/antlr-3.3-complete.jar:. SimulinkLexer.java
> javac -classpath antlr/antlr-3.3-complete.jar:. SimulinkParser.java
> javac -classpath antlr/antlr-3.3-complete.jar:. Test.java
> cat testdata/empty.mdl | java -classpath antlr/antlr-3.3-complete.jar:. Test
> enter COMMENT # line=1:0
> exit COMMENT M line=2:0
> enter IDENTIFIER M line=2:0
> exit IDENTIFIER line=2:5
> enter file [@1,6:10='Model',<4>,2:0]
> enter block [@1,6:10='Model',<4>,2:0]
> enter WS line=2:5
> exit WS { line=2:6
> line 2:6 no viable alternative at character '{'
> enter WS
> line=2:7
> exit WS } line=3:0
> line 3:0 no viable alternative at character '}'
> enter WS
> line=3:1
> exit WS
> line=4:0
> enter WS
> line=4:0
> exit WS line=5:0
> line 5:0 mismatched input '<EOF>' expecting BLOCK_BEGIN
> exit block [@6,17:17='<EOF>',<-1>,5:0]
> exit file [@6,17:17='<EOF>',<-1>,5:0]
> -------------------------------------
>
> As I understand it the parser consumes 'Model' as IDENTIFIER and goes into
> state block. It ignores a WS, then finds a '{'. This should be recognized as
> BLOCK_BEGIN, which is the next token expected in block---any idea what I'm
> doing wrong?
fragment TOKENs are meant to only be recognized when creating further
tokens. Since your BLOCK_BEGIN and BLOCK_END are intended to be final
TOKENs (you use them in your parser's "block" rule), you should remove
the "fragment" from those token rules.
> best regards
> Florian Franzmann
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
Kevin J. Cummings
kjchome at verizon.net
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)
More information about the antlr-interest
mailing list