[antlr-interest] (newbie) very basic grammar for simple text and integer

Johannes Luber jaluber at gmx.de
Sun Aug 5 16:34:00 PDT 2007


ali azimi wrote:
> Hi,
>  
> Thank you so much for replying to me. I have tried the indstructions,
> however the problem still remains. I am simply not able to make the
> following grammar understand to parse a simple text or integer. When
> being debugged, and inputed some simple text or integer, the grammar
> emits the "MismatchedTokenException" error message.
>  
> Could you please help me?
>  
> The grammar is:
> 
> text  :Text;
> integer :Integer;
>  
> Text      :(AlphaNumeric|Special|Space|Apostrophe)* ;
> Integer       :Decimaldigit+ ;
>  
> fragment Apostrophe:'\'';
> fragment Space           :(' ')*;
> fragment Word            : '.'* AlphaNumeric ( AlphaNumeric | '.' )* ;
> fragment CHARACTERSTRING : '\'' ( options{greedy=false;}:
> (~('\''|'\r'|'\n')| '\'' '\''))* '\'';
> fragment Special        
> :'+'|'-'|'!'|'/'|'>'|'8'|'('|')'|'"'|','|';'|'<'|'='|':'|'?'|'&'|'%'|'.'|'_';  
> fragment AlphaNumeric    :Uppercase|National|Lowercase|Decimaldigit;
> fragment Decimaldigit    :'0'..'9' ;
> fragment National        :'#'|'@'|'"'|'$'|'['|']'|'{'|'}'|'^'|'~' ;
> fragment Lowercase       :'a'..'z' ;
> fragment Uppercase       :'A'..'Z' ;
>  
> NEWLINE:'\r' ? '\n' {skip();};
> WS : (' ' |'\t' |'\n' |'\r' )+ {skip();} ;
>  
> I am very grateful.
>  
> Best regard,
>  
> Al
>

I've removed from Special all doubles of National. Furthermore I turned
all * into +, if they allowed the lexer rule to be empty. I'd also
advise to remove from Special ',' if you plan to separate the tokens
with commas. As it stands the WS rule is useless for spaces, as they end
as text tokens. You may want to exclude space as an allowed character
for the first position in Text. Also Word is unused, although I changed
it into a more correct version, if I gathered your intent correctly.

Best regards,
Johannes Luber


input_data  : (Text|Integer)*;

Text      :(AlphaNumeric|Special|Space|Apostrophe)+ ;
Integer       :Decimaldigit+ ;

fragment Apostrophe:'\'';
fragment Space           :' ';
fragment Word            : ( AlphaNumeric | '.' )+ ;
fragment CHARACTERSTRING : '\'' ( options{greedy=false;}:
(~('\''|'\r'|'\n')| '\'' '\''))* '\'';
fragment Special
:'+'|'-'|'!'|'/'|'>'|'('|')'|','|';'|'<'|'='|':'|'?'|'&'|'%'|'.'|'_';
fragment AlphaNumeric    :Uppercase|National|Lowercase|Decimaldigit;
fragment Decimaldigit    :'0'..'9' ;
fragment National        :'#'|'@'|'\"'|'$'|'['|']'|'{'|'}'|'^'|'~' ;
fragment Lowercase       :'a'..'z' ;
fragment Uppercase       :'A'..'Z' ;

NEWLINE:'\r' ? '\n' {skip();};
WS : (' ' |'\t' |'\n' |'\r' )+ {skip();} ;


More information about the antlr-interest mailing list