[antlr-interest] ANother mismatched token 0!=0

Fri Mar 14 07:15:45 PDT 2008

G R schrieb:
> Hi,
> I'm trying to parse a file divide in 3 sections, each section contains a 
> field '=' a value.
> Here is an exemple that' i am trying to parse :
> 
> [config]   
> id = 420
> revision= 1
> severity.tag=@_TAG_SEVERITY
> severity.high        = [1-2]
> severity.medium        = [3]
> severity.low        = [4-6]
> severity.info <http://severity.info>        = (?:[7-9]|10|11|12|13)
> [classification]       
> classification.severity = true
> auth = ((?:109|113)\d*)
> bridge = ((?:101|102|103|104|709)\d*)
> [idmef]
> additional_data(1).type = string
> additional_data(1).meaning = blablabla
> additional_data(1).data = $3
> additional_data(0).type = string
> additional_data(0).meaning = blublublu
> additional_data(0).data = $2
> 
> I got the following grammar :
> 
> grammar GlobalConfig;
> options {
>     language=Java;
> }
> tokens {
>     CONFIG_START = '[config]';
>     CLASS_START = '[classification]';
>     IDMEF_START = '[idmef]';
>     ID = 'id';
>     REV = 'revision';
>     SEV_TAG = 'severity.tag';
>     SEV_HI = 'severity.high';
>     SEV_MED = 'severity.medium';
>     SEV_LO = 'severity.low';
>     SEV_IN = 'severity.info <http://severity.info>';
>     CLAS_SEV = 'classification.severity';
>     TRUE = 'true';
>     FALSE = 'false';
> }
> @members{
> ...}
> configFile
>     :    CONFIG_START configPart CLASS_START classificationPart 
> IDMEF_START idmefPart EOF;
>    
> configPart
> @init{
> this.classes = new ArrayList<String>();
> this.classesValues = new TreeMap<String, String>();
> this.idmefPaths = new TreeMap<String, String>();
> }
>     :    id rev severity;
>     id    :    ID '=' DIGITS NEWLINE
>         {this.id <http://this.id> = $DIGITS.text;};
>     rev    :    REV '=' DIGITS NEWLINE
>         {this.rev = $DIGITS.text;};
>     severity
>         :    severityTag severityHigh severityMed severityLow severityInfo;
>         severityTag
>             : SEV_TAG  '=' TAG NEWLINE
>             {this.sev_tag=$TAG.text;};
>         severityHigh
>             : SEV_HI '=' VALUE NEWLINE
>             {this.sev_high=$VALUE.text;};
>         severityMed
>             : SEV_MED '=' VALUE NEWLINE
>             {this.sev_medium=$VALUE.text;};
>         severityLow
>             : SEV_LO '=' VALUE NEWLINE
>             {this.sev_low=$VALUE.text;};
>         severityInfo
>             : SEV_IN '=' VALUE NEWLINE
>             {this.sev_info=$VALUE.text;};
>    
> classificationPart
>     :    CLAS_SEV '=' (on | off);
>     on
>     @init{this.classificationSeverity=true;}
>         : TRUE NEWLINE classes;
>         classes
>             : (LITERAL NEWLINE)+
>             {this.classes.add($LITERAL.text);};
>     off
>     @init{this.classificationSeverity=false;}
>         : FALSE NEWLINE classesValues;
>         classesValues
>             : (LITERAL '=' VALUE NEWLINE)+
>             {this.classesValues.put($LITERAL.text, $VALUE.text);};
> 
>    
> idmefPart
>     :    (IDMEFPATH '=' VALUE)+
>     {this.idmefPaths.put($IDMEFPATH.text, $VALUE.text);};
> 
> 
> IDMEFPATH
>     :    LETTER (LETTER | SCORIES);
> VALUE
>     :    (SCORIES | DIGITS | LETTER)+;
> DIGITS
>     :    DIGIT+;
> 
> TAG
>     :    '@_' LITERAL;
> LITERAL
>     :    LETTER (LETTER | '-' | '_')*;
>    
> fragment SCORIES
>     :    '-' | '_' | ':' | '.' | '?' | '!' | '|' | '@' | '#' | '$' | '^' 
> | '~' | '(' | ')' | '[' | ']' | '\\' | '/' | '*';
> fragment LETTER
>     :    ('a'..'z' | 'A'..'Z');
> fragment DIGIT
>     :    '0'..'9';
>    
> NEWLINE
>     :    '\r'? '\n';
> WS
>     : (' '|'\t'|'\n'|'\r')+ {skip();};
> 
> Each time i try to parse the config file i gave you before the grammar, 
> I get an error with the parser rule "id" saying :
> BR.recoverFromMismatchedToken
> line 2:5 mismatched input '420' expecting DIGITS
> and in ma parsed tree I got :
> ID = MismatchedTokenException 0!=0
> 
> I can't find a way to solved this, and i don't understand what is my 
> error, although I'm nearly sure this is a very stupid error.
> Can anyone help ?
> 
> Thanks.
> G.R

I believe that you should change at least all literals which are called 
from several places into own fragment rules. Otherwise several tokens 
are trying to recognize the same input.

Johannes