[antlr-interest] Spaces issues

Loring Craymer lgcraymer at yahoo.com
Tue Mar 29 18:53:06 PDT 2011


The likely cause of your problems is the extensive use of fragment rules.  ANTLR 
3 does not use follow sets in  lexers and invocation of fragment rules usually 
disables LL* processing.  Inline your fragment rules, and your current problems 
should disappear, although others may still lurk.

--Loring


----- Original Message ----
> From: Fabien Hermenier <hermenierfabien at gmail.com>
> To: antlr-interest at antlr.org
> Sent: Tue, March 29, 2011 12:51:47 PM
> Subject: Re: [antlr-interest] Spaces issues
> 
> Here is my entire grammar
> There is a lot of commented rules and  "litteralRange" does not have its 
> complete definition cause easier patterns  does not work yet.
> Currently, litteralRange should accept inputs such as  "[2..3"] or "[ 2  
> .. 0xFF]".
> 
> Thanks for your  help!
> 
> ---
> grammar ANTLRVJob5;
> 
> options {
>       language = Java;
>      output = AST;
> }
> fragment Digit  :'0'..'9';
> fragment Letter    :'a'..'z'|'A'..'Z';
> fragment  Name    : Domain ('.' Domain)*;
> fragment Domain: Letter  ('-'?(Letter|Digit))*;
> fragment VarPrefix: '$';
> fragment EnumSep:  ',';
> fragment InnerContent:    (Letter
>               |Digit
>               |'_'
>              |'-'
>               |'.'(Letter|Digit));
> fragment RRange: ']'  (InnerContent*(Letter|Digit))?;
> fragment LRange: (Letter  (Digit|Letter|'-'|'_'|'.')*)? '[';
> 
> //Number litteral section
> fragment  HEX_LITERAL : ;
> fragment OCTAL_LITERAL :;
> fragment  DECIMAL_LITERAL:;
> NUMBER: '0'(
>      ('x'|'X') { $type =  HEX_LITERAL;}
>      (Digit|'a'..'f'|'A'..'F')+
>       |
>      ('0'..'7')+ {$type = OCTAL_LITERAL;}
>       |
>      )
>      |
>      '1'..'9' Digit*  {$type = DECIMAL_LITERAL;}
>      ;
> 
> NAME: Name;
> ENUMSEP:  EnumSep;
> EQUALS    :    '=';
> ENDL    :     ';';
> PLUS    :    '+';
> MINUS     :    '-';
> TIMES    :    '*';
> VARIABLE:     VarPrefix(Letter|'_')(Letter|Digit|'_')*;
> 
> COMMENT
>       :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>       |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
>       ;
> 
> WS    :    ('\n'|'\r'|'\t'|' ')  {$channel=HIDDEN;};
> 
> 
> INNERCONTENT:  InnerContent+;
> RRANGE:RRange;
> LRANGE: LRange;
> LVRANGE: VarPrefix  LRange;
> CONSTRAINTIDENTIFIER:  Letter(Letter|Digit|'_')*'(';
> 
> litteral:     NAME|NUMBER;
> operator:    PLUS|TIMES;
> 
> //litteralRange:     LRANGE INTEGER '..' INTEGER RRANGE;
> litteralRange:    '['  NUMBER '..' NUMBER ']';
> 
> litteralEnum:    LRANGE INNERCONTENT  /*(ENUMSEP INNERCONTENT)+']'  RRANGE*/;
> 
> variableEnum: LVRANGE  INNERCONTENT (ENUMSEP INNERCONTENT)+  RRANGE;
> variableRange: LVRANGE  NUMBER '..' NUMBER RRANGE;
> 
> explodedSet:('{}'| '{'expression (ENUMSEP  expression)*'}');
> 
> atom    :    '(' expression  ')'
>          |litteral
> //         |VARIABLE
>          |litteralRange
> //         |litteralEnum
> //         |variableRange
> //        |variableEnum
> //         |explodedSet
> ;
> 
> 
> expression: atom/* (operator  expression)?*/;
> 
> var_decl:    VARIABLE EQUALS expression  ';';
> 
> /*forEachStatement:
>      'foreach' VARIABLE 'in'  expression '{'
>      instruction*
>       '}';
> 
> constraintCallStatement: CONSTRAINTIDENTIFIER expression (',' 
> expression)* ')' ';';
> */
> instruction:    var_decl;
>           //|forEachStatement
> //         |constraintCallStatement;
> 
> vjob_decl:     instruction*;
> ---
> 
> Le 29/03/11 12:47, Jim Idle a écrit :
> > Looks  like you might be looking for a token that you have not defined, but
> >  post your grammar as it stands now and we can work it out.
> >
> >  Jim
> >
> >> -----Original Message-----
> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> bounces at antlr.org] On Behalf Of Fabien  Hermenier
> >> Sent: Tuesday, March 29, 2011 11:25 AM
> >> To: antlr-interest at antlr.org
> >>  Subject: Re: [antlr-interest] Spaces issues
> >>
> >> Yes, and it  this situation, it seems it ignores the first number and
> >> the range  delimiter:
> >> Here is a sample of the event list with the input  "[2..3]" and the
> >> starting rule  "litteralRange"
> >>
> >> Consume  [[/<32>,1:0, at 0]
> >> Create node 2(0)
> >> Add child 2 to  1
> >> Location (64,20)
> >> LT 1 (3)
> >> LT 1  (3)
> >> LT 2 (])
> >> LT 1 (3)
> >> LT 1 (3)
> >>  LT 1 (3)
> >> RecognitionException: MismatchedTokenException(0!=0) Begin  resync LT 1
> >> (3) Consume [3/<15>,1:4, at 1] LT 1 (]) Consume  []/<35>,1:5 at 2] LT 1 (;)
> >> ...
> >>  ...
> >>
> >> Le 29/03/11 12:16, Jim Idle a écrit  :
> >>> Did you use the debugger instead of the  interpreter?
> >>>
> >>>  Jim
> >>>
> >>>> -----Original  Message-----
> >>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >>>> bounces at antlr.org] On Behalf Of Fabien  Hermenier
> >>>> Sent: Tuesday, March 29, 2011 10:37  AM
> >>>> To: antlr-interest at antlr.org
> >>>>  Subject: Re: [antlr-interest] Spaces  issues
> >>>>
> >>>> Le 29/03/11 07:36, John B. Brodie  a écrit :
> >>>>>  Greetings!
> >>>>>
> >>>>> On Tue, 2011-03-29 at  00:47 -0600, Fabien Hermenier wrote:
> >>>>>>  Hi
> >>>>>>
> >>>>>> I starting to use  ANTLR3 with AntlrWorks 3.4.1 on OS X and I have
> >>>>>> some  issues with spaces. I've attached a sample antlr file
> >>>>  describing
> >>>>>> my grammar (see 1st  grammar)
> >>>>>>
> >>>>>> I'm trying to  test 'litteralRange'. So using the interpreter, I
> >>>>  write
> >>>>>> "[2 ..3]" or "[2 .. 3]" as input and it works  fine. However, if I
> >>>>>> give the string "[2..3]" it does  not work. I have followed the
> >>>>>> tutorial and declare  the Lexer WS with the channel hidden to
> >>  ignore
> >>>>>> spaces, but I still have strange issues with  this.
> >>>>>>
> >>>>>> Another strange  fact is that if I write a reduced grammar that
> >>  just
> >>>>>> isolate the rule I want to test, it is fine  (see 2nd grammar).
> >>>>>>
> >>>>>> Does  anyone have a solution or a hint  ?
> >>>>>>
> >>>>> ....good stuff  snipped....
> >>>>>
> >>>>> see Jim Idle's WIKI  entry:
> >>>>>
> >>>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
> >>>>> +point%2C+dot%2C+range%2C+time+specs
> >>>>>
> >>>>> ; (the above url is supposed to be all on 1 line without white  space)
> >>>>>
> >>>>> Hope this  helps...
> >>>>>        -jbb
> >>>>>
> >>>>>
> >>>> Thanks,  I still have a question. I understand how it is difficult to
> >>>>  capture '..' while having to bother with float numbers such as  ".3".
> >>>> But in my case, I only have to deal with integer  values, so
> >> currently
> >>>> I don't see why I need to  help the Lexer.
> >>>> I have reduced the number of fragments  following the principle of
> >> the
> >>>> link you sent to  me (to catch in a single rule numbers in base 10,
> >>  16
> >>>> or
> >>>> 8) but
> >>>> it  didn't solve my problem  yet.
> >>>>
> >>>>
> >>>>
> >>>>  List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>>>  Unsubscribe:
> >>>> http://www.antlr.org/mailman/options/antlr-interest/your-
> >>>>  email-address
> >>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>>  Unsubscribe:
> >>> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> >>  address
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>  Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> >>  email-address
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: 
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


More information about the antlr-interest mailing list