[antlr-interest] Spaces issues
Loring Craymer
lgcraymer at yahoo.com
Tue Mar 29 22:46:28 PDT 2011
Now you are getting confused by COMBINED grammars. ANTLR generates distinct
lexer and parser from a combined grammar; the lexer (the capitalized rules in
the "combined grammar") tokenizes the input, while the parser operates on the
tokens generated by the lexer. Regroup your rules to segregate parser from
lexer (this is to make them more readable and has no functional impact) and then
consider whether the lexer rules do what you want (clearly, they don't in their
current incarnation). Lexer rules should be considered as alternatives for
tokens (in fact, ANTLR generates a master lexer rule which basically takes the
form
Tokens : A | B | ... | Z ;
where A, B, and so forth are the lexer production names).
--Loring
----- Original Message ----
> From: Fabien Hermenier <hermenierfabien at gmail.com>
> To: antlr-interest at antlr.org
> Sent: Tue, March 29, 2011 7:59:54 PM
> Subject: Re: [antlr-interest] Spaces issues
>
> Hi
>
> I have reduced the number of fragment to zero for test purposes but it
> does not solve the problem.
> So I have reduced the grammar to a minimum, to only be able to parse the
> input I gave to you.
> It appears yet, that the Lexer rule "INNERCONTENT" has caused the issue.
> This is strange to me as it was not used in the rule "litteralRange".
>
> Does anyone know how is this possible ?
>
> Thanks for your help
> Fabien.
>
> Le 29/03/11 19:53, Loring Craymer a écrit :
> > The likely cause of your problems is the extensive use of fragment rules.
>ANTLR
> > 3 does not use follow sets in lexers and invocation of fragment rules
>usually
> > disables LL* processing. Inline your fragment rules, and your current
>problems
> > should disappear, although others may still lurk.
> >
> > --Loring
> >
> >
> > ----- Original Message ----
> >> From: Fabien Hermenier<hermenierfabien at gmail.com>
> >> To: antlr-interest at antlr.org
> >> Sent: Tue, March 29, 2011 12:51:47 PM
> >> Subject: Re: [antlr-interest] Spaces issues
> >>
> >> Here is my entire grammar
> >> There is a lot of commented rules and "litteralRange" does not have its
> >> complete definition cause easier patterns does not work yet.
> >> Currently, litteralRange should accept inputs such as "[2..3"] or "[ 2
> >> .. 0xFF]".
> >>
> >> Thanks for your help!
> >>
> >> ---
> >> grammar ANTLRVJob5;
> >>
> >> options {
> >> language = Java;
> >> output = AST;
> >> }
> >> fragment Digit :'0'..'9';
> >> fragment Letter :'a'..'z'|'A'..'Z';
> >> fragment Name : Domain ('.' Domain)*;
> >> fragment Domain: Letter ('-'?(Letter|Digit))*;
> >> fragment VarPrefix: '$';
> >> fragment EnumSep: ',';
> >> fragment InnerContent: (Letter
> >> |Digit
> >> |'_'
> >> |'-'
> >> |'.'(Letter|Digit));
> >> fragment RRange: ']' (InnerContent*(Letter|Digit))?;
> >> fragment LRange: (Letter (Digit|Letter|'-'|'_'|'.')*)? '[';
> >>
> >> //Number litteral section
> >> fragment HEX_LITERAL : ;
> >> fragment OCTAL_LITERAL :;
> >> fragment DECIMAL_LITERAL:;
> >> NUMBER: '0'(
> >> ('x'|'X') { $type = HEX_LITERAL;}
> >> (Digit|'a'..'f'|'A'..'F')+
> >> |
> >> ('0'..'7')+ {$type = OCTAL_LITERAL;}
> >> |
> >> )
> >> |
> >> '1'..'9' Digit* {$type = DECIMAL_LITERAL;}
> >> ;
> >>
> >> NAME: Name;
> >> ENUMSEP: EnumSep;
> >> EQUALS : '=';
> >> ENDL : ';';
> >> PLUS : '+';
> >> MINUS : '-';
> >> TIMES : '*';
> >> VARIABLE: VarPrefix(Letter|'_')(Letter|Digit|'_')*;
> >>
> >> COMMENT
> >> : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
> >> | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
> >> ;
> >>
> >> WS : ('\n'|'\r'|'\t'|' ') {$channel=HIDDEN;};
> >>
> >>
> >> INNERCONTENT: InnerContent+;
> >> RRANGE:RRange;
> >> LRANGE: LRange;
> >> LVRANGE: VarPrefix LRange;
> >> CONSTRAINTIDENTIFIER: Letter(Letter|Digit|'_')*'(';
> >>
> >> litteral: NAME|NUMBER;
> >> operator: PLUS|TIMES;
> >>
> >> //litteralRange: LRANGE INTEGER '..' INTEGER RRANGE;
> >> litteralRange: '[' NUMBER '..' NUMBER ']';
> >>
> >> litteralEnum: LRANGE INNERCONTENT /*(ENUMSEP INNERCONTENT)+']'
>RRANGE*/;
> >>
> >> variableEnum: LVRANGE INNERCONTENT (ENUMSEP INNERCONTENT)+ RRANGE;
> >> variableRange: LVRANGE NUMBER '..' NUMBER RRANGE;
> >>
> >> explodedSet:('{}'| '{'expression (ENUMSEP expression)*'}');
> >>
> >> atom : '(' expression ')'
> >> |litteral
> >> // |VARIABLE
> >> |litteralRange
> >> // |litteralEnum
> >> // |variableRange
> >> // |variableEnum
> >> // |explodedSet
> >> ;
> >>
> >>
> >> expression: atom/* (operator expression)?*/;
> >>
> >> var_decl: VARIABLE EQUALS expression ';';
> >>
> >> /*forEachStatement:
> >> 'foreach' VARIABLE 'in' expression '{'
> >> instruction*
> >> '}';
> >>
> >> constraintCallStatement: CONSTRAINTIDENTIFIER expression (','
> >> expression)* ')' ';';
> >> */
> >> instruction: var_decl;
> >> //|forEachStatement
> >> // |constraintCallStatement;
> >>
> >> vjob_decl: instruction*;
> >> ---
> >>
> >> Le 29/03/11 12:47, Jim Idle a écrit :
> >>> Looks like you might be looking for a token that you have not defined,
>but
> >>> post your grammar as it stands now and we can work it out.
> >>>
> >>> Jim
> >>>
> >>>> -----Original Message-----
> >>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >>>> bounces at antlr.org] On Behalf Of Fabien Hermenier
> >>>> Sent: Tuesday, March 29, 2011 11:25 AM
> >>>> To: antlr-interest at antlr.org
> >>>> Subject: Re: [antlr-interest] Spaces issues
> >>>>
> >>>> Yes, and it this situation, it seems it ignores the first number and
> >>>> the range delimiter:
> >>>> Here is a sample of the event list with the input "[2..3]" and the
> >>>> starting rule "litteralRange"
> >>>>
> >>>> Consume [[/<32>,1:0, at 0]
> >>>> Create node 2(0)
> >>>> Add child 2 to 1
> >>>> Location (64,20)
> >>>> LT 1 (3)
> >>>> LT 1 (3)
> >>>> LT 2 (])
> >>>> LT 1 (3)
> >>>> LT 1 (3)
> >>>> LT 1 (3)
> >>>> RecognitionException: MismatchedTokenException(0!=0) Begin resync LT 1
> >>>> (3) Consume [3/<15>,1:4, at 1] LT 1 (]) Consume []/<35>,1:5 at 2] LT 1 (;)
> >>>> ...
> >>>> ...
> >>>>
> >>>> Le 29/03/11 12:16, Jim Idle a écrit :
> >>>>> Did you use the debugger instead of the interpreter?
> >>>>>
> >>>>> Jim
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >>>>>> bounces at antlr.org] On Behalf Of Fabien Hermenier
> >>>>>> Sent: Tuesday, March 29, 2011 10:37 AM
> >>>>>> To: antlr-interest at antlr.org
> >>>>>> Subject: Re: [antlr-interest] Spaces issues
> >>>>>>
> >>>>>> Le 29/03/11 07:36, John B. Brodie a écrit :
> >>>>>>> Greetings!
> >>>>>>>
> >>>>>>> On Tue, 2011-03-29 at 00:47 -0600, Fabien Hermenier wrote:
> >>>>>>>> Hi
> >>>>>>>>
> >>>>>>>> I starting to use ANTLR3 with AntlrWorks 3.4.1 on OS X and I have
> >>>>>>>> some issues with spaces. I've attached a sample antlr file
> >>>>>> describing
> >>>>>>>> my grammar (see 1st grammar)
> >>>>>>>>
> >>>>>>>> I'm trying to test 'litteralRange'. So using the interpreter, I
> >>>>>> write
> >>>>>>>> "[2 ..3]" or "[2 .. 3]" as input and it works fine. However, if I
> >>>>>>>> give the string "[2..3]" it does not work. I have followed the
> >>>>>>>> tutorial and declare the Lexer WS with the channel hidden to
> >>>> ignore
> >>>>>>>> spaces, but I still have strange issues with this.
> >>>>>>>>
> >>>>>>>> Another strange fact is that if I write a reduced grammar that
> >>>> just
> >>>>>>>> isolate the rule I want to test, it is fine (see 2nd grammar).
> >>>>>>>>
> >>>>>>>> Does anyone have a solution or a hint ?
> >>>>>>>>
> >>>>>>> ....good stuff snipped....
> >>>>>>>
> >>>>>>> see Jim Idle's WIKI entry:
> >>>>>>>
> >>>>>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
> >>>>>>> +point%2C+dot%2C+range%2C+time+specs
> >>>>>>>
> >>>>>>> ; ; (the above url is supposed to be all on 1 line without white
>space)
> >>>>>>>
> >>>>>>> Hope this helps...
> >>>>>>> -jbb
> >>>>>>>
> >>>>>>>
> >>>>>> Thanks, I still have a question. I understand how it is difficult to
> >>>>>> capture '..' while having to bother with float numbers such as
".3".
> >>>>>> But in my case, I only have to deal with integer values, so
> >>>> currently
> >>>>>> I don't see why I need to help the Lexer.
> >>>>>> I have reduced the number of fragments following the principle of
> >>>> the
> >>>>>> link you sent to me (to catch in a single rule numbers in base 10,
> >>>> 16
> >>>>>> or
> >>>>>> 8) but
> >>>>>> it didn't solve my problem yet.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>>>>> Unsubscribe:
> >>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-
> >>>>>> email-address
> >>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>>>> Unsubscribe:
> >>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> >>>> address
> >>>>
> >>>>
> >>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> >>>> email-address
> >>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list