[antlr-interest] [Fwd: Ignoring tokens in AnTLR+Python]

Daniel Hernandez Bahr dbahr at estudiantes.uci.cu
Fri Mar 5 06:37:34 PST 2010


I should add that i DO have a whitespace rule defined as follow:

protected WS:   ( ' '
                |    '\t'
                |    '\f'
                // handle newlines
                |   (    ("\r\n") => "\r\n"  // Evil DOS
                    |    '\r'    // Macintosh
                    |    '\n'    // Unix (the right way)
                    )
                    { $newline; }
                )
                { _ttype = SKIP; }
                ;

Is there anything wrong with the rule? I have no clue about why am 
getting the unexpected char thing.

Does anyone?

Best regards,

D.H. Bahr
Daniel Hernandez Bahr wrote:
> Hi all.
>
> Sorry for posting the hole "parsing files" thing, it was rather stupid 
> and i figured it out later.
>
> The thing is, I have defined the rules for the assignments and macros, 
> and made a sample input file containing only such instructions i'm 
> posting the first lines so you have a clearer idea:
>
> SWIG_LDFLAGS="$LDFLAGS"
> INSTALL="$abs_srcdir/$INSTALL"
> APR_VER_REGEXES=["0\.9\.[7-9] 0\.9\.1[0-9] 1\."]
> APU_VER_REGEXES=["0\.9\.[7-9] 0\.9\.1[0-9] 1\."]
>
> as can be seen there are only assignments in the first four lines and 
> the assignment rule looks like this:
>
> sentences   :   sentence (sentences)?;
>
> sentence    :   assignment | macro;
>
> assignment  :   w:WORD EQUAL^ v:value
>             {
>                 w = w.getText()
>                 e = Exception ("%s is not a valid identifier" %(w))
>                 print w, "::",
>                 if (not w[0].isalpha()):
>                     raise e
>                 else:
>                     try:
>                         index = w.index(".")
>                         index = w.index("-")
>                         raise e
>                     except ValueError:
>                         index = -1
>             }
>             ;
>
> value       :   WORD | s:STRINGLIT
>             {
>                 print s.getText()
>             }
>             ;
>
> yet when i run the lexer/parser script i get this:
>
> "$LDFLAGS"
> SWIG_LDFLAGS :: UNEXPECTED CHAR: 0xA
>
> does anyone knows what i am doing wrong here??
>
> Best regards,
>
> D.H. Bahr
> Daniel Hernandez Bahr wrote:
>   
>> I am back.
>>
>> I've just realized that the ignoring should be done in Parser (not in 
>> Lexer), so I made some adjustments and tried again the construction:
>>
>> sentence: assignment | macro | other;
>> other: ~(assignment | macro);
>>
>> and now I'm getting that the subrule cannot be inverted. Only subrules 
>> of the form:
>>     (T1|T2|T3...) or
>>     ('c1'|'c2'|'c3'...)
>> may be inverted (ranges are also allowed).
>>
>> So I am back to the same problem:
>>
>> How do I ignore the other sentences i don't need?
>>
>> Best regards,
>>
>> D.H. Bahr.
>>
>> -------- Original Message --------
>> Subject: 	[antlr-interest] Ignoring tokens in AnTLR+Python
>> Date: 	Thu, 04 Mar 2010 10:14:23 -0500
>> From: 	Daniel Hernandez Bahr <dbahr at estudiantes.uci.cu>
>> To: 	antlr-interest at antlr.org <antlr-interest at antlr.org>
>> References: 
>> <4a051d931003031537ib220a57jf896cd43fbb5d319 at mail.gmail.com> 
>> <eae205eee3744a458a11b871a47d2bfe at temporal-wave.com> 
>> <9362e74e1003040511s48ff2e25h828466dc5639aea1 at mail.gmail.com>
>>
>>
>>
>> Hello everyone!
>>
>> I am fairly new to AnTLR. I am working on an interpreter for 
>> configuration files ('configure.ac' files i should say); but I don't 
>> need to scan every single token on the files, only variable assignments 
>> and one or another macro so, my question is:
>>
>> How can I ignore every other sentence on the files?
>>
>> At first I intended to do something like
>>
>> SENTENCE: ASSIGNMENT | MACRO | OTHER;
>> OTHER: ~(ASSIGNMENT | MACRO)
>>
>> but i get that ~TOKEN is not allowed in lexer. Is there a way to achieve 
>> this without me having to define the entire grammar of 'configure.ac' files?
>>
>> Best regards,
>>
>> D.H. Bahr
>>
>> PS: As remarked in subject I am using python and not Java or C.
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>   
>>     
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>   



More information about the antlr-interest mailing list