[antlr-interest] [Fwd: Ignoring tokens in AnTLR+Python]

Daniel Hernandez Bahr dbahr at estudiantes.uci.cu
Fri Mar 5 06:22:24 PST 2010


Hi all.

Sorry for posting the hole "parsing files" thing, it was rather stupid 
and i figured it out later.

The thing is, I have defined the rules for the assignments and macros, 
and made a sample input file containing only such instructions i'm 
posting the first lines so you have a clearer idea:

SWIG_LDFLAGS="$LDFLAGS"
INSTALL="$abs_srcdir/$INSTALL"
APR_VER_REGEXES=["0\.9\.[7-9] 0\.9\.1[0-9] 1\."]
APU_VER_REGEXES=["0\.9\.[7-9] 0\.9\.1[0-9] 1\."]

as can be seen there are only assignments in the first four lines and 
the assignment rule looks like this:

sentences   :   sentence (sentences)?;

sentence    :   assignment | macro;

assignment  :   w:WORD EQUAL^ v:value
            {
                w = w.getText()
                e = Exception ("%s is not a valid identifier" %(w))
                print w, "::",
                if (not w[0].isalpha()):
                    raise e
                else:
                    try:
                        index = w.index(".")
                        index = w.index("-")
                        raise e
                    except ValueError:
                        index = -1
            }
            ;

value       :   WORD | s:STRINGLIT
            {
                print s.getText()
            }
            ;

yet when i run the lexer/parser script i get this:

"$LDFLAGS"
SWIG_LDFLAGS :: UNEXPECTED CHAR: 0xA

does anyone knows what i am doing wrong here??

Best regards,

D.H. Bahr
Daniel Hernandez Bahr wrote:
> I am back.
>
> I've just realized that the ignoring should be done in Parser (not in 
> Lexer), so I made some adjustments and tried again the construction:
>
> sentence: assignment | macro | other;
> other: ~(assignment | macro);
>
> and now I'm getting that the subrule cannot be inverted. Only subrules 
> of the form:
>     (T1|T2|T3...) or
>     ('c1'|'c2'|'c3'...)
> may be inverted (ranges are also allowed).
>
> So I am back to the same problem:
>
> How do I ignore the other sentences i don't need?
>
> Best regards,
>
> D.H. Bahr.
>
> -------- Original Message --------
> Subject: 	[antlr-interest] Ignoring tokens in AnTLR+Python
> Date: 	Thu, 04 Mar 2010 10:14:23 -0500
> From: 	Daniel Hernandez Bahr <dbahr at estudiantes.uci.cu>
> To: 	antlr-interest at antlr.org <antlr-interest at antlr.org>
> References: 
> <4a051d931003031537ib220a57jf896cd43fbb5d319 at mail.gmail.com> 
> <eae205eee3744a458a11b871a47d2bfe at temporal-wave.com> 
> <9362e74e1003040511s48ff2e25h828466dc5639aea1 at mail.gmail.com>
>
>
>
> Hello everyone!
>
> I am fairly new to AnTLR. I am working on an interpreter for 
> configuration files ('configure.ac' files i should say); but I don't 
> need to scan every single token on the files, only variable assignments 
> and one or another macro so, my question is:
>
> How can I ignore every other sentence on the files?
>
> At first I intended to do something like
>
> SENTENCE: ASSIGNMENT | MACRO | OTHER;
> OTHER: ~(ASSIGNMENT | MACRO)
>
> but i get that ~TOKEN is not allowed in lexer. Is there a way to achieve 
> this without me having to define the entire grammar of 'configure.ac' files?
>
> Best regards,
>
> D.H. Bahr
>
> PS: As remarked in subject I am using python and not Java or C.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>   



More information about the antlr-interest mailing list