[antlr-interest] Re: Grammar Problem
Bogdan Mitu
bogdan_mt at yahoo.com
Tue Jun 4 08:03:38 PDT 2002
--- johnclarke72 <johnclarke at hotmail.com> wrote:
> I put this in the tag parser because I want to go on to write the
> rules that will allow it to process HTML attributes (which may or may
> not exist). It seems that putting the description of what a whole
> tag looks like in the parser is the best approach.
>
> The main lexer does switch to the tag lexer when it sees <.
The switch to the tag lexer will be done *after* the consumption of '<' by
the main lexer. The tag lexer will never see it, so the token INITSTARTTAG
will never be generated. Modify the startHTMLTag rule to start after '<':
startHTMLTag : tagName:TAGDATA
{System.out.println("STARTTAG : "+tagName.getText());}
FINISHSTARTTAG;
On the other hand, I'm not sure you really need a separate parser for tags
(although you probably need the embedded lexer). The parser doesn't have to
know about the lexer switches.
Bogdan
> How can I get this to work correctly ?
>
> Thanks
>
> John
>
> --- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> > Hi,
> >
> > Why rule startHTMLTag starts with INITSTARTTAG, while the others
> not?
> > It seems that you use embedded lexer and parser for HTML tags. You
> probably
> > have in the main lexer a rule that recognize '<' and switches the
> lexer. The
> > Tag Parser is connected to the second lexer, and will never receive
> the
> > INITSTARTTAG token it is expecting in the rule startHTMLTag.
> >
> > Try:
> > startHTMLTag : /* INITSTARTTAG removed */ tagName:TAGDATA
> > {System.out.println("STARTTAG : "+tagName.getText
> ());}
> > FINISHSTARTTAG;
> >
> > Bogdan
> >
> >
> > --- johnclarke72 <johnclarke at h...> wrote:
> > > Hi,
> > > I am currently having problems with a HTML Grammar that I am
> > > writing. The Grammar has been added to the end of this e-mail.
> > >
> > > When I enter HTML comments (<!-- This is a Comment -->) and End
> Tags
> > > (</endTag>) it handles it correctly.
> > >
> > > However, if I enter <test> or anything similar to this I get
> > > an "line 1: unexpected token: test" error message.
> > >
> > > How can I get this to work ?
> > >
> > > I would be grateful for all advice offered regarding this.
> > >
> > > John
> > >
> > > HTMLTagLexer.g
> > > ==============
> > >
> > > // Import the required Classes
> > > header
> > > {
> > > import java.util.*;
> > > import antlr.*;
> > > }
> > >
> > > // Define the class
> > > class HTMLTagLexer extends Lexer;
> > >
> > > // Set the options for the Lexer
> > > options
> > > {
> > > k=3; // Set the look ahead to 3
> > > Characters
> > > caseSensitive = false; // Set Case Sensitivity to
> false
> > > charVocabulary = '\1' .. '\377'; // Set the Lexer Character
> > > Vocabulary
> > > testLiterals = false; // Don't test against the
> Literals
> > > table
> > > exportVocab = HTMLTagLexer; // The Grammar to export
> > > }
> > >
> > > // The routines that will enable us to switch between lexer states
> > > {
> > > // The Current Selector
> > > TokenStreamSelector selector;
> > >
> > > // The method that will enable us to switch between lexer
> states
> > > public void setSelector(TokenStreamSelector
> tokenStreamSelector)
> > > {
> > > selector = tokenStreamSelector;
> > > }
> > > }
> > >
> > > // Define the Tokens required for the Grammar
> > >
> > > // Various HTML Marker Tags
> > > INITSTARTTAG : "<";
> > > FINISHSTARTTAG : ">";
> > > EQUALS : "=";
> > >
> > > // HTML Comments
> > > HTMLCOMMENT : "!--"! (options {greedy=false;} : .)* "-->"!
> > > {selector.pop();}
> > > ;
> > >
> > > // Main HTML Tags Section. This defines the Tag names,
> > > // attributes and attribute values
> > >
> > > // TAGDATA is used to define the Tag Name and names of
> > > // attributes used within the tag
> > > TAGDATA : (~(' ' | '\r' | '\n' | '\t' | '<' | '>' | '/' | '!'
> | '='))
> > > +;
> > >
> > > // TAGVALUE is used to define the values for attributes
> > > // used within the tags
> > >
> > >
> > > // Definition of an End Tag
> > > ENDTAG : '/'! ( 'a'..'z' )+ ">"! {selector.pop();};
> > >
> > > // Ignore all White Space
> > > WS : ( ' '
> > > | '\t'
> > > | '\r' '\n' { newline(); }
> > > | '\n' { newline(); }
> > > )
> > > {$setType(Token.SKIP);} //ignore this token
> > > ;
> > >
> > > HTMLTagParser.g
> > > ===============
> > >
> > > // Define the class
> > > class HTMLTagParser extends Parser;
> > >
> > > // Set the options for the Parser
> > > options
> > > {
> > > importVocab = HTMLTagLexer; // The Grammar to import
> > > }
> > >
> > >
> > > // The Parser Rules
> > > processHTML:
> > > htmlComment:HTMLCOMMENT {System.out.println
> > > ("COMMENT : "+htmlComment.getText().trim());}
> > > | startHTMLTag
> > > | endTag:ENDTAG {System.out.println("ENDTAG : "+endTag.getText
> > > ());};
> > >
> > > startHTMLTag : INITSTARTTAG tagName:TAGDATA
> > > {System.out.println("STARTTAG : "+tagName.getText
> ());}
> > > FINISHSTARTTAG;
> > >
> > >
> > >
> > >
> > >
> > >
> > > Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
> > >
> > >
> > >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! - Official partner of 2002 FIFA World Cup
> > http://fifaworldcup.yahoo.com
>
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>
__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list