[antlr-interest] Re: Grammar Problem

johnclarke72 johnclarke at hotmail.com
Tue Jun 4 13:53:41 PDT 2002


Thanks for your help.

John

--- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> --- johnclarke72 <johnclarke at h...> wrote:
> > I put this in the tag parser because I want to go on to write the 
> > rules that will allow it to process HTML attributes (which may or 
may 
> > not exist).  It seems that putting the description of what a 
whole 
> > tag looks like in the parser is the best approach.
> > 
> > The main lexer does switch to the tag lexer when it sees <.  
> 
> The switch to the tag lexer will be done *after* the consumption 
of '<' by
> the main lexer. The tag lexer will never see it, so the token 
INITSTARTTAG
> will never be generated. Modify the startHTMLTag rule to start 
after '<':
> 
> startHTMLTag : tagName:TAGDATA
>                {System.out.println("STARTTAG : "+tagName.getText
());}
>                FINISHSTARTTAG;
>  
> On the other hand, I'm not sure you really need a separate parser 
for tags
> (although you probably need the embedded lexer). The parser doesn't 
have to
> know about the lexer switches.
> 
> Bogdan
> 
> > How can I get this to work correctly ?
> > 
> > Thanks
> > 
> > John
> > 
> > --- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> > > Hi,
> > > 
> > > Why rule startHTMLTag starts with INITSTARTTAG, while the 
others 
> > not? 
> > > It seems that you use embedded lexer and parser for HTML tags. 
You 
> > probably
> > > have in the main lexer a rule that recognize '<' and switches 
the 
> > lexer. The
> > > Tag Parser is connected to the second lexer, and will never 
receive 
> > the
> > > INITSTARTTAG token it is expecting in the rule startHTMLTag.
> > > 
> > > Try:
> > > startHTMLTag : /* INITSTARTTAG removed */ tagName:TAGDATA
> > >                 {System.out.println
("STARTTAG : "+tagName.getText
> > ());}
> > >                 FINISHSTARTTAG;
> > >  
> > > Bogdan
> > > 
> > > 
> > > --- johnclarke72 <johnclarke at h...> wrote:
> > > > Hi,
> > > >    I am currently having problems with a HTML Grammar that I 
am 
> > > > writing.  The Grammar has been added to the end of this e-
mail.
> > > > 
> > > > When I enter HTML comments (<!-- This is a Comment -->) and 
End 
> > Tags 
> > > > (</endTag>) it handles it correctly.
> > > > 
> > > > However,  if I enter <test> or anything similar to this I get 
> > > > an "line 1: unexpected token: test" error message.  
> > > > 
> > > > How can I get this to work ?
> > > > 
> > > > I would be grateful for all advice offered regarding this.
> > > > 
> > > > John
> > > > 
> > > > HTMLTagLexer.g
> > > > ==============
> > > > 
> > > > // Import the required Classes
> > > > header
> > > > {
> > > >    import java.util.*;
> > > >    import antlr.*;
> > > > }
> > > > 
> > > > // Define the class
> > > > class HTMLTagLexer extends Lexer;
> > > > 
> > > > // Set the options for the Lexer
> > > > options
> > > > {
> > > >   k=3;                             // Set the look ahead to 3 
> > > > Characters
> > > >   caseSensitive = false;           // Set Case Sensitivity to 
> > false
> > > >   charVocabulary = '\1' .. '\377'; // Set the Lexer Character 
> > > > Vocabulary
> > > >   testLiterals = false;            // Don't test against the 
> > Literals 
> > > > table
> > > >   exportVocab = HTMLTagLexer;      // The Grammar to export
> > > > }
> > > > 
> > > > // The routines that will enable us to switch between lexer 
states
> > > > {
> > > >    // The Current Selector
> > > >    TokenStreamSelector selector;
> > > > 
> > > >    // The method that will enable us to switch between lexer 
> > states
> > > >    public void setSelector(TokenStreamSelector 
> > tokenStreamSelector)
> > > >    {
> > > >      selector = tokenStreamSelector;
> > > >    }
> > > > }
> > > > 
> > > > // Define the Tokens required for the Grammar
> > > > 
> > > > // Various HTML Marker Tags
> > > > INITSTARTTAG   : "<";
> > > > FINISHSTARTTAG : ">";
> > > > EQUALS         : "=";
> > > > 
> > > > // HTML Comments
> > > > HTMLCOMMENT : "!--"! (options {greedy=false;} : .)* "-->"!
> > > >               {selector.pop();}
> > > >               ;
> > > > 
> > > > // Main HTML Tags Section.  This defines the Tag names,
> > > > // attributes and attribute values
> > > > 
> > > > // TAGDATA is used to define the Tag Name and names of
> > > > // attributes used within the tag
> > > > TAGDATA : (~(' ' | '\r' | '\n' | '\t' | '<' | '>' | '/' | '!' 
> > | '='))
> > > > +;
> > > > 
> > > > // TAGVALUE is used to define the values for attributes
> > > > // used within the tags
> > > > 
> > > > 
> > > > // Definition of an End Tag
> > > > ENDTAG   : '/'! ( 'a'..'z' )+ ">"! {selector.pop();};
> > > > 
> > > > // Ignore all White Space
> > > > WS : ( ' '
> > > >      | '\t'
> > > >      | '\r' '\n' { newline(); }
> > > >      | '\n' { newline(); }
> > > >      )
> > > >      {$setType(Token.SKIP);} //ignore this token
> > > > ;
> > > > 
> > > > HTMLTagParser.g
> > > > ===============
> > > > 
> > > > // Define the class
> > > > class HTMLTagParser extends Parser;
> > > > 
> > > > // Set the options for the Parser
> > > > options
> > > > {
> > > >   importVocab = HTMLTagLexer;     // The Grammar to import
> > > > }
> > > > 
> > > > 
> > > > // The Parser Rules
> > > > processHTML:
> > > >    htmlComment:HTMLCOMMENT {System.out.println
> > > > ("COMMENT : "+htmlComment.getText().trim());}
> > > >    | startHTMLTag
> > > >    | endTag:ENDTAG {System.out.println
("ENDTAG : "+endTag.getText
> > > > ());};
> > > > 
> > > > startHTMLTag : INITSTARTTAG tagName:TAGDATA
> > > >                {System.out.println
("STARTTAG : "+tagName.getText
> > ());}
> > > >                FINISHSTARTTAG;
> > > > 
> > > > 
> > > > 
> > > > 
> > > >  
> > > > 
> > > > Your use of Yahoo! Groups is subject to 
> > http://docs.yahoo.com/info/terms/ 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > __________________________________________________
> > > Do You Yahoo!?
> > > Yahoo! - Official partner of 2002 FIFA World Cup
> > > http://fifaworldcup.yahoo.com
> > 
> > 
> >  
> > 
> > Your use of Yahoo! Groups is subject to 
http://docs.yahoo.com/info/terms/ 
> > 
> > 
> > 
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! - Official partner of 2002 FIFA World Cup
> http://fifaworldcup.yahoo.com


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list