[antlr-interest] Re: Grammar Problem

Tue Jun 4 06:57:32 PDT 2002

I put this in the tag parser because I want to go on to write the 
rules that will allow it to process HTML attributes (which may or may 
not exist).  It seems that putting the description of what a whole 
tag looks like in the parser is the best approach.

The main lexer does switch to the tag lexer when it sees <.  

How can I get this to work correctly ?

Thanks

John

--- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> Hi,
> 
> Why rule startHTMLTag starts with INITSTARTTAG, while the others 
not? 
> It seems that you use embedded lexer and parser for HTML tags. You 
probably
> have in the main lexer a rule that recognize '<' and switches the 
lexer. The
> Tag Parser is connected to the second lexer, and will never receive 
the
> INITSTARTTAG token it is expecting in the rule startHTMLTag.
> 
> Try:
> startHTMLTag : /* INITSTARTTAG removed */ tagName:TAGDATA
>                 {System.out.println("STARTTAG : "+tagName.getText
());}
>                 FINISHSTARTTAG;
>  
> Bogdan
> 
> 
> --- johnclarke72 <johnclarke at h...> wrote:
> > Hi,
> >    I am currently having problems with a HTML Grammar that I am 
> > writing.  The Grammar has been added to the end of this e-mail.
> > 
> > When I enter HTML comments (<!-- This is a Comment -->) and End 
Tags 
> > (</endTag>) it handles it correctly.
> > 
> > However,  if I enter <test> or anything similar to this I get 
> > an "line 1: unexpected token: test" error message.  
> > 
> > How can I get this to work ?
> > 
> > I would be grateful for all advice offered regarding this.
> > 
> > John
> > 
> > HTMLTagLexer.g
> > ==============
> > 
> > // Import the required Classes
> > header
> > {
> >    import java.util.*;
> >    import antlr.*;
> > }
> > 
> > // Define the class
> > class HTMLTagLexer extends Lexer;
> > 
> > // Set the options for the Lexer
> > options
> > {
> >   k=3;                             // Set the look ahead to 3 
> > Characters
> >   caseSensitive = false;           // Set Case Sensitivity to 
false
> >   charVocabulary = '\1' .. '\377'; // Set the Lexer Character 
> > Vocabulary
> >   testLiterals = false;            // Don't test against the 
Literals 
> > table
> >   exportVocab = HTMLTagLexer;      // The Grammar to export
> > }
> > 
> > // The routines that will enable us to switch between lexer states
> > {
> >    // The Current Selector
> >    TokenStreamSelector selector;
> > 
> >    // The method that will enable us to switch between lexer 
states
> >    public void setSelector(TokenStreamSelector 
tokenStreamSelector)
> >    {
> >      selector = tokenStreamSelector;
> >    }
> > }
> > 
> > // Define the Tokens required for the Grammar
> > 
> > // Various HTML Marker Tags
> > INITSTARTTAG   : "<";
> > FINISHSTARTTAG : ">";
> > EQUALS         : "=";
> > 
> > // HTML Comments
> > HTMLCOMMENT : "!--"! (options {greedy=false;} : .)* "-->"!
> >               {selector.pop();}
> >               ;
> > 
> > // Main HTML Tags Section.  This defines the Tag names,
> > // attributes and attribute values
> > 
> > // TAGDATA is used to define the Tag Name and names of
> > // attributes used within the tag
> > TAGDATA : (~(' ' | '\r' | '\n' | '\t' | '<' | '>' | '/' | '!' 
| '='))
> > +;
> > 
> > // TAGVALUE is used to define the values for attributes
> > // used within the tags
> > 
> > 
> > // Definition of an End Tag
> > ENDTAG   : '/'! ( 'a'..'z' )+ ">"! {selector.pop();};
> > 
> > // Ignore all White Space
> > WS : ( ' '
> >      | '\t'
> >      | '\r' '\n' { newline(); }
> >      | '\n' { newline(); }
> >      )
> >      {$setType(Token.SKIP);} //ignore this token
> > ;
> > 
> > HTMLTagParser.g
> > ===============
> > 
> > // Define the class
> > class HTMLTagParser extends Parser;
> > 
> > // Set the options for the Parser
> > options
> > {
> >   importVocab = HTMLTagLexer;     // The Grammar to import
> > }
> > 
> > 
> > // The Parser Rules
> > processHTML:
> >    htmlComment:HTMLCOMMENT {System.out.println
> > ("COMMENT : "+htmlComment.getText().trim());}
> >    | startHTMLTag
> >    | endTag:ENDTAG {System.out.println("ENDTAG : "+endTag.getText
> > ());};
> > 
> > startHTMLTag : INITSTARTTAG tagName:TAGDATA
> >                {System.out.println("STARTTAG : "+tagName.getText
());}
> >                FINISHSTARTTAG;
> > 
> > 
> > 
> > 
> >  
> > 
> > Your use of Yahoo! Groups is subject to 
http://docs.yahoo.com/info/terms/ 
> > 
> > 
> > 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Yahoo! - Official partner of 2002 FIFA World Cup
> http://fifaworldcup.yahoo.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/