[antlr-interest] Re: Still having problems with the lexer code

johnclarke72 johnclarke at hotmail.com
Thu May 9 15:56:18 PDT 2002


Terence,
        Thanks for your reply.  I am extremely new to ANTLR and to be 
honest although I think I understand the example I can't seem to 
develop one based upon that.

I hope that I am not asking much how could I get this to work ?  
Would it also be possible to explain why it would work in the 
ammended version ?

Thanks

John

--- In antlr-interest at y..., Terence Parr <parrt at j...> wrote:
> Hi John...jumping in late, but it seems that if you are staying in 
the 
> "lexer" world to do your parsing within, you should just call the 
rule 
> that parses that grammar.  I designed the selector stream stuff for 
> having an outside agent, like the parser, switch selectors.  Does 
the 
> javadoc example help at all?
> 
> I'd suggest merging the lexers and have the grammar for stuff 
inside 
> comment.  OH, also not that once <!-- is matched by HTMLCOMMENT in 
the 
> first lexer, it cannot appear in the other lexer (already 
consumed).  
> That is probably the source of your problem.
> 
> Ter
> 
> On Thursday, May 9, 2002, at 02:45  PM, johnclarke72 wrote:
> 
> > A number of people have offered me advice regarding this problem 
but
> > so far I have not been able to solve it.
> >
> > When I compile and run the application I then enter <!-- test --> 
and
> > expect to see :
> > HTML Comment : <!-- test --> on the screen. But all I see is :
> >
> > line 1: unexpected token: <!-
> > exception: antlr.TokenStreamRecognitionException: unexpected 
char: -
> >
> > I cannot see what is causing the problem. It is probably something
> > very simple that I have missed out. I would be grateful for any
> > advice offered.
> >
> > Best Wishes
> >
> > John
> >
> > The Grammar for the Text Lexer
> > ==============================
> >
> > // Import the Required Classes
> > header
> > {
> > import java.util.*;
> > import antlr.*;
> > }
> >
> > // The Class
> > class TextLexer extends Lexer;
> >
> > // Set the Options for the Lexer
> > options
> > {
> > k=3; // Set the Look Ahead to 3
> > Characters
> > charVocabulary = '\1' .. '\377'; // Set the Lexer Character
> > Vocabulary
> > testLiterals = false; // Don't test against the
> > Literals table
> > }
> >
> > // The routine that will allow us to switch between Selectors
> > {
> > // The current Selector
> > TokenStreamSelector selector;
> >
> > // The method that will enable us to switch between Selectors
> > public void setSelector(TokenStreamSelector tokenStreamSelector)
> > {
> > selector = tokenStreamSelector;
> > }
> >
> > }
> >
> > HTMLCOMMENT : "<!-" {selector.select("HTMLTagLexer");};
> >
> > // TEXT
> > WORD : ( ~ (' '|'\r'|'\n'|'\t'|'<') ) +;
> >
> > // Ignore all White Space
> > WS : ( ' '
> > | '\t'
> > | '\r' '\n' { newline(); }
> > | '\n' { newline(); }
> > )
> > {$setType(Token.SKIP);} //ignore this token
> > ;
> >
> > The Grammar for the Tag Lexer
> > =============================
> > // Import the Required Classes
> > header
> > {
> > import java.util.*;
> > import antlr.*;
> > }
> >
> > // The Class
> > class HTMLTagLexer extends Lexer;
> >
> > // Set the Options for the Lexer
> > options
> > {
> > k=3; // Set the Look Ahead to 3
> > Characters
> > charVocabulary = '\1' .. '\377'; // Set the Lexer Character
> > Vocabulary
> > testLiterals = false; // Don't test against the
> > Literals table
> > importVocab = Tagged; // The Vocabulary to import
> > exportVocab = HTMLTags; // Export the Vocabulary to
> > HTMLTags
> > }
> >
> > // The routine that will allow us to switch between Selectors
> > {
> > // The current Selector
> > TokenStreamSelector selector;
> >
> > // The method that will enable us to switch between Selectors
> > public void setSelector(TokenStreamSelector tokenStreamSelector)
> > {
> > selector = tokenStreamSelector;
> > }
> >
> > }
> >
> > // HTML Comment Definition
> > HTMLCOMMENT : "<!--" (options { greedy=false; }: .) * "-->";
> >
> > // Ignore all White Space
> > WS : ( ' '
> > | '\t'
> > | '\r' '\n' { newline(); }
> > | '\n' { newline(); }
> > )
> > {$setType(Token.SKIP);} //ignore this token
> > ;
> >
> > The Grammar for the Parser
> > ==========================
> >
> > // Import the Required Classes
> > header
> > {
> > import java.util.*;
> > import antlr.*;
> > }
> >
> > // The Class
> > class HTMLParser extends Parser;
> >
> > // Set the Options for the Parser
> > options
> > {
> > importVocab = Tagged; // The Vocabulary to import
> > }
> >
> > // Define the starting point for processing the HTML
> > processData :
> > (
> > text:WORD {System.out.println("TEXT " + text.getText());}
> > | comment:HTMLComment {System.out.println("HTML Comment " +
> > comment.getText());}
> > )+;
> >
> > The Java Application
> > ====================
> >
> > import java.io.*;
> > import antlr.*;
> >
> > // The HTMLParserApp Class
> > class HTMLParserApp
> > {
> >
> > // The Main function
> > public static void main(String[] args)
> > {
> > try
> > {
> > // Create the required Lexers
> > HTMLTagLexer htmlTagLexer = new HTMLTagLexer(new
> > DataInputStream(System.in));
> > TextLexer textLexer = new TextLexer
> > (htmlTagLexer.getInputState());
> >
> > // Create the TokenStreamSelector and add the required
> > Lexers to it
> > TokenStreamSelector tokenStreamSelector = new
> > TokenStreamSelector();
> > tokenStreamSelector.addInputStream
> > (htmlTagLexer,"HTMLTagLexer");
> > tokenStreamSelector.addInputStream(textLexer,"TextLexer");
> >
> > // Select the starting Lexer
> > tokenStreamSelector.select("TextLexer");
> >
> > // Add the TokenStreamSelector to the Required Lexers
> > htmlTagLexer.setSelector(tokenStreamSelector);
> > textLexer.setSelector(tokenStreamSelector);
> >
> > // Create the HTML Parser
> > HTMLParser htmlParser = new HTMLParser(tokenStreamSelector);
> >
> > // Process the HTML
> > htmlParser.processData();
> >
> > } catch(Exception e)
> > {
> > System.err.println("exception: "+e);
> > }
> > }
> > }
> >
> >
> >
> >
> >
> > Your use of Yahoo! Groups is subject to 
> > http://docs.yahoo.com/info/terms/
> >
> >
> --
> Co-founder, http://www.jguru.com
> Creator, ANTLR Parser Generator: http://www.antlr.org


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list