[antlr-interest] Matching Substring In Lexer

John B. Brodie jbb at acm.org
Mon Apr 26 14:02:13 PDT 2010


Greetings!

On Mon, 2010-04-26 at 16:40 -0400, Kunal Sawlani wrote:
> Hi All,
> I have been trying to solve a problem which I have been having in the lexer,
> but with no luck. My example goes as follows.
> I have a simple grammar with two tokens.
> I want to treat the the string "$ text" as a token TEXTINPUT and everything
> else as a token ANYTHING, which matches anything.
> The scanning process works fine when you supply it the string "$ TEXT", the
> correct token is returned. And if any other character is supplied, the token
> ANYTHING is returned.
> However, for the string "$1", the scanner complaints that it was looking for
> ' ', and no viable alternative for 1. What I want it to return is two tokens
> ANYTHING for the "$", and another token ANYTHING for "1". I was reading into
> the concept of syntactic
> predicates to solve this issue, but I am not quiet getting it right. If
> anyone could point me in the right direction, it would be great. Also, I
> wanted to know if there are any other approaches to solve this issue. I got
> the syntactic predicates concept after reading the following article
> http://www.jguru.com/faq/view.jsp?EID=459059
> 
> <http://www.jguru.com/faq/view.jsp?EID=459059>Any help would be greatly
> appreciated!
> Thanks
> 

see attached....



-------------- next part --------------
grammar Trial;

tokens { TEXTINPUT; }

@members {
   private static final String [] x = new String[]{
      "xyz",
      "$ TEXT",
      "$xyz",
      "$ Txyz",
      "xyz$ TEXT$xyz$ Txyz"
   };

   public static void main(String [] args) {
      for( int i = 0; i < x.length; ++i ) {
         try {
            System.out.println("about to lex:`"+x[i]+"`");
            LexerOnlyLexer lexer =
               new LexerOnlyLexer(new ANTLRStringStream(x[i]));

            int j = 1;
            while( true ) {
               Token token = lexer.nextToken();
               if( token.getType() == LexerOnlyLexer.EOF ) break;
               System.out.format("\%d: type = \%s, text = `\%s`\%n",
                                 j,
                                 tokenNames[token.getType()],
                                 token.getText());
               j++;
            }
         } catch(Exception e) {
            e.printStackTrace();
         }
      }
   }
}

run_it : .+ EOF;

ANYTHING :
      '$' ( (' TEXT')=>' TEXT' { $type=TEXTINPUT; } )?
   | .
   ;



More information about the antlr-interest mailing list