[antlr-interest] White space needed in the parsing.

jack zhang jackgzhang2 at yahoo.com
Thu Sep 18 11:47:42 PDT 2008


I got a question regarding the Antlr Lexer. Basically I would like  
to do a simple logic query parser. For example,

 (1) The input: Hello AND world OR antlr
    The AST: (OR (AND Hello world) antlr)

 This part works fine now. But I would like to match a string (including  
 white spaces)  Here is the example,

 (1) The input: Hello AND how are you OR world
    The Current AST: (AND Hello how) are (OR you world)
    I want to achieve: (OR (And Hello how are you) world)


 I would like to match "how are you"  into one WORD, "hello" into  
 another token.  If I use following rules:

 WORD: .+;

 this will match everything including the "AND" into a WORD.

 Did you have such situation before?

 Attached is the lexer and parser grammer file.

 Thx.

 -Jack.

==== 
grammar Query;

//=== Parser Option ===//
options {
  output = AST;
  k=*;
}



//=== Lexer ===//
OR: 'OR';
AND: 'AND';
NOT: 'NOT';
WORD  : ('a'..'z' | 'A'..'Z' | '.' | ',' | '0'..'9')+ | '"'.+'"';
LEFT_PAREN: '(';
RIGHT_PAREN: ')';
WHITESPACE: (' ' | '\t' | '\r' | '\n') { $channel = HIDDEN; } ;


//=== Parser ===//
expr: orexpression*;
 
orexpression
    :   andexpression (OR orexpression)*
    ;

andexpression
    : notexpression (AND andexpression)*
    ;

notexpression
    : (NOT)? atom
    ;

atom
    : WORD
    | LEFT_PAREN! expr RIGHT_PAREN!
    ;

===





      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080918/6e240ae5/attachment.html 


More information about the antlr-interest mailing list