[antlr-interest] White space needed in the parsing.

Jim Idle jimi at temporal-wave.com
Thu Sep 18 12:05:58 PDT 2008


Use WORD+ in the parser rules, not the lexer. If you want the
whitespace, reconstruct the input string by doing getStartPosition() on
the first WORD and getEndPosition() on the last word and reconstruct
from the input string.

Jim

On Thu, 2008-09-18 at 11:47 -0700, jack zhang wrote:

> I got a question regarding the Antlr Lexer. Basically I would like  
> to do a simple logic query parser. For example,
> 
>  (1) The input: Hello AND world OR antlr
>     The AST: (OR (AND Hello world) antlr)
> 
>  This part works fine now. But I would like to match a string
> (including  
>  white spaces)  Here is the example,
> 
>  (1) The input: Hello AND how are you OR world
>     The Current AST: (AND Hello how) are (OR you world)
>     I want to achieve: (OR (And Hello how are you) world)
> 
> 
>  I would like to match "how are you"  into one WORD, "hello" into  
>  another token.  If I use following rules:
> 
>  WORD: .+;
> 
> 
>  this will match everything including the "AND" into a WORD.
> 
>  Did you have such situation before?
> 
>  Attached is the lexer and parser grammer file.
> 
>  Thx.
> 
>  -Jack.
> 
> ==== 
> grammar Query;
> 
> //=== Parser Option ===//
> options {
>   output = AST;
>   k=*;
> }
> 
> 
> 
> //=== Lexer ===//
> OR: 'OR';
> AND: 'AND';
> NOT: 'NOT';
> WORD  : ('a'..'z' | 'A'..'Z' | '.' | ',' | '0'..'9')+ | '"'.+'"';
> LEFT_PAREN: '(';
> RIGHT_PAREN: ')';
> WHITESPACE: (' ' | '\t' | '\r' | '\n') { $channel = HIDDEN; } ;
> 
> 
> //=== Parser ===//
> expr: orexpression*;
>  
> orexpression
>     :   andexpression (OR orexpression)*
>     ;
> 
> andexpression
>     : notexpression (AND andexpression)*
>     ;
> 
> notexpression
>     : (NOT)? atom
>     ;
> 
> atom
>     : WORD
>     | LEFT_PAREN! expr RIGHT_PAREN!
>     ;
> 
> ===
> 
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080918/924843dd/attachment.html 


More information about the antlr-interest mailing list