[antlr-interest] "Context Sensitive" Tokens

Gavin Lambert antlr at mirality.co.nz
Wed Dec 17 23:44:40 PST 2008


At 14:20 18/12/2008, Mihai Danila wrote:
>The problem with this grammar is that TODAY and NOW become their 
>own tokens and can't be used as string literals or as field 
>names. These work: field=TODAY, field=NOW, but these don't: 
>TODAY=string (TODAY is a valid field name) and field=TODAY (TODAY 
>is a valid string).
>
>The nasty solution is to extend the field and string rules to 
>match these tokens:
>
>query:    field '=' value;
>field:    (DIGIT | ALPHA | '_')+ | TODAY | NOW;
>value:    string | date;
>date:     isoDate | TODAY | NOW;
>string:   (DIGIT | ALPHA)+ | TODAY | NOW;
>isoDate:  DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT;
>DIGIT:    '0'..'9';
>ALPHA:    'a'..'z' | 'A'..'Z';
>TODAY:    'TODAY';
>NOW:      'NOW';
>
>But these are nasty and I'd rather not use them. Fragments didn't 
>seem to work for me. What is the standard solution to this 
>problem, if any?

My standard solution is to do exactly that (although normally I 
would try to consolidate DIGIT and ALPHA into single multi-digit 
and alphanumeric tokens).  If, in the context of a "field", you 
can match either a DIGIT or an ALPHA or a TODAY then that's what 
the rule should express.  (If you like, when you match a TODAY you 
can convert it to a different token type [eg. multiple ALPHAs] 
when constructing an AST.  If you *are* constructing an AST, of 
course.)



More information about the antlr-interest mailing list