[antlr-interest] Determining context in lexer?

Monty Zukowski monty at codetransform.com
Wed Nov 10 08:45:54 PST 2004


On Nov 10, 2004, at 6:01 AM, Don Caton wrote:

>
> Hi all:
>
> In the language I'm parsing, the square brackets are used for two 
> completely
> different things.  The first usage is as an array access operator, 
> i.e.:
>
>   x := id[expr]
>   x := id[expr, expr]
>   x := id[expr][expr]
>   x := f()[expr]
>
> and so on.  No problem.  BUT, square brackets can also delimit literal
> strings, i.e.:
>
>   x := [Hello World]   // equivalent to x := "Hello World"
>   x := f( [Hello], "World" )
>
> ... etc.  The problem is that the lexer tokenizes text such as:
>
>   [x + 1]
>
> into five individual tokens which will eventually match a parser rule 
> such
> as:
>
>    arraySubscr: LBRKT expr ( COMMA expr )* RBRKT;
>
> but if the brackets delimit a string, I want the text to be parsed 
> into a
> single STRING_LITERAL token, which would eventually match a rule such 
> as:
>
>    literalValue:  STRING_LITERAL | INT_LITERAL | FLOAT_LITERAL | ... 
> etc. ;
>
> Problem is, the lexer does not have context information to decide how 
> to
> tokenize a "[" ... "]" sequence of characters.  I don't think the use 
> of "["
> is ambiguous and if I knew what the prior token was then I could 
> probably
> use a semantic predicate in the lexer rule for "[".  Syntactic and 
> semantic
> predicates can look ahead, but I need to look backwards and I didn't 
> find
> anything in the docs that addresses this kind of problem.
>
> -- 
> Don
>

You can override makeToken() to store the token type of the last token 
returned (maybe you want to not set it if it was whitespace?  will 
depend on your application.

So keep that as an instance variable and then check with your semantic 
predicate.

Monty



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list