[antlr-interest] Determining context in lexer?

Alexey Demakov demakov at ispras.ru
Wed Nov 10 06:22:26 PST 2004


From: "Don Caton" <dcaton at shorelinesoftware.com>
> In the language I'm parsing, the square brackets are used for two completely
> different things.  The first usage is as an array access operator, i.e.:
> 
>   x := id[expr]
>   x := id[expr, expr]
>   x := id[expr][expr]
>   x := f()[expr]
> 
> and so on.  No problem.  BUT, square brackets can also delimit literal
> strings, i.e.:
> 
>   x := [Hello World]   // equivalent to x := "Hello World"
>   x := f( [Hello], "World" )
> 
> ... etc.  The problem is that the lexer tokenizes text such as:
> 
>   [x + 1]
> 
> into five individual tokens which will eventually match a parser rule such
> as:
> 
>    arraySubscr: LBRKT expr ( COMMA expr )* RBRKT;
> 
> but if the brackets delimit a string, I want the text to be parsed into a
> single STRING_LITERAL token, which would eventually match a rule such as:
> 
>    literalValue:  STRING_LITERAL | INT_LITERAL | FLOAT_LITERAL | ... etc. ;
> 
> Problem is, the lexer does not have context information to decide how to
> tokenize a "[" ... "]" sequence of characters.  I don't think the use of "["
> is ambiguous and if I knew what the prior token was then I could probably
> use a semantic predicate in the lexer rule for "[".  Syntactic and semantic
> predicates can look ahead, but I need to look backwards and I didn't find
> anything in the docs that addresses this kind of problem.  

You can insert filter between lexer and parser. 
This filter will store type of last token.
This info can be used in lexer predicates.

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list