[antlr-interest] Newbe lexer question

Sun Nov 12 14:12:07 PST 2006

Hi!

On 12. Nov 2006, at 22:50 , HC wrote:

> 1. Can (an what would a pseudo lexer rule be) antlr handle the  
> following
> scenario easily:
>
> Let's say my lexer needs to be able to tokenize the following strings:
>
> a) "car"
> b) "bus"
> c) "bus car"
>
> It is fairly easy to write a rule for antlr that is smart enough to
> recognize the string "bus car" as token C and not as token B  
> followed by
> token A?

Yes, this is no problem, although it looks a bit ugly at times ;)
If you do not have a whole lot of those cases it shouldn't be a concern.
I.e. you could solve this with syntactic predicates.

> 2. Can I pass strings to be tokenized into the lexer from outside the
> grammar file?
>
> Let's say I have lexer rule called WORD which may be any character  
> sequence
> which I specify externally. I am looking to maintain an external  
> dictionary
> of words which are valid WORD tokens which can be expanded and  
> reduced by a
> user without modifying the grammar file.

Am I understanding you correctly, in that you want to have a dynamic  
lexer rule, or
do you want this:
I could imagine to do this is to let the lexer recognize the said  
strings
as WORDs and then use a custom action to look up the token's text in  
your external
dictionary. You could indicate a recognition error to ANTLR and try  
error recovery
(like ANTLR does internally) or simply fail.
If you want the latter, this is easy. Dynamically changing lexer  
rules isn't because
that would affect lookahead.
Another way could be to employ syntactic predicates, but it really  
depends on what you
are trying to achieve.

> Am I using Antlr in the right context here?

If you are doing language recognition, yes ;)

HTH,

-k
-- 
Kay Röpke
http://classdump.org/