[antlr-interest] Can a lexer rule have user defined attributes?

Gavin Lambert antlr at mirality.co.nz
Fri Jul 25 02:49:45 PDT 2008


At 21:24 25/07/2008, Jan van Mansum wrote:
>ID returns [int rv]
>     :    'a'..'z'+ {$rv = 3;}
>     ;
[...]
>- Why does the grammar check out as correct?

Because ANTLR's error-checking can be a little crude at 
times.  Basically the grammar check is mostly just checking 
syntax; you've made a semantic error there, and it's not so good 
at catching those.

>- Why can lexer rules not have user defined attributes?

Because there's nowhere to put them.  All lexer rules (top level 
ones, at least) are required to return a Token, and (except in the 
C target) the standard token doesn't have any "spare" 
storage.  You can, however, subclass either Token or CommonToken 
and add your own fields.  Then you can store whatever additional 
data you want.

However it's not quite as simple as the above; to get extra data 
in there you'll need to override the emit method and pass it in 
that way.  (Johannes recently posted an example of this for C# 
just today, in fact.  I'm sure the Java version is similar.)



More information about the antlr-interest mailing list