[antlr-interest] How to ignore TOKEN in a String

John B. Brodie jbb at acm.org
Mon Mar 21 11:36:01 PDT 2011


Greetings!

On Mon, 2011-03-21 at 10:00 -0700, Hiten R wrote:
> Hi All,
> 
> ANTLR grammar acts funny when it encounters a TOKEN in a String. How should
> I make the ANTLR escape the letter found in the String is not a TOKEN.
> 
> Help will be appreciated.
> 
> Thanks
> Hiten
> 
> Example
> text_content.txt
> funny boys are Tom Hardy Donald
> serious guys are not funny either
> 
> grammar
> options {
>     language=Java;
>     k=1;
> }
> 
> start
>   : 'funny' call_funny_parse
>   | 'serious' call_serious_parse
>   ;
> 
> call_funny_parse
> @init {
>   ArrayList<String> person = new ArrayList<String>(); //this should contain
> Tom Hardy Donald
> }
>   : jackT=TOKEN    //boys
>     macT=TOKEN   //are
>     (nextPersonT=TOKEN { person.add($nextPersonT.text); })* // Tom Hardy
> Donald
>   ;
> 
> call_serious_parse
> @init {
>   String line = "";
> }
>   : (stringT=TOKEN { line = line + $stringT.text; })* // This is where is
> fails and says 'funny' is expecting something else

you did not post a complete example of your problem so I can only
speculate...

funny is a reserved word in your language and can therefore NEVER be a
TOKEN. so when your call_serious_parse rule encounters the `funny` word
in the input string, it will receive from the lexer a token having the
type of the funny keyword and not the type of TOKEN.

for each quoted string in your parser rules, ANTLR will implicitly
generate a lexer rule corresponding to that string. so until you become
more familiar with the workings of ANTLR, it is best that you avoid
quoted strings in your parser rules and instead explicitly define the
corresponding lexer rules yourself. in this way you (hopefully) will be
better able to see any potential undesired overlap amongst your tokens.


search the mail archives (see antlr.markmail.org) and/or the antlr wiki
for "Keywords as Identifiers" and similar search patterns for the ways
to get around your issues with 'funny'.

Hope this helps...
   -jbb




More information about the antlr-interest mailing list