[antlr-interest] 'filter' option in ANTLR 3.0

Ryan Hollom ryan.hollom at us.lawson.com
Thu Sep 21 07:54:08 PDT 2006


Thanks for the advice, Jim.  If I must, I could break apart all of the 
multi-word keywords into single word keywords -- I'm not sure if I would 
run in to other issues or not.  However, I'm also using the generated 
parser for content assist in an editor, and would very much like to 
present multi word keywords in their full form, rather than breaking them 
apart into individual options.  Perhaps a small concession to get the 
parser working, but a desire nonetheless!

Please refer to my previous post for a trimmed down example of my 
language, and the issues that I'm still seeing.

Thanks again,
Ryan




"Jim Idle" <jimi at intersystems.com> 
Sent by: antlr-interest-bounces at antlr.org
09/20/2006 06:01 PM

To
Ryan Hollom/Lawson at Lawson, <antlr-interest at antlr.org>
cc

Subject
Re: [antlr-interest] 'filter' option in ANTLR 3.0






Ryan, 
 
I suggest that you will have more traction specifying these words as 
individual keywords and not using spaces within the keywords. I presume 
that you want filter mode for some other reason though? You would not 
normally require it for ?normal? parsing. Then again maybe this is not a 
parser as such?
 
So: Try this, (with suitable mods of course)

grammar test;
 
test
      : ( classDef | fieldDef | inlineDef)+
      ;
 
classDef: ID IS A CLASSDEFINITION   ;
fieldDef: ID IS (A | AN) ID ;
inlineDef: ID IS (ALPHA | NUMERIC) ;
 
A     : 'a' ;
ALPHA : 'Alpha' ;
AN    : 'an';
CLASSDEFINITION : 'ClassDefinition' ;
IS    : 'is' ;
NUMERIC : 'Numeric';
 
WS    : (' ' | '\t' | 'n' | '\r') { channel=99;} ;
 
ID : IDCHARS IDCHARS*;
 
 
fragment
IDCHARS: ('a'..'z' | 'A'.. 'Z') ;
 
 
This works perfectly in ANTLRWorks 1.0b3 with the input:


Elephant is a ClassDefinition
Socrates is an Elephant
AllElephants is a Socrates
 
 
Ter ? this reminds me that though we have debated the case insensitive 
lexing thing, I no longer remember what the conclusion was [meaning los in 
meta levels ;-) ]? Just overriding things in the end was it?
 
Jim
 
 

From: antlr-interest-bounces at antlr.org 
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Ryan Hollom
Sent: Wednesday, September 20, 2006 2:17 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] 'filter' option in ANTLR 3.0
 

Greetings- 
I have a grammar with several multi-word keywords, and I'm having trouble 
properly tokenizing the input.  For example, I have the rules 

classDef : ID 'is a ClassDefinition'; 
fieldDef: ID ('is a' | 'is an') ID 
inlineDef : ID 'is' ('Alpha' | 'Numeric') 

So the 'is'-prefixed keywords are 'is a ClassDefinition', 'is a Class', 
'is a', 'is an', and 'is'.  With these rules, the lexer chokes on input 
like: 

MyClass is a ClassDefinition 
        MyNumericField is Numeric 

with a no viable alt line 2:20; char='N' 

It would seem to me that the lexer should try to match the longest 
multi-word keyword it can, and, in this case, should create the tokens 
<MyClass>, <'is a ClassDefinition'>, <MyNumericField>, <'is'>, and 
<'Numeric'>.  I have tried to use the filter option to properly tokenize, 
but this forces me to list all of my keywords in the order in which they 
should be recognized (correct?), which seems like it would be a big issue 
when importing a different vocab/super grammar. 

Am I missing an obvious solution here?  I've tried many different 
permutations and can't seem to get it just right. 

Thanks in advance, 
Ryan 

PS - Why is it that when the filter option is set to true, semantic 
actions are handled differently?  For the rule 

fieldDef: ID { printId(); } 'is a' ID; 

generates to 
if (backtracking == 1) { printId(); } 
with filter=true vs 
if (backtracking == 0) { printId(); } 
when filter=false. 

I am using antlr3.0 b4.  Thanks again! 
--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.5/450 - Release Date: 9/18/2006

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.5/450 - Release Date: 9/18/2006
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060921/680baaa4/attachment.html 


More information about the antlr-interest mailing list