[antlr-interest] delimited param lists

Chris Black chris at lotuscat.com
Tue Jul 12 14:16:39 PDT 2005


Goran wrote:

> Hi,

Hello! I took the liberty of supplying a subject to make the thread 
easier to track for others.

> first I must say that I'm beginner in ANTLR (I writte few minor 
> projects so far, but nothing complicated), so please forgive me on 
> this question but, I just don't get it.
> I have old assembler written manualy, and now I want to transfer it to 
> ANTLR, so, among other things I have situation like this:
>
> Possible source combination:
>     1.    aaa                        ; just mnemonic
>     2.    aaa param              ; mnemonic and one param
>     3.    aaa param, param   ; mnemonic and two params
>
> when I write down following
> /
> /
>
>     /statement
>         : mnemonic (paramBlock)?
>         ;   
>
>     paramBlock
>         : expression (COMMA expression)?
>         ;
>     /
>
> parser does recognize options 2 and 3, but when I write
> /
> /
>
>     /statement
>         : mnemonic (paramBlock)*
>         ;   
>
>     paramBlock
>         : expression (COMMA expression)*
>         ;
>     /
>
> parser passes but on second statement (for example if I have 1 
> following 2) he does not recognise aaa as mnemonic but as identifier.
> So, I'm comfused with (xxx)? thas this means 0 or 1 (because this is 
> not functioning that way) or I miss something (what is probably the 
> case :-) )
>
> I'm using ANTLR 2.5.7 and K=2

I'm guessing you are actually using 2.7.5 and the 2.5.7 was a typo...

Anyway, you are correct on ? and *. ? means 0 or 1 (or "optional"), * is 
0 or more, and + is 1 or more.
As for why your parser isn't working I'm not quite sure. My initial 
suspicion is that the parser does not know when to end the loop that can 
match any number of paramBlocks or (COMMA expression)*. The parser needs 
a way of knowing when to stop matching a rule that has * or +. This can 
be a bit tricky and sometimes requires predicates, but usually it is 
something as easy as matching an end of line or end of statement token:

statement: mnemonic (paramBlock)* NEWLINE;

Another thing I just noticed is that you probably don't need a * on both 
the paramBlock match in statement AND the (COMMA expression) part inside 
the paramBlock rule. The way it is now is you are saying a statement is 
a mnemonic followed by ANY NUMBER of paramBlocks, AND a paramBlock 
itself can contain ANY NUMBER of expressions.

Anyway, just a few tips, hope they help.
Chris


More information about the antlr-interest mailing list