[antlr-interest] ANTLR enforces LL(1) beyond about 300 tokens

A Z asicaddress at gmail.com
Sun Aug 22 09:49:44 PDT 2010


Hello,

  I am trying to develop a SystemVerilog grammar using ANTLR 3.2. I was able
to successfully construct a Verilog2005 grammar and verified it against
about 800 tests. I used the same approach for SystemVerilog but upon
compilation I get lots of errors that make it clear ANTLR is only using
LL(1).

SystemVerilog has about twice the number of keywords and 50% more operators
than Verilog2005 so I took the working Verilog2005 grammar reduced it to
just the tokens and a single rule:


grammar Verilog2005;

tokens
{
K_ACCEPT_ON                = 'accept_on';
K_ALIAS                    = 'alias';
K_ALWAYS                   = 'always';
.
.
.
EQUALSTWOQMARK             = '==?';
BANGEQUALSQMARK            = '!=?';
MINUSGT                    = '->';
}

fragment Alpha     : ('a'..'z' | 'A'..'Z');
fragment IdentChar : ('0'..'9' | 'a'..'z' | 'A'..'Z' | '$' | '_');
SIMPLE_IDENT  : (Alpha | '_') IdentChar*;

unary_op  :
    PLUS
  | MINUS
  | BANG
  | TILDE
  | AMPERSAND
  | TILDEAMP
  | VBAR


I then slowly added the SystemVerilog tokens until it started failing.
Around 300 tokens I start getting these errors:

warning(209): temp.g:341:1: Multiple token rules can match input such as
"'a'": K_ACCEPT_ON, K_ALIAS, K_ALWAYS, K_ALWAYS_COMB, K_ALWAYS_FF,
K_ALWAYS_LATCH, K_AND, K_ASSERT, K_ASSIGN, K_ASSUME, K_AUTOMATIC,
SIMPLE_IDENT

As a result, token(s)
K_ALIAS,K_ALWAYS,K_ALWAYS_COMB,K_ALWAYS_FF,K_ALWAYS_LATCH,K_AND,K_ASSERT,K_ASSIGN,K_ASSUME,K_AUTOMATIC,SIMPLE_IDENT
were disabled for that input


I am not sure how to resolve this.  Removing the final identifier token also
allows a clean compile but the ANTLR book indicates ANTLR should try to
match in the order listed. Thanks.


More information about the antlr-interest mailing list