[antlr-interest] "Ignore until" in the Parser

Bryan Ewbank ewbank at gmail.com
Tue Jun 14 05:04:38 PDT 2005


How do you know when to stop parsing the "first few lines" of a
module?  If you can define the first token to be ignored, then you can
write a token filter that works like sed:
   nextToken()
   {
      t = nextToken(); // i.e., from input stream
      if (t == __token_from_which_to_scan__)
      {
           while (t != ENDMODULE) { t = nextToken(); }
      }
      return t;
   }

Another option, if you prefer to keep it in the parser, is to write a
parser rule that is "match all tokens until ENDMODULE" - something
like this:
   skipModule : (options{greedy=false;} . )* ENDMODULE ;

Or perhaps this
   skipModule : ( {LT(1) != ENDMODULE}? . )* ENDMODULE ;

I leave error recovery as an exercise (what if you hit EOF before ENDMODULE?)

On 6/14/05, Greg Bedwell <gregbedwell at gmail.com> wrote:
> Hi,
> 
> I've scoured the documentation for an answer to this but I find it
> quite hard to work out where a potential answer may lie.
> 
> I'm using ANTLR to strip some information from the headers of verilog
> modules (inputs, outputs, associated widths, parameters, etc) which
> ALMOST works fine, except for one tiny little problem.
> 
> I'm only interested in parsing the first few lines of each module.  I
> don't care at all about the contents of the body of the module at all
> really.  Currently it works perfectly if I manually delete the
> contents of the body from the verilog module but that is not a
> feasable solution once this program leaves my hands :).
> 
> I need a way to define a rule in the parser that says: skip everything
> until the word "endmodule" appears.  Here's an example of the code I
> am parsing.  There will be potentially hundreds of modules in each
> file I parse:
> 
> module myModule (out, in1, sel);
> 
> parameter size1 = 8;
> parameter size2 = 4;
> 
> output [size1-1:0] out;
> input [size1*size2-1:0] in1;
> input [size2-1:0] sel;
> 
> // Everything from here I want to ignore
> 
> reg [size1-1:0] out;
> integer i,j,temp;
> 
> always @ (in1 or sel)
> begin
>     // Lots of verilog code.
> end
> 
> // Ignore until here
> 
> endmodule
> 
> 
> I want to stay away from having to use an entire verilog grammar if
> possible because it seems a massive overkill and will complicate my
> solution which at the moment is relatively few lines long :)
> 
> Thanks
>


More information about the antlr-interest mailing list