[antlr-interest] How to retrieve free-form text between delimiters?

Thomas Brandon tbrandonau at gmail.com
Mon Jul 23 01:39:31 PDT 2007


On 7/23/07, Andrew Lentvorski <bsder at allcaps.org> wrote:
> I'm trying to create a parser for a VCD (Verilog change dump) file.  I'm
> trying to pull the free-form text from between delimiters.  My VCD file
> looks like this:
>
>
> $date
>          Fri Jan 26 11:28:51 2007
> $end
> $comment
>          Could be anything! 12345 #12345
> $end
> $comment Could be anything 2! 12345 #12345 $end
>
>
> How do I retreive the text from between the $date ... $end or $comment
> ... $end pairs?
>
> I tried this grammar:
>
>
> grammar vcdfile;
>
> vcd     :       declaration_command+ EOF
>         ;
>
> declaration_command
>         : date_dcmd | comment_dcmd
>         ;
>
> date_dcmd:      '$date' ( options {greedy=false;} : . )* '$end'
> {System.out.println("D:"+$date_dcmd.text);} ;
> comment_dcmd:   '$comment' ( options {greedy=false;} : . )* '$end'
> {System.out.println("C:"+$comment_dcmd.text);};
>
> INT     :       '0'..'9'+;
> WS      :       (' '|'\t'|'\n'|'\r')+ {skip();} ;
>
>
> However, I wind up with a bunch of errors like this:
> line 2:8 no viable alternative at character 'F'
> line 2:9 no viable alternative at character 'r'
> line 2:10 no viable alternative at character 'i'
> <... lots more deleted ...>
>
> And output like this:
> D:$date261128512007$end
> C:$comment1234512345$end
> C:$comment21234512345$end
>
> Any suggestions as to what I need to do?  I thought that a . was
> supposed to match *anything*, but clearly my definition of anything and
> ANTLR's definition of anything don't correspond.
>
> Any advice for solving this?
A . does match anything but in a parser this means any token not any
character, so as the only thing your lexer matches is digits and
whitespace anything else is an error. You either need to move your
date and comment rules to the lexer or make the lexer return tokens
for any input that can occur in dates and comments. If you add a lexer
rule after other rules like:
ANY: .;
Then your example should work. However, depending on how you want to
process input moving the rules to the lexer may be a better option.

Tom.
>
> Thanks,
> -a
>
>


More information about the antlr-interest mailing list