[antlr-interest] parsing just a subset of a grammar

Mon Nov 19 11:23:22 PST 2012

Hi,

I'm new to ANTLR and I seek for a good advice.

Here is my story. I'm parsing Cisco IOS config files. They are quite
loosely defined but actually I don't need to have whole the config
file parsed. I'm interested in just a subset of the config file (acl
rule below) and I don't really care about all other parts of the file
right now. Having said it, in the future I'll need to add other parts
as well (e.g. rule for interfaces definition) but again, I don't need
to have all of the config file parsed. I don't want to implement
complete Cisco IOS grammar since seams it would become a very hard
task indeed.

To ignore all not interesting parts of the config file I defined the
grammar this way:

/*
 * Parser Rules
 */

config: (acl | any)* EOF;
any: (ID|INT)* EOL;
acl: 'ip' 'access-list' 'extended'? ID EOL (remark | rule)+ EOF;
remark: (index)? 'remark' (~EOL)* EOL;
rule: (index)? verb protocol source source_port destination
destination_port flag? log? EOL;

// Not so interesting parser rules here...

/*
 * Lexer Rules
 */

fragment
CHAR: 'a'..'z' | 'A'..'Z' | '_' | '-' | '.' | '+' | '/' | ':' | '%';
fragment
NUMBER: '0'..'9';
INT: NUMBER+;
ID: CHAR (CHAR | NUMBER)*;
EOL: ('\r' | '\n')+;
WS: (' ' | '\t') { $channel=HIDDEN; };
COMMENT: '!' (~('\r' | '\n'))* EOL { $channel=HIDDEN; };
ILLEGAL: .;

It turns out ANTLR doesn't behave the way I expected =) What I wanted
is for ANTLR to parse the following line "no ip bootp server" via
'any' rule but ANTLR finds 'ip' token in the line and treats the line
as not correct 'acl' rule. Specifying syntactic predicate "config:
(('ip' 'access-list') => acl | any)* EOF" only makes things worse
judging by ANTLRWorks output - parser stops almost immediately with an
unrecoverable error.

My question is - is there a way to achieve the kind of filtering I'm
talking about (parse only 'acl', ignore anything else) via ANTLR
grammar? What should I use? Syntactic predicate? Several-pass parsing?
Custom lexer (how do I even start implementing such beast?)? Parse out
all interesting sections from a file via regex before supplying them
to ANTLR grammar that is only ACL-oriented (at least I know how to
implement this last option)?

-- Alexander