[antlr-interest] Island grammar for reading shell commands

Bill Lear rael at zopyra.com
Mon Nov 29 19:00:46 PST 2010


I've followed the sensible advice to create an island grammar to deal
with parsing an unstructured shell command along with more structured
input.  Unfortunately, when I follow the examples as given in the
Antlr examples zip file, I get very close, but then either a NULL
pointer exception bites, or I can only parse one line of "island"
input --- I can't figure out how to return control back to the "sea"
parser and have it read more input.

I've tried to boil it down to very essential form.  If someone could have
a look, I'd appreciate it (I've been working on it for hours and can't
quite get it to work).

Below my sig are the details.

Many thanks in advance for any help you can offer.


Bill

Here is some sample input:

shell ls /var/log
cleanup
cleanlog
cleanup -timeout 20
cleanlog -timeout 20
cleanup -timeout 20 -notify wlear at paypal.com
cleanlog -timeout 20 -notify wlear at paypal.com
cleanup -timeout 20 -notify "wlear at paypal.com foo at bar.com"
cleanlog -timeout 20 -notify "wlear at paypal.com foo at bar.com"
shell find /var/log/qmail -type f -name '@*' | xargs rm -f
shell -timeout 20 ls /tmp

Here is my driver program:

import org.antlr.runtime.*;

public class Command {
    public static void main(String[] args) throws Exception {
        new CommandParser(
            new CommonTokenStream(
                new CommandLexer(
                    new ANTLRInputStream(System.in)))).commands();
    }
}


Here is the "island" grammar (Shell.g):

grammar Shell;

@parser::members {
    private String command;
    public String getCommand() {
        return command;
    }
}

shell: REST_OF_LINE {
        command = $REST_OF_LINE.text.trim();
    }
    ;

REST_OF_LINE: (options {greedy=false;} : . )* '\r'? '\n' {
        // If this is uncommented, I get a NULL pointer exception.
        //emit(Token.EOF_TOKEN);
    }
    ;

Here is the "sea" grammar (Command.g):

grammar Command;

@lexer::members {
    public static final int SHELL_CHANNEL = 1;
}

commands : command+ ;

command
scope {
    int timeout;
    List<String> notifyList;
}
@init {
    $command::timeout = -1;
    $command::notifyList = new ArrayList<String>();
}
    : cleanup | cleanlog {
    }
    | SHELL
    | NEWLINE
    ;

cleanup
    : CLEANUP command_options? {
        System.out.println("cleanup::timeout=" + $command::timeout
                           + " email=" + $command::notifyList);
    }
    ;

cleanlog
    : CLEANLOG command_options? {
        System.out.println("cleanlog::timeout=" + $command::timeout
                           + " email=" + $command::notifyList);
    }
    ;

SHELL
    : 'shell' {
        System.out.println("Got shell.  Going native.");

        ShellLexer l = new ShellLexer(input);

        CommonTokenStream tokens = new CommonTokenStream(l);

        ShellParser parser = new ShellParser(tokens);
        parser.shell();

        String command = parser.getCommand();

        System.out.println("Got command from ShellParser:[" + command + "]");
        $channel = SHELL_CHANNEL;
    }
    ;

command_options
    : timeoutOption
    | notifyOption
    | timeoutOption notifyOption
    | notifyOption timeoutOption
    ;

timeoutOption
    : TIMEOUT INT { $command::timeout = Integer.parseInt($INT.text); }
    ;

notifyOption
    : NOTIFY EMAIL {
        $command::notifyList.add($EMAIL.text);
    }
    | NOTIFY QUOTED_STRING {
        String[] l = $QUOTED_STRING.text.split("\\s+");

        for (int i = 0; i < l.length; i++) {
            $command::notifyList.add(l[i]);
        }
    }
    ;

CLEANUP: 'cleanup' ;
CLEANLOG: 'cleanlog' ;
TIMEOUT: '-timeout' ;
NOTIFY: '-notify' ;
INT: '0'..'9'+ ;

QUOTED_STRING:
    '"' ( ESCAPE_SEQUENCE | ~('\\'|'"') )* '"' {
        setText(getText().substring(1, getText().length() - 1));
    }
    | '\'' ( ESCAPE_SEQUENCE | ~('\\'|'\'') )* '\'' {
        setText(getText().substring(1, getText().length() - 1));
    }
    ;

WS: (' ' | '\t')+ { skip(); } ;

NEWLINE: '\r'? '\n' { } ;

EMAIL: ~('\n' | '\r' | ' ' | '"')+ {
    }
    ;

COMMENT
    : '//' ~('\n'|'\r')* '\r'? '\n' { skip(); }
    | '/*' ( options {greedy=false;} : . )* '*/' { skip(); }
    ;

fragment
ESCAPE_SEQUENCE : '\\' ('\"'|'\''|'\\') ;


More information about the antlr-interest mailing list