[antlr-interest] Reading all text to end-of-line in a rule

Bill Lear rael at zopyra.com
Fri Nov 26 08:18:50 PST 2010


I have searched in vain for a solution to this, though as this is my
first attempt at writing an Antlr grammar, perhaps I just don't know
enough about Antlr to form the right search.

I am trying to write a grammar to support simple one-line command
constructs of the following form:

   clean [-timeout <N>] [-notify (<email> | "<email1> <email2>...")]
   shell [-timeout <N>] shell_command_text

Where "shell_command_text" above is simply the remaining text on the
line.

Examples:

    clean -timeout 20
    clean -notify "foobar at biz.com joe at bla.com" -timeout 2
    shell ls -l /tmp
    shell /x/home/boo/jre/bin/java -jar /tmp/j.jar DoIt -timeout 22 fizbuz

In my current grammar, I have, among other things, essentially
the following:

    clean: 'clean' comamndOptions? ;

    shell: 'shell' timeoutOption COMMAND_TEXT ;

    commandOptions: timeoutOption | notifyOption |
        timeoutOption notifyOption | notifyOption timeoutOption ;

    timeoutOption: '-timeout' INT ;

    notifyOption: '-notify' EMAIL | '-notify' QUOTED_STRING ;

    COMMAND_TEXT: ~('\n' | '\r')+ {
        setText(getText().trim());
    } ;

    QUOTED_STRING:
        '"' ( EscapeSequence | ~('\\'|'"') )* '"' {
            setText(getText().substring(1, getText().length() - 1));
        } | '\'' ( EscapeSequence | ~('\\'|'\'') )* '\'' {
            setText(getText().substring(1, getText().length() - 1));
        } ;

    fragment
    EscapeSequence : '\\' ('\"'|'\''|'\\') ;
    INT: '0'..'9'+ ;
    ID: 'a'..'z'+ ;
    EMAIL: ~('\n' | '\r' | ' ' | '"')+ ;
    NEWLINE: '\r'? '\n' ;
    WS: (' ' | '\t')+ { skip(); } ;

Which of course does not work, as the COMMAND_TEXT rule basically
obliterates the others:

error(208): Command.g:133:1: The following token definitions can never be matched because prior tokens match the same input: INT,ID,EMAIL,WS

So, I'm at a loss as to how to support the need to read until the end of
line, or end of file, for the 'shell' rule.

I'm assuming a syntactit predicate or some other trickery is on order,
but I'm simply not able to figure it out after much head scratching.

Would anyone here be able to help with this?  Any other helpful
criticism of the above would also be welcome (a better EMAIL rule?
a better way to handle '-' type options?).

Thank you.


Bill


More information about the antlr-interest mailing list