[antlr-interest] Reading all text to end-of-line in a rule

Martijn Reuvers martijn.reuvers at gmail.com
Fri Nov 26 10:23:24 PST 2010


Hello Bill,

The grammar below does the trick for your command (I stripped it
somewhat, to quickly get it working). See the SHELL_COMMAND token,
notice the options part with the greedy identifier. It reads until it
encounters the newline stuff. You need to have something in front of
it, like SHELL in this example - otherwise it would match anything,
which is not what you want.

Martijn



grammar Test;

start
  :    email
  | SHELL_COMMAND
      { System.out.println("cmd=" + $SHELL_COMMAND.text.substring(6)); }
  ;


email
  :    EMAIL TIMEOUT INT
  ;

EMAIL
    :    'email'
    ;

TIMEOUT
    : 'timeout'
    ;

SHELL
  :    'shell'
  ;

INT
  : '0'..'9'+
  ;

SHELL_COMMAND
    : SHELL (options {greedy=false;} : . )* '\r'? '\n'
    ;

WS
  : (' ' | '\t')+ { skip(); }
  ;


On Fri, Nov 26, 2010 at 5:18 PM, Bill Lear <rael at zopyra.com> wrote:
>
> I have searched in vain for a solution to this, though as this is my
> first attempt at writing an Antlr grammar, perhaps I just don't know
> enough about Antlr to form the right search.
>
> I am trying to write a grammar to support simple one-line command
> constructs of the following form:
>
>   clean [-timeout <N>] [-notify (<email> | "<email1> <email2>...")]
>   shell [-timeout <N>] shell_command_text
>
> Where "shell_command_text" above is simply the remaining text on the
> line.
>
> Examples:
>
>    clean -timeout 20
>    clean -notify "foobar at biz.com joe at bla.com" -timeout 2
>    shell ls -l /tmp
>    shell /x/home/boo/jre/bin/java -jar /tmp/j.jar DoIt -timeout 22 fizbuz
>
> In my current grammar, I have, among other things, essentially
> the following:
>
>    clean: 'clean' comamndOptions? ;
>
>    shell: 'shell' timeoutOption COMMAND_TEXT ;
>
>    commandOptions: timeoutOption | notifyOption |
>        timeoutOption notifyOption | notifyOption timeoutOption ;
>
>    timeoutOption: '-timeout' INT ;
>
>    notifyOption: '-notify' EMAIL | '-notify' QUOTED_STRING ;
>
>    COMMAND_TEXT: ~('\n' | '\r')+ {
>        setText(getText().trim());
>    } ;
>
>    QUOTED_STRING:
>        '"' ( EscapeSequence | ~('\\'|'"') )* '"' {
>            setText(getText().substring(1, getText().length() - 1));
>        } | '\'' ( EscapeSequence | ~('\\'|'\'') )* '\'' {
>            setText(getText().substring(1, getText().length() - 1));
>        } ;
>
>    fragment
>    EscapeSequence : '\\' ('\"'|'\''|'\\') ;
>    INT: '0'..'9'+ ;
>    ID: 'a'..'z'+ ;
>    EMAIL: ~('\n' | '\r' | ' ' | '"')+ ;
>    NEWLINE: '\r'? '\n' ;
>    WS: (' ' | '\t')+ { skip(); } ;
>
> Which of course does not work, as the COMMAND_TEXT rule basically
> obliterates the others:
>
> error(208): Command.g:133:1: The following token definitions can never be matched because prior tokens match the same input: INT,ID,EMAIL,WS
>
> So, I'm at a loss as to how to support the need to read until the end of
> line, or end of file, for the 'shell' rule.
>
> I'm assuming a syntactit predicate or some other trickery is on order,
> but I'm simply not able to figure it out after much head scratching.
>
> Would anyone here be able to help with this?  Any other helpful
> criticism of the above would also be welcome (a better EMAIL rule?
> a better way to handle '-' type options?).
>
> Thank you.
>
>
> Bill
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list