[antlr-interest] Island grammar for reading shell commands

Mike Matera mike.matera at xilinx.com
Tue Nov 30 21:32:09 PST 2010


Hi Bill,

I think you're doing something weird here:

ShellLexer l = new ShellLexer(input);

You are reusing the CharStream that your parser is using.  I don't know
if this is really supposed to work or not.  Try this instead:

ShellLexer l = new ShellLexer(new ANTLRStringStream($SHELL.getText()))

This copies the characters into a new buffer with no ambiguous impacts
on your top-level parser.  I use this exact pattern to parse strings in
my language. 

One other thing you might consider is not parsing the shell command at
all.  If all you intend to do is run the command you can simply split
the command on whitespace then use java.lang.ProcessBuilder.  Here's
what I mean:

String[] cmd = $SHELL.getText().split("\\s+");
ProcessBuilder bld = new ProcessBuilder(cmd);
Process p = bld.start();

In order to make this trick work you want to stuff all of the shell
characters into a single token.  Here's how you could do that:

SHELL: 'shell' ~('\n' | '\r')+ ('\r')? '\n' ;

This token definition is like the ones you would use to implement a
mechanism similar to a '#define'.  The token blindly snarfs everything
after 'shell' until the end of line.  This is good because the
non-syntax elements won't bother your parser and because it preserves
whitespace which you will use later to tokenize your shell command. 

Hope this helps!

Cheers
./m



On 11/30/2010 06:05 PM, Bill Lear wrote:
> On Tuesday, November 30, 2010 at 15:49:39 (-0800) Jim Idle writes:
>> REST_OF_LINE allows an empty token which will immediately match nothing and
>> continue to do so forever. You want +  not *. I think you might be doing
>> this wrong to be honest. I would probably not use ANTLR for this.
> Maybe, but Antlr is so cool ...
>
> I did try replacing * with + and I got the same error.  Blech.  I really
> didn't want to write a parser by hand for all of this.  I've got
> significantly more to do than what I've got here, the rest of which
> should be easy to handle for antlr.
>
> So, there is really no viable way to have Antlr read the rest of the
> line of input?  I would be happy to just write the Java code for that
> one line:
>
> shell -timeout 30 find /var/log -name ....
>
> If I could just get hold of the input stream, read to end of line,
> I could hand-parse the '-timeout N' part, etc., and then let the
> lexer continue reading on the next line.  Is there no way to "cut out"
> a part of the input like this to process separately?
>
> Actually, I think I have a sick idea: Since this is line-based stuff,
> and small files, I can read this all in to memory.  I can hand-parse
> the shell command lines, and replace them with an empty line (to
> preserve line numbers in case of error) in the input, noting which
> lines I modified.  Then, I can give the massaged input to Antlr.  The
> parser is going to produce one instance of a Command class for each
> line, and return a list of them, in order.  I can just put the shell
> Command instances back in the list where they belong and be on my
> merry way..
>
> As I said though, it would be really cool if I could just do this all
> in Antlr.
>
>
> Bill
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

This email and any attachments are intended for the sole use of the named recipient(s) and contain(s) confidential information that may be proprietary, privileged or copyrighted under applicable law. If you are not the intended recipient, do not read, copy, or forward this email message or any attachments. Delete this email message and any attachments immediately.




More information about the antlr-interest mailing list