[antlr-interest] parsing the matlab language

Jim Idle jimi at temporal-wave.com
Thu Apr 14 07:55:02 PDT 2011


Matlab parsing is more difficult than this because it is such an abysmally
specified language, made worse by the recent attempts to add objects to
it. For instance, one part of the syntax boils down to "include this file
if it is a file otherwise it is this syntax". Because Matlab is
interpreted, it is easy to hack something together to do that, but when it
is parsed, it means you will need the whole structure of source code
available or you will get it incorrect. I did not complete the whole
syntax but it is hairy. You may be better off with a custom lexer.

Jim



> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ron Burk
> Sent: Thursday, April 14, 2011 7:45 AM
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] parsing the matlab language
>
> Couldn't find a rigorous language spec for matlab, but it kinda looks
> like the lexer can handle the distinction by remember the previous
> token type returned. (e.g., if the previous token type was X, or X., or
> ), then the single quote is an operator instead of a string -- the
> exact rule depends on the exact language spec).
>
> Obviously, you don't want the parser trying to piece together a quoted
> string out of whatever (possibly illegal!) tokens appeared inside it,
> so this is a job for the lexer.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list