[antlr-interest] parsing the matlab language

Graham Wideman gwlist at grahamwideman.com
Thu Apr 14 17:58:47 PDT 2011


Gary:

For what version(s) of Matlab is the code you are hoping to digest?  I'm curious whether you have any comments on the degree to which the versions you want to process vary?

Jim:

Sounds like you've looked at Matlab from a syntax point-of-view in some depth.  I've wallowed in Matlab (R2010) for the many months as a user, and have noted its share of thing-on-a-thing aspects, such as two different object models, one or both of which apparently underwent a substantial revision as of R2008.

But I'm curious about your comment on the "maybe include this file" syntax. What version of Matlab does this pertain to?  What syntax is it that has this odd behavior, and what's an example of a syntax issue that depends on it downstream?

I'd like to know because it sounds like a part of Matlab I've not encountered, and is probably a hazard I want to avoid.   Alternatively, perhaps you are describing a feature set that pertains to a version of Matlab of some time back, and today's Matlab might be somewhat less unruly, which might aid Gary's cause?

-- Graham

At 4/14/2011 07:55 AM, Jim Idle wrote:
>Matlab parsing is more difficult than this because it is such an abysmally
>specified language, made worse by the recent attempts to add objects to
>it. For instance, one part of the syntax boils down to "include this file
>if it is a file otherwise it is this syntax". Because Matlab is
>interpreted, it is easy to hack something together to do that, but when it
>is parsed, it means you will need the whole structure of source code
>available or you will get it incorrect. I did not complete the whole
>syntax but it is hairy. You may be better off with a custom lexer.
>
>Jim
>
>
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Ron Burk
>> Sent: Thursday, April 14, 2011 7:45 AM
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] parsing the matlab language
>>
>> Couldn't find a rigorous language spec for matlab, but it kinda looks
>> like the lexer can handle the distinction by remember the previous
>> token type returned. (e.g., if the previous token type was X, or X., or
>> ), then the single quote is an operator instead of a string -- the
>> exact rule depends on the exact language spec).
>>
>> Obviously, you don't want the parser trying to piece together a quoted
>> string out of whatever (possibly illegal!) tokens appeared inside it,
>> so this is a job for the lexer.
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>
>List: http://www.antlr.org/mailman/listinfo/antlr-interest
>Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list