[antlr-interest] Re: Local lookahead depth
Oliver Zeigermann
oliver at zeigermann.de
Sun Nov 9 21:40:21 PST 2003
lgcraymer wrote:
>>>Also, as to actions in lookahead code: this is something that
>
> Ter supported in PCCTS under the name "guarded predicates" or some
>
>>>such. I don't know that it saw much use, and I suspect that
>
> usage indicates a too early incorporation of semantic information
> into the
>
>>>translator--tree transformation helps avoid that.
>>
>>1.) You might really increase the set of parseable languages using
>
> this
>
>>technique
>
>
> Possible, but I don't really see how. You would have to have
> extreme interdependency between semantics and syntax at the very
> least.
I had an example that was much more of theoretical interest (I do not
recall right now, but could look it up). On the more practical side same
grammars might look nicer if you had such a feature.
Admitted, this is not a really practical example, but consider the
following grammar:
{
int cnt = 0;
}
LANGUAGE
: ( SHORTWORD ) => SHORTWORD { System.out.println("SHORT"); }
| LONGWORD { System.out.println("LONG"); }
;
protected SHORTWORD : { cnt = 0; } ( {cnt < 1000}? '*' { cnt++; } )+ '#' ;
protected LONGWORD : { cnt = 0; } ( {cnt < 10000}? '*' { cnt++; } )+ '#' ;
It describes a language with two words:
1.) SHORTWORD: exactly 1000 '*' followed by a single '#'
2.) LONGWORD: exactly 10000 '*' followed by a single '#'
While the are certainly other grammars that describe this language, this
one seems to be the most natural, but does not work, because semantic
predicates (like {cnt < 1000}?) rely on semantic actions ({ cnt++; }, {
cnt = 0; }).
>
>>2.) Sometimes using tree transformation is too expensive
>
>
> Sometimes it is overkill (unnecessary development), but too
> expensive? I doubt it, especially for languages where lexing and
> parsing are complex. [BTW, my experience is that unsubstantiated
> performance arguments are usually bogus and made in an attempt to
> subjectively win an argument that cannot be won on the basis of
> objective evidence.]
I have the same experience. But consider extremely large amounts of
input to be parsed. In this case it is prohibitve to generate an AST
because of the memory issue. As a very practical exmaple I have parsing
of the AMM (Aircraft Maintenance Manual) which is available in SGML
(very hard to parse, really). I parsed this a few years using ANTLR, but
its size normally is around 100MB. A few years ago my machine had 128MB
of RAM! You see what I mean?
Oliver
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list