[antlr-interest] Parsing OpenEdge (4GL database language) without preprocessor phase

Pieter van Ginkel pvginkel at gmail.com
Sat Feb 5 01:25:34 PST 2011


I've found the examples on how to parse includes from the lexer. These looks
very promising, but I would like some guidance on a specific issue.

My source language allows for the following construct:

File myinc.i:

message "{&a}".

File userid.i:

userid()

And the calling file:

{myinc.i a="Welcome :~" + ~{userid.i~} + ~""}

The preprocessor expands this to:

message "Welcome :" + userid() + "".

What happens here is that the parameters of the include file rewrite its
contents. From what I've seen I would really like to solve this in the
lexer, but I can't see how I could do this.

Suggestions are welcome.

On Mon, Jan 31, 2011 at 2:53 AM, Douglas Godfrey
<douglasgodfrey at gmail.com>wrote:

> Yes. The text of the token {&a} can be replaced in the lexer so the parser
> would see the value 7
>
>
> On Sun, Jan 30, 2011 at 9:24 AM, Pieter van Ginkel <pvginkel at gmail.com>wrote:
>
>> This sounds terrific. Will ANTLR treat the contents of {myinc.i} as part
>> of the original file?
>>
>> One more question. The includes are parameterized, e.g.
>>
>> {myinc.i a=7}
>>
>> and the contents of myinc.i:
>>
>> {&a} * 14
>>
>> Is this also possible?
>>
>> On Fri, Jan 28, 2011 at 11:58 PM, Douglas Godfrey <
>> douglasgodfrey at gmail.com> wrote:
>>
>>> Antlr can implement includes inline in the lexer with a stacked input
>>> stream.
>>> When the lexer encounters {myinc.i} it would open a new stream and switch
>>> to it
>>> for the tokens "7", "*" and "14" and switch back to the original stream
>>> when it
>>> reached EOF in the myinc.i file.
>>>
>>> The 3 tokens from myinc.i woild have the file name, line and column from
>>> the
>>> include file. The text {myinc.i} would be consumed by the lexer without
>>> any
>>> generated token.
>>>
>>> On Fri, Jan 28, 2011 at 7:56 AM, Pieter van Ginkel <pvginkel at gmail.com>wrote:
>>>
>>>> I need to write a parser for OpenEdge (a 4GL database language), but I
>>>> need
>>>> to preserve facts of the source files that would otherwise be lost
>>>> through
>>>> the preprocessor.
>>>>
>>>> E.g., the following contrived example:
>>>>
>>>> assign customer.name = {myinc.i}.
>>>>
>>>> And an include myinc.i with the contents:
>>>>
>>>> 7 * 14
>>>>
>>>> I need to have an AST that contains the fact that customer.name was
>>>> assigned
>>>> with {myinc.i} and not 7 * 14. The includes are normally processed using
>>>> a
>>>> preprocessor, so theoretically it's possible that the includes are
>>>> accessed
>>>> anywhere within a file (not in a neat location like after the assign in
>>>> the
>>>> above example). However, the code base is quite clean and this shouldn't
>>>> pose much of a problem.
>>>>
>>>> The reason I need this is that I am writing an application for source
>>>> analysis for which I need to know every detail of the source file.
>>>>
>>>> Can this be done with ANTLR? Any tips?
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe:
>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>
>>>
>>>
>>
>


More information about the antlr-interest mailing list