[antlr-interest] Q: how to incorporate a preprocessor in the flow?

Martin d'Anjou point14 at magma.ca
Mon Apr 4 18:59:40 PDT 2011


Hi,

Thanks to both of you for sharing your approaches. Right now I am 
pondering how to alter the sequence of tokens before they hit the 
parser. Intuitively I want to have three processing units (lexer, 
pre-processor, parser) connected together through io pipes of tokens 
(e.g. token fifos), but this is not how ANTLR was architected (it's how 
I would have done it in hardware though!).

Martin


On 11-04-04 09:25 AM, Sam Harwell wrote:
> I used a hand-crafted implementation of TokenSource between the lexer and
> parser. In the preprocessor, whenever I manipulated a token I used a new
> token class derived from CommonToken (call it SubstitutedToken) which
> contained a linked list leading from the effective position in the stream
> (stored in CommonToken) all the way back to the original location (file and
> position) of the token definition. When a CommonToken substitution occurs,
> the linked list has one node containing the original source position where
> defined. Whenever a SubstitutedToken substitution occurs, a new node for the
> token's previous effective position is added to the linked list and that new
> head pointer is stored in the new token.
>
> `define x 3
> `define y `x
> `y
>
> In this case, token `y is eventually replaced with a SubstitutedToken which
> appears at (line 2, column 1, length 1, text "3") containing the following
> linked list:
>
> Line 3, column 1, length 2 (list head, the location where `y was substituted
> with `x)
> Line 2, column 11, length 2 (the location where `x was substituted with '3')
> Line 1, column 11, length 1 (the actual source location where the token '3'
> is defined)
>
> This list allows true relative ordering of all tokens in the processed
> source: when two tokens appear to be at the same location in the
> preprocessed stream, you simply compare the positions of the first node in
> the position list.
>
> Sam
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of A Z
> Sent: Monday, April 04, 2011 12:13 AM
> To: Martin d'Anjou
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Q: how to incorporate a preprocessor in the
> flow?
>
> Hi Martin,
>
>    I just completed an SV preprocessor which can parse UVM 1.0 successfully.
> After 2 revisions I settled on a completely separate preprocessor(lexer and
> parser). As you saw, you need to tokenize the macro_text in order to easily
> support macros with arguments and detect the three escaped tokens `", `\`"
> and ``. I'm not sure how well a lexer only approach could handle cases where
> a macro substitution can merge text with a previously lexed token. The
> separate approach still has flaws, such as good error reporting. Of course I
> could be missing an obvious easy solution.
>
>
>
> On Sun, Apr 3, 2011 at 9:51 PM, Martin d'Anjou<point14 at magma.ca>  wrote:
>
>> Hello,
>>
>> I am trying to find a way to incorporate a preprocessor in the ANTLR
>> flow. I thought of doing this before the lexer, but I need to tokenize
>> the incoming char stream for macro substitution to be easy. I thought
>> of doing it between the lexer and the parser, and replace the
>> preprocessor tokens with their expansion before feeding the token
>> stream to the parser, so I guess I would end up using something like
>> the TokenRewriteStream??? Can someone steer me in the right direction
>> please? Or should I be using lexer rule actions? In which case, any
>> example on how to access the token stream of the replacement token
>> list of an identifier? Too many questions sorry.
>>
>> The language I am hoping to tokenize is SystemVerilog and has C-like
>> preprocessor macros (`include, `ifdef, `define NAME(params,...), token
>> concatenation, etc.).
>>
>> Regards,
>> Martin
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>



More information about the antlr-interest mailing list