[antlr-interest] ANTLR 2.7.7 - prepend to InputBuffer

Alex Marin alex.marin at amiq.ro
Wed Aug 18 10:39:18 PDT 2010


Hi Jim,

You are right, this would be the straightforward way to handle the 
problem in its entirety. However, we wouldn't want to rewrite the 
preprocessing part of the lexer, as it works fine except for the cases 
I've mentioned. Right now we hope that implementing an InputBuffer which 
accepts its input to be modified during lexing would be of less impact 
to what we already have.

As for the language users - in my opinion they will always _use_ the 
language in _all_ possible ways, no matter how unreadable or twisted 
this would render the code. If anyone could ever change this, I think 
it's the language designers, not the parser implementers :)

Thanks for your suggestion anyway!
Alex

On 08/18/2010 06:29 PM, Jim Idle wrote:
> The better solution is to create a separate pre-processor stage. You run the
> pre-processor and it feed its output in to the 'real' lexer. This is a lot
> easier to maintain and by the time you had messed around trying to create
> secondary lexers, include stacks and so on for other solutions, you end up
> with this being faster. Also, the ability to have a much simpler lexer that
> is just handling #if, #define and macro expansion makes things much easier
> (other than macros in macros).
>
> Now, if your environment is known,then you could just implement this in m4,
> which is available just about everywhere.
>
> http://en.wikipedia.org/wiki/Preprocessor
> http://en.wikipedia.org/wiki/M4_(computer_language)
> http://www.gnu.org/software/m4/
>
> You could also just run the C pre-processor and define your language as
> using the C macro engine.
>
> And further, if you can do without macros altogether, you can stop the
> language users creating unreadable, undebuggable code because they thought
> macros were cool and clever and used them nested to 23 levels ;-)
>
> Jim
>
>
>
>    
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Alex Marin
>> Sent: Wednesday, August 18, 2010 8:00 AM
>> To: antlr-interest at antlr.org
>> Cc: etools at amiq.ro
>> Subject: [antlr-interest] ANTLR 2.7.7 - prepend to InputBuffer
>>
>> Hello everyone,
>>
>> Is it possible in ANTLR 2.7.7 to add characters on the fly to a lexer's
>> InputBuffer? Do you have any suggestion/guideline/experience on how to
>> do this? (without re-implementing the InputBuffer/CharQueue classes)
>>
>> I will explain the reason behind the need to do this. The language we are
>> lexing allows pre-processing, by using C-like defines (or macros).
>>
>> // Define foo macro
>> `define foo this is the replacement of foo
>>
>> // Use foo macro
>> `foo // this expands to 'this is the replacement of foo'
>>
>> // Define moo macro with a parameter
>> `define moo(x) x+2
>>
>> // Use moo macro
>> `moo(1) // this expands to '1+2'
>>
>> At the moment, when our lexer encounters a macro, it starts another lexer
>> which lexes the replacement. E.g:
>>
>> ...
>> `define foo this is the replacement of foo ...
>> `foo // Here another lexer is started and provided the input stream 'this
>>      
> is
>    
>> the replacement of foo'
>> ...
>>
>> This approach works very well at the time being, except for the situations
>> when the replacement text is not a fully "lexable" piece of code, like in
>>      
> the
>    
>> example below:
>>
>> ...
>> `define a(x,y) x+y // Define macro a with parameters x and y `define b
>>      
> `a(1  //
>    
>> Define macro b without parameters ...
>> `b,2) // Use macro b; this should expand to `a(1,2) and therefore to 1+2
>>      
> ...
>    
>> Now, the problem is that the lexer we start for `b will lex '`a(`' which
>>      
> is not
>    
>> lexically correct and will fail with an error.
>>
>> The best solution would be to be able to insert in the main lexer's buffer
>>      
> the
>    
>> replacement of `b and continue lexing normally. Of course, no new lexer
>> would be needed then. So, after matching `b, the buffer would look like
>> `a(1,2)
>>
>> Thanks in advance,
>> Alex
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>      
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>    


More information about the antlr-interest mailing list