[antlr-interest] ANTLR 2.7.7 - prepend to InputBuffer

Jim Idle jimi at temporal-wave.com
Wed Aug 18 11:14:32 PDT 2010


Then yes, I suggest that you implement a custom input stream that allows you
to push input in to a FIFO, which is always read before the supplied input
stream. Of course you will need a stack of these because of nested macros
and so on. You would need to override LA and related methods but it should
not be a big deal.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Alex Marin
> Sent: Wednesday, August 18, 2010 10:39 AM
> To: antlr-interest at antlr.org
> Cc: etools at amiq.ro
> Subject: Re: [antlr-interest] ANTLR 2.7.7 - prepend to InputBuffer
> 
> Hi Jim,
> 
> You are right, this would be the straightforward way to handle the problem
in
> its entirety. However, we wouldn't want to rewrite the preprocessing part
of
> the lexer, as it works fine except for the cases I've mentioned. Right now
we
> hope that implementing an InputBuffer which accepts its input to be
> modified during lexing would be of less impact to what we already have.
> 
> As for the language users - in my opinion they will always _use_ the
language
> in _all_ possible ways, no matter how unreadable or twisted this would
> render the code. If anyone could ever change this, I think it's the
language
> designers, not the parser implementers :)
> 
> Thanks for your suggestion anyway!
> Alex
> 
> On 08/18/2010 06:29 PM, Jim Idle wrote:
> > The better solution is to create a separate pre-processor stage. You
> > run the pre-processor and it feed its output in to the 'real' lexer.
> > This is a lot easier to maintain and by the time you had messed around
> > trying to create secondary lexers, include stacks and so on for other
> > solutions, you end up with this being faster. Also, the ability to
> > have a much simpler lexer that is just handling #if, #define and macro
> > expansion makes things much easier (other than macros in macros).
> >
> > Now, if your environment is known,then you could just implement this
> > in m4, which is available just about everywhere.
> >
> > http://en.wikipedia.org/wiki/Preprocessor
> > http://en.wikipedia.org/wiki/M4_(computer_language)
> > http://www.gnu.org/software/m4/
> >
> > You could also just run the C pre-processor and define your language
> > as using the C macro engine.
> >
> > And further, if you can do without macros altogether, you can stop the
> > language users creating unreadable, undebuggable code because they
> > thought macros were cool and clever and used them nested to 23 levels
> > ;-)
> >
> > Jim
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> bounces at antlr.org] On Behalf Of Alex Marin
> >> Sent: Wednesday, August 18, 2010 8:00 AM
> >> To: antlr-interest at antlr.org
> >> Cc: etools at amiq.ro
> >> Subject: [antlr-interest] ANTLR 2.7.7 - prepend to InputBuffer
> >>
> >> Hello everyone,
> >>
> >> Is it possible in ANTLR 2.7.7 to add characters on the fly to a
> >> lexer's InputBuffer? Do you have any suggestion/guideline/experience
> >> on how to do this? (without re-implementing the InputBuffer/CharQueue
> >> classes)
> >>
> >> I will explain the reason behind the need to do this. The language we
> >> are lexing allows pre-processing, by using C-like defines (or macros).
> >>
> >> // Define foo macro
> >> `define foo this is the replacement of foo
> >>
> >> // Use foo macro
> >> `foo // this expands to 'this is the replacement of foo'
> >>
> >> // Define moo macro with a parameter
> >> `define moo(x) x+2
> >>
> >> // Use moo macro
> >> `moo(1) // this expands to '1+2'
> >>
> >> At the moment, when our lexer encounters a macro, it starts another
> >> lexer which lexes the replacement. E.g:
> >>
> >> ...
> >> `define foo this is the replacement of foo ...
> >> `foo // Here another lexer is started and provided the input stream
> >> 'this
> >>
> > is
> >
> >> the replacement of foo'
> >> ...
> >>
> >> This approach works very well at the time being, except for the
> >> situations when the replacement text is not a fully "lexable" piece
> >> of code, like in
> >>
> > the
> >
> >> example below:
> >>
> >> ...
> >> `define a(x,y) x+y // Define macro a with parameters x and y `define
> >> b
> >>
> > `a(1  //
> >
> >> Define macro b without parameters ...
> >> `b,2) // Use macro b; this should expand to `a(1,2) and therefore to
> >> 1+2
> >>
> > ...
> >
> >> Now, the problem is that the lexer we start for `b will lex '`a(`'
> >> which
> >>
> > is not
> >
> >> lexically correct and will fail with an error.
> >>
> >> The best solution would be to be able to insert in the main lexer's
> >> buffer
> >>
> > the
> >
> >> replacement of `b and continue lexing normally. Of course, no new
> >> lexer would be needed then. So, after matching `b, the buffer would
> >> look like
> >> `a(1,2)
> >>
> >> Thanks in advance,
> >> Alex
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-
> >> email-address
> >>
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address



More information about the antlr-interest mailing list