[antlr-interest] greedy vs nongreedy lexer rules

Cliff Hudson cliff.s.hudson at gmail.com
Sun Apr 18 16:18:48 PDT 2010


Hmm, it seems to me that there should be a way to record the set of
recursive actions and appropriate pointers into the lexed string such that
you can replicate the logical state of the DFA when you execute the actions
once an alternative is definitely selected.  Is there more to the state of
the system at any given possible action point than a pointer to the start of
the current substring, its length, and maybe a pointer to the already
matched token stream?  I am possibly out of my depth here on understanding
how the lexing system really works.  Or I have not adequately explained my
idea :)

On Sun, Apr 18, 2010 at 4:08 PM, Terence Parr <parrt at cs.usfca.edu> wrote:

> Hi Cliff, thanks for the input.  I think it might be hard to record
> complete "state of the lexer" for each input position efficiently.  Users
> could, for example, update a large global data structure as they lexed.
>
> Hmm...yeah, i was trying this idea earlier but we sort of need to formalize
> arguments to parser rules to handle predicates that get generated outside of
> the defining function (when I need to gen cyclic DFA).  THis happens for
> Java as it has no goto. might as well do locals too.
>
> T
>
> On Apr 18, 2010, at 4:04 PM, Cliff Hudson wrote:
>
> > With respect to local variables and actions in ambiguous sets of rule, it
> seems to me that the entire rule alternative is the scope for all actions
> which appear in it, so having an action which declares a variable and then
> another action later in the alternative which executes some code is really
> all one method.  What would need to be dealt with is that the language
> target generator would need to be able to take the state pulled from the DFA
> and insert that information into the alternative's action sequence so that
> each action had access to the logical state at the time it executes.
> >
> > For instance, in the rule:
> >
> > FOO: { int n=4; } 'a'* { n += $text.Length; } 'bcd' {
> System.WriteLine("{0}: {1}", n, $text); } ;
> >
> > the alternative's action function would look like:
> >
> > foo_alt1(State[] states)
> > {
> >     int n=4;
> >     n += states[0].Text.Length;
> >     System.WriteLine("{0}: {1}", n, states[1].Text);
> > }
> >
> > The State[] is an output from the DFA.  Ambiguity then doesn't have any
> effect on your ability to execute actions, but language targets would need
> to be rewritten.
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list