[antlr-interest] Re: suggested ANTLR projects?

Thomas Brandon tom at psy.unsw.edu.au
Sun Aug 17 17:45:41 PDT 2003


Hadn't considered multiplexing, but unfortunately I think that 
doesn't help. With mutliplexing, you need a token (well a rule) to 
tell it to kick over to the action lexer. But the problems is with 
distinguishing that token (rule).
Plus, even if that did work you'd have the problem of hooking 
multiplexing into the Netbeans incremental lexer framework, and it 
would be harder (if anything) than hooking in a token stream filter 
(which is how I implemented the subtoken stuff). You'd need to store 
some sort of virtual PUSH_LEXER and POP_LEXER tokens and have the 
incremental lexer handle them when it relexes.
Multiplexing is basically what Antlr does now. OK, it packs all the 
action stuff into a single token, but then it uses the action lexer 
to handle that, so it's not as clean an implementation but same basic 
idea.

Tom.
--- In antlr-interest at yahoogroups.com, "bogdan_mt" <bogdan_mt at y...> 
wrote:
> > As Marco says one way to solve it is to use state variables but 
> this 
> > doesn't work in incremental lexing (at least in the netbeans 
> > implementation), you need some notion of non-restartable tokens 
so 
> > the state is properly updated, e.g. when you change "options" 
> > to "optios" it needs to relex the following tokens (left to 
> right), 
> > to pick up what is now an action, when you delete the curly it 
> needs ...
> 
> This will work, but you are reinventing the wheel. ANTLR has a 
> better solution for this: lexer multiplexing. In fact, the problem 
> was that the option specification is an embedded language, with a 
> different grammar. The "right" solution is to write two lexers that 
> call one another when appropriate. Read the documentation and the 
> examples from ANTLR distribution for more details. Ter was probably 
> too busy and used a quick hack.
> 
> BTW, porting the Netbeans approach in ANTLR might not be a good 
> idea. They wanted something very general, that works with any 
parser 
> generator, and had to reimplement many features that ANTLR already 
> had.
> 
> Best regards,
> Bogdan
> 
> 
> --- In antlr-interest at yahoogroups.com, "tbrandonau" <tom at p...> 
wrote:
> > Terr was right, there was a good reason. Basically options 
> section, 
> > tokens section and actions are horribly ambiguous, partly due to 
> the 
> > opacity of actions. The rules are:
> > OPTIONS: "options" (WS|COMMENT)* LCURLY; // Same for tokens
> > ACTION: LCURLY (.*) RCURLY; // With extra stuff to handle RCURLY 
> in 
> > comment\string literal etc.
> > So, if you see a LCURLY it's really hard to know what to do. Is 
it 
> an 
> > action where you want to swallow everything pretty 
> indescriminantly 
> > or the start of a tokens\options block where you can actually 
> parse 
> > what's inside?
> > The solution used in Antlr is to match "options" (WS|COMMENT)* 
> LCURLY 
> > in RULEDEF (lowercase starting identifiers).
> > 
> > As Marco says one way to solve it is to use state variables but 
> this 
> > doesn't work in incremental lexing (at least in the netbeans 
> > implementation), you need some notion of non-restartable tokens 
so 
> > the state is properly updated, e.g. when you change "options" 
> > to "optios" it needs to relex the following tokens (left to 
> right), 
> > to pick up what is now an action, when you delete the curly it 
> needs 
> > to re-lex "options" as a ruleDef not an OPTIONS_BLOCK (left to 
> right) 
> > etc. So, what you really need to do is recognise it as a single 
> block 
> > and record 'subtokens' for the various parts. That way the re-
> lexing 
> > stuff treats it as one token but you can pull the various parts 
> out. 
> > Hence you want a way to return multiple tokens from a single 
rule. 
> Or 
> > you can make a custom token class to store subtokens, but then 
you 
> > have a problem hooking into the incremental lexer. After lexing 
> you 
> > need to unpack the subtokens for subsequent stuff and then repack 
> > them back up for the incremental lexer, meaning you need to 
> hookinto 
> > the lexer. I managed to hack the Netbeans lexer to support non-
> > restartable tokens and that kinda worked. There was some problem 
> in 
> > there (incremental and batch lexing was slightly different in a 
> few 
> > cases) but seemed to get the right stuff.
> > 
> > Ideally you might try and leave it to the parser, but the opacity 
> of 
> > actions makes that not possible, there can be stuff in an action 
> that 
> > is not lexable (unless you made a new Antlr lexer for every 
action 
> > language).
> > 
> > Tom.
> > --- In antlr-interest at yahoogroups.com, Marco Ladermann 
> > <ladermann at h...> wrote:
> > > Am Mittwoch, 13. August 2003 04:57 schrieb Brian Smith:
> > > > tbrandonau wrote:
> > > > > Ensemble section). In fact the Netbeans support could be 
> > improved
> > > > > upon, incremental lexing gains from having a way to in 
> effect 
> > return
> > > > > multiple tokens at a time, to tell the incremental lexer 
not 
> to 
> > try
> > > > > an resume in the middle of a token (e.g. in Antlr you want 
to
> > > > > return "options {" as two tokens: LITERAL_options and 
LCURLY 
> > but you
> > > > > want to lex it in a single rule) so either non-restartable 
> > tokens or
> > > >
> > > > Please explain why "options {" is better lexed as a single 
> rule? I
> > > > noticed this kind of thing in ANTLR's antlr.g grammar and I 
> > simply could
> > > > not understand why the grammar was written like that. I feel 
I 
> > must be
> > > > overlooking something.
> > > 
> > > I'm just playing around with what Tom suggests - a ANTLR-
> Netbeans 
> > module - and 
> > > my first step was to transform the antlr.g into a tree grammar. 
> The 
> > matching 
> > > of "options {" ("tokens {") as one token was indeed a problem. 
> The 
> > rationale 
> > > behind this, I think, is that there is a need to distuingush 
> action 
> > code from 
> > > the options/tokens name-value pairs. My solution was to 
> introduce a 
> > state 
> > > variable and semantic predicates to make the decision. This 
> allows 
> > also to 
> > > recognize the comments between "options" and "{", which are 
> simply 
> > ignored in 
> > > the original antlr.g.
> > > 
> > > Marco


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list