[antlr-interest] Re: Newbie needing parser help

lgcraymer lgc at mail1.jpl.nasa.gov
Mon Apr 26 13:26:26 PDT 2004


--- In antlr-interest at yahoogroups.com, "craigbarker1" <craigbarker1 at y...> wrote:
> Is there an easy way to make the parser think that it's been sent a 
> quoted string by inserting the " token into the token stream if its 

You can rewrite the text at the end of the rule to add quotes at front and back of the string

> not the next one? I suppose this also causes the problem of how to 
> position the closing ". Effectively nothing between the commas is 
> significant but if I try something along the lines of (~(COMMA|NL))* 

That happens when either COMMA or NL are not "protected"--use the characters instead of the rules to get rid of the nondeterminism 
warnings.

--Loring

> I get lots of non-determinism.
> 
> Thanks for your help.
> 
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > Ugly problem.  What might make sense for this one is to make state
> > changes in the lexer and recognize strings in your COMMA rule.
> > 
> > That is:
> > 
> > ID :
> >    <character tokens>
> >    { hash table lookup; set commaText var if appropriate }
> >    ;
> > 
> > 
> > COMMA :
> >     { commaText = true }? ','! (~(',' | '\n'))+
> >         { _ttype = COMMATEXT; }
> >     |   ','
> >     ;
> > 
> > You can probably also do something with a token filter.
> > 
> > --Loring
> > 
> > --- In antlr-interest at yahoogroups.com, "craigbarker1"
> > <craigbarker1 at y...> wrote:
> > > Hi All,
> > > 
> > > I'm relatively new to all this language recognition stuff and 
> have a 
> > > question that I could really use a hand with. It's probably not 
> that 
> > > hard, it's more likely that i'm just missing something obvious.
> > > 
> > > The issue is that i'm trying to parse a language that allows 
> > > unquoted strings to be passed as parameters to functions. There 
> are 
> > > no rules on what can go inside these unquoted string's - they 
> can be 
> > > the names of literals, functions or any random sequence of 
> > > characters.
> > > 
> > > I've tried recognising a set of ID tokens (defined as per the 
> java 
> > > grammer specification) but this is no good as i've got 
> > > testLiterals=true; so anything that is a literal comes through 
> from 
> > > the lexer as a specific token type and therefore doesn't match 
> > > against ID.
> > > 
> > > Here is an example of the type of thing i'm trying to match:
> > > 
> > > PAGES,Sale detail,Status changes,Sale costs
> > > 
> > > The issue lies with the fact that each of the parameters are 
> REALLY 
> > > strings but in this bizzare language they don't have to be 
> double 
> > > quoted. The issue is further compounded by the fact that the 
> word 
> > > Status is really a function name and hence has a specific token 
> type.
> > > 
> > > Here is a snippet of the grammer i've done so far to deal with 
> > > this:
> > > 
> > > designerCommand
> > > //Commands to the designer
> > > 	:	"SIZE" COMMA NUM_INT COMMA NUM_INT
> > > 	|	"PAGES" COMMA textParameter (COMMA textParameter)*
> > > 	;
> > > 
> > > textParameter
> > > 	:	(ID)*
> > > 	| 	STRING_LITERAL
> > > 	;
> > > 
> > > Please let me know if you can provide any advise at all or even 
> > > point me to a relevant article somewhere.
> > > 
> > > Many thanks in advance,
> > > 
> > > Craig



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list