[antlr-interest] Re: Match until "<@"

Tue May 14 18:27:12 PDT 2002

Hi,

I agree our grammar is context sensitive. After going through the 
examples provided, my impression is that we can use a Parser 
Generator if the boundaries are clearly defined between tokens( like 
a block of code would begin with '{' and end with '}' etc ). 
Otherwise the grammar will have to use predicates. right? In my case 
these are special tokens that we use in HTML templates....somewhat 
like JSP tags.

The only tags that I can have in my HTML templates are below:

Token 1: (not exactly a token, but for convenience...)
--------------------------
<@loop identifierOrKeyword 1234>
some text here....
</loop>
--------------------------

Token 2:
--------------------------
<@somesimpletexthere>
--------------------------

Token 3:
--------------------------
everything that's not part of token 1 and token 2
--------------------------

Token 1 and Token 2 can occur in any order within a given HTML 
template. Note that token 3 is nothing but the remaining text i.e. 
HTML code.

Not sure if that makes it easy....

And I donot have the luxury to change this grammar as it already is 
being used in several apps. I wish my grammar is well defined like a 
language....

I made some progress in terms of Token 3 after reading your mail. But 
still a lot more needs to be done. 

Thanks,
Praveen.

--- In antlr-interest at y..., Terence Parr <parrt at j...> wrote:
> 
> On Sunday, May 12, 2002, at 04:26  PM, praveen_c wrote:
> 
> > Terence,
> >
> > Thanks for the reply.
> >
> > In your previous mail you mentioned that this kind of problem is
> > hard for a lexer. Suppose I have tokens T1, T2 and T3 that my
> > lexer can recognize. I want all the REMAINING text that is not 
part
> > of T1, T2 and T3 to be another token T4( assume that T1, T2 and T3
> > can occur in any order ). Is there an easy way to specify this in
> > the grammar? I think I can come up with a long semantic predicate
> > that is negation of T1, T2 and T3. But like you said that doesn't
> > sound very clean.
> 
> Hmm...seems like maybe your grammar, (T1|T2|T3), would have an 
action 
> after that switched the lexer to see the T4 lexer...Hmm...is it a 
> sequence of T1..T3?  If so that is harder...you will need a trigger 
in 
> your lexer or parser somehow to know when to switch.  Typically the 
> language is the problem when you can't figure out what 
the "trigger" 
> is.  I.e., your language is highly context sensitive or something.
> 
> > So here is my question, is this a limitation of Parser Generators?
> > or is this specific to antlr. Am I better off using another Parser
> > Generator like JavaCC?
> 
> As I say, you might take a look at the language spec.  Is this 
something 
> that you folks wrote?  Can you change it?
> 
> > Before somebody flames at my post, please note that I'm just
> > learning
> > about Parser Generators and I donot have a CS degree.
> 
> No flame wars here. :)  We are a cuddly sort of mailing list :) ;)
> 
> Ter

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/