[antlr-interest] Starting my first parser...

David Holroyd dave at badgers-in-foil.co.uk
Fri Dec 22 02:30:23 PST 2006


On Fri, Dec 22, 2006 at 05:48:15PM +1100, Mark Mandel wrote:
> To start with, I'm just wanting to parse a most basic simple HQL like
> string that looks like:
> 
> select emails.Email.emailid from emails.Email where
> emails.Email.emailName = :emailName
[...]
> The way I'm seeing it (and I think this sounds right), the rule for
> emails.Email.emailid is something like:
> 
> identifier : a..z ( a..z | 0..9 | '_' | '.' )*
> 
> Which basically means, anything that is alpha numeric, has a . or a _ in it.

Consider making '.' a seperate token from IDENTIFIER, so that you can
easily allow whitespace between a '.' and an identifier-part.

> Where I get confused is, a Lexer definition like:
> 
> select: 'select'
> 
> Could possibly be a identifier when parsing... as the string 'select'
> also fits the definition of identifier, so how can I tell ANTLR to
> check for that, short of setting k=7 or something similar? or is that
> the only way?
> 
> I've looked at similar SQL grammars, and the HQL grammar, and they
> seem quite happy with a k=2, but I don't get how they have managed to
> do that.  I feel like I'm missing a penny dropping moment.

The HQL grammar uses an action to call a weakKeywords() method, defined
in a Parser subclass, here, I think,

http://anonsvn.jboss.org/repos/hibernate/trunk/Hibernate3/src/org/hibernate/hql/ast/HqlParser.java

This appears to do the magic of transumting keywords into identifiers
were allowed.  Also, the HQL lexer has

  options { testLiterals=true; }

in the definition of it's IDENT token,

  http://anonsvn.jboss.org/repos/hibernate/trunk/Hibernate3/grammar/hql.g


> I've also just downloaded and started looking at AntrWorks 1.08b,
> which is amazing, but haven't really started delving into it yet, but
> I will start playing with it shortly.

NB. AltlrWorks is for ANTLR v3, while the above examples are for v2.


ta,
dave

-- 
http://david.holroyd.me.uk/


More information about the antlr-interest mailing list