[antlr-interest] Starting my first parser...

Mark Mandel mark.mandel at gmail.com
Thu Dec 21 22:48:15 PST 2006


Hey all,

Totally language parser / compiler newbie here, so please go easy.

I've done as much reading as I can, and even tried a bunch of stuff,
but I can't seem to wrap my head around one very simple thing.

The case I'm working on is I'm writing a HQL like (I say 'like', but
in actuality it will probably be almost identical) syntax for my
ColdFusion ORM.

To start with, I'm just wanting to parse a most basic simple HQL like
string that looks like:

select emails.Email.emailid from emails.Email where
emails.Email.emailName = :emailName

No subselects, nothing fancy, just a simple select looking like that.

Now, where I first get stuck is basically at the Lexer point, and with
look ahead.

The way I'm seeing it (and I think this sounds right), the rule for
emails.Email.emailid is something like:

identifier : a..z ( a..z | 0..9 | '_' | '.' )*

Which basically means, anything that is alpha numeric, has a . or a _ in it.

Where I get confused is, a Lexer definition like:

select: 'select'

Could possibly be a identifier when parsing... as the string 'select'
also fits the definition of identifier, so how can I tell ANTLR to
check for that, short of setting k=7 or something similar? or is that
the only way?

I've looked at similar SQL grammars, and the HQL grammar, and they
seem quite happy with a k=2, but I don't get how they have managed to
do that.  I feel like I'm missing a penny dropping moment.

I've also just downloaded and started looking at AntrWorks 1.08b,
which is amazing, but haven't really started delving into it yet, but
I will start playing with it shortly.

Any and all help will be appreciated - even if its just 'go read over here'

Thanks,

Mark

-- 
E: mark.mandel at gmail.com
W: www.compoundtheory.com


More information about the antlr-interest mailing list