[antlr-interest] XPath - identifiers are keywords

bob mcwhirter bob at werken.com
Wed Jul 3 07:06:56 PDT 2002


On Wed, 3 Jul 2002 mzukowski at yci.com wrote:

> This is actually a difficult problem.  See
> http://www.jguru.com/faq/view.jsp?EID=140.  If there aren't too many of the
> keywords that can be identifiers then you can handle it in the parser by
> getting rid of your COMMENT rule and testing for it in your parser.  Instead
> of testing for NCNAME, you would have a rule ncname, e.g.

XPath is particularly nefarious, but not just with regards to keywords.

	foo()

That's either a node-type-test, if 'foo' is equal to 'comment' or
'node', but it might be a function (possibly user-defined) otherwise.
You also have to be able to deal with:

	foo:bar()

I think I ended up with a parser rule that have 5 predicates to decide
exactly what was going on, to determine if 'foo' was a node-type name,
a namespace-prefix, or possibly a user-function.

I hate to say it, but I found it easier to write an XPath lexer/parser
by hand than to convince an antlr grammar to do the right thing.

	<plug type="shameless">

	If you're just trying to parse xpaths, might I recommend 
	taking a look at SAXPath (http://saxpath.org/), which provides
	the xpath parse-events in a style like SAX does for XML,
	if you're using Java.  Though, I think SAXPath has been
	ported to Python, and maybe C++?

	</plug>

-bob


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list