[antlr-interest] syntactic predicates vs. backtrack=true
Jim Idle
jimi at temporal-wave.com
Tue Feb 5 08:53:21 PST 2008
In my opinion you should never use backtracking unless performance is
not a big deal for you. It is good for prototyping and as you say it can
make the specification of your grammar easier/neater.
If your grammar is ending up with lots of nested syntactic predicates
(that cause multiple executions of them) and you dont want to put in
the effort to solve the issues (remember that there are some places
where you must use a predicate, such as distinguishing generics from
logical expressions in C#), then using backtracking and memoizing might
well be faster than all the predicates.
Backtracking basically trys each alt that might be predicted in turn
until it finds one that works. The memoizing option allows the paresr to
realize that it has already tried some parts of the path and skip them
so it helps a lot performance wise, but you are still doing all that
looking ahead and abandoning paths. With a good knowledge of what is
happening though, you can favor the alts that are most likely to come up
and help a lot.
I would say that if you are producing a tool that does not run
continuously or on many many input streams, and/or the parse is
generally a small part of the overall time, then you may find
backtracking is just fine and infinitely preferable to lots of
predicates.
There are some languages that are not necessarily good targets for an
ANTLR grammar and there isn't a good way to parse them without
backtracking or lots of predicates. Wiki markup languages are like that
for instance because they are a bit haphazard and tend to have lots of
context. If you are not trying to parse such a language, then it is
usually possible to create a syntactic parser that has very few
ambiguities. Experience and effort is required. Experience lets you spot
the things that are truly an ambiguity in the language spec vs something
that you could specify in a better way.
Remember that your parser is only there to filter out valid syntax from
invalid syntax, not to try and allow different syntactic paths for
things that are essentially the same syntax. Amalgamate all common token
sequences, even if this allows through code that isnt really valid.
Then filter such things out semantically in the tree parser (you should
always end up with a syntactically sound tree). Many times language
specs are written in an informative manner and distinguish things that
are really semantics, such as methodName and interfaceName, which are
syntactically the same thing and perhaps create long starting sequences
before you can infer syntactically that you have a method declaration
not an interface declaration and so on.
I hope that helps, but in summary, try not to use backtracking and
memoizing, but dont throw out the options completely as there are
reasonable situations where they are perfectly fine.
Jim
> -----Original Message-----
> From: Mark Volkmann [mailto:r.mark.volkmann at gmail.com]
> Sent: Tuesday, February 05, 2008 4:41 AM
> To: antlr-interest
> Subject: [antlr-interest] syntactic predicates vs. backtrack=true
>
> Can someone describe a case where it's better to use syntactic
> predicates on rule alternatives than just specifying backtrack=true as
> a rule option? It certainly introduces less clutter in the grammar to
> use backtrack and of course there is less to type and less opportunity
> for error. Is it significantly more efficient to use syntactic
> predicates?
>
> --
> R. Mark Volkmann
> Object Computing, Inc.
More information about the antlr-interest
mailing list