[antlr-interest] Example code-generation target that outputs a state machine

Mike Samuel mikesamuel at gmail.com
Thu Oct 15 19:58:26 PDT 2009


I'd like to derive a state machine that recognizes a combined lexical
grammar of JS/HTML/CSS (hacked so that JS has a regular lexical
grammar) and a mapping from states to the production they're part of.
I need to keep the number of states small.

I saw something tantalizing about "dfaState() String Template" at
http://www.antlr.org/wiki/display/ANTLR3/How+to+build+an+ANTLR+code+generation+target
but am still unsure how to proceed.

Is this possible with ANTLR, and if so, does anyone know of existing
code I could adapt?

For background, my end goal is to bolt string interpolation onto
javascript, but in a way that doesn't introduce XSS problems as
described at
http://google-caja.googlecode.com/svn/changes/mikesamuel/string-interpolation-29-Jan-2008/trunk/src/js/com/google/caja/interp/index.html
And I'd like to explore using this as a way to address a lot of
injection problems in PHP by subtly changing the semantics of its
string interpolation.

cheers,
mike


More information about the antlr-interest mailing list