[antlr-interest] JavaScript grammar
Benjamin Shropshire
shro8822 at vandals.uidaho.edu
Sat Mar 29 21:15:35 PDT 2008
Chris Lambrou wrote:
> Hi all,
>
> I couldn't get the ECMAScript by Greg Clemenson on the Grammar List
> page to work. It's supposed to run in v3.0 without any issues, but I
> ran into a whole host of problems. Since I'm fairly new to ANTLR, I
> thought I'd work my way through Terence's book and have a stab at
> writing a JavaScript grammar from scratch as a learning exercise.
> Well, I've reached a point where the script may be useful to others,
> so I've attached it - it compiles cleanly, without any warnings. I
> could also do with some advice, though.*
> *
>
> 1. Unlike other whitespace characters, line separators (represented
> by my LT token type) are important in JavaScript, as you're
> allowed to use them to terminate statements instead of the usual
> terminating semicolon character. As a result, I cannot 'hide'
> line separators like other whitespace characters, and my grammar
> is peppered with LT!* sequences. Is there a way to place the LT
> tokens on the hidden channel, and then optionally reveal them
> only in the few rules that require it?
> 2. The grammar doesn't include any ^ or ! modifiers to impose any
> kind of useful structure to the generated AST. I can see how I
> ought to do this in the simple cases (e.g. 'return'^
> expression), but I'm not sure how far I ought to go with this
> before relying on a subsequent tree grammar to finish the job.
>
> I haven't performed much in the way of formal testing, except that it
> seems to work with everything I've thrown at it using the ANTLRWorks
> debugger. I guess I ought to look into writing some gunit tests...
>
> Regards,
>
> Chris
It is most likely not kosher, but if you can look at an LT in a sequence
of tokens test if it is a virtual semicolon (without knowing anything
but the adjoining tokens) then some sort of preprocessor (I'm thinking:
lex, filter tokens into new lex stream, parse) might be able to convert
what is needed. You might call the filter a TokenSedStream or something
like that. I did something like that (but with the text) to deal with
indentation sensitivity in my only attempt with ANTLR. As I said, not
kosher, but if all else fails "You gotta go with what works." (Law #37)
More information about the antlr-interest
mailing list