[antlr-interest] Want to write a fairly simple syntax converter...

Fri Apr 11 07:54:22 PDT 2008

I am new to all this language parsing, and I am struggling to understand
'how much I need to understand' (to use a Rumsfeld'ism)

If I just want to write a fairly simple converter, and keep whitespace
fairly intact, how 'dirty' do I have to get my hands, language parsing
and code wise?

To clarify, I want to convert a proprietary format into equivalent XML.
Something like "x [ a b c ]"   ->  "<rule name=x> <one-of> a b c
</one-of> </rule>"
(But obviously, it gets a little more complicated than that)

My 2 biggest questions:
1) Do I need to worry about 'building trees', accessing the AST or
anything like that? Or are the 'snippets' of code you can put in the
grammar rules going to get me by?
2) Is maintaining whitespace easily do-able? It seems to get gobbled up
with little opportunity to keep it intact. It seems I could maybe
tokenize it explicitly as meaningful input, and then be able to simply
re-constitute it in the output, or is that just crazy talk and will
complicate my grammars too much (with 'WS?' sprinkled everywhere...)

... Just trying to get a good idea of what I am in for...

Thanks for any replies!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080412/c19ecf67/attachment.html