[antlr-interest] Want to write a fairly simple syntax converter...

Peter Nann peter.nann at vecommerce.com.au
Mon Apr 14 15:52:17 PDT 2008


 
Thanks for the pointers. I think I have enough info to bury my head in
the manual for a while.

> > P.S. I am loving all the comments about "XML not being meant for
human 
> > consumption"...
> Not sure if all people here agree here, but at least Ter does.

I think I have come across at least 2 such comments.
Even that makes me feel better and more sane. ;-)
As sane as Ter maybe, anyway...  (Hello if you are reading Ter)
I very much like the cut of Ter's jib.
The single-mindedness behind ANTLR is impressive and inspiring.

ANTLRWorks is impressive too.

Cheers.


-----Original Message-----
From: Johannes Luber [mailto:jaluber at gmx.de] 
Sent: Monday, 14 April 2008 8:26 PM
To: Peter Nann
Subject: Re: [antlr-interest] Want to write a fairly simple syntax
converter...

Peter Nann schrieb:
> Thank Johannes:
>  You are right, my XML example was bad, sorry, I was working from 
> memory... (And I ain't no XML guru. Hate the stuff) The format that we

> are targetting (Which is already strictly defined by a third Party) 
> does use  <item>a</item>.
> 
> 
> I want to maintain output white-space as I still want the output to be

> human readable (as much as XML can be).
>  - At least so that I can detect bugs in my parsing/translation, but 
> also because the output XML will be used by another (proprietary) 
> compiler, and if I get error messages out of that I want to be able to

> identify the error location easily back in the original.
>  - Maintaining 1-to-1 line mappings between input and output will help

> a lot.

As the best bet is to correlate lines and not the characters in a
particular line, it seems to be easier to add the line information in an
extra XML element (even it is a comment <!-- -->). Thus you can simplify
the parser and are more flexible in the way, how you output the XML.

> P.S. I am loving all the comments about "XML not being meant for human

> consumption".
> I thought I must be crazy being the only person to think this amongst 
> my surrounding Java programmars... Hallelujah to have Found a group 
> that agrees...  ;-)

Not sure if all people here agree here, but at least Ter does.

Johannes

P.S.: Please make sure that your answer goes to the list via using the
"Reply to All" option.
> 
> -----Original Message-----
> From: Johannes Luber [mailto:jaluber at gmx.de]
> Sent: Saturday, 12 April 2008 4:20 AM
> To: Peter Nann
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Want to write a fairly simple syntax 
> converter...
> 
> Peter Nann schrieb:
>> I am new to all this language parsing, and I am struggling to 
>> understand 'how much I need to understand' (to use a Rumsfeld'ism)
>>
>> If I just want to write a fairly simple converter, and keep 
>> whitespace
> 
>> fairly intact, how 'dirty' do I have to get my hands, language 
>> parsing
> 
>> and code wise?
>>
>> To clarify, I want to convert a proprietary format into equivalent
> XML.
>> Something like "x [ a b c ]"   ->  "<rule name=x> <one-of> a b c 
>> </one-of> </rule>"
>> (But obviously, it gets a little more complicated than that)
> 
> Please don't use this kind of XML. It is possible to create a schema 
> which says that an element includes a whitespace separated list, but 
> everyone agrees that it is simpler to work with:
> 
> "<rule name=x> <one-of> <elem>a</elem> <elem>b</elem> <elem>c</elem> 
> </one-of> </rule>"
> 
> More verbose yes, but XML wasn't designed with terseness in mind.
> 
>> My 2 biggest questions:
>> 1) Do I need to worry about 'building trees', accessing the AST or 
>> anything like that? Or are the 'snippets' of code you can put in the 
>> grammar rules going to get me by?
> 
> If you want to put out the parse tree in the same shape as it is and 
> with no extra computation, then I doubt that a tree grammar is 
> necessary for you.
> 
>> 2) Is maintaining whitespace easily do-able? It seems to get gobbled 
>> up with little opportunity to keep it intact. It seems I could maybe 
>> tokenize it explicitly as meaningful input, and then be able to 
>> simply
> 
>> re-constitute it in the output, or is that just crazy talk and will 
>> complicate my grammars too much (with 'WS?' sprinkled everywhere...)
> 
> What do you need to retain th whitespace for? XML ignores big parts of

> it anyway and with a new file format you don't have to follow the 
> conventions laid down by your predecessors. Better to ignore the 
> original whitespace here.
> 
> Johannes
> 



More information about the antlr-interest mailing list