[antlr-interest] Parsing RTF to Braille

Daniel Warner dwarner at uni-paderborn.de
Mon Jun 25 11:46:48 PDT 2007


Hello,

I'm studying computer sciences and mathematics in Paderborn, and currently I'm working on a university project with the goal to transform RTF-documents into a text-based representation of braille, called HBS.

The output format HBS is already specified, although there is no grammar (it is all in an existing application and I have to reengineer). Lots of information in a RTF-document is of course irrelevant for blind people and will therefore have to be eliminated. HBS codes a lot of structural information but far less layout information, so I will also be faced with problems such as: What do I do with footnotes, how should I represent text that is colored red (maybe even inconsistently), how can I map layout to structure appropriately, etc. just to mention a few.

As I want to implement the RTF-HBS-Parser in Java, I naturally looked for parser generators for this language. To me ANTLR v3 seems to be the most promising approach in this area, and I really appreciate Prof. Parr for publishing his tool under the BSD License.

I already baught his book "The Definitive ANTLR Reference" (and PDF) and have a question concerning the "big picture" for my project:

The RTF specification 1.9 from Microsoft is huge. What approach would you suggest in parsing RTF with ANTLR to the mentioned text-based representation of braille (HBS)?

1) Use actions in the grammar rules?
2) Create an AST from the RTF input and a tree grammar for the AST that outputs HBS?
3) Use templates?
4) Other suggestions?

Thanks a lot in advance for some hints that help me starting off with my work,

Daniel Warner

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.9.6/865 - Release Date: 24.06.2007 08:33



More information about the antlr-interest mailing list