[antlr-interest] Parsing RTF to Braille

Daniel Warner dwarner at uni-paderborn.de
Tue Jun 26 06:32:43 PDT 2007


Hi Gerald,

you're right concerning the different and incomplete interpretations of the RTF spec. It seems to me that implementing a PDF->HBS Converter just moves the problem to finding a good parser for RTF->PDF?

Regards,
Daniel.

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org]On Behalf Of Gerald B. Rosenberg
Sent: Monday, June 25, 2007 9:33 PM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Parsing RTF to Braille


At 11:46 AM 6/25/2007, Daniel Warner wrote:

The RTF specification 1.9 from Microsoft is huge. What approach would you suggest in parsing RTF with ANTLR to the mentioned text-based representation of braille (HBS)?

1) Use actions in the grammar rules?
2) Create an AST from the RTF input and a tree grammar for the AST that outputs HBS?
3) Use templates?
4) Other suggestions?

I would strongly suggest implementing a PDF to HBS converter if only to avoid the many different/incomplete interpretations of the RTF spec.  The PDF spec is substantially smaller and far more uniformly implemented.  Conversion from RTF, and many other document formats, to PDF can be automated with little difficulty.  

In both RTF and PDF, top down left right orientation is common, but not required.  Therefore, you can encounter "out of order" text even on simple pages.  So, actions for direct output are not likely to be useful.  If there is a need to handle footnotes, tables, and columns, AST is the only way to go.  You will likely need multiple tree walkers to distinguish different text blocks and reorganize the AST content into a reasonably consistent form.  Output then should be fairly linear, so templates should not be necessary.

HTHs,
Gerald 
----
Gerald B. Rosenberg, Esq.
NewTechLaw
260 Sheridan Ave., Suite 208
Palo Alto, CA  94306-2009

650.325.2100  (office)  /  650.703.1724  (cell)
650.325.2107  (facsimile)

www.newtechlaw.com


CONFIDENTIALITY NOTICE:  This email message (including any attachments) is being sent by an attorney, is for the sole use of the intended recipient, and may contain confidential and privileged information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender immediately by reply email and delete all copies of this message and any attachments without retaining a copy. 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.9.8/869 - Release Date: 25.06.2007 17:32



More information about the antlr-interest mailing list