[antlr-interest] Parsing RTF to Braille
Gerald B. Rosenberg
gbr at newtechlaw.com
Mon Jun 25 12:33:10 PDT 2007
At 11:46 AM 6/25/2007, Daniel Warner wrote:
>The RTF specification 1.9 from Microsoft is huge. What approach
>would you suggest in parsing RTF with ANTLR to the mentioned
>text-based representation of braille (HBS)?
>
>1) Use actions in the grammar rules?
>2) Create an AST from the RTF input and a tree grammar for the AST
>that outputs HBS?
>3) Use templates?
>4) Other suggestions?
I would strongly suggest implementing a PDF to HBS converter if only
to avoid the many different/incomplete interpretations of the RTF
spec. The PDF spec is substantially smaller and far more uniformly
implemented. Conversion from RTF, and many other document formats,
to PDF can be automated with little difficulty.
In both RTF and PDF, top down left right orientation is common, but
not required. Therefore, you can encounter "out of order" text even
on simple pages. So, actions for direct output are not likely to be
useful. If there is a need to handle footnotes, tables, and columns,
AST is the only way to go. You will likely need multiple tree
walkers to distinguish different text blocks and reorganize the AST
content into a reasonably consistent form. Output then should be
fairly linear, so templates should not be necessary.
HTHs,
Gerald
----
Gerald B. Rosenberg, Esq.
NewTechLaw
260 Sheridan Ave., Suite 208
Palo Alto, CA 94306-2009
650.325.2100 (office) / 650.703.1724 (cell)
650.325.2107 (facsimile)
www.newtechlaw.com
CONFIDENTIALITY NOTICE: This email message (including any
attachments) is being sent by an attorney, is for the sole use of the
intended recipient, and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender immediately by reply email and delete all
copies of this message and any attachments without retaining a copy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070625/534842b4/attachment.html
More information about the antlr-interest
mailing list