Rob Weir: Captain Kirk, Bring Your Universal Translator
Rob Weir continues his examples of areas where the Microsoft / Ecma International OOXML "standard" is designed to be too obscure for any competitor to implement it. Says Rob, "The Ecma Office Open XML (OOXML) specification seems to presuppose the existence of a Universal Translator of sorts." He then goes on to quote a part of the specification.
An alternative format import part allows content specified in an alternate format (HTML, MHTML, RTF, earlier versions of WordprocessingML, or plain text) to be embedded directly in a WordprocessingML document in order to allow that content to be migrated to the WordprocessingML format.
He notes that there are many different versions of each of these formats, but the standard does not specify which versions. An OOXML-compliant application needs to read all of the above-mentioned formats, without any knowledge of which versions to accept. Conspicuously missing from the list are standard formats like XHTML, DocBook, TeX, or ODF.
Andy Updegrove has begun listing some of the places where OOXML conflicts with existing standards. This is the time for standards groups to point out potential problems in the hope that they will be corrected in the final spec or (if it is not salvageable) the spec rejected as unacceptable.
I saw earlier today, where someone asked whether he should use ODF as a document interchange format. It was referenced from an AbiWord blog. I note that Ryan recommends RTF instead of ODF for document interchange, and Dom (the leader of the AbiWord project) says that RTF is just as good for that purpose as ODF. Having been in environments where multiple word processors were used and RTF was the supposed interchange format, I see problems there. Now maybe it was that the applications supported different versions of RTF, but there were significant and unexpected differences. I like the words Rob Weir used to describe RTF:
RTF – Rich Text Format is a proprietary document format occasionally updated by Microsoft. As one wag quipped, "RTF is defined as whatever Microsoft Word exports when it exports to RTF".
Dom, Ryan, I have to hand it to you. If AbiWord were my project, it would probably be sitting with all my other partially-completed projects. You have really succeeded in producing a small (light on the resource requirements), multiplatform word processor. You are very much respected, including by me. But in this case, I still disagree with you. ODF was designed to be used across applications and platforms, plus it has the advantages of being XML-based and having an open specification that is not controlled by any one vendor.
Today, I may send a draft of a document to a co-worker. Tomorrow, he may want to transform it into XHTML + CSS to put up on our Intranet. The next day, someone may want it in PDF format. The day after that, it may be used (with the proper transformations) with our to-do list / time management system. An XML-based format foresees ease of manipulation by software tools (such as Apache Cocoon), while still remaining human-readable. RTF, as you know, is a maze of backslash-quoted codes that is sure to deter most humans from trying to read the file's contents directly.
And while I agree that for most of the purposes that people exchange word processing documents, they would be better off exchanging either plain text or PDF. Even so, that isn't what people do, at least not yet.