OOXML and the Mac: More Bad News from Microsoft

Once upon a time, 5 months ago, I wrote a blog post with the title Is Office Open XML A One-Way Standard? Ask Microsoft. In it, I took Microsoft to task because OOXML was so difficult to implement, based on information in a blog posting by Rick Schaut where he explained why it was taking so long for the Mac support for OOXML to arrive. That post generated an incredible amount of traffic and links to my blog, much more than I ever had before or since. Rick Schaut even tried to rebut my claims in his original blog posting, and David Berlind of ZDNet covered the whole kerfluffle, so of course I had to respond too. It was quite an experience for someone who had just started his blog.

But it seems like this is the time to revisit the situation, because yesterday Microsoft announced the availability of a beta version of a tool called the Microsoft Open Office XML File Format Converter for Mac. Catchy name, huh? What this standalone 25MB download does is convert Word 2007 .docx and .docm files into RTF, which can then be imported by Word 2004 for the Mac. No support for Excel or Powerpoint, which Microsoft had previously promised for Spring 2007. Funny that it doesn’t export directly to .doc, though, since I thought the whole point of OOXML was its 100% fidelity with existing Microsoft Office document formats. Now Microsoft says conversion to RTF is “good enough” for those pesky Mac users:

Why Rich Text Format? RTF (once called the “interchange format”) is simply a highly convenient intermediate format for the beta converter to use; I’ll let one of our Word experts (like Rick) expand on the technical reasons why this is the case if there’s interest.

Seems to me that if RTF is good enough as an interchange format, ODF should also qualify since it is at least as functional as RTF! Can’t wait to hear what new rationalization Rick and company come up with for this one…

But that’s not all the bad news. Hidden in the announcement of the tool is the news that the release of Office 2008 for the Mac later this year will not include native OOXML capability – that will arrive sometime later (they say the tools for Mac Office 2004 will arrive 6-8 weeks later, but don’t specify a date for the Mac Office 2008 version). I’m not the only one who has figured out the blatant spin here, see CNET, Wired, and TUAW.

And that, of course, brings us back to Rick Schaut’s original rationales for the OOXML support in Mac Office. Wonder if he still thinks they made the right call? Sure feels to me like my “serious” estimate of 40 man years to fully implement OOXML support wasn’t far off the mark…

Advertisements

~ by Andrew Shebanow on 16May07.

15 Responses to “OOXML and the Mac: More Bad News from Microsoft”

  1. There is one statement you make:

    Funny that it doesn’t export directly to .doc, though, since I thought the whole point of OOXML was its 100% fidelity with existing Microsoft Office document formats.

    Well, from what I know, OOXML does have complete fidelity with existing Microsoft Office document formats.

    I see the move to convert to RTF more of a way of protecting the old binary formats. If you have a translator for the old formats, then you exposing those formats (which they don’t want to do).

    I’m not saying that it’s the right move by MS, but rather, because they choose to not convert to the old formats doesn’t mean the new one doesn’t have 100% fidelity with the old one.

  2. Nicholas, I think you missed my point. The issue is: the reason they said they needed OOXML and couldn’t use ODF was because 100% fidelity was crucial to their customers. If conversions back and forth weren’t transparent those customers wouldn’t accept the new format. So now here were are, and on the Mac they’re saying that roundtrip fidelity isn’t that important: using RTF as an intermediary is just fine. Maybe it is, maybe it isn’t, but they can’t say have it both ways when its “convenient”.

  3. And by the way, Nicholas, your website is completely broken on Mac Safari and Mac Firefox. You need to break out of that IE-only world…

  4. “OOXML does have complete fidelity with existing Microsoft Office document formats.”

    Correct me if I am wrong, but the “VBA macros” library is also gone in Mac Office 2008.

    “Seems to me that if RTF is good enough as an interchange format, ODF should also qualify since it is at least as functional as RTF!”

    That’s exactly right. It’s a shame that Microsoft would rather have everybody purchase expensive licenses of the latest Office when they get WordPad (windows) for free.

    As for the degradation to RTF itself, it’s one day of work for someone in the dev team. The decision tree is trivial, and the Word file format lends itself to this. Converting to old .doc requires much more work because of all that is stored in bits. And probably no round-trip either (all the new theming + gradient/bevel fill capabilities for instance).

  5. Its all evidence of the same basic dishonesty. Even the name open office XML is designed to be confusing. I seem to remember this same company threatening to sue a guy called Mike Rowe because his name sounded too similar to M$’s. I’m sure I’m not the only one who gets fed up with the arrogance and hypocrisy of it all. Its fundamentally obvious that M$ is interested in M$, not its customers, not society at large, not interoperability and not fair competition. Fair enough, probably true of many companies but at least let’s be honest about it.

  6. Andy,

    Wherever did you get the idea that using RTF for Word implies a loss in fidelity? Because it’s “convenient”? Is there some strange dictionary you’re using where “convenient” and “fidelity” are antonyms?

    While you’re at it, can you provide an actual source for your “good enough” quote? I can’t find it anywhere in the MacMojo post.

    And, no, ODf is, most definitely, not as “functional” as RTF, for a number of reasons, not the least of which is the fact that RTF is capable of expressing data that ODF is not.

    But, you knew that already, didn’t you? If not, then why are you blogging about file formats without having sufficiently familiarized yourself with even the rudimentary facts?

  7. Rick,

    I was sorely tempted to delete your post, since it definitely goes over my line between arguing the facts and personal attacks. However, since I was poking a bit of fun at you in the post I’m going to let it slide.

    As for the substance of your post, such as it is, well, let’s take things one at a time.

    First, I never said or implied that fidelity and convenience were antonyms. My assertion that RTF was not a full fidelity format is well documented. Some versions of Microsoft Word warn you that “some data may be lost” (or some such, don’t remember the exact warning) when saving to RTF. In wikipedia’s entry for MS Word, RTS is described (my emphasis added) as:

    Rich Text Format (RTF) was an early effort to create a format for interchanging formatted text between applications. RTF remains an optional format for Word that retains most formatting and all content of the original document

    “most formatting” != “full fidelity”. Enough said.

    Second, for the “good enough” thing, it wasn’t a literal quote. Quotation marks can sometimes be used in such a way. Sorry you were confused. The reason why I wrote the good enough comment was because of the justification used: using a crappy format like RTF for interchange because it was “convenient” for Microsoft programmers.

    Third, on ODF not being able to express data that RTF can support, well all I can say is that *you* should do a little more research before spouting off. Like OOXML, ODF has a very well defined extension mechanism and a zip-based packaging format (the latter of which Microsoft copied pretty blatantly). As such, if there was data in RTF (or .doc for that matter) which couldn’t be expressed correctly in ODF, it would have been quite simple for Microsoft to define extensions to ODF in their own namespaces which could be used to reprresent anything else that might have been needed, and to do so in a way that would have allowed other ODF clients to work with those documents. In fact, much of the criticism of OOXML is that Microsoft decided to reinvent the wheel instead of building on an internationally recognised standard.

    As for your last point, well, I’d like to respond in kind but I don’t want to sink to that level. Have a nice day.

  8. One more update: rereading your comments and my original post, I can see how the way I linked the “good enough” text to the original Microsoft announcement could lead some to believe it was a direct quote, although I think that the fact that the text was immediately followed by an actual quote mitigates that. Again, sorry if you or anyone else got confused by that.

  9. ComputerWorld, “Inside story: How Microsoft & Massachusetts played hardball over open standards,” pg. 2.

    But [former Secretary of FInance & Administration Eric] Kriss insisted that the ODF policy wasn’t intended to be anti-Microsoft. He said technical people at Microsoft told him it would be “trivial” to add support for ODF to the new Office 2007. The resistance to doing so came from the vendor’s business side, according to Kriss.

    http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=273815&pageNumber=2

  10. Rick Shaut said, “And, no, ODf is, most definitely, not as “functional” as RTF, for a number of reasons, not the least of which is the fact that RTF is capable of expressing data that ODF is not.”

    That is highly misleading. Yes, the RTF used in Word’s native file support API used for mapping purposes supports a very rich feature set that goes way beyond the RTF shared among supporting applications. It was difficult to reverse engineer. But that is not the version under discussion. What people are talking about is the version used for sharing data between applications that is under discussion.

    First, that RTF is a feature-crippled format for exchange of simple documents between word processors. ODF includes not only word processing support but also spreadsheet, charts, and presentations formats.

    Second, the ODF word processing format (ODT) has a far richer feature set than RTF. In fact, the only development team working on plug-ins for Microsoft Office that adds native ODF file support to Office — the OpenDocument Foundation’s team — has learned that nearly all of the mapping difficulties are in the other direction, except for citations, i.e., Word’s feature set is leaner than that supported by the ODF word processing format.

    The plain truth is that Word’s page layout engine is ancient and supports a far leaner feature set than the more recent StarOffice/OpenOffice.org page layout engine, whose feature set provided the foundation for ODF. In fact, Word’s page layout engine is so ancient that Microsoft’s knowledge base confirms that bugs that were there in Word 6 are still there. We’re talking about a 16-bit page layout engine here, covered so deeply by spaghetti code that it’s becoming increasingly difficult for Microsoft to implement new features using internal processes. So deeply buried in spaghetti code that even data loss bugs have gone unrepaired for more than a decade. E.g., check the bug reports for tables that span a page break.

    Third, we are discussing an eXtensible schema. There is no barrier to extending the ODF specification to support features Microsoft wants to add or to support citations. In fact, with the RDF support nearly completed for ODF 1.2, citations support will be a snap.

    Fourth, even if ODF were a less featureful word processing format, that does not in any way justify Microsoft reinventing the wheel for the functionality ODF and MOOX share. As Tim Bray, one of the co-chairs of the technical committee that developed XML 1.0, said a couple of years ago in regard to ODF and the Office 2003 XML schemas:

    “The ideal outcome would be a common shared office-XML dialect for the basics—and it should be ODF (or a subset), since that’s been designed and debugged—then another extended vocabulary to support Microsoft features, whether they’re cool new whizzy features or mouldy old legacy features (XML Namespaces are designed to support exactly this kind of thing). That way, if you stayed with the basic stuff you’d never need to worry about software lock-in; the difference between portable and proprietary would be crystal-clear. And, for the basic stuff that everybody uses, there’d be only one set of tags.

    “This outcome is technically feasible. Who could possibly be against it?” http://www.tbray.org/ongoing/When/200x/2005/11/27/Office-XML

    About the same time, Microsoft’s Alan Yates addressed the same issue:

    “I would say, in the future, some time, you know, at some point, there will be convergence. Convergence does happen over a period of time. Or there will be incorporation, there will be subsetting, supersetting. You know, the wireless standard, the A version merged into the B version, merged into the G version over a period of time to give better performance and functionality over a period of time.

    “So, good news, I think, on that front is that this problem will be solved in time. It is not an easy, sort of snap-your-fingers sort of problem.”

  11. Andy,

    First, at no point did I attack your character. There’s a difference between an ad hominem attack and taking people to task for talking out of the side of their mouths. Indeed, the tone and tenor of my remarks didn’t differ, in any substantive way, from the tone and tenor of your original post, the entire substance of which is to suggest that we would content ourselves with less than full fidelity for our users with respect to our file format converters. How does that not impugn the character of everyone on the Mac Word development team?

    The internet is littered with the facile opinions of the woefully uninformed, and, frankly, I think you’re too smart to just be another voice in that cacophany.

    Yet, on this subject, there is little doubt that you’ve been woefully uninformed, and your specific citation of Wikipedia on the subject only serves to confirm that hypothesis. How did your research miss the fact that RTF is a Microsoft-published standard? How is it not patently obvious that we can, at any time, augment that standard for the needs of our customers in the context of file format converters?

    There is plenty of documentation on the internet about the purpose, capabilities and history of RTF. You really have no excuse for not having done your homework.

    As for ODF extensibility, there’s a difference between “unspecified” and “underspecified” behavior. Are you not aware of that difference, and how that difference applies specifically to important data in Word documents? I’d give you some internet search terms that would be useful, but I think you’re smart enough to figure them out.

    Marbux,

    While you’ve come much closer to the truth than Andy has, you still miss the mark. In particular the notion that one needs to reverse engineer the RTF format is quite fascinating given the fact that a new RTF spec has been made available for download with every release of Word for Windows since RTF was first made public. Go to msdn.microsoft.com and search for “RTF specification.” It’s not all that hard to find.

    However, I’m even more facinated by your statement that the RTF used in Word’s converter APIs is “not the version under discussion,” given the fact that what prompted this discussion in the first place was the release of a Word file format converter! I’m sorry. When did the subject change?

    Your “facts” about Word’s layout engine are also incorrect, but I think those issues are outside the scope of this discussion. If you want a detailed run-down, feel free to contact me via the link on my blog’s home page.

    Lastly, your discussion about “reinventing the wheel”, while extending the argument to a religious discussion about Microsoft’s intent with the ECMA file formats, also goes beyond the scope I wish to address. While that might be an interesting discussion once legitimate facts are brought to light, it has little to do with any inference that can be reasonably drawn based on Geoff Price’s comments with respect to our use of RTF in the Mac Word converter.

  12. Rick,

    Your arguments over what is/is not a personal attack are confusing. First you say that you didn’t attack my character, you are simply taking me to task for talking out the side of my mouth. Then you say your tone didn’t differ from mine. Then you say that I attacked the character of the entire Mac Word development team. This is a very peculiar sort of logic, I must say.

    I referenced wikipedia quite intentionally, knowing full well the issues around doing so. I did so because the original macmojo announcement also linked to wikipedia, and thought the juxtaposition would be amusing. That said, the issue here remains: my understanding is that I cannot today take any arbitrary .docx file (which might include complicated content such as custom XML schemas, etc.), convert it to RTF with your new converter, open it with Word 2004 Mac and save as .doc, then open that .doc with Word 2007 and have it retain all the features of the original .docx. Am I wrong? If so, I’ll happily withdraw my objections on this issue, but if not, then all the stuff about RTF being something Microsoft could augment “at any time” is merely a diversionary tactic. But I’m pretty sure I’m not wrong: the converter’s readme was chock full of known issues with the conversion process, so I don’t believe it could pass my test.

    Even if I am wrong on the RTF thing, though, it still doesn’t change the larger issue w.r.t. full OOXML support in Mac Office 2008. Last December, you gave estimates of how long it would take to add that support for OOXML to Mac Word and said that only supporting OOXML would have been a smaller and easier task. Now those original estimates are out the window and as a result your customers (like me) aren’t going to get full OOXML support when Mac Office 2008 ships. Does that make you reconsider your decisions from way back when you started the port? And why shouldn’t this schedule miss be considered at least a partial indictment of OOXML as a difficult to implement format? That is the real issue at hand here…

  13. Andy,

    If you find my logic about personal attacks confusing, consider how Adobe developers would react if I posted something that inferred they were willing to sacrifice user data for the sake of convenience. If you still find the logic confusing, then I’m afraid there’s little chance for any well-reasoned discussion between the two of us.

    Second, your statement WRT “any” arbitary Word 2007 file is irrelevant as far as file formats are concerned. If your file includes data for features not implemented in 2004, your data will certainly be lost, but that has nothing to do with any intermediate file formats and everything to do with the fact that Mac Word 2004 is not the same as Win Word 2007. We choose to implement a different set of features for Mac Office, because our customers actually have a different set of needs.

    Your comment on the beta read me file barely warrants mention. Do you always take the “known issues” in the read me file for a beta to reflect the final version of a piece of software? If you do, I can only say that you are the first person I’ve met who does.

    As for your last paragraph, I’m afraid you have me completely confused. You claim that Mac Office 2008 won’t offer “full OOXML support” when it ships, but I have no idea what you use as the basis for that claim or how that relates to the difficulty of developing a conforming implementation of the ECMA standrds. As far as I can see, the only reason to conflate the distinction between 100% feature parity and conforming to the ECMA standard is because doing so suits the rhetorical end you wish to achieve.

    When you use such rhetorical techniques, you might well construct an internally coherent argument, but you utterly fail to construct a compelling argument. When you equate 100% feature parity with conforming to the ECMA standard, your argument is reduced to saying that the ECMA standard shouldn’t be an ISO standard because, well, Office just has too darn many features. Is that really the argument you want to make, Andy?

    Lastly, as for any “schedule miss,” I suggest you hunt down John Welch’s comment, on http://www.bynkii.com, regarding the media coverage of our announcement. My favorite line is, “Now, face it, C|Net is a dingaling collective when it comes to Mac news, so blaming them for being wrong is like blaming a dog for farting. It stinks, but what can you do?”

  14. Sorry I haven’t posted a reply. My work schedule has been insane.

    The bynkii link was interesting, thanks. I hadn’t realized that the schedule changes had actually occurred back in December, and apparently the folks at Cnet and elsewhere were under the same impression. So I retract what I said about this being new news.

    But it still sucks for customers, of which I am one. I’m dealing with the pain of docx and xlsx and pptx files getting emailed to me, frequently. Converting .docx to .rtf to get to .doc is a pain. The story is even worse for xlsx and pptx. The fact that you won’t have this fixed for your next major release is lame. You can try and talk around this all you want but you know I’m right.

    And the delay still raises valid questions about the usefulnes and implementability of OOXML.

    Here’s another take on the situation from Joe Wilcox, who is a highly credible source in my book.

  15. […] 2007 doesn’t implement it contrary to what the popular IT press reports, and it’s unlikely to be implemented in Office 2008 (at least on Macs). So even Microsoft doesn’t think […]

Comments are closed.

 
%d bloggers like this: