first_page

Microsoft Word 2007 Research

Every time I write a post about Microsoft Word, I feel obliged to explain why I am still using Microsoft Word:

  • The concept of inserting custom XML elements directly into prose is not even recognized in OpenOffice.org.

  • The visual comfort of viewing type written by hand for hours at a time is an issue that OpenOffice.org (and others) struggles with… and I am not able to go along with that struggle. Microsoft has ClearType and Bill Hill.

  • The .NET programmability of Word 2007 allows me to treat prose like data. My work on CleanXHTML 1.2 is an attempt to realize this advantage. No one on this planet (including Microsoft) seems to popularize the importance of fluidly moving between loose prose and tightly defined data. This motion should have the same sexiness that dynamic programming languages seem to have with self-described geeks. On the Mac, there is a word processor called Scrivener that treats blocks of prose like DOM nodes. This is the sexiest example of what I am talking about that I can find at the moment.Okay: my verdict for the famous Ribbon, featured in the “Fluent” user interface:

  • There is a Word MVP selling a Ribbon Customizer for 30 bucks. So for those wanting make jokes out of all conspiracy theories here is a new funny one: Microsoft, a multi-billion-dollar enterprise, conspired to release a new user interface that is not customizable so that one of their MVPs can make some money, 30-bucks a pop.

  • At MIX08 some dude from Microsoft spoke for about an hour about the “innovative” history of the Ribbon but failed (according to my attention span) to mention the obvious: you can’t customize the Ribbon! To not even apologize for this and promise to fix this in a future release is outrageous—like Steve Jobs outrageous.Okay: my new understanding of the SDK for OpenXML:

  • No more shall the SDK for OpenXML be confused with the Word 2007 Content Control Toolkit.

  • The strong word “Part” used throughout the OpenXML API, starting with the abstraction OpenXmlPart, represents a location in the ZIP OpenXmlPackage. In the same manner that an XML document contains nodes (or a database store contains tables), an OpenXML Package contains parts. The OpenXmlPart, as of today, does not define anything (useful to me) inside of an XML file.

  • The SDK for OpenXML APIs do not define anything of significance inside of an XML file. This is most important piece of information I need about this SDK. It follows that the OpenXmlPart.GetStream() and OpenXmlPart.FeedData() methods are the most important members of the entire SDK for OpenXML. So, for example, silly me would be looking for a class called “DocumentPart” or (“GlossaryDocumentEntry”) in the OpenXML API which would correspond to the <w:docPart> element. As of today these types to do exist.I would like to be very much mistaken about this but seems that to actually do something useful (to me) with this API, I would have to call WordProcessingDocument.MainDocumentPart.GlossaryDocumentPart.GetStream() and modify raw XML to, say, programmatically load AutoText (Quick Part) entries. This sucks. I can do it—but it sucks. (And, no, MVPs you can’t just run a Macro for Quick Parts.)

Speaking of conspiracy theories, I get a distinct feeling that Microsoft, with its billions of dollars, will never provide a clean API wrapper around the raw XML (based on a sick profit motive based on the idea that making high-fidelity content difficult to get out of Office documents keeps customers locked into Office products—yes, Brian Jones I am still saying shit like this even though the word “open” is in OpenXML)—even though this must be irresistible to some young Redmond new hire just itching to play with LINQ. Is some MVP out there going to charge US$499 for LINQ to OpenXML?

I thought it would cute to do a Google search with the keywords WordProcessingDocument and GetStream. Barely 40 documents in the entire wired universe showed up—all of them irrelevant. It’s lonely at the top.

rasx()