Sunday, January 20, 2008

Office Open XML - A Quick Touch on the Office 2007 File Formats

By now, many of you may have started to become a little familiar with the new Microsoft Office formats that became a part of Office 2007. These are easily distinguished by their file extensions which are now four letter extensions ending in an "x" or "m" depending on whether they contain macros or not. For example, up through Office 2003, Word documents would end in .doc. However, if you create a new non-macro document in Word 2007, it will default to an extension of .docx. Additionally, a new Word 2007 document containing macros will default to an extension of .docm.

Here are some of the updated formats in Office 2007:

Microsoft Office Product

Older Office Format

New Office 2007 Non-Macro Format

New Office 2007 Macro-Enabled Format

Other New Office 2007 Formats

Excel

*.xls

*.xlsx

*.xlsm

*.xlsb, *.xltx, *.xltm, *.xlam

PowerPoint

*.ppt

*.pptx

*.pptm

*.potx, *.potm, *.ppsx, *.ppsm

Word

*.doc

*.docx

*.docm

*.dotx, *.dotm

That's great, Jim… but, why should I care anything about these new formats?!

Here's why…

The new formats are XML-based versions. Unlike prior formats, these are open formats and can be used by other companies and developers to help create a more universal standard. Microsoft's standard is called Office Open XML (or OOXML) and they are not the only player in the game trying to get their document format to become the universal standard. Sun Microsystems is pushing its OpenDocument Format (ODF) as well, and right now, it's anyone's game.

The new Microsoft XML formats allow for a number of benefits over the prior formats Office used. They can provide for files that are up to 75% smaller than if they had been saved in the older formats. They also are less prone to corruption than the older files. Additionally, a new file can easily be identified if it is has macros in it simply by the format extension it has (e.g. *.docm versus *.docx), which can help you see real quick if a file may unwanted or unknown code. More of the benefits can be found by looking at:

Microsoft MSDN: Introducing the Office (2007) Open XML File Formats (http://msdn2.microsoft.com/en-us/library/aa338205.aspx)

A neat little trick is to know that the new formats are saved in a ZIP format. So it you rename a file like "My Document.docx" to "My Document.zip," you can actually browse and see how the new file format works from within your ZIP program. Although, pretty cool to look at and learn from, don't try this with critical files or modify any of the content within it or you can screw up the file.

So, it sounds pretty cool, Jim – but are there any issues?

Well, if someone with Office 2007 saves a Word document in the new .docx format, for example, and sends it to someone with Office 2003, that person won't be able to open the .docx file unless they have installed the Microsoft Office Compatibility Pack. So, if you are running an earlier version of Microsoft Office, you should download the Microsoft Office Compatibility Pack, which will allow you to open files created in the new format. More information, as well as a link to download and install the compatibility pack, can be found by looking at:

Microsoft Article ID 924074: How to use earlier versions of Excel, PowerPoint, and Word to open and save files from 2007 Office programs (http://support.microsoft.com/kb/924074)

Hopefully, I've been able to help you understand the Office 2007 file formats as well as some of the benefits and "gotchas." The new open formats that are available can lead to some dramatic changes to the way office suite documents are handled by office suites all around and now, I hope that you understand the first giant step in that direction!

-- Jim White
MCSE, CCSP, CCEA, Server+, A+, and more!
www.booksbyjim.com

No comments: