Metadata is getting a lot of press lately, especially among companies that are wrestling with the new electronic discovery standards issued by the US Supreme Court. But what is it really?

Technically, metadata is data about other data. If the customer’s address is data, the number of entries in your address book is metadata. If the body of a Word document is data, the date you last opened the file is metadata. If the values in an Excel spreadsheet are data, the formulas in each cell are metadata.

From a legal point of view, metadata is everything about the document that’s not immediately visible when the document is printed. It includes all the MS Office "properties" like file size, author and character count. It also includes any hidden features such as the old versions that are still buried in the document when you leave the Track Changes option on. It includes formulae in spreadsheets and formatting commands like the print area.

For most normal uses, the metadata about a document is just background. We take it for granted and almost always ignore it. But if your metadata reveals facts that you wanted to keep private, it can be embarrassing and expensive. In one case, a major pharmaceutical company deleted some study data from a report – and got caught when the New England Journal of Medicine looked in the Tracked Changes to show the deleted comments. In another case, a confidential White House policy paper about Iraq was outed when a quick command revealed the report’s author. In yet another case, officials covered up classified information with black bars, not realizing that readers could easily uncover the text by copying it from under the black and pasting it elsewhere.

When you get into a legal situation, metadata becomes even more important. Metadata is used to show “who knew it and when they knew it” – to provide the context around the document in question. Metadata can either clear you or convict you. Because of its importance, metadata must be preserved and unaltered when you are collecting documents that will be used in court. This is hard because routine Windows operations will change the metadata just by opening the file. Make sure that you have the tools you need to keep metadata intact before you get into the lawsuit.

And, of course, be very careful before you post a document publicly. Make sure you clean out the metadata that you don’t want public.

Leave a Reply