Forensic analysis of Microsoft Word documents

Microsoft Word documents are stored in a proprietary binary file format that records additional information, known as metadata, beyond just the text of the document in it. Some of the information contained in the documents that you create and distribute may be embarrassing or private in nature and has shown up in several news stories much to the sources embarrassment. A forensic analysis of these documents can recover this metadata. There are several easy to use tools to discover and clean metadata from Microsoft Word documents.

As several news stories highlight, sharing word documents with others may reveal more then you bargained for such as:

  • Your name
  • Your Initials
  • Your company name
  • Your computer name
  • The name of the server where you saved the file
  • File properties and summary information.
  • Names of previous authors
  • Document revisions
  • Template information
  • Hidden, delete text
  • Editing comments

Knowing the information may be in a document is fine, however, seeing is believing. Let’s create a test Microsoft word document.

Using EnCase, a commercial forensic analysis program available from Guidance Software, it is possible to see just how messy Word documents are. Notice below that editing the document created three documents. One, ~$tadata document.doc, is the deleted backup file that gets created while a file is being edited. It stores the previous version which as highlighted below was empty.

Now let’s edit the document and add some text to discover.

Forensic analysis of the end of the file clearly shows that I was the user that edited the document, that the template I used was Normal.dot, and that I was using Microsoft Word 10.

While using a commercial forensic product like EnCase can show the raw metadata, it is much easier to use one of several commercially available products that can show and even remove metadata from Word documents. On such product is Metadata Assistant from Payne Consulting Group.

Using this program anyone can easily discover and clean the metadata from Word and other Microsoft Office documents. Simply start the program, select the document to analyze, and click analyze.

The program will display all the hidden metadata in the document.

If this is a document you are sending to others it is a simple click on clean to save a metadata free version of the document.

1 comment:

Listen to Me said...

Hello Michael,

I found your article on forensic analysis of microsoft documents to be very useful. A friend of mine recommended me to check your website as I am currently investigating a potentially malpractice of a company. In summary, the word document which was sent to us could not be accepted as it was submitted a day after the due date (the due date was 22/10/2009 and they sent it to us on 23/10/2009).

However, on the letter itself it was dated 19/10/2009 to fit the deadline. This could amount to a fraud and using microsoft words, we could check that the data was last modified on 23/10/2009. We suspected that the document was also created on 23rd, not 19th as stated on the letter.

I wondered if you could direct me to any free software that could trace when such word document was created?

I would be grateful if you could help - we are pretty desperate to be honest! Look forward to receiving your reply at robert.smyth99@yahoo.com.

Cheers & thanks.

Rob.