While huge steps have been made to increase the security of company information in recent years, hidden document information is often overlooked. Every time a Word document is created and amended, invisible data tracking the author, document changes, editing time and other document properties are added to the document.
Microsoft Word’s collaboration features, such as comments and Track Changes, result in a significant amount of metadata being included in documents. Originally conceived to shed light on data, document metadata categorizes information to make it easier to track and find. When used properly, metadata is helpful. But when used carelessly, it makes it easy for other people to find out details about the document and other privileged information that could harm corporations.
Every time a document is created, metadata is automatically added to it. Some of the information stored in the document may also be confidential (i.e. previous versions or information that may have been rejected or accepted) and may also expose corporations to hidden risks when it is emailed to people outside the company. The problem is not that metadata is added to a document, but rather, it is often more difficult to remove the metadata once it has been added. And because this type of information travels with the document every time it is emailed to others, sensitive or confidential information may be transmitted unknowingly.
Within Microsoft Word, the ability to view Comments and Suggested Changes from other people can be useful when collaborating with several parties. In most cases, these collaboration features significantly enhance the user experience. However, changes that are not accepted still remain with the document, even though they appear to be invisible. These changes can easily be displayed by turning on the “Show markup view.” This can result in embarrassing situations where external parties are able to see information that was not intended for their eyes.
With the release of Microsoft XP and 2003, Microsoft added some metadata removal features. However, it is still painfully short of a complete, automated solution and relies heavily on technical end user intervention. For example, under the Security Tab in the Options menu, Microsoft provides the ability to remove personal information from a file upon save and to warn users before printing, saving, or sending a file that it contains tracked changes or comments. Additionally, users can turn off “fast saves” under the save tab in the options menu to ensure that deleted data is really deleted. The only problem is that users must go to several different places within the application to remove different types of metadata. There is no central place to manage the different settings and most users of Microsoft Word are not even aware that these protection options are even available to them.
This problem is further compounded when documents are attached from within Microsoft Outlook and sent to outside parties. Used as the de facto way to electronically exchange documents in an enterprise environment, Microsoft Outlook does not offer warnings about metadata in attached documents or zipped files. Thus, the potential to accidentally send Microsoft Word documents containing harmful metadata (and thus expose a corporation to the inadvertent disclosure of sensitive information) is amplified tremendously with every document that is sent back and forth in a collaboration.
In financial reporting documents such as spreadsheets, metadata can be saved with Microsoft Excel files. A quick review of the Fortune 1000 Web sites shows that 33% of these Web sites contain Microsoft Excel documents publicly posted either directly on the company’s corporate Web site or linked to a third party site for SEC filings. Accidental posting of Microsoft Excel documents that contain potentially harmful document metadata can be easily viewed by anyone who downloads these documents.
Because metadata is often invisible, document users can unwittingly send confidential information to people outside their organization. In fact, there have been several widely publicized, high profile cases in which document metadata proved to be the culprit.
Document metadata can get corporations in big trouble—putting the organization at financial risk, a competitive disadvantage, and placing them in an embarrassing situation with costly consequences.
Learn more about the different types and dangers of document metadata.
Architectures and Applications Division of the Systems and Network Attack Center (SNAC), NSA
There are a number of pitfalls for the person attempting to sanitize a Word document for release. This paper describes the issue, and gives a step-by-step description of how to do it with confidence that inappropriate material will not be released.
By: Dan Pennington, Director of Practice, PRO
Are you unwittingly sending confidential information to clients or opposing counsel? If you have e-mailed a Microsoft Word or Corel WordPerfect document to either, the answer to this question is likely “Yes”. Download “The Dangers of Document Metadata” to understand how you can take steps to eliminate metadata from your documents.
Scalable Exploitation of and Responses to Information Leakage Through Hidden Data in Published Documents
By: Simon Byers
In considering the leakage of information through hidden text mechanisms in commonly used information interchange formats we demonstrate how to automate and scale the search for hidden data in Word documents. The combination of this scaling with typical behavior patterns of Word users and the default settings of the Word program leads to an uncomfortable state of affairs for Word users concerned about information security. We discuss some countermeasures employable by users and note more general consequences of these effects.