Data Destruction in the Cloud: It’s Complicated

Paul Gillin

As organizations move more data to the cloud, some are asking a new question about the data lifecycle management process: How do you reliably perform data destruction on storage devices you don’t own?

This question is particularly pressing for organizations in regulated industries, many of which have rules requiring verified destruction of old or unneeded data. The imposition of new pan-European privacy regulation in Europe that establishes a “right to be forgotten” rule has raised the stakes even more.

The Challenges of Data Destruction in the Cloud

The procedures for expunging data from on-premises equipment are mature and well-documented, ranging from magnetic erasure to physical destruction of media. But customers don’t own the media that holds their cloud data, at least not by default. What’s more, that data may move around a lot. Cloud providers frequently copy, replicate and back up data for the sake of protection or to maximize their own operational efficiency.

It’s tricky to keep track of where data resides in the cloud and how many copies exist. Even cloud providers who pledge to put deleted data through a virtual shredder may have copies in backups or logs. This “zombie” data may live for years without detection until a breach or unintentional disclosure reveals its existence. There have been several reported cases of files that users thought they had deleted years ago reappearing in their accounts, according to ZDNet.

To make things more complex, some cloud storage providers offer automatic replication as a convenience, storing data both in the cloud and on whatever personal devices the user specifies. That means a user with a desktop PC, laptop and smartphone may end up with three downloaded copies of the data in addition to whatever is in the cloud. Tracking data in such a scenario becomes next to impossible.

Each cloud provider has its own rules for data destruction but finding them can be a challenge. Even then, no vendor is going to accept legal liability for its customer’s data.

Toward a Solution

There is no simple solution to the data destruction conundrum. The easiest fix is to specify that the customer owns the storage media and is responsible for its ultimate disposition. However, this fix is also the most expensive and inconvenient solution.

There are contractual remedies in the form of service-level agreements (SLAs) that specify where data must be kept and/or limit the cloud provider’s latitude to copy or move it. SLAs may also define such factors as how data is deleted, how deletion is verified and who is notified. Such agreements take time and legal resources to negotiate, however, and verifying compliance can be difficult.

Some experts say the simplest and least expensive option is to encrypt all data stored in the cloud. In that scenario, data is never actually deleted, but destroying the encryption key renders it useless. While this solution may sound easy, the devil is in the details. For example, if only selected records need to be destroyed, they must be decrypted and re-encrypted with a different key, a task that will likely be left up to the customer.

Regardless of what data destruction technique you use, the process starts with good governance. Data that is critical or marked for a scheduled disposal should be tagged as such using a taxonomy that both the customer and the cloud provider agree upon. Access to that data should be limited to a defined list of people. Employees should not be permitted to make copies of sensitive data. If an agreement provides for verified destruction, multiple witnesses should be designated to confirm that the process is completed.

Organizations that want an extra degree of protection should simply keep their most sensitive data onsite. The cloud storage market may be growing rapidly, but it still has some maturing to do.

