Apple stores your voice data for two years

The iPhone and iPad maker holds on to the data from Siri and Dictation for two years, so long as it abides by its own privacy policy — which, as you might expect, is fairly vague.
Written by Zack Whittaker, Contributor

Apple disclosed today that it stores the data created when people use Siri and Dictation, two voice-driven services found on its mobile devices, for two years.

The disclosure comes after one civil liberties group warned that Apple isn't doing enough to inform its customers of their privacy rights.

Siri, Dictation: Same thing, different platform
Image: Josh Lowensohn/CNET

The Cupertino, Calif.-based company's privacy policies on its Siri and Dictation services do not disclose exactly how their technology works or how for long the company stores customer data. Both services were first made available to the public as standard features in October 2011.

Some companies, such as IBM, banned use of the services in their workplaces because they could not guarantee the security of their data.

This morning, Wired's Robert McMillan lifted the lid on how long Apple stores the data: up to two years, according to Apple spokeswoman Trudy Muller.

Siri (found on iOS devices) and Dictation (found on both iOS and OS X devices) take voice-input data and send it, over the air, to Apple servers. A random number is generated to anonymize the user, and the data excludes a user's email address, phone number and Apple ID. 

After six months, the data is "disassociated" from that random number. Apple then uses it to "generally improve [Siri/Dictation] and other Apple products and services." The company says the data may include "related diagnostic data, such as hardware and operating system specifications and performance statistics."

These files are held for up to 18 months.

Speaking to Wired, Apple said that turning off Siri immediately deletes the random number identifier and "any associated data."

"Dictation is turned on as part of Siri," according to Apple support documentation. Both Siri and Dictation fall under the same service-level agreement and thus the same privacy policy.

However, if Siri and Dictation remain on and are not disabled, any voice-input data and corresponding personal information will be retained for up to two years from the time it was first entered into your compatible iOS or OS X device.

Google takes a similar approach. The search giant offers similar voice services, and the company anonymizes data after two years to use it to improve its speech recognition service. Google says it has "no way" of telling who spoke a particular query.

But the devil is in the details. 

Apple, its partners and its peers may not be able to tell who spoke a particular query after a set amount of time — for Google it is never; Apple has a six-month limit — but the contents of that voice data still remain on their servers.

Such content could range from an innocuous Siri request such as "What was the score of that game last night?" to a sensitive, legally regulated dictation that reveals the precise time a company plans to file for an initial public offering. It is the latter scenario that is of greatest concern.

For this reason, IBM last year banned Siri on its corporate network. Chief information officer Jeanette Horan said the computing giant is "extraordinarily conservative" about computer security, and suggested that "spoken queries might be stored somewhere." 

Nicole Ozer, the technology and civil liberties policy director at the ACLU of Northern California, told ZDNet in a phone conversation that Apple "may be storing confidential business information on its servers."

"Apple can be collecting personal information about who you are, who you know, where you go and what you do," she added. 

Siri's privacy policy: Clear as mud

Apple's Siri and Dictation privacy policy [PDF] explains that customer voice data is sent to the company for server-side conversion, and says that personal data may be recorded:

When you use Siri or Dictation, the things you say will be recorded and sent to Apple in order to convert what you say into text and to process your requests. 

Your device will also send Apple other information, such as your first name and nickname; the names, nicknames, and relationship with you (e.g., "my dad") of your address book contacts; and song names in your collection (collectively, your "User Data").

The statement "User Data" is key — this is the content stored on your phone that can be physically read, including text-based details, contacts, notes, and calendar entries.

Though it is publicly available, it is not easy to find this privacy policy. It can only be found on devices that support the services, and the online version does not mention anything about data deletion, storage duration or any other elements about which the ACLU is concerned.

"A broader issue is right now there's no link to Siri's privacy policy from Apple's website, and it's really important for Apple to make its policies clear for those who not only own an compatible device but those who are purchasing a device with Siri preinstalled," Ozer said.

The only way to access Siri's privacy policy is by accessing it directly the iPhone 4S, iPhone 5, and certain iPad models.

Here's a snippet from what it says:

If you turn off [Siri/Dictation], Apple will delete your User Data, as well as your recent voice input data. Older voice input data that has been disassociated from you may be retained for a period of time to generally improve [Siri/Dictation] and other Apple products and services. This voice input data may include audio files and transcripts of what you said and related diagnostic data, such as hardware and operating system specifications and performance statistics.

(Because the two policies are otherwise identical, [Siri/Dictation] are interchangeable.)

Information collected by Apple will be "treated in accordance with Apple's Privacy Policy," the company says.

That privacy policy states:

Apple makes it easy for you to keep your personal information accurate, complete, and up to date. We will retain your personal information for the period necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.

There is also this following nugget to consider, which comes at a time where California is mulling over a "right to know" data law that would go above and beyond what EU citizens have. It has yet to be implemented, but other states may follow suit.


For other personal information, we make good faith efforts to provide you with access so you can request that we correct the data if it is inaccurate or delete the data if Apple is not required to retain it by law or for legitimate business purposes. We may decline to process requests [...] for which access is not otherwise required by local law.

According to Ozer: "In the U.S. there isn't a comprehensive privacy law that control what companies can do with the data that they collect. Apple is only required to do what it says in its privacy policy."

"Its privacy policy requires that if somebody uses Siri then they agree that their voice data and user data will be collected — even after they turn it off, their older voice input data can be used to 'improve' Apple and Siri services."

Can Apple hold onto this data for as long as it wants?

Under most major jurisdictions, including the U.S. and in the EU, Apple can. It just chooses not to.

While the U.S. doesn't have data retention laws, the EU does — albeit controversially. But EU law doesn't apply in this instance. 

According to a European Commission spokesperson, the EU Data Retention Directive does not apply to Apple. Vague as it may be, Apple and Google and its peer companies are not classified by EU authorities as a "telecommunication service providers or operators."

ISPs and telecom firms must hold onto communications data for a period of six months to two years under under the EU Data Retention Directive. This allows governments from around the world to request access to that data, including IP addresses, time and date of emails, phone calls, and text messages, so long as a court requests it.

The EU forces companies to hold onto your data for a set amount of time, but it also dictates that it must be destroyed after a certain amount of time or when it is no longer needed. This policy has been controversial, and there are ongoing disputes over the legality of it, and whether it has even been fully implemented in EU member state law.

Recently, EU regulators warned Google that it must clarify how long it stores user data for under its new merged privacy policy. The search giant was told to modify its privacy policy after regulators found that it may not be in compliance with EU law. Further, Microsoft, Yahoo, and Google were told by the European Commission's privacy group, the Article 29 Working Party, to limit how long they store identifiable information.

Apple may also be under this requirement. To date, the company has not been targeted because it has not been the subject of any complaints.

Bottom line

A lack of clarity and spotty availability of otherwise public information make it difficult for businesses and consumers to make informed choices about their purchasing decisions. There is little upside to this inconsistency. Though for the enterprise, the truth may be a bitter pill to swallow.

Editorial standards