This is great news. I’m going to have to reevaluate my use of DEVONthink again.
Full breakdown of what’s encrypted iCloud data security overview - Apple Support
Really excited to hear this! I saw someone in the Verge comments with an idea that I like: would be nice if Apply had an indicator for whomever you were messing in iMessage if they were not using Advanced Data Protection.
Glad I didn’t end up buying it over Thanksgiving
Not available in my region, including Private Relay
Eventually, “Advanced Data Protection” should be available everywhere in 2023:
Users on Apple’s beta program in the US will be able to enable Advanced Data Protection beginning Wednesday, Apple says. It will be available broadly to US users by the end of the year and will begin rolling out globally — including in China, according to The Wall Street Journal — in early 2023.
Source: first post in this topic, The Verge
It is a remarkable decision Apple has made: the only way to really protect customers’ data is not to have the keys. Now, it will be the customers’ decision if they want to enable this protection or not (the downside is that the data is gone, if you lose your “keys”.
It will be interesting to see how this plays out in the long run because it basically eliminates any way for authorities to get access to this data. My personal opinion is that this is the right choice. There are downsides, but I do not see any good alternative if we really want encryption. Still, it will be interesting to see how legislators all over the world (not only the usual suspects) will respond (I am referring to the issue that many legislators want encryption while wanting to make sure that for instance child abuse does not happen unnoticed - and for sure, there are also the usual suspects that have other intentions regarding data).
This system solves a sticky privacy issue, but opens others.
(The checksum of a file, also called a hash, can be thought of as the file’s digital DNA, or fingerprint. These are generated by various algorithms, and the chances of two different files having the same checksum is exceedingly rare.)
In August 2021, Apple announced its Expanded Protections for Children system that would have allowed scanning users’ devices and matching checksums of photos against a database of known CSAM material, notifying Apple of any matches. This raised privacy concerns, as it introduced a backdoor into people’s devices that could allow bad actors, nation states, etc. to scan, or mandate scans, of these devices.
This announced end-to-end encryption program solves this problem. Every file (document, photo, etc.) stored in iCloud will have a checksum of the raw data of the file. This means that Apple et al. can determine if people are in possession of CSAM without scanning their devices.
It also raises concerns. Let’s say protestors in Nation (fill in your favorite) are sharing a document to coordinate protests against Nation. By determining the checksum of the file, Apple (perhaps under duress from Nation) can determine everyone who is in possession of that file.
Another scenario would be discovery of a document exposing atrocities of Nation. Using the checksum of that document, along with date/time stamps, the flow of that document from person to person could be traced by Nation.
It will be interesting to see how this plays out, and if the EFF and other privacy groups push back against it.
My thoughts exactly. Encryption was in the news a few years ago after the United Kingdom passed their Investigatory Powers Act, and Australia their Assistance and Access Act. Apple planned to offer end to end encryption a while back but didn’t after the FBI complained. I wonder what has changed?
Thanks for this analysis!
The Verge, covering Joanna Stern’s recent interview with Craig Federighi:
And the exact relevant interview question and answer:
My takeaway is that even checksum scanning won’t be something they’re doing here. Is that wrong?
In my interpretation based on the documentation, I believe that is wrong.
From the article @aardy shared above:
Note that this metadata is stored with standard encryption, meaning Apple holds a key to decrypt it, even if the data itself is encrypted with Advanced Data Protection.
Encryption of certain metadata and usage information
Some metadata and usage information stored in iCloud remains under standard data protection, even when Advanced Data Protection is enabled.
So before the file (document, photo, etc.) is encrypted, a checksum will be calculated and that, along with the filename and other information listed will be stored on iCloud. While Apple et al. won’t be able to read the contents of the file, if they have another file with the same contents, and therefore the same checksum, then for all intents and purposes, they can say that you have the same file.
Just to elaborate a little more, let’s say I have a PDF of the Universal Declaration of Human Rights udhr.pdf, and store it in my Documents, which is saved to iCloud. Among other things (just to simplify a bit), the following will be uploaded:
Filename: udhr.pdf File size: 98029 sha256: 366eebb50e4cf32e7db54cc1eaedaa001438c35114274a5e2de2e959ad6888cd
(sha256 is a “checksum” algorithm)
Now, Nation tells Apple they will no longer allow their products to be sold, and will shut down any factories supplying them, unless Apple provides them a list of all users that have files with this sha256:
366eebb50e4cf32e7db54cc1eaedaa001438c35114274a5e2de2e959ad6888cd. Apple policy (per Federighi) is to protect people, so in their estimation in this case, that includes turning over the list of users with that file.
In checking all files in iCloud, we find that Alice has a file in her iCloud:
Filename: could_expose_me_as_a_dissident.pdf File size: 98029 sha256: 366eebb50e4cf32e7db54cc1eaedaa001438c35114274a5e2de2e959ad6888cd
The checksum matches the one Nation is interested in.
Is that the file Nation is interested in? Well, the probability of it not being the file are 4.3x10-43, while the odds of us being annihilated by an asteroid in the next second are 10-15. So, yeah, that’s the file, and certainly good enough evidence for Nation to deal with Alice.
Edit: continuing the scenario:
Monique also has this file, but she used PDF Expert to add some text, and resaved it.
And thus the metadata has changed:
Filename: totally_not_udhr.pdf File size: 104939 sha256: 9daac0554e2c51e9e3b82a5012a5991ee8a1238a29b4d3c2f9901e184b408c67
So our clever Monique doesn’t appear in the list.
Finally, as @WayneG stated above, this encryption goes against several agency’s wishes. Since the iCloud code is not open source or independently validated, we don’t know if there are additional backdoors or bugs (accidental or intentional) that can be exploited to circumvent the encryption.
I’m not sure that I follow:
Why would E2EE imply cryptographic hashes have to be stored with Apple?
What prevents Apple from storing/generating hashes without E2EE?
Hashes are useless as a means of identifying files that are not exact, byte for byte replicas of one another whereas there are statistical methods that can determine if unencrypted files are similar to one another.
So while I agree that there are lots of privacy issues that come with letting a company like Apple have access to your data, I don’t see how any of them are made worse or introduced by the use of E2EE. I’m genuinely and non-sarcastically asking: What am I missing (pre-coffee)?
This problem can be solved (without loss of utility) by including information in the checksum that only Alice knows. (Whether or not Apple does that though, I don’t know.)
Also, it looks (based on what you posted) as if the file name is included in the checksum, so simply renaming a file makes it undetectable by that simple comparison method.
According to their document, they are storing hashes to allow deduplication. The issues I detailed above come along with storing those metadata.
If I understand your question correctly, nothing. This is similar to what the now-canceled program was going to do. Create a hash on-device, compare to know CSAM, then alert Apple if there was a match.
Correct, which allows Apple to determine if you have a particular file (as detailed above).
True, but I don’t think that is at play in this scenario.
It’s not the E2EE, but the metadata that are the issue.
True, salting is a way to do this. Whether they do is, as far as I know, not know. I frankly would doubt it, as that negates their ability to deduplicate files across users (e.g. if 1,000,000 people have the UDHR file, they only have to store it once). And it would also negate their ability to determine who has what files, which seems to be the quiet part no one is saying out loud.
The filename is one item in the metadata that is uploaded, along with file size, hash, etc. I doubt the hash would include the filename or size (or the other metadata), as, again, that would defeat deduplication, detecting if people have files, etc.
Hm. Just had another thought, these metadata allow Apple to determine if people have music files, videos, etc. even if they are encrypted.
Similar to salting, but not the same. Salt is random and pubished along with the hash, allowing anyone to check a file they have against a salted hash, but not to do so against precomputed hashes. Using a shared secret prevents anyone other than the owner from feasibly being able to determine the identity of the file by matching it against files with known hashes.
I would be shocked if they are able to deduplicate files across users once E2EE has been applied. In fact, I would argue that their ability to know whether two people have the same file would violate their claim of E2EE.
That’s where the hash comes in. It’s created from the unencrypted file, as I demonstrated at great length above.
This is the same way people verify file downloads. Download a file, compare the hash against the one that’s on the website. Same hash, same file.
So, the system isn’t perfect, that’s too bad. But it is better than the previous “standard” data protection in iCloud.
That’s true, but I just read through the “iCloud data security overview” and a lot of the E2E that Apple is adding is for things like Bookmarks, Shortcuts, and Memoji etc. Most of the items listed are already protected by E2E or, IMO, insignificant.
If I can turn on E2E per category I’ll add it to Messages in iCloud, iCloud Backup, and iCloud Drive. I don’t use Notes, it can’t be used for mail, contacts, or calendars and I don’t want my photos encrypted.
I encrypt things that I would rather lose than have them made public. If this new feature is just On or Off, I think I’ll pass.
Right, but as I expliained at no great length and not quiet as far above, a hash can be computed that both serves as assurance of file integrity, and also preserves privacy. So, if Apple wants to claim the benefits offered by true E2EE then they will have to compute their file hashes in a way that doesn’t them know if two files are the same based purely on their checksums, and thereby give up this method of deduplication. It’s not clear to me how they’re actually computing those hashes though, unless they document it exactly.
But, in the end there is absolutely nothing about this new system that introduces something that isn’t already available to Apple now, so the new system does represent a strict increase in file content confidentiality.