back to indexThe Unofficial Guide to Apple’s Private Cloud Compute - Jonathan Mortensen, CONFSEC

00:00:00.000 |
We're going to talk about Apple's private cloud compute. This is an unofficial guide. I don't 00:00:19.360 |
work at Apple. We'll talk about it in a sec. So this is my background, my PhD in data science, 00:00:27.280 |
biomedical informatics. I've sold two companies, one in AI and data, one in cyber security and 00:00:32.000 |
infrastructure. I'm at South Park Commons. I'm building a company called Confidence Security, 00:00:35.760 |
which we'll get to at the end. But again, disclaimer, I'm not an Apple employee. I'm 00:00:40.400 |
not speaking on their behalf. Everything I've gleaned is from public sources. And hopefully, 00:00:44.720 |
what we'll learn today is some tools that we can use ourselves. There's really six key components 00:00:49.440 |
and some approaches to ensure privacy. And privacy and security are very related, not perfectly 00:00:54.720 |
overlapping, but related. So before I get there, I know that we're in the security track, 00:00:59.680 |
but I want to motivate why you might care about privacy. Not everyone believes that you should 00:01:04.160 |
have privacy. So let me just give some examples. This year, DeepSeq leaked a million sensitive 00:01:12.560 |
records of chat logs. And you might say to yourself, like, well, that's DeepSeq. Everyone knew that was 00:01:17.600 |
going to happen. But before I show you the next piece, I want to poll the audience. How many here 00:01:23.920 |
care about privacy? All right. How many of us use ChatGPT? All right. How many consider ChatGPT to be 00:01:33.600 |
private? Okay, good. How many use the ChatGPT private mode? And then how many use the API but with zero data 00:01:43.360 |
retention, kind of like the default state? Okay, great. Well, as of yesterday, OpenAI has to retain 00:01:50.000 |
everything anyway, whether or not you flagged it as private or not. And I don't want to get into the 00:01:55.920 |
comments of why they have to do that. But the point is, is that they have the capability of retaining 00:02:01.840 |
your private chats that you can flag on the UI as not private other things. And obviously, being forced to 00:02:08.320 |
do that is not great. But this is why we should all care about privacy. So Apple doesn't want to have 00:02:16.000 |
these headlines because they're one of their major value props is privacy. So let's talk a little bit 00:02:20.800 |
about the problems that Apple solved and then how we might use it. So fundamentally, AI requires more 00:02:26.960 |
compute than a phone. But obviously, they want to bundle AI into their phones. Privacy is a major selling 00:02:32.480 |
point. Anytime you give up private data to something remote, you're inherently reducing your privacy 00:02:38.160 |
rate. Anytime I give you my data, it's not as private as it was the second before. So the question 00:02:43.680 |
that Apple is trying to answer in their PCC system, which is now available on all of our iPhones and 00:02:48.560 |
used for inference, is how do you get remote compute while remaining private? And the simple way to do 00:02:53.280 |
that would be to buy all the iPhones H100s, pair them in the cloud, you get your own H100, boom. But 00:02:58.400 |
obviously, AI is even more expensive. So that's not going to work. So they actually need some approach, 00:03:04.480 |
which is how do you get remote compute while remaining private and cheap? Otherwise, it doesn't work. So 00:03:10.800 |
I'm going to kind of frame the problem this way. You've got an iPhone, you've got an untrusted remote 00:03:15.680 |
server, you can't see inside of it. It's a black box. Once you give them your data, you have no idea what 00:03:20.240 |
happens inside. And what Apple does is tries to make it not a black box so that the iPhone has some 00:03:25.440 |
control of what happens to the data inside Apple's remote servers. And then hopefully that this trusted 00:03:32.640 |
remote service is also hard to hack. So for the remainder of the talk, and we're not going to be 00:03:38.080 |
able to get into all of it in 16 minutes, but we're going to talk about Apple's PCC requirements that they 00:03:43.840 |
set up. And I'll review a conceptual architecture about how they meet those requirements. Then we'll go 00:03:48.880 |
into two specific components of the six because I don't have time to go through all six and you'll 00:03:53.600 |
be bored by that point. And then talk about some pros and cons of Apple's things and how we might 00:03:58.720 |
use some of those components ourselves. So there are five key requirements to Apple's private cloud 00:04:06.080 |
compute that they're trying to meet when they design the system. The first one is stateless computation. 00:04:10.560 |
This is essentially the guarantee that when Apple receives your data, it's only used to satisfy the 00:04:15.040 |
request and cannot be used. It's impossible to use it for anything else. You can't log it, 00:04:19.040 |
anything like that. The second thing is enforceable guarantees. The notion that the code, everything's 00:04:26.400 |
enforced with code, not by some sort of policy, not I shouldn't SSH to the instance, but I can SSH to the 00:04:32.000 |
instances. No, there's no SSH on the instance. You can't SSH to it. You don't want to save things? Well, 00:04:37.680 |
don't have a disk, right? So these are what they call enforceable guarantees, not just policies. The third 00:04:43.120 |
principle or requirement is non-targetability. That means that if you wanted to hack my data on PCC, 00:04:49.600 |
you'd have to target everyone and sift through all of it rather than having some easy way to find just me. 00:04:55.040 |
No privileged runtime access. I just briefly touched on it earlier, but essentially there's no way to 00:05:03.040 |
bypass these restrictions in production. And then the final one and the most important one is verifiable 00:05:09.120 |
transparency. Verifiable transparency essentially says we can prove that all of the above items are true. 00:05:15.600 |
Great. So let's talk about, again, this is a little more bigger representation of the black box. 00:05:23.840 |
In a classic, you know, kind of remote system, you have some sort of auth service, and then we have an 00:05:29.680 |
AI engine. And in this AI engine, you have some SRE who can access it and some disk that you can write to 00:05:35.200 |
it. But again, the iPhone doesn't know what it's sending its data to, but this is fundamental. So let's 00:05:40.800 |
see how we can change this to get to some of these at a conceptual level. So the first thing that Apple does 00:05:48.160 |
is it adds an anonymizer. And this anonymizer is the first part of two parts of non-targetability. But 00:05:57.760 |
ideally, right, Apple can't tell who the data is coming from, so it would be harder for an attacker 00:06:02.320 |
to come and, like, fish out my particular set of data. If you're a student looking at this, you're still 00:06:10.720 |
off behind the anonymizer, and so the iPhone provides some sort of auth credentials, and those auth credentials 00:06:14.880 |
are obviously tied to the user. So the second thing that Apple does is separates auth. And conceptually, 00:06:20.880 |
you think of this as if you're going to the arcade and you want to go spend your money on arcade 00:06:24.960 |
machines, you first put your money into the coin machine, you get some coins out. These coins are 00:06:29.920 |
anonymous. Now you can go to the machine and no one knows what machines you spend your money on. That's 00:06:34.720 |
essentially what happens here. It's called blind signatures. We're not going to have time to get into 00:06:38.640 |
it today, but that's what happens. So now the iPhone is making an anonymous request, going through an 00:06:44.640 |
anonymizer that's taking everyone's. It's kind of like Tor. It's like laundering everyone's data, 00:06:49.120 |
so that if someone were to access the system internally, they wouldn't know who it's coming 00:06:55.360 |
from. So that gets us non-targetability. The second thing that Apple does is it changes the set of 00:07:00.960 |
requests that are happening. The first thing it does is, before it sends its data, it says, "What are you 00:07:05.040 |
running?" And if the AI engine replies, "Well, I'm running this and only this," the iPhone might say, "Okay, I trust that." And if that remains 00:07:14.400 |
true, then you can run this AI on the data that I'm submitting. This is how they achieve verifiable 00:07:22.000 |
transparency. There's a little more subtlety to that, which we are going to get into, but it's 00:07:26.160 |
essentially the iPhone says, "I trust the code that you're running. You can only decrypt my data if you're 00:07:29.840 |
still running that code." So the iPhone can verify what they're doing. The next thing, no privilege runtime 00:07:35.120 |
access. That was easy. Just get rid of SSHD. Make it no way of accessing those machines. Enforceable guarantees. 00:07:42.000 |
Get rid of the disk. We talked about that. And then stateless computation, again, 00:07:46.800 |
with no disk, no access. There's nothing to do with the data other than respond to the iPhone. 00:07:50.960 |
And since the iPhone verified the code that was running on this machine, it knows it's not being 00:07:56.880 |
logged anywhere before it gives them the data. Okay. So they achieve those five guarantees that I talked 00:08:04.560 |
about here using six technical components. And again, we're going to go into two of them. 00:08:09.760 |
But I'll describe them all very briefly. Oblivious HTTP is a spec developed by Cloudflare and Apple and 00:08:17.440 |
others that allows you to essentially make anonymous requests using a third party to use the launder your 00:08:25.040 |
requests through this third party. So all of their requests that goes to Apple's private cloud compute first goes through Cloudflare. 00:08:30.480 |
So when Apple receives it, it only knows that it came from Cloudflare, not from an individual user's IP address. 00:08:35.440 |
The second thing that they use is blind signatures. Blind signatures is that arcade analogy that I gave 00:08:41.120 |
you. But it essentially is a way to auth separately and then verify that you're bearing true authentication, 00:08:48.000 |
but you can't link it to your identity. And again, we don't have time to go into that. But if you want to look 00:08:52.080 |
it up, it's a formal spec as well. There's lots of packages and open source libraries that let you use 00:08:56.880 |
that. Third component is the secure enclave. This is an equivalent we have in our world, if we're not 00:09:04.800 |
programming on Apple, is TPMs, if you've heard of that. But they're essentially a place, a separate piece 00:09:09.680 |
of hardware where the private keys are kept. And that makes a guarantee that those keys can never be removed 00:09:16.240 |
from the hardware. That's really important because you don't want the keys to be given away that does 00:09:23.440 |
all of these. All of the interactions that this is doing is with keys that they prove who they are. 00:09:28.320 |
If they could move it and have some third party hold it, then it wouldn't be trusted. You could 00:09:32.400 |
essentially fake everyone out that you are an official AI engine, but actually you're somewhere 00:09:36.960 |
else. So the secure enclave helps with that. Again, we won't be getting into those. We're going to get into 00:09:40.720 |
these two. The last one is secure boot and hardened operating system. This is like a standard technique, 00:09:46.960 |
but it's essentially they run a very limited version of iOS that makes it very difficult to hack or 00:09:52.720 |
modify. Everything has to be signed just like if you've done an iOS app. Now you have to do signatures, 00:09:58.240 |
but theirs is like even crazier. Okay, so the ones we're going to talk about are remote attestation. 00:10:04.880 |
That was this flow I talked about here. Great. Then the other one is the transparency log. The 00:10:12.160 |
transparency log is a record of all of the software that Apple is deploying on their private nodes so 00:10:17.760 |
that you can go and verify what's on the record is actually what's being sent to you during the 00:10:22.240 |
attestation. Okay, so let's talk about remote attestation very briefly. I'm going to talk about it 00:10:29.360 |
abstractly not with iPhones. So you have some client and the client says, "What are you running?" 00:10:34.400 |
And the server replies with two things: a set of signed claims and then a public key. And the signed 00:10:40.960 |
claims essentially say, "I'm on genuine hardware. I'm running a genuine GPU. I am running this set of 00:10:48.880 |
software. I use this bootloader. I use this version of Linux." And then the client gets to look at those 00:10:55.120 |
claims and decide whether it trusts that version. It might be like, "Oh, I only trust this version of 00:11:01.520 |
the Linux kernel and above." Or, "I only trust that it's been signed by Apple." And if so, it can use this 00:11:09.520 |
public key that comes across to encrypt data that is later sent to the server. And this is really important, 00:11:18.560 |
which is this public key and these claims are tied together. So during later interactions with the server, 00:11:23.840 |
the client will encrypt using the public key and the signed claims. And the server will only be able 00:11:28.400 |
to decrypt if it is still matching those signed claims. There's a whole bunch of cryptography that 00:11:33.920 |
makes this possible and a bunch of certificate chains and a bunch of like trusts and vendors. But that's 00:11:39.280 |
the fundamental idea. And this is what is letting you change it black box to something that's a little 00:11:44.080 |
more translucent, right? Not just throwing it over the wall. You can kind of see what's going on inside. 00:11:50.240 |
Okay. The second thing is the transparency log. Transparency log is actually very simple conceptually. 00:11:56.240 |
It's just a database with records for each software release or each component in a software release signed by a 00:12:03.040 |
particular person. So, for example, in this record, Bob added this binary or piece of compiled source code. 00:12:13.200 |
And this is the hash of that binary on November 1st, 24. And then that's it. It's just a declaration that 00:12:19.760 |
this binary was signed by Bob. Why does that matter? Why would you care about this? Well, first of all, 00:12:26.160 |
reviewers can go through and offline look at these binaries that are made publicly available 00:12:33.840 |
and verify their behavior. And so then when you get a remote attestation and the remote attestation 00:12:39.600 |
says this hash of this binary is there, you can be like, oh yeah, I've already checked this binary. 00:12:43.440 |
I believe that it's doing the right thing. The second thing, so that's what I said the second point, 00:12:48.880 |
which is you can check the remote attestations match what's in the log. And then finally, if you see an 00:12:53.840 |
attestation that's not on the log, you know the whole system's been compromised. Because if it's not on the log, 00:12:59.760 |
definitely someone is like doing some sort of shenanigans, right? They might have like hijacked 00:13:03.840 |
your connection, whatever. And it's just because like a limited set of people can write to this log 00:13:09.600 |
and there's no way to modify the log, right? It's append only. It uses like a Merkle tree so that you can't 00:13:14.400 |
change the contents. Great. So that is the transparency log. So let me tell you how this all comes together, 00:13:23.680 |
right? So remote attestation is this flow. Again, the iPhone firsts through the anonymizer, 00:13:31.760 |
requests a remote attestation package and then says, well, if I believe that remote attestation package, 00:13:37.360 |
I trust the contents that is running on the server. I can then send my data and I phrase this as try to 00:13:43.440 |
decrypt the data on the AI. Again, if the attestation changed, the AI engine would not be able to decrypt the 00:13:49.920 |
data, right? So that's the most important part, right? It says, I'm running this thing. Trust me. 00:13:54.000 |
And it says, I trust you. Okay, great. Encrypt it. And I can only decrypt it as long as it's still 00:13:58.160 |
running the exact thing I said I trusted. And the second item we talked about is the transparency log, 00:14:04.720 |
which is check if the attested claims match the transparency log. And this transparency log we 00:14:09.200 |
talked about. So on here, Apple is writing a lot, a lot of software onto the log and then essentially 00:14:14.720 |
saying, trust what's on the log, you can verify it offline. And then when the attestation claims come 00:14:20.400 |
in, just double check that they do indeed match. Okay, and then I don't have time to get into all of 00:14:28.000 |
these. But here are some of the other items that we talked about, the blind signatures, the oblivious 00:14:36.240 |
HTTP is the anonymizer, blind signatures are the way they do the auth. And then of course, 00:14:42.160 |
over here, we have the secure enclave, I kind of put that outside of the AI engine, they're separate 00:14:48.880 |
pieces of hardware. And then the hardening is just this little lock, but you know, we don't have time 00:14:53.280 |
to get into it. And that's kind of at a very conceptual level, like you could essentially do a PhD on 00:15:01.440 |
each of these, how Apple's PCC works. So what are the gaps? What are the downsides? Well, first, you have 00:15:09.760 |
to put all of your trust in Apple still, right? On the bright side, like Apple runs their whole supply chain, 00:15:15.360 |
they verify the nodes when they get them at their data center, they actually re-sign them with what's 00:15:21.200 |
called data center identity keys or something like that, DCI case. But there's no guarantee that Apple 00:15:28.160 |
doesn't share the certs with anyone, or insecurely generate them, or set the private key to one 00:15:33.360 |
everywhere. Now, I think they are trying to do their best effort, but you still have to trust. You've 00:15:39.600 |
shifted the trust now into like Apple's behavior rather than the hardware. But anyhow, 00:15:45.120 |
they're only available on Apple devices for consumer use on official apps. Maybe at some point, 00:15:50.800 |
they'll make PCC available to everyone else, but not yet. So what trade-offs do is Apple PCC make, 00:15:57.760 |
they're limited by latencies to Apple data centers. So they do have local models first that they try and 00:16:05.040 |
use, but if those local models aren't adequate, they'll send them to data centers. As we start to do 00:16:11.760 |
like real-time voice and other things, this is a little more latencies, like adds a lot more latency 00:16:17.600 |
to the system. The compute costs are higher. There's doing a lot more encryption. There's like, 00:16:22.480 |
I didn't, I mean, you're not seeing it, but there's like six layers of encryption before it even gets to 00:16:26.400 |
that node. That actually, that actually makes it happen. So you're spending a little bit more compute 00:16:30.080 |
there. Like I told you, no custom models, no fine tuning. The client libraries are very complicated. 00:16:36.000 |
The client having to orchestrate all of these requests, this transparency log, this auth, 00:16:42.480 |
that's way more complicated than a simple HTTP request, which kind of sucks. And what if 00:16:48.480 |
your iPhone goes down after it's authenticated, and then it loses all the authentication keys? Like, 00:16:54.400 |
you've essentially like lost all of your state, right? So there's a lot more stateful. 00:16:58.080 |
Operationally complex. You can't SSH into the machine and there's no logging. So that's difficult. 00:17:06.080 |
Not everyone would sign up for that. You can't do any usage tracking. If you could do usage tracking, 00:17:11.840 |
then you'd be identified, right? And so Apple can't like parcel out, you know, you get 2000 tokens. 00:17:18.160 |
They do do some fraud and abuse tracking at a very gross level. But if you wanted to use this and 00:17:23.840 |
maybe pass on your costs, a similar architecture and pass on your costs to the customer, you wouldn't 00:17:28.080 |
be able to know which customer was doing what, right? And then not open to third party developers. 00:17:33.760 |
Okay. What can I learn from this? I gave you the list of six that Apple uses. And here's what's 00:17:40.480 |
available in our world. If you're not developing on Apple Silicon and Apple hardware, you still have 00:17:45.440 |
oblivious HTTP and blind signatures. There are libraries to do that. So we don't have secure enclaves, 00:17:50.320 |
but we have TPMs. Almost all Intel and AMD hardware now has TPMs. And then in the cloud environment, 00:17:56.720 |
they have virtual TPMs that provide the same behavior as a TPM. And again, that's where you put a bunch of 00:18:02.560 |
your private keys that are tied to that public key that I talked about. These are available for us. Secure boot 00:18:08.560 |
and hardened operating system. Remote attestation is kind of available. It's kind of tied to the TPM. 00:18:16.800 |
There aren't great standards yet, but there is a little bit of work there. Transparency log. There 00:18:22.560 |
are two open ones. One's called SIGSUM. The other one's called SIGSTORE. If you've heard of them, if not. 00:18:29.440 |
And then confidential VMs are just becoming available on cloud providers with GPUs. So confidential computing 00:18:36.800 |
has been around for a while, but now you also have to have confidential H100s. And only H100s support, 00:18:42.880 |
and H200s support confidentiality. What that means is that their memory is encrypted. So if you were to 00:18:47.520 |
physically go up to the H100 and like try to look at its RAM, you wouldn't be able to see what's going on 00:18:52.800 |
there or figure out what's going on there. And then finally, what we have that Apple doesn't have is we 00:18:57.680 |
have open source and we have reproducible builds. We have the ability to link the source code to the 00:19:01.440 |
binaries. And so we can have security research look at the source code as well as, you know, black box 00:19:08.080 |
test the binaries and develop confidence in what the server might be running. All right, what's next? 00:19:13.920 |
Okay, so Apple has set the standard for private AI and the market is definitely following. 00:19:21.280 |
That was in June of 2024. It wasn't actually released until October of 2024. Azure Open AI, 00:19:26.960 |
or sorry, Azure AI, not Azure Open AI, is doing private inferencing starting as in September. 00:19:31.280 |
They're still in private preview. And then about a month ago, Meta of all companies, I guess I'm 00:19:36.160 |
recorded, Meta of all these great companies, also added private processing, which if you read their blog 00:19:43.200 |
post, it's like they copy and pasted this. Maybe they used Llama to rewrite it into their language, 00:19:48.320 |
but it's essentially identical, which is great for all of us thinking about privacy. And sure, 00:19:52.960 |
I'm sure WhatsApp also doesn't want those like press releases like I showed earlier. 00:19:56.800 |
So I'll just close by saying we're building the same thing. But for everyone else, if you're not on 00:20:03.040 |
Apple or you're not in WhatsApp, we have it. It's called Confident Security. And if you'd like to 00:20:09.600 |
talk more, let me know. By the way, this is an anti-AI shirt, which means that if you take pictures 00:20:14.640 |
of me, it will confuse all the facial recognition stuff. We have others. If you have some cool 00:20:19.920 |
questions and want to talk afterward, if it deems it worthy, I will give you an anti-AI shirt. We also 00:20:24.960 |
have some other privacy-based swag in the back. So come hit me up. Thanks, everyone.