The Unofficial Guide to Apple’s Private Cloud Compute

00:00:00.000 | We're going to talk about Apple's private cloud compute. This is an unofficial guide. I don't

00:00:19.360 | work at Apple. We'll talk about it in a sec. So this is my background, my PhD in data science,

00:00:27.280 | biomedical informatics. I've sold two companies, one in AI and data, one in cyber security and

00:00:32.000 | infrastructure. I'm at South Park Commons. I'm building a company called Confidence Security,

00:00:35.760 | which we'll get to at the end. But again, disclaimer, I'm not an Apple employee. I'm

00:00:40.400 | not speaking on their behalf. Everything I've gleaned is from public sources. And hopefully,

00:00:44.720 | what we'll learn today is some tools that we can use ourselves. There's really six key components

00:00:49.440 | and some approaches to ensure privacy. And privacy and security are very related, not perfectly

00:00:54.720 | overlapping, but related. So before I get there, I know that we're in the security track,

00:00:59.680 | but I want to motivate why you might care about privacy. Not everyone believes that you should

00:01:04.160 | have privacy. So let me just give some examples. This year, DeepSeq leaked a million sensitive

00:01:12.560 | records of chat logs. And you might say to yourself, like, well, that's DeepSeq. Everyone knew that was

00:01:17.600 | going to happen. But before I show you the next piece, I want to poll the audience. How many here

00:01:23.920 | care about privacy? All right. How many of us use ChatGPT? All right. How many consider ChatGPT to be

00:01:33.600 | private? Okay, good. How many use the ChatGPT private mode? And then how many use the API but with zero data

00:01:43.360 | retention, kind of like the default state? Okay, great. Well, as of yesterday, OpenAI has to retain

00:01:50.000 | everything anyway, whether or not you flagged it as private or not. And I don't want to get into the

00:01:55.920 | comments of why they have to do that. But the point is, is that they have the capability of retaining

00:02:01.840 | your private chats that you can flag on the UI as not private other things. And obviously, being forced to

00:02:08.320 | do that is not great. But this is why we should all care about privacy. So Apple doesn't want to have

00:02:16.000 | these headlines because they're one of their major value props is privacy. So let's talk a little bit

00:02:20.800 | about the problems that Apple solved and then how we might use it. So fundamentally, AI requires more

00:02:26.960 | compute than a phone. But obviously, they want to bundle AI into their phones. Privacy is a major selling

00:02:32.480 | point. Anytime you give up private data to something remote, you're inherently reducing your privacy

00:02:38.160 | rate. Anytime I give you my data, it's not as private as it was the second before. So the question

00:02:43.680 | that Apple is trying to answer in their PCC system, which is now available on all of our iPhones and

00:02:48.560 | used for inference, is how do you get remote compute while remaining private? And the simple way to do

00:02:53.280 | that would be to buy all the iPhones H100s, pair them in the cloud, you get your own H100, boom. But

00:02:58.400 | obviously, AI is even more expensive. So that's not going to work. So they actually need some approach,

00:03:04.480 | which is how do you get remote compute while remaining private and cheap? Otherwise, it doesn't work. So

00:03:10.800 | I'm going to kind of frame the problem this way. You've got an iPhone, you've got an untrusted remote

00:03:15.680 | server, you can't see inside of it. It's a black box. Once you give them your data, you have no idea what

00:03:20.240 | happens inside. And what Apple does is tries to make it not a black box so that the iPhone has some

00:03:25.440 | control of what happens to the data inside Apple's remote servers. And then hopefully that this trusted

00:03:32.640 | remote service is also hard to hack. So for the remainder of the talk, and we're not going to be

00:03:38.080 | able to get into all of it in 16 minutes, but we're going to talk about Apple's PCC requirements that they

00:03:43.840 | set up. And I'll review a conceptual architecture about how they meet those requirements. Then we'll go

00:03:48.880 | into two specific components of the six because I don't have time to go through all six and you'll

00:03:53.600 | be bored by that point. And then talk about some pros and cons of Apple's things and how we might

00:03:58.720 | use some of those components ourselves. So there are five key requirements to Apple's private cloud

00:04:06.080 | compute that they're trying to meet when they design the system. The first one is stateless computation.

00:04:10.560 | This is essentially the guarantee that when Apple receives your data, it's only used to satisfy the

00:04:15.040 | request and cannot be used. It's impossible to use it for anything else. You can't log it,

00:04:19.040 | anything like that. The second thing is enforceable guarantees. The notion that the code, everything's

00:04:26.400 | enforced with code, not by some sort of policy, not I shouldn't SSH to the instance, but I can SSH to the

00:04:32.000 | instances. No, there's no SSH on the instance. You can't SSH to it. You don't want to save things? Well,

00:04:37.680 | don't have a disk, right? So these are what they call enforceable guarantees, not just policies. The third

00:04:43.120 | principle or requirement is non-targetability. That means that if you wanted to hack my data on PCC,

00:04:49.600 | you'd have to target everyone and sift through all of it rather than having some easy way to find just me.

00:04:55.040 | No privileged runtime access. I just briefly touched on it earlier, but essentially there's no way to

00:05:03.040 | bypass these restrictions in production. And then the final one and the most important one is verifiable

00:05:09.120 | transparency. Verifiable transparency essentially says we can prove that all of the above items are true.

00:05:15.600 | Great. So let's talk about, again, this is a little more bigger representation of the black box.

00:05:23.840 | In a classic, you know, kind of remote system, you have some sort of auth service, and then we have an

00:05:29.680 | AI engine. And in this AI engine, you have some SRE who can access it and some disk that you can write to

00:05:35.200 | it. But again, the iPhone doesn't know what it's sending its data to, but this is fundamental. So let's

00:05:40.800 | see how we can change this to get to some of these at a conceptual level. So the first thing that Apple does

00:05:48.160 | is it adds an anonymizer. And this anonymizer is the first part of two parts of non-targetability. But

00:05:57.760 | ideally, right, Apple can't tell who the data is coming from, so it would be harder for an attacker

00:06:02.320 | to come and, like, fish out my particular set of data. If you're a student looking at this, you're still

00:06:10.720 | off behind the anonymizer, and so the iPhone provides some sort of auth credentials, and those auth credentials

00:06:14.880 | are obviously tied to the user. So the second thing that Apple does is separates auth. And conceptually,

00:06:20.880 | you think of this as if you're going to the arcade and you want to go spend your money on arcade

00:06:24.960 | machines, you first put your money into the coin machine, you get some coins out. These coins are

00:06:29.920 | anonymous. Now you can go to the machine and no one knows what machines you spend your money on. That's

00:06:34.720 | essentially what happens here. It's called blind signatures. We're not going to have time to get into

00:06:38.640 | it today, but that's what happens. So now the iPhone is making an anonymous request, going through an

00:06:44.640 | anonymizer that's taking everyone's. It's kind of like Tor. It's like laundering everyone's data,

00:06:49.120 | so that if someone were to access the system internally, they wouldn't know who it's coming

00:06:55.360 | from. So that gets us non-targetability. The second thing that Apple does is it changes the set of

00:07:00.960 | requests that are happening. The first thing it does is, before it sends its data, it says, "What are you

00:07:05.040 | running?" And if the AI engine replies, "Well, I'm running this and only this," the iPhone might say, "Okay, I trust that." And if that remains

00:07:14.400 | true, then you can run this AI on the data that I'm submitting. This is how they achieve verifiable

00:07:22.000 | transparency. There's a little more subtlety to that, which we are going to get into, but it's

00:07:26.160 | essentially the iPhone says, "I trust the code that you're running. You can only decrypt my data if you're

00:07:29.840 | still running that code." So the iPhone can verify what they're doing. The next thing, no privilege runtime

00:07:35.120 | access. That was easy. Just get rid of SSHD. Make it no way of accessing those machines. Enforceable guarantees.

00:07:42.000 | Get rid of the disk. We talked about that. And then stateless computation, again,

00:07:46.800 | with no disk, no access. There's nothing to do with the data other than respond to the iPhone.

00:07:50.960 | And since the iPhone verified the code that was running on this machine, it knows it's not being

00:07:56.880 | logged anywhere before it gives them the data. Okay. So they achieve those five guarantees that I talked

00:08:04.560 | about here using six technical components. And again, we're going to go into two of them.

00:08:09.760 | But I'll describe them all very briefly. Oblivious HTTP is a spec developed by Cloudflare and Apple and

00:08:17.440 | others that allows you to essentially make anonymous requests using a third party to use the launder your

00:08:25.040 | requests through this third party. So all of their requests that goes to Apple's private cloud compute first goes through Cloudflare.

00:08:30.480 | So when Apple receives it, it only knows that it came from Cloudflare, not from an individual user's IP address.

00:08:35.440 | The second thing that they use is blind signatures. Blind signatures is that arcade analogy that I gave

00:08:41.120 | you. But it essentially is a way to auth separately and then verify that you're bearing true authentication,

00:08:48.000 | but you can't link it to your identity. And again, we don't have time to go into that. But if you want to look

00:08:52.080 | it up, it's a formal spec as well. There's lots of packages and open source libraries that let you use

00:08:56.880 | that. Third component is the secure enclave. This is an equivalent we have in our world, if we're not

00:09:04.800 | programming on Apple, is TPMs, if you've heard of that. But they're essentially a place, a separate piece

00:09:09.680 | of hardware where the private keys are kept. And that makes a guarantee that those keys can never be removed

00:09:16.240 | from the hardware. That's really important because you don't want the keys to be given away that does

00:09:23.440 | all of these. All of the interactions that this is doing is with keys that they prove who they are.

00:09:28.320 | If they could move it and have some third party hold it, then it wouldn't be trusted. You could

00:09:32.400 | essentially fake everyone out that you are an official AI engine, but actually you're somewhere

00:09:36.960 | else. So the secure enclave helps with that. Again, we won't be getting into those. We're going to get into

00:09:40.720 | these two. The last one is secure boot and hardened operating system. This is like a standard technique,

00:09:46.960 | but it's essentially they run a very limited version of iOS that makes it very difficult to hack or

00:09:52.720 | modify. Everything has to be signed just like if you've done an iOS app. Now you have to do signatures,

00:09:58.240 | but theirs is like even crazier. Okay, so the ones we're going to talk about are remote attestation.

00:10:04.880 | That was this flow I talked about here. Great. Then the other one is the transparency log. The

00:10:12.160 | transparency log is a record of all of the software that Apple is deploying on their private nodes so

00:10:17.760 | that you can go and verify what's on the record is actually what's being sent to you during the

00:10:22.240 | attestation. Okay, so let's talk about remote attestation very briefly. I'm going to talk about it

00:10:29.360 | abstractly not with iPhones. So you have some client and the client says, "What are you running?"

00:10:34.400 | And the server replies with two things: a set of signed claims and then a public key. And the signed

00:10:40.960 | claims essentially say, "I'm on genuine hardware. I'm running a genuine GPU. I am running this set of

00:10:48.880 | software. I use this bootloader. I use this version of Linux." And then the client gets to look at those

00:10:55.120 | claims and decide whether it trusts that version. It might be like, "Oh, I only trust this version of

00:11:01.520 | the Linux kernel and above." Or, "I only trust that it's been signed by Apple." And if so, it can use this

00:11:09.520 | public key that comes across to encrypt data that is later sent to the server. And this is really important,

00:11:18.560 | which is this public key and these claims are tied together. So during later interactions with the server,

00:11:23.840 | the client will encrypt using the public key and the signed claims. And the server will only be able

00:11:28.400 | to decrypt if it is still matching those signed claims. There's a whole bunch of cryptography that

00:11:33.920 | makes this possible and a bunch of certificate chains and a bunch of like trusts and vendors. But that's

00:11:39.280 | the fundamental idea. And this is what is letting you change it black box to something that's a little

00:11:44.080 | more translucent, right? Not just throwing it over the wall. You can kind of see what's going on inside.

00:11:50.240 | Okay. The second thing is the transparency log. Transparency log is actually very simple conceptually.

00:11:56.240 | It's just a database with records for each software release or each component in a software release signed by a

00:12:03.040 | particular person. So, for example, in this record, Bob added this binary or piece of compiled source code.

00:12:13.200 | And this is the hash of that binary on November 1st, 24. And then that's it. It's just a declaration that

00:12:19.760 | this binary was signed by Bob. Why does that matter? Why would you care about this? Well, first of all,

00:12:26.160 | reviewers can go through and offline look at these binaries that are made publicly available

00:12:33.840 | and verify their behavior. And so then when you get a remote attestation and the remote attestation

00:12:39.600 | says this hash of this binary is there, you can be like, oh yeah, I've already checked this binary.

00:12:43.440 | I believe that it's doing the right thing. The second thing, so that's what I said the second point,

00:12:48.880 | which is you can check the remote attestations match what's in the log. And then finally, if you see an

00:12:53.840 | attestation that's not on the log, you know the whole system's been compromised. Because if it's not on the log,

00:12:59.760 | definitely someone is like doing some sort of shenanigans, right? They might have like hijacked

00:13:03.840 | your connection, whatever. And it's just because like a limited set of people can write to this log

00:13:09.600 | and there's no way to modify the log, right? It's append only. It uses like a Merkle tree so that you can't

00:13:14.400 | change the contents. Great. So that is the transparency log. So let me tell you how this all comes together,

00:13:23.680 | right? So remote attestation is this flow. Again, the iPhone firsts through the anonymizer,

00:13:31.760 | requests a remote attestation package and then says, well, if I believe that remote attestation package,

00:13:37.360 | I trust the contents that is running on the server. I can then send my data and I phrase this as try to

00:13:43.440 | decrypt the data on the AI. Again, if the attestation changed, the AI engine would not be able to decrypt the

00:13:49.920 | data, right? So that's the most important part, right? It says, I'm running this thing. Trust me.

00:13:54.000 | And it says, I trust you. Okay, great. Encrypt it. And I can only decrypt it as long as it's still

00:13:58.160 | running the exact thing I said I trusted. And the second item we talked about is the transparency log,

00:14:04.720 | which is check if the attested claims match the transparency log. And this transparency log we

00:14:09.200 | talked about. So on here, Apple is writing a lot, a lot of software onto the log and then essentially

00:14:14.720 | saying, trust what's on the log, you can verify it offline. And then when the attestation claims come

00:14:20.400 | in, just double check that they do indeed match. Okay, and then I don't have time to get into all of

00:14:28.000 | these. But here are some of the other items that we talked about, the blind signatures, the oblivious

00:14:36.240 | HTTP is the anonymizer, blind signatures are the way they do the auth. And then of course,

00:14:42.160 | over here, we have the secure enclave, I kind of put that outside of the AI engine, they're separate

00:14:48.880 | pieces of hardware. And then the hardening is just this little lock, but you know, we don't have time

00:14:53.280 | to get into it. And that's kind of at a very conceptual level, like you could essentially do a PhD on

00:15:01.440 | each of these, how Apple's PCC works. So what are the gaps? What are the downsides? Well, first, you have

00:15:09.760 | to put all of your trust in Apple still, right? On the bright side, like Apple runs their whole supply chain,

00:15:15.360 | they verify the nodes when they get them at their data center, they actually re-sign them with what's

00:15:21.200 | called data center identity keys or something like that, DCI case. But there's no guarantee that Apple

00:15:28.160 | doesn't share the certs with anyone, or insecurely generate them, or set the private key to one

00:15:33.360 | everywhere. Now, I think they are trying to do their best effort, but you still have to trust. You've

00:15:39.600 | shifted the trust now into like Apple's behavior rather than the hardware. But anyhow,

00:15:45.120 | they're only available on Apple devices for consumer use on official apps. Maybe at some point,

00:15:50.800 | they'll make PCC available to everyone else, but not yet. So what trade-offs do is Apple PCC make,

00:15:57.760 | they're limited by latencies to Apple data centers. So they do have local models first that they try and

00:16:05.040 | use, but if those local models aren't adequate, they'll send them to data centers. As we start to do

00:16:11.760 | like real-time voice and other things, this is a little more latencies, like adds a lot more latency

00:16:17.600 | to the system. The compute costs are higher. There's doing a lot more encryption. There's like,

00:16:22.480 | I didn't, I mean, you're not seeing it, but there's like six layers of encryption before it even gets to

00:16:26.400 | that node. That actually, that actually makes it happen. So you're spending a little bit more compute

00:16:30.080 | there. Like I told you, no custom models, no fine tuning. The client libraries are very complicated.

00:16:36.000 | The client having to orchestrate all of these requests, this transparency log, this auth,

00:16:42.480 | that's way more complicated than a simple HTTP request, which kind of sucks. And what if

00:16:48.480 | your iPhone goes down after it's authenticated, and then it loses all the authentication keys? Like,

00:16:54.400 | you've essentially like lost all of your state, right? So there's a lot more stateful.

00:16:58.080 | Operationally complex. You can't SSH into the machine and there's no logging. So that's difficult.

00:17:06.080 | Not everyone would sign up for that. You can't do any usage tracking. If you could do usage tracking,

00:17:11.840 | then you'd be identified, right? And so Apple can't like parcel out, you know, you get 2000 tokens.

00:17:18.160 | They do do some fraud and abuse tracking at a very gross level. But if you wanted to use this and

00:17:23.840 | maybe pass on your costs, a similar architecture and pass on your costs to the customer, you wouldn't

00:17:28.080 | be able to know which customer was doing what, right? And then not open to third party developers.

00:17:33.760 | Okay. What can I learn from this? I gave you the list of six that Apple uses. And here's what's

00:17:40.480 | available in our world. If you're not developing on Apple Silicon and Apple hardware, you still have

00:17:45.440 | oblivious HTTP and blind signatures. There are libraries to do that. So we don't have secure enclaves,

00:17:50.320 | but we have TPMs. Almost all Intel and AMD hardware now has TPMs. And then in the cloud environment,

00:17:56.720 | they have virtual TPMs that provide the same behavior as a TPM. And again, that's where you put a bunch of

00:18:02.560 | your private keys that are tied to that public key that I talked about. These are available for us. Secure boot

00:18:08.560 | and hardened operating system. Remote attestation is kind of available. It's kind of tied to the TPM.

00:18:16.800 | There aren't great standards yet, but there is a little bit of work there. Transparency log. There

00:18:22.560 | are two open ones. One's called SIGSUM. The other one's called SIGSTORE. If you've heard of them, if not.

00:18:29.440 | And then confidential VMs are just becoming available on cloud providers with GPUs. So confidential computing

00:18:36.800 | has been around for a while, but now you also have to have confidential H100s. And only H100s support,

00:18:42.880 | and H200s support confidentiality. What that means is that their memory is encrypted. So if you were to

00:18:47.520 | physically go up to the H100 and like try to look at its RAM, you wouldn't be able to see what's going on

00:18:52.800 | there or figure out what's going on there. And then finally, what we have that Apple doesn't have is we

00:18:57.680 | have open source and we have reproducible builds. We have the ability to link the source code to the

00:19:01.440 | binaries. And so we can have security research look at the source code as well as, you know, black box

00:19:08.080 | test the binaries and develop confidence in what the server might be running. All right, what's next?

00:19:13.920 | Okay, so Apple has set the standard for private AI and the market is definitely following.

00:19:21.280 | That was in June of 2024. It wasn't actually released until October of 2024. Azure Open AI,

00:19:26.960 | or sorry, Azure AI, not Azure Open AI, is doing private inferencing starting as in September.

00:19:31.280 | They're still in private preview. And then about a month ago, Meta of all companies, I guess I'm

00:19:36.160 | recorded, Meta of all these great companies, also added private processing, which if you read their blog

00:19:43.200 | post, it's like they copy and pasted this. Maybe they used Llama to rewrite it into their language,

00:19:48.320 | but it's essentially identical, which is great for all of us thinking about privacy. And sure,

00:19:52.960 | I'm sure WhatsApp also doesn't want those like press releases like I showed earlier.

00:19:56.800 | So I'll just close by saying we're building the same thing. But for everyone else, if you're not on

00:20:03.040 | Apple or you're not in WhatsApp, we have it. It's called Confident Security. And if you'd like to

00:20:09.600 | talk more, let me know. By the way, this is an anti-AI shirt, which means that if you take pictures

00:20:14.640 | of me, it will confuse all the facial recognition stuff. We have others. If you have some cool

00:20:19.920 | questions and want to talk afterward, if it deems it worthy, I will give you an anti-AI shirt. We also

00:20:24.960 | have some other privacy-based swag in the back. So come hit me up. Thanks, everyone.

The Unofficial Guide to Apple’s Private Cloud Compute - Jonathan Mortensen, CONFSEC