The Unofficial Guide to Apple’s Private Cloud Compute

We're going to talk about Apple's private cloud compute. This is an unofficial guide. I don't work at Apple. We'll talk about it in a sec. So this is my background, my PhD in data science, biomedical informatics. I've sold two companies, one in AI and data, one in cyber security and infrastructure.

I'm at South Park Commons. I'm building a company called Confidence Security, which we'll get to at the end. But again, disclaimer, I'm not an Apple employee. I'm not speaking on their behalf. Everything I've gleaned is from public sources. And hopefully, what we'll learn today is some tools that we can use ourselves.

There's really six key components and some approaches to ensure privacy. And privacy and security are very related, not perfectly overlapping, but related. So before I get there, I know that we're in the security track, but I want to motivate why you might care about privacy. Not everyone believes that you should have privacy.

So let me just give some examples. This year, DeepSeq leaked a million sensitive records of chat logs. And you might say to yourself, like, well, that's DeepSeq. Everyone knew that was going to happen. But before I show you the next piece, I want to poll the audience. How many here care about privacy?

All right. How many of us use ChatGPT? All right. How many consider ChatGPT to be private? Okay, good. How many use the ChatGPT private mode? And then how many use the API but with zero data retention, kind of like the default state? Okay, great. Well, as of yesterday, OpenAI has to retain everything anyway, whether or not you flagged it as private or not.

And I don't want to get into the comments of why they have to do that. But the point is, is that they have the capability of retaining your private chats that you can flag on the UI as not private other things. And obviously, being forced to do that is not great.

But this is why we should all care about privacy. So Apple doesn't want to have these headlines because they're one of their major value props is privacy. So let's talk a little bit about the problems that Apple solved and then how we might use it. So fundamentally, AI requires more compute than a phone.

But obviously, they want to bundle AI into their phones. Privacy is a major selling point. Anytime you give up private data to something remote, you're inherently reducing your privacy rate. Anytime I give you my data, it's not as private as it was the second before. So the question that Apple is trying to answer in their PCC system, which is now available on all of our iPhones and used for inference, is how do you get remote compute while remaining private?

And the simple way to do that would be to buy all the iPhones H100s, pair them in the cloud, you get your own H100, boom. But obviously, AI is even more expensive. So that's not going to work. So they actually need some approach, which is how do you get remote compute while remaining private and cheap?

Otherwise, it doesn't work. So I'm going to kind of frame the problem this way. You've got an iPhone, you've got an untrusted remote server, you can't see inside of it. It's a black box. Once you give them your data, you have no idea what happens inside. And what Apple does is tries to make it not a black box so that the iPhone has some control of what happens to the data inside Apple's remote servers.

And then hopefully that this trusted remote service is also hard to hack. So for the remainder of the talk, and we're not going to be able to get into all of it in 16 minutes, but we're going to talk about Apple's PCC requirements that they set up. And I'll review a conceptual architecture about how they meet those requirements.

Then we'll go into two specific components of the six because I don't have time to go through all six and you'll be bored by that point. And then talk about some pros and cons of Apple's things and how we might use some of those components ourselves. So there are five key requirements to Apple's private cloud compute that they're trying to meet when they design the system.

The first one is stateless computation. This is essentially the guarantee that when Apple receives your data, it's only used to satisfy the request and cannot be used. It's impossible to use it for anything else. You can't log it, anything like that. The second thing is enforceable guarantees. The notion that the code, everything's enforced with code, not by some sort of policy, not I shouldn't SSH to the instance, but I can SSH to the instances.

No, there's no SSH on the instance. You can't SSH to it. You don't want to save things? Well, don't have a disk, right? So these are what they call enforceable guarantees, not just policies. The third principle or requirement is non-targetability. That means that if you wanted to hack my data on PCC, you'd have to target everyone and sift through all of it rather than having some easy way to find just me.

No privileged runtime access. I just briefly touched on it earlier, but essentially there's no way to bypass these restrictions in production. And then the final one and the most important one is verifiable transparency. Verifiable transparency essentially says we can prove that all of the above items are true. Great.

So let's talk about, again, this is a little more bigger representation of the black box. In a classic, you know, kind of remote system, you have some sort of auth service, and then we have an AI engine. And in this AI engine, you have some SRE who can access it and some disk that you can write to it.

But again, the iPhone doesn't know what it's sending its data to, but this is fundamental. So let's see how we can change this to get to some of these at a conceptual level. So the first thing that Apple does is it adds an anonymizer. And this anonymizer is the first part of two parts of non-targetability.

But ideally, right, Apple can't tell who the data is coming from, so it would be harder for an attacker to come and, like, fish out my particular set of data. If you're a student looking at this, you're still off behind the anonymizer, and so the iPhone provides some sort of auth credentials, and those auth credentials are obviously tied to the user.

So the second thing that Apple does is separates auth. And conceptually, you think of this as if you're going to the arcade and you want to go spend your money on arcade machines, you first put your money into the coin machine, you get some coins out. These coins are anonymous.

Now you can go to the machine and no one knows what machines you spend your money on. That's essentially what happens here. It's called blind signatures. We're not going to have time to get into it today, but that's what happens. So now the iPhone is making an anonymous request, going through an anonymizer that's taking everyone's.

It's kind of like Tor. It's like laundering everyone's data, so that if someone were to access the system internally, they wouldn't know who it's coming from. So that gets us non-targetability. The second thing that Apple does is it changes the set of requests that are happening. The first thing it does is, before it sends its data, it says, "What are you running?" And if the AI engine replies, "Well, I'm running this and only this," the iPhone might say, "Okay, I trust that." And if that remains true, then you can run this AI on the data that I'm submitting.

This is how they achieve verifiable transparency. There's a little more subtlety to that, which we are going to get into, but it's essentially the iPhone says, "I trust the code that you're running. You can only decrypt my data if you're still running that code." So the iPhone can verify what they're doing.

The next thing, no privilege runtime access. That was easy. Just get rid of SSHD. Make it no way of accessing those machines. Enforceable guarantees. Get rid of the disk. We talked about that. And then stateless computation, again, with no disk, no access. There's nothing to do with the data other than respond to the iPhone.

And since the iPhone verified the code that was running on this machine, it knows it's not being logged anywhere before it gives them the data. Okay. So they achieve those five guarantees that I talked about here using six technical components. And again, we're going to go into two of them.

But I'll describe them all very briefly. Oblivious HTTP is a spec developed by Cloudflare and Apple and others that allows you to essentially make anonymous requests using a third party to use the launder your requests through this third party. So all of their requests that goes to Apple's private cloud compute first goes through Cloudflare.

So when Apple receives it, it only knows that it came from Cloudflare, not from an individual user's IP address. The second thing that they use is blind signatures. Blind signatures is that arcade analogy that I gave you. But it essentially is a way to auth separately and then verify that you're bearing true authentication, but you can't link it to your identity.

And again, we don't have time to go into that. But if you want to look it up, it's a formal spec as well. There's lots of packages and open source libraries that let you use that. Third component is the secure enclave. This is an equivalent we have in our world, if we're not programming on Apple, is TPMs, if you've heard of that.

But they're essentially a place, a separate piece of hardware where the private keys are kept. And that makes a guarantee that those keys can never be removed from the hardware. That's really important because you don't want the keys to be given away that does all of these. All of the interactions that this is doing is with keys that they prove who they are.

If they could move it and have some third party hold it, then it wouldn't be trusted. You could essentially fake everyone out that you are an official AI engine, but actually you're somewhere else. So the secure enclave helps with that. Again, we won't be getting into those. We're going to get into these two.

The last one is secure boot and hardened operating system. This is like a standard technique, but it's essentially they run a very limited version of iOS that makes it very difficult to hack or modify. Everything has to be signed just like if you've done an iOS app. Now you have to do signatures, but theirs is like even crazier.

Okay, so the ones we're going to talk about are remote attestation. That was this flow I talked about here. Great. Then the other one is the transparency log. The transparency log is a record of all of the software that Apple is deploying on their private nodes so that you can go and verify what's on the record is actually what's being sent to you during the attestation.

Okay, so let's talk about remote attestation very briefly. I'm going to talk about it abstractly not with iPhones. So you have some client and the client says, "What are you running?" And the server replies with two things: a set of signed claims and then a public key. And the signed claims essentially say, "I'm on genuine hardware.

I'm running a genuine GPU. I am running this set of software. I use this bootloader. I use this version of Linux." And then the client gets to look at those claims and decide whether it trusts that version. It might be like, "Oh, I only trust this version of the Linux kernel and above." Or, "I only trust that it's been signed by Apple." And if so, it can use this public key that comes across to encrypt data that is later sent to the server.

And this is really important, which is this public key and these claims are tied together. So during later interactions with the server, the client will encrypt using the public key and the signed claims. And the server will only be able to decrypt if it is still matching those signed claims.

There's a whole bunch of cryptography that makes this possible and a bunch of certificate chains and a bunch of like trusts and vendors. But that's the fundamental idea. And this is what is letting you change it black box to something that's a little more translucent, right? Not just throwing it over the wall.

You can kind of see what's going on inside. Okay. The second thing is the transparency log. Transparency log is actually very simple conceptually. It's just a database with records for each software release or each component in a software release signed by a particular person. So, for example, in this record, Bob added this binary or piece of compiled source code.

And this is the hash of that binary on November 1st, 24. And then that's it. It's just a declaration that this binary was signed by Bob. Why does that matter? Why would you care about this? Well, first of all, reviewers can go through and offline look at these binaries that are made publicly available and verify their behavior.

And so then when you get a remote attestation and the remote attestation says this hash of this binary is there, you can be like, oh yeah, I've already checked this binary. I believe that it's doing the right thing. The second thing, so that's what I said the second point, which is you can check the remote attestations match what's in the log.

And then finally, if you see an attestation that's not on the log, you know the whole system's been compromised. Because if it's not on the log, definitely someone is like doing some sort of shenanigans, right? They might have like hijacked your connection, whatever. And it's just because like a limited set of people can write to this log and there's no way to modify the log, right?

It's append only. It uses like a Merkle tree so that you can't change the contents. Great. So that is the transparency log. So let me tell you how this all comes together, right? So remote attestation is this flow. Again, the iPhone firsts through the anonymizer, requests a remote attestation package and then says, well, if I believe that remote attestation package, I trust the contents that is running on the server.

I can then send my data and I phrase this as try to decrypt the data on the AI. Again, if the attestation changed, the AI engine would not be able to decrypt the data, right? So that's the most important part, right? It says, I'm running this thing. Trust me.

And it says, I trust you. Okay, great. Encrypt it. And I can only decrypt it as long as it's still running the exact thing I said I trusted. And the second item we talked about is the transparency log, which is check if the attested claims match the transparency log.

And this transparency log we talked about. So on here, Apple is writing a lot, a lot of software onto the log and then essentially saying, trust what's on the log, you can verify it offline. And then when the attestation claims come in, just double check that they do indeed match.

Okay, and then I don't have time to get into all of these. But here are some of the other items that we talked about, the blind signatures, the oblivious HTTP is the anonymizer, blind signatures are the way they do the auth. And then of course, over here, we have the secure enclave, I kind of put that outside of the AI engine, they're separate pieces of hardware.

And then the hardening is just this little lock, but you know, we don't have time to get into it. And that's kind of at a very conceptual level, like you could essentially do a PhD on each of these, how Apple's PCC works. So what are the gaps? What are the downsides?

Well, first, you have to put all of your trust in Apple still, right? On the bright side, like Apple runs their whole supply chain, they verify the nodes when they get them at their data center, they actually re-sign them with what's called data center identity keys or something like that, DCI case.

But there's no guarantee that Apple doesn't share the certs with anyone, or insecurely generate them, or set the private key to one everywhere. Now, I think they are trying to do their best effort, but you still have to trust. You've shifted the trust now into like Apple's behavior rather than the hardware.

But anyhow, they're only available on Apple devices for consumer use on official apps. Maybe at some point, they'll make PCC available to everyone else, but not yet. So what trade-offs do is Apple PCC make, they're limited by latencies to Apple data centers. So they do have local models first that they try and use, but if those local models aren't adequate, they'll send them to data centers.

As we start to do like real-time voice and other things, this is a little more latencies, like adds a lot more latency to the system. The compute costs are higher. There's doing a lot more encryption. There's like, I didn't, I mean, you're not seeing it, but there's like six layers of encryption before it even gets to that node.

That actually, that actually makes it happen. So you're spending a little bit more compute there. Like I told you, no custom models, no fine tuning. The client libraries are very complicated. The client having to orchestrate all of these requests, this transparency log, this auth, that's way more complicated than a simple HTTP request, which kind of sucks.

And what if your iPhone goes down after it's authenticated, and then it loses all the authentication keys? Like, you've essentially like lost all of your state, right? So there's a lot more stateful. Operationally complex. You can't SSH into the machine and there's no logging. So that's difficult. Not everyone would sign up for that.

You can't do any usage tracking. If you could do usage tracking, then you'd be identified, right? And so Apple can't like parcel out, you know, you get 2000 tokens. They do do some fraud and abuse tracking at a very gross level. But if you wanted to use this and maybe pass on your costs, a similar architecture and pass on your costs to the customer, you wouldn't be able to know which customer was doing what, right?

And then not open to third party developers. Okay. What can I learn from this? I gave you the list of six that Apple uses. And here's what's available in our world. If you're not developing on Apple Silicon and Apple hardware, you still have oblivious HTTP and blind signatures. There are libraries to do that.

So we don't have secure enclaves, but we have TPMs. Almost all Intel and AMD hardware now has TPMs. And then in the cloud environment, they have virtual TPMs that provide the same behavior as a TPM. And again, that's where you put a bunch of your private keys that are tied to that public key that I talked about.

These are available for us. Secure boot and hardened operating system. Remote attestation is kind of available. It's kind of tied to the TPM. There aren't great standards yet, but there is a little bit of work there. Transparency log. There are two open ones. One's called SIGSUM. The other one's called SIGSTORE.

If you've heard of them, if not. And then confidential VMs are just becoming available on cloud providers with GPUs. So confidential computing has been around for a while, but now you also have to have confidential H100s. And only H100s support, and H200s support confidentiality. What that means is that their memory is encrypted.

So if you were to physically go up to the H100 and like try to look at its RAM, you wouldn't be able to see what's going on there or figure out what's going on there. And then finally, what we have that Apple doesn't have is we have open source and we have reproducible builds.

We have the ability to link the source code to the binaries. And so we can have security research look at the source code as well as, you know, black box test the binaries and develop confidence in what the server might be running. All right, what's next? Okay, so Apple has set the standard for private AI and the market is definitely following.

That was in June of 2024. It wasn't actually released until October of 2024. Azure Open AI, or sorry, Azure AI, not Azure Open AI, is doing private inferencing starting as in September. They're still in private preview. And then about a month ago, Meta of all companies, I guess I'm recorded, Meta of all these great companies, also added private processing, which if you read their blog post, it's like they copy and pasted this.

Maybe they used Llama to rewrite it into their language, but it's essentially identical, which is great for all of us thinking about privacy. And sure, I'm sure WhatsApp also doesn't want those like press releases like I showed earlier. So I'll just close by saying we're building the same thing.

But for everyone else, if you're not on Apple or you're not in WhatsApp, we have it. It's called Confident Security. And if you'd like to talk more, let me know. By the way, this is an anti-AI shirt, which means that if you take pictures of me, it will confuse all the facial recognition stuff.

We have others. If you have some cool questions and want to talk afterward, if it deems it worthy, I will give you an anti-AI shirt. We also have some other privacy-based swag in the back. So come hit me up. Thanks, everyone.

The Unofficial Guide to Apple’s Private Cloud Compute - Jonathan Mortensen, CONFSEC

Transcript