A Platform for Secure Analytics and Machine Learning¶

Introduction to MC²¶

Born out of research in the UC Berkeley RISE Lab, MC² is a platform for running secure analytics and machine learning on encrypted data. With MC², organizations can safely upload their confidential data to the cloud in encrypted form and securely compute analytics and machine learning without exposing the unencrypted data to the cloud provider. MC² also enables secure collaboration among multiple organizations, where the data owners can use the platform to jointly analyze their collective data without revealing their individual data to each other.

MC²’s stack supports a single client interface, as well as the following compute services:

Opaque SQL: Encrypted data analytics on Spark SQL using hardware enclaves
Secure XGBoost: Collaborative XGBoost training and inference on encrypted data using hardware enclaves
Federated XGBoost: Collaborative XGBoost in the federated setting

_images/mc2_stack.jpg — Overview of the MC² stack¶

What are secure enclaves?¶

In order to provide strong privacy guarantees for user data, MC² leverages secure enclaves, which are a recent advance in computer processor technology that enables the creation of a secure region of memory on an otherwise untrusted machine. Any data or software placed within the enclave is encrypted and isolated from the rest of the system. No other process on the same processor – not even privileged software such as the OS or the hypervisor – can access the encrypted enclave memory.

Since the operating system is untrusted, enclaves provide a feature called remote attestation, which enables clients to cryptographically verify that an enclave in the cloud is running a specific version of the code. This allows remote clients to have confidence that the expected code will be executed on their data instead of a malicious piece of code.

Examples of secure enclave technology include Intel SGX, ARM TrustZone, and AMD Memory Encryption. All major cloud providers support VMs with enclaves (see Microsoft Azure Confidential Computing, GCP Confidential Computing, and AWS Nitro enclaves).

MC² platform’s workflow¶

The MC² platform builds upon the Open Enclave SDK, an open source SDK that provides a single unified abstraction across different enclave technologies. The use of Open Enclave enables our library to be compatible with many different enclave backends, including Intel SGX.

The diagrams below show a sample workflow of a user using MC² for secure data processing in a cloud. Green indicates a trusted component, while read indicates an untrusted component that could be compromised by an adversary.

Before any query can be exeucuted, the user must execute remote attestation to load MC² compute service into enclaves and transfers their private keys to our service. This can be done by initializing with the MC² client.
As part of the compute step, the user first uses our MC² client software to encrypt and upload their data to untrusted cloud storage. Next, the user issues compute tasks to an untrusted orchestrator. The orchestrator will forward the requests to the enclave compute service, which will read the relevant encrypted data from the untrusted storage, decrypt it inside the enclave environment using the user’s private key, and run the user-specified compute tasks.
Finally, the result returned to the user in encrypted form, which can be decrypted locally by the user.

Getting Involved¶

In addition to building out the MC² platform, we’re continuing to grow the MC² community. Here are some ways to get involved in the MC² community:

Join our community Slack
Star and follow us on GitHub
Email us at mc2-dev@googlegroups.com

A Platform for Secure Analytics and Machine Learning¶

Introduction to MC2¶

What are secure enclaves?¶

MC2 platform’s workflow¶

Getting Involved¶

Introduction to MC²¶

MC² platform’s workflow¶