Privacy Preserving Machine Learning - Resources & Materials

A compiled list of resources and materials for PPML.

Posted by : khoaguin on Jan 4, 2022

About

This is a compiled list of resources and materials for PPML.
Here is the link to the github repo. And please leave a Star if you find it useful.

Survey Papers

[Arxiv’20] Privacy in Deep Learning: A Survey
[Arxiv’20] SoK: Training Machine Learning Models over Multiple Sources with Privacy Preservation
[IEEEAccess’20] Privacy-Preserving Deep Learning on Machine Learning as a Service: A Comprehensive Survey
[PETS’21] SoK: Privacy-Preserving Computation Techniques for Deep Learning
[PETS’21] SoK: Efficient Privacy Preserving Clustering

Cryptographic-based Approaches

Training Phase

Homomorphic Encryption (HE) & Functional Encryption (FE)

[BMC medical genomics’18] Privacy-preserving logistic regression training
[Arxiv’19] CryptoNN: Training Neural Networks over Encrypted Data (Functional Encryption). Code (Python)
[PETS’18] CryptoDL: Privacy-preserving Machine Learning as a Service (Unofficial Code) (C++/Python)
[IEEE/CVF’19] Towards Deep Neural Network Training on Encrypted Data
[ArXiv’20] Neural Network Training With Homomorphic Encryption
[ArXiv’20] PrivFT: Private and Fast Text Classification with Homomorphic Encryption

HE-based Hybrid Techniques

[NeurlPS’20] Glyph: Fast and Accurately Training Deep Neural Networks on Encrypted Data (Switch between HE schemes: TFHE and BGV)
[NDSS’21] POSEIDON: Privacy-Preserving Federated Neural Network Learning (Federated Learning + HE) (Code is confidential)

Secure Multi-Party Computation (SMPC)

[S&P’17] SecureML: A System for Scalable Privacy-Preserving Machine Learning. Code (C++)
[CCS’19] QUOTIENT: two-party secure neural network training and prediction
[Arxiv’19] CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning
[PETS’20] Falcon: Honest-Majority Maliciously Secure Framework for Private Deep Learning. Code (C++)
[ICPP’20] ParSecureML: An Efficient Parallel Secure Machine Learning Framework on GPUs. Code (C++)
[PoPETs’20] FLASH: Fast and Robust Framework for Privacy-preserving Machine Learning.
[NDSS’20] BLAZE: Blazing Fast Privacy-Preserving Machine Learning
[USENIX’21] Cerebro: A Platform for Multi-Party Cryptographic Collaborative Learning. Code (Python)
[USENIX’21] Fantastic Four: Honest-Majority Four-Party Secure Computation With Malicious Security. Code (Python)
[S&P’21] CryptGPU: Fast Privacy-Preserving Machine Learning on the GPU. Code (Python)
[USENIX’21] SWIFT: Super-fast and Robust Privacy-Preserving Machine Learning
[Arxiv’21] Adam in Private: Secure and Fast Training of Deep Neural Networks with Adaptive Moment Estimation
[Arxiv’21] Secure Quantized Training for Deep Learning. Code (Python)
[ACMCCS’21] ABY2.0: Improved Mixed-Protocol Secure Two-Party Computation. Code (C++)

SMPC-based Hybrid Techniques

[PETS’18] SecureNN: 3-Party Secure Computation for Neural Network Training. Code (C++)
[CCS’18] ABY3: A Mixed Protocol Framework for Machine Learning. Code (C++)
[NDSS’20] Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning
[Arxiv’21] Tetrad: Actively Secure 4PC for Secure Training and Inference
[ICLR’21] MPCLeague: Robust 4-party Computation for Privacy-preserving Machine Learning
[PETS’22] AriaNN: Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing. Code (Python)

Hybrid Techniques

[IACR Cryptol’17] Private Collaborative Neural Network Learning (SMPC + DP)

Inference Phase

Non-cryptographic-based Approaches

Federated Learning

[AISTATS’17] Communication-Efficient Learning of Deep Networks from Decentralized Data (FedAvg)

Courses

Private AI Series (OpenMined)
MIT 6.875: Foundations of Cryptography (MIT, Fall 2021) - Lecture Notes
Privacy in Statistics and Machine Learning (Boston University, Spring 2021)
Algorithms for Private Data Analysis - Fall 2020 (University of Waterloo, Fall 2020)
Privacy Preserving Machine Learning (University of Lille, 2021)

Frameworks

PySyft (Python): decouples private data from model training (using FL, DP, HE, SMPC…)

HE

TenSEAL (Python): A library for doing homomorphic encryption operations on tensors
concrete (Rust): zama.ai’s variant of TFHE scheme. It is based on the Learning With Errors (LWE) and the Ring Learning With Errors (RLWE) problems

SMPC

CrypTen (Python): a framework for Privacy Preserving Machine Learning built on PyTorch
MP-SPDZ (C++): software to benchmark various secure multi-party computation (MPC) protocols such as SPDZ, SPDZ2k, MASCOT, Overdrive, BMR garbled circuits, Yao’s garbled circuits, and computation based on three-party replicated secret sharing as well as Shamir’s secret sharing (with an honest majority)
MOTION (C++): a Framework for Mixed-Protocol Multi-Party Computation
ABY (C++): combines arithmetic, boolean and garbled style computation, and proposes protocols to switch between the arithmetic/boolean/garbled worlds for 2 parties.

Other Resources

About Khoa Nguyen

I am a deep learning engineer who is passionate about building privacy-preserving medical AI applications.

Email : dkn.work@protonmail.com

Website : https://khoaduynguyen.com

Privacy Preserving Machine Learning - Resources & Materials

A compiled list of resources and materials for PPML.

About

Survey Papers

Cryptographic-based Approaches

Training Phase

Homomorphic Encryption (HE) & Functional Encryption (FE)

Secure Multi-Party Computation (SMPC)

Hybrid Techniques

Inference Phase

Homomorphic Encryption (HE) & Functional Encryption

Secure Multi-Party Computation (SMPC)

Hybrid Tehcniques

Non-cryptographic-based Approaches

Federated Learning

Courses

Frameworks

HE

SMPC

Other Resources