SOTERIA: In Search of Efficient Neural Networks for Private Inference Anshul Aggarwal, Trevor E. Carlson, Reza Shokri, Shruti Tople National University of Singapore (NUS), Microsoft Research {anshul, trevor, reza}@comp.nus.edu.sg,
[email protected] Abstract paper, we focus on private computation of inference over deep neural networks, which is the setting of machine learning-as- ML-as-a-service is gaining popularity where a cloud server a-service. Consider a server that provides a machine learning hosts a trained model and offers prediction (inference) ser- service (e.g., classification), and a client who needs to use vice to users. In this setting, our objective is to protect the the service for an inference on her data record. The server is confidentiality of both the users’ input queries as well as the not willing to share the proprietary machine learning model, model parameters at the server, with modest computation and underpinning her service, with any client. The clients are also communication overhead. Prior solutions primarily propose unwilling to share their sensitive private data with the server. fine-tuning cryptographic methods to make them efficient We consider an honest-but-curious threat model. In addition, for known fixed model architectures. The drawback with this we assume the two parties do not trust, nor include, any third line of approach is that the model itself is never designed to entity in the protocol. In this setting, our first objective is to operate with existing efficient cryptographic computations. design a secure protocol that protects the confidentiality of We observe that the network architecture, internal functions, client data as well as the prediction results against the server and parameters of a model, which are all chosen during train- who runs the computation.