Young Scholar TechTalk – Secure and High-performance AI Serving: Protecting AI Secretes, Accelerating AI Insights
September 19, 2023 (Tuesday) 4:30-5:30pm
Driven by the remarkable success of artificial intelligence (AI) and edge computing, the deployment of well-trained private AI models on third-party edge devices for mission-critical applications has become increasingly prevalent. Safeguarding these private models on untrusted devices, while simultaneously speeding up model serving (i.e., inference) through accelerators like GPUs, has escalated in urgency.
We introduce SOTER, a new AI serving system that, for the first time, achieves both high security and high performance. Harnessing the associativity property of AI operators, SOTER presents an innovative approach—transforming computationally expensive AI operators into parameter-morphed equivalents for secure execution on untrusted but fast GPUs, and losslessly restoring inference results within trusted execution environments (TEEs) in CPUs. Experimental results on six prevalent AI models in the three most popular categories show that, even with stronger model protection, SOTER achieves comparable performance with baselines while retaining the same high accuracy as insecure AI model inference.