IntelliCode

Journey to optimize large scale transformer model inference with ONNX Runtime

June 30, 2021 7 min read

By Xiaoyu LiuApplied Scientist II, Data&AI, Developer Division (DevDiv)
Eric LinSenior Researcher SDE, Turing Team
Emma NingPrincipal Program Manager, AI Frameworks

“With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.” Large-scale transformer models, such as GPT-2 and GPT-3, are among the most Read more

Follow OpenAtMicrosoft