AI + Machine Learning - Microsoft Open Source Blog

Live demos of machine learning models with ONNX and Hugging Face Spaces

June 6, 2022 5 min read

By Jacky ChenSoftware Engineer, AI Frameworks

Choosing which machine learning model to use, sharing a model with a colleague, and quickly trying out a model are all reasons why you may find yourself wanting to quickly run inference on a model. You can configure your environment and download Jupyter notebooks, but it would be nicer if there was a way to Read more

Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime

April 19, 2022 8 min read

By Faith XuPrincipal Program Manager, Machine Learning Platform

Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows across the industry. We’re happy to see that the ONNX Runtime Machine Learning model inferencing solution we’ve built and use in high-volume Microsoft products and services also resonates with our open source community, enabling new capabilities that Read more

Supporting efficient large model training on AMD Instinct™ GPUs with DeepSpeed

March 21, 2022 6 min read

By Olatunji RuwasePrincipal RSDE
Jeff RasleySenior Research SDE

This post was co-authored by Jithun Nair and Aswin Mathews, members of technical staff at AMD. In recent years, large-scale deep learning models have demonstrated impressive capabilities, excelling at tasks across natural language processing, computer vision, and speech domains. Companies now use these models to power novel AI-driven user experiences across a whole spectrum of Read more

Add AI to mobile applications with Xamarin and ONNX Runtime

December 14, 2021 2 min read

By Scott McKayPrincipal Software Engineer
Guoyu WangSenior Software Engineer

ONNX Runtime now supports building mobile applications in C# with Xamarin. Support for Android and iOS is included in the ONNX Runtime release 1.10 NuGet package. This enables C# developers to build AI applications for Android and iOS to execute ONNX models on mobile devices with ONNX Runtime. ONNX Runtime is the open source project Read more

ONNX Runtime Web—running your machine learning model in browser

September 2, 2021 5 min read

By Emma NingPrincipal Program Manager, AI Frameworks
Yulong WangSenior Software Engineer, AI Frameworks
Du LiSenior Software Engineer, AI Frameworks

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer Read more

Accelerate PyTorch training with torch-ort

July 13, 2021 3 min read

By Natalie KershawSenior Program Manager, AI Frameworks, Microsoft

With a simple change to your PyTorch training script, you can now speed up training large language models with torch_ort.ORTModule, running on the target hardware of your choice. Training deep learning models requires ever-increasing compute and memory resources. Today we release torch_ort.ORTModule, to accelerate distributed training of PyTorch models, reducing the time and resources needed Read more

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform

July 13, 2021 4 min read

By Weixing ZhangPrincipal Software Engineer, AI Frameworks at Microsoft
Suffian KhanSoftware Engineer, AI Frameworks at Microsoft

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD. ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce a preview version of ONNX Runtime in Read more

Simple steps to create scalable processes to deploy ML models as microservices

July 9, 2021 6 min read

By Elena Neroslavskaya

This post was co-authored by Alejandro Saucedo, Director of Machine Learning Engineering at Seldon Technologies. About the co-author: Alejandro leads teams of machine learning engineers focused on the scalability and extensibility of machine learning deployment and monitoring products with over five million installations. Alejandro is also the Chief Scientist at the Institute for Ethical AI Read more

Journey to optimize large scale transformer model inference with ONNX Runtime

June 30, 2021 7 min read

By Xiaoyu LiuApplied Scientist II, Data&AI, Developer Division (DevDiv)
Eric LinSenior Researcher SDE, Turing Team
Emma NingPrincipal Program Manager, AI Frameworks

“With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.” Large-scale transformer models, such as GPT-2 and GPT-3, are among the most Read more

Create privacy-preserving synthetic data for machine learning with SmartNoise

February 18, 2021 5 min read

By Andreas KoppDigital Advisor for AI solutions

Watch our webinar on Open Data Science Conference Read the white paper on SmartNoise Differential Privacy machine learning case studies The COVID-19 pandemic demonstrates the tremendous importance of sufficient and relevant data for research, causal analysis, government action, and medical progress. However, for understandable data protection considerations, individuals and decision-makers are often very reluctant to share personal or sensitive data. Read more

Accelerate and simplify Scikit-learn model inference with ONNX Runtime

December 17, 2020 5 min read

By Xavier DupreData Scientist at Microsoft
Olivier GriselSoftware engineer at Inria and core contributor to scikit-learn

Scikit-learn is one of the most useful libraries for general machine learning in Python. To minimize the cost of deployment and avoid discrepancies, deploying scikit-learn models to production usually leverages Docker containers and pickle, the object serialization module of the Python standard library. Docker is a good way to create consistent environments and pickle saves Read more

ONNX Runtime scenario highlight: Vespa.ai integration

December 14, 2020 1 min read

By Faith XuPrincipal Program Manager, Machine Learning Platform

Since its open source debut two years ago, ONNX Runtime has seen strong growth with performance improvements, expanded platform and device compatibility, hardware accelerator support, an extension to training acceleration, and more. We are excited by its broad usage in production, powering more than a hundred models across Microsoft products and services and bringing concrete Read more

Blog posts

Follow OpenAtMicrosoft