Skip to content
Microsoft Open Source Blog

Posts

ONNX Runtime Web—running your machine learning model in browser 

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer...Read more

Accelerate PyTorch training with torch-ort 

With a simple change to your PyTorch training script, you can now speed up training large language models with torch_ort.ORTModule, running on the target hardware of your choice. Training deep learning models requires ever-increasing compute and memory resources. Today we release torch_ort.ORTModule, to accelerate distributed training of PyTorch models, reducing the time and resources needed...Read more

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform 

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD. ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce a preview version of ONNX Runtime in...Read more

Simple steps to create scalable processes to deploy ML models as microservices 

This post was co-authored by Alejandro Saucedo, Director of Machine Learning Engineering at Seldon Technologies. About the co-author: Alejandro leads teams of machine learning engineers focused on the scalability and extensibility of machine learning deployment and monitoring products with over five million installations. Alejandro is also the Chief Scientist at the Institute for Ethical AI...Read more

Journey to optimize large scale transformer model inference with ONNX Runtime 

“With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.” Large-scale transformer models, such as GPT-2 and GPT-3, are among the most...Read more

Create privacy-preserving synthetic data for machine learning with SmartNoise 

Watch our webinar on Open Data Science Conference  Read the white paper on SmartNoise Differential Privacy machine learning case studies The COVID-19 pandemic demonstrates the tremendous importance of sufficient and relevant data for research, causal analysis, government action, and medical progress. However, for understandable data protection considerations, individuals and decision-makers are often very reluctant to share personal or sensitive data....Read more

Accelerate and simplify Scikit-learn model inference with ONNX Runtime 

Scikit-learn is one of the most useful libraries for general machine learning in Python. To minimize the cost of deployment and avoid discrepancies, deploying scikit-learn models to production usually leverages Docker containers and pickle, the object serialization module of the Python standard library. Docker is a good way to create consistent environments and pickle saves...Read more

ONNX Runtime scenario highlight: Vespa.ai integration 

Since its open source debut two years ago, ONNX Runtime has seen strong growth with performance improvements, expanded platform and device compatibility, hardware accelerator support, an extension to training acceleration, and more. We are excited by its broad usage in production, powering more than a hundred models across Microsoft products and services and bringing concrete...Read more

Adding RoBERTa NLP to the ONNX model zoo for natural language predictions 

In summer 2019, I worked as a high school intern for the ONNX AI team at Microsoft and loved working on various projects with the team, including the BERT text classification model. However, due to Covid-19, the Microsoft Internship Program for high school students was canceled in the summer of 2020. This led two other...Read more

How to deploy Elastic Cloud on Microsoft Azure 

From startups to the global 2000, Elastic powers search solutions for thousands of companies worldwide to find documents, monitor infrastructure, protect against security threats, and more. With Elastic Cloud managed services on Azure, you have the power of Elastic Enterprise Search, Elastic Observability, and Elastic Security. You can quickly and easily deploy as a managed...Read more

Introducing ONNX Runtime mobile – a reduced size, high performance package for edge devices 

ONNX Runtime is an open source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce ONNX Runtime release v1.5 as part of our AI at Scale initiative. This release includes ONNX Runtime mobile, a new feature targeting smartphones and other...Read more

Accelerate traditional machine learning models on GPU with ONNX Runtime 

With the growing trend towards deep learning techniques in AI, there are many investments in accelerating neural network models using GPUs and other specialized hardware. However, many models used in production are still based on traditional machine learning libraries or sometimes a combination of traditional machine learning (ML) and DNNs. We’ve previously shared the performance...Read more