Posts

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform 

4 min read

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD. ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce a preview version of ONNX Runtime in…Read more

Simple steps to create scalable processes to deploy ML models as microservices 

7 min read

This post was co-authored by Alejandro Saucedo, Director of Machine Learning Engineering at Seldon Technologies. About the co-author: Alejandro leads teams of machine learning engineers focused on the scalability and extensibility of machine learning deployment and monitoring products with over five million installations. Alejandro is also the Chief Scientist at the Institute for Ethical AI…Read more

Journey to optimize large scale transformer model inference with ONNX Runtime 

7 min read

“With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.” Large-scale transformer models, such as GPT-2 and GPT-3, are among the most…Read more

How to migrate and modernize Linux workloads and open source databases to Azure 

3 min read

With extensive support for all major Linux distributions including Red Hat, SUSE, Ubuntu, CentOS, Debian, and managed platform-as-a-service (PaaS) offerings for open source databases like Azure Database for MySQL, Azure Database for PostgreSQL, and Azure Database for MariaDB—it’s no surprise that Linux is the fastest growing platform on Azure. Furthermore, Azure Migrate makes the discovery,…Read more

Empowering you to achieve more with open source on Azure 

1 min read

At Microsoft, we are taking cloud architecture to the next level and our open cloud reduces the friction for developers to get applications up and running. We give autonomy and control to the developers to flexibly choose their infrastructure and give them options to build, migrate, and deploy across multiple environments on-premises, in the cloud,…Read more

ONNX Runtime 1.8: mobile, web, and accelerated training 

2 min read

The V1.8 release of ONNX Runtime includes many exciting new features. This release launches ONNX Runtime machine learning model inferencing acceleration for Android and iOS mobile ecosystems (previously in preview) and introduces ONNX Runtime Web. Additionally, the release also debuts official packages for accelerating model training workloads in PyTorch. ONNX Runtime is a cross-platform runtime…Read more

Delivering reliable production experiences with PyTorch Enterprise on Microsoft Azure 

3 min read

At Microsoft, we use PyTorch to power products such as Bing and Azure Cognitive Services and we actively contribute to several PyTorch open-source projects, including PyTorch Profiler, ONNX Runtime, DeepSpeed, and more. Today, we’re announcing a new initiative in collaboration with Facebook—the PyTorch Enterprise Support Program. This new program enables service providers to develop and…Read more

Making eBPF work on Windows 

3 min read

eBPF is a well-known but revolutionary technology—providing programmability, extensibility, and agility. eBPF has been applied to use cases such as denial-of-service protection and observability. Over time, a significant ecosystem of tools, products, and experience has been built up around eBPF. Although support for eBPF was first implemented in the Linux kernel, there has been increasing…Read more

Optimizing BERT model for Intel CPU Cores using ONNX runtime default execution provider 

5 min read

This blog was co-authored with Manash Goswami, Principal Program Manager, Machine Learning Platform. The performance improvements provided by ONNX Runtime powered by Intel® Deep Learning Boost: Vector Neural Network Instructions (Intel® DL Boost: VNNI) greatly improves performance of machine learning model execution for developers. In the past, machine learning models mostly relied on 32-bit floating…Read more