Faster inference for PyTorch models with OpenVINO Integration with Torch-ORT 

4 min read

Many developers opt to use popular AI Frameworks like PyTorch, which simplifies the process of analyzing predictions, training models, leveraging data, and refining future results. Read more

Optimizing and deploying transformer INT8 inference with ONNX Runtime-TensorRT on NVIDIA GPUs 

5 min read

Mohit Ayani, Solutions Architect, NVIDIA Shang Zhang, Senior AI Developer Technology Engineer, NVIDIA Jay Rodge, Product Marketing Manager-AI, NVIDIA Transformer-based models have revolutionized the natural language processing (NLP) domain. Ever since its inception, transformer architecture has been integrated into models like Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) for performing tasks Read more

Supporting efficient large model training on AMD Instinct™ GPUs with DeepSpeed 

6 min read

This post was co-authored by Jithun Nair and Aswin Mathews, members of technical staff at AMD. In recent years, large-scale deep learning models have demonstrated impressive capabilities, excelling at tasks across natural language processing, computer vision, and speech domains. Companies now use these models to power novel AI-driven user experiences across a whole spectrum of Read more

Announcing Azure Active Directory (Azure AD) workload identity for Kubernetes 

1 min read

Today, we are excited to announce an open-source project called Azure AD workload identity for Kubernetes. It leverages the public preview capability of Azure AD workload identity federation. With this project, developers can use native Kubernetes concepts of service accounts and federation to access Azure AD protected resources, such as Azure and Microsoft Graph, without needing Read more

Add AI to mobile applications with Xamarin and ONNX Runtime 

2 min read

ONNX Runtime now supports building mobile applications in C# with Xamarin. Support for Android and iOS is included in the ONNX Runtime release 1.10 NuGet package. This enables C# developers to build AI applications for Android and iOS to execute ONNX models on mobile devices with ONNX Runtime. ONNX Runtime is the open source project Read more

Ratify container supply chain in Kubernetes 

4 min read

Securing the software supply chain and verifying that chain is hard for any software, and containers running in Kubernetes are no exception. Operational best practices like image signing, scanning, provenance verification, and ensuring these operations have been properly completed with signed software bill of materials (SBoMs) are all required, and tons of tools are appearing Read more

Progress on making eBPF work on Windows 

3 min read

eBPF is a well-known, but revolutionary, technology for providing programmability, extensibility, and agility. eBPF has been applied to use cases such as denial-of-service protection and observability. In May 2021, we announced the effort to make eBPF work on Windows, and were encouraged by the huge amount of interest. Six months have passed since then, and Read more

ONNX Runtime Web—running your machine learning model in browser 

5 min read

We are introducing ONNX Runtime Web (ORT Web), a new feature in ONNX Runtime to enable JavaScript developers to run and deploy machine learning models in browsers. It also helps enable new classes of on-device computation. ORT Web will be replacing the soon to be deprecated onnx.js, with improvements such as a more consistent developer Read more

Introducing Distributed Data Parallel support on PyTorch Windows 

6 min read

Model training has been and will be in the foreseeable future one of the most frustrating things machine learning developers face. It takes quite a long time and people can’t really do anything about it. If you have the luxury (especially at this moment of time) of having multiple GPUs, you are likely to find Read more

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform 

4 min read

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD. ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce a preview version of ONNX Runtime in Read more