ONNX Runtime

ONNX Runtime Web unleashes generative AI in the browser using WebGPU

February 29, 2024 4 min read

By Emma NingPrincipal Program Manager, AI Frameworks
Yulong WangSenior Software Engineer, AI Frameworks
Satya JandhyalaPrincipal Software Engineer, AI Frameworks

This blog is thrilled to announce the official launch of ONNX Runtime Web featuring WebGPU, now available in the ONNX Runtime 1.17 release. Read more

On-Device Training: Training a model in browser

February 6, 2024 6 min read

By Caroline ZhuSoftware Engineer, AI Frameworks

Continuing the ONNXRuntime On-Device Training blog series, we are introducing ONNX Runtime Training for Web, a new feature in ONNX Runtime (ORT) that enables training models in the browser. Read more

Boosting performance in ONNX Runtime with Intel® AMX for 4th Gen Intel® Xeon® Processors

September 7, 2023 4 min read

By Chen FuPrincipal Software Engineer, Microsoft
Kiefer KuahSoftware Engineer, Intel

ONNX Runtime, Intel®, and Microsoft developed the 8-bit integer matrix multiplication kernel in ONNX Runtime using Intel® AMX instructions, resulting in four times faster performance than 3rd Gen Intel® Xeon® using Intel® DL Boost. Read more

Connect fluid dynamics, machine learning, and virtual reality with ONNX Runtime

July 25, 2023 5 min read

By Cassie BreviuSenior Technical Program Manager, ONNX Runtime, AI Frameworks—Microsoft

By thinking outside the box, we can envision creating a virtual multiverse. Within this innovative space, one can propose, evaluate, and decide on multiple hypotheses. Real-world examples of this approach include planning new product configurations, operating a plant, designing heating or cooling systems, or responding to catastrophes. Read more

On-Device Training with ONNX Runtime: A deep dive

July 5, 2023 6 min read

By Ashwini KhadeSoftware Engineer, AI Frameworks
Kshama PawarSenior Program Manager

Building upon the foundation we established earlier, this blog will present comprehensive information about the underlying details of training models directly on user devices using ORT. Equipped with these technical details, we encourage you to try out On-Device Training with ONNX Runtime for your custom scenario. Read more

Olive: A user-friendly toolchain for hardware-aware model optimization

June 26, 2023 4 min read

By Emma NingPrincipal Program Manager, AI Frameworks
Devang PatelPrincipal Architect, AI Frameworks
Guoliang HuaPrincipal Software Engineer Manager, Microsoft.

Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don’t need to be an expert to explore diverse hardware optimization toolchains. Read more

Automate optimization techniques for transformer models

June 26, 2023 3 min read

By Emma NingPrincipal Program Manager, AI Frameworks
Feng TianAI Architect—Intel
Yuwen ZhouAI Engineer—Intel
Haihao ShenLeading AI Architect—Intel
Saurabh TangriPrincipal AI Engineer—Intel

Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage of model compression techniques in their deployment platform, including Intel processors and accelerators. Read more

On-Device Training: Efficient training on the edge with ONNX Runtime

May 31, 2023 4 min read

By Kshama PawarSenior Program Manager
Ashwini KhadeSoftware Engineer, AI Frameworks
Baiju MeswaniSoftware Engineer, AI Frameworks

ONNX Runtime is a high-performance cross-platform inference and training engine that can run a variety of machine learning models. ORT provides an easy-to-use experience for the AI developers to run models on multiple hardware and software platforms. Read more

High-performance deep learning in Oracle Cloud with ONNX Runtime

March 15, 2023 4 min read

By Faith XuPrincipal Program Manager, Machine Learning Platform

In this blog post, we’ll share challenges our team faced, and how ONNX Runtime solves these as the backbone of success for high-performance inferencing. Read more

Performant on-device inferencing with ONNX Runtime

February 8, 2023 6 min read

By Faith XuPrincipal Program Manager, Machine Learning Platform
Brian LambertMachine Learning Engineer, Pieces.app

The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how ONNX Runtime enables their success. Read more

Improve BERT inference speed by combining the power of Optimum, OpenVINO™, ONNX Runtime, and Azure

January 25, 2023 5 min read

By Cassie BreviuSenior Technical Program Manager, ONNX Runtime, AI Frameworks—Microsoft
Akhila VidiyalaCloud Software Development Engineer, OpenVINO™ AI Frameworks Architectures—Intel
Devang AggarwalProduct Manager, OpenVINO™ AI Framework Integrations—Intel
Sachin RastogiProduct Manager, OpenVINO™ AI Workflows —Intel

Make large models smaller and faster with OpenVino Execution Provider, NNCF and ONNX Runtime leveraging Azure Machine Learning. Read more

Hugging Face Transformers now enabled in Apache OpenNLP by ONNX Runtime

September 20, 2022 1 min read

By Faith XuPrincipal Program Manager, Machine Learning Platform

We’re excited to share the recent integration of ONNX Runtime in Apache OpenNLP! Apache OpenNLP is a Java machine learning library for natural language processing (NLP) tasks. Read more

Follow OpenAtMicrosoft