ONNX Runtime Web unleashes generative AI in the browser using WebGPU
This blog is thrilled to announce the official launch of ONNX Runtime Web featuring WebGPU, now available in the ONNX Runtime 1.17 release. Read more
This blog is thrilled to announce the official launch of ONNX Runtime Web featuring WebGPU, now available in the ONNX Runtime 1.17 release. Read more
Continuing the ONNXRuntime On-Device Training blog series, we are introducing ONNX Runtime Training for Web, a new feature in ONNX Runtime (ORT) that enables training models in the browser. Read more
ONNX Runtime, Intel®, and Microsoft developed the 8-bit integer matrix multiplication kernel in ONNX Runtime using Intel® AMX instructions, resulting in four times faster performance than 3rd Gen Intel® Xeon® using Intel® DL Boost. Read more
By thinking outside the box, we can envision creating a virtual multiverse. Within this innovative space, one can propose, evaluate, and decide on multiple hypotheses. Real-world examples of this approach include planning new product configurations, operating a plant, designing heating or cooling systems, or responding to catastrophes. Read more
Building upon the foundation we established earlier, this blog will present comprehensive information about the underlying details of training models directly on user devices using ORT. Equipped with these technical details, we encourage you to try out On-Device Training with ONNX Runtime for your custom scenario. Read more
Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don’t need to be an expert to explore diverse hardware optimization toolchains. Read more
Intel has collaborated with Microsoft to integrate Intel® Neural Compressor into Olive, enabling developers to easily take advantage of model compression techniques in their deployment platform, including Intel processors and accelerators. Read more
ONNX Runtime is a high-performance cross-platform inference and training engine that can run a variety of machine learning models. ORT provides an easy-to-use experience for the AI developers to run models on multiple hardware and software platforms. Read more
In this blog post, we’ll share challenges our team faced, and how ONNX Runtime solves these as the backbone of success for high-performance inferencing. Read more
The team at Pieces shares the problems and solutions evaluated for their on-device model serving stack and how ONNX Runtime enables their success. Read more
Make large models smaller and faster with OpenVino Execution Provider, NNCF and ONNX Runtime leveraging Azure Machine Learning. Read more
We’re excited to share the recent integration of ONNX Runtime in Apache OpenNLP! Apache OpenNLP is a Java machine learning library for natural language processing (NLP) tasks. Read more