Microsoft Open Source success story—Babylon 

2 min read

An ongoing series of stories about Microsoft people and projects making their world better through open source. If you haven’t heard of Babylon.js, there is no doubt that it’s already made your day more cheerful, powering Microsoft Teams’ Reactions‘ (those cute floating emojis), or your presentation faster and smoother as the engine that powers rendering Read more

Accelerate and simplify Scikit-learn model inference with ONNX Runtime 

5 min read

Scikit-learn is one of the most useful libraries for general machine learning in Python. To minimize the cost of deployment and avoid discrepancies, deploying scikit-learn models to production usually leverages Docker containers and pickle, the object serialization module of the Python standard library. Docker is a good way to create consistent environments and pickle saves Read more

Introducing the Cluster API Provider for Azure (CAPZ) for Kubernetes cluster management 

5 min read

Managing Kubernetes clusters is hard. Managing Kubernetes clusters at scale across a variety of infrastructures is—well—even harder. The Kubernetes community project Cluster API (CAPI) enables users to manage fleets of clusters across multiple infrastructure providers. The Cluster API Provider for Azure (CAPZ) is the solution for users who need to manage Kubernetes clusters on Azure Read more

ONNX Runtime scenario highlight: Vespa.ai integration 

1 min read

Since its open source debut two years ago, ONNX Runtime has seen strong growth with performance improvements, expanded platform and device compatibility, hardware accelerator support, an extension to training acceleration, and more. We are excited by its broad usage in production, powering more than a hundred models across Microsoft products and services and bringing concrete Read more

Adding RoBERTa NLP to the ONNX model zoo for natural language predictions 

3 min read

In summer 2019, I worked as a high school intern for the ONNX AI team at Microsoft and loved working on various projects with the team, including the BERT text classification model. However, due to Covid-19, the Microsoft Internship Program for high school students was canceled in the summer of 2020. This led two other Read more

Introducing ONNX Runtime mobile – a reduced size, high performance package for edge devices 

2 min read

ONNX Runtime is an open source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce ONNX Runtime release v1.5 as part of our AI at Scale initiative. This release includes ONNX Runtime mobile, a new feature targeting smartphones and other Read more

Accelerate traditional machine learning models on GPU with ONNX Runtime 

4 min read

With the growing trend towards deep learning techniques in AI, there are many investments in accelerating neural network models using GPUs and other specialized hardware. However, many models used in production are still based on traditional machine learning libraries or sometimes a combination of traditional machine learning (ML) and DNNs. We’ve previously shared the performance Read more

Announcing Dapr integration in Azure API Management Service 

4 min read

Dapr integration in the Azure API Management (APIM) service is now available. This new capability enables operations teams to directly expose Dapr microservices as APIs and make those APIs discoverable and easily consumable by developers with proper controls across multiple Dapr deployments—whether in the cloud, on-premises, or on the edge. Since its initial release last Read more

Open-sourcing TensorFlow with DirectML 

3 min read

Following the release of our Developer Preview in June, today we’re announcing an exciting next step as we make the source code of TensorFlow-DirectML, an extension of TensorFlow on Windows, available to the public as an open-source project on GitHub. TensorFlow-DirectML broadens the reach of TensorFlow beyond its traditional Graphics Processing Unit (GPU) support, by Read more

5 Comments

GPT-2 fine-tuning with ONNX Runtime – a 34% speedup in training time 

4 min read

Model training is an important step when developing and deploying large scale Artificial Intelligence (AI) models. Training typically utilizes a large amount of compute resources to tune the model based on the input dataset. Transformer models, with millions and billions of parameters, are especially compute-intensive and training costs increase with model size and fine-tuning steps Read more

VS Code Docker extension can now run containers in Azure Container Instances 

3 min read

Today we are releasing version 1.4 of our Visual Studio Code Docker extension, which makes it easy to build, manage, and deploy containerized applications from Visual Studio Code (VS Code). In this release, you can now view and troubleshoot containers deployed in Azure Container Instances (ACI) from within VS Code. If you are using, or Read more

3 Comments