Managing Kubernetes clusters is hard.

Managing Kubernetes clusters at scale across a variety of infrastructures is—well—even harder.

The Kubernetes community project Cluster API (CAPI) enables users to manage fleets of clusters across multiple infrastructure providers. The Cluster API Provider for Azure (CAPZ) is the solution for users who need to manage Kubernetes clusters on Azure IaaS. In the past, we have recommended AKS Engine for this common scenario.  While we will continue to provide regular, stable releases for AKS Engine, the Azure team is excited to share that CAPZ is now ready for users and will be our primary tool for enabling customers to operate self-managed Kubernetes clusters on Azure IaaS.

Do you manage your own Kubernetes clusters?

Kubernetes is the dominant cross-platform tool for managing containerized applications. Azure Kubernetes Service (AKS) is the managed service that makes it easy for users to run Kubernetes on Azure. AKS is mature, scalable, secure, and backed by Azure’s excellent support. But some users need to run clusters themselves and can’t take advantage of AKS. Some need functionality that is not available in AKS yet or might never be because they require user access to the control plane.

Some are running a service themselves on Azure that leverages Kubernetes and needs complete control, and others might need to run their own clusters for compliance or regulatory reasons (for example, financial services companies who can’t delegate management to another organization). Still, other users are developing new integrations with Kubernetes or Kubernetes features themselves, and need to be able to tweak, control, and test anything and everything. We call these clusters that users run themselves “self-managed” clusters.

If you need to run self-managed clusters on Azure, whatever your reason, you’ve come to the right place.

Cluster API powers self-managed clusters on Azure

The Kubernetes community has long recognized the need for tooling to provide standardized lifecycle management of clusters independent of the infrastructure on which they run. In response SIG Cluster Lifecycle created the Cluster API sub-project:

Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. – The Cluster API Book

Cluster API provides our team with a natural place to innovate in open source for users and expand community participation in solving Azure user problems at the same time. Thus, it made sense for us to spend the past 18 months investing in the Azure Provider for Cluster API (CAPZ) to make it a fully functional project ready to realize the vision of Cluster API for every user.

Cluster API Diagram

The most recent CAPZ release, v0.4.10, includes new capabilities such as GPU support, private clusters, and Azure API call tracing. Some of you may be reluctant to adopt a tool whose API is labeled alpha (v1alpha3 to be exact). You should take comfort in the knowledge that CAPI enables forward and backward compatibility of API versions so that when the project moves to v1alpha4, and then v1beta1, you’ll be able to upgrade, and then use the API to output your objects with the new API version.

Our team is thrilled with the CAPZ work because more of you will be able to effectively manage your cluster’s entire lifecycle on Azure. It has also been fulfilling to drive innovations in the Cluster API community, like CAPI MachinePool, which enables users to take advantage of each infrastructure provider’s native VM scaling group capability. CAPI brings Kubernetes native cluster management and CAPZ enables this naturally on Azure infrastructure. Together in the community, we can deliver better capabilities for users more quickly.

Users are already taking advantage of CAPI and CAPZ on Azure. The Azure provider community consists of amazing people from Azure, VMware, Red Hat, Weaveworks, and more. Community members are realizing the power of the Cluster API by using CAPZ for use cases that span from building new platforms and products, like Tanzu Kubernetes Grid, to testing new hardware on multiple infrastructures.

Users are also discovering new use cases for CAPI. For example, a recent example uses CAPI and Helm to operate managed clusters. And our team is using CAPZ to validate new versions of, and features in, Kubernetes on Azure. Soon our upstream tests will move from using AKS Engine to CAPZ.

But what about AKS Engine?

Our team, Azure Container Compute Upstream, has the following mission:

  • Enable Azure to efficiently consume innovations from the Kubernetes ecosystem
  • Contribute innovations from Azure to the Kubernetes ecosystem

We maintain AKS Engine as an open source tool for Azure customers, but the narrow focus on Azure-specific APIs is inconsistent with our mission in the Kubernetes ecosystem.

AKS Engine works by creating ARM templates from a cluster model. ARM templates are a great Azure-specific solution for cluster creation, but this design falls short of empowering ongoing operational needs such as scaling, in-place upgrading, and extension management. And it isn’t useful for users who are focused on multi-cloud scenarios like managing fleets of Kubernetes clusters across cloud infrastructures that do not support ARM.

AKS Engine users will continue to receive excellent community support. As more maintainers have joined the AKS Engine community the Upstream team has shifted focus to CAPZ for new Kubernetes features. The community is committed to integrating and validating new versions of Kubernetes into AKS Engine. AKS Engine will remain the tool for creating Kubernetes clusters on Azure Stack Hub. We encourage other AKS Engine users to evaluate moving to CAPZ as it already provides stronger support for managing the cluster lifecycle compared to AKS Engine, and new investments from the Upstream team will be focused there. If you are committed to using AKS Engine longer term and would like to become a project maintainer, please reach out to us!

Cluster API CAPZ: Getting started, getting help, and getting involved

To get started building Kubernetes clusters on Azure with CAPZ, try the amazing CAPZ documentation. When you have issues, please look at the CAPZ issues and create new ones if needed. If you want to get more involved in developing CAPZ, our team is active during office hours and invite your participation. Many also find the #cluster-api-azure Slack channel to be a great source of advice, help, and collaboration.

In our next blog we’ll discuss in more detail how you can customize your CAPZ deployment to tune startup time for your application by baking your chosen operating system and patch level, and/or your application binaries and configurations into the virtual machine images. We plan to follow that with a discussion about how to leverage the GitOps principles by synchronizing a git repo with your management cluster. Reach out to us in the Kubernetes Slack (@craiglpeters and @jackfrancis) or on Twitter (@peterscraig and @jackfrancis_esq) with any other topic you’d like to see us dig into.