How to deploy SQL Server 2019 Big Data Clusters

SQL Server 2019 Big Data Clusters is a scale-out, data virtualization platform built on top of the Kubernetes container platform. This ensures a predictable, fast, and elastically scalable deployment, regardless of where it’s deployed. In this blog post, we’ll explain how to deploy SQL Server 2019 Big Data Clusters to Kubernetes.

First, the tools

Deploying Big Data Clusters to Kubernetes requires a specific set of client tools. Before you get started, please install the following:

  • azdata: Deploys and manages Big Data Clusters.
  • kubectl: Creates and manages the underlying Kubernetes cluster.
  • Azure Data Studio: Graphical interface for using Big Data Clusters.
  • SQL Server 2019 extension: Azure Data Studio extension that enables the Big Data Clusters features.

Choose your Kubernetes

Big Data Clusters is deployed as a series of interrelated containers that are managed in Kubernetes. You have several options for hosting Kubernetes, depending on your use case, including:

  • Azure Kubernetes Service (AKS): You can use the Azure portal to deploy Azure Kubernetes Service. Azure Kubernetes Service allows you to deploy a managed Kubernetes cluster in Azure, all you manage and maintain are the agent nodes. You don’t even have to provision your own hardware.
  • Multiple Linux machines: Kubernetes can also be deployed to multiple Linux machines, physical or virtual. This is a great option if you’re looking for an opportunity to leverage existing infrastructure. You can use the kubeadm tool to create the Kubernetes cluster and a bash script. Visit our documentation to learn how to automate the deployment.

Deploy SQL Server 2019 Big Data Clusters

After configuring Kubernetes, your next step is to deploy Big Data Clusters with the azdata bdc create command. There are several different ways to do this as well:

Deployment scripts

Deployment scripts can make deployment easier and faster by deploying both Kubernetes and Big Data Clusters in a single step. They also often provide default values for Big Data Clusters settings. However, you aren’t locked into the values defined by the script. Deployment scripts can also be customized, so you can create your own version that configures the Big Data Clusters deployment to your liking.

Two deployment scripts are currently available. The Python script deploys a big data cluster on Azure Kubernetes Service, and the Bash script deploys Big Data Clusters to a single node kubeadm cluster.

Deployment notebooks

There’s one more option for deploying Big Data Clusters, and that’s running an Azure Data Studio notebook. There will also be a UX experience in Azure Data Studio for deployment.

Because SQL Server Big Data Clusters are deployed on Kubernetes, getting up and running is fairly painless. As you can see, you have several options each step of the way, but your path is made clear based on your use case. To learn more about what you can do with Microsoft SQL Server 2019, check out the free Packt guide Introducing Microsoft SQL 2019. If you’re ready to jump to a fully managed cloud solution, check out the Essential Guide to Data in the Cloud.