Skip to content
Microsoft SQL Server Blog

SQL Server Big Data Clusters  is a new capability brought to market as part of the SQL Server 2019 release. Big Data Clusters extends SQL Server’s analytical capabilities beyond in-database processing of transactional and analytical workloads by uniting the SQL engine with Apache Spark and Apache Hadoop to create a single, secure, and unified data platform.

Big Data Clusters is available exclusively to run on Linux containers, orchestrated by Kubernetes, and can be deployed in multiple-cloud providers or on-premises.

Today, we’re announcing the release of the latest cumulative update (CU), CU10, for SQL Server Big Data Clusters, which includes important capabilities:

  • Upgraded base images from Ubuntu 16.04 to Ubuntu 20.04.
  • High availability support for Hadoop KMS components.
  • Additional configurability of SQL Server networking and process affinity settings at the resource-scope.
  • Resource management for Spark-related containers through cluster-scoped settings.

Major improvements in this update are highlighted below, along with resources for you to learn more and get started.

Upgraded base image versions

SQL Server 2019 CU9 included a software refresh for most of the open source components deployed with Big Data Clusters. Building on this momentum and in line with our commitment to ensure that Big Data Clusters component versions are up to date with those supported, we are now upgrading the base operating system (OS) for all container images from Ubuntu 16.04 to Ubuntu 20.04.

For existing Big Data Clusters deployments, no other action is necessary apart from the regular in-place upgrade to the new CU. The new CU10 images that include the upgraded base OS version will be used when upgrading Big Data Clusters. As a best practice, we recommend upgrading to CU10 to take advantage of new capabilities and improvements and to ensure containers are covered by the Ubuntu support lifecycle.

High Availability support for Hadoop KMS components

Consistent with our commitment to continuous improvements of the Encryption at Rest feature set, CU10 adds High Availability capabilities for Hadoops key management service (KMS) components. After the upgrade, all namenode pods will have a KMS instead of just one namenode pod. The benefits are two-fold, increased high availability and increased performance of encryption operations on encryption zones.

Ready to learn more?

Check out the SQL Server Big Data Clusters CU10 release notes to learn more about all the improvements available with the latest update. For a technical deep-dive on Big Data Clusters, take a look at the documentation page and visit our GitHub repository.

Follow the instructions on our documentation page to get started and deploy Big Data Clusters.