Skip to content
SQL Server Blog

Microsoft SQL Server 2019 Big Data Cluster enables intelligence over all your data and helps remove data silos by combining both structured and unstructured data across the entire data estate. Big Data Clusters integrates Microsoft SQL Server and the best of big data open-source solutions. It is deployed on scalable clusters using Apache Spark, HDFS containers with Kubernetes, and SQL Server. Microsoft SQL Server 2019 Big Data Cluster is the ideal Big Data solution for AI, ML, M/R, Streaming, BI, T-SQL, and Spark.

Big Data Clusters Reference Architecture

In October 2019, Microsoft and Intel conducted performance and scalability testing using workloads derived from the TPC-DS schema with very large data sets producing 1TB, 10TB, 30TB, and 100TB worth of raw structured and semi-structured data running on Microsoft SQL Server 2019 Big Data Cluster.

The TPC-DS is the world’s first industry-standard benchmark designed to measure the performance of a decision support system including queries and data maintenance. It’s comprised of 99 queries that scan large volumes of data by utilizing Spark SQL and gives answers to real-world business questions. It challenges the cluster configurations to extract maximum efficiency from CPU, memory, and I/O along with the operating system and the big data solution.

We used 2nd Gen Intel Xeon Scalable processors for the performance testing. Across infrastructures, Intel® Xeon® Scalable platform is designed for data center modernization to drive operational efficiencies that lead to improved total cost of ownership (TCO) and higher productivity for users.

Results

The Big Data Cluster benchmarks, derived from TPC-DS, demonstrates the scalability and performance of Microsoft SQL Server 2019 Big Data reference Cluster.

1TB - 10TB - 100TB Data Set - Elapsed Query Runtimes

Our testing demonstrates that the performance scales linearly from 1TB to 100TB datasets seamlessly and the various system resources are effectively utilized. Microsoft SQL Server 2019 Big Data Cluster leverages the high performance of Intel® Xeon® processors and Intel® SSDs to deliver great performance for complex queries. In addition, the benchmark results demonstrate powerful elasticity and performance of the entire platform.

The combination of Microsoft SQL Server 2019 Big Data Cluster and Intel’s Xeon Scalable platform can address many of your Big Data challenges. You can store and analyze data from multiple sources at scale, in various data formats, with scale-out compute for data processing and machine learning, together with the industry-leading experience of SQL Server.

Here is a link to download the technical white paper that captures detailed steps, configuration, and analysis of the benchmark study for Microsoft SQL Server 2019 Big Data Cluster on Intel’s Xeon Scalable platform.

Microsoft SQL Server 2019 Big Data Cluster performance benchmark technical whitepaper.

Learn more