About two years ago, we heard an increasing demand from the .NET community for an easier way to build big data applications with .NET, outside of needing to learn Scala or Python. Thus, in a collaboration between Azure Data and .NET teams, we started the .NET for Apache® Spark™ open source project.

Today, we are happy to announce the release of version 1.0 of .NET for Apache Spark, an open source package that brings high performance APIs for using Apache Spark from C# and F#. Indeed, it is the same technology that powers the .NET experience for Apache Spark inside Azure Synapse Analytics.

timeline

At a high-level, .NET for Apache Spark offers users the ability to:

  • Write Spark applications in C# or F# targeting .NET Core 2.1+.
  • Reuse existing C#/F# libraries and use the power of Apache Spark to parallelize them.
  • Leverage the high-quality Visual Studio or Visual Studio Code IDEs for building Spark apps.

The new .NET for Apache Spark v1.0 brings in additional capabilities to an already rich library:

We are extremely grateful to everyone who contributed towards reaching this important milestone. For more about the history of the project, key contributors, and its use in production pipelines, read the full announcement.

To learn more about .NET for Apache Spark, check out our presentation at the Databricks’ Spark+AI Summit 2019, Microsoft Build 2019, SQLBits 2020, and the demo at Ignite 2020.

Learn more about .NET for Apache Spark:

  • Check out the .NET for Apache Spark code on GitHub.
  • Ready to try this out? Check out getting started.
  • Feel like contributing? Start with the outstanding issues.

Don’t miss our presentation describing this major release in the upcoming .NET Conf 2020: “The Missing Piece: Diving into the World of Big Data with .NET for Apache Spark”!

Questions or feedback? Let us know in the comments below.