Apache Spark Connector for SQL Server and Azure SQL is now open source

Accelerating big data analytics with the Spark connector for SQL Server

SQL Server logoWe're happy to announce that we have open–sourced the Apache Spark Connector for SQL Server and Azure SQL on GitHub. Born out of Microsoft's SQL Server Clusters investments, the Apache Spark Connector for SQL Server and is a high-performance connector that enables you to usetransactional data in analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any , on-premises or in the cloud, as an input data source or output data sink for Spark jobs.

Why use the Apache Spark Connector for SQL Server and Azure SQL

The Apache Spark Connector for SQL Server and Azure SQLis based on the Spark DataSourceV1 API and SQL Server Bulk API anduses the same interface as the built-in JDBC Spark-SQL connector. This allows you to easily integrate the connector and migrate your existing Spark jobs by simply updating the format parameter!

Notable features and benefits of the connector:

  • Support for all Spark bindings (Scala, Python, R).
  • Basic and (AD) keytab support.
  • Reordered DataFrame write support.
  • Reliable connector support for single instance.

Depending on your scenario, the Apache Spark Connector for SQL Server and is up to 15X faster than the default connector. The connector takes advantage of Spark's distributed architecture to move data in parallel, efficiently using all resources.

Visit the GitHub page for the connector to download the project and get started!

Get involved

The release of the Apache Spark Connector for SQL Server and Azure SQL makes the interaction between SQL Server and Spark even more flawless. We are continuously evolving and improving the connector, and we look forward to your feedback and contributions!

Want to contribute or have feedback or questions? Check out the project on GitHub and follow us on Twitter at @SQLServer.

The post Apache Spark Connector for SQL Server and Azure SQL is now open source appeared first on SQL Server Blog.

 

This article was originally published by Microsoft's Secure Blog. You can find the original article here.