Apache Drill anniversary marked by accelerating user adoption
MapR Technologies, Inc., provider of the industry’s only converged data platform, announced today the one year anniversary of Apache Drill on the MapR Platform. Since its general availability last year, Drill has helped hundreds of organizations worldwide rapidly analyze trillions of records and millions of files, and has fundamentally changed the way businesses actively engage with their data. Drill’s self-describing schema support eases ad-hoc queries in order to help analysts and business users leverage big data. Momentum driving Drill adoption is highlighted by customer success, superior product and strong community involvement.
Strong Community and Rapid Innovation:
With 60+ Drill contributors from a variety of companies and fast growing adoption, support for the community has grown significantly since GA last year. New community contributions include data source plug-ins (such as JDBC, MongoDB, Kudu), geospatial functions, and Avro file format support. Additionally, the efficient columnar representation, which is part of the Drill execution engine to achieve high performance in analytic workloads, has evolved into a new open source community project, Apache Arrow. Arrow provides a standardized columnar in-memory data representation for the data that various tools and frameworks can leverage by enabling performance and interoperability of data analytics.
“The Apache Drill Community has made excellent progress in the last year,” said Aman Sinha, Apache Drill PMC member and principal software engineer, MapR Technologies. “Drill users continue to experience high performance with analytic SQL queries on big data along with schema flexibility. I look forward to further innovation and collaboration with this community as we establish Drill as the breakthrough SQL-on-Hadoop technology.”
Customer Success at Petabyte Scale:
Drill has helped customers deploy secure and interactive analytics at petabyte scale. As the first and only schema-free SQL engine, Drill provides an unmatched level of agility and simplicity for harnessing value from structured and semi-structured datasets, and enables customers to leverage investments in existing SQL and BI/analytics tools. Drill is also highly differentiated with a granular and decentralized model to make big data securely available to a variety of users in an organization including the growing use of Drill as interactive SQL with Apache Spark to power batch, stream processing, and advanced analytics use cases.
Superior Product Features and Performance:
With agile and fast-paced release cycles, Drill continues to add innovative capabilities. The project has delivered significant enhancements with six releases in just the last twelve months to update query performance, scale, data source support, advanced SQL, end-to-end security, and BI tool integration in these releases. New product features include:
- ANSI SQL enhancements – Analytic window functions, Drop Table syntax, automatic data partitioning, SQL UNION support, complex data enhancements and more
- End-to-end security from BI tools to Hadoop – Including Hive impersonation support, web UI security and client impersonation
- Significant query speed-ups at scale – Query planning & optimization improvements, ability to cache metadata, robust partition pruning, improved memory management for better scale and more
- Improved Tableau & BI tool experience with metadata query speed ups and security
- Seamless compatibility and performance querying Hive tables - Multiple Hive versions support, ability to use of native parquet readers, partition pruning, INT96 data type and more
- Flexible JSON data type handling
- New and improved JDBC driver
These game-changing features have evolved the use cases for Drill in the last twelve months. Initial deployments focused on use cases that only Drill could enable for big data exploration and discovery. Drill has now expanded to become a single interactive query for a broad range of BI use cases including reporting and ad-hoc queries. Examples can be found here
“Drill is SQL on everything. It is exciting to see how Drill adoption evolved in the past year. With its simplicity, self-service and SQL capabilities, Drill is opening new use cases for customers driving the next-generation of data analytics.” said Neeraja Rentachintala, senior director, product management, MapR Technologies. “When combined with the MapR Converged Data Platform, Drill is uniquely positioned to be the uniform SQL access layer across files, tables, and streaming data instantly enabling operational analytics with flexibility.”