On Thursday, September 13, 2012, over 100 people – both in person and remotely from around the world – attended the first meeting. A couple of days before that, there were 350 participants at a Drill presentation at Google headquarters. So clearly there is an interest and a need for the potential that Drill represents and the momentum among the largest players in the big data community is strong.
So, why all the buzz and interest around yet another avenue in the big data world?
The answer is simple: latency matters. There is a need for speed. Hadoop is wildly popular and successful, but it is still based on batch processing. Yet, there is a need for ad-hoc, real time query and analysis of large data sets and that is where Drill comes in.
The Drill project was inspired by Google’s Dremel and the paper that Google published in 2010 about how it is using Dremel.
The kickoff meeting served as an introduction to the Drill project and to encourage participation. As an indication of how quickly a strong community is coalescing around the Drill Project, the creators of OpenDremel – the other open source project based on Dremel – flew in from Israel to join the meeting and to announce that OpenDremel was merging its efforts with the Drill project. Founder Camuel Gilyadov presented the work done to date.
Jason Frantz then proceeded to present architecture outline and specifically discussed specific tasks and goals that the Drill project is looking to deliver to users.
Juilan Hyde, who has been involved with the Mondrian Project, presented his views on data independence (the logical/physical separation of data) and why it is so important within the context of big data management.
It was a productive three hours and the entire session is available for viewing at this replay link.
Anyone interested in learning more about or contributing to the Apache Drill project is also encouraged to join the mailing list, which can be found at the project page at http://incubator.apache.org/projects/drill.html