About this course
You will write SQL queries on a variety of data types, including structured data in a Hive table, semi-structured data in HBase or MapR-DB, and complex data file types, such as Parquet and JSON. You will also learn the different services involved at each step, and how Drill optimizes a query for distributed SQL execution.
Prerequisites for Success in the Course
Review the following prerequisites carefully and decide if you are ready to succeed in this course. The instructor will move forward with lab exercises, assuming that you have mastered the skills listed below.
- Basic Linux knowledge, including familiarity with basic command-line options such a mv, cp, cd, ls, ssh, and scp
- Access to, and the ability to use, a laptop with a terminal program installed (such as terminal on the Mac, or PuTTY and WinSCP on Windows)
- Beginner to intermediate fluency with SQL
Right for you?
- For data analysts and data scientists who use SQL to perform data analysis.
- This is a data analysis course; You must have SQL experience to do the exercises.
This course prepares you for the MapR Certified Hadoop Data Analyst (MCHDA) certification exam. This exam is coming soon.
Included in this 2-day course are:
- Slide Guide pdf
- Lab Guide pdf
- Lab Code
- Write familiar SQL queries on structured data
- Get familiar with the ease of using SQL with Drill
- Write familiar SQL queries on a range of data types
- Perform SQL queries on semi-structured data
- Use SQL to query complex and nested JSON data
- Use Drill Explorer to explore semi-structured data
- Discover the schema of unknown data
- Preview unknown data
- List the different types of data Drill can explore and query
- Simple and complex data types
- Structured and semi-structured data
- Describe how Drill interacts with data and discovers its schema
- Explore data to create queries using multiple data sources
- Join simple, complex, structures and semi-structured data in the same query
- Use views to visualize data in BI tools
- Create a view to save common queries for easy reuse
- Load a view into your BI tools to visualize queries without writing code
- Components of Drill
- Learn the components of a Drillbit
- Customize extensible Drill core modules
- Drill execution
- Follow a query through execution in Drill
- Learn how a query is broken into fragments and distributed on a cluster
- Describe the stages involved with query planning
- Optimization and flexibility of Drill
- Learn the cost-based and rule-based techniques used by the Drill optimizer
- Discover the flexibility and extensibility of Drill
- Familiar SQL queries on structured Hive data
- Familiar SQL queries on complex data
- Query Parquet data
- Query JSON data
- A single query that joins Hive, HBase and JSON
- Explore multiple data sources with the Drill Explorer
- Drill Explorer Interface
- Data sources
- Discover data schema
- Preview data
- Save a view
MapR Sandbox with Apache Drill
Advice from the front.
Apache Drill Website
|YOU MAY ALSO LIKE|
Instructor Led Training