DA 440 - Apache Hive Essentials

register-html: 

About this course

DA 440 is an introductory-level course designed for data analysts and developers. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.

Are you ready?

  • Required:
    • Familiarity with a command-line interface, such as a Unix shell
    • Familiarity with RDBMS database tools, such as SQL
    • Access to, and the ability to use, a laptop with an internet connection and a terminal program installed (such as terminal on the Mac, or PuTTY on Windows).
  • Recommended:

Right for you?

  • For data analysts and developers interested in the data pipeline
  • For data scientists and business analysts who are familiar with SQL and want to use data on an HDFS
  • This is a programming course; you must have some programming experience to do the exercises

What's next?

Certification

This course prepares you for the MapR Certified Data Analyst (MCDA) certification exam.


Syllabus

Lesson 1:
Hive in the Hadoop Ecosystem
  • Use cases of Hive
  • Steps in the data pipeline
Lesson 2:
Create and Load Data
  • Create databases, internal tables, external tables, and partitioned tables
  • Learn about data types and casting in Hive
  • Load data into tables and databases
Lesson 3:
Query and Manipulate Data
  • Query, sort, and filter data
  • Manipulate data with user-defined functions