ESS 1000 – Big Data Essentials


About this course

The Essentials series of courses is intended for anyone interested in getting started with big data, Apache Hadoop, or MapR.

Prerequisites What’s Included
None. Ideal for business managers, students, developers, administrators, analysts or anyone interested in learning the fundamentals of transitioning from traditional data models to big data models.
  • Slide guide
  • Glossary
  • This is a non-lab course.

What’s next?

The ESS 1000 course provides the prerequisite knowledge for all courses in the Administrator (ADM), Data Analyst (DA), and Developer (DEV) learning paths.


ESS 100 – Introduction to Big Data

  • Lesson 1 – Introduction to Big Data
    • Define big data
    • Summarize the history of big data computing
    • Define key terms in big data computing
  • Lesson 2 – The Big Data Pipeline
    • Organize the steps in the data pipeline
    • Explain the role of administrators
    • Explain the role of developers
    • Explain the role of data analysts

ESS 101 – Apache Hadoop Essentials

  • Lesson 3 – Core Elements of Apache Hadoop
    • Compare and contrast local and distributed file systems
    • Explain data management in the Hadoop file system
    • Summarize the MapReduce algorithm
  • Lesson 4 – The Apache Hadoop Ecosystem
    • Define the following ecosystem components:
      • Administration: ZooKeeper, YARN
      • Ingestion: Flume, Oozie, Sqoop
      • Processing: Spark, HBase, Pig
      • Analysis: Hive, Drill, Mahout
  • Lesson 5 – Solving Big Data Problems with Apache Hadoop
    • Summarize the following use cases:
      • Data Warehouse Optimization
      • Recommendation Engine
      • Large-Scale Log Analysis

ESS 102 – MapR Converged Data Platform Essentials

  • Lesson 6 – MapR-FS
    • Review key components of HDFS
    • Describe key components of MapR-FS
    • Compare and contrast MapR-FS and HDFS
  • Lesson 7 – MapR-DB
    • Compare and contrast databases
    • Describe common MapR-DB use cases
    • Describe components and features of MapR-DB
  • Lesson 8 – MapR Streams
    • Compare and contrast real-time and batch processing
    • Describe key components of MapR Streams


Related Resources

MapR Sandbox with Apache Drill
Get started


Advice from the front.


On-demand Training
DA 410 - Apache Drill Essentials
Learn more

DA 440 - Apache Hive Essentials
Learn more

DA 450 - Apache Pig Essentials
Learn more

DEV 320 - HBase Data Model and Architecture
Learn more

DEV 360 - Apache Spark Essentials
Learn more