ESS 1000 – Big Data Essentials


About this course

The Essentials series of courses is intended for anyone interested in getting started with big data, Apache Hadoop, or MapR.

Prerequisites What’s Included
None. Ideal for business managers, students, developers, administrators, analysts or anyone interested in learning the fundamentals of transitioning from traditional data models to big data models.
  • Slide guide
  • Glossary
  • This is a non-lab course.

What’s next?

The ESS 1000 course provides the prerequisite knowledge for all courses in the Administrator (ADM), Data Analyst (DA), and Developer (DEV) learning paths.


ESS 100 – Introduction to Big Data

  • Lesson 1 – Introduction to Big Data
    • Define big data
    • Summarize the history of big data computing
    • Define key terms in big data computing
  • Lesson 2 – The Big Data Pipeline
    • Organize the steps in the data pipeline
    • Explain the role of administrators
    • Explain the role of developers
    • Explain the role of data analysts

ESS 101 – Apache Hadoop Essentials

  • Lesson 3 – Core Elements of Apache Hadoop
    • Compare and contrast local and distributed file systems
    • Explain data management in the Hadoop file system
    • Summarize the MapReduce algorithm
  • Lesson 4 – The Apache Hadoop Ecosystem
    • Define the following ecosystem components:
      • Administration: ZooKeeper, YARN
      • Ingestion: Flume, Oozie, Sqoop
      • Processing: Spark, HBase, Pig
      • Analysis: Hive, Drill, Mahout
  • Lesson 5 – Solving Big Data Problems with Apache Hadoop
    • Summarize the following use cases:
      • Data Warehouse Optimization
      • Recommendation Engine
      • Large-Scale Log Analysis

ESS 102 – MapR Converged Data Platform Essentials

  • Lesson 6 – MapR-FS
    • Review key components of HDFS
    • Describe key components of MapR-FS
    • Compare and contrast MapR-FS and HDFS
  • Lesson 7 – MapR-DB
    • Compare and contrast databases
    • Describe common MapR-DB use cases
    • Describe components and features of MapR-DB
  • Lesson 8 – MapR Streams
    • Compare and contrast real-time and batch processing
    • Describe key components of MapR Streams