MapReduce Hands-On Training

Duration

3 days

Delivery

  • Instructor-led Training
  • Students must provide their own workstation

Target Audience

Java Developer

Course Overview

This intense three-day hands-on course is designed to make Java developers productive with Hadoop as soon as possible. Once the Hadoop cluster is up and running, you need to transition existing data management applications over to Hadoop quickly, as well as leverage the new capabilities now available to explore new uses of that data. If you're joining an established Big Data team, this course gives you the foundation you need to become productive as soon as possible. This course covers skills for developing and debugging MapReduce programs and optimizing their performance by Java developers. It also introduces Apache ecosystem projects such as Hive, Pig and HBase.

Course Outline

  • The process of taking a MapR Hadoop project from conception to completion
  • Best practices for using MapR Hadoop most effectively towards providing solutions
  • How to write a MapReduce program using Hadoop API
  • Utilizing HDFS for effective loading and processing of data
  • How MapReduce and HDFS work
  • What issues to consider when developing MapReduce jobs
  • How to implement common algorithms in Hadoop
  • How to effectively debug, monitor and optimize Hadoop solutions.
  • How to leverage other projects such as MapR Hive, MapR Pig, and HBase
  • Advanced MapR Hadoop API topics required for real-world data analysis

Course Syllabus

Day 1

  • Introduction
    • Traditional Systems
    • Why Big Data
    • Why Hadoop
    • Hadoop Basic Concepts/Fundamentals
  • Hadoop in the Enterprise
    • Where Hadoop Fits in the Enterprise
    • Review Use Cases
  • Architecture
    • Hadoop Architecture
    • Building Blocks
    • HDFS
    • MapReduce
  • MapR Hadoop CLI
    • Walkthrough
    • Exercise
  • MapReduce Programming
    • Fundamentals
    • Anatomy of MapReduce Job Run
    • Job Scheduling
    • Sample Code Walk Through
    • Hadoop API Walk Through
    • Exercise

Day 2

  • MapReduce Formats
    • Input Formats, Exercise
    • Output Formats, Exercise
  • MapReduce Features
    • Counters, Exercise
    • Map Side Join, Exercise
    • Reduce Side Join, Exercise
    • Sorting, Exercise
  • MapReduce Algorithms
    • Walkthrough of 2-3 Algorithms Use Case A (Long Exercise)
    • Input Formats, Exercise
    • Output Formats, Exercise

Day 3

  • MapReduce Testing
  • MapReduce Performance Tuning
  • MapR Hive and Pig Based MapReduce
  • Recap and Q&A