Apache Pig

Many users who are new to Hadoop find that the MapReduce framework has a steep learning curve. Apache Pig helps these users by offering a simpler alternative for transforming and analyzing large data sets. Users write scripts in a high level language called Pig Latin, which Pig translates into MapReduce jobs that run on a Hadoop cluster. Pig Latin supports a wide variety of data analysis and transformation functions, including join, sort, group, filter, and aggregate.

Pig is ideal for many use cases, including:

  • Data Transformation: Convert large data sets from one format to another.
  • Data Aggregation: Combine large data sets that are formatted in different ways.
  • Data Analysis: Easily generate big insights from big data.