Welcome to Developer Central. Here are some code samples and best practices that will help you get started on Hadoop and manage your clusters efficiently. You can also read about Lambda Architecture and browse through our Tech Resources.


This sample scala program loads HBase/M7 tables into an RDD and points to how in-memory processing in Spark can be used to augment the performance of real-time applications served by HBase/M7.

Tags: Spark HBase M7 Beginner

One of the advantages of Apache Drill is the capability to process raw text files in their native format without first having to create schemas/metadata to define what the data looks like. Here is a quick example of how Drill can query a simple csv file.

Tags: Drill SQL Query Editor Hive Beginner

This is a great example of Pig and Hive in action. The data set used is publicly available, making it a great self-help tutorial to play with.

Tags: Query Editor Pig ODBC NFS Hive HQL Beginner

The following program illustrates a table load tool, which is a great utility program that can be used for batching puts into an HBase/M7 table. The program creates a simple HBase table with a single column within a column family, and inserts 100,000 rows in a batch fashion.

Tags: HBase M7

Here is a powerful utility that lets you audit your environment when deploying Hadoop. The topic introduces you to Clush.

Tags: Cluster Ops

Here are some quick pointers to ensure that your JVM settings on Hadoop are well-tuned to avoid heap space errors.

Tags: Cluster Ops Heap Memory Management

Pages