Apache Hadoop is an open source framework for running applications on large clusters of commodity hardware. Managed by the Apache Software Foundation, Hadoop implements a parallel processing technique named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed on any node in the cluster. In addition, the Hadoop framework provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. [1]
Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers. Hadoop was created by Doug Cutting, who named it after his son's toy elephant. It was originally developed to support distribution for the Nutch search engine project. [2]
Yahoo! has been the largest contributor [3] to the project, and uses Hadoop extensively across its businesses.
Hadoop is now widely used across industries including finance, entertainment, government, health care, media, technology, telecom, and other industries with big data analytic requirements.
MapR provides a complete distribution for Apache Hadoop, but MapR is not affiliated with the Apache Software Foundation.