Revolution Analytics: Leverage R in Apache Hadoop

At A Glance
Revolution Analytics, Inc. is the leading provider of software and services for advanced analytics based on open-source R, the world’s most popular programming language for computational statistics and data science.

Leverage R in Apache Hadoop

The Challenge for Hadoop on Main Street
Unique, advanced analytics big data platform at petabyte scale...and beyond. Over the next three years, IT industry analyst IDC expects spending on Apache Hadoop for data management to grow at an average annual rate of 60%. While Hadoop is an excellent platform for managing large and complex data sets, its native tooling for advanced analytics is still evolving, and without a reliable architectural foundation, its readiness for mission-critical enterprise operations will be limited. Conventional server-based analytic software packages from vendors typically connect with Hadoop and extract data over a network, but they do not run inside Hadoop. This may be suitable for users working with small packets or small samples of data. However, this architecture is impractical when source data approaches terabyte scale—and sampling is not acceptable for some analytic applications.

There is a pressing need for an analytics platform that runs inside Hadoop, handles data scale in the petabytes, offers comprehensive support for advanced analytics, and is accessible to a broad user base—without extensive re-training. With this is mind, MapR and Revolution have teamed up to pave the way forward for enterprise-scale big data analytics.

Any Analytics, Any Data Type, Simplified in Apache Hadoop

Pursue Your Big Data Aspirations at Enterprise Scale
With two million R users in the workforce today, most analytics-savvy organizations have people who know R—and are likely to prefer it. However, since open source R runs in memory and lacks a native distributed computing framework, its use is usually limited to small data sets.

Revolution R Enterprise (RRE) extends an enhanced and supported version of R with a distributed computing framework, scalable algorithms, big data connectivity tools, an integrated development environment and enterprise deployment tools. RRE does not simply connect to Hadoop like many other analytic tools—it runs inside the MapR Distribution including Apache Hadoop, providing users with direct access to data stored in MapR, guaranteeing enterprise-grade reliability, scale and speed. Revolution interfaces to popular BI platforms such as Jaspersoft, QlikView or Tableau and productivity tools like Microsoft Excel and Alteryx, quickly transforming results into visual insights that can be shared across the enterprise.

Since data remains in place in the MapR environment, this dramatically reduces the total cycle time to build and deploy predictive models, and eliminates potential security issues from data movement or replication. The RRE write once/deploy anywhere architecture offers flexibility and portability while helping organizations avoid vendor and platform “lock-in.”

Why RRE on MapR is Unique
MapR provides several key advantages to make analytics professionals more productive. MapR snapshots and volumes allow users to build and test models on the same cluster for production data without impacting operations. This also allows for easy versioning of models and back-testing against historical data sets. In addition, unlike other distributions for Hadoop, only MapR provide a fully read-write data platform which allows existing applications, custom libraries, and modeling languages, and scripts (e.g., Grep, Git) to work out of the box. Moreover, data movement is quick and easy with MapR Direct Access NFS™ without requiring a separate cluster for data ingest.

Product Snapshot
Revolution R Enterprise™ (RRE) platform delivers a faster, more scalable, enterprise-supported distribution of R in a cost-effective enterprise-class big data analytics platform. RRE extends open source R to enable enterprise users to analyze data without moving it, build models using large data sets, and deploy models into production without recoding.

About Revolution Analytics
Revolution Analytics, with its Revolution R Enterprise (RRE) software, is the innovative leader in big data big analytics. RRE is powered by the R language and the company was named a “Visionary” in the 2014 Gartner Magic Quadrant for Advanced Analytics Platforms. RRE is used by enterprises with massive data, performance and multi-platform requirements that need to drive down the cost of big data.

About MapR
MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified distribution for Hadoop.

Benefits

Solution Highlights

  • Comprehensive R-Based Big Data Big Analytics Platform. Supports 100% of R and CRAN plus and adds a variety of big data statistics, predictive modeling and machine learning capabilities running on MapR.
  • Broad, Fast, Scalable Analytics. Explore, model, predict and scale large data sets with a high-performance parallel architecture.
  • Enterprise-Grade Operations. Enterprise data hub founded on MapR® Distribution is easy, dependable and very fast.
  • Expands Big Data Analytics Talent Pool. Today’s R users can now build big data analytics using R and avoiding Java, SQL and parallel programming to streamline projects and cut costs.

Revolution R Enterprise Benefits

  • Write Once, Deploy Anywhere Build and deploy models in R without having to re-code or learn MapReduce
  • High-Performance Analytics at Any Scale. Runs natively inside Hadoop to allow real-time modeling and scoring in a true commercial production environment.
  • Ongoing Commitment to R. Extends research tools into commercial applications for industrial use, while fostering development community.

MapR Benefits

  • Top-Ranked Hadoop Distribution. One unified big data platform for Hadoop, NoSQL, database and streaming applications.
  • Proven Production Readiness. Benefit from both open source community innovation as well as MapR architectural enhancements.
  • Consistent High Performance. Eliminate downtime and performance bottlenecks, while ensuring business continuity.

Get started with MapR and Revolution today!
Get the MapR Sandbox for Hadoop, a fully functional Hadoop cluster running in a virtual machine. Visit www.mapr.com/sandbox

Learn more about Revolution Analytics. Call us at 1-650-646-9545 or e-mail info@revolutionanalytics. com. Visit our website at revolutionanalytics.com

DOWNLOAD PDF