How Big is Big Data?

There’s been a lot in the news lately about the NSA and Verizon call detail records… how much data are they talking about?

Each phone call has a call detail record. A call detail record contains information about a call, not the call itself. In other words, these records contain information about the originating number, the terminating number, the length of the call, etc. At first glance, it would seem to be a huge endeavor to analyze all the calls in the U.S., and it would even seem to require a huge datacenter (or multiple datacenters) to store all the data.

But in reality, the amount of data is relatively small on the spectrum of Big Data projects. There are 300 million people in the U.S., and approximately 250 million of them are adults and teens. If we assume that everyone generates 10 phone calls per day, on average, we have over 2.5 billion phone calls. The size of a typical call detail record is 200 bytes. In some cases, there can be multiple records generated for a single call. Think of these call records as metadata. If we assume 10 call records per call, this would expand to 2KB of data for every phone call. Given these assumptions, the size of the data would be 5 terabytes per day.

At MapR, we have customers who are analyzing many times this amount of data on a daily basis. How big would the cluster need to be? Well, we have customers with 32TB of data on a single node. If an organization wanted to analyze 30 days of U.S. call detail records, it would be approximately 150TB of data, which is just 5 nodes of a MapR Hadoop cluster. The total call record volume for the U.S. wouldn’t come close to creating a busy signal on a MapR cluster.

Use This Graphic for FREE on Your Site!

You may use the infographic above on your website, however, the license we grant to you requires that you properly and correctly attribute the work to us with a link back to our website by using the following embed code.

Embed Code

<div style="width: 280px">
<a href="http://www.mapr.com/images/blog/big-data-infographic-lg.jpg" />
<img src="http://www.mapr.com/images/blog/thumb-big-data-infographic.jpg" alt="Big Data & Apache Hadoop capabilities" /></a><br/> To view the original post, see the original <a href="http://www.mapr.com/blog/how-big-is-big-data"> Big Data infographic</a>.</div>

…and of course a nice square thumbnail for Facebook:

no

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free