We live in a world where the combination of Moore’s Law and Metcalfe’s Law heralds a data revolution. The billions of smartphone and broadband users today already generate massive quantities of data. And cheap sensors in Internet of Things (IoT) devices and new 5G networks optimized for machine-to-machine (M2M) mean that this first wave of users will soon be joined by tens of billions of machines.
The IoT is here and 5G is coming
The IoT is already happening—with 13 billion devices in use today. And 5G is just around the corner, with the first networks to be commercially available around 2020. 5G has been designed for capacity but also specifically with M2M communications in mind—scale, low power, energy efficiency, and support for a massive number of devices.
Volume and velocity of big data
The volume and velocity of data today is almost unlimited, while the constraints on collecting and moving it around the world are disappearing. For the most part, this is happening in the enterprise space. The question, then, for enterprises—and for their CIOs, specifically—is whether all of this data can be gathered, often in real time, stored, and then analyzed?
These kinds of questions highlight the IT challenges that enterprises will face from this data explosion, with perhaps the most important being the question of ingestion.
The question of ingestion
The ingestion of data has to be done, in real time, using a distributed messaging layer that decouples the data capture from the storage, processing, and analysis of the data. Apache Kafka and MapR Streams have been built to easily capture any type of data and move the data in a publish-subscribe model between various server and software components.
As the raw data is streamed into the processing and storage layers, it can be used for many different purposes from being stored for later analytics to being aggregated and filtered for real time analytics.
Performing rich analytics
Apache Spark is today the most used technology to achieve this. It can process, aggregate, and transform data in a distributed and scalable way in real time.
Once you have the data in the system and it is processed, it becomes possible to perform rich analytics. SQL is still the most used language for analytics, mostly because it is a well-known and powerful language, but also because it is compliant with the analytics and reporting tools used by the vast majority of business users and application developers.
In the context of the IoT, which generates lot of data in different formats, the SQL engine must be able to deal with all the formats and also make good use of the power of the servers deployed in a cloud infrastructure. This is why cloud is key to the success of the IoT and 5G. It gives enterprises the ability to manage these massive data flows and extract value from them. Without the cloud and big data, the benefit of 5G is only its capacity, and its true potential will never be realized.
MapR is the acknowledged leader for developing and running innovative data applications. The MapR Converged Data Platform brings together Hadoop and Spark with global event streaming; real-time, top-ranked NoSQL database capabilities; and enterprise storage. Ericsson and MapR have formed a partnership to drive innovation for digital industrialization. As part of this, Ericsson is adopting the MapR Converged Data Platform for Ericsson’s cloud portfolio.
Editor's Note: This blog post was originally published here.