Operational Analytics and Droning About Big Data

Kirk's data mining hammerWhen I worked at NASA, about a dozen years ago, NASA issued a request for concept papers describing potential new technologies to support future manned missions to the Moon (and beyond).  By that time in my career, I was beginning to see everything through the “data lens”. To illustrate my growing fascination with data, one of my NASA colleagues gave me a data-related gift when I left NASA in 2003 to join George Mason University – he gave me a children’s toy hammer, with a handwritten label that said “Kirk’s data mining hammer.” This declaration was making an analogy to the old saying “to a child with a hammer, all the world is a nail.” My colleague would be amused at how I could find an application of data mining (machine learning and data science) to just about every problem that I encountered.  One might say, then and especially now, that I could drone on about data for hours. Data mining was my hammer, and the world of data was my nail.  Well, we now know that last part was right – the world of big data is now becoming everyone’s “nail”!  We also know that analytics and data science are becoming a universally required “hammer” in every domain.

Getting back to the NASA story – from the perspective of my data-centric universe, I envisioned a predictive analytics concept for NASA’s future Moon (and Mars) missions.  Specifically, it was a supply chain analytics concept. I imagined that there would be a time when the distant orb would have multiple bases with multiple teams of humans and robots doing scientific, engineering, mining, and other activities all across that distant landscape. To support these distributed activities for extended periods (months or years), an efficient steady supply chain would be needed – preferably (for cost reasons) there would be a single delivery node servicing all receiver nodes.  As I developed my concept, I noted that one couldn’t simply order a needed item and expect quick delivery from Earth. So, I conceived an in-orbit warehouse orbiting the Moon (or Mars) with most of the needed supplies.  The warehouse would be replenished autonomously through data mining the database logs of available supplies, their past usage rates, their predicted future usage needs, and so forth. The predictive analytics system would trigger timely shipments from Earth, for just-in-time delivery to the distant celestial surface.  In addition, the “space-to-surface” supply replenishment deliveries from Moon orbit to a specific lunar base could also be automatically invoked using those same database logs of usage statistics and supply chain analytics workflows.  Hence, the exploration of the distant worlds would be supported by essentially the same just-in-time global replenishment system that Walmart uses!

That was then – and this is now: How Drones Will Change the Way You Eat.”  Into this new big data world of supply chain analytics, enter the drone!  Amazon is proposing to deliver food shipments to your home by way of drone: “air-to-surface” supply replenishment deliveries!  It is just one more step to imagine that this can be done through autonomous just-in-time replenishment, sending items to your doorstep as determined through predictive supply chain analytics. Not only that, remember that we are talking about Amazon here – the king of recommendation systems using big data analytics! Amazon can use their customer data to make offers and recommend other products to you prior to the delivery of your food shipment, such as: “special discount on your new HDTV if you order it in time for your next food shipment.” The potential for increased sales and revenues for Amazon would be astronomical, and that could give Amazon a “just-in-time delivery” of another sort!

In the above examples, drones are using big data to assist in the performance of a function. But, drones are also being used to generate big data!  We can imagine a multitude of national security and military drones that are collecting surveillance data of distant lands and adversaries. But, drones are also being used to monitor animal cruelty on farms – see Drone on the Farm: An Aerial Exposé.”  The big data generation rate from drones is likely to mimic data’s exponential growth rate in every other sector of the world.  In fact, the U.S. military has had to beef up their storage capacity for the avalanche of streaming video data coming from its surveillance cameras, which undoubtedly includes drones (UAVs). 

The video feed data storage and processing bottleneck from drones represents just one more example of massive data growth creating massive technology challenges.  These challenges fall under the umbrella of operational analytics, which includes mining machine logs in operational settings (e.g., fleets of automobiles or aircraft), social media mining, streaming analytics on sensor data of all types, and mining the impending data flood from the Internet of Things.  The challenge of real-time big data analytics is just beginning to find real solutions, including Apache Spark in the Hadoop ecosystem, used for large-scale real-time in-memory (i.e., not batch) data processing. 

MapR is beginning to investigate the data challenges, communication standards, analytics requirements, and technology responses that are being triggered by the Internet of Things and other operational analytics environments. The common characteristics shared among these use cases include: high input rate, streaming (time-series) data, many small files, and the need for fast micro-adjustments in the operational environment (including supply chain replenishment, just-in-time delivery, event response, and more).  Consequently, we will look into more aspects of the Internet of Things in coming articles.  In this context, we are beginning to wonder if the Internet of Things will lead to the phrase “ginormous data” as a just-in-time replacement for the phrase “big data” in the lexicon of analytics. 

So, the next time someone drones on about big data, ask them about drones: “how are your groceries being delivered?”


Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free