Dr. Kirk Borne is a Principal Data Scientist at Booz Allen Hamilton. Previously he was a Professor of Astrophysics and Computational Science in the George Mason University School of Physics, Astronomy, and Computational Sciences. He was at Mason from 2003 to 2015, where he taught and advised students in the graduate and undergraduate Computational Science, Informatics, and Data Science programs. Before Mason, he spent nearly 20 years in positions supporting NASA projects, including an assignment as NASA's Data Archive Project Scientist for the Hubble Space Telescope, and as Project Manager in NASA's Space Science Data Operations Office. He has extensive experience in big data and data science, including expertise in scientific data mining and data systems. He has published over 200 articles (research papers, conference papers, and book chapters), and given over 200 invited talks at conferences and universities worldwide. In these roles, he focuses on achieving big discoveries from big data through data science, and he promotes the use of information and data-centric experiences with big data in the STEM education pipeline at all levels. He believes in data literacy for all! Learn more about him at http://kirkborne.net/. You can follow him on Google+ here and on Twitter at @KirkDBorne, where he has been identified as one of the social network’s top big data influencers.
Much has been written about the power of big data collections to enable the 360 view of our customers, our business, our employees, and our processes. When our numerous disparate heterogeneous data collections are aggregated and joined in the data lake, with appropriate data tagging and data discovery tools in place (such as Apache Drill), then we can reach for that ideal: the 360 view of our domain!
One of the most significant characteristics of the evolving digital age is the convergence of technologies. That includes information management (structured and unstructured databases: e.g., NoSQL), data collection (big data), data storage (cloud and distributed data: e.g., Hadoop), data applications (analytics), knowledge discovery (data science), algorithms (machine learning), transparency (open data), computation (distributed data processing: e.g., MapReduce and Spark), sensors (Internet of Things: IoT), and API services (microservices, containerization).
In the beginning was data. How do we know this? Because many (if not all) creation stories from all cultures were essentially developed as an explanation of the world as observed by humans.
Dimensionality reduction is a critical component of any solution dealing with massive data collections. Being able to sift through a mountain of data efficiently in order to find the key descriptive, predictive and explanatory features of the collection is a fundamental required capability for coping with the Big Data avalanche.
The Internet of Things (IoT), with its ubiquitous sensors and streams of big data for big insights, has an estimated market valuation of 17 trillion U.S. Dollars. Apparently, the "sensoring" of the world is a seriously big deal, generating insights into people, processes, and products on a scale that is almost incomprehensible. Certainly 17 trillion dollars is almost incomprehensible.
Someone once said “if you can’t measure something, you can’t understand it.” Another version of this belief says: “If you can’t measure it, it doesn’t exist.” This is a false way of thinking – a fallacy – in fact it is sometimes called the McNamara fallacy.
Big data flows from all channels in the modern technological world: social, mobile, networks, sales, machines, sensors, markets, etc. In fact, big data flows so abundantly that we choose water-themed metaphors to describe it: data lake, data flood, data tsunami, oceans of data, streaming data, and even the CD sea of data.
Big Data is a big deal in all enterprises everywhere. It is also big concept. In other words, big data is not only about rapidly increasing data volumes. There is also a rapid increase in data-driven business culture, applications, and use cases. As these continue to spiral upward, there is an accompanying increased demand for easy-to-use analytics tools.
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!