Slicing and Dicing Big Data Science in the Academic Universe

Reflecting on the numerous applications and use cases of Big Data in the world, we see that there is essentially no domain of human endeavor that is untouched by the data analytics revolution.  Consequently, no business function can be isolated from data-driven processes and decisions – including development, operations, technology, finance, marketing, sales, customer service, consumer sentiment, human resources, and leadership functions.  The flood of information from numerous channels delivers valuable insight, foresight, and oversight to all aspects of modern business. 

The same can be said of academic institutions, particularly those who are paying attention to the market and to the big data revolution, including such landmark events as the release of the 2011 McKinsey Global Institute report on Big Data: The Next Frontier for Innovation, Competition, and Productivity, and the announcement in 2012 of the US President’s “National Big Data Research and Development Initiative.”  There is definitely change occurring all across the academic spectrum in response to this new frontier of learning, research, and discovery. Nevertheless, despite the ubiquity of Big Data and Data Science, when I discuss these topics with my colleagues at the university, particularly those who are new to the subject, I tend to make reference repeatedly to the same few shining examples on our campus at George Mason University:  the business analytics components of the MBA program in the School of Management, the machine learning and Hadoop components of the massive data analytics program in the Engineering School, and the data science components of the computational science and informatics program in the College of Science (where my affiliation lies).

In order to give due attention to the many places in the modern innovative university where Big Data and Data Science are touching some of the faculty and students, I started my own list of the corresponding departments, institutes, and/or programs at just my own university where big data science is cutting across and into both new and traditional higher education programs.  The list grew, and grew, and grew!  The more that I reflected on conversations that I have had, on meetings that I have attended, and on emails that I have exchanged over the past decade, then the more I began to appreciate the breadth (not just the depth) of Big Data Science penetration in academia.

Attached below is my current list (which undoubtedly will grow as soon as this article is published) where I attempt to slice and dice the big data science academic pie.  I offer the list here in the hope that it may encourage and inspire business and government leaders to welcome this change more whole-heartedly (which might mean changing your employment requirements for new hires) and to welcome the innovative digital workers that colleges and universities will be sending you in the coming years.  “Born Digital” may be slightly cliché and perhaps even over-interpreting the sociological data, but maybe not.  In either case, it cannot be ignored.

Academic programs exploring big data science at George Mason University include:

  • Air Transportation Systems (causal and prescriptive analytics)
  • Astrostatistics and Astroinformatics (big data science for astronomy)
  • Automobile Transportation Safety (data mining; multiple-database integration)
  • Biostatistics and Bioinformatics (big data science for genomics, proteomics, …)
  • Campus Computing (big iron for computing on big data)
  • Climate Science (analytics on massive data and simulations)
  • Communications (data-driven journalism)
  • Computational Sciences (data-oriented discovery and inquiry = 4th Paradigm of Science)
  • Computer Science (analytics; machine learning; data-intensive computing)
  • Cybersecurity (predictive analytics)
  • Education (data-driven decision-making for school officials)
  • Environmental Science (computational sustainability; data analytics)
  • Geography, Geoinformation, and Geospatial Intelligence (GIS; location-based analytics)
  • Health Informatics (machine learning; predictive analytics)
  • Language Arts (computational linguistics; text analytics; natural language processing)
  • Library Services (data management and curation; metadata creation)
  • Materials Science (materials informatics; cheminformatics)
  • Mathematics (algorithms; optimization theory)
  • Nursing Informatics (medical information; EHRs; EMRs)
  • Operations Research (decision science; prescriptive analytics)
  • Psychology (big data analytics in organizational science)
  • Public Health (statistics; prescriptive analytics)
  • Public Policy (predictive and risk analytics)
  • School of Management (business analytics)
  • Simulation Science (massive data streams from supercomputer simulations)
  • Social Complexity (social network analysis and modeling)
  • STEM Education (citizen science = crowdsourcing big data science problems)
  • Statistics (duhhh!)  

I am sure that I left out somebody’s group, and I apologize in advance for my oversight. I wish to end with a little deeper slice and dice into the program in which I teach and do research: CSI (Computational Science and Informatics).  The Data Science concentration in the CSI graduate program invites and engages MS and PhD students who are studying astronomy and space sciences, space plasma physics, computational statistics, computational learning, materials science, medicine, automobile transportation safety, finance, citizen science, big image processing, text analytics, visual analytics, and any research domain in which the language and methods of data science are applicable… which is probably anything!  Data science is truly transdisciplinary – it transcends discipline and domain boundaries (in academia and in business).

Therefore, as we think about how big data science might be impacting this or that corner of our lives and our businesses, let us recognize the fact that today’s data are born digital, that we need the right tools and talents to benefit from the corresponding analytics (“learning from data”) revolution, and that there is no escaping big data.  Resistance is futile. You and your processes will be assimilated.  

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free