Federal, Regional and Local Government Agencies Use Big Data and Hadoop to Accomplish Critical Missions and Objectives
Government agencies collect vast amounts of data every single day. In order to make the most out of this fast-growing volume of data, the Obama Administration created the “Big Data Research and Development Initiative” which included an investment of over $250 million in big data throughout its federal agencies. Big data technologies are now playing a key role in public sector fields such as intelligence, defense, cybersecurity and scientific research.
Federal, regional and local agencies and departments can benefit from state-of-the-art big data technologies in order to store, preserve, and analyze massive amounts of data. Technologies currently in use are either too expensive and cannot scale as desired. By leveraging big data technologies such as the MapR Converged Data Platform, these agencies can strengthen the country’s national security, transform teaching/learning, accelerate the pace of innovation and discovery in the science and engineering fields, and much more.
MapR provides government agencies with a cost-effective and scalable architecture that drives real-time analysis, situational analysis, and supports information flow across multiple agencies and departments.
MapR is currently deployed across the Intelligence Community, Department of Defense, and civilian agencies.
Government Use Cases
The federal government launched a cybersecurity research and development plan that relies on the ability to analyze large data sets in order to improve the security of U.S. computer networks. One such initiative involves the Department of Homeland Security, which is deploying an intrusion detection system of sensors that are capable of analyzing internet traffic entering Federal systems, as well as identifying malware and unauthorized access attempts. The MapR Converged Data Platform allows for building models that can detect and identify these unauthorized attempts and separate abnormal activities from regular activities.
The National Geospatial-Intelligence Agency is creating a “Map of the World” that can gather and analyze data from a wide variety of sources such as satellite and social media data. The map contains a variety of data from classified, unclassified, and top secret networks, and is the main source of geo-intelligence that the NGA shares with other intelligence agencies such as the NSA, CIA, and the Defense Intelligence Agency. The MapR Converged Data Platform can be used as an Enterprise Data Hub to store and analyze various types of structured and unstructured data.
Crime Prediction and Prevention
According to a UNODC (United Nations Office on Drugs and Crime) report, criminals laundered close to $1.6 trillion in 2009, or 2.7% of the global GDP. The Financial Crimes Enforcement Network (FinCEN), a bureau of the U.S. Treasury Department, uses an analytics tool that can be used to collect and analyze large numbers of bank transactions in order to combat domestic and international money laundering, terrorist financing, and other financial crimes. In addition, local agencies such as police departments can leverage advanced, real-time analytics to provide actionable intelligence that can be used to understand criminal behavior, identify crime/incident patterns, and uncover location-based threats. The MapR Converged Data Platform provides capabilities such as machine learning and anomaly detection that allows for identification of patterns that can reduce and reduce crimes.
The Department of Defense (DOD) is investing $250 million dollars every year across military departments in a series of programs that will focus on harnessing and utilizing massive data in order to advance agency missions. Sensing, perception and decision support will be combined in order to develop autonomous systems that can make decisions on their own. The MapR Converged Data Platform can increase the number of activities and events that an analyst can observe in real-time and improve their decision making abilities.
The driving force behind the National Security Agency’s data processing capabilities is Accumulo, an open source project created by the NSA that gives users the ability to store data in very large tables for fast access with fine grain security. By bringing data sets together, the NSA can use Accumulo to investigate certain details while blocking access to information that could reveal personal data, allowing administrators to remain in compliance with privacy requirements. Accumulo runs seamlessly on the MapR Converged Data Platform, so Accumulo users inherit the strong dependability of the MapR platform that HBase™ users have enjoyed for a long time.
Pharmaceutical Drug Evaluation
According to a McKinsey & Co. report, big data technologies could reduce research and development costs for pharmaceutical makers by $40 billion to $70 billion. Both the FDA and NIH use big data technologies to access large amounts of data to evaluate drugs and treatment, and decide if warning labels are needed. In addition, researchers can use the MapR Converged Data Platform to analyze a much larger patient population, decide what treatments are most effective, and identify side effects patterns of drugs.
In keeping with its focus on basic research, the National Science Foundation has initiated a long-term plan to: 1) implement new methods for deriving knowledge from data, 2) develop new approaches to education, and 3) create a new infrastructure to “manage, curate, and serve data to communities”. The MapR Converged Data Platform can be used to solve specific problems that include:
- Using big data in research for uncovering protein structures and biological pathways.
- Giving geoscientists access to big data technologies in order to analyze and share information about our planet.
- Training undergraduate students to use graphical/visualization techniques for analyzing complex data.
The National Oceanic and Atmospheric Administration, or NOAA, collects data every minute of every day from land, sea, and space-based sensors. When you hear your local forecast about an incoming tornado or hurricane, that weather report is using data that’s directly from the NOAA. On a daily basis, the NOAA uses big data approaches to collect, analyze and extract value from over 20 terabytes of data. Technologies such as the MapR Converged Data Platform are ideally suited to manage such large volumes of data that are varied in nature.
The MapR Converged Data Platform can be used by tax organizations to analyze both unstructured and structured data from a variety of sources in order to identify suspicious behavior and multiple identities, which could lead to an increase in tax fraud identification. By leveraging the power of Hadoop, tax organizations can proactively detect and prevent tax fraud.
Health and Human Services Fraud Detection/Decision Support
The MapR Converged Data Platform can be used to enable health and human services agencies to perform data mining and predictive analytics in order to detect fraud. For example, a social services department can analyze data in order to uncover anomalies within a state childcare program, which can help investigators to prioritize their caseload.
By consolidating data from many different local agencies, local governments can coordinate communications during city-wide emergency situations. In addition, responders can use technologies such as the MapR Converged Data Platform to analyze words, pictures, and hashtags on Twitter in order to decide where supplies such as food and water are needed the most.
Local government agencies need to have the ability to analyze traffic flow data on different roads or in different parts of the city. The MapR Converged Data Platform helps in aggregating real-time traffic data gathered from road sensors, GPS devices and video cameras and provides traffic managers with the ability to identify potential problems in a public bus network. These potential traffic problems in dense urban areas can be prevented by adjusting public transportation routes in real time.
The US government is actively deploying the MapR Converged Data Platform to better process and analyze fast growing data in a much more cost effective method. Enterprise-grade features such as mirroring and snapshots allow Federal agencies to comply with Continuity of Operations (COOP) requirements. Direct Access NFS™ enables agencies to leverage existing applications.