George Demarest presents the story of how organizations progress through four phases of digital transformation: Phase 1: Experimentation, Phase 2: Implementation, Phase 3: Expansion, Phase 4: Optimization
Data Governance is essential to delivering maximum value from your big data environment. Without knowing what data you have, what it means, who uses it, what it is for, and how good it is, you can never create the insights and information needed to run a modern data-driven enterprise. Instead of an afterthought, data governance needs to be front and center in the organizational effort to harness the power of its data.
Apache Spark is becoming the de facto standard as a processing and compute engine for big data workloads.The benefits of both in-memory processing and support for multiple programming languages have made it much easier to develop big data applications with Spark.
To learn how Hadoop is evolving to enable enterprise-grade deployments that serve a broadening list of use cases and user profiles, read "Hadoop for the Enterprise," the new Best Practices Report by TDWI’s Philip Russom.
Without design principles, swimming in circles in a big data lake can make your arms tired. Fortunately, the data lake concept has evolved so that best practices have emerged. This Checklist Report discusses what your enterprise should consider before diving into a data lake project, no matter if it’s your first, second, or even third major data lake project. Presumably, adherence to these principles will become second nature to the data lake team and they will even improve upon them at some point.
Data warehouse modernization takes many forms. Many users are diversifying their software portfolios, while others are even decommissioning current DW platforms in order to replace them with modern ones optimized for today’s requirements in big data, analytics, real time, and cost control. No matter what modernization strategy is in play, all require significant adjustments to the logical and systems architectures of the extended data warehouse environment.
Disaster recovery (DR) is the science of returning a system to operating status after a site-wide disaster. DR enables business continuity for significant data center failures for which high availability features cannot cover.
MapR Converged Community Edition (MapR CE) is a free edition of the MapR Converged Data Platform with community forum support for unlimited production use. This free version includes Apache Hadoop, Apache Spark™, MapR-DB (NoSQL database), MapR Streams (event streaming), and MapR-FS (POSIX file system). MapR CE enables distributed processing of large data sets across a cluster of servers. MapR delivers a proven platform that supports a broad set of mission-critical and real-time production uses.
MapR Technologies and Waterline Data deliver a joint solution that provides data governance capabilities on big data.
There are two types of companies in the big data space: 1) Those that are born in big data to deliver a competitive edge through the software or the process for enabling data, and 2) Those who have a mandate to simultaneously cut IT and storage costs and to create a platform for innovation.Download now