This post is the second in a series where we will go over examples of how MapR data scientist Joe Blue assisted MapR customers, in this case a regional bank, to identify new data sources and apply machine learning algorithms in order to better understand their customers. In this second part, we will cover a bank customer profitability 360° example, presenting the before, during and after.
In this blog post, I’ll help you get started using Apache Spark’s spark.ml Logistic Regression for predicting cancer malignancy. Spark’s spark.ml library goal is to provide a set of APIs on top of DataFrames that help users create and tune machine learning workflows or pipelines.
This blog post is the second in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer (Read Part 1). In his book, Davi outlines the rationale and key challenges of securing big data systems and applications, and he’s included some terrific anecdotes to make the entire book a quick and insightful read.
In this Whiteboard Walkthrough, MapR Chief Application Architect, Ted Dunning, explains how special capabilities such as mirroring, bi-directional stream and table replication and control of data locality make MapR particularly effective in cloud computing, whether you use cloud-to-cloud clusters or a hybrid of cloud and on-premise. Ted also explains how cloud bursting is a useful strategy for elastic work loads.
In this week’s Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, describes advantages of MapR Converged Data Platform and how they work in the cloud. With files, tables and streams engineered into the same technology, MapR has particular advantages for multi-tenancy in the cloud including common pathnames and common security.
Business owners and executives today know the power of social media, mobile technology, cloud computing, and analytics. If you pay attention, however, you will notice that truly mature and successful digital businesses do not jump at every new technological tool or platform.
Considering big data techniques, Hadoop-based approaches were among the first to be widely recognized and widely used, but Hadoop is just a part of modern big data solutions. Evolving technologies offer a wide range of capabilities that include distributed file storage, NoSQL databases, data stream transport and stream processing, search, SQL-on-big-data, machine learning, and more.
In this week’s Whiteboard Walkthrough, Rachel Silver, Ecosystem Product Manager at MapR, talks about MapR Ecosystem Packs or MEPs that give you a convenient way to upgrade open source ecosystem components without having to upgrade the core MapR platform. The open source components in MEPs have been tested to be functionally interoperable within the MEP so that you can spend more time processing/analyzing data and less time troubleshooting your stack.
According to Gartner, by 2020, a quarter of a billion connected cars will form a major element of the Internet of Things. Connected vehicles are projected to generate 25GB of data per hour, which can be analyzed to provide real-time monitoring and apps, and will lead to new concepts of mobility and vehicle usage.
Earlier this year, I published a series of posts on the deployment of Apache Drill to Azure. While the steps covered in those posts work, I’d like to speed up the process significantly. With the MapR Converged Data Platform available in the Azure Marketplace, I can have a Drill-enabled MapR cluster up and running much faster and with much less effort.
- 1 of 93
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!