Considering big data techniques, Hadoop-based approaches were among the first to be widely recognized and widely used, but Hadoop is just a part of modern big data solutions. Evolving technologies offer a wide range of capabilities that include distributed file storage, NoSQL databases, data stream transport and stream processing, search, SQL-on-big-data, machine learning, and more.
In this blog post, I’ll help you get started using Apache Spark’s spark.ml Logistic Regression for predicting cancer malignancy. Spark’s spark.ml library goal is to provide a set of APIs on top of DataFrames that help users create and tune machine learning workflows or pipelines.
In this week’s Whiteboard Walkthrough, Rachel Silver, Ecosystem Product Manager at MapR, talks about MapR Ecosystem Packs or MEPs that give you a convenient way to upgrade open source ecosystem components without having to upgrade the core MapR platform. The open source components in MEPs have been tested to be functionally interoperable within the MEP so that you can spend more time processing/analyzing data and less time troubleshooting your stack.
According to Gartner, by 2020, a quarter of a billion connected cars will form a major element of the Internet of Things. Connected vehicles are projected to generate 25GB of data per hour, which can be analyzed to provide real-time monitoring and apps, and will lead to new concepts of mobility and vehicle usage.
Earlier this year, I published a series of posts on the deployment of Apache Drill to Azure. While the steps covered in those posts work, I’d like to speed up the process significantly. With the MapR Converged Data Platform available in the Azure Marketplace, I can have a Drill-enabled MapR cluster up and running much faster and with much less effort.
The last decade has ushered in a perfect storm of disruption for the financial services sector – arguably the most data-intensive sector of the global economy. As a result, companies in this sector are caught in a vice.
There is no denying it – we live in The Age of the Customer. Consumers all over the world are now digitally empowered, and they have the means to decide which businesses will succeed and grow, and which ones will fail. As a result, most savvy businesses now understand that they must be customer-obsessed to succeed.
In this week's Whiteboard Walkthrough video, Sameer Nori, Senior Product Marketing Manager at MapR Technologies, compares a traditional data warehouse or MPP database versus a modern data lake. Sameer explains the advantages in data agility and data exploration you get with a data lake, and how you can complement an existing data warehouse deployment.
The field of data science is one of the youngest and most exciting fields in the technology sector. In no other industry or field can you combine statistics, data analysis, research, and marketing to do jobs that help businesses make the digital transformation and come to full digital maturity.
According to IDC, the big data market, including services such as analytics, is expected to reach nearly $50 billion by 2019. Luckily for Quantium, data is in its DNA. Australia’s largest analytics business is happily riding the coattails of the booming global big data industry.
- 1 of 92
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!