How to Do It: New Approaches to Big Data Machine Learning and Analytics

It’s not just how you store big data but what you can do with it – and that was apparent as Java developers took part in Devoxx conferences in London and Paris last week. Participants had a lot to say about the international presenters, and among those was MapR Chief Application Architect and Apache Mahout committer Ted Dunning.

Ted gave two presentations during Devoxx France 2013, the first to the Paris DataGeeks group on new approaches to machine learning with Apache Mahout and the second at the Devoxx main conference on real-time learning with Storm and the MapR Distribution for Hadoop at scale.

The Mahout talk focused on a new way to do collaborative filtering with cross-recommendations and advances in big data discovery through clustering. The highlight of the latter topic was a new, lightning fast clustering algorithm that will be part of the 0.8 release of Mahout coming very soon. Visit here for more information and to view the slides.

The second talk, Real-time Learning, approached this big data topic both at the detailed mathematical level and at the practical architectural level including ways to use Storm + the MapR Distribution for Hadoop. Ted dove into some serious math, some serious magic and a lot of useful content on the best approaches to how to use these models in many settings from financial portfolio optimization to direct mail to web-site design. As with the Paris DataGeeks event, the great enthusiasm and participation of the Devoxx audience showed how strongly these developers are interested in new ways to analyze big data. Visit here for more information and to view the slides.

If you are in Paris, be sure to see Ted at Big Data Paris. He will present two sessions: Getting a grasp of Hadoop’s world – Open Source and branded solutions and Expect More From Hadoop on Wednesday April 3rd.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free