Teaching Old Engines New Tricks: New Approaches to Machine Learning

It was really exciting to be in Berlin for the open source Berlin Buzzwords 2013 conference this week. The audiences were energized, and one of the really hot topics was a lot of interest and enthusiasm for search.

This enthusiasm was especially apparent at the June 4 presentation “Multi-modal Recommendation Algorithms” by Ted Dunning, MapR’s Chief Applications Architect. Surprisingly, a major part of this recommendation/machine-learning talk involved search, in particular, the use of Apache Solr/Lucene with Apache Mahout on the MapR distribution for Apache Hadoop.

The main thrust of the talk had to do with the advantage gained by using multiple behaviors as the source of input data for building a recommendation engine. Normally in a recommendation system, you observe behaviors similar to the one you want to drive through the use of your recommender, and then you use those behaviors as your input data to build and train your model. In contrast, Ted’s multi-modal approach has two new twists:

  1. Use multiple types of behavior as input to a Mahout-based recommendation model.
  2. Use the behavioral indicators output from the Mahout step as input for Solr-based search. The search engine here is abused to provide recommendations instead of search results.
The combination of search with recommendation surprised even some very experienced Buzzwords audience members, including Anne Verling, who engaged in this tweet interplay just after the talk: Tweets at Buzzwords Ted’s multi-modal recommendation talk fit well with the buzz (pun intended) around the recent announcement of the inclusion of LucidWorks as part of the MapR distribution. The combination of LucidWorks Solr + Apache Mahout (which is also included with MapR distribution) makes it easy to put the multimodal recommendation technique into practice. And these techniques are similar to some of the log processing that is done in Lucidworks Big Data product, also included with MapR.

The combination of Solr + MapR meets challenges many businesses face, and I predict that the excitement seen at Buzzwords foretells a lot of new technologies that will appear using the combination of MapR/Apache Mahout/LucidWorks Solr and Big Data.

Click here view Ted's slides.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free