Many years ago (don’t ask me how I know this!) the hamburger chain Burger King began branding themselves with this slogan: “Have it your way!” It was pure marketing genius! The idea that you could order something, in this case a hamburger, at a fast food dispensary that would be tailor-made to your specific personal tastes was revolutionary – it set them apart from their competitors. Something similar happened when Amazon.com, one of the first major online stores in the Internet era, began suggesting books (and other products) to their customers that were an amazingly good match to each individual’s personal tastes. Of course, Amazon accommodated its customers with this value-added service by invoking a scientific procedure, data science applied to customer data, not by asking customers directly (as did Burger King). Thus did Amazon deploy the first enterprise-scale personalized recommendation services. Again, this was marketing genius. It is important to note that Amazon’s was not the first such service, nor did they invent the first algorithm. The first modern incarnation of this creative invention (using the “wisdom of crowds” – i.e., collaborative filtering) can be traced back to the GroupLens research project at the University of Minnesota in 1992, which was implemented in the MovieLens recommender system for film viewers starting in 1997. The 1994 paper that summarized the GroupLens presentation at a CSCW (Computer Supported Cooperative Work) Conference has been cited nearly 4000 times, making it the most cited CSCW conference paper of all time! In July 2013, the GroupLens authors (Paul Resnick and John Riedl) re-enacted their original 1992 presentation, just 6 weeks prior to the death of Riedl.
The Netflix Prize
There have been many followers down the personalization trail blazed by Burger King, GroupLens, and Amazon, most notably Netflix, who offered in 2006 a one million dollar prize for anyone who could improve upon their recommendation engine’s algorithm by at least 10%. There were over 44,000 entries in this contest, from over 41,000 teams, representing approximately 51,000 contestants. A winner was declared soon after July 26, 2009 when the “Bellkor’s Pragmatic Chaos” team submitted an algorithm that delivered 10.06% improvement. Another team matched their score on the test dataset, but the winning team scored best on the “hidden dataset” that Netflix used to score contestants’ entries. This latter detail provides a classic instructional example of how to avoid overfitting in a predictive analytics model, which is built against a training dataset – you find the “best solution” (which works best on a general set of data) through error measurement and verification of the algorithm against previously unseen data. “Bellkor’s Pragmatic Chaos” algorithm was the winner of the Netflix Prize on the basis of having the best performance on the hidden dataset, but they may have won in another category (though this cannot be verified) – their algorithm’s total length must have been among the leaders in that characteristic also. Their winning algorithm was presented in detail in a 90-page scientific research paper. The algorithm is an emphatic example of the type of algorithm that seems to be a frequent winner in Kaggle.com crowdsourced data science competitions – ensembles. Ensemble algorithms are in fact an amalgamation of multiple algorithms – they combine the predictions from large numbers of different algorithms. The proof is in the prizes – ensemble learning is one of the most accurate machine learning methodologies for big data analytics problems.
The Wayne Gretzky Principle
In an earlier post, we described principles of Leadership Analytics, one of which was based on the famous quote by hockey legend Wayne Gretzky: “Skate to where the puck is going to be, not to where it has been.” We emphasized in that article the relevance and application of this quote to successful Predictive Analytics and Recommender Systems: it is old school business reporting to tell upper management what the customer has purchased, but it is new school (“rocket science”) business analytics to predict what the customer will purchase! Predictive modeling is the flavor of marketing genius; and the secret sauce that gives it flavor are the data science filtering algorithms that drive the recommender engine (which generates the recommendations). So, what is the fuel for this engine? Of course, it is the data … the customer big data!
Practical Machine Learning – Innovations in Recommendation
Recommendation systems are now seen in nearly all online stores – products are recommended to customers that are based on a variety of filtering algorithms: Collaborative Filtering - CF (How similar are this customer’s tastes and interests to other customers who also viewed/bought this item?); or Content-based Filtering – CBF (How similar, or semantically connected, are other products in the store’s inventory to the product that the customer is now viewing?) A new O’Reilly book on “Practical Machine Learning – Innovations in Recommendation” (by Ted Dunning [Chief Application Architect at MapR] and Ellen Friedman) explores the mechanics and implementation of recommendation engines. We previously discussed design patterns of recommender engines, including the co-occurrence matrix (of customers’ product purchases), which Dunning & Friedman describe in detail. An emerging new type of filtering algorithm for recommender engines is context-based filtering (CxBF), which addresses this question: How does this customer’s interaction with our online site change as a function of context (e.g., time of day, the browser that they are using – mobile or desktop, their geo-location, their IP address, or the weather). Consequently, successful CF depends upon a good model of the customer’s preferences, CBF depends upon a model of the store’s content or products (e.g., a topic model), and CxBF depends upon a model of the customer’s context (e.g., situational analytics). A remarkable example of CxBF occurred in 2004, when Walmart analyzed their customer purchases in Florida prior to the arrival of a major hurricane. They discovered one particular product that sold at a rate 7 times in excess of its normal daily sales rate. The surprising product was: strawberry pop tarts. The context of the purchases (i.e., the imminent arrival of a hurricane) appeared to be the strongest explanatory variable. Walmart tested this “data science” hypothesis by shipping large numbers of pallets of strawberry pop tarts to Florida prior to the arrival of the next hurricane(s), and the customers indeed bought the entire pop tart inventory in all of the stores. That was successful data science, successful predictive analytics, and successful knowledge discovery from data, proving once again that knowledge is power, especially in big data marketing!
Data Everywhere, Personalization Everywhere
My new definition of Big Data that I am now promoting is this: “Everything, quantified and tracked!” In this brave new big data world in which everything is being measured and recorded, service providers have an opportunity to personalize nearly every customer engagement and experience. Consequently, we see the emergence of ubiquitous personalization (e.g., in personalized medicine and personalized learning). Your data stream delivers a wealth and variety of features, attributes, and characteristics that can be fed into a CF, CBF, or CxBF recommendation engine model, to let you “have it your way” more than ever!