Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Recommender Systems- Past, Present and Future

byKirk Borne
June 23, 2014
in Articles

Follow @DataconomyMedia

Recommender systems are among the most fun and profitable applications of data science in the big data world. Training data (corresponding to the historical search, browse, purchase, and customer feedback patterns of your customers) can be converted into golden opportunities for ROI (i.e., Return On Innovation and Investment). The predictive analytics tools of data science yield a bonanza of mechanisms to engage your customers and enrich their customer experience. What better loyalty program can there be if not the one that offers the customer what they want before they ask (and sometimes, even before they think of it for themselves). Yes, we know of some cases that have gone bad (such as the secretly pregnant teen and the targeted coupons that Target sent to her father), and we recognize that there is a fine line between being intimate with your customers versus being intimidating, but usually people do like to receive offers for great products that they love.

A new O’Reilly book (by Ted Dunning and Ellen Friedman) on “Practical Machine Learning – Innovations in Recommendation” takes a look at the nuts & bolts, the mechanics and the implementation, and the theory and the practice of recommender engines. They describe the design of a simple recommender using Apache Mahout, based upon the co-occurrence analysis of customers’ product purchases.

The book is available as a free download from the MapR website, which you can find here.
These are the chapters in the book:
1. Practical Machine Learning
2. Careful Simplification
3. What I Do, Not What I Say
4. Co-Occurrence and Recommendation
5. Deploy the Recommender
6.Example: Music Recommender
7.Making it Better
8. Lessons Learned

In two separate articles, I examine recommender systems — their underlying principles, data science, history, and design patterns. Here are links to the articles, plus a short excerpt from each one:

(a) Design Patterns for Recommendation Systems – Everyone Wants a Pony
Excerpt :
We identify four different design patterns that are useful in recommender engines for predicting customer behavior in the customer experience environment (e.g., online store, browser, smartphone app, or whatever): co-occurrence matrices, vector space models, Markov models, and “everyone gets a pony (the most popular item).”
The co-occurrence matrix, described in Dunning and Friedman’s book, is the cross-matrix of all possible product pairs A and B that were co-purchased by prior customers. Analysis of non-zero elements in this matrix identifies which co-occurrences are anomalous, that is, are more frequent than you’d expect by independent occurrence of items. These anomalous co-occurrences become indicators for potential offers of product B for customers who buy product A. This approach is based upon the association rule mining algorithm (a limited form of an approach called market basket analysis).

Vector space models are useful for both customer modeling and product modeling. This begins with building a feature vector, consisting of either a set of features that describe a customer (e.g., products of interest, features of interest, manufacturers of interest, purchase frequency, price range, etc.) or a set of features that describe a product (e.g., content, author/creator, theme/genre, etc.). Cosine similarity calculations are then made against these feature vectors to identify similar customers (X,Y) and similar products (A,B). In the first case, products are offered to customer X based upon the purchase history of similar customer Y. In the second case, the customer is offered product A based upon its similarity to product B that the customer has previously purchased or has recently looked at (but not purchased).
… …

(b) Personalization – It’s Not Just for Hamburgers Anymore
Excerpt :
Many years ago (don’t ask me how I know this!) the hamburger chain Burger King began branding themselves with this slogan: “Have it your way!” It was pure marketing genius! … … In 2006, Netflix offered a one million dollar prize for anyone who could improve upon their recommendation engine’s algorithm by at least 10%. There were over 44,000 entries in this contest, from over 41,000 teams, representing approximately 51,000 contestants. A winner was declared soon after July 26, 2009 when the “Bellkor’s Pragmatic Chaos” team submitted an algorithm that delivered 10.06% improvement. Another team matched their score on the test dataset, but the winning team scored best on the “hidden dataset” that Netflix used to score contestants’ entries. This latter detail provides a classic instructional example of how to avoid overfitting in a predictive analytics model, which is built against a training dataset – you find the “best solution” (which works best on a general set of data) through error measurement and verification of the algorithm against previously unseen data. “Bellkor’s Pragmatic Chaos” algorithm was the winner of the Netflix Prize on the basis of having the best performance on the hidden dataset, but they may have won in another category (though this cannot be verified) – their algorithm’s total length must have been among the leaders in that characteristic also. Their winning algorithm was presented in detail in a 90-page scientific research paper. The algorithm is an emphatic example of the type of algorithm that seems to be a frequent winner in Kaggle.com crowdsourced data science competitions – ensembles. Ensemble algorithms are in fact an amalgamation of multiple algorithms – they combine the predictions from l. … …e numbers of different algorithms. The proof is in the prizes – ensemble learning is one of the most accurate [and prize-winning] machine learning methodologies for big data analytics problems…

I encourage you to take a look at the new book and at the full-length versions of the above articles.

Follow @DataconomyMedia


kirkKirk is a data scientist, top big data influencer and professor of astrophysics and computational science at George Mason University.He spent nearly 20 years supporting NASA projects, including NASA’s Hubble Space Telescope as Data Archive Project Scientist, NASA’s Astronomy Data Center, and NASA’s Space Science Data Operations Office. He has extensive experience in large scientific databases and information systems, including expertise in scientific data mining. He is currently working on the design and development of the proposed Large Synoptic Survey Telescope (LSST), for which he is contributing in the areas of science data management, informatics and statistical science research, galaxies research, and education and public outreach.


Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!

[mc4wp_form]

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Tags: surveillanceWeekly Newsletter

Related Posts

From imaging to staffing: 5 ways AI is changing healthcare

From imaging to staffing: 5 ways AI is changing healthcare

October 5, 2025
AI agents are here—Make them your media-buying back office

AI agents are here—Make them your media-buying back office

October 5, 2025
DOT Miners Combine XRP and DeFi to Earn with a Passive Income Model, Bringing Crypto to a Head

DOT Miners Combine XRP and DeFi to Earn with a Passive Income Model, Bringing Crypto to a Head

September 24, 2025
Best ELD devices and fleet management tools 2025: Top picks for trucking companies

Best ELD devices and fleet management tools 2025: Top picks for trucking companies

September 18, 2025
Zen Media and Optimum7 Merge to Create AI-Native Growth Agency: Why Data Is at the Core

Zen Media and Optimum7 Merge to Create AI-Native Growth Agency: Why Data Is at the Core

September 18, 2025
How wedding photographers save hours with SoftOrbits batch editing

How wedding photographers save hours with SoftOrbits batch editing

September 11, 2025
Please login to join discussion

LATEST NEWS

Could CTEM have prevented the Oracle Cloud breach?

ChatGPT reportedly reduces reliance on Reddit as a data source

Perplexity makes Comet AI browser free, launches background assistant and Chess.com partnership

Light-powered chip makes AI computation 100 times more efficient

Free and effective anti-robocall tools are now available

Choosing the right Web3 server: OVHcloud options for startups to enterprises

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.