In 2012, Americans watched more legally-delivered video content via the Internet than on physical formats such as DVDs. This shift in format also allows online content providers to gather large amounts of data on viewer habits.

Recommendation engine

Netflix is the de facto iTunes for online video content. It collects data on every search and every rating entered into the system. Netflix user logins allows further data enrichment with verified personal information (sex, age, location), as well as preferences (viewing history, bookmarks, Facebook likes). Third-party data providers, such as Nielsen, adds an extra layer of data on top.

In the same way Amazon crunches all that data to make book recommendations, Netflix makes movie recommendations. The better recommendation engine, the more content gets watched. Having details on when and where content is viewed also means Netflix knows when and where reminders best be sent.

Content creation

Having detailed knowledge of Netflix subscribers allowed the company take the next step of creating content. The data collected by Netflix indicated there was a strong interest for a remake of the BBC miniseries ‘House of Cards’. These viewers also enjoyed movies by Kevin Spacey or those directed by David Fincher. Now add a $100 million commitment, and you get two seasons.

Next steps

Online delivery of the video allows Netflix to log every event. Every time the viewer presses play, pause or repeat. With hundreds of millions of events, Netflix could create a playbook on how to create TV shows. When to show an explosion, when to show a sex scene, or even when to have a product placement.

Netflix is even capturing screenshots to capture characteristics such as color and scenery, and how well the audience responds to such imagery. These details could effectively govern the creative direction, on what’s being shown when. However, with so many details determined by algorithms, can it really still be described as a ‘creative direction’?


Read more here.

This week, we feature the best articles on limitations of Big Data. In particular, Tim Harford makes an excellent case on why correlation does not equal causation.

Image credit: David Erickson


Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!


Previous post

Big Data: Brazil and Mexico Leading The Way In Latin America

Next post

Big Data Complexity and India's Election