Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Using Wikipedia Data to Predict Box Office Success

by Taha Yasseri
October 9, 2014
in Uncategorized
Home Uncategorized
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

My colleagues and I have devised a mathematical model which can be used to predict films that become blockbusters or flops at the box office – up to a month before the movie is released.

Our model is based on an analysis of the activity on Wikipedia pages about American films released in 2009 and 2010. After examining 312 movies, taking into account the number of page views for the movie’s article, the number of human editors contributing to the article, the number of edits made and the diversity of online users, we could come up with good estimations for the prospective popularity of a movie at box office. The results obtained using this model, and the actual figures (published in Internet Movie Database (IMDb)) showed a high degree of correlation.

Yasseri_PLoSONE_Figure (1)Actual first weekend box office revenue in the United States against its predicted value based on Wikipedia data 30 days before the release. The green line, indicating the perfect prediction, is drawn for comparison. Each dot represents a movie from the sample and the size of the dot indicates the amount of the error in the prediction. Predictions for more successful movies are more accurate.

Their mathematical algorithm has allowed us to predict box office revenues with an overall accuracy of around 77 per cent. This level of accuracy is higher than the best existing predictive models applied by marketing firms (which they estimate to be at around 57 per cent). We could predict the box office takings of six out of 312 films with 99 per cent accuracy where the predicted value was within one per cent of the real value. Some 23 movies were predicted with 90 per cent accuracy and 70 movies with an accuracy of 70 per cent and above.

The more successful the show, the more accurately we were able to predict box office takings. This is possibly due to the increased amount of online data generated by films that turn out to be successful. The model correctly forecast the commercial success of Iron Man 2, Alice in Wonderland, Toy Story 3 and Inception, but failed to accurately forecast the financial return on less successful movies Never Let Me Go, and Animal Kingdom.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


Box Office Prediction Graph

These results can be of great value to marketing firms but more importantly for us; we were able to demonstrate how we can use socially generated online data to predict a lot about future human behaviour.

We have demonstrated for the first time that Wikipedia edit statistics provide us with another tool to predict social events. We studied the problem of predicting the financial success of movies and concluded that, in some aspects, forecasting based on Wikipedia outperforms tweets as Wikipedia activity has a longer timescale which enables earlier predictions.

The efficiency of the predictions might be improved by applying more sophisticated statistical methods, such as including the controversy measure of an article.


taha_yasseriTaha Yasseri is a Big Data Research Officer at the Oxford Internet Institute. Prior to Oxford Internet Institute, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread.

This Research has been published in PLoS ONE and can be accessed at “Mestyán, M., Yasseri, T., and Kertész, J. (2013) Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data. PLoS ONE 8 (8) e71226.”


(Image Credit: Brett Sayer)

Tags: academiaBig DataBox Officeoxfordoxford internet institutepredictive analyticsWikipedia

Related Posts

Data sourcing is still a major stumbling block for AI

Data sourcing is still a major stumbling block for AI

August 18, 2022
AI and data analytics COVID-19

How AI and Data Analytics Will Impact The Era of COVID-19

February 17, 2022
Medical field changing thanks to AI

The Medical Field is Changing Because of Artificial Intelligence

August 19, 2021
Zeni series B funding

AI-Powered Fintech Startup Zeni Raises $34m in Series B Round

August 6, 2021
Coming up LIVE: Can we have both Privacy and Security?

Coming up LIVE: Can we have both Privacy and Security?

June 4, 2020
How GDPR is Affecting Marketing Data

How GDPR is Affecting Marketing Data

July 5, 2018

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Playing with fire: The leaked plugin DAN unchains ChatGPT from its moral and ethical restrictions

The art of abstraction in computer science

AI whisperers, fear, Bing AI ads and guns: Was Elon right?

The strategic value of IoT development and data analytics

AI experts call for pause in development of advanced systems

Microsoft Security Copilot is the AI-ssential tool for cybersecurity experts

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.