Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

3 reasons why data science can fail

by Dr. Isaac Roseboom
June 17, 2016
in Data Science
Home Topics Data Science
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

The rise of data science in the last decade has been driven by the ease of access to deep data and significant reductions in the costs associated with processing it. These days anyone with a credit card can now setup a cloud-based data warehouse and tracking system within minutes, but achieving a return on this investment is not so straightforward. This is not to say that effective use of data science can’t be very profitable, just that it is not always guaranteed.

There are three key reasons why data science projects can potentially fail:

Table of Contents

  • 1.Solving the wrong problem
  • 2.Mismatch of problem, technology and personnel
  • 3.Data integrity

1.Solving the wrong problem

Most data science applications are about optimization, i.e. let’s take a product and make it better, faster and easier using data. Ideally you could take the product as a whole and optimize for revenue, but this is not always possible, as it requires taking into account all the elements that can influence revenue, and their relationships with each other, thereby factorially increasing the number of permutations that would need to be tested. This is why optimization problems are usually smaller scale i.e. increase consumption via better recommendations, increase conversions with re-targeting, etc. However, this simpler view can lead to large resource dedicated to solving a problem that may have little impact on the overall revenue.

Taking an example from games, product managers obsess over getting non-spenders to convert. There is a good reason to obsess over this; most F2P games only get around 2% of players to ever spend. However, the clearest route to improving this number, offering a large ‘first payment’ discount, is not a good solution to improving revenue as it naturally deflates the value of in-game content and may put off repeat spends. There are analogs to this in retail CRM systems which have established a ‘race to the bottom’ for pricing to the point where consumers now will only spend if they think they are getting a substantial discount.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


2.Mismatch of problem, technology and personnel

While the pool of qualified data scientists and engineers has swollen over recent years, the diverse and ever growing range of technology solutions means that any single data scientist or team may only have experience with a small range of vendors. This is a problem as each source of technology is very much designed to work with a particular class of data. For instance, Hadoop is a great solution for batch-processing, but is not well suited to data that is drip-fed in real-time. Similarly, NoSQL databases are great for cases when the data structure needs to be flexible, but will not perform as well on large static structured data as a relational database. Additionally, while normalized data schema may be appropriate for traditional database tech and low dimension problems, flat wide structures will give a performance boost in modern column oriented databases like PostgreSQL and AWS redshift.

Mismatches of technology and/or personnel to the core business problems are common across the games industry and typical for small companies that simple cannot invest in a large diverse data team. In these cases the impact on data productivity can be devastating, with latency and data complexity issues causing all but the simplest business use cases to be abandoned.

3.Data integrity

Ultimately the best laid plans of data scientists are undone by the simplest of errors. Erroneous data feeds are one of the most common issues in data science projects. These are usually caused by a lack of communication with product developer and/or a lack of understanding of how the product operates. In recent times the proliferation of third party cloud services, and the need to combine data from them, has vastly increased the opportunity for data bugs to spawn and propagate.

With time and diligence problem data feeds can be corrected, but this process will introduce costly delays that can reduce confidence in data usage across a business.

It is generally the mismatch between simplified business goals and the way they are defined as analysis projects that causes failures or problems with data science projects. Often, commercial management doesn’t fully understand the process required to conduct an analysis project and this is compounded as they invariably have a requirement to obtain broad answers quickly. Analysts need to stand strong and become better communicators and negotiators; as it is they who ultimately have to take responsibility for the scope of the projects they agree to lead. Research topics need to be broken-down into their constituent parts, with gap analysis undertaken on the data, tools and human resources available, so that realistic expectations and timescales are set.

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Tags: Big Datadata scienceStrategy

Related Posts

What is ChatGPT Plus, and how to get it? Learn its features, price, and how to join ChatGPT Plus waitlist. Is it worth it? Keep reading and find out

ChatGPT Plus: How does the paid version work?

February 2, 2023
AI Text Classifier: OpenAI's ChatGPT detector can distinguishes AI-generated text

AI Text Classifier: OpenAI’s ChatGPT detector indicates AI-generated text

February 2, 2023
BuzzFeed ChatGPT integration: Buzzfeed stock surges in enthusiasm over OpenAI

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

February 2, 2023
Adversarial machine learning 101: A new frontier in cybersecurity

Adversarial machine learning 101: A new cybersecurity frontier

January 31, 2023
What is the Nvidia Eye Contact AI feature? Learn how to get and use the new Nvidia Broadcast feature. Zoom meetings and streams get easier.

Nvidia Eye Contact AI can be the savior of your online meetings

February 2, 2023
How did ChatGPT passed an MBA exam

How did ChatGPT passed an MBA exam?

February 2, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Cyberpsychology: The psychological underpinnings of cybersecurity risks

ChatGPT Plus: How does the paid version work?

AI Text Classifier: OpenAI’s ChatGPT detector indicates AI-generated text

A journey worth taking: Shifting from BPM to DPA

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

Adversarial machine learning 101: A new cybersecurity frontier

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.