Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Get the facts straight: The 10 Most Common Statistical Blunders

by Sunil Kappal
January 27, 2017
in Data Science, Data Science 101, Understanding Big Data
Home Topics Data Science
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Competent analysis is not only about understanding statistics, but about implementing the correct statistical approach or method. In this brief article I will showcase some common statistical blunders that we generally make and how to avoid them.

To make this information simple and consumable I have divided these errors into two parts:

  • Data Visualization Errors
  • Statistical Blunders Galore

Table of Contents

  • Data Visualization Errors
    • Pie Charts
    • Bar Graphs
    • Time Charts
    • Histograms
  • Statistical Blunders Galore
    • Biased Data
    • No Margin of Error
    • Non-Random Sample
    • Correlation is not Causation
    • Botched Numbers

Data Visualization Errors

This is one nightmare-inducing area to both the presenter as well as the audience. Incorrect data presentation can skew the inference and can leave the interpretation at the mercy of the audience.

Pie Charts

Pie charts are considered to be the best graph when you want to show how the categorical values are broken. However, they can be seriously deceptive or misleading. Below are some quick points to remember when looking at the Pie Charts:


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


  • Percentages should add up to 100%
  • 3D fits better in VR consoles than in pie charts
  • Thou shall not have ‘Other’ – Beware of the slices with ‘Other’. If that is larger than the rest of the slices, you have a problem, because it makes the pie chart vague
  • Show the total number of reported categories to determine how big is the pie

Bar Graphs

Bar graphs are great graphs to show the categorical data by the number or percent for a particular group. Points to consider when examining a Bar Graph:

  • Thou shall have right scale: Scale made very small to make the graph look big or severe
  • Consider the units being represented by the height of the bar and what it means as a result in terms of those units

Time Charts

A time chart is used to show how the measurable quantities change by time.

  • Thou shall have the right scale and the axis: It is a good practice to check the scale on the vertical axis (usually the quantity) as well as the horizontal axis (timeline) as the results can be made to look very impactful by switching the scales
  • Don’t try to answer the “Why is it happening?” question using the time charts as they only show “What is happening”
  • Ensure that your time charts should show empty spaces for the times when no data was recorded

Histograms

  • It is good practice to check the scale used for the vertical axis frequency (relative or otherwise), especially when the results are showed down through the use of inappropriate scale
  • Ensure that the intervals are not missed on the x or y axis to make the data look smaller
  • Ensure the application of histogram is correct as people tend to confuse histograms with a bar graphs

Statistical Blunders Galore

This is probably a ‘no-nonsense zone’ where one would not want to make false assumptions or erroneous selections. Statistical errors can be a costly affair, if not checked or looked into it carefully.

Biased Data

Unbiased

Bias in statistics can be termed as over or underestimating the true value. Below are some most common sources or reasons for such errors.

  • Measurement instruments that are systematically off and causing such bias. Example a scale that adds up 5 pounds each time you weigh.
  • Survey participants influenced by the questioning techniques
  • A Population sample of individuals that doesn’t represent the population of interest. For example, examining exercise habits by only visiting people in gyms will introduce a bias.

No Margin of Error

This is a great way to understand the potential miscalculation or change in circumstance that can result in a sampling error and ensures that the result from a sample study is close to the number that can be expected from the entire population. It is a good idea to always look for this statistics to ensure that the audiences are not left to wonder about the accuracy of the study.

Non-Random Sample

Non-Random samples are biased, and their data cannot be used to represent any other population beyond themselves. It is pivotal to ensure that any study is based on the random sample and if it isn’t, well, you are about to get into big trouble.

Correlation is not Causation

Besides the above statement, correlation is one statistic that has been misused more than being used. Below are the few reasons that makes me believe the misuse part of this statistic.

Correlation applies only to two numerical variables, such as weight and height, call duration and hold time, test scores for a subject and time spent studying that subject etc. So, if you hear someone say, “It appears that the study pattern is correlation with gender,” you know that’s statistically incorrect. Study pattern and gender might have some level of association but they cannot be correlated in the statistical sense.

Correlation helps to measure the strength and the direction of a linear relationship. If the correlation is weak, once can say that there is no linear relationship but that doesn’t mean that there is no other type of relationship that might exist.

Botched Numbers

One should not believe in everything that appears with statistics. As we know error appears all the time (either by design or by mistake), so look for the below points to ensure that there are no botched numbers.

  • Make sure everything adds up to what it is reported to
  • “A stitch in time saves nine” – Do not hesitate to double-check the numbers and basic of calculations
  • Look at the response rates of a survey – Number of people responded divided by the number of people surveyed
  • Question the statistic type used to ensure it is the best fit

Being a consumer of information, it is your job to identify shortcomings within the data and analysis presented to avoid that “oops” moment. Statistics are nothing but simple calculations that are smartly used by people who are either ignorant or don’t want you to catch them to make their story interesting. So, to be a certified skeptic, wear your statistics glasses.

 

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Tags: Data analysisdata scienceData Visualizationstatistics

Related Posts

ChatGPT now supports plugins and can access live web data

ChatGPT now supports plugins and can access live web data

March 24, 2023
business intelligence career path explained

From zero to BI hero: Launching your business intelligence career

March 24, 2023
Can artificial intelligence have consciousness

Exploring the mind in the machine

March 23, 2023
Adobe Firefly AI: See ethical AI in action

Adobe Firefly AI: See ethical AI in action

March 22, 2023
Runway AI Gen-2 makes text-to-video AI generator a reality

Runway AI Gen-2 makes text-to-video AI generator a reality

March 21, 2023
What is containers as a service (CaaS): Examples

Maximizing the benefits of CaaS for your data science projects

March 21, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

ChatGPT now supports plugins and can access live web data

From zero to BI hero: Launching your business intelligence career

Microsoft Loop is here to keep you always in sync

Exploring the mind in the machine

Adobe Firefly AI: See ethical AI in action

A holistic perspective on transformational leadership in corporate settings

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.