Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

7 habits of highly effective data analysis

by Ray Li
May 30, 2016
in Data Science 101, Understanding Big Data
Home Topics Data Science Data Science 101
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Highly effective data analysis isn’t learned overnight, but it can be learned faster. Here are 7 habits of data analysis I wish someone told me for effectively incorporating, communicating and investing in data analysis geared towards an engineering team.

Table of Contents

  • 1. Value simplicity of analysis over fancy algorithms
  • 2. Value more data sources over more data
  • 3. Value familiar tools over the latest shiny tools
  • 4. Value insights and investments over indicators
  • 5. Value CUSS over trust
  • 6. Value direction over definitive
  • 7. Value how software works over how you *think* software works
  • Bonus: Architectural knowledge is your smartcut

1. Value simplicity of analysis over fancy algorithms

If you can’t explain your analysis to a 5 year old, then you’ll have a tough time selling it to others. The focus for product data analysis is not the analysis — don’t get me wrong, you need the analysis, but it’s the story you tell and your recommendations based on that data that really matter.

Using complex analysis that confuses will result in the exact opposite of what you want. You want to be able drive engineering behaviors and investments with your analysis. If your analysis is too opaque and engineers aren’t quickly wrapping their heads around the story you’re telling, your analysis has lost its value.

The ultimate measure of the impact of your data analysis is how engineering behaviors and investments change. Make it easy for others to change.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


2. Value more data sources over more data

Looking at more data across a broader time frame can give you more confidence in your analysis. However, a single pipeline of telemetry or logs is limited by the features being captured. Generally, a single pipeline only tells a part of the product story.

Same Analysis + Same Pipeline = Same Story

What you need is another source of data. Maybe all SQL operations are logged somewhere or maybe you have the facility to pull a sample of logs from your users. More data sources also allows you to confirm whether your story is consistent. More data will not give you more insight. More data sources will.

3. Value familiar tools over the latest shiny tools

Shiny, new tools are fun to play with and sometimes useful, but remember the ultimate measure of the impact of your data analysis?

You want to make it easy for others to change, and change is not easy. Here are 3 things from Your Brain at Work to keep in mind to give yourself the best shot at facilitating change:

  • Make it safe for your fellow engineers to change. Tell a story that everyone can quickly wrap their heads around, and tell it with familiar tools. Stay away from the latest, coolest technologies for visualization unless they are really necessary for your story.
  • Drill into the core message of your analysis.
  • Repeat the core message, then repeat it again. 🙂

Unless you’re recommending the adoption of a new tool, the focus should not be on the tools, it’s on the core message of your story.

4. Value insights and investments over indicators

Indicators are your key performance indicators (KPI). They’ll likely come in the form of graphs, plots or tables. Your analysis doesn’t stop there. Indicators are only the first “I” of the 3 I’s of data-driven engineering. Tell an insightful story around your data, and then recommend investments. You’re the agent of change, and your analysis must be infused with your insights and your recommendations for investments.

5. Value CUSS over trust

Data never comes clean. That’s why I frequently feel like a janitor. As a data janitor, I rarely trust all the data is there and in the right format. I always apply Kern’s CUSS acronym from Introduction to Probability and Statistics Using R to understand the data’s Center, Unusual features, Spread and Shape.

  • Center – Where is the general tendency of the data?
  • Unusual features – Are there missing data points? Outliers? Clustering?
  • Spread – What is the variability of the data?
  • Shape – If you plot the data, what is the shape of the data?

Knowing how the data is generated and the CUSS of the data allows you to draw better reasoned insights and investments.

6. Value direction over definitive

The cost of data collection is often the major hurdle standing in the way of a definitive answer to a business or engineering question. You can almost always get a partial answer that’s better than what you have now.

The author of How To Measure Anything recommends asking this question:

“Is there any measurement method at all that can reduce uncertainty enough to justify the cost of the measurement?”

Even if you don’t have the instrumentation in place to definitively answer whether a specific component is the problem, you can find a cheap way to reduce the uncertainty by eliminating a few components. Maybe you can stitch together a few different sources of data and give some very rough tallies to get things going in the right direction.

Getting yourself or your team moving in the right direction is more important than getting that super accurate, definitive answer.

7. Value how software works over how you *think* software works

The beauty of product data analysis is seeing the footprints of actual users using your software product. Sometimes you’ll get a really nice set of footprints. More likely than not, you’ll get partial impressions making your investigation all the more difficult. Regardless, telemetry and log footprints are a reflection of reality.

Architectural knowledge is a great asset. However, the telemetry and logs represent hard evidence of what’s actually going on rather than what we believe is going on. As a product data scientist, you have a unique view of the software. You see the software as it actually is.

This is powerful, because not only do you have evidence of how the software actually works, you can also scale that insight to a broad set of users. You can make claims like “77% of our users go down this code path which contradicts the design.” Believe in the footprints left behind by your users, but always double-check. One of my favorite quotes from The Elements of Statistical Learning is “In God we trust, all others bring data.”

Bonus: Architectural knowledge is your smartcut

I’m contradicting myself here, but knowing how the different components in your product work together is invaluable for product data analysis.

Completely relying on your telemetry and logs to tell you about how the software works is possible, but it’s the reliable yet slow way. Take the smartcut and learn about how the code executes. Step through it with a debugger, and form a model in your mind about how the components flow and fit together.

Keep SFDIPOT in mind: structure, function, data, interfaces, platform, operations and timing.

In Smartcuts, the author makes the claim that you can learn and train yourself faster by building on platforms. Platforms are things like tools and frameworks built by others. Use a debugger tool or an architecture document to quickly get your platform in place. Then your analysis of telemetry and logs takes on completely new meaning, since you deliberately trained yourself to spot patterns of code execution.

This content originally appeared on http://rayli.net/blog/

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Tags: DataData analysis

Related Posts

Business analytics and business intelligence solutions in retail

Navigate through the rough seas of retail with business intelligence as your compass

September 20, 2022
dark data

If only you knew the power of the dark data…

August 22, 2022
Machine learning vs data science: We explained the differences and similarities between data science and ML, as well as their salaries, skills, future, and more.

Machine learning makes life easier for data scientists

August 5, 2022
What is a data governance framework? Data governance framework components, examples, practices and how to find the best data governance framework explained

The data governance framework is an indispensable compass of the digital age

August 4, 2022
In this article, you can learn What are big data services (BDaaS), big data services company, big data solutions, big data consulting services, big data as a service examples, and more.

Everything you should know about big data services

July 26, 2022
In this article, you can learn what is data transformation, data transformation examples, data transformation tools, data transformation process, data transformation rules, and more.

The ABC’s of data transformation

July 14, 2022

Comments 6

  1. White Rooms says:
    7 years ago

    Hospitality service ..not only hotels but also apartments and guesthouses are only at the beginning of a datadriven management. The best still has to come

    Reply
  2. Penny Stewart says:
    7 years ago

    Excellent article – I’ve learnt some of these the hard way!

    Reply
    • Darya says:
      7 years ago

      great to hear that Penny, even though it wasn’t always the easiest path 🙂

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Google unveils its experimental conversational AI service Bard

Achieving data resilience with StaaS

AI Asmongold may have been one of the very first examples of AI streamers

Mastering the art of efficiency through business process transformation

Google starts testing its ChatGPT rival AI chatbot called Apprentice Bard

How AI improves education with personalized learning at scale and other new capabilities

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.