Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

The White Rabbit, Cheshire Cat and Your Analytics Wonderland: How Data Curation Reminds Me of a Familiar Childhood Journey

by Stephanie McReynolds
November 3, 2016
in BI & Analytics, Contributors, Data Science
Home Topics Data Science BI & Analytics
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

Analysis is easy. It’s rational and straightforward. Understand the relatively linear set of steps for populating an analytic model and you’re set for success. Finding insights is the part of the data process that is hard. It requires the non-linear skill of interpretation. Interpretation is inexact, it requires judgement and is generally a process that can be unpredictable in advance and occasionally extremely frustrating.

Table of Contents

  • Going down the rabbit hole
  • Knowing your data like the Cheshire Cat
  • Enter Data Curation
  • Getting Data Governance right

Going down the rabbit hole

Successfully finding insights requires having a white rabbit to guide you down the rabbit hole of discovery. You must leave the familiar ground of key performance indicators that you trust and drill downward into the world of raw data. Faced with a new world of wonders, Alice could only interpret them through her own limited frame of reference. In some organizations, that frame of reference is provided by expert data stewards. Data stewards, where they exist, are asked to manually consolidate and document this knowledge as best practices or rules for exploration. But divorced from your process of exploration, these rules are often ignored. And even when the rules are followed, they may be only one perspective among many possibilities and not enough to deliver true insight. Like Alice, you may find yourself left to your own devices on the path to insight and may not be able to predict when you’ll hit your Wonderland, but when you do it will be magical, created out of your own curiosity and the wonder of data.

This is one of the most beautiful things about analytics but it is also the most dangerous. For curiosity has a strong link to imagination. And the enticement, but also the downfall of imagination, is that it can set us on paths divorced from reality. There are rules that matter in discovering insights- governance policies, privacy rules, and details about the accuracy of your data. All of these count in finding a correct conclusion. Insights are only as useful as they are accurate.

Knowing your data like the Cheshire Cat

Today, data governance rules are packed away in process diagrams, documents, wikis and technical metadata repositories divorced from your analytic process and hard to incorporate manually into your workflow. What inevitably results is a well-known scenario played out across boardrooms day after day. Data can lead to inaccurate conclusions, and data visualizations based on bad data can be especially problematic. A beautiful, full-color visualization is revealed only to be defrocked by a barrage of questions from the business leaders- where did you get this data? Who validated its accuracy? What you’re showing me doesn’t align with what I know about our business from my years of experience. Does your interpretation follow best practices? What are our best practices for visualizing this type of information?


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


In all reality, your exploration of Wonderland requires a guide, but no one has ever walked the exact same path as you have through your data. It’s more likely you need a guiding spirit, someone who will appear when most necessary to keep you sane, but also allow you to experience the essential secret of Wonderland, maybe through your own Mad Teaparty. You need Cheshire Cat.

In every analytics organization, there are experienced analysts ready to take on the role of the Cheshire Cat in your analytics Wonderland. They have explored the inner corners of every table, file and cube in your data repository. They’ve struggled with standardizing the semantics of different parts of the business. You might not be able to see them, but they have been there before and have privileged knowledge of the inner workings of Wonderland.

Enter Data Curation

Like the ephemeral Cheshire Cat, this knowledge needs to appear when and where you need it. Data Curation is how you can let the knowledge of these analysts materialize and dematerialize within the flow of your analysis. By revealing the breadcrumbs left explicitly or implicitly by analysts that have some before you, systems that support the curation of data establish a simple, frictionless way of sharing data knowledge throughout an organization.

Data Curation includes techniques introduced initially through social curation applications. Things like:

  • Lists

  • Popularity rankings

  • Relevance feeds

  • Annotations

  • Comments

  • Up-votes/Down-votes

  • Articles

Some of these require analysts to manually populate some information to enable sharing, but others, like relevance feeds and popularity rankings leverage machine learning and AI technologies to observe, parse, and automate the communication of data usage patterns to end users.

Increasingly, data catalogs and their integrated data recommendation engines automate a good deal of the work of curating data. They provide just enough of an automated head-start to sharing best practices that they make a real difference in the distribution of data knowledge throughout an organization. And what these automated techniques start is a virtuous circle of engagement by both expert data users and the casual user who feel empowered to add to the knowledge base of information about your data.

Getting Data Governance right

In order to be truly effective, that data knowledge needs to be consolidated into a single source of reference rather than scattered across multiple, sometimes conflicting sources of truth, a virtual jabberwocky that can be unintelligible to the uninitiated. Traditionally, creating a data catalog that could serve as this source required heavy-handed documentation performed by expert technical writers or IT data modelers, not business users. Proficient with the skill of documenting, these were clear sources for guidance, but they lacked the wisdom of applied actions and the speed and insights that can come from the free-form exploration of data.

Today’s modern data catalogs take a page from the world of social media to empower data curation with a lighter touch. Analysts who are data experts in action can share, annotate, and collaborate around data, seamlessly, in the flow of exploration. Queries, common syntax and communicated definitions are captured through social interaction. Think Pinterest for your data.

This approach to governing your data allows for a best of both worlds scenario — IT gets the assurance and control it needs to ensure accuracy and compliance while business users remain as unencumbered and free as Alice to explore new worlds of data analysis. Most important is the fact that, not only does this allow business users to play with data, it also provides the organization as a whole with the insights needed to drive business growth.

Some of the world’s most advanced data-driven organizations are embracing Data Curation as a technique to share data knowledge seamlessly across a new population of self-service data users, and take the data literacy of their organizations to a new level.

Whether through data visualization, analysis or data prep, a new world of self-service, but with governed access unfolds and no longer is the data teaparty a mad endeavor. The sanity of those who have come before is shared with every new Alice and yet the Wonderland of your data is preserved.

 

Like this article? Subscribe to our weekly newsletter to never miss out!

Follow @DataconomyMedia

Image: Jannes Pockele

Tags: Alationdata analyticsData CurationStephanie McReynolds

Related Posts

What is the Nvidia Eye Contact AI feature? Learn how to get and use the new Nvidia Broadcast feature. Zoom meetings and streams get easier.

Nvidia Eye Contact AI can be the savior of your online meetings

January 30, 2023
How did ChatGPT passed an MBA exam

How did ChatGPT passed an MBA exam?

January 27, 2023
What is AI prompt engineering? Learn how to write a prompt with examples. ChatGPT prompt engineering and more explained in this article.

AI prompt engineering is the key to limitless worlds

January 27, 2023
What is Analytics as a Service (AaaS): Examples

Transform your data into a competitive advantage with AaaS

January 26, 2023
Google code red: ChatGPT and You.com like AI-powered tools threatening the search engine. Moreover, latest Apple Search rumors increased the danger.

Google code red: ChatGPT, You.com and rumors of Apple Search challenge the dominance of search giant

January 26, 2023
Tome AI offers a new way to create presentations easily

Tome AI offers a new way to create presentations easily

January 25, 2023

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

Fostering a culture of innovation through digital maturity

Nvidia Eye Contact AI can be the savior of your online meetings

How did ChatGPT passed an MBA exam?

AI prompt engineering is the key to limitless worlds

Transform your data into a competitive advantage with AaaS

Google code red: ChatGPT, You.com and rumors of Apple Search challenge the dominance of search giant

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.