Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

The Building Blocks of a Data-Driven Enterprise

by Ashish Thusoo
May 15, 2015
in Data Science
Home Topics Data Science
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

These days, every business is exploring ways to use data and new technologies to gain a competitive edge. While there’s no questioning the value to be found in big data analytics, organizations have had a low success rate to date when it comes to rolling out data initiatives. A recent Capgemini study, for instance, found that only 13 percent of big data initiatives in organizations surveyed have achieved full-scale production, and only 27 percent of the executives surveyed described their big data initiatives as “successful.”

In light of this, what should organizations be thinking about as they roll out their own big data initiatives?

Table of Contents

  • Make the data accessible
  • Make the data more collaborative 
  • Plan to scale 
  • Use the right technology for the job
  • Putting it all together

Make the data accessible

It may sound obvious, but data isn’t useful if it isn’t put in the hands of teams that can make use of it. The more people that have access to data throughout an organization, the more value it can have.

Yet, organizations still struggle with this for various reasons. To name a few, the tools are too complex; each team needs different tools; the strain on IT to handle the admin required of big data deployments is too much; or the training involved in getting teams up and running takes time.  A common complaint from companies declaring failure in their big data projects is a “lack of talent”.  To a great extent, this gap in talent can be attributed to the fact that it is indeed difficult to find data scientists and analysts who also code in the various languages needed to navigate the Hadoop Ecosystem.  Hadoop was built by developers for developers but there are now emerging cloud-based tools to continually abstract away this added technical complexity enabling data teams to focus on data problems and not be pulled into programming or infrastructure issues.


Join the Partisia Blockchain Hackathon, design the future, gain new skills, and win!


In an ideal world, organizations would have a completely self-service platform with a range of the latest tools and technologies that can be used as needed for the task at hand. This reduces barriers to insights while making teams more productive with their data analysis.

Make the data more collaborative 

In addition to getting data into the hands of more teams, fostering collaboration around the data also drives greater value from it in three ways. First, it promotes better insights as more brains ponder the problem at hand and dig into the results from different perspectives. Second, it reduces duplicate effort, so that different people on the same or even different teams aren’t conducting the same queries in silos. This saves man-hours and it can also save money or compute resources. Third, it can also create more efficient and transparent relationships among teams, clients or partners if you’re able to share results and collaborate on analysis with them.

Plan to scale 

No matter how small your organization or datasets are now, they will grow—rapidly.

Your infrastructure needs to scale too, and you need to plan and invest out ahead of your needs. That means adding hardware, software and staff resources to keep pace. For on-premises big data deployments, it can take anywhere from weeks to months to expand the capacity and capabilities of data infrastructure—that means you always need to be thinking and planning several months out.

Use the right technology for the job

It goes without saying that organizations should use the right tools and technologies for the job. For organizations deploying on-premises big data infrastructure, this is much easier said than done.

Organizations have a wide range of tools and technologies to choose from—Hadoop, Hive, Spark, Presto, Pig, etc.—and new technologies are popping up all the time. Each technology requires specialized expertise to deploy and integrate with existing systems, and carries its own costs and admin requirements.

Yet, the more tools available, the more productive and the more effective your teams’ analyses will be. The challenge is weighing the comprehensiveness of tools with the cost, support and admin requirements to install and maintain it.

Putting it all together

Big data initiatives are all about gaining new insights from all the data companies have access to as quickly as possible to propel their businesses forward. To do this successfully, organizations need a big data infrastructure that is flexible, accessible and scalable.

When you look at all the requirements listed throughout this article, it becomes clear that those qualities map nicely to the characteristics of the cloud. On-premises deployments take months or even years to implement and tune. Scaling becomes a challenge because you have to plan well ahead of your needs and devote resources to adding capacity. Adding new technologies can be time-consuming and often requires special expertise (adding to costs).

Big data analytics in the cloud eliminates the risks of on-premises deployments, while providing several advantages that speed the time to value of data. For organizations of all sizes looking to become truly data-driven businesses, moving their big data initiatives into the cloud is the fastest and most effective way to achieve this.

With companies like Amazon, Google and Microsoft devoting significant resources to not only grow their cloud offerings, but they’re now aggressively rolling big data services on top of their clouds, it’s clear that the future of big data is in the cloud.


ashishthusoo About Ashish Thusoo, CEO and co-founder of Qubole: Before co-founding Qubole, Ashish ran Facebook’s Data Infrastructure team. The team built one of the largest data processing and analytics platforms in the world under his leadership. This platform achieved not just the bold aim of making data accessible to analysts, engineers and data scientists, but drove the “big data” revolution. In the process of scaling Facebook’s big data infrastructure, he helped drive the creation of a host of tools, technologies and templates that are used industry wide today.


Image credit: Brett Jordan

Tags: ApplciationsData Science ApplicationsenterpriseQubole

Related Posts

AI Asmongold video: In the Athene AI Show, a Twitch streamer's funny deepfake revealed and people love it. So how did this happen? Keep reading and find out.

AI Asmongold may have been one of the very first examples of AI streamers

February 6, 2023
Google starts testing its ChatGPT rival AI chatbot called Apprentice Bard

Google starts testing its ChatGPT rival AI chatbot called Apprentice Bard

February 3, 2023
Artificial intelligence in education: Examples

How AI improves education with personalized learning at scale and other new capabilities

February 3, 2023
What is ChatGPT Plus, and how to get it? Learn its features, price, and how to join ChatGPT Plus waitlist. Is it worth it? Keep reading and find out

ChatGPT Plus: How does the paid version work?

February 2, 2023
AI Text Classifier: OpenAI's ChatGPT detector can distinguishes AI-generated text

AI Text Classifier: OpenAI’s ChatGPT detector indicates AI-generated text

February 2, 2023
BuzzFeed ChatGPT integration: Buzzfeed stock surges in enthusiasm over OpenAI

BuzzFeed ChatGPT integration: Buzzfeed stock surges after the OpenAI deal

February 2, 2023

Comments 5

  1. Robert Klein says:
    8 years ago

    Great overview, Ashish. It’s going to be a totally different world when we’re finished with this preparatory translation phase and get everything visible and connected. Would love you or your team’s input on this API. It correlates themes in unstructured streams: dev.keywordmeme.com. It’ll help to build tools for a lot of what’s mentioned here. Thanks!

    Reply
  2. Andrew Paek says:
    8 years ago

    I’m a pretty new to data analysis and I was wondering what the “costs and admin requirements” of data analysis tools would be? Besides paying for the software of course if any of the above cost money.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

LATEST ARTICLES

AI Asmongold may have been one of the very first examples of AI streamers

Mastering the art of efficiency through business process transformation

Google starts testing its ChatGPT rival AI chatbot called Apprentice Bard

How AI improves education with personalized learning at scale and other new capabilities

Cyberpsychology: The psychological underpinnings of cybersecurity risks

ChatGPT Plus: How does the paid version work?

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy
  • Partnership
  • Writers wanted

Follow Us

  • News
  • AI
  • Big Data
  • Machine Learning
  • Trends
    • Blockchain
    • Cybersecurity
    • FinTech
    • Gaming
    • Internet of Things
    • Startups
    • Whitepapers
  • Industry
    • Energy & Environment
    • Finance
    • Healthcare
    • Industrial Goods & Services
    • Marketing & Sales
    • Retail & Consumer
    • Technology & IT
    • Transportation & Logistics
  • Events
  • About
    • About Us
    • Contact
    • Imprint
    • Legal & Privacy
    • Newsletter
    • Partner With Us
    • Writers wanted
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.