Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Cloudera on Why Hadoop Projects Fail

byEileen McNulty
July 23, 2014
in Conversations
Home Conversations

Cloudera on Why Hadoop Projects FailMark Lewis is the Senior Director of Marketing at Cloudera. Cloudera offers a unified platform for big data; its enterprise data hub, built on Apache Hadoop. The Enterprise Data Hub allows companies to execute storage, access, management, analysis, security and search all within one framework. We recently caught up with Mark at Big Data & Analytics Day to discuss the Enterprise Data Hub, the evolving discipline of data science and why Hadoop projects fail.
 


 

Tell us about the relationship between the Enterprise Data Hub and the data warehouse.

A lot of people think they need to build a big data system as though it’s one entire entity. But it’s not, it’s a number of entities, and there’s a journey in how you get there. Many ask “If I adopt an enterprise data hub, does that mean I don’t need my data warehouse anymore?” Absolutely not. A great analogy actually is the digital camera—it is the best at its job, it takes great pictures, you’ve got great focus, and the quality of the lenses are absolutely superb. But how many people actually take a photo with a digital camera? Quite a few. But how many people in comparison take a picture with a smartphone? Now, a lot more people would take a picture with a smartphone.

So examine for a moment, why is that? Well, a smartphone has a lot of apps built into it. Apps that enable you to edit the data and share the data as well. That’s what an enterprise data hub is. It’s bringing the compute power to where the data resides. A data warehouse is a bit like the SLR—it’s really good at what it does but you need to do something with the data in order for it to actually operate on it, which is where the ETL process is. A digital camera doesn’t work on its own, you need to take the data from the camera and move it to your computer and then do something with it.

In a recent LinkedIn poll of “Why Hadoop Projects Fail?”, a consistent pain point identified by respondents was “No financially compelling use case”. Do you agree?

Yes. I think what this highlights is that big data is a very recent and evolving discipline. In my talk today “How many people here are actually right at the beginning and have actually no idea about big data?”- about 15% of the room actually raised their hands. When I asked how many people in the room are real experts in this field, 0% put their hand up. When I asked “Who has an element of knowledge but is a bit confused?”, nearly all the room put their hands up. And I’m seeing that from Germany to the UK to Ireland to the US to the Middle East.

What we’ve got here is a new, evolving discipline and I think that people don’t understand it. So if people are looking to get into this, great! But be prepared to actually put the effort in to understand and build your discipline, and gain an overview of the all of the different aspects.There’s data scientists, there’s IT architects, there’s people who understand data warehousing—each of those are quite disciplined and all of that coming together under the enterprise data hub is a relatively new concept.

It’s easy to understand why the data scientists just feel like everything is landing on their lap. You have the word “big data” and everyone thinks you know everything about everything. It’s a bit like—another analogy I like to use—being a mediocre violinist and you’ve played with an orchestra once or twice but you’re not that great. To your non-musical friends, you’re a genius! You know everything about the orchestra. You could conduct it, you can compose, you can write great symphonies! But you know that in the grand scheme of things you’ve got a lot to learn and I think if people are honest in big data, there are disciplines where people have a lot to learn.

There are experts out there. But the people skills as well as the technology are still evolving. It’s going to be a journey before people fully appreciate and can really use it to its full extent. That’s why companies like Cloudera are helping with training.

Another question we’re often asked is “Why are Cloudera and MongoDB (for instance) marketing themselves as partners- aren’t they doing exactly the same thing?” Would you care to clear up the confusion about the specific uses of Hadoop and NoSQL databases, and how they work together?

Very simply put, the Hadoop structure is like an operating system. It’s not actually that exciting to look at or do much with. It is a way of just scaling, computing, and storing that data. What you then have are the applications and the modules that then allow you to put data in and extract data out.

The partner companies each have specific uses; some do ETL, some do analytics. These components come together, and each of them is specialist in their own right. But it is one system. That’s why big data means different things to different people. Someone in an analytics company will say, “We’ve been doing big data for years!” and they’re not wrong. Someone in an ETL company will say, “We’ve been doing analytics data for years,” and they’re not wrong. And somebody who’s just simply been running spreadsheets and doing a bit of analysis on a few numbers says they’ve been doing big data and they’re not completely wrong. But to scale, and to have the different volumes of data from different places is the challenge we’re facing today. That’s where the partner ecosystem comes in.

On the subject of partnerships, would you like to tell us more about the partnership between Intel and Cloudera?

It’s something both Cloudera and Intel are really excited about. The whole thing is building together an engagement that’s ultimately going to help the customer. So if you take the way that the systems are operating, 95% of the data sent today is based on Intel. We’re working with Intel on helping to improve the Hadoop’s performance, and the way that it operates at an engineering level.

You can imagine that that’s going to help a lot of people out there get a lot of performance improvement across a lot of systems. And of course being an open source company—as Cloudera is—with the committers that we have, we’re going to contribute that back to the community as well. So that will benefit a lot of people who are involved in Hadoop, not just Cloudera. So it’s a very exciting partnership.


Cloudera on Why Hadoop Projects Fail 2Cloudera offers a unified platform for big data; its enterprise data hub, built on Apache Hadoop. The Enterprise Data Hub allows companies to execute storage, access, management, analysis, security and search all within one framework.


Follow @DataconomyMedia

(Featured Image Credit: Cloudera)

Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!

[mc4wp_form]

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Tags: ClouderaDatabase Technology NewsletterWeekly Newsletter

Related Posts

Data Sanity in an AI World: How to Drive Real Business Value

Data Sanity in an AI World: How to Drive Real Business Value

July 29, 2025
When a model touches millions: Hatim Kagalwala on accuracy accountability, and applied machine learning

When a model touches millions: Hatim Kagalwala on accuracy accountability, and applied machine learning

July 9, 2025
Jeff Mahony: The Maverick Investor’s Guide to Real-World Success

Jeff Mahony: The Maverick Investor’s Guide to Real-World Success

June 27, 2025
AI Redefines Filmmaking Landscape, Expert Says, Unlocking Creativity and Sparking Ethical Debates

AI Redefines Filmmaking Landscape, Expert Says, Unlocking Creativity and Sparking Ethical Debates

June 25, 2025
Conversations with Trailblazing Women: Professor Dame Wendy Hall of University of Southampton

Conversations with Trailblazing Women: Professor Dame Wendy Hall of University of Southampton

June 2, 2025
Domain-Agnostic AI: Dmytro Afanasiev’s methodology for scaling technological innovations across industry barriers

Domain-Agnostic AI: Dmytro Afanasiev’s methodology for scaling technological innovations across industry barriers

June 1, 2025
Please login to join discussion

LATEST NEWS

UK Home Office seeks full Apple iCloud data access

iPhone 17 may drop physical SIM in EU

Zscaler: Salesloft Drift breach exposed customer data

AI boosts developer productivity, human oversight still needed

Windows 11 25H2 enters testing with no new features

ChatGPT logo fixes drive demand for graphic designers

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.