Hadoop is designed to make storing and processing large amounts of data affordable. It does this in two ways:

  • Through hardware, by allowing use of normal processing chips, and
  • Through software, with the open-source Apache licence

Open-source software essentially means Hadoop is free, or more specifically no licence fee needs to be paid. On the other hand, there are number of weaknesses that makes it less ideal for commercial use:

 

1. Integration with existing systems

Hadoop is not optimised for ease for use. Installing and integrating with existing databases might prove to be difficult, especially since there is no software support provided.

2. Administration and ease of use

Hadoop requires knowledge of MapReduce, while most data practitioners use SQL. This means significant training may be required to administer Hadoop clusters.

3. Security

Hadoop lacks the level of security functionality needed for safe enterprise deployment, especially if it concerns sensitive data.

4. Single Point of Failure (SPOF)

The original version of Hadoop only has a single node responsible for understanding where all data is located, which makes the cluster useless should that node fail. This issue has been resolved in Hadoop 2.0.

 

Pure-play Hadoop companies such as Cloudera, MapR and Hortonworks seek to make Hadoop more enterprise-ready, by developing platform that plugs these gaps, but may entail licencing fees..

In conjunction with Big Data Week, we compiled a list of the best Big Data articles here.

(Image credit: Timothy Appnel)

Previous post

Simpson's Paradox in Mobile App Monetization

Next post

DRM and China's Knockoff Economy