Data Science
While the field of data science is not tied directly to Big Data, advances in one tends to produce advances in the other. Big Data increases our ability to harvest and process data, while data science allows us to dig into it for insights.

Amazon Kinesis vs. Apache Kafka For Big Data Analysis
Data processing today is done in form of pipelines which include various steps like aggregation, sanitization, filtering and finally generating insights by applying various statistical models. Amazon Kinesis is a platform to build pipelines for streaming data at the scale of terabytes per hour. Parts of the Kinesis platform are

How to use ElasticSearch for Natural Language Processing and Text Mining — Part 2
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so you might want to brush up on the basics before getting started. This time we’ll focus on one very important type of query for Text Mining.

Boost Your Data Wrangling with R
The R language is often perceived as a language for statisticians and data scientists. Quite a long time ago, this was mostly true. However, over the years the flexibility R provides via packages has made R into a more general purpose language. R was open sourced in 1995, and since

Keep it real — say no to algorithm porn!
For people in the know, machine learning is old hat. Even so, it’s set to become the data buzzword of the year — for a rather mundane reason. When things get complex, people expect technology to ‘automagically’ solve the problem. Whether it’s automated financial product consultation or shopping in the supermarket of

A survival guide for the coming AI revolution
If the popular media are to be believed, artificial intelligence (AI) is coming to steal your job and threaten life as we know it. If we do not prepare now, we may face a future where AI runs free and dominates humans in society. The AI revolution is indeed underway. To

Confused by data visualization? Here’s how to cope in a world of many features
The late data visionary Hans Rosling mesmerised the world with his work, contributing to a more informed society. Rosling used global health data to paint a stunning picture of how our world is a better place now than it was in the past, bringing hope through data. Now more than

Data Mining for Social Intelligence – Opinion data as a monetizable resource
The digital age is characterised increasingly by the collective. The information generated by tapping into the minds of many is driving decisions in both the public and private sector; research is becoming social. On the back of this, a new science has emerged – known as opinion mining – which

Three Mistakes that Set Data Scientists up for Failure
The rise of the data scientists continues and social media is filled with success stories – but what about those who fail? There are no cover articles praising the failures of the many data scientists that don’t live up to the hype and don’t meet the needs of their stakeholders.

Big Data for Humans: The Importance of Data Visualization
Everyone has heard the old moniker garbage in – garbage out. It is a simple way of saying that machine learning is only as good as the data, algorithms, and human experience that goes into them. But even the best results can be thought of as garbage if no one

How to do time series prediction using RNNs, TensorFlow and Cloud ML Engine
The Estimators API in tf.contrib.learn (See tutorial here) is a very convenient way to get started using TensorFlow. The really cool thing from my perspective about the Estimators API is that using it is a very easy way to create distributed TensorFlow models. Many of the TensorFlow samples that you