Posts In Category

Resources

Machine LearningResources

Undoubtedly, 2017 has been yet another hype year for machine learning (ML) and artificial intelligence (AI). As ML and AI become increasingly ubiquitous in many industries, so does the proof that advanced analytics significantly improve day-to-day operations and drive more revenue for businesses. Yes, it’s true – enterprises worldwide have

Read More
Nonlinear Regression
Data ScienceData Science 101Resources

Linear regression is a basic tool. It works on the assumption that there exists a linear relationship between the dependent and independent variable, also known as the explanatory variables and output. However, not all problems have such a linear relationship. In fact, many of the problems we see today are

Read More
Big Data Analysis
Big DataData ScienceResources

Introduction A couple of months ago a client of mine asked me the following question: “What is the faster data structure object in Python for Big Data analysis today?” I get questions like this one all the time. Some of them are not easy to solve at all and it

Read More
Data ScienceData Science 101Resources

In recent years’ evidence has been mounting that points to a crisis in the reproducible results of scientific research. Reviews of papers in the fields of psychology and cancer biology found that only 40% and 10%, respectively, of the results, could be reproduced. Nature published the results of a survey of

Read More
Data ScienceResources

This is the continuation of the Frequency Distribution Analysis using Python Data Stack – Part 1 article. Here we’ll be analyzing real production business surveys for your review. Application Configuration File The configuration (config) file config.py is shown in Code Listing 3. This config file includes the general settings for

Read More
Data ScienceResources

During my years as a Consultant Data Scientist I have received many requests from my clients to provide frequency distribution reports for their specific business data needs. These reports have been very useful for the company management to make proper business decisions quickly. In this paper I would like to

Read More
Data ScienceResources

Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so you might want to brush up on the basics before getting started. This time we’ll focus on one very important type of query for Text Mining.

Read More
Data ScienceData Science 101Resources

The R language is often perceived as a language for statisticians and data scientists. Quite a long time ago, this was mostly true. However, over the years the flexibility R provides via packages has made R into a more general purpose language. R was open sourced in 1995, and since

Read More
time series
Data ScienceMachine LearningResources

The Estimators API in tf.contrib.learn (See tutorial here) is a very convenient way to get started using TensorFlow. The really cool thing from my perspective about the Estimators API is that using it is a very easy way to create distributed TensorFlow models. Many of the TensorFlow samples that you

Read More
Cybersecurity
Case StudiesCybersecurityResources

In today’s world of never ending cybersecurity threats, companies must be more diligent than ever in protecting their data from external and internal threats. The question of dealing with cybersecurity incidents is not a matter of if but WHEN. How a company responds to a breach is vital – the

Read More