Posts In Category

Data Science 101

Data ScienceData Science 101Resources

In recent years’ evidence has been mounting that points to a crisis in the reproducible results of scientific research. Reviews of papers in the fields of psychology and cancer biology found that only 40% and 10%, respectively, of the results, could be reproduced. Nature published the results of a survey of

Read More
Data ScienceData Science 101Machine Learning

This article is part of a media partnership with PyData Berlin, a group helping support open-source data science libraries and tools. To learn more about this topic, please consider attending our fourth annual PyData Berlin conference on June 30-July 2, 2017. Miroslav Batchkarov and other experts will be giving talks

Read More
Data ScienceData Science 101Resources

The R language is often perceived as a language for statisticians and data scientists. Quite a long time ago, this was mostly true. However, over the years the flexibility R provides via packages has made R into a more general purpose language. R was open sourced in 1995, and since

Read More
Data ScienceData Science 101Understanding Big Data

The late data visionary Hans Rosling mesmerised the world with his work, contributing to a more informed society. Rosling used global health data to paint a stunning picture of how our world is a better place now than it was in the past, bringing hope through data. Now more than

Read More
Data ScienceData Science 101

The rise of the data scientists continues and social media is filled with success stories – but what about those who fail? There are no cover articles praising the failures of the many data scientists that don’t live up to the hype and don’t meet the needs of their stakeholders.

Read More
Big DataData Science 101Understanding Big Data

Oftentimes while analyzing big data we have a need to make checks on pieces of data like number of items in the dataset, number of unique items, and their occurrence frequency. Hash tables or Hash sets are usually employed for this purpose. But when the dataset becomes so enormous that

Read More
Data ScienceData Science 101Resources

Categorical data is a kind of data which has a predefined set of values. Taking “Child”, “Adult” or “Senior” instead of keeping the age of a person to be a number is one such example of using age as categorical. However, before using categorical data, one must know about various

Read More
Big DataData ScienceData Science 101FeaturedUnderstanding Big Data

Data Science, Data Analytics, Data Everywhere Jargon can be downright intimidating and seemingly impenetrable to the uninformed. While complicated vernacular is an unfortunate side effect of the similarly complicated world of machines, those involved in computers, data and whole host of other tech-intensive sectors don’t do themselves any favors with

Read More
Data ScienceData Science 101Understanding Big Data

As the scale of data grows across organizations with terabytes and petabytes coming into systems every day, running ad hoc queries across the entire dataset to generate important metrics and intelligence is no longer feasible. Once the quantum of data crosses a threshold, even simple questions such as what is

Read More
false friends
Data ScienceData Science 101

I recently stumbled across a research paper, Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US, which piqued my interest in derivative uses of data, an ongoing research interest of mine. A variety of deep learning techniques were used to draw conclusions about relationships

Read More