Rapidly changing data management technology in the 21st Century has created cost-saving, business-expanding opportunities. Big companies around the world have realized the potential of modern technology to take over complex operations with more ease than a skilled worker could. There are debates about how it is pushing some traditional methods
During my years as a Consultant Data Scientist I have received many requests from my clients to provide frequency distribution reports for their specific business data needs. These reports have been very useful for the company management to make proper business decisions quickly. In this paper I would like to
R is ubiquitous in the machine learning community. Its ecosystem of more than 8,000 packages makes it the Swiss Army knife of modeling applications. Similarly, Apache Spark has rapidly become the big data platform of choice for data scientists. Its ability to perform calculations relatively quickly (due to features like in-memory
As the creation and consumption of data continues to grow among businesses of all sizes, so does the challenge of analyzing and turning that data into actionable insights. According to IBM, 90 percent of the data in the world today has been created in the last two years, at 2.5
Data processing today is done in form of pipelines which include various steps like aggregation, sanitization, filtering and finally generating insights by applying various statistical models. Amazon Kinesis is a platform to build pipelines for streaming data at the scale of terabytes per hour. Parts of the Kinesis platform are
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so you might want to brush up on the basics before getting started. This time we’ll focus on one very important type of query for Text Mining.