Uncategorized

The Week in Big Data – 18th August, 2014
This week saw several huge funding announcements in the big data sphere. Israeli cyber security startup GuardiCore raised $11 million to bolster data centre security; Everstring landed $12 million to enlist more data scientists; Nervana raised $1.1 million for their deep learning hardware venture, which remains shrouded in mystery. See

Distributed NoSQL: Riak
In previous post Distributed NoSQL: HBase and Accumulo, we explored two BigTable-like open source solutions. In this post we will learn about Riak, a highly available key-value store modeled after Amazon.com’s Dynamo. As we know, HBase and Accumulo provide strong consistency as a region/tablet is served by only one RegionServer/TabletServer at a time. However, this

Improving the Steam Recommender System, Emptying Your Wallet
We recently caught up with Kevin Wong, a business intelligence professional and machine learning enthusiast, to talk about his latest project: Building a better recommender system for Steam. For anyone who isn’t familiar, Steam is a digital distribution platform for games, developed by Valve Corporation. It boasts over 75 million

Distributed NoSQL: HBase and Accumulo
NoSQL (Not Only SQL) database, departing from relational model, is a hot term nowadays although the name is kind of misleading. The data model (e.g., key-value, document, or graph) is surely very different from the tabular relations in the RDBMS. However, these non-relational data models are actually not new. For

The Week in Big Data– August 11, 2014
This week has seen many announcements in the data science arena that will make our lives a little easier. The new CoLaborate app lets data scientists collaborate via Google Drive; Facebook is making its security better than ever; guest contributor Rick Delgado walked us through how five companies are using

Distributed NoSQL: MongoDB
We have explored several interesting distributed key-value databases including HBase and Accumulo, Riak, and Cassandra. Although key-value pairs are very flexible, it is tedious to map them to objects in applications. In this post we will learn about a popular document-oriented database MongoDB. MongoDB uses JSON-like documents with dynamic schemas, making the integration of data in

Choosing a NoSQL
Many people ask my opinion on different NoSQL databases, and they also want to know the benchmark numbers. I guess that the readers of this post probably have similar questions. When you start building your next cool cloud application, there are dozens of NoSQL options to choose. It is natural

Poll: Will Analytics Change the Direction of Sports?
This week, we reported a news piece on how NHL teams are investing heavily in analytics. The Edmonton Oilers, for example, announced it is hiring a famous analytics blogger, Tyler Dellow, to consult with hockey operations. Furthermore, the New Jersey Devils, Toronto Maple Leafs, and one unnamed NHL team all

The Week in Big Data– August 4, 2014
This week, we have a special line up of interesting big data news and articles. Alexandre Passant, the co-founder of Music & Data Geeks, looks at the Rolling Stone’s 500 Greatest Songs of All Time and runs some “API-based data-science” to see what insights we can derive from this top-500. We also learnt