Databricks and Datastax have announced a partnership centring around a supported open-source integration of their Cassandra and Spark products respectively. This follows announcements of Databricks integrations with Cloudera, MapR and Alpine Data Labs.
Kelvin Chu, compute and data team lead at Ooyala, a video analytics platform who built their own integration between Spark and Cassandra, had this to say about the new partnership: “With Cassandra as the data store and Spark for data crunching, these new analytic capabilities are making the processing of large data volumes a breeze. Spark on Cassandra is giving us the power to act on things in real time, which means faster decisions and faster results.”
Cassandra will be integrated with the Spark Core Engine, meaning it can take advantage of all types of analysis on the framework. The benefit will be in-memory analytics for things like personalisation & recommendation, fraud detection and sensor-data detection. “Analytics on real-time data is important because people want to look at what the customer is looking at or buying right now, do a quick analysis against historical or location data, and offer them something different,” Martin Van Ryswyk, Datastax’s Executive VP of Engineering, stated. “This also happens in fraud scenarios when you need to stop a transaction right now, not two hours from now once you’ve seen all that data in batch mode on Hadoop or a data warehouse.”
Complete our SAP x Data Natives CDO Club survey now, and help us to help you
However, not everyone convinced that combined power of Cassandra and Spark is something to get excited about. David Brinker, COO of ScaleOut Software, has voiced concerns about Spark’s lack of handling for real-time state changes, and Cassandra’s eventual consistency model. “Spark does not handle real-time state changes to individual data items in memory; it can only stream data and change the whole data set,” he says. “Cassandra has a similar limitation because you can’t update data in Cassandra; all you can do is delete it and create a new copy.”
Databricks and Datastax hope to have the integration ready over the summer. Only time will tell if this will prove to be a game-changing technology.
Read more here.
Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!