Databricks, a start up that builds software around the popular open-source project Apache Spark, announced on Monday at this year’s Spark Summit in San Francisco that it has raised $33 million in Series B funding. The announcement also included the launch of a new cloud computing service on Amazon Web Services, which aims to reduce the time and cost of setting up and analysing big data by sampling the data pipeline in the cloud. New Enterprise Associates led the funding round with participation from existing investor Andreessen Horowitz.
“Getting the full value out of their Big Data investments is still very difficult for organizations,” said Databrick’s CEO, Ion Stoica. “Clusters are difficult to set up and manage, and extracting value from your data requires you to integrate a hodgepodge of disparate tools, which are themselves hard to use. Our vision at Databricks is to dramatically simplify big data processing and free users to focus on turning data into value.”
Although Spark is currently deployable on AWS, the announcement of Databricks Cloud means that Spark will be a managed service that will be supported directly by Databricks. Data will be stored in AWS by default, but can also be stored in HDFS if customers already have Hadoop clusters running in AWS. Moreover, as GigaOM reports, “Databricks cloud can read data from, and export data to, MongoDB, MySQL and Amazon Redshift.”
While Databricks Cloud will exclusively run on AWS S3 for now, Stoica said the company has plans to explore other options like Google Compute Cloud and Microsoft Azure. Databricks cloud is currently in beta but is expected to be ready on Amazon by this fall, starting at “a couple of hundred dollars” per user, per month.
Databricks is current the most active open-srouce project for big data, and its recent funding has brought the total investment in the company to $47 million.
Read more here
Interested in more content like this? Sign up to our newsletter, and you wont miss a thing!
(Image Credit: Flickr)