This week, Google released a research paper revealing a new data warehousing system – dubbed Mesa — relating to the tech giant’s core business: Internet advertising. As GigaOM describe, the new warehouse concept is “designed to handle needs relating to Google’s ad business (serving internal users, as well as a front-end query service for customers) but can also function as a generic data warehouse system for other use cases.”
The system can handle near real-time data and maintain performance even if an entire data center goes offline, the paper claims. As the researchers describe, “Mesa handles petabytes of data, processes millions of row updates per second, and serves billions of queries that fetch trillions of rows per day.”
Although there are other database systems on the market from Facebook, Twitter, and other large vendors like SAP, Teradata and others, the researchers argue that these systems are particularly good at bulk data-loading rather than speed. Mesa’s update cycle, on the other hand, takes minutes and it processes hundreds of millions of rows, the researchers claim.
Although the closest system resembling the power of Google’s system is Vertica, there is no production system designed to manage replicated data across multiple datacenters. “Furthermore, it is not clear if these systems are truly cloud enabled or elastic,” the paper continues.
Moreover, Mesa has the potential to be much more than an additional data warehouse system. An open source version, for example, as GigaOM note, would probably be exceptionally popular. Another possibility is that Mesa will result in a new cloud service, which would help Google compete against rival cloud computing services such as Amazon Web Services and Microsoft Azure. Whether any of these projects will come to fruition remains to be seen, but as Google has shown previously with its research on the Dremel query system (which later created BigQuery), such developments are certainly conceivable.
Read more here
(Image Credit: David Precious)