Last week I attended a specialized training/go-to-market program in Seattle hosted by Microsoft’s Azure team.  It was a week full of clam chowder, excellent crab cakes, ridiculous discounts at the Microsoft store, and technical training on Azure, Microsoft’s massive and fully loaded penetration point into the world of cloud computing.  Although late to the game (a troubling problem characteristic of this company), Microsoft has invested hundreds of millions of dollars, man hours, and its corporate reputation behind Azure.  However, it must be admitted that Microsoft is a good target for technology-oriented derision and evil empire monikers. After all, they are the ones that released “Windows Me” and pioneered the use of independent consultants in order to reduce employee costs.  And does anyone remember Clippy?  Holy hell, what’s not to hate?

Azure, if I have anything to say about it. And after this deep dive into the technology behind the user interface, I have come to the conclusion that this is the beginning of the end for compute/network/storage vendors that do not dramatically change their business model going forward.  However, this is somewhat of a complicated story.  Let’s start at the top.

Here are my three top takeaways from the training:

1)      Microsoft has a cost profile that is an order of magnitude cheaper than anything anyone can develop in house for large databases and unstructured data.  As consumption increases, price per terabyte declines. In addition, commodity hardware dramatically drives the price down; they are not using anything other than JBOD and white-box high density servers. Given the size and scale of Microsoft’s Azure platform, it is unlikely that any company (other than another cloud provider) could replicate the cost advantages associated with buying in the quantity that Microsoft is. The “magic sauce” is the software and orchestration services that Microsoft layers on top of this massive compute farm; they ensure availability, resiliency, and elasticity.  This in-house IP takes the place of the specialized hardware and exorbitantly expensive licensing costs associated with licensing associated with capabilities like data stream multi-pathing, replication technologies, business continuity, and others must-haves for a data center.

2)      Microsoft is working with a huge number of niche open-source players.  What this translates into is an ability to take complex requirements from clients and craft a customized approach for platform integrity, data ingestion, and data manipulation.  This opens the door for large and composite workload processing in the cloud.  There is a high level of complexity associated with acquiring/ingesting/cleansing data in the oil/gas vertical: companies are not sure how to source the data, how to organize it in such a way that it can be aggregated, and how to ensure that the data conforms to an MDM model.  Microsoft and Azure can make all of this much simpler.  A uniform point in the cloud to upload data creates the opportunity for a “data lake,” an uncurated aggregation of disparate streams of data that can be tied together as needed by Hive and other map/reduce translators.  Ingestion of data can be further simplified (for the client, anyways) using Storm, Sqoop, Flume, and other tools supported by Microsoft for a client-centric ingestion architecture.  Cleansing can be imitated by statistical probabilistic smoothing; given the size of the data sets that we will work with, this should be a reasonable (and cheaper) mechanism for cleansing data than a full-blown MDM architecture and associated data curation.

3)      The machine learning practice (real-time analytics for big data streams) is a technological priority for Microsoft.  They are putting immense dollars and resources into both self-service big data analytics (which are admittedly still somewhat clunky and immature) as well as true statistical-model analysis for data streams.  Microsoft has a pretty robust team of data scientists in the United States and across the globe that can construct the appropriate mathematical regression models that will be needed to establish causation between disparate data elements coming from unrelated sources. Of course, an integration partner is still needed to act as a domain expert to capture and manage workflow, organizational dynamics, data element characterization, and other efforts. However, Microsoft has the muscle to work with these integration partners to understand how to map the data streams and data elements together in a way that will generate the highest value knowledge for the client.

So why do I think this means the end of the traditional data center? Simply put, there will not be a viable space for major players like NetApp, HP, or Cisco going forward.  The era of corporate owned data centers and associated compute resources is coming to a close; the advent of cloud computing is the beginning of the end for these hardware system manufacturers, at least in their current form.  Yes, there is a lot of noise when it comes to security, platform interdependence, legislative mandates, and even the mechanics related to data ingestion and portability.  However, these issues will resolve themselves in the next six to twelve years, and enterprise clients will find the cost model of public cloud irresistible as they evaluate their year over year capital expenditures.  Companies will be forced to adapt, and those that lack the agility to do so will find themselves cornered into niche markets or buried in the graveyard of those corporations that they once supplanted.  It will be interesting to see who makes it.



Jamal is a regular commentator on the Big Data industry. He is an executive and entrepreneur with over 15 years of experience driving strategy for Fortune 500 companies. In addition to technology strategy, his concentrations include digital oil fields, the geo-mechanics of multilateral drilling, well-site operations and completions, integrated workflows, reservoir stimulation, and extraction techniques. He has held leadership positions in Technology, Sales and Marketing, R&D, and M&A in some of the largest corporations in the world. He is currently a senior manager at Wipro where he focuses on emerging technologies.


(Image Credit:Rainer Stropek)

Previous post

Apervi’s Conflux Gives a Big Boost to a Confluence of Big Data Workflows

Next post

Link-U Hybrid SmartCam Ensures Security with Easy Installation and Uninterrupted Connectivity