The term “open source” was first coined in February 1999, and has continued to gain prominence ever since. Companies are being built around Open Source Software (OSS), with business models to provide technical support, training as well as enterprise feature sets that come at a cost for such software. Cloudera, Hortonworks and MapR are all companies that are well known in the Hadoop space for such business models.
OSS: The Hype
There is also a growing practice of technology enabled companies open sourcing software that was developed for in house use. Some recent examples include:
- Web publishing startup Medium has open sourced a tool, called Charted, that it built to visualize data within the company.
- Ebay open sourced a tool Kylin that the company says enables fast queries over even petabytes of data stored in Hadoop
- LinkedIn continues to create and open source software, Kafka, an open source real-time data tool being a popular one. At the time of this blog, LinkedIn came out with its intention to open source yet another tool Gobblin, that helps the company take in tons of data from a variety of sources so that it can be analyzed in its Hadoop-based data warehouses
Why Open Source?
What might have set off this trend among technology enabled companies? White papers published by Google on Distributed File System created a lot of interest in the technology community and inspired Apache Hadoop Open Source project. Google continues to be able to attract the best and the brightest in the Silicon Valley. Perhaps, other technology enabled companies such as LinkedIn, Facebook and Netflix took a cue and started open sourcing their software. Some of the benefits that such technology companies have been able to reap from open sourcing their software include:
- It is a great way to showcase innovation capabilities
- It is brand building in the technology and open source community
- Attract top talent. Today’s most talented technologists want to work at a company that enables them to contribute and stay connected to the wider technical community
- Open Sourcing Software developed by internal teams enables critiquing by developers around the world: This enhances the overall quality and efficiency of the software and team. It can help benchmark your team capabilities with the best and the brightest anywhere in the world
- For Software developed with an intent to be Open Sourced, the code is much cleaner and efficient (It is like your home being spruced up when you expect visitors)
- As the number of contributors to the open source software grows, it is beneficial to the company that created it. These contributors are people solving real world problems that are highly likely to be common for many companies
There are perks to developers that create or contribute to open source:
- Credibility in the tech community
- Personal branding
- It positions the developer as not just someone who can code, but someone who can code in a loose and divergent team and influence outcomes. An important differentiator between a good developer and a great one
- A sense of achievement and validation when your software powers companies worldwide
- Unlike a natural and visible show and tell opportunity for say, a UI designer that designed the landing page of a popular web site, a developer has little opportunity to show case to the external world all the great code and technology that went behind it. Open Sourcing software is a vehicle for ‘Self Realization’.
- Open sourcing software also means that software will live on, advanced by others , regardless of new projects the developer will move on to, in course of time
The Open Source Ecosystem
Open Source has created an ecosystem of developers, contributors, committers, OSS Organizations such as Apache Software Foundation, Linux Foundation, Eclipse Foundation, companies enabling OSS code sharing and business models monetizing certain aspects of OSS
For a developer who created OSS, keeping that work alive for years to come entails several activities
- Promote OSS to a credible foundation such as Apache, taking into account licensing and other library issues. This can often prove be a laborious and time consuming process, given the democratic nature of OSS review and approval process adopted by OSS Foundations.
- Act as an evangelist and spread awareness for the OSS through meet ups, white papers & presentation in conferences. For a developer, it is not just about the rapid increase in install base. It is about how many installs continue to use and evolve the OSS year after year. Kafka an OSS tool from LinkedIn found much success with many companies using it.
Companies enabling code sharing and business models monetizing certain aspects of OSS are emerging as important constituents of this rapidly developing ecosystem.
- GitHub is a popular example of source code sharing repository. Open Source Code repositories have become an important tool for developers to discover and collaborate on projects. Many developers have begun considering their credentials on GitHub as important as their conventional resume. Most of the team in Firebase, a company acquired by Google recently have a link to their GitHub profile account on their company profile
- Confluent, a startup funded by LinkedIn was recently formed to commercialize Kafka from LinkedIn. The startup expects to develop proprietary tools to complement Kafka, while keeping Kafka open source. This is a growing monetization model and there are other examples of companies monetizing open source software by building proprietary tools and offering support ( DataStax was built around Cassandra OSS)
OSS in many ways is a self-regulating and self-propagating eco system
- Governance model for OSS seems to vary across the spectrum between ‘benevolent dictators’ to ‘meritocratic governance’ in which participants gain influence over a project in recognition of their contribution. OSS projects do seem to find an optimal governance model, over a period of time, to function.
- The freedom to fork your version of the OSS and innovate from there is a highly desirable and in built model in the world of OSS
- Open dialogue in OSS and tech forums reveal the good, bad and the ugly of a software, leading to improvement
- It is developed by true practioners solving business problems through technology
- It creates new business and monetization models
Limitations of OSS
The OSS ecosystem is not without its limitations:
- Companies creating OSS seem to have a regional concentration in tech hubs like the Silicon Valley. This is rapidly changing as we see contributions from all over the world.
- Some companies are known to harvest code from vast fields of open source software while obscuring its code donations and distancing itself from the wider world of computing
- Openness and contribution by companies adopting OSS is entirely voluntary. It is difficult to fully know the universe of Install base for an open source tool.
- As OSS projects become large and popular, code review and committing can become onerous. Commercial distributions of such OSS can approach it with vested interests.
- For many Enterprise companies that have traditional IT Shops, there are barriers to OSS adoption:
- The ‘no commercial support model’ in OSS (In many cases, although this is changing Example: Hadoop distributors, DataStax for No SQL, Databricks for Apache SPARK….)
- Lack of expertise to evaluate a variety of OSS options for a given business probelm. Contrast this with an Enterprise Software vendor who will show use cases, do PoC with you, give you references and have a solution architect at your disposal (for a cost of course)
Open Source Software in the Future
Going by the growth of contributors, the number of companies creating and contributing software to the OSS community and the number of OSS repositories increasing in GitHub, OSS is here to stay.
The concept of Open Source is growing in other areas beyond Software. Open Compute Project Foundation started by Facebook aims to collectively develop the most efficient computing infrastructure possible at the lowest cost and widest distribution. Designs that emerge from this project are freely available for everyone to improve upon and contribute back to the project.
The notion that ‘free’ is not valuable no longer appears valid. Once software becomes free, something else is built using that software that becomes eventually more valuable. Businesses and monetization models do arise from this. OSS, along with cloud is enabling people with big ideas and small budget to innovate, collaborate and thrive without spending a lot of money. The OSS ecosystem indeed is very powerful.
Opinions expressed by the author are personal in nature. He acknowledges inputs from Kishore Gopalakrishna, LinkedIn.
Sendil Thangavelu is corporate director for analytics and data management at Flextronics. He lead analytics, data management and enterprise application initiatives spanning strategy, investment planning, architecture, delivery, and customer relationships for companies including PG&E and Toyota. Thangavelu also led consulting engagements for VMware, Cisco, Intel, Nissan, and a few start ups in the Bay Area. He has a bachelor’s degree and an MBA. He is an experienced practitioner, thinker, blogger and speaker in the data arena and taught business intelligence and data management courses at UC — San Diego.
(Image credit: Thomas Galvez, via Flickr)