The Open Source Tradeoff
As developers, we have had access to high-quality programming languages and compilers for several decades now. The ability to create clean and reusable code has been available for many years. But the most important advancement in the last few years has been the acceptance of open source in our organizations.
In the past when an organization built a system, it needed to start from scratch. Apart from some modules or libraries previously built within the organization or even a collection of code carried over by the individual developer, the new system had to be made from the ground up. Every single part of the system — every need — had to be made all over again.
It is no wonder why organizations cared deeply about their code, as it was a valuable intellectual property that gave them an edge against the competition. But that required developers to reinvent the wheel time after time, and the loss of that knowledge with the attrition of the developers or the evolution of the tools and languages we used made it fragile and easy to be lost.
That is where the open-source movement comes in. The idea is to pool all of our knowledge — all of our intellectual capacity — into a common repository that is easily and freely shared. Those system needs which we find in every new one can be met with a previously developed and tested solution.
The effect on our productivity is exponential.
We have access to thousands of readily available frameworks and libraries, from the simplest of things to complete scaffolding for our systems. We can do more — and faster than ever before. And we can do it in a more secure way, as those libraries, used by thousands of different organizations, are tested both in a rigorous and practical way by all of us, making the detection and remediation of errors and vulnerabilities a shared task,
In short, open-source enables the community to pull more resources into development and testing than a single organization could ever muster.
But it has a tradeoff, as everything does. We are trading that improvement in productivity for a loss of control, potentially making our systems more vulnerable. The libraries we use are not completely controlled by us, and we should adapt and plan for that.
Let us give an example.
On March 23, 2016, a developer, completely in his right as the author, unpublished many packages he had chosen to make available on NPM, the most popular Javascript package manager. One of those packages, left-pad, consisting of 11 lines of code doing a simple task, was used by thousands of different other packages and frameworks, including some of the most widely used ones in the Javascript community. Unable to find the left-pad package dependency, these packages and frameworks became unusable, stopping the development efforts of many organizations for hours or days.
What happened recently with the Log4j vulnerability is another instance of the same risk. The ability to log the behavior of our systems is a very important part of a well-made solution. It is a need we all have, and that is very similar for all of us. To use a very well-understood and tested library as Apache Log4j is much more productive than trying to build a new solution from scratch every single time. So we use it in almost every Java system that there is. And when a big vulnerability is found in it, it affects every single one of them at the same time.
But let’s be very clear here. To stop using open source libraries like left-pad or Log4j is not the solution. The improvements in productivity that they allow are vital for the software-centric times we live in right now. It is because of the open-source community that those libraries were developed, used, and tested. It was because of their open nature that the vulnerability was found, and was remediated quickly.