Practical Strategies for Securing Open Source Code

When you think of open source, your mind may initially go to companies that prominently specialize in open source software, like Red Hat (now owned by IBM) or Canonical.

Yet the fact is that today, almost every company is an open source company, at least in part. As of 2018, more than half of organizations surveyed by the Linux Foundation contributed internally developed code to the open source community or planned to do so. And a full 93 percent of companies use open source software themselves. That means that, if you’re considering open sourcing some of your business’s internal code, you’re in good company.

Not only that, but contributing code to the open source community is also a way to build your business’s brand, attract developer talent to your team and influence some of the open source projects that impact your ecosystem by, for example, contributing code that helps third-party tools integrate with your own platforms.

Yet, while open sourcing software offers many benefits, it also presents challenges, especially in the realm of security. When you expose your internal codebases to the public, you inherently increase the potential attack surface of your organization.

In order to open source your code responsibly, then, it’s critical to plan ahead to mitigate the security, privacy and reputational risks that may arise when your business shares its source code with the world. To help meet that challenge, this article walks through several practical strategies for securing open source code that businesses share as part of an open source community program.

#1. Protect Your Digital Credentials

Source code is often rife with secrets, meaning any type of digital credential that either humans or applications use to access systems or data. Secrets include not just passwords, but also encryption keys, GitHub keys and so on.

Modern applications tend to rely heavily on secrets in order to manage tasks such as authentication between different microservices or integrations between an application and a third-party resource, like a public cloud service. At the same time, developers or DevOps teams sometimes bake secrets into CI/CD configuration data to enable integrations between tools within the software delivery process.

For these reasons, keeping secrets secure is a challenge in any context. But it can be especially difficult when your code is open source and available to the public at large. In that case, any sensitive secrets that lurk within your code are exposed to third parties, who could abuse them to gain unauthorized access to your development tools, databases, servers and so on.

The lesson here is that, when you open source code, it’s critical to ensure that you scan it to identify any secrets within it, then remove those secrets before you share the code publicly. And remember that it’s not just application code that you need to check, but also any configuration data, databases or other content that you share alongside your source code, and which could also house sensitive secrets.

You should also take steps to prevent secrets sprawl, which happens when you store secrets across multiple systems. That practice makes it difficult to monitor and manage secrets centrally, and increases the risk of secret leaks. You can mitigate this risk — and avoid the poor practice of inserting secrets directly in application or configuration code — by storing your secrets in a centralized secrets manager.

#2. Safeguard Against Data Leaks

Beyond secrets, a variety of other types of sensitive data can hide within source code or configuration files. An application that you develop internally, then later decide to open source, might contain information about your customers that you are not authorized to share publicly. Comments that developers leave on source code could include details about your internal system configurations that would help attackers breach your network. Your team might have hard-coded internal IP addresses, hostnames and sensitive security information into source code.

When you open source your code, you must ensure that it doesn’t contain sensitive data like this. Otherwise, any sensitive information stored in your code becomes open to the world at large.

Here again, scanning the code for sensitive information is a first step to prevent data leaks, but you can go further. Establishing internal governance policies that define which types of information your developers should and should not include within code can help prevent the insertion of sensitive data in the first place. Assessing your data security posture by taking stock of which types of data your business owns and which ones could cause the most harm if shared publicly will also help you establish stronger data security practices across all of your codebases, including but not limited to those that you open source.

#3. Lock Down Open Source Libraries

As we noted above, 93 percent of organizations use open source in some form or another. In many cases, those usages involve leveraging open source libraries, which a company’s developers may import into their own internal projects.

There are good reasons to take advantage of open source libraries. Above all, doing so saves developers the time and effort of having to “reinvent the wheel” by writing their own libraries from scratch when an equivalent open source library already exists.

But there are also security risks associated with third-party libraries. It’s impossible to guarantee that the developers of the libraries adhere to the same security standards as your own developers. Many open source libraries are maintained by small teams of volunteer programs, which makes them particularly vulnerable to coding flaws or exploits that the developers don’t notice. Even widely used open source libraries can contain critical security issues — as was the case with the Heartbleed bug, for example, which resulted from a security problem in the OpenSSL library that millions of websites use to encrypt their content.

This isn’t to say you should avoid open source libraries. But you should use them responsibly. Establish a Bill of Materials (BOM) that details which third-party libraries you’ve incorporated into your own code, and track vulnerability databases so you’ll know when a security flaw is discovered in one of your libraries.

These practices will help you secure any code you maintain that depends on open source libraries, whether it’s code that you share through an open source program or code you use purely for internal purposes.

#4. Secure Your Git Repositories

Git is an excellent tool, and there is nothing inherently insecure about it. However, it can be very easy to make mistakes when using Git — like creating insecure directories on self-hosted Git servers or storing sensitive data in Git repositories — that leave your code at risk. And when you do things like this, Git itself will do nothing to alert you to the problem or protect you from yourself.

Because creating a public Git repository is probably the most obvious way to share open source code with the world, it’s essential to avoid Git security mistakes when you decide to open source internal code. It’s bad enough to have insecure internal Git servers. It’s much worse when anyone with an Internet connection can abuse your Git repositories.

You can mitigate these risks by establishing policies that enforce best practices surrounding Git security. Identify practices that are disallowed within your organization, like storing secrets in a Git repository (instead, keep them in a secure secrets manager). At the same time, scanning your repositories for sensitive information or insecure configurations will go far to ensure that you don’t accidentally invite a security breach when you publish your code through a public Git repo.

#5. Prevent Code Injections

If you release code to the public that is subject to vulnerabilities like code injection attacks, you risk establishing a reputation for your business as one that writes insecure code — which is exactly the opposite of the goodwill you probably hope to gain by open sourcing your code.

The best way to prevent code injection vulnerabilities is to follow secure coding best practices. But even skilled programmers sometimes make oversights that enable injections, which is why scanning your code for these risks is equally important. Your goal should be to find injection vulnerabilities before the public does.

#6. Use SSH Keys

Authenticating with passwords and usernames is convenient. But it’s hardly the most secure means of accessing sensitive information. When you are dealing with high-risk content like source code, you can gain an added layer of protection by relying on SSH keys for authentication rather than authenticating with passwords and usernames over HTTPS.

Yes, setting up the SSH keys takes a little bit of time, and you’ll have to find a way to manage them efficiently (here again, secrets managers can help). But it’s a lot less work in the long run than dealing with security incidents that stem from compromised usernames and passwords.

#7. Automate Code Security

Governance policies, risk assessments and best practices go a long way toward helping your team avoid making critical security mistakes when writing and sharing source code.

But we’re all humans — even developers who may not seem fully human at times — and we make mistakes. That’s why automatically scanning source code and configuration files for secrets, sensitive information, security vulnerabilities, insecure dependencies and other risks is an essential secondary layer of defense against security risks.

Automated security helps ensure that all of the code your business depends on is as secure as possible. It mitigates security risks in your applications, while also protecting your business’s reputation against the potential harm it could suffer if you open source code that contains security issues.

Final Thoughts

Open sourcing code can reap significant rewards for your business. But it’s also a major undertaking, especially when you factor in all of the security risks involved. By following the steps described above, you can ensure that the code you share with the world is as secure as possible — and that, in turn, your open source contributions bolster your brand.