How Fuzzing Can Make A Large Open Source Project More Secure
Emily Ratliff of the Linux Foundation explains the considerations to take when planning to fuzz your open source project
One of the best practices for secure development is dynamic analysis. Among such techniques, fuzzing has been highly popular since its invention and a multitude of fuzzing tools of varying sophistication have been developed.
What is fuzzing? Fuzzing involves inputting invalid, unexpected or random data into software to identify flaws. It’s often automated, or semi-automated, so can be invaluable in finding causes of crashes, memory leaks or security problems.
In the context of fuzzing a single application, it can be quite straight-forward to spot and rectify flaws in a system. But what if a developer is leading a large project which does not lend itself to being fuzzed easily? In this case, dynamic analysis must be approached in a far more considered fashion.
Read More: Security challenges threaten ‘golden era’ of open source
Deciding the goals
Deciding on the goals is a fundamental first step. Fuzzing might be used to screen for security issues, or perhaps all types of correctness issues. One of fuzzing’s many benefits is that it detects many low severity issues which may never be encountered in normal use. These may look exactly like security vulnerabilities with the only difference being that no trust boundary is actually being crossed.
For example, if by fuzzing a tool that only ever expects input to come from the output of a trusted tool, it may cause crashes that would never be encountered through normal usage. Are there other ways to get corrupt input into the tool? If so, there is a security vulnerability. If not, it may be a ‘low priority’ correctness issue that is never resolved. That is unless it is clearly defined in the goals that all of the issues will be addressed, rather than only those relating to security. By setting up expectations for dynamic analysis up front, it can save a lot of time and frustration.
Understanding trust boundaries
Documenting and understanding exactly where the error checking should occur is vital to the efficiency of the process. The most seasoned security professionals find it easy to create a mental model of strong security whereby every function defensively checks every input. However in the real world, it is more complex than that. This type of hypervigilance is hugely wasteful and therefore never survives in production.
By putting in the extra work to establish a correct mental mode of the security boundaries for the project, the analysis team will better understand where in the programme control flow the checks should be made and where they can be omitted, thus ensuring a more efficient use of available time.
Segment the project based on interface
Different ‘fuzzers’ have different specialities. Therefore, the project should be segmented into areas for the different types of fuzzers based on the interface – file, network, API.
Explore existing tools
New fuzzing tools are being developed all the time and the existing tools are being updated with new capabilities. By staying up to date with these developments, it is possible to unearth new efficiency gains that can make a fuzzing process run that bit smoother – even if it is only helping within a subset of the project.
A situation may arise whereby the requirement is to perform a dynamic analysis on a large mixed language project which does not lend itself to existing tools. In this instance, it is worth considering writing a fuzzer specific to the project’s APIs and generating random inputs based on them – adding lots of assertions that are at least enabled during fuzzing. Creating a fuzzer is relatively straight-forward if it is clear what the API is and how it can be introspected – take a random number generator, set up an isolated container or VM and go.
Is fuzzing really worth it?
A common critique of fuzzing tools is that after they have been run for a while, they stop finding bugs. But the obvious flaw in this conception is that not finding bugs is exactly what you want to achieve. Just as throwing away an automated test suite because it finds so few regressions is illogical, the same rationale applies to fuzzing. If fuzzing tools are no longer finding bugs, it should be considered a success – before moving on to find the more elusive threats.
How much work?
The most time-consuming part of the fuzzing process is finding the best tool to run it. If a tool works with the project or even covers a subset of the project, then it is simply a case of running it.
As the tool picks out issues in the system, deciding whether to fix the low priority glitches will be decided along the way and will shape the development of the project’s trust boundaries. There may be a crash for which a patch is generated and submitted, only to find that it gets rejected because the incorrect input generated by the fuzzer can never reach that part of the project. This might make adding a check too expensive.
Whichever approach is taken, make sure it is well-established in written processes so that future developers can go above and beyond it in the future.
Emily Ratliff is Senior Director of Infrastructure Security for the Core Infrastructure Initiative at The Linux Foundation
What do you know about Linux? Take our quiz!