Open source software is made for the world by volunteers. To find bugs in it, you have to go searching for them. This practical book will guide you through it, and show you how to get involved in the open source community. You’ll find out where to look for bugs, how to report them, and what you can do to fix them. You’ll also learn about common problems that programmers run into, so you can ask the right questions when you’re looking at someone’s code.
You may think that finding bugs in a piece of software is only feasible if you have access to the source code. And while that may be the only way to debug some features, many open source projects contain instructions within the documentation on how to report a bug. These instructions tell you what problems are being targeted and what they look like, so that any issues will easily be spotted. Let’s take a look at how to find bugs in an open source project.
First up, consider open source software you use. Think of the languages, and then think of any open source libraries you use. For example if you’re a Python programmer you might use Django, the Requests library, and a whole lot of other packages. Pick a few that you think are very popular. These projects have larger teams and you’ll get good experience from them. However, you’ll also want to pick a few smaller projects. While you might get lost in the crowd with a larger project, one person can make a huge impact on a smaller project.
Once you’ve picked a few libraries, find their issue trackers. Try to think of a good trigger when you’ll be able to spend some time on open source. Maybe it’s every day in the middle of your morning coffee, or right after your once-a-week team standup, or maybe you can use the time right before your once-a-month community meet-up. I maintain an open source project that sends issues to your inbox, CodeTriage. Find a consistent schedule and pace that works with your life.
When you sit down to triage, make a goal. Maybe you want to find five issues to comment on. What exactly should you say? There are three states to an issue:
10) A bug is reported.
20) Extra information is gathered to reproduce the bug.
30) If the bug can’t be reproduced GOTO 20.
It’s your job to gather information and move towards a resolution. Let’s look at some common issue types and things you can do to help.
Static defect detection tools
Static defect detection faces serious technical difficulties, and as a result, there are multiple
specialized kinds of static analysis tools. These tools are targeted at their particular use cases,
or limited by chosen technologies. Even though the properties listed in Section 2 seem to be
desirable for most applications, many of the existing static analysis tools don’t have them.
We will examine the existing tools, focusing primarily on tools for checking C programs,
based on how they specify the situations in the source code that indicate the presence of bugs,
and on the limitations imposed by algorithms for finding these situations in the source code.
Simple source code analysis tools, such as ITS4 [VBKM00], RATS and Flawfinder, are used
to help with manual code audit. Such systems find certain template situations, such as potentially
dangerous function calls, and list them exhaustively. Simplicity of analysis algorithms results in
most of the reported issues not corresponding to real bugs.
Tools used to verify the absence of bugs of a certain type without requiring specifications
usually impose restrictions on the source code of analyzed programs. Restrictions follow from
inability of the existing sound analysis algorithms to work with arbitrary data structures (and,
correspondingly, code constructions), and from the requirement to have correct, even if incomplete, information about library functions. One group of such tools in based on abstract interpretation techniques, and includes PolySpace (see comparison with other systems in [ZLL04])
and ASTREE [ ´ CCF+05]. These tools verify runtime safety and other properties of source code
and were applied to check embedded software in aviation and device drivers. Another group
of tools is based on counterexample guided abstraction refinement algorithm, which uses theorem proving to efficiently work with abstraction of state space. This group is represented
by BLAST [HJMS02] and SLAM [BBC+06], applied to verification of runtime safety of device drivers. Limitations imposed by these tools make it difficult to apply them to regular open
source programs, requiring either heavy revision of the source code, or isolation of parts of the
code from incompatible features in preparation for checking.
Systems for user specification checking allow finding defects in more complex situations, and
without restricting the source code, but require big number of manually written specifications to
be effective. Cqual allows to add qualifiers to C types, and was used to find format string vulnerabilities
Issues have a shelf life. If they’re not actively being worked on, they can go bad. Maybe the original reporter found a workaround, or maybe the bug resolved itself.
To find issues, sort by “updated at” if the issue tracker allows for it. Go through a few of them and leave a comment:
Can you confirm this is still an issue?.
If the issue is really old and someone already did that, you can push for the inactive issue to be closed:
There is no activity here for the last 2 months, let’s close this issue for now and re-open if needed.
An issue is only good and helpful as the people working on it. If the issue isn’t important enough for the original reporter to care about, there are likely more pressing problems that maintainers should focus on. Closing an issue isn’t a finality. If the problem comes up again, the issue can be re-opened or referenced by a new issue. Often if a thread goes on for too long it can be difficult to easily scan all the conversations, so sometimes starting fresh is the best way to move forward.
This guide is intended to help software developers and bug hunters contribute to open source projects by finding bugs. This book covers tools that can be used to find bugs in different programming languages, as well as points out some common mistakes made by novice bug hunters.