During my projects, I often collaborated with other teams, integrating their code into my work. This required me to debug code written in various languages, sometimes outside of my area of expertise. I’m a TS/JS dev and I had to work with Swift/Objective-C, Kotlin/Java, and even Golang code.
Given that I wasn't a native developer and wasn't familiar with the various codebases, I had one major problem: investigating bugs was really time-consuming. I faced two main challenges:
For the first issue, there is no magic solution: you have to read the code and ask for explanations. However, the second issue is more manageable, and I’ll share an effective method to help you save time during your bug investigations.
Before that, it is interesting to understand the problem by answering the following question:
It's very simple. Just follow a short list of steps:
While this procedure may work with simple bugs, you are almost guaranteed to lose a lot of time when facing a more complex problem.
Moreover, even after a stroke of genius (or luck) that fixes the problem, you then need to clean up all your tests, identify which parts of the tests are necessary for the fix, remember which test was conducted, and most importantly, understand why it works to avoid a shaky fix that works half-heartedly and by chance.
In other words, even after our brilliant insight, the problem is far from being truly resolved.
Let's take an example of a bug I recently encountered:
I have an application where you can launch a video. The player used to play the video is an external library. When you press the escape
key on your keyboard, you exit the video via an onExit
function that calls the goBack
function of React Navigation.
Here is the VideoPlayer component:
When I exit the page with the video, the memory does not decrease, the video remains in memory → I have a memory leak.
Step 1: My goal is to fix the memory leak. My plan is to fix the memory leak.
Step 2:
destroyPlayer
function of the external library. What luck! I have access to this code, the environment to be able to debug it only takes a day to set up!console.log
and the debugger (I’m a pro 😎)Step 3:
Step 4:
→ 1.5 days wasted with no result.
→ No usable result to prevent someone else from repeating the same tests I did.
When encountering a bug, it’s important to determine the scope of the bug. Which components are involved? Which pages? Under what conditions?
The first step is to be able to consistently reproduce this bug by identifying all the parameters (environment, branch, commit, procedure to reproduce). For simple problems, this step is often immediate. I just added a function that causes side effects or simply does not work.
In our example, I quickly notice that the memory leak is there as soon as I exit with the escape
button. This allows me to narrow the scope of my bug to the steps involved in closing the player. I don't waste time analyzing the creation of the player, how memory behaves during playback, etc. A well-defined scope already saves a lot of investigation time.
Once the bug is identified, I do not dive into a deep code reading, I calmly read the logs. Most of the time, the bug/error will be clear and can be fixed in 5 minutes without any problem.
Unfortunately for me, this time no logs 😓.
If the error persists, I can then start preparing. I take out paper and pens, Excalidraw or draw.io. I need to understand the bug.I managed to identify how to reproduce it, now I seek to understand the flow that causes it. Complex bugs are often due to interactions between multiple components, states, hooks, etc. I could try to remember everything, but to keep my future self from going insane, I prefer to write it all down.
I draw the buggy flow as well as the “perfect” situation. The drawing does not need to be exhaustive at first. I can iterate on it to make more complete until I have a good representation of both flows.
Perfect situation:
Buggy situation:
At this stage, I don’t know why the destroyPlayer method isn’t fully executed, I only know it’s called but the player isn’t destroyed.
Having both situations drawn out will not only help identify the problem but also think through a solution. Additionally, these drawings will be very useful for seeking help and explaining the issue to someone else clearly.
Do not hesitate to diversify the diagrams at first to find the one that best illustrates the problem. Here I used a sequence diagram but you could have used an activity diagram or just rectangles and arrows if you want. The important thing is to have a visual representation of what happens.
Once the process is drawn, I can start thinking about why our bug is there. This is often when I dive into a tunnel and come out 2 days later feeling like I’ve been going in circles forever. Worse, after a 30-minute session trying to test something, I can simply forget why I was setting up this test. Here is my method to avoid these tunnels:
I think of several hypotheses and for each, a plan to confirm/refute them as simply as possible. The goal is to identify precisely where the bug is and to avoid wasting time on improvised implementations.
In my example, I know the error occurs during the call to destroyPlayer
. I can then formulate two hypotheses covering all possibilities:
destroyPlayer
function doesn’t work → the problem comes from the external library.Test plan: Call destroyPlayer manually rather than in the useEffect cleanup.
The test plan for hypothesis 1 will validate/refute this hypothesis.
Let’s document the results as exhaustively as possible. Some test results will invalidate other hypotheses without needing additional tests. This is the case for our test plan above.
For my memory leak, I manually tested the destruction functions of the player, without the useEffect
cleanup, and confirmed that the player was properly destroyed. I concluded that the library's functions work correctly. The problem, therefore, lies in my implementation.
The function called in this cleanup is asynchronous and was trying to update a component in the DOM but this component was already unmounted. So the function to destroy crashed hence the memory leak. The solution was to destroy the player before navigating.
This bug investigation showed me how proper organisation can save hours, if not days. With a proper and simple plan I was able to identify the problem in less than an hour:
When formulating hypothesis and debugging, it’s very easy to fall into a loop of increasingly complex hypotheses and waste a lot of time. During an investigation, we often discover auxiliary problems or possible causes that do not directly impact the main issue. For example, while trying to resolve a memory leak, we might encounter a performance error or unexpected behavior in another part of the code.
The trap is to "dive" into this auxiliary bug or problem, forgetting the main objective of the investigation. This leads to dispersed efforts and can significantly prolong the resolution time of the initial problem. Instead of solving the main bug, we end up debugging several unrelated issues, which can become frustrating and inefficient.
By adopting these practices, you will be able to solve bugs more efficiently and systematically.
If you want to know more about debugging I recommend you read the debug guide which really helped me creating this methodology.
Kudos to Delphine for her amazing drawing of me losing my mind over this bug.