How to Memorize a Complex Codebase?
5 tricks to remember important pieces of your massive codebase
Does your tech lead always have answers to code questions? Somehow, they always connect the dots between an esoteric symptom/issue and the codebase.
It almost feels like they have memorized the entire codebase. While it seems like magic, it is not. You can do it, too.
Note: Memorizing the codebase does not mean you remember everything line by line. It is about doing these five things:
🌐 1. Know the APIs
Memorize your team’s API surface/workflows. In fact, you should be very familiar with those and not need to refer to the documentation.
Here is what I do:
Make a list of the prioritized scenarios.
Use the API / product like my customer.
Run those scenarios under a debugger and add log lines to understand which code path is being executed.
Tests can shed light on nuanced behaviors, so understand & run them.
Fix bugs in different areas(very helpful!).
I have seen some engineers who have been on a team for a while but still do not know key areas. So, when a major issue strikes, they rely on others to resolve it.
📑 2. Document & understand the core “flow”
Next, I understand how the code is organized. While there is no alternative to actually reading the code in depth, there are some tricks:
know which source files are important vs. which are not.
know the lines to focus on vs. skim through
I start by deep-diving into one workflow in code search and under the debugger. Next, I document the classes/files, starting from the API surface layer and working my way through all the dependencies.
Key classes to know are API handlers, caching layers, database interactions, etc.
Classes to skim through are the simple utilities, logging helpers, etc.
Note that memorizing the code is not a one-day job. I learn something new every time I look at it, so I keep live notes.
Here is a sample version of my high-level notes:
🕵️ 3. Read older “issues”
Memorizing code is not just about what is visible in it but also about knowing system constraints, bottlenecks, and the sources of repeated problems. Reading about past issues is a great way to learn about these issues.
Here are key things I look for in the past issues
How did the system behave with bad inputs?
Overload incidents highlight bottlenecks and how the system deals with it.
How did the team handle the past issues, and which issues/areas still need attention?
Issues that repeat and why
How annoyed do critical customers get when our team has issues
🧵 4. Threading model
I have dealt with many production issues, and quite a few were caused by race conditions.
The key to solving those was memorizing the threading model in my head.
Examples of things I try to learn:
Can the request threads be blocked in any circumstances?
Do the key states get accessed/updated from different threads? Does it have the right locking mechanism?
Do the long-running stuff run on their thread pool? Can they be resource-intensive?
Were there past issues due to deadlocks or blocked threads?
How does the request thread use the cache, and does another thread invalidate the cache?
I document the different kinds of thread pools and their responsibilities, i.e., states they update, code they run, i/o operations, etc.
🔥 5. The crazy stuff
Every codebase has few code blocks that are extremely complex, even with a giant comment section.
Knowing these can make you:
Effective when esoteric issues strike. It will reduce debugging time as you can connect the dots faster than others.
good at identifying gotchas with new design proposals
A great mentor who can help others navigate the complex codebase
I keep a log of such stuff and add to it whenever I encounter such code blocks.
Parting Note
Memorizing the codebase is an important skill for every engineer, especially tech leads. It does not mean you remember things line by line, but you must know the core scenarios, system constraints, and other code nuances.
Don’t expect to learn everything in a day and retain it forever. It is a continuous process. So, keep live notes and spend quality time learning this regularly.
🎤 Shoutout
Managing up: 3 things I wish I realized sooner by Jordan Cutler
Prioritize your Tasks Effectively by
If you enjoyed this article, then hit the ❤️ button. It helps!
If you think someone else will benefit from this, then make sure to 🔁 share this post.
It was great to read about your in-depth process on this, Raviraj. Only some of it applies for me on the frontend side, but I was able to do the translations and make it work.
Thanks as well for the shout-out on my recent article 🙇♂️
One effective way that I found was by assigning myself to resolve as many bugs as possible. That made me go through different pieces of code many times and that repetition helped me memorize the code.