The cost of refactoring
Most engineers have actively refactored code they're responsible for, and a lot think there's no cost to refactoring - but there is. Let's explore and understand, what is it?
Many teams like to live by the 'boy scouting' mantra - leave an area cleaner than you found. Thus, refactoring is widespread in software engineering teams - but at what cost?
There's always a cost. Everything done, every decision made, has tradeoffs. It's essential to remember this when making decisions - whether architecture decisions, when and what to refactor, work prioritisation, or even just delaying replying to a message from somebody until later.
Contents
When is a good time to refactor?
What are the costs of refactoring?
What is the cost of NOT refactoring?
How to refactor well
When is a good time to refactor?
Deciding when to refactor is subjective - and relies heavily on context. You have to make an informed decision where you have weighed the tradeoffs (costs) in refactoring something, and the cost is worth the price.
I can't make that decision for you, but I can inform you when there are good scenarios to consider refactoring.
The two most prominent ticks to go ahead with refactoring, in my mind, are:
The team is hurting from an earlier decision (such as architecture), and refactoring can remove or reduce the pain.
There is a business case (user requirement) to be working in the area that is giving your team pain.
Many engineers often make it apparent that they want to refactor an area. Generally, engineers are more vocal about this than anything and are persistent about it - but this doesn't mean it's an actual pain point that needs to be addressed, and that's an easy trap to fall into.
One way to ensure you/the team (truly) feel the pain of an area is to choose a priority area to refactor and provide reasons for it. If there's even data to back it up, then it's a pain point - not to say you need this data, but it helps validate the concern.
Before you go off and do any refactoring, ensure the team is on board - there's nothing worse than all the cost and no reward because the team disagrees the area needs refactoring.
Dedicate time to refactoring, or do it ad hoc?
I'm not a fan of etching out dedicated time for refactoring - putting that out there now.
The best time to refactor is ad hoc when you meet both criteria above. Having a user need gives you a reason to be in the area and feel the pain, which means you have work to do in the area - so you can refactor as part of business-as-usual (BAU) work without putting dedicated time aside, reducing one cost of refactoring and optimising time utilisation.
What are the costs of refactoring?
Many engineers find refactoring code rejuvenating work, so they vocalise this desire and overplay the importance of it. We must ensure we're pragmatic and consider the tradeoffs before committing.
Time is such a costly thing. When you are refactoring, what are you not doing? Revenue (features, infra, etc.) generating work. I know it's not black and white; it never is, but that's an easy way to look at it here. Time spent refactoring is time spent not building new features - but it's more than that. When you make a change (refactor), somebody has to use their time to review that change, and then somebody has to spend time ensuring no behaviours have changed or bad ones have been introduced. You should use behaviour-driven tests to help with the last one, but that only slightly reduces costs.
Knowledge is lost, and confusion is added when we refactor - this is prone to being glossed over because we usually focus on the benefits of the refactoring. In the long run, your change may bring more clarity to a codebase, but your refactor breaks the mental map others have in the short term. Time needs to be expended to familiarise/learn the new codebase - this mental map that has been eradicated is costly to replace and essential. Without a mental map, everything in this area takes longer. If a new, severe bug is found and needs to be fixed, then there are potentially fewer people who understand the codebase and can quickly execute the changes required to fix the bug.
Unknown unknowns. What was known is now unknown. Akin to the mental map lost in the above point, you suddenly don't know what you don't know about the code section. As systems grow, bugs are discovered (as is almost inevitable) and often classified as 'won't fix'. These are known knowns. Your refactor may have fixed or hidden them deeper - but we don't know that now. We need to determine if they exist or are gone and if new bugs will appear. Time is required to discover these things (if you're lucky to find them). Tests can help, particularly behaviour-driven tests, but we only catch some bugs.
What is the cost of NOT refactoring?
It's not all doom and gloom. Refactoring is almost inevitable in software; it's a constant, like the speed of light (and it seems to come up as often, too!), and there is usually a good reason it's wanted. It can be expensive not to refactor.
The three most significant costs of not refactoring:
Bus factor remains a risk.
Technical debt persists.
Morale steadily declines.
All three costs go hand in hand. Technical debt can lead to poor 'working conditions', declining morale and people moving on from the project/company, increasing the bus factor and creating a higher risk.
One trick here is that refactoring may not altogether remove the technical debt, but it can still alleviate morale (temporarily) - reducing chances of a high bus factor. Sometimes, taking a more minor cost and performing a smaller refactor is the better tactical choice. Smaller cost, smaller reward.
The excitement fallacy
One of the worst reasons to refactor is because 'it's exciting' or 'energising' work. You're at most going to increase morale minutely but incur all the costs - not a worthy tradeoff.
Most reasons behind refactoring are emotional, but this is the one to avoid. Pain and data are your friends, not excitement. Find other ways to sate the desire for exciting work - do personal development and try out a new technology (as a POC, not production work!).
How to refactor well
So, given we know when and why - how do you refactor excellently?
Small increments - just like good old-fashioned trunk-based development recommends (we all know how important trunk-based is). Small, concise, targeted changes are ideal because they're easy/quick for somebody to review. When you make small changes, it's easier to determine the scope of their impact.
Highly focused refactors are essential. If you try to bite off more than you can chew, you'll fix nothing and break everything. Remember, the most considerable cost of refactoring is time. If you are highly focused, you reduce the scope and thus time investment. You can choose the higher priority refactor targets, and once they have been completed, you can re-evaluate. Things move fast, and having small targets means you can pivot more successfully.
Use (behavioural) tests to minimise the risk/cost of introducing unknown unknowns - another cost. You should ensure you have tests in place before you touch the code. Use TDD to guide you here. Understand the behaviours expected in the part of the system you are modifying, put tests in place, and then execute the refactor.
Another way of mitigating this pain is to pair/mob on the refactor. You'll also reduce the cost of knowledge loss because more people will be close to the changes and understand the new code.
In summary
Refactoring isn't as perfect as it's made out to be, and we should be conscious of the cost before doing it. It’s important work, but it’s not always important enough.
So, next time you want to do some refactoring, pause and take your time to find reasoning and break down the work with a focus on small deliverables in the highest priority areas.