Here I think out loud about an idea to turn technical debt into dollar amounts such that it can be properly discussed with managers.
Practically all codebases suffer from technical debt. Usually the developers will complain that management never prioritizes the cleanup well enough. The usual advice is to reserve a fixed amount towards debt removal, like half a day per week or one day per sprint.
That sounds like a cop-out to me. Why don't we properly assess the value of cleaning that up? If you tell managers you want to invest 4 hours of work to save $10,000 over the next year, the value is clear even if they have no clue about software. The big question is how you can come up with the number "$10,000"? Technical debt is hard to quantify in money.
Technical Debt is Rule Violations
Let's first define what technical debt actually is. Ward Cunningham gets the credit of coming up with the debt analogy. He was a consultant for a finance company, so a finance analogy suggested itself. Turned out the comparison was useful in other industries as well.
The concept describes that during software development we sometimes make choices that imply costs later. Like you borrow money to buy a new car, but later you to pay it back plus interests. The interests aspect manifests in software development in two ways: Either technical debt incurs ongoing costs like slowing down development or paying off later requires more effort because further development inherited and multiplied the debt. In financial terms, we can either pay the compounding interest in ongoing rates or all at once at the end.
To identify and measure technical debt, we look for rule violations. For example, your code might violate naming conventions. This makes the code slightly harder to read and understand which increases the risk to introduce bugs or miss them during a code review. There are tools to check naming conventions, so we can count all violations.
A funny implication of defining technical debt as rule violations is that you can increase and decrease it arbitrarily by changing the rules. If you declare to use CamelCase instead of snake_case from now on, you implicitly created technical debt which needs to be cleaned up. If you revert the rule decision, this technical debt disappears.
Ok, so we can quantify technical debt but in a useless metric. Knowing that we have 5925 rule violations is not a dollar amount.
A similar problem occurs with the risks due to the current Corona pandemic. We know that there are now certain risks whenever you meet other people. Studies have measured the impact of certain behaviors. How can you turn this into practical guidance?
One interesting approach is from the microCOVID project. One microCOVID represents a one-in-a-million chance to catch the Corona Virus Disease. For example, if I'm in a car for 15 minutes with one other person it is 70 microCOVIDs. If we both wear FFP2 masks, it goes down to 4 microCOVIDs. Doing it twice a day for one week, it adds up (4 * 2 * 5) to 40 microCOVIDs. Their suggested default is a budget of 200 microCOVIDs per week, which corresponds to a 1% chance of COVID per year. You may adapt the budget to your needs. Lower if you are more vulnerable. Higher if you don't care that much.
Could we reuse this idea for technical debt?
As seen above we can count rule violations. What we want is a measure of the opportunity costs to fix it. Let us steal the "microcovids" idea and define "microdefects": A one-in-a-million chance to be a serious defect. Some made-up examples:
- "Violating the naming conventions" costs 1 microdefect. If you have a million of them, there is probably a bug related to one.
- "If-clauses without an else" costs 10 microdefects.
- "If-clause bodies without braces" costs 1000 microdefects.
- "Using a function pointer in pointer arithmetic" is 900000 microdefects.
Of course, the microdefect amounts need to be calibrated, ideally by quantitative data.
Now, you need to determine the costs of an average defect. Maybe it costs $1000 to fix it (including secondary costs like losing face in front of a customer). Now you know that putting braces around an if-clause would save you $1. You can determine how important fixing the 9375 violations in your codebase is and compare it to an effort estimation easily. This can be integrated into static analysis tooling, so it requires no manual effort to compute the costs (apart from calibration and setup).
Is anybody out there already doing something like this?
A good discussion ensued on lobste.rs.