When you find two identical or near-identical pieces of code in your codebase this is identified as code duplication. There are many resources to learn why code duplication is an issue.
When you find duplicate code, the intuitive reaction is to fix it right away. Don’t. Here is why:
There is a subtle difference between fixing an issue and solving it. Fixing means to remove it. Solving means to analyze the root cause and fix that. When you fix code duplication the typical thought process is: “Oh, here is a piece of code, and here is the same again. That’s bad. Let us merge them. Hm, maybe we can extract a new private method. Or we create a class for it. Or we place it in this class here, since this is already used in both places anyway. Oh yes, let’s do that.”
However, when you find duplicate pieces of code this is usually not the root issue. It is only a smell. There might be a flaw in your design. The place of your code might be discrepant from where the associated semantic concept belongs to.
Take this example:
We have a model class that represents a service product in three shapes: Basic, Medium, Top. We find this in a property:
class Product { int ProductType { get; set; } }
Then we have a document generation module for printing contracts or offers. Therein we find multiple blocks of the form
if (product.ProductType == Constants.MEDIUM || product.ProductType == Constants.TOP) { ... }
We could fix that by extracting a local variable:
var isMediumOrTop = product.ProductType == Constants.MEDIUM || product.ProductType == Constants.TOP;
Or we could extract a method
private static bool IsMediumOrTop(product) {...}
No, much better. Let us introduce a helper class.
public class AllPurposeDocumentGenerationHelper { ... }
We won’t do that. Instead we lean back and think:
- We have duplicate code. Why is that?
- Is there a business meaning attached to this condition?
- What is it?
We find that actually the business wants to “unlock” a certain special offer in the product if you buy the medium version or above. So it is a business thing. It is nothing specific to generating a document, nor is “IsMediumOrTop” necessarily a good name to represent it.
Recognizing that, we decide that the model needs to be extended:
bool SpecialOfferIncluded { get { return ProductType == Constants.MEDIUM || ProductType == Constants.TOP; } }
The conditionals in the document generation will now look like this:
if (product.SpecialOfferIncluded) { ... }
The semantic meaning will be immediately clear to the reader (and coder). Also, by looking at the new model property you will learn a business rule: “If the product type is medium or top, the special offer is unlocked.”
Next we notice that in our front end code we have the same conditional but there it was not so obvious:
if (product.ProductType == Constants.BASIC) { ... }
Gotcha!
if (!product.SpecialOfferIncluded) { ... }
No, wait! We wanted to lean back and think first. While doing that, it might turn out that this piece of code is actually not related to the special offer thing. Instead it is really specific to the product type. Luckily we now have a way to make a semantic distinction between the two meanings.
Soon the business will come to you and tell you that in the next version, the SpecialOffer should be included only if the price exceeds a certain limit. Easy:
bool SpecialOfferIncluded { get { return Price > Constants.MIN_PRICE_FOR_SPECIAL_OFFER; } }
Done.
Conclusion
When you face duplicate code, take it as a opportunity to think about your design, the business, and how well the design represents the semantics of your application. There are good chances you will find a way to improve that, thus having really solved the root issue of the duplicate code rather than just having it fixed in an arbitrary way.
