Please allow me to start with a story of a classic legacy experience. I’m sure some readers have already faced this experience in one form or another:
A couple years back, a very profitable company that was already leading in its industry decided to launch a new service. This launch would require a considerable amount of development work to an existing large monolith system – to roughly quantify, 100 developers spent 40% of their development time maintaining that system.
This legacy system served the core business and was quite essential for day-to-day operation. Like many others, this system architecture had taken the shape of a big ball of mud over the years. There were intertwined connections everywhere, making it very difficult to change anything without unwanted consequences. Conceptually, you could divide the system into different domains, but there were no clear boundaries or interfaces.
Because the new service was considered to be a potential game changer, management was keen on launching it as quickly as possible. To make this happen, they followed a proven strategy with a history of success. They formed a 15-member, cross-functional team, moved the whole team into the same room, and put one of the company’s most energetic business leaders in charge of the project.
The new feature was considered to touch almost all domains of the system, but 80% of the work was to be concentrated on one specific domain. Initially, the team had nine developers, with two developers having experience on that particular domain. Other developers had different levels of experience, but none had a solid working knowledge of the legacy system.
The real work begins
With the draft requirements in hand, the team estimated and agreed on a rough deadline of two months.
The project kicked off with great energy and enthusiasm. However, after a month and a half, the team was still nowhere near finished. So, the project lead locked down the team and moved them to a more isolated room in the building so they could maintain focus on the project.
After another month – although the product was starting to take shape – the team was still far from delivering a finished product. In the meantime, project manager realized a need for more senior developers with knowledge of different domains within the system. So, he intervened and brought in six more people, mostly senior developers and architects with expertise in different domains of the system.
Once those six people came into action, they quickly realized they couldn’t move forward with the code that had been written. New members of the team urged for whole solution to be re-designed, fearing that everything would become completely unusable if they simply moved forward. Eventually they got approval from the management and scrapped 70% of the code.
The real work begins – again
So, the project re-started almost from the beginning and continued for another ten months. In the meantime, management was growing increasingly impatient because the sales team had already sold the new product to many customers and, having already missed one deadline, time was running out for them.
Finally, they set a hard deadline of three months. At this state, the team realized it was a nearly impossible feat unless more people from other teams helped implement some of the features related to different domains. Since this project was the No. 1 priority of the company, three different teams dropped everything and came to the rescue. Three months later, they just managed to launch the product. Finally, after 16 months, the product was delivered without compromising any features!
What does this all mean?
Yes, the story has a happy ending. Even though it nearly ran out of time, budget and resources, this company somehow managed to pull it off. However, for most companies, there are rarely happy endings in these scenarios. Bbefore I get ahead of myself by calling this a complete success, let’s consider some facts:
- It was a two month project that got dragged along for almost 16 months.
- 21 additional people had to come to the rescue, sucking up company resources.
- Three teams had to suddenly drop everything that was in progress.
- Failing to keep promises to the customer was an embarrassment for the company.
- From technical perspective, the project introduced a reasonable amount of technical debt.
So, how do we handle this kind of scenario? How do we get out of this? And finally, how do we avoid this from even happening in the first place? I don’t believe that there’s any silver bullet approach. However, I would like to discuss few things that, in my opinion, can make a difference.
Challenge the game plan
First of all, if you end up in a situation where you have to build features to an existing legacy, never take it lightly. Make sure you’ve done all the necessary groundwork before you start. Let your developers and architects do a pre-analysis and come up with a thought through solution. If you have people with legacy experience, involve them. Those who have spent time with a legacy system have a better understanding of all the quirks and oddities in the domain model, business model and data-model. It will help overlooking obvious legacy odds.
Also have a person or a group who can challenge the solution and can give a fresh look at the system. This is important since, when we get into a complex system, we often tend to miss the easier solutions. Having a discussion about the design with people who are not involved in the design is always a good idea, but in case of a legacy system, it is really worth doing.
Break up the big ball of mud
In the long run, whenever a system grows big, I think it is important to chop it up into smaller sub-systems. The more you delay, the more difficult it becomes. As a first step, divide the system into different domains and fence those domains with well-designed APIs/interfaces. Gradually force all the external calls of that domain/subsystem to go through the APIs.
The key factor here is to decide upon the domains or sub-systems. If you can do a natural division, it is comparatively easier. If you fail to do it right, you’ll probably face many nasty challenges, like underlying data models hanging between multiple domains. Of course, it depends on how much ugliness exists on your system, but the point is if you can spot the right cut-points your job becomes comparatively easier.
On the other hand, the problem might not be always a big monolith system. It might be a small yet extremely difficult to understand system, something people are afraid to touch. In those cases, I think adding new code should be done in isolation as much as possible, creating a green field zone inside the legacy. Whenever there is an opportunity, existing services should be re-written and extracted from the legacy mess.
And, of course, before you go ahead with any legacy re-factoring you also need to plan how to safeguard the existing behavior of the system. Legacy systems are generally hard to test, especially when you are dealing with big mud ball architecture. You need to plan carefully to at least have tests on the feature level to make sure you’re not breaking existing features. A pragmatic approach is very important here. Decide the right level of testing to make sure you have enough of a safety net and then move ahead.
Don’t forget a regular check up
And finally to avoid this kind of situation, I think it is important to find the time to do regular health check of your system. If it gets too complicated for someone to understand, or if the architecture becomes difficult to explain, that’s usually a sign that things are getting out of hand.
Just like with your car, your system also needs to go through periodic servicing. If you don’t have a product owner or stakeholder who readily understands the consequences, try to make the effort of explaining those. We always have something new to add, but time is rarely ever made to re-factor. In my opinion, it is you – the developers and architects – who need to make the tough call. The sooner you make it, the less hard it becomes!