Legacy code - strangle or tame?
For me, as probably for every programmer, legacy code is something that I do not want to deal with. It is always not well written, hard to read, and very complicated.
Unfortunately, the world is not a perfect place and we must do it on a daily basis. The first thing we want to do when we are starting to work with a new legacy project is to rewrite everything from scratch. Such an approach in most cases will not work too well, it will take much time to deliver, it will stop introducing new features for a long time and do not bring any value to customers during this period.
A better approach is to rewrite step by step, one module at a time. Let legacy live with a new codebase, arm in arm. It can be achieved by using a strangler pattern. Unfortunately, such an approach is also not always possible. Sooner or later you will face the situations when the requested feature will touch almost every part of the system or there will be no part left that can be rewritten at once in a reasonable time.
From strangling to taming
Let me tell you a story about one of our projects which we started two years ago. It is a task management software with a lot of features and integrations with external systems. This software was developed a few years before we took it. It was used by many customers back then and nowadays it is getting more popular from month to month. Our main goal at the beginning was to make it faster, more robust, and more scalable.
We started as always by collecting as much knowledge about the domain as we can, then extracted some modules as refactoring candidates and get into coding. Refactoring a few modules took us almost a year. Almost a year with no big visible changes for the users.
The big disadvantage of refactoring legacy code is that it has no visible value for customers, even if it is a huge improvement for us, developers. Our hard work at this stage will pay off in the future for sure, but we should also give some value to users during the whole process, not only at the end.
This is why it is so important to properly design new codebase architecture and use possibilities it gives to parallelly refactor and introduce new features.
New architecture
Architecture should be always adjusted to project requirements, each project has specific characteristics and requirements. In our case, because of external systems integrations, and more integrations on the horizon, events were a very important part that we wanted to introduce. Based on that we decided to go with CQRS.
CQRS - Command Query Responsibility Segregation
Probably you heard about this more than once. It is known for some time and a commonly used approach. It allows us to separate responsibilities, to separate domain, and make code clear and easily extensible.
Base principles:
- each change must be done by Command
- each Command has one Handler
- each change can dispatch Event
- each Event can have multiple Handlers
- fetching data from the database is performed by Query
- each Query has one Handler
Taming
At some point in time, we decided to reduce refactoring activities and focus more on introducing new features, to finally make something visible to the customers. And so we did. We introduced a couple of new features in already refactored parts. Unfortunately, quite soon, we hit a wall. It was no longer possible to add something without touching legacy code, and such refactoring will always require too much time and it will probably end up with refactoring the whole system. Too many parts are connected.
Use events
Is the refactoring always really needed? Maybe we can add new features without refactoring?
One of the features that we wanted to implement was the notifications module. It is obvious that notifications are connected with many modules. And it is also obvious that from a business perspective it is not acceptable to refactor all these modules just because of notifications.
Luckily it is not required, we can prepare domain events for legacy code and dispatch such events when needed. The actual job will be done by a new code in event handlers.
Example
In this example, I will show you how we can notify the user when he was assigned to a task in our system.
Let’s start with the code currently responsible for assigning users to the task.
<?php
namespace App\Legacy\Controller;
class TaskController
{
public function assignUsers(Request $request, int $taskId)
{
$users = $request->get('users');
$taskHelper = $this->get('taskhelper');
if (!$this->taskExists($taskId)) {
return new Response('NOK', 404);
}
if ($users !== null) {
$usersIds = explode(',', $users);
foreach ($usersIds as $id) {
if ($this->userIsAssignedToTask($taskId, $id)) {
continue;
}
$taskHelper->assignUser($taskId, $id);
}
}
return new Response();
}
}
<?php
namespace App\Legacy\Service;
class TaskHelper
{
private $repository;
// …
public function assignUser(int $taskId, int $userId)
{
$this->repository->assignUser($taskId, $userId);
}
}
This is just an example code, but you can get the idea of what kind of legacy we are dealing with. Of course, we can inject some service from the new codebase to TaskHelper and use it to create a notification. Such a solution will work but will be less extensible. The solution which I recommend here is to dispatch a domain event and let the handler do the job.
<?php
namespace App\Task\Domain\Event;
class UserWasAssignedToTask
{
private $taskId;
private $userId;
public function __construct(int $taskId, int $userId)
{
$this->taskId = $taskId;
$this->userId = $userId;
}
// …
}
<?php
namespace App\Notification\Infrastructure\Event\Handler;
use App\Task\Domain\Event\UserWasAssignedToTask;
class NotifyUserAboutTaskAssignment
{
public function __invoke(UserWasAssignedToTask $event)
{
// here you can do everything you want
}
}
Now we have a nice event and handler in the new codebase, let use it in legacy to tie everything up.
namespace App\Legacy\Service;
use App\Task\Domain\Event\UserWasAssignedToTask;
class TaskHelper
{
private $eventBus;
private $repository;
// ...
public function assignUser(int $taskId, int $userId)
{
$this->repository->assignUser($taskId, $userId);
$this->eventBus->dispatch(
new UserWasAssignedToTask($taskId, $userId)
);
}
}
Now when we have it all in place, each time when a user is assigned to a task, a new event will be dispatched and handled. We can easily extend the functionality of our handler, we can also add more handlers, e.g. log this in some event log, or propagate this change to some external system.
What if you want to reuse legacy code?
Another new feature we wanted to implement were automated actions. The user can set some rules which will perform actions based on some condition. For example, when a task is finished it should be assigned to the proper user. Not too complicated, looks quite easy to implement.
But what if all actions which should be performed are in legacy (in a non-reusable form)? E.g. Method in Symfony controller handles the whole logic. Should we refactor all those places? How much time will it take?
Use command
What we can do is to add commands and handlers for all actions we need. Commands will contain all data needed and the handler will wrap up legacy code. We will not refactor anything in this process, just take it as it is and put in a new shiny box called “command handler”. Of course, this will not pay off the technical debt we had in legacy but allows to introduce new features easier and faster. And what is most important, will allow us to refactor smaller parts in the future. One command handler is for sure easier to refactor than the whole legacy controller/service/module.
Thanks to that we can easily use this code in a new codebase, not only it will be reusable but also it will be in line with the new approach.
And this was what we did, code we needed was extracted to handlers, and replaced everywhere in legacy by dispatching command.
Example
I will use code from the previous example. We will refactor TaskController to a new approach. Of course, this example will be shortened to just show the concept. In our project, there is much more logic behind this feature.
Let first prepare the command
<?php
namespace App\Task\Domain\Command;
class AssignUserToTask
{
private $taskId;
private $userId;
public function __construct(int $taskId, int $userId)
{
$this->taskId = $taskId;
$this->userId = $userId;
}
// …
}
and handler
<?php
namespace App\Task\Application\CommandHandler;
use App\Task\Domain\Command\AssignUserToTask;
use App\Task\Domain\Exception\TaskNotExistException;
class AssignUserToTaskHandler
{
private $taskHelper;
// ...
public function __invoke(AssingUserToTask $command)
{
$taskId = $command->getTaskId();
$userId = $command->getUserId();
if (!$this->taskExists($taskId)) {
throw new TaskNotExistException($taskId);
}
if ($this->userIsAssignedToTask($taskId, $userId)) {
return;
}
$this->taskHelper->assignUser($taskId, $userId);
// now we can also move event dispatching from TaskHelper to
// this command handler
// (but I will leave this at it is for now)
}
// …
}
Probably you noticed that I moved some parts of TaskController to the handler also, it is always a good idea to make such small improvements boy scout rule.
Now we can adjust TaskController.
<?php
namespace App\Legacy\Controller;
use App\Task\Domain\Command\AssignUserToTask;
class TaskController
{
public function assignUsers(Request $request, int $taskId)
{
$users = $request->get('users');
$commandBus = $this->get('command_bus');
if ($users !== null) {
$usersIds = explode(',', $users);
foreach ($usersIds as $id) {
$commandBus->dispatch(new AssignUserToTask($taskId, $id));
}
}
return new Response();
}
}
Looks quite easy, right? Probably you are wondering why we did this, not much changed in the code, the handler is still the same legacy code but in a different place. TaskController looks almost the same.
Let me remind why we did this. We did this to introduce automated actions. Thanks to the command/handler approach we can easily do this.
Our example automated action should work as follow: When the task is finished then assign the proper user to this task.
Let first add TaskFinished event
<?php
namespace App\Task\Domain\Event;
class TaskFinished
{
private $taskId;
// ...
}
Then we can add a handler for our automated action.
<?php
namespace App\Automations\Infrastructure\Event\Handler;
use App\Task\Domain\Command\AssignUserToTask;
use App\Task\Domain\Event\TaskFinished;
class AssignUserOnTaskFinished
{
private $commandBus;
// ...
public function __invoke(TaskFinished $event)
{
$taskId = $event->getTaskId();
if (!$this->isActionEnabled($taskId)) {
return;
}
$userId = $this->getUserIdForAction($taskId);
$this->commandBus->dispatch(
new AssignUserToTask($taskId, $userId)
);
}
// …
}
This is of course some kind of compromise, we will not get rid of legacy, it will still be there, but tamed. Now you can easily use legacy code even without knowing that in fact legacy code will be executed. Everything will look nice and as good as your new codebase.
You can later (when pressure for new features will be lower) take such AssignUserToTaskHanlder and refactor it properly to get rid of the legacy TaskHelper service.
Working with legacy?
And what is your approach to legacy? Strangling, taming, rewriting? To learn more on that topic, make sure to check our CTO's blogpost on possible scenarios of handling technical debt in legacy applications. Or maybe you are already tired of the constant struggle with the legacy? Just droup us a message, always glad to help.