How to delete… not to delete yourself?
I want to present you a real story that happened in a project we took over. On this day, Sentry revealed a strange error. The short analysis showed that the error occurred when a user was removed. But wait, there is no such feature as removing a user! What happened?!? I analyzed the logs and found out that someone wanted to delete his/her profile picture and instead... removed himself!
The different project we took over, another example. Someone published a post (let's say a blog post), and it was commented on. But later, the comment was deleted, and… the whole post disappeared!
about onDelete, cascade remove and orphanremoval
There were a few more projects I participated in with the same story to happen. The first one, already mentioned, occurred on prod., luckily the second case was spotted earlier. All these obstacles were caused by the same problem - cascade deletion. It may look like a junior mistake, but many people I talked to, have the same problem. So don't blame junior developers, let's dive deeper and wonder what we are dealing with.
In Doctrine, we can control how objects behave when they are removed in several ways.
For example, what we have here is the User and the Address entity with one-to-one relation Without any “cascade” removing defined.
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user")
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address")
*/
private $user;
}
In this case, the address table will have a foreign key referencing to the user table.
foreign key (user_id) references user (id)
So what happens if we want to delete a user who has an address? Like here:
$entityManager->remove($user);
$entityManager->flush();
It's simple. The user connects to the address with a foreign key, so we can not simply delete it.
An exception occurred while executing 'DELETE FROM user WHERE id = ?' with params [1]:
SQLSTATE[23000]: Integrity constraint violation: 1451 Cannot delete or update a parent row:
a foreign key constraint fails
(`address`, CONSTRAINT `FK_6ED395` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`))
onDelete
If we have a one-to-one, many-to-one or many-to-many relationship, we can use @JoinColumn (as annotation or other way) to specify how the column will look and how it will behave. We can use onDelete option there. It will force the database to delete associated objects, it will work on the database level, not in the ORM 1.
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user")
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address")
* @ORM\JoinColumn(name="user_id", onDelete="CASCADE")
*/
private $user;
}
That onDelete="CASCADE" in Address $user will generate update (I’m using mysql):
ALTER TABLE address
ADD CONSTRAINT FK_1234 FOREIGN KEY (user_id)
REFERENCES user (id)
ON DELETE CASCADE;
So the address table will have the foreign key with referential action ON DELETE CASCADE. Now, if we delete the user, the related address will also be automatically deleted (at the database level).
Easy.
In the example mentioned, what matters is that the address table stores the user's id. But what if we want to reverse it so that the user table will store the id of its address?
During this step, we can make a mistake, a costly mistake.
class User
{
/**
* @ORM\OneToOne(targetEntity="Address")
* @ORM\JoinColumn(onDelete="CASCADE")
*/
private $address;
}
class Address
{
}
What has changed? The relationship is now unidirectional, user stores id of his/her address and address knows nothing about the user. Additionally, now the User $address has @JoinColumn annotation with referential action onDelete="CASCADE".
It's easy to get confused and think that the user has a relation to address, and if we remove the user, an address will be also removed. But it's not the case. onDelete="CASCADE" will cause a foreign key with referential action ON DELETE CASCADE in the user table. So deleting addresses will delete users!
cascade={"remove"}
So what are these cascades? Doctrine provides the ability of the entity to cascade operations to the associated entities. What kind of cascade operations are available? Persist, remove, merge, detach, refresh and all. I’ve noticed that most of the people, without second thoughts, add persist (often with remove) or all. Let's focus on remove.
It’s a Doctrine mechanism, so it will work on the ORM level, the database will not know anything about it. So if we delete something from the database "manually", outside the ORM context, then nothing will be automatically removed, not like in onDelete="CASCADE" example.
Below we have an example of cascade removing.
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user", cascade={"remove"})
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address")
*/
private $user;
}
If we have a relation defined in this way and we delete a user in some part of the system, like here:
$entityManager->remove($user);
$entityManager->flush();
ORM will generate two deletion queries in one transaction. It will remove the address and the user. Well, that's what we expected.
But what if someone makes a mistake? And how can it happen? Look at the example below:
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user")
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address", cascade={"remove"})
*/
private $user;
}
The relationship is the same, the only thing that has changed is the place where the cascade={"remove"} is.
Now if we remove the address like here:
$entityManager->remove($address);
$entityManager->flush();
The user will be also removed! Conclusions? Add cascade={"remove"} in the entity whose removal should delete other entities associated with it. So, do not add it in the entity to be deleted at the moment when another related object is deleted.
I've seen many times cascade={"remove"} placed not in the entity in which it should be. What is the reason for it? Maybe someone thought that one should add cascade={"remove"} in the entity to remove? (please don’t do that). We often write code with no second thought. Using our experience, we code more "automatically", and overconfidence can also be a reason for mistakes. If we don't have a lot of experience or we don't have a bad experience with cascade removal, it's easy to make such a bug.
Remember that cascade={"remove"} works on the ORM level, and onDelete="CASCADE" works on the database level. What are the consequences? ORM operates earlier so if you add both of them, onDelete="CASCADE” will be omitted, because ORM will prepare queries that will bypass this option when deleting.
orphanRemoval=true
It’s another way to cascade remove objects. In relations one-to-one, one-to-many and many-to-many we can use orphanRemoval option. It works on ORM level.
Look at the example of its use:
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user", orphanRemoval=true)
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address")
* @ORM\JoinColumn(name="user_id", referencedColumnName="id")
*/
private $user;
}
And now with the user removed:
$entityManager->remove(user);
$entityManager->flush();
His address is deleted as well. But it was expected to happen. By deleting the address, we'll eliminate it and, of course, not remove the user, that's also what we want.
But what else can be messed up here? Take a look at the example below:
class User
{
/**
* @ORM\OneToOne(targetEntity="Address", mappedBy="user")
*/
private $address;
}
class Address
{
/**
* @ORM\OneToOne(targetEntity="User", inversedBy="address", orphanRemoval=true)
* @ORM\JoinColumn(name="user_id", referencedColumnName="id")
*/
private $user;
}
If by mistake someone adds orphanRemoval into the address instead of the user, the user is deleted as well, when the address is removed.
So what is the difference between orphanRemoval and cascade={"remove"}?
If we used cascade={"remove"} and set null on the user’s address. Like here:
$user>setAddress(null);
$entityManager->flush();
Nothing happens, because we modified the user, who is the inverse side of the relation.
If the user is the owning side, the above action will write null in his row in the database and the address would become an "orphan", not used by anyone.
This is the difference. If we used the orphanRemoval option, the above action would remove the address from the database, because it would no longer be used by anyone (orphan).
Option orphanRemoval should only be used only in objects that own the relationship2. What does it mean to be a relationship owner? It means that the other side of the relationship can't exist without the owner. For example, the user has an address, but the address without the user is useless.
Where to use it?
If you want to delete the user along with his related address:
onDelete="CASCADE” | cascade={"remove"} | orphanRemoval=true |
---|---|---|
class Address { … onDelete="CASCADE"… private $user; } |
class User { …cascade={"remove"}… private $address; } |
class User { …orphanRemoval=true… private $address; } |
Entities that need to be deleted together with the other side of relationship | In the entity whose removal should cascade delete related |
So how to delete and not delete yourself in the process?
The most important thing is to use all these options consciously and know how they work. If you are not sure whether to use cascade remove, onDelete or orphanRemoval, then don't use any of them! We need to be sure of our code and know the consequences of each option. As you could see, doctrine gives great opportunities to "automate" removal process, but automation that is used incorrectly can be very dangerous.
I hope I showed you how to solve this problem, and you'll never have to deal with a situation where something suddenly "disappears" from your database.