Unindexed foreign keys again: and yes, it does matter!

Some months ago, I was investigating the classical “unindexed foreign key locking scenario”: what happens if, in a parent table/child table relationship, the parent table gets updated or deleted, including its primary key (see https://recurrentnull.wordpress.com/2013/12/17/unindexed-foreign-keys-considered-harmful-but-are-they/).
Basically, it turned out that as long as you’re not using on delete cascade, few spectacular things do happen …

However, in the real world, what can happen – and might even be a probable and common scenario – is the following:
The application has to deal with complex objects, combined by some form of composition, that on the database side turn up as parent/child relationships. Now if the application does not want to rely on the database to take care of dependencies (using on delete cascade), it has to clean up the dependent child entries before deleting the parent – a nice and clean approach…
However, if this is done concurrently, the following scenario may emerge:

Session wants to delete complex object 1:

SQL> delete from child where parent_id=1; -- session 1, object 1

1 row deleted.

SQL> select sid, type, object_name, id1, id2, lmode, request, block
    from gv$lock l left join dba_objects o
    on (l.id1=o.object_id)
    where type='TM' or type='TX'
    order by sid, type;

SID TY OBJECT_NAME          ID1        ID2      LMODE    REQUEST      BLOCK
-- -- -------------- ---------- ---------- ---------- ---------- ----------
280 TM PARENT            446690          0          3          0          2
280 TM CHILD             446692          0          3          0          2
280 TX                  1179664    1314092          6          0          2       

At the same time, another session needs to delete complex object 2:

SQL> delete from child where parent_id=1; -- session 2, object 2

1 row deleted.

SQL> select sid, type, object_name, id1, id2, lmode, request, block
    from gv$lock l left join dba_objects o
    on (l.id1=o.object_id)
    where type='TM' or type='TX'
    order by sid, type;

SID TY OBJECT_NAME          ID1        ID2      LMODE    REQUEST      BLOCK
-- -- -------------------- ---- ---------- ---------- ---------- ----------
278 TM CHILD              446692          0        3          0          2
278 TM PARENT             446690          0        3          0          2
278 TX                    1114138   1429973        6          0          2
280 TM PARENT             446690          0        3          0          2
280 TM CHILD              446692          0        3          0          2
280 TX                    1179664   1314092        6          0          2

So far, so good… Both sessions have the parent as well as child tables locked in row exclusive mode, no conflicts…
Now session 1 wants to continue, updating the corresponding parent row:

SQL> delete from parent where id=1;

And there we have it, session 1 hangs – it needs to upgrade its lock on the child table to mode 5 (share row exclusive), which is incompatible with the table being locked in row exclusive mode by session 2:

SQL> select sid, type, object_name, id1, id2, lmode, request, block
    from gv$lock l left join dba_objects o
    on (l.id1=o.object_id)
    where type='TM' or type='TX'
    order by sid, type;

SID TY OBJECT_NAME                 ID1        ID2      LMODE    REQUEST      BLOCK
--- -- -------------------- ---------- ---------- ---------- ---------- ----------
278 TM CHILD                    446692          0          3          0          1
278 TM PARENT                   446690          0          3          0          2
278 TX                         1114138    1429973          6          0          2
280 TM PARENT                   446690          0          3          0          2
280 TM CHILD                    446692          0          3          5          1
280 TX                         1179664    1314092          6          0          2

Looking at this scenario, it might be a bit of an unlucky timing, but certainly this course of events is not improbable at all! …Such that in the real world, it might make a lot of sense still to check for unindexed foreign keys …

Advertisements

Unindexed foreign keys considered harmful – but are they?

For quite some time, I’ve had it at the back of my brain to check our database application for unindexed foreign keys. After all, don’t they say unindexed foreign keys are evil, taking full table locks on the child tables and all. Now recently there was a locking situation again, at first glance the table names seemed to match the suspicion.. and it seemed time to take an action, implementing the missing foreign keys.
First, however, I’ve wanted to quickly build a test case, to demonstrate the issue, and that’d be it.
Well, it turned out differently. The results were so little in the line of what I expected that I ended up doing a systematic test.

The following results were obtained on 11.2.0.3 as well as 12.1. Every single sub-test consists of deleting one of the parent rows and trying to perform one other action (insert, update, delete) on either the child or the parent table. In order not to miss any further surprises, I’ve been pedantic and have also performed those kinds of sub-tests where the outcome seemed totally clear from the beginning. (You can just skip over these but I’m listing them for completeness).

Importantly, every sub-test was done for two cases – a “normal” foreign key constraint (on delete=”no action” as it says in dba_constraints) and “on delete cascade” – as these turned out to behave vey differently. Then, for the interesting cases, the effects of introducing an index on the foreign key will be reported. The sql for creating and populating the test tables will be found at the end of the article.

Delete from parent, and perform DML on the child

Here are the results for an initial delete from the parent. In the “on delete cascade” case, we’re deleting a row that has references in the child table, in the “no action” case of course we can’t and we’re choosing another row. (What is included in brackets is always what was performed in the “no action” case.)
The object_ids of the relative tables are:

  • parent_ondc (parent table referenced by “on delete cascade” constraint): 102000
  • parent_ondna (parent table referenced by “on delete no action” constraint): 102002
  • child: 102004
constraint: on delete cascade constraint: on delete n.a.
session 1: delete from parent where id=1 [3] <initial action>
RX on child
<initial action>
NO lock on child
select sid, type, id1, id2, lmode, request, block from v$lock …
27 TM 102002 0 3 0
27 TM 102004 0 3 0
27 TM 102000 0 3 0
27 TX 655370 2239 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: update child set name=’noname’ where id=1 [3] HANGS (for TX; no locks on parent) not possible (does not exist)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102004 0 3 0
43 TX 655370 2239 0 6
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: update child set name=’noname’ where id=2 WORKS (no locks on parent) WORKS (RX on parent)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102004 0 3 0
43 TX 262148 1958 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102002 0 3 0
43 TM 102004 0 3 0
27 TM 102000 0 3 0
27 TX 655370 2239 6 0
session 2: insert into child values (5, 1, 1, ‘3’) [ (5, 3, 3, ‘5’) ] HANGS (for TX; takes RX on parent) HANGS (for TX; takes RX on parent)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 655370 2239 0 4
43 TX 131098 2766 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 327692 2696 0 4
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: insert into child values (5, 2, 1, ‘3’) [(5, 3, 4, ‘5’)] WORKS (takes RX on parent) WORKS (takes RX on parent)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 589826 2473 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: delete from child where id=1 [3] HANGS (for TX; takes RS on parent) not possible (doesn’t exist)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 2 0
43 TM 102000 0 2 0
43 TM 102004 0 3 0
43 TX 655370 2239 0 6
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: delete from child where id=2 WORKS (takes RS on parent) WORKS (takes RS on parent)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 2 0
43 TM 102000 0 2 0
43 TM 102004 0 3 0
43 TX 262168 1968 0 6
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102002 0 2 0
43 TM 102000 0 2 0
43 TM 102004 0 3 0
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0

So, with the pending “delete from parent”, DML on the child works just fine. If it hangs, it is blocked directly by the open transaction from session 1 – and this happens exactly in those cases where you’d expect it to happen – when trying to update a child row referring to the key just being deleted (scenario possible with on delete cascade only), when trying to insert a child row with exactly that key (both cases), and when trying to delete a row with that key in the child (again, on delete cascade only).
Now, on to DML on the parent table.

Delete from parent, and perform DML on the parent

constraint: on delete cascade constraint: on delete n.a.
session 1: delete from parent where id=1 [3] <initial action>
RX on child
<initial action>
NO lock on child
session 2: delete from parent where id=2 HANGS (requesting SRX on child) not possible
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 0 5
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: delete from parent where id=3 HANGS (requesting SRX on child) WORKS
select sid, type, id1, id2, lmode, request, block from v$lock … same as above
43 TM 102002 0 3 0
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: update parent set id=2 where id=1 HANGS (requesting S on child) not possible
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102000 0 2 0
43 TM 102004 0 0 4
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: update parent set id=2 where id=2 HANGS (requesting S on child) WORKS
select sid, type, id1, id2, lmode, request, block from v$lock … same as above
43 TM 102002 0 3 0
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: insert into parent values(1, ‘1’) [(3,’3′)] HANGS (for TX) HANGS (for TX)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102000 0 3 0
43 TM 102004 0 2 0
43 TX 262168 1968 6 0
43 TX 655370 2239 0 4
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102004 0 2 0
43 TM 102002 0 3 0
43 TX 327692 2696 0 4
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0
session 2: insert into parent values(5, ‘5’) WORKS (taking RS on child) WORKS (taking RS on child)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102000 0 3 0
43 TM 102004 0 2 0
43 TX 262168 1968 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102004 0 2 0
43 TM 102002 0 3 0
43 TX 458757 2032 6 0
27 TM 102002 0 3 0
27 TX 327692 2696 6 0

Now this is where there came the big surprise for me! In the on delete cascade case, I cannot delete another row from the parent table, be it referenced by a child row or not! The second session is requesting an SRX lock on the child table, which it cannot get as the table is locked in RX mode by session 1!
Secondly, updating a different key in the parent table does not work either (even a “fake update” like “update parent set id=2 where id=2”). This time, session 2 requests an S lock, which again it cannot get for being blocked by session 1.
At least, inserts on the parent work as expected. And no locking issues with “normal” foreign keys.

Now, will something change, in the on delete cascade case, when we add an index on the foreign key? Let’s try.

Comparing locking behaviour with and without index

on delete cascade: no index on delete cascade: with index
session 1: delete from parent where id=1 [3] <initial action> <initial action>
session 2: delete from parent where id=2 HANGS (requesting SRX on child) WORKS (taking another RX on child)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 0 5
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102002 0 3 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 327696 2730 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: update parent set id=2 where id=1 HANGS (requesting S on child) HANGS (this time, for TX)
select sid, type, id1, id2, lmode, request, block from v$lock …
43 TM 102000 0 2 0
43 TM 102004 0 0 4
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 655370 2239 0 6
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0
session 2: update parent set id=2 where id=2 HANGS (requesting S on child) WORKS (taking another RX on child)
select sid, type, id1, id2, lmode, request, block from v$lock … same as above
43 TM 102000 0 3 0
43 TM 102004 0 3 0
43 TX 327696 2730 6 0
27 TM 102002 0 3 0
27 TM 102000 0 3 0
27 TM 102004 0 3 0
27 TX 655370 2239 6 0

So really, adding the index fixed the situation. Everything works that should work, and in the case where we were trying to update the parent key that was being deleted from the other session, we are not blocked for requesting a share lock on the child table, but simply and regularly by the TX lock from session 1.

And the conclusion is…?

… now well. These results have been obtained on 11.2.0.3 and 12.1, in a very simple and straightforward setup… On the other hand, in the 12c Concepts Guide it still says

“When both of the following conditions are true, the database acquires a full table lock on the child table:
– No index exists on the foreign key column of the child table.
– A session modifies a primary key in the parent table (for example, deletes a row or modifies primary key attributes) or merges rows into the parent table. Inserts into the parent table do not acquire table locks on the child table.”

Frankly, I have no idea. Actually, this text does not say exactly which actions would get blocked by the “full table lock”, but the following example mentions updates on the child table (“For example, sessions cannot update employee phone numbers.”)
So that definitely contradicts what we saw. And by the way – what is a full table lock? Just some lines above what I quoted,the Guide explains the lock modes, but it never says what a “full table lock” maps to. Mode 3 (RX), anyway, is what was seen here.

There is one other thing, actually. As you will see from the sql below, the way I performed this test was creating 2 parent tables, one linked to by the on delete cascade constraint, the other, by the normal constraint. Now as you can see from the output above, when I delete a row from the “on delete cascade” parent, not only does the child table get locked in RX mode, the other parent table does, too! This must be some generalization / extension of what Jonathan Lewis, in Lock modes, says about RX locks

“Since 11.1 used at opposite end of RI during DML”

… but still you might ask – why should this happen?
Which might be an interesting question for another experiment 🙂

The test setup

sql> drop table child;
sql> drop table parent_ondc;
sql> drop table parent_ondna;
sql> create table parent_ondc (id number primary key, name varchar2(30));
sql> create table parent_ondna (id number primary key, name varchar2(30));
sql> create table child (id number primary key, parent_id_ondc number references parent_ondc (id) on delete cascade, parent_id_ondna number references parent_ondna (id), name varchar2(30));
sql> insert into parent_ondc values (1, '1');
sql> insert into parent_ondna values (1, '1');
sql> insert into parent_ondc values (2, '2');
sql> insert into parent_ondna values (2, '2');
sql> insert into parent_ondc values (3, '3');
sql> insert into parent_ondna values (3, '3');
sql> insert into parent_ondc values (4, '4');
sql> insert into parent_ondna values (4, '4');
sql> insert into child  values (1, 1, 1, '1');
sql> insert into child  values (2, 2, 2,'2');
sql> commit;
sql> --create index parent_ondnc_fk on child(parent_id_ondnc);