Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit 793dd8e

Browse files
committed
Add discussion and example about predicate locking and why "serializable"
mode isn't really serializable. I had thought this was covered already in our docs, but I sure can't find it.
1 parent 11d8138 commit 793dd8e

File tree

1 file changed

+100
-14
lines changed

1 file changed

+100
-14
lines changed

doc/src/sgml/mvcc.sgml

+100-14
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.43 2003/12/13 23:59:06 neilc Exp $
2+
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.44 2004/08/14 22:18:23 tgl Exp $
33
-->
44

55
<chapter id="mvcc">
@@ -394,6 +394,90 @@ ERROR: could not serialize access due to concurrent update
394394
a transaction executes several successive commands that must see
395395
identical views of the database.
396396
</para>
397+
398+
<sect3 id="mvcc-serializability">
399+
<title>Serializable Isolation versus True Serializability</title>
400+
401+
<indexterm>
402+
<primary>serializability</primary>
403+
</indexterm>
404+
405+
<indexterm>
406+
<primary>predicate locking</primary>
407+
</indexterm>
408+
409+
<para>
410+
The intuitive meaning (and mathematical definition) of
411+
<quote>serializable</> execution is that any two successfully committed
412+
concurrent transactions will appear to have executed strictly serially,
413+
one after the other --- although which one appeared to occur first may
414+
not be predictable in advance. It is important to realize that forbidding
415+
the undesirable behaviors listed in <xref linkend="mvcc-isolevel-table">
416+
is not sufficient to guarantee true serializability, and in fact
417+
<productname>PostgreSQL</productname>'s Serializable mode <emphasis>does
418+
not guarantee serializable execution in this sense</>. As an example,
419+
consider a table <structname>mytab</>, initially containing
420+
<screen>
421+
class | value
422+
-------+-------
423+
1 | 10
424+
1 | 20
425+
2 | 100
426+
2 | 200
427+
</screen>
428+
Suppose that serializable transaction A computes
429+
<screen>
430+
SELECT SUM(value) FROM mytab WHERE class = 1;
431+
</screen>
432+
and then inserts the result (30) as the <structfield>value</> in a
433+
new row with <structfield>class</> = 2. Concurrently, serializable
434+
transaction B computes
435+
<screen>
436+
SELECT SUM(value) FROM mytab WHERE class = 2;
437+
</screen>
438+
and obtains the result 300, which it inserts in a new row with
439+
<structfield>class</> = 1. Then both transactions commit. None of
440+
the listed undesirable behaviors have occurred, yet we have a result
441+
that could not have occurred in either order serially. If A had
442+
executed before B, B would have computed the sum 330, not 300, and
443+
similarly the other order would have resulted in a different sum
444+
computed by A.
445+
</para>
446+
447+
<para>
448+
To guarantee true mathematical serializability, it is necessary for
449+
a database system to enforce <firstterm>predicate locking</>, which
450+
means that a transaction cannot insert or modify a row that would
451+
have matched the <literal>WHERE</> condition of a query in another concurrent
452+
transaction. For example, once transaction A has executed the query
453+
<literal>SELECT ... WHERE class = 1</>, a predicate-locking system
454+
would forbid transaction B from inserting any new row with class 1
455+
until A has committed.
456+
<footnote>
457+
<para>
458+
Essentially, a predicate-locking system prevents phantom reads
459+
by restricting what is written, whereas MVCC prevents them by
460+
restricting what is read.
461+
</para>
462+
</footnote>
463+
Such a locking system is complex to
464+
implement and extremely expensive in execution, since every session must
465+
be aware of the details of every query executed by every concurrent
466+
transaction. And this large expense is mostly wasted, since in
467+
practice most applications do not do the sorts of things that could
468+
result in problems. (Certainly the example above is rather contrived
469+
and unlikely to represent real software.) Accordingly,
470+
<productname>PostgreSQL</productname> does not implement predicate
471+
locking, and so far as we are aware no other production DBMS does either.
472+
</para>
473+
474+
<para>
475+
In those cases where the possibility of nonserializable execution
476+
is a real hazard, problems can be prevented by appropriate use of
477+
explicit locking. Further discussion appears in the following
478+
sections.
479+
</para>
480+
</sect3>
397481
</sect2>
398482
</sect1>
399483

@@ -434,7 +518,8 @@ ERROR: could not serialize access due to concurrent update
434518
<para>
435519
The list below shows the available lock modes and the contexts in
436520
which they are used automatically by
437-
<productname>PostgreSQL</productname>.
521+
<productname>PostgreSQL</productname>. You can also acquire any
522+
of these locks explicitly with the command <xref linkend="sql-lock">.
438523
Remember that all of these lock modes are table-level locks,
439524
even if the name contains the word
440525
<quote>row</quote>; the names of the lock modes are historical.
@@ -736,8 +821,8 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
736821
<para>
737822
The best defense against deadlocks is generally to avoid them by
738823
being certain that all applications using a database acquire
739-
locks on multiple objects in a consistent order. That was the
740-
reason for the previous deadlock example: if both transactions
824+
locks on multiple objects in a consistent order. In the example
825+
above, if both transactions
741826
had updated the rows in the same order, no deadlock would have
742827
occurred. One should also ensure that the first lock acquired on
743828
an object in a transaction is the highest mode that will be
@@ -778,7 +863,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
778863
Another way to think about it is that each
779864
transaction sees a snapshot of the database contents, and concurrently
780865
executing transactions may very well see different snapshots. So the
781-
whole concept of <quote>now</quote> is somewhat suspect anyway.
866+
whole concept of <quote>now</quote> is somewhat ill-defined anyway.
782867
This is not normally
783868
a big problem if the client applications are isolated from each other,
784869
but if the clients can communicate via channels outside the database
@@ -801,8 +886,8 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
801886
</para>
802887

803888
<para>
804-
Global validity checks require extra thought under <acronym>MVCC</acronym>. For
805-
example, a banking application might wish to check that the sum of
889+
Global validity checks require extra thought under <acronym>MVCC</acronym>.
890+
For example, a banking application might wish to check that the sum of
806891
all credits in one table equals the sum of debits in another table,
807892
when both tables are being actively updated. Comparing the results of two
808893
successive <literal>SELECT sum(...)</literal> commands will not work reliably under
@@ -824,17 +909,17 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
824909

825910
<para>
826911
Note also that if one is
827-
relying on explicit locks to prevent concurrent changes, one should use
912+
relying on explicit locking to prevent concurrent changes, one should use
828913
Read Committed mode, or in Serializable mode be careful to obtain the
829-
lock(s) before performing queries. An explicit lock obtained in a
914+
lock(s) before performing queries. A lock obtained by a
830915
serializable transaction guarantees that no other transactions modifying
831916
the table are still running, but if the snapshot seen by the
832917
transaction predates obtaining the lock, it may predate some now-committed
833918
changes in the table. A serializable transaction's snapshot is actually
834919
frozen at the start of its first query or data-modification command
835920
(<literal>SELECT</>, <literal>INSERT</>,
836921
<literal>UPDATE</>, or <literal>DELETE</>), so
837-
it's possible to obtain explicit locks before the snapshot is
922+
it's possible to obtain locks explicitly before the snapshot is
838923
frozen.
839924
</para>
840925
</sect1>
@@ -888,10 +973,11 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
888973
</term>
889974
<listitem>
890975
<para>
891-
Share/exclusive page-level locks are used for read/write
892-
access. Locks are released after the page is processed.
893-
Page-level locks provide better concurrency than index-level
894-
ones but are liable to deadlocks.
976+
Share/exclusive hash-bucket-level locks are used for read/write
977+
access. Locks are released after the whole bucket is processed.
978+
Bucket-level locks provide better concurrency than index-level
979+
ones, but deadlock is possible since the locks are held longer
980+
than one index operation.
895981
</para>
896982
</listitem>
897983
</varlistentry>

0 commit comments

Comments
 (0)