Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit a37ab1d

Browse files
committed
Improve MVCC discussion.
1 parent 6fec216 commit a37ab1d

File tree

1 file changed

+146
-97
lines changed

1 file changed

+146
-97
lines changed

doc/src/sgml/mvcc.sgml

Lines changed: 146 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.8 2000/09/29 20:21:34 petere Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.9 2000/10/11 17:38:36 tgl Exp $
33
-->
44

55
<chapter id="mvcc">
@@ -70,7 +70,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.8 2000/09/29 20:21:34 petere
7070
<listitem>
7171
<para>
7272
A transaction re-reads data it has previously read and finds that data
73-
has been modified by another committed transaction.
73+
has been modified by another transaction (that committed since the
74+
initial read).
7475
</para>
7576
</listitem>
7677
</varlistentry>
@@ -82,8 +83,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.8 2000/09/29 20:21:34 petere
8283
<listitem>
8384
<para>
8485
A transaction re-executes a query returning a set of rows that satisfy a
85-
search condition and finds that additional rows satisfying the condition
86-
has been inserted by another committed transaction.
86+
search condition and finds that the set of rows satisfying the condition
87+
has changed due to another recently-committed transaction.
8788
</para>
8889
</listitem>
8990
</varlistentry>
@@ -175,7 +176,9 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.8 2000/09/29 20:21:34 petere
175176
</tbody>
176177
</tgroup>
177178
</table>
179+
</para>
178180

181+
<para>
179182
<productname>Postgres</productname>
180183
offers the read committed and serializable isolation levels.
181184
</para>
@@ -187,55 +190,70 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.8 2000/09/29 20:21:34 petere
187190
<para>
188191
<firstterm>Read Committed</firstterm>
189192
is the default isolation level in <productname>Postgres</productname>.
190-
When a transaction runs on this isolation level, a query sees only
191-
data committed before the query began and never sees either dirty data or
192-
concurrent transaction changes committed during query execution.
193+
When a transaction runs on this isolation level,
194+
a <command>SELECT</command> query sees only data committed before the
195+
transaction began and never sees either dirty data or concurrent
196+
transaction changes committed during transaction execution. (However, the
197+
<command>SELECT</command> does see the effects of previous updates
198+
executed within this same transaction.)
193199
</para>
194200

195201
<para>
196-
If a row returned by a query while executing an
202+
If a target row found by a query while executing an
197203
<command>UPDATE</command> statement
198-
(or <command>DELETE</command>
199-
or <command>SELECT FOR UPDATE</command>)
200-
is being updated by a
204+
(or <command>DELETE</command> or <command>SELECT FOR UPDATE</command>)
205+
has already been updated by a
201206
concurrent uncommitted transaction then the second transaction
202207
that tries to update this row will wait for the other transaction to
203208
commit or rollback. In the case of rollback, the waiting transaction
204209
can proceed to change the row. In the case of commit (and if the
205210
row still exists; i.e. was not deleted by the other transaction), the
206-
query will be re-executed for this row to check that new row
207-
version satisfies query search condition. If the new row version
208-
satisfies the query search condition then row will be
209-
updated (or deleted or marked for update).
211+
query will be re-executed for this row to check that the new row
212+
version still satisfies the query search condition. If the new row version
213+
satisfies the query search condition then the row will be
214+
updated (or deleted or marked for update). Note that the starting point
215+
for the update will be the new row version; moreover, after the update
216+
the doubly-updated row is visible to subsequent <command>SELECT</command>s
217+
in the current transaction. Thus, the current transaction is able to see
218+
the effects of the other transaction for this specific row.
210219
</para>
211220

212221
<para>
213-
Note that the results of execution of <command>SELECT</command>
214-
or <command>INSERT</command> (with a query)
215-
statements will not be affected by concurrent transactions.
222+
The partial transaction isolation provided by Read Committed level is
223+
adequate for many applications, and this level is fast and simple to use.
224+
However, for applications that do complex queries and updates, it may
225+
be necessary to guarantee a more rigorously consistent view of the
226+
database than Read Committed level provides.
216227
</para>
217228
</sect1>
218229

219230
<sect1 id="xact-serializable">
220231
<title>Serializable Isolation Level</title>
221232

222233
<para>
223-
<firstterm>Serializable</firstterm> provides the highest transaction isolation.
234+
<firstterm>Serializable</firstterm> provides the highest transaction
235+
isolation. This level emulates serial transaction execution,
236+
as if transactions had been executed one after another, serially,
237+
rather than concurrently. However, applications using this level must
238+
be prepared to retry transactions due to serialization failures.
239+
</para>
240+
241+
<para>
224242
When a transaction is on the serializable level,
225-
a query sees only data
226-
committed before the transaction began and never see either dirty data
227-
or concurrent transaction changes committed during transaction
228-
execution. So, this level emulates serial transaction execution,
229-
as if transactions would be executed one after another, serially,
230-
rather than concurrently.
243+
a <command>SELECT</command> query sees only data committed before the
244+
transaction began and never sees either dirty data or concurrent
245+
transaction changes committed during transaction execution. (However, the
246+
<command>SELECT</command> does see the effects of previous updates
247+
executed within this same transaction.) This is the same behavior as
248+
for Read Committed level.
231249
</para>
232250

233251
<para>
234-
If a row returned by query while executing a
235-
<command>UPDATE</command>
252+
If a target row found by a query while executing an
253+
<command>UPDATE</command> statement
236254
(or <command>DELETE</command> or <command>SELECT FOR UPDATE</command>)
237-
statement is being updated by
238-
a concurrent uncommitted transaction then the second transaction
255+
has already been updated by a
256+
concurrent uncommitted transaction then the second transaction
239257
that tries to update this row will wait for the other transaction to
240258
commit or rollback. In the case of rollback, the waiting transaction
241259
can proceed to change the row. In the case of a concurrent
@@ -250,13 +268,75 @@ ERROR: Can't serialize access due to concurrent update
250268
other transactions after the serializable transaction began.
251269
</para>
252270

253-
<note>
254-
<para>
255-
Note that results of execution of <command>SELECT</command>
256-
or <command>INSERT</command> (with a query)
257-
will not be affected by concurrent transactions.
258-
</para>
259-
</note>
271+
<para>
272+
When the application receives this error message, it should abort
273+
the current transaction and then retry the whole transaction from
274+
the beginning. The second time through, the transaction sees the
275+
previously-committed change as part of its initial view of the database,
276+
so there is no logical conflict in using the new version of the row
277+
as the starting point for the new transaction's update.
278+
Note that only updating transactions may need to be retried --- read-only
279+
transactions never have serialization conflicts.
280+
</para>
281+
282+
<para>
283+
Serializable transaction level provides a rigorous guarantee that each
284+
transaction sees a wholly consistent view of the database. However,
285+
the application has to be prepared to retry transactions when concurrent
286+
updates make it impossible to sustain the illusion of serial execution,
287+
and the cost of redoing complex transactions may be significant. So
288+
this level is recommended only when update queries contain logic
289+
sufficiently complex that it may give wrong answers in Read Committed
290+
level.
291+
</para>
292+
</sect1>
293+
294+
<sect1 id="applevel-consistency">
295+
<title>Data consistency checks at the application level</title>
296+
297+
<para>
298+
Because readers in <productname>Postgres</productname>
299+
don't lock data, regardless of
300+
transaction isolation level, data read by one transaction can be
301+
overwritten by another concurrent transaction. In other words,
302+
if a row is returned by <command>SELECT</command> it doesn't mean that
303+
the row still exists at the time it is returned (i.e. sometime after the
304+
current transaction began); the row might have been modified or deleted
305+
by an already-committed transaction that committed after this one started.
306+
Even if the row is still valid "now", it could be changed or deleted
307+
before the current transaction does a commit or rollback.
308+
</para>
309+
310+
<para>
311+
Another way to think about it is that each
312+
transaction sees a snapshot of the database contents, and concurrently
313+
executing transactions may very well see different snapshots. So the
314+
whole concept of "now" is somewhat suspect anyway. This is not normally
315+
a big problem if the client applications are isolated from each other,
316+
but if the clients can communicate via channels outside the database
317+
then serious confusion may ensue.
318+
</para>
319+
320+
<para>
321+
To ensure the current existence of a row and protect it against
322+
concurrent updates one must use <command>SELECT FOR UPDATE</command> or
323+
an appropriate <command>LOCK TABLE</command> statement.
324+
(<command>SELECT FOR UPDATE</command> locks just the returned rows against
325+
concurrent updates, while <command>LOCK TABLE</command> protects the
326+
whole table.)
327+
This should be taken into account when porting applications to
328+
<productname>Postgres</productname> from other environments.
329+
330+
<note>
331+
<para>
332+
Before version 6.5 <productname>Postgres</productname>
333+
used read-locks and so the
334+
above consideration is also the case
335+
when upgrading to 6.5 (or higher) from previous
336+
<productname>Postgres</productname> versions.
337+
</para>
338+
</note>
339+
</para>
260340
</sect1>
261341

262342
<sect1 id="locking-tables">
@@ -268,17 +348,11 @@ ERROR: Can't serialize access due to concurrent update
268348
access to data in tables. Some of these lock modes are acquired by
269349
<productname>Postgres</productname>
270350
automatically before statement execution, while others are
271-
provided to be used by applications. All lock modes (except for
272-
AccessShareLock) acquired in a transaction are held for the duration
351+
provided to be used by applications. All lock modes acquired in a
352+
transaction are held for the duration
273353
of the transaction.
274354
</para>
275355

276-
<para>
277-
In addition to locks, short-term share/exclusive latches are used
278-
to control read/write access to table pages in shared buffer pool.
279-
Latches are released immediately after a tuple is fetched or updated.
280-
</para>
281-
282356
<sect2>
283357
<title>Table-level locks</title>
284358

@@ -290,10 +364,8 @@ ERROR: Can't serialize access due to concurrent update
290364
</term>
291365
<listitem>
292366
<para>
293-
An internal lock mode acquiring automatically over tables
294-
being queried. <productname>Postgres</productname>
295-
releases these locks after statement is
296-
done.
367+
A read-lock mode acquired automatically on tables
368+
being queried.
297369
</para>
298370

299371
<para>
@@ -425,22 +497,28 @@ ERROR: Can't serialize access due to concurrent update
425497
<title>Row-level locks</title>
426498

427499
<para>
428-
These locks are acquired when internal
429-
fields of a row are being updated (or deleted or marked for update).
430-
<productname>Postgres</productname>
431-
doesn't remember any information about modified rows in memory and
432-
so has no limit to the number of rows locked without lock escalation.
500+
These locks are acquired when rows are being updated (or deleted or
501+
marked for update).
502+
Row-level locks don't affect data querying. They block
503+
writers to <emphasis>the same row</emphasis> only.
433504
</para>
434505

435506
<para>
436-
However, take into account that <command>SELECT FOR UPDATE</command> will modify
437-
selected rows to mark them and so will results in disk writes.
507+
<productname>Postgres</productname>
508+
doesn't remember any information about modified rows in memory and
509+
so has no limit to the number of rows locked at one time. However,
510+
locking a row may cause a disk write; thus, for example,
511+
<command>SELECT FOR UPDATE</command> will modify
512+
selected rows to mark them and so will result in disk writes.
438513
</para>
439514

440515
<para>
441-
Row-level locks don't affect data querying. They are used to block
442-
writers to <emphasis>the same row</emphasis> only.
443-
</para>
516+
In addition to table and row locks, short-term share/exclusive locks are
517+
used to control read/write access to table pages in the shared buffer
518+
pool. These locks are released immediately after a tuple is fetched or
519+
updated. Application writers normally need not be concerned with
520+
page-level locks, but we mention them for completeness.
521+
</para>
444522
</sect2>
445523
</sect1>
446524

@@ -449,9 +527,9 @@ ERROR: Can't serialize access due to concurrent update
449527

450528
<para>
451529
Though <productname>Postgres</productname>
452-
provides unblocking read/write access to table
453-
data, unblocked read/write access is not provided for every
454-
index access methods implemented
530+
provides nonblocking read/write access to table
531+
data, nonblocking read/write access is not currently offered for every
532+
index access method implemented
455533
in <productname>Postgres</productname>.
456534
</para>
457535

@@ -482,21 +560,21 @@ ERROR: Can't serialize access due to concurrent update
482560
</para>
483561

484562
<para>
485-
Page-level locks produces better concurrency than index-level ones
563+
Page-level locks provide better concurrency than index-level ones
486564
but are subject to deadlocks.
487565
</para>
488566
</listitem>
489567
</varlistentry>
490568

491569
<varlistentry>
492570
<term>
493-
Btree
571+
Btree indices
494572
</term>
495573
<listitem>
496574
<para>
497-
Short-term share/exclusive page-level latches are used for
498-
read/write access. Latches are released immediately after the index
499-
tuple is inserted/fetched.
575+
Short-term share/exclusive page-level locks are used for
576+
read/write access. Locks are released immediately after each index
577+
tuple is fetched/inserted.
500578
</para>
501579

502580
<para>
@@ -507,39 +585,10 @@ ERROR: Can't serialize access due to concurrent update
507585
</varlistentry>
508586
</variablelist>
509587
</para>
510-
</sect1>
511-
512-
<sect1 id="applevel-consistency">
513-
<title>Data consistency checks at the application level</title>
514-
515-
<para>
516-
Because readers in <productname>Postgres</productname>
517-
don't lock data, regardless of
518-
transaction isolation level, data read by one transaction can be
519-
overwritten by another. In the other words, if a row is returned
520-
by <command>SELECT</command> it doesn't mean that this row really
521-
exists at the time it is returned (i.e. sometime after the
522-
statement or transaction began) nor
523-
that the row is protected from deletion or update by concurrent
524-
transactions before the current transaction does a commit or rollback.
525-
</para>
526588

527589
<para>
528-
To ensure the actual existance of a row and protect it against
529-
concurrent updates one must use <command>SELECT FOR UPDATE</command> or
530-
an appropriate <command>LOCK TABLE</command> statement.
531-
This should be taken into account when porting applications using
532-
serializable mode to <productname>Postgres</productname> from other environments.
533-
534-
<note>
535-
<para>
536-
Before version 6.5 <productname>Postgres</productname>
537-
used read-locks and so the
538-
above consideration is also the case
539-
when upgrading to 6.5 (or higher) from previous
540-
<productname>Postgres</productname> versions.
541-
</para>
542-
</note>
590+
In short, btree indices are the recommended index type for concurrent
591+
applications.
543592
</para>
544593
</sect1>
545594
</chapter>

0 commit comments

Comments
 (0)