Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Commit eb0eadb

Browse files
committed
Add.
1 parent d2c2551 commit eb0eadb

File tree

2 files changed

+131
-2
lines changed

2 files changed

+131
-2
lines changed

doc/TODO

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
TODO list for PostgreSQL
22
========================
3-
Last updated: Wed Jan 24 08:38:34 EST 2001
3+
Last updated: Wed Jan 24 09:24:35 EST 2001
44

55
Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
66

@@ -303,7 +303,7 @@ MISC
303303
connection pooling
304304
* Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
305305
ANALYZE, and CLUSTER
306-
* Delay fsync() when other backends are about to commit too
306+
* Delay fsync() when other backends are about to commit too [fsync]
307307

308308
SOURCE CODE
309309
-----------

doc/TODO.detail/fsync

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
From pgsql-hackers-owner+M908@postgresql.org Sun Nov 19 14:27:43 2000
2+
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
3+
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA10885
4+
for <pgman@candle.pha.pa.us>; Sun, 19 Nov 2000 14:27:42 -0500 (EST)
5+
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
6+
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAJJSMs83653;
7+
Sun, 19 Nov 2000 14:28:22 -0500 (EST)
8+
(envelope-from pgsql-hackers-owner+M908@postgresql.org)
9+
Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46] (may be forged))
10+
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAJJQns83565
11+
for <pgsql-hackers@postgreSQL.org>; Sun, 19 Nov 2000 14:26:49 -0500 (EST)
12+
(envelope-from pgman@candle.pha.pa.us)
13+
Received: (from pgman@localhost)
14+
by candle.pha.pa.us (8.9.0/8.9.0) id OAA06790;
15+
Sun, 19 Nov 2000 14:23:06 -0500 (EST)
16+
From: Bruce Momjian <pgman@candle.pha.pa.us>
17+
Message-Id: <200011191923.OAA06790@candle.pha.pa.us>
18+
Subject: Re: [HACKERS] WAL fsync scheduling
19+
In-Reply-To: <002101c0525e$2d964480$b97a30d0@sectorbase.com> "from Vadim Mikheev
20+
at Nov 19, 2000 11:23:19 am"
21+
To: Vadim Mikheev <vmikheev@sectorbase.com>
22+
Date: Sun, 19 Nov 2000 14:23:06 -0500 (EST)
23+
CC: Tom Samplonius <tom@sdf.com>, Alfred@candle.pha.pa.us,
24+
Perlstein <bright@wintelcom.net>, Larry@candle.pha.pa.us,
25+
Rosenman <ler@lerctr.org>,
26+
PostgreSQL-development <pgsql-hackers@postgresql.org>
27+
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
28+
MIME-Version: 1.0
29+
Content-Transfer-Encoding: 7bit
30+
Content-Type: text/plain; charset=US-ASCII
31+
Precedence: bulk
32+
Sender: pgsql-hackers-owner@postgresql.org
33+
Status: OR
34+
35+
[ Charset ISO-8859-1 unsupported, converting... ]
36+
> > There are two parts to transaction commit. The first is writing all
37+
> > dirty buffers or log changes to the kernel, and second is fsync of the
38+
> ^^^^^^^^^^^^
39+
> Backend doesn't write any dirty buffer to the kernel at commit time.
40+
41+
Yes, I suspected that.
42+
43+
>
44+
> > log file.
45+
>
46+
> The first part is writing commit record into WAL buffers in shmem.
47+
> This is what XLogInsert does. After that XLogFlush is called to ensure
48+
> that entire commit record is on disk. XLogFlush does *both* write() and
49+
> fsync() (single slock is used for both writing and fsyncing) if it needs to
50+
> do it at all.
51+
52+
Yes, I realize there are new steps in WAL.
53+
54+
>
55+
> > I suggest having a per-backend shared memory byte that has the following
56+
> > values:
57+
> >
58+
> > START_LOG_WRITE
59+
> > WAIT_ON_FSYNC
60+
> > NOT_IN_COMMIT
61+
> > backend_number_doing_fsync
62+
> >
63+
> > I suggest that when each backend starts a commit, it sets its byte to
64+
> > START_LOG_WRITE.
65+
> ^^^^^^^^^^^^^^^^^^^^^^^
66+
> Isn't START_COMMIT more meaningful?
67+
68+
Yes.
69+
70+
>
71+
> > When it gets ready to fsync, it checks all backends.
72+
> ^^^^^^^^^^^^^^^^^^^^^^^^^^
73+
> What do you mean by this? The moment just after XLogInsert?
74+
75+
Just before it calls fsync().
76+
77+
>
78+
> > If all are NOT_IN_COMMIT, it does fsync and continues.
79+
>
80+
> 1st edition:
81+
> > If one or more are in START_LOG_WRITE, it waits until no one is in
82+
> > START_LOG_WRITE. It then checks all WAIT_ON_FSYNC, and if it is the
83+
> > lowest backend in WAIT_ON_FSYNC, marks all others with its backend
84+
> > number, and does fsync. It then clears all backends with its number to
85+
> > NOT_IN_COMMIT. Other backend will see they are not the lowest
86+
> > WAIT_ON_FSYNC and will wait for their byte to be set to NOT_IN_COMMIT
87+
> > so they can then continue, knowing their data was synced.
88+
>
89+
> 2nd edition:
90+
> > I have another idea. If a backend gets to the point that it needs
91+
> > fsync, and there is another backend in START_LOG_WRITE, it can go to an
92+
> > interuptable sleep, knowing another backend will perform the fsync and
93+
> > wake it up. Therefore, there is no busy-wait or timed sleep.
94+
> >
95+
> > Of course, a backend must set its status to WAIT_ON_FSYNC to avoid a
96+
> > race condition.
97+
>
98+
> The 2nd edition is much better. But I'm not sure do we really need in
99+
> these per-backend bytes in shmem. Why not just have some counters?
100+
> We can use a semaphore to wake-up all waiters at once.
101+
102+
Yes, that is much better and clearer. My idea was just to say, "if no
103+
one is entering commit phase, do the commit. If someone else is coming,
104+
sleep and wait for them to do the fsync and wake me up with a singal."
105+
106+
>
107+
> > This allows a single backend not to sleep, and allows multiple backends
108+
> > to bunch up only when they are all about to commit.
109+
> >
110+
> > The reason backend numbers are written is so other backends entering the
111+
> > commit code will not interfere with the backends performing fsync.
112+
>
113+
> Being waked-up backend can check what's written/fsynced by calling XLogFlush.
114+
115+
Seems that may not be needed anymore with a counter. The only issue is
116+
that other backends may enter commit while fsync() is happening. The
117+
process that did the fsync must be sure to wake up only the backends
118+
that were waiting for it, and not other backends that may be also be
119+
doing fsync as a group while the first fsync was happening. I leave
120+
those details to people more experienced. :-)
121+
122+
I am just glad people liked my idea.
123+
124+
--
125+
Bruce Momjian | http://candle.pha.pa.us
126+
pgman@candle.pha.pa.us | (610) 853-3000
127+
+ If your life is a hard drive, | 830 Blythe Avenue
128+
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
129+

0 commit comments

Comments
 (0)