|
1 | 1 | pg_autovacuum README
|
| 2 | +-------------------- |
2 | 3 |
|
3 |
| -pg_autovacuum is a libpq client program that monitors all the databases of a |
4 |
| -postgresql server. It uses the stats collector to monitor insert, update and |
5 |
| -delete activity. When an individual table exceeds it's insert or delete |
6 |
| -threshold (more detail on thresholds below) then that table is vacuumed or |
7 |
| -analyzed. This allows postgresql to keep the fsm and table statistics up to |
8 |
| -date without having to schedule periodic vacuums with cron regardless of need. |
| 4 | +pg_autovacuum is a libpq client program that monitors all the |
| 5 | +databases associated with a postgresql server. It uses the stats |
| 6 | +collector to monitor insert, update and delete activity. |
9 | 7 |
|
10 |
| -The primary benefit of pg_autovacuum is that the FSM and table statistic information |
11 |
| -are updated as needed. When a table is actively changed pg_autovacuum performs the |
12 |
| -necessary vacuums and analyzes, when a table is inactive, no cycles are wasted |
13 |
| -performing vacuums and analyzes that are not needed. |
| 8 | +When a table exceeds its insert or delete threshold (more detail |
| 9 | +on thresholds below) then that table will be vacuumed or analyzed. |
| 10 | + |
| 11 | +This allows postgresql to keep the fsm and table statistics up to |
| 12 | +date, and eliminates the need to schedule periodic vacuums. |
| 13 | + |
| 14 | +The primary benefit of pg_autovacuum is that the FSM and table |
| 15 | +statistic information are updated as needed. When a table is actively |
| 16 | +changing, pg_autovacuum will perform the necessary vacuums and |
| 17 | +analyzes, whereas if a table remains static, no cycles will be wasted |
| 18 | +performing unnecessary vacuums/analyzes. |
| 19 | + |
| 20 | +A secondary benefit of pg_autovacuum is that it ensures that a |
| 21 | +database wide vacuum is performed prior to xid wraparound. This is an |
| 22 | +important, if rare, problem, as failing to do so can result in major |
| 23 | +data loss. |
| 24 | + |
| 25 | + |
| 26 | +KNOWN ISSUES: |
| 27 | +------------- |
| 28 | +pg_autovacuum has been tested under Redhat Linux (by me) and Solaris (by |
| 29 | +Christopher B. Browne) and all known bugs have been resolved. Please report |
| 30 | +any problems to the hackers list. |
| 31 | + |
| 32 | +pg_autovacuum does not get started automatically by either the postmaster or |
| 33 | +by pg_ctl. Along the sames lines, when the postmaster exits no one tells |
| 34 | +pg_autovacuum. The result is that at the start of the next loop, |
| 35 | +pg_autovacuum fails to connect to the server and exits. Any time it fails |
| 36 | +to connect pg_autovacuum exits. |
| 37 | + |
| 38 | +pg_autovacuum requires that the stats system be enabled and reporting row |
| 39 | +level stats. The overhead of the stats system has been shown to be |
| 40 | +significant under certain workloads. For instance a tight loop of queries |
| 41 | +performing "select 1" was nearly 30% slower with stats enabled. However, |
| 42 | +in practice with more realistic workloads, the stats system overhead is |
| 43 | +usually nominal. |
14 | 44 |
|
15 |
| -A secondary benefit of pg_autovacuum is that it guarantees that a database wide |
16 |
| -vacuum is performed prior to xid wraparound. This is important as failing to do |
17 |
| -so can result in major data loss. |
18 | 45 |
|
19 | 46 | INSTALL:
|
20 |
| -To use pg_autovacuum, uncompress the tar.gz into the contrib directory and modify the |
21 |
| -contrib/Makefile to include the pg_autovacuum directory. pg_autovacuum will then be made as |
22 |
| -part of the standard postgresql install. |
| 47 | +-------- |
| 48 | + |
| 49 | +As of postgresql v7.4 pg_autovacuum is included in the main source tree |
| 50 | +under contrib. Therefore you just make && make install (similar to most other |
| 51 | +contrib modules) and it will be installed for you. |
| 52 | + |
| 53 | +If you are using an earlier version of postgresql just uncompress the tar.gz |
| 54 | +into the contrib directory and modify the contrib/Makefile to include the pg_autovacuum |
| 55 | +directory. pg_autovacuum will then be made as part of the standard |
| 56 | +postgresql install. |
23 | 57 |
|
24 | 58 | make sure that the folowing are set in postgresql.conf
|
25 |
| -stats_start_collector = true |
26 |
| -stats_row_level = true |
27 | 59 |
|
28 |
| -start up the postmaster |
29 |
| -then, just execute the pg_autovacuum executable. |
| 60 | + stats_start_collector = true |
| 61 | + stats_row_level = true |
| 62 | + |
| 63 | +start up the postmaster, then execute the pg_autovacuum executable. |
30 | 64 |
|
31 | 65 |
|
32 | 66 | Command line arguments:
|
| 67 | +----------------------- |
| 68 | + |
33 | 69 | pg_autovacuum has the following optional arguments:
|
| 70 | + |
34 | 71 | -d debug: 0 silent, 1 basic info, 2 more debug info, etc...
|
| 72 | +-D dameonize: Detach from tty and run in background. |
35 | 73 | -s sleep base value: see "Sleeping" below.
|
36 | 74 | -S sleep scaling factor: see "Sleeping" below.
|
37 |
| --t tuple base threshold: see Vacuuming. |
38 |
| --T tuple scaling factor: see Vacuuming. |
39 |
| --U username: Username pg_autovacuum will use to connect with, if not specified the |
40 |
| - current username is used |
| 75 | +-v vacuum base threshold: see Vacuum and Analyze. |
| 76 | +-V vacuum scaling factor: see Vacuum and Analyze. |
| 77 | +-a analyze base threshold: see Vacuum and Analyze. |
| 78 | +-A analyze scaling factor: see Vacuum and Analyze. |
| 79 | +-L log file: Name of file to which output is submitted, otherwise STDERR |
| 80 | +-U username: Username pg_autovacuum will use to connect with, if not |
| 81 | + specified the current username is used. |
41 | 82 | -P password: Password pg_autovacuum will use to connect with.
|
42 | 83 | -H host: host name or IP to connect too.
|
43 | 84 | -p port: port used for connection.
|
44 | 85 | -h help: list of command line options.
|
45 | 86 |
|
46 |
| -All arguments have default values defined in pg_autovacuum.h. At the time of this |
47 |
| -writing they are: |
48 |
| -#define AUTOVACUUM_DEBUG 1 |
49 |
| -#define BASETHRESHOLD 100 |
50 |
| -#define SCALINGFACTOR 2 |
51 |
| -#define SLEEPVALUE 3 |
52 |
| -#define SLEEPSCALINGFACTOR 2 |
53 |
| -#define UPDATE_INTERVAL 2 |
| 87 | +All arguments have default values defined in pg_autovacuum.h. At the |
| 88 | +time of writing they are: |
| 89 | + |
| 90 | +-d 1 |
| 91 | +-v 1000 |
| 92 | +-V 2 |
| 93 | +-a 500 (half of -v is not specified) |
| 94 | +-A 1 (half of -v is not specified) |
| 95 | +-s 300 (5 minutes) |
| 96 | +-S 2 |
54 | 97 |
|
55 | 98 |
|
56 | 99 | Vacuum and Analyze:
|
57 |
| -pg_autovacuum performes either a vacuums analyze or just analyze depending on the table activity. |
58 |
| -If the number of (inserts + updates) > insertThreshold, then an only an analyze is performed. |
59 |
| -If the number of (deletes + updates ) > deleteThreshold, then a vacuum analyze is performed. |
60 |
| -deleteThreshold is equal to: tuple_base_value + (tuple_scaling_factor * "number of tuples in the table") |
61 |
| -insertThreshold is equal to: 0.5 * tuple_base_value + (tuple_scaling_factor * "number of tuples in the table") |
62 |
| -The insertThreshold is half the deleteThreshold because it's a much lighter operation (approx 5%-10% of vacuum), |
63 |
| -so running it more often costs us little in performance degredation. |
| 100 | +------------------- |
| 101 | + |
| 102 | +pg_autovacuum performs either a vacuum analyze or just analyze depending |
| 103 | +on the quantity and type of table activity (insert, update, or delete): |
| 104 | + |
| 105 | +- If the number of (inserts + updates + deletes) > AnalyzeThreshold, then |
| 106 | + only an analyze is performed. |
| 107 | + |
| 108 | +- If the number of (deletes + updates ) > VacuumThreshold, then a |
| 109 | + vacuum analyze is performed. |
| 110 | + |
| 111 | +deleteThreshold is equal to: |
| 112 | + vacuum_base_value + (vacuum_scaling_factor * "number of tuples in the table") |
| 113 | + |
| 114 | +insertThreshold is equal to: |
| 115 | + analyze_base_value + (analyze_scaling_factor * "number of tuples in the table") |
| 116 | + |
| 117 | +The AnalyzeThreshold defaults to half of the VacuumThreshold since it |
| 118 | +represents a much less expensive operation (approx 5%-10% of vacuum), and |
| 119 | +running it more often should not substantially degrade system performance. |
64 | 120 |
|
65 | 121 | Sleeping:
|
66 |
| -pg_autovacuum sleeps after it is done checking all the databases. It does this so as |
67 |
| -to limit the amount of system resources it consumes. This also allows the system |
68 |
| -administrator to configure pg_autovacuum to be more or less aggressive. Reducing the |
69 |
| -sleep time will cause pg_autovacuum to respond more quickly to changes, be they database |
70 |
| -addition / removal, table addition / removal, or just normal table activity. However, |
71 |
| -setting these values to high can have a negative net effect on the server. If a table |
72 |
| -gets vacuumed 5 times during the course of a large update, it might take much longer |
73 |
| -than if it was vacuumed only once. |
| 122 | +--------- |
| 123 | + |
| 124 | +pg_autovacuum sleeps for a while after it is done checking all the |
| 125 | +databases. It does this in order to limit the amount of system |
| 126 | +resources it consumes. This also allows the system administrator to |
| 127 | +configure pg_autovacuum to be more or less aggressive. |
| 128 | + |
| 129 | +Reducing the sleep time will cause pg_autovacuum to respond more |
| 130 | +quickly to changes, whether they be database addition/removal, table |
| 131 | +addition/removal, or just normal table activity. |
| 132 | + |
| 133 | +On the other hand, setting pg_autovaccum to sleep values to agressivly |
| 134 | +(for too short a period of time) can have a negative effect on server |
| 135 | +performance. If a table gets vacuumed 5 times during the course of a |
| 136 | +large update, this is likely to take much longer than if the table was |
| 137 | +vacuumed only once, at the end. |
| 138 | + |
74 | 139 | The total time it sleeps is equal to:
|
75 |
| -base_sleep_value + sleep_scaling_factor * "duration of the previous loop" |
76 | 140 |
|
77 |
| -What it monitors: |
78 |
| -pg_autovacuum dynamically generates a list of databases and tables to monitor, in |
79 |
| -addition it will dynamically add and remove databases and tables that are |
80 |
| -removed from the database server while pg_autovacuum is running. |
| 141 | + base_sleep_value + sleep_scaling_factor * "duration of the previous |
| 142 | + loop" |
| 143 | + |
| 144 | +Note that timing measurements are made in seconds; specifying |
| 145 | +"pg_vacuum -s 1" means pg_autovacuum could poll the database upto 60 times |
| 146 | +minute. In a system with large tables where vacuums may run for several |
| 147 | +minutes, longer times between vacuums are likely to be appropriate. |
| 148 | + |
| 149 | +What pg_autovacuum monitors: |
| 150 | +---------------------------- |
| 151 | + |
| 152 | +pg_autovacuum dynamically generates a list of all databases and tables that |
| 153 | +exist on the server. It will dynamically add and remove databases and |
| 154 | +tables that are removed from the database server while pg_autovacuum is |
| 155 | +running. Overhead is fairly small per object. For example: 10 databases |
| 156 | +with 10 tables each appears to less than 10k of memory on my Linux box. |
0 commit comments