|
| 1 | +pg_autovacuum README |
| 2 | + |
| 3 | +pg_autovacuum is a libpq client program that monitors all the databases of a |
| 4 | +postgresql server. It uses the stats collector to monitor insert, update and |
| 5 | +delete activity. When an individual table exceeds it's insert or delete |
| 6 | +threshold (more detail on thresholds below) then that table is vacuumed or |
| 7 | +analyzed. This allows postgresql to keep the fsm and table statistics up to |
| 8 | +date without having to schedule periodic vacuums with cron regardless of need. |
| 9 | + |
| 10 | +The primary benefit of pg_autovacuum is that the FSM and table statistic information |
| 11 | +are updated as needed. When a table is actively changed pg_autovacuum performs the |
| 12 | +necessary vacuums and analyzes, when a table is inactive, no cycles are wasted |
| 13 | +performing vacuums and analyzes that are not needed. |
| 14 | + |
| 15 | +A secondary benefit of pg_autovacuum is that it guarantees that a database wide |
| 16 | +vacuum is performed prior to xid wraparound. This is important as failing to do |
| 17 | +so can result in major data loss. |
| 18 | + |
| 19 | +INSTALL: |
| 20 | +To use pg_autovacuum, uncompress the tar.gz into the contrib directory and modify the |
| 21 | +contrib/Makefile to include the pg_autovacuum directory. pg_autovacuum will then be made as |
| 22 | +part of the standard postgresql install. |
| 23 | + |
| 24 | +make sure that the folowing are set in postgresql.conf |
| 25 | +stats_start_collector = true |
| 26 | +stats_row_level = true |
| 27 | + |
| 28 | +start up the postmaster |
| 29 | +then, just execute the pg_autovacuum executable. |
| 30 | + |
| 31 | + |
| 32 | +Command line arguments: |
| 33 | +pg_autovacuum has the following optional arguments: |
| 34 | +-d debug: 0 silent, 1 basic info, 2 more debug info, etc... |
| 35 | +-s sleep base value: see "Sleeping" below. |
| 36 | +-S sleep scaling factor: see "Sleeping" below. |
| 37 | +-t tuple base threshold: see Vacuuming. |
| 38 | +-T tuple scaling factor: see Vacuuming. |
| 39 | +-U username: Username pg_autovacuum will use to connect with, if not specified the |
| 40 | + current username is used |
| 41 | +-P password: Password pg_autovacuum will use to connect with. |
| 42 | +-H host: host name or IP to connect too. |
| 43 | +-p port: port used for connection. |
| 44 | +-h help: list of command line options. |
| 45 | + |
| 46 | +All arguments have default values defined in pg_autovacuum.h. At the time of this |
| 47 | +writing they are: |
| 48 | +#define AUTOVACUUM_DEBUG 1 |
| 49 | +#define BASETHRESHOLD 100 |
| 50 | +#define SCALINGFACTOR 2 |
| 51 | +#define SLEEPVALUE 3 |
| 52 | +#define SLEEPSCALINGFACTOR 2 |
| 53 | +#define UPDATE_INTERVAL 2 |
| 54 | + |
| 55 | + |
| 56 | +Vacuum and Analyze: |
| 57 | +pg_autovacuum performes either a vacuums analyze or just analyze depending on the table activity. |
| 58 | +If the number of (inserts + updates) > insertThreshold, then an only an analyze is performed. |
| 59 | +If the number of (deletes + updates ) > deleteThreshold, then a vacuum analyze is performed. |
| 60 | +deleteThreshold is equal to: tuple_base_value + (tuple_scaling_factor * "number of tuples in the table") |
| 61 | +insertThreshold is equal to: 0.5 * tuple_base_value + (tuple_scaling_factor * "number of tuples in the table") |
| 62 | +The insertThreshold is half the deleteThreshold because it's a much lighter operation (approx 5%-10% of vacuum), |
| 63 | +so running it more often costs us little in performance degredation. |
| 64 | + |
| 65 | +Sleeping: |
| 66 | +pg_autovacuum sleeps after it is done checking all the databases. It does this so as |
| 67 | +to limit the amount of system resources it consumes. This also allows the system |
| 68 | +administrator to configure pg_autovacuum to be more or less aggressive. Reducing the |
| 69 | +sleep time will cause pg_autovacuum to respond more quickly to changes, be they database |
| 70 | +addition / removal, table addition / removal, or just normal table activity. However, |
| 71 | +setting these values to high can have a negative net effect on the server. If a table |
| 72 | +gets vacuumed 5 times during the course of a large update, it might take much longer |
| 73 | +than if it was vacuumed only once. |
| 74 | +The total time it sleeps is equal to: |
| 75 | +base_sleep_value + sleep_scaling_factor * "duration of the previous loop" |
| 76 | + |
| 77 | +What it monitors: |
| 78 | +pg_autovacuum dynamically generates a list of databases and tables to monitor, in |
| 79 | +addition it will dynamically add and remove databases and tables that are |
| 80 | +removed from the database server while pg_autovacuum is running. |
0 commit comments