Location via proxy:   
[Report a bug]   [Manage cookies]                

Latest Blog Posts

A roadmap to scaling Postgres
Posted by Kaarel Moppel on 2025-04-09 at 21:00
After meeting with another database fellow recently - the evergreen topic of database scaling popped up again. That won’t go out of fashion anytime soon I guess, despite the AI advancements and the rise of Postgres-aware chatbots a la Xata Agent. Namely - when to grab for which designs, technical...

Introducing pgNow: Indexing
Posted by Ryan Booz on 2025-04-09 at 17:54
A few weeks ago I wrote about the introduction of pgNow, a free, cross-platform desktop application from Redgate that is designed to assist users who might lack extensive Postgres experience or a robust monitoring solution. It provides real-time data on queries, indexes, high-value configuration settings, and vacuuming setup and efficiency to help you pinpoint and ... Read more

Geopolitical Postgres
Posted by Bruce Momjian in EDB on 2025-04-09 at 13:45

Postgres is a data storage tool, and as a tool it is neutral in how it is used. Tools can be used for good, for harm, and because Postgres is open source, the community doesn't even know how it is used.

My first knowledge of non-traditional uses of Postgres was in 2003 during a Postgres training. During a break, someone whispered to me that one of the attendees was from a pornographic video company, and that Postgres was very popular in that industry. The more I thought about it, the more it made sense — many database companies didn't want to do business with them, but Postgres was open source so they could use it easily.

The flexibility of Postgres to be used without restrictions didn't become apparent to me again until 2014, when Russia invaded Crimea and fermented a war in Ukraine's Donbas region. Russia was internationally sanctioned, and many technology companies stopped doing business in Russia, causing Russian companies to scramble for alternatives. In reaction Russia implemented a nation-wide effort called "import substitution" to replace imported goods and products with locally produced ones. A logical database alternative was Postgres, for the same reason it was a popular option for pornographic video companies.

Continue Reading »

pgai: Importing Wikipedia into PostgreSQL
Posted by Hans-Juergen Schoenig in Cybertec on 2025-04-08 at 05:00

When dealing with AI, one of the key things is: Where can I find data to experiment with? While there are many places out there on the internet to satisfy the need for test data, Wikipedia is certainly one of the most interesting candidates to get you started. So: How can we import Wikipedia into PostgreSQL and make use of it? Here is how it works.

Using pgai to load Wikipedia

One of the by far most useful extensions out there is pgai, which is a massive toolbox for Artificial Intelligence in PostgreSQL. Luckily, it also contains an easy way to load Wikipedia into the database. 

Once you have installed pgai on your systems or launched the ready-made containers as shown on Github, you can enable the extension inside your database:

cybertec=# CREATE EXTENSION IF NOT EXISTS ai CASCADE;
NOTICE:  installing required extension "vector"
NOTICE:  installing required extension "plpython3u"
CREATE EXTENSION

All we have to do here is call the ai.load_dataset function, which has a ton of useful parameters we can utilize:

cybertec=# \x
Expanded display is on.
cybertec=# \df+ ai.load_dataset
List of functions
-[ RECORD 1 ]-------+-------------------------------------------------
Schema              | ai
Name                | load_dataset
Result data type    | bigint
Argument data types | name text, config_name text DEFAULT NULL::text, 
                      split text DEFAULT NULL::text, 
                      schema_name name DEFAULT 'public'::name, 
                      table_name name DEFAULT NULL::name, 
                      if_table_exists text DEFAULT 'error'::text, 
                      field_types jsonb DEFAULT NULL::jsonb, 
                      integer DEFAULT 5000, 
                      max_batches integer DEFAULT NULL::integer, 
                      kwargs jsonb DEFAULT '{}'::jsonb
Type                | func
Volatility          | volatile
Parallel            | unsafe
Owner               | hs
Security            | invoker
Access privileges   | hs=X*/hs +
                    | pg_datab
[...]

Waiting for Postgres 18 – Docker Containers 34% Smaller
Posted by Jeremy Schneider on 2025-04-08 at 01:50

On February 25, 2025 Christoph Berg committed the patch:

Subject: [PATCH] Move JIT to new postgresql-18-jit package. (Closes: #927182)

Make LLVM architectures a inclusion list so it works in the Architecture field.

This closed Debian bug 927182 which had been opened in April 2019 by Laurence Parry. That bug had raised concerns over the significant size increase of adding LLVM as a requirement to support Postgres JIT functionality.

Postgres supports packaging LLVM as a separate optional package without needing to recompile database binaries. Postgres is compiled once, and it performs a runtime check whether LLVM libraries are present. It gracefully disables JIT functionality in the database if LLVM libraries are not installed.

From: Andres Freund
Subject: Re: llvm dependency and space concerns
Date: 2025-01-12 00:03:43
Lists: pgsql-hackers

Hi,

On 2025-01-11 13:22:39 -0800, Jeremy Schneider wrote:
> It's a cleaner solution if JIT works more like an extension, and we can
> run a single build and split JIT into a separate package.

It does work like that. Only llvmjit.so has the llvm dependency, the main
postgres binary doesn't link to llvm. If llvmjit.so isn't available, jit is
silently disabled.

Andres

In the official community Postgres RPM/yum repository LLVM was split out to a separate package – but it had never been prioritized in the official community Debian repositories to make these code updates until now.

Christoph noted on the mailing lists that the work is not yet finished – there is still a little more that needs to be done:

From: Christoph Berg
Cc: pgsql-pkg-debian@lists.postgresql.org
Subject: Re: pg18 patch: separate package for llvm/jit
Date: Fri, 28 Feb 2025 16:02:42 +0100

Re: Jeremy Schneider
> I would like to propose this change for Postgres 18.  

Hi,

I committed a change to the PG18 packaging that implements that split.
The new package is called postgresql-18-jit.

There is more work to do on each of the extension packages currently
depending on "postgre
[...]

Mini Summit 3: APT Extension Packaging
Posted by David Wheeler in Tembo on 2025-04-07 at 18:33
Orange card with large black text reading “APT Extension Packaging”. Smaller text below reads “Christoph Berg, Debian/Cybertec” and “04.09.2025”. A photo of Christoph looking cooly at the camera appears on the right.

This Wednesday, April 9 at noon America/New_York (16:00 UTC) for Extension Mini Summit #3, where Christoph Berg will take us on a tour of the PostgreSQL Global Development Group’s APT repository with a focus on packaging extensions. For those of us foolish enough to consider building our own binary packaging systems for extensions, this will be an essential session. For everyone else, come be amazed by the sheer volume of extensions readily available from the repository. Browse on over to the Meetup to register for this live video conference.

Frédéric Delacourt
Posted by Andreas 'ads' Scherbaum on 2025-04-07 at 14:00
PostgreSQL Person of the Week Interview with Frédéric Delacourt: I live in Normandy, France. I’m sure you know the place since it is where „D-Day - 6th june 1944 The battle of Normandy in France“ took place.

SELECT FOR UPDATE – Reduce Contention and Avoid Deadlocks to Improve Performance in PostgreSQL
Posted by Umair Shahid in Stormatics on 2025-04-07 at 10:19

Relational databases are at the heart of countless applications around the world, from high-traffic e-commerce websites to enterprise resource planning (ERP) systems and financial services. Concurrency management—where multiple database transactions operate on the same data simultaneously—is critical to getting good performance and avoiding problems like deadlocks or data inconsistencies.

When multiple transactions need to modify the same rows, ensuring data consistency can become tricky. A single wrong approach to locking can lead to suboptimal performance or even bring your application to a standstill as numerous transactions block one another. One tool in PostgreSQL’s arsenal to handle concurrency is SELECT FOR UPDATE. It allows you to lock specific rows before updating them, preventing other transactions from modifying those rows until your transaction completes.

In this blog, we will dive deep into SELECT FOR UPDATE in PostgreSQL. We will explore how it helps in reducing contention, avoiding deadlocks, and ultimately boosting performance when dealing with highly concurrent applications.

The post SELECT FOR UPDATE – Reduce Contention and Avoid Deadlocks to Improve Performance in PostgreSQL appeared first on Stormatics.

Tips for Tracking Django Model Changes with django-pghistory
Posted by Colin Copeland on 2025-04-07 at 10:00

Django and its admin interface are a big part of why Caktus uses Django, but the admin's ability to log database changes is limited. For example, it shows only changes made via the Django admin, not via other parts of the site.

We've written previously on the Caktus blog about django-simple-history, a tool we use to track model changes in the admin and other parts of our Django projects. django-simple-history works well for some cases, but as a Python solution, it is not able to track changes made directly in the database with raw SQL.

Over the last year, we've been using yet another tool, django-pghistory, to track data changes in Postgres tables with 5+ million records, so I thought I'd write a short post with some of the things we've learned over this time.

Track changes selectively

django-pghistory works using Postgres triggers, which are a great solution for tracking and recording changes at a low level in the database (no matter what initiated the changes). That said, there are two caveats to this approach which are worth noting:

  1. The triggers need to be removed and re-added during schema changes. django-pghistory handles this for you, however, we found it makes database migrations longer and harder to read during code reviews due to the extra queries required. It's an expense worth incurring, but we recommend employing django-pghistory only on the models that really need it (probably not on every model in your project).
  2. django-pghistory includes an event viewer in the Django admin that shows you all the changes across all tracked models. This is great for small and simple projects, but it can quickly get out of hand for large projects with many tracked models. For this reason, we again recommend limiting tracked models only to those that really need it. For particularly large projects, it may be helpful to disable the "all events" viewer. This can be done by adding PGHISTORY_ADMIN_ALL_EVENTS = False to your Django sett
[...]

Size matter
Posted by Federico Campoli on 2025-04-07 at 07:00

Nowadays it’s not uncommon to deal with machines with hundreds of GB of RAM.

Abundant memory can give PostgreSQL a massive performance boost. However things work slightly different than you may expect.

Let’s find out!

🎙️ Talking Postgres Podcast: Helping Rails developers learn Postgres with Andrew Atkinson
Posted by Andrew Atkinson on 2025-04-07 at 00:00

Back in November, I met with Claire Giordano, host of the Talking Postgres podcast, who asked a ton of great questions about my experience writing a Postgres book aimed at Ruby on Rails web developers.

Some questions

Claire had a lot of thoughtful questions. Here’s a few:

  • Why write a technical book? Was there some moment or spark?
  • Why write a book about Postgres?
  • Why Ruby on Rails?

Fun topics

Claire also brought up a lot of fun points and reactions. Here’s a sample:

  • The importance planting seeds and encouraging others with ambitious projects
  • Would I consider writing a book for Django and Python for Postgres?
  • Where does the book fit in the landscape?
  • How long did it take to write this book?
  • Did I ever want to quit writing, even for a moment?
  • Did I have a party when the book was fully complete?
  • I talked about “little parties” with Rails developer communities at events like Rails World and Sin City Ruby
  • What was my experience like in working with other publishers
  • I shared my deep appreciation for the efforts of the technical reviewers of the book!
  • We talked about cheese! 🧀 (stories and connections with Postgres icons David Rowley and Melanie Plageman)
  • What was my favorite chapter?
  • Is there a frequently asked question I get about databases from Rails developers?
  • For my consulting services, do clients hire me for my Rails expertise or my Postgres expertise?

Quote mentioned by Claire:

Writing is thinking. To write well is to think clearly. That’s why it’s so hard.
—David McCullough

It was a real honor to be a guest on this prestigious podcast. I’m lucky to call Claire a friend as well! Thank you for the opportunity Claire, Aaron, and team!

Check out more episodes of Talking Postgres!

A Repository Of Handy Database Administration Queries
Posted by Dave Stokes on 2025-04-04 at 18:58

 I have started a repository of handy SQL Queries for Database Administration, which I have collected over the years. Finding unused or duplicate indexes or queries without indexes and other information can speed up response times. Knowing statistics like cache hit rates can aid in planning upgrades. 

I am currently populating this repo as I run across the queries as part of my 'spring cleaning. ' I welcome your participation if you have a handy bit of SQL for an open-source database you'd like to share. The repo is at https://github.com/davestokes/HandySQL, and I will be adding to it over the next few months.

If you have a question that you would like to address, please don't hesitate to let me know. I will try to accommodate you. 

Creating Histograms with Postgres
Posted by Christopher Winslett in Crunchy Data on 2025-04-04 at 14:00

Histograms were first used in a lecture in 1892 by Karl Pearson — the godfather of mathematical statistics. With how many data presentation tools we have today, it’s hard to think that representing data as a graphic was classified as “innovation”, but it was. They are a graphic presentation of the distribution and frequency of data. If you haven’t seen one recently, or don’t know the word histogram off the top of your head - it is a bar chart, each bar represents the count of data with a defined range of values. When Pearson built the first histogram, he calculated it by hand. Today we can use SQL (or even Excel) to extract this data continuously across large data sets.

While true statistical histograms have a bit more complexity for choosing bin ranges, for many business intelligence purposes, Postgres width_bucket is good-enough to counting data inside bins with minimal effort.

Postgres width_bucket for histograms

Given the number of buckets and max/min value, width_bucket returns the index for the bucket that a value will fall. For instance, given a minimum value of 0, a maximum value of 100, and 10 buckets, a value of 43 would fall in bucket #5: select width_bucket(43, 0, 100, 10) AS bucket; But 5 is not correct for 43, or is it?

You can see how the values would fall using generate_series:

SELECT value, width_bucket(value, 0, 100, 10) AS bucket FROM generate_series(0, 100) AS value;

postgres histogram 1-100

When running the query, the values 0 through 9 go into bucket 1. As you can see in the image above, width_bucket behaves as a step function that starts indexing with 1. In this scenario, when passed a value of 100, width_bucket returns 11, because the maximum value given the width_bucket is an exclusive range (i.e. the logic is minimum <= value < maximum).

We can use the bucket value to generate more readable labels.

Auto-formatting histogram with SQL

Let’s build out a larger query that creates ranges, range labels, and formats the histogram. We will start by using a synthetic table within

[...]

Don't let your PostgreSQL get exploited
Posted by Thom Brown on 2025-04-04 at 08:30

As you may have heard, there are reportedly over 1,500 PostgreSQL servers that have been exploited to mine Bitcoin. And your server could be next if you haven't taken precautions. Firstly, you need to update to the latest minor release, just so no known exploitable bugs exist on your system. But regardless of whether you update, your PostgreSQL instance could still be misconfigured in a way that would allow unwelcome visitors access, so you need to make sure you shore up your defenses. Here are some steps you should take.

1. Get it off the internet

Your database shouldn't be accessible from the open internet. People run port scanners on public IP addresses all the time, and yours will eventually be found. Edit postgresql.conf and ensure listen_addresses isn't set to "*", because this tells PostgreSQL to accept connections from any network interface. Instead, make sure it only listens to connections from the local network.

For example:

listen_addresses = 'localhost,192.168.0.133'

2. Nail down authentication

Make sure you have no "trust" entries in pg_hba.conf. This allows users to access the database unchalleneged and unauthenticated. That's an absolute minimum. Really, you should have very restricted entries, ones that allow certain users to access certain databases from certain sources, requiring them to authenticate with a password or certificate, and over an SSL connection. And if using a password, not using MD5, but instead using SCRAM. This is because MD5 is considered weak nowadays and has become compromised. You can make sure all passwords are hashed using SCRAM by setting the following in postgresql.conf:

password_encryption = scram-sha-256

You can find all users that have an MD5 password with the following query:

SELECT usename
FROM pg_shadow
WHERE passwd LIKE 'md5%';

If this returns any users, it is strongly recommended a new password is set, which should correctly hash it with SCRAM if password_encryption is set correctly in postgresql.conf.

ALTER USER myuser WITH PASSWO
[...]

PostgreSQL 18: part 3 or CommitFest 2024-11
Posted by Pavel Luzanov in Postgres Professional on 2025-04-03 at 22:45

We continue the series of articles about new patches coming to PostgreSQL 18, this one covering the news of the November CommitFest.

If you missed the previous reviews, you can check them out here: 2024-07, 2024-09.

  • initdb: checksum calculation enabled by default
  • Planner: array lookup instead of multiple similar conditions under OR
  • Planner: switching around expressions in DISTINCT clauses
  • GROUPING SETS: HAVING -> WHERE
  • Data type cache entry invalidation and temporary tables
  • Planner: incremental sorting during Merge Join
  • New function array_reverse
  • Functions min and max for the type bytea
  • Parallel worker usage statistics
  • New function pg_ls_summariesdir
  • new contrib module: pg_logicalsnapinspect
  • Improved extension installation error messages
  • NOT NULL constraints in the system catalog
  • A TOAST table for pg_index
  • COPY... FROM and file_fdw: rejected row limit
  • LIKE support with nondetermenistic collations
  • TLS v1.3: cipher suites

...

CNPG Recipe 17 - PostgreSQL In-Place Major Upgrades
Posted by Gabriele Bartolini in EDB on 2025-04-03 at 08:31

CloudNativePG 1.26 introduces one of its most anticipated features: declarative in-place major upgrades for PostgreSQL using pg_upgrade. This new approach allows you to upgrade PostgreSQL clusters by simply modifying the imageName in their configuration—just like a minor version update. While it requires brief downtime, it significantly reduces operational overhead, making it ideal for managing large fleets of PostgreSQL databases in Kubernetes. In this article, I will explore how it works, its benefits and limitations, and cover an upgrade of a 2.2TB database.

Postgres vs. SQL Server: B-Tree Index Differences & the Benefit of Deduplication
Posted by Lukas Fittl on 2025-04-03 at 05:45
When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways. In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they…

Using pgvector for timeseries data
Posted by Hans-Juergen Schoenig in Cybertec on 2025-04-03 at 05:00

pgvector is a widely adopted extension for PostgreSQL that is often used to handle semantic search. One can find various topics and posts dealing with AI and so on. Recently, we have posted information about semantic search in PostgreSQL (see post).

However, pgvector is much more than that - a vector can be anything, and the technology can be applied to other fields, such as timeseries analysis, as well. This article will explain how this works and what can be done to leverage the technology to get a handle on timeseries data.

Using pgvector to analyze stock indexes

For the purpose of this example, we will use some historic timeseries showing the development of the German stock market (DAX):

cybertec=# SELECT  * 
           FROM    stock_data 
           ORDER BY d DESC;
     d      | symbol |        open        |        high        |        low         |       close        |  volume   
------------+--------+--------------------+--------------------+--------------------+--------------------+-----------
 2025-03-20 | ^GDAXI |            23009.5 |    23315.490234375 |     22842.94921875 |    23295.720703125 |         0
 2025-03-19 | ^GDAXI |    23288.060546875 |    23372.080078125 |      23136.5390625 |    23272.150390625 |  79641400
 2025-03-18 | ^GDAXI |     23380.69921875 |    23476.009765625 |    23240.560546875 |    23266.650390625 |  80663300
 2025-03-17 | ^GDAXI |      23154.5703125 |      23154.5703125 |      22933.5703125 |    22998.529296875 |  67152000
 2025-03-14 | ^GDAXI |      22986.8203125 |     23049.48046875 |    22465.119140625 |     22501.33984375 |  93287400
 2025-03-13 | ^GDAXI |       22567.140625 |    22752.830078125 |     22417.51953125 |    22578.099609375 |  78955600
 2025-03-12 | ^GDAXI |     22676.41015625 |     22813.83984375 |     22461.76953125 |    22525.740234375 |  80929100
 2025-03-11 | ^GDAXI |     22328.76953125 |    22835.099609375 |     22258.30078125 |       22680.390625 |  97374800
 2025-03-10 | ^GDAXI |     22620.94921875 |    23164.240234375 |      22519
[...]

Multi-Version Concurrency Control (MVCC) in PostgreSQL: Learning PostgreSQL with Grant
Posted by Grant Fritchey in Redgate on 2025-04-02 at 20:52

It’s a tale as old as time. You want to read data. Your mate wants to write data. You’re stepping on each other’s toes, all the time. When we’re talking about relational data stores, one aspect that makes them what they are is the need to comply with the ACID properties. These are:

  • Atomicity: A transaction fails or completes as a unit
  • Consistency: Once a transaction completes, the database is in a valid, consistent, state
  • Isolation: Each transaction occurs on its own and shouldn’t interfere with the others
  • Durability: Basically, writes are writes and will survive a system crash

A whole lot of effort is then made to build databases that both allow you to meet the necessary ACID properties while simultaneously letting lots of people into your database. PostgreSQL does this through the Multi-version Concurrency Control (MVCC). In this article we’ll discuss what MVCC is and how PostgreSQL deals with concurrency in order to both meet ACID properties and provide a snappy performance profile. Along the way we’ll also be talking once more about the VACUUM process in PostgreSQL (you can read my introduction to the VACUUM here).

Let me start by giving you the short version of what MVCC is, and then the rest of the article explains more details. Basically, PostgreSQL is focused on ensuring, as much as possible, that reads don’t block writes and writes don’t block reads. This is done by always, only, inserting rows (tuples). No updates to an existing row. No actual deletes or updates. Instead, it uses a logical delete mechanism, which we’ll get into. This means that data in motion doesn’t interfere with data at rest, meaning a write doesn’t interfere with a read, therefore, less contention & blocking. There’s a lot to how all that works, so let’s get into it.

Concurrency Modes in PostgreSQL

The world can be a messy place. If everything in a database were ordered, completely in series, including exactly who could access what and when they could access it, we’d nev

[...]

Do not expose port 5432 to the public Internet
Posted by Christophe Pettus in pgExperts on 2025-04-02 at 16:30

Sometimes, we run into a client who has port 5432 exposed to the public Internet, usually as a convenience measure to allow remote applications to access the database without having to go through an intermediate server appllication.

Do not do this.

This report of a “security issue” in PostgreSQL is alarmist, because it’s a basic brute-force attack on PostgreSQL, attempting to get supueruser credentials. Once it does so, it uses the superuser’s access to the underlying filesystem to drop malware payloads.

There’s nothing special about this. You could do this with password-auth ssh.

But it’s one more reason not to expose PostgreSQL’s port to the public. There are others:

  • You open yourself up to a DDOS attack on the database itself. PostgreSQL is not hard to do a DOS attack on, since each incoming connection forks a new process.
  • There have been, in the past, bugs in PostgreSQL that could cause data corruption even if the incoming connection was not authenticated.

As good policy:

  • Always have PostgreSQL behind a firewall. Ideally, it should have a non-routable private IP address, and only applications that are within your networking infrastructure can get at it.
  • Never allow remote logins by superusers.
  • Make sure your access controls (pg_hba.conf, AWS security groups, etc.) are locked down to the minimum level of access required.

Please Welcome Prairie Postgres!
Posted by Henrietta Dombrovskaya on 2025-04-02 at 10:44

Dear Postgres Community! In this post, I want to formally introduce Prairie Postgres, a project that has kept me really busy for the past several months.

Recently, you’ve seen a lot of me online, talking about things I want to do and things I am already doing. Thinking about all these activities holistically, I can group them all around three common themes: Postgres “elementary education,” connecting with application developer communities, and connecting with educational institutions. The overall goal is broader promotion of knowledge about Postgres and its best practices.

Why do I think this is important? Postgres is undoubtedly the most beloved and most well-known open-source database, but too many people jump into the ocean of possibilities without being equipped with even basic knowledge. This creates a barrier for adoption and results in disappointments and not being able to get the most out of PostgreSQL.

The most important mission of Prairie Postgres is creating Postgres educational opportunities in the United States Midwest States. We want to focus on Midwest because other parts of the US have PostgreSQL User Groups and conferences, and the Midwest has a lot of Postgres but not enough community activities. We know that there are many people in the region working with Postgres, and we want to reach out to them. I often hear the sentiment that we see “all familiar faces” at Postgres events, and this includes not only the speakers, but the attendees as well, and we want to change that.

We are making our very first steps, but even these first steps wouldn’t be possible without tremendous help from many people. Firstly, I want to thank Pat Wright who gave me the idea to create an NFP, and then helped navigate the legal field – I can’t even imagine how long it would have taken us without his help.

Next I want to thank Dian Fay and Anna Bailliekova, Prairie Postgres co-founders, who jumped into this adventure and took upon themselves the most important task of never letting me make any dec

[...]

Postgres backend statistics (Part 2): WAL statistics
Posted by Bertrand Drouvot on 2025-04-02 at 05:26

Introduction

PostgreSQL 18 will normally (as there is always a risk of seeing something reverted until its GA release) include those commits: Add data for WAL in pg_stat_io and backend statistics:

commit a051e71e28a12342a4fb39a3c149a197159f9c46
Author: Michael Paquier 
Date:   Tue Feb 4 16:50:00 2025 +0900

Add data for WAL in pg_stat_io and backend statistics

This commit adds WAL IO stats to both pg_stat_io view and per-backend IO
statistics (pg_stat_get_backend_io()).
.
.

and Add WAL data to backend statistics:

commit 76def4cdd7c2b32d19e950a160f834392ea51744
Author: Michael Paquier 
Date:   Tue Mar 11 09:04:11 2025 +0900

Add WAL data to backend statistics

This commit adds per-backend WAL statistics, providing the same
information as pg_stat_wal, except that it is now possible to know how
much WAL activity is happening in each backend rather than an overall
aggregate of all the activity.  Like pg_stat_wal, the implementation
relies on pgWalUsage, tracking the difference of activity between two
reports to pgstats.

This data can be retrieved with a new system function called
pg_stat_get_backend_wal(), that returns one tuple based on the PID
provided in input.  Like pg_stat_get_backend_io(), this is useful when
joined with pg_stat_activity to get a live picture of the WAL generated
for each running backend, showing how the activity is [un]balanced.
.
.

It means that:

  • WAL IO statistics are available per backend through the pg_stat_get_backend_io() function (already introduced in Postgres backend statistics (Part 1))
  • WAL statistics are available per backend through the pg_stat_get_backend_wal() function

So that we can see the WAL activity in each backend.

Let’s look at some examples

Thanks to the pg_stat_get_backend_io() function, we can:

Retrieve the WAL IO statistics for my backend

db1=# SELECT backend_type, object, context, reads, read_bytes, read_time, writes, write_bytes, write_time, fsyncs, fsync_time  FROM pg
[...]

2025 Postgres Extensions Mini Summit Two
Posted by David Wheeler in Tembo on 2025-04-01 at 19:32

Last Wednesday, March 26, we hosted the second of five virtual Extension Mini-Summits in the lead up to the big one at the Postgres Development Conference (PGConf.dev) on May 13 in Montreal, Canada. Peter Eisentraut gave a very nice presentation on the history, design decisions, and problems solved by “Implementing an Extension Search Path”. That talk, plus another 10-15m of discussion, is now available for your viewing pleasure:

If you’d like to attend any of the next three Mini-Summits, join the Meetup!

Once again, with many thanks again to Floor Drees for the effort, here’s the transcript from the session.

Introduction

Floor Drees introduced the organizers:

Peter Eisentraut, contributor to PostgreSQL development since 1999, talked about implementing an extension search path.

The stream and the closed captions available for the recording are supported by PGConf.dev and their gold level sponsors, Google, AWS, Huawei, Microsoft, and EDB.

Implementing an extension search path

Peter: Thank you for having me!

I’m gonna talk about a current project by me and a couple of people I have worked with, and that will hopefully ship with Postgres 18 in a few months.

So, what do I know about extensions? I’m a Postgres core developer, but I’ve developed a few extensions in my time, here’s a list of extensions that I’ve built over the years.

[...]

Postgres on Kubernetes for the Reluctant DBA
Posted by Karen Jex in Crunchy Data on 2025-04-01 at 17:29

Slides and transcript from my talk, "Postgres on Kubernetes for the Reluctant DBA", at Data on Kubernetes Day Europe in London on 1 April 2025.



Introduction

This is me!
As you can see from the diagram representing my career so far (and as you already know if you've read my posts or watched my talks before), I have a database background.
I was a DBA for 20 years before I moved into database consultancy, and I’m now a senior solutions architect at Crunchy Data, working with customers to design, implement and manage their database environments, almost exclusively on Kubernetes.




Over the past few years, I’ve given a lot of talks about running Postgres on Kubernetes, and I work with a lot of customers who are at various different points on their databases on Kubernetes journey.
The questions from the audience, and hallway conversations at conferences are always interesting, and tend to come from one of 2 groups of people:

  1. People who are deep into running databases on kubernetes and are looking for answers to some tricky technical issue or architectural question.
  2. Self-proclaimed “old-school DBAs” who still aren’t convinced that running databases in containers is a good idea.

I prepared this talk especially for that 2nd group of people, so I hope some of you are in the audience today!
And don’t forget, as you saw on the previous slide, I come from an old-school DBAs background, so I’ve gone through that process.




To get an idea of who was in the audience, I asked the question
What’s your main area of responsibility in your current role? There was a reasonably even split betweeen:

  • Databases
  • System or Platform Administration
  • Devops or Automation
  • Development



The plan for the session was:

  1. Some Databases on Kubernetes background.
  2. Some audience participation.*
  3. A look at some of the main concerns that DBAs have about running DBs on Kubernetes.
  4. Some of the challenges you might encounter, and how you can overc
[...]

Swiss Database Synergy Day 2025: Oracle and PostgreSQL User Groups Unite
Posted by Laurenz Albe in Cybertec on 2025-04-01 at 05:00

Since I sang the praise of the 2024 Swiss PGDay, my people asked me to report from yet another small conference: the Swiss Database Synergy Day 2025, organized by the Swiss Oracle and PostgreSQL user groups.

A joint event with Oracle and PostgreSQL? You must be kidding!

It may indeed be surprising to hear of such an event. But then, PostgreSQL people have held talks at the yearly conference of the German Oracle User Group. The Austrian Oracle User Group held an event dedicated to the interplay of Oracle and PostgreSQL. Furthermore, previous editions of the Austrian PGDay were co-sponsored by an Oracle support company. And this list only covers events in my part of the world.

In addition, there is this nice reminder that the Polish Oracle User Group has sent my boss after he had talked there a while ago:

Polish Oracle Group picture

Still, there is a difference between PostgreSQL people talking at an Oracle conference and a joint event. I can understand why PostgreSQL people are interested in Oracle conferences: after all, a lot of our business comes from Oracle users who have fallen out of love with their database. But what motivates the Oracle users? Are they just curious and open-minded? Do they think the future belongs to PostgreSQL? Or is it just their way of showing dissatisfaction with Oracle?

Then there is the ambivalent word “synergy” in “Database Synergy Day”. Fundamentally, “synergy” is something positive, but in today's business speak it usually refers to fusioning two companies or departments. Rather than increasing productivity, that means that the two entities spend a year fighting for dominance until one of them ends up on top. All the while, the productivity is near zero. I sincerely hope that I am not about to witness that kind of synergy...

Well, I'm going to find out.

The speakers' dinner

It is customary to invite the speakers to a dinner as a “thank you” for their (unpaid) effort. This conference is no exception. As I said in my article on the Swiss PGDay 2024, a PostgreSQL feels a

[...]

Akshat Jaimini
Posted by Andreas 'ads' Scherbaum on 2025-03-31 at 14:00
PostgreSQL Person of the Week Interview with Akshat Jaimini: Hi! I am Akshat Jaimini, a final year undergraduate currently based out of Dehradun, India. I am working as an SDE Intern at a fintech firm called ION Group and will graduate soon in June, 2025.

PgPedia Week, 2025-03-30
Posted by Ian Barwick on 2025-03-30 at 20:59

A lot going on this week (well, last week - this edition of PgPedia Week is unfortunately delayed due to personal reason), as CommitFest 52 (2025-03) - the final CommitFest in the PostgreSQL 18 development cycle - draws to an end. Though usually it will continue right up until code freeze on April 8th .

PostgreSQL 18 changes New features Dynamically loaded shared libraries can provide user-queryable metadata Support for NOT NULL constraints on virtual generated columns new conflict type multiple_unique_conflicts added gamma() , lgamma() new mathematical functions pg_get_loaded_modules() reports information about loaded modules max_files_per_process now controls additionally opened files pg_recvlogical option --drop-slot no longer requires provision of --dbname initdb option --no-sync-data-files added psql \dx now shows the default extension version \watch default interval is now configurable pg_createsubscriber option --all added pg_dump option --sequence-data added options --with-data / --with-schema / --with-statistics added pg_upgrade option --swap added amcheck function gin_index_check() added dblink SCRAM authentication pass-through pg_overexplain contrib module for expanding EXPLAIN output Reversions query_id_squash_values removed

more...

How to fix Hibernate bug by conditional index
Posted by Pavel Stehule on 2025-03-30 at 05:59

Yesterday I found significant grow of seq read tuples. After some investigation I found query with strange predicate:

WHERE 1 = case when pb1_0.parent_id is not null then 0 end

It is really strange, and I had to ask, who wrote it.

The reply is - Hibernate. It is a transformation of predicate parent_id = (?) when the list of id is empty.

Unfortunately, PostgreSQL is not able to detect so this predicate is always false, and then the repeated execution ended in repeated full scans.

Fortunately, Postgres has simple workaround - conditional index

CREATE INDEX ON TABLE pb(id)
  WHERE 1 = case when pb1_0.parent_id is not null then 0 end

This index is always empty, and then index scan is fast.

This issue should be fixed in more recent versions of Hibernate where predicate 1=0 is generated instead.

Transparent Data Encryption for PostgreSQL Release Candidate is Here!
Posted by Jan Wieremjewicz in Percona on 2025-03-28 at 13:23
Transparent Data Encryption PostgreSQLPostgreSQL is, without a doubt, one of the most popular open source databases in the world. Why? Well, there are many reasons, but if I had to pick just one, I’d say it’s extensibility. PostgreSQL isn’t just a database; it’s an ecosystem of extensions that can transform it to tackle any upcoming challenges. By enabling […]

Converting JSON documents to relational tables
Posted by Regina Obe in PostGIS on 2025-03-28 at 04:03

JSON is one of the most popular ways of disseminating data between systems. It is probably the most common offered by webservices. PostgreSQL is a database perfectly suited for grabbing that data and transforming it into a more structured relational format. All this can be done directly in the database. We'll go over some ways to load and restructure json data.

Continue reading "Converting JSON documents to relational tables"

Top posters

Number of posts in the past two months

Top teams

Number of posts in the past two months

Feeds

Planet

  • Policy for being listed on Planet PostgreSQL.
  • Add your blog to Planet PostgreSQL.
  • List of all subscribed blogs.
  • Manage your registration.

Contact

Get in touch with the Planet PostgreSQL administrators at planet at postgresql.org.