ANSI SQL Implementation Guide Postgresql
ANSI SQL Implementation Guide Postgresql
durch
Stefan Simkovics
Paul Petersgasse 36
A - 2384 Breitenfurt
It was the author’s task to add the support for the two missing features to the existing source
code. Before the implementation could be started an intensive study of the relevant parts
of the SQL92 standard and the implementation of the existing features of PostgreSQL had
been necessary. This document will not present only the results of the implementation but
also the knowledge collected while studying the SQL language and the source code of the
already existing features.
Chapter 1 presents an overview on the SQL92 standard. It gives a description of the
relational data model and the theoretical (mathematical) background of SQL. Next the
SQL language itself is described. The most important SQL statements are presented and a
lot of examples are included for better understanding. The information given in this chapter
has mainly been taken form the books [DATE96], [DATE94] and [ULL88].
Chapter 2 gives a description on how to use PostgreSQL. First it is shown how the
backend (server) can be started and how a connection from a client to the server can be
established. Next some basic database management tasks like creating a database, creating
a table etc. are described. Finally some of PostgreSQL’s special features like user defined
functions, user defined types, the rule system etc. are presented and illustrated using a lot of
examples. The information given in chapter 2 has mainly been taken from the PostgreSQL
documentation (see [LOCK98]), the PostgreSQL manual pages and was verified by the
author throughout various examples which have also been included.
Chapter 3 concentrates on the internal structure of the PostgreSQL backend. First the
stages that a query has to pass in order to retrieve a result are described using a lot of
figures to illustrate the involved data structures. The information given in that part of
chapter 3 has been collected while intensively studying the source code of the relevant
parts of PostgreSQL. This intensive and detailed examination of the source code had been
necessary to be able to add the missing functionality. The knowledge gathered during that
period of time has been summarized here in order to make it easier for programmers who
are new to PostgreSQL to find their way in.
The following sections cover the author’s ideas for the implementation of the two miss-
ing SQL features mentioned above and a description of the implementation itself.
Section 3.7 deals with the implementation of the having logic. As mentioned earlier the
having logic is one of the two missing SQL92 features that the author had to implement.
3
4
The first parts of the chapter describe how aggregate functions are realized in PostgreSQL
and after that a description of the enhancements applied to the code of the planner/optimizer
and the executor in order to realize the new functionality is given. The functions and data
structures used and added to the source code are also handled here.
Section 3.8 deals with the implementation of the intersect and except functionality
which was the second missing SQL92 feature that had to be added by the author. First
a theoretical description of the basic idea is given. The intersect and except logic is imple-
mented using a query rewrite technique (i.e. a query involving an intersect and/or except
operation is rewritten to a semantically equivalent form that does not use these set oper-
ations any more). After presenting the basic idea the changes made to the parser and the
rewrite system are described and the added functions and data structures are presented.
Contents
1 SQL 9
1.1 The Relational Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.1 Formal Notion of the Relational Data Model . . . . . . . . . . . . 10
Domains vs. Data Types . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Operations in the Relational Data Model . . . . . . . . . . . . . . . . . . . 11
1.2.1 Relational Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 Relational Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Tuple Relational Calculus . . . . . . . . . . . . . . . . . . . . . . 14
1.2.3 Relational Algebra vs. Relational Calculus . . . . . . . . . . . . . 14
1.3 The SQL Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Simple Selects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Aggregate Operators . . . . . . . . . . . . . . . . . . . . . . . . . 17
Aggregation by Groups . . . . . . . . . . . . . . . . . . . . . . . . 17
Having . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Union, Intersect, Except . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2 Data Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Create Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Data Types in SQL . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Create Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Create View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Drop Table, Drop Index, Drop View . . . . . . . . . . . . . . . . . 23
1.3.3 Data Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Insert Into . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.4 System Catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.3.5 Embedded SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5
6 CONTENTS
7
Chapter 1
SQL
SQL has become one of the most popular relational query languages all over the world. The
name ”SQL” is an abbreviation for Structured Query Language. In 1974 Donald Cham-
berlin and others defined the language SEQUEL (Structured English Query Language) at
IBM Research. This language was first implemented in an IBM prototype called SEQUEL-
XRM in 1974-75. In 1976-77 a revised version of SEQUEL called SEQUEL/2 was defined
and the name was changed to SQL subsequently.
A new prototype called System R was developed by IBM in 1977. System R imple-
mented a large subset of SEQUEL/2 (now SQL) and a number of changes were made to
SQL during the project. System R was installed in a number of user sites, both internal
IBM sites and also some selected customer sites. Thanks to the success and acceptance of
System R at those user sites IBM started to develop commercial products that implemented
the SQL language based on the System R technology.
Over the next years IBM and also a number of other vendors announced SQL products
such as SQL/DS (IBM), DB2 (IBM) ORACLE (Oracle Corp.) DG/SQL (Data General
Corp.) SYBASE (Sybase Inc.).
SQL is also an official standard now. In 1982 the American National Standards Insti-
tute (ANSI) chartered its Database Committee X3H2 to develop a proposal for a standard
relational language. This proposal was ratified in 1986 and consisted essentially of the IBM
dialect of SQL. In 1987 this ANSI standard was also accepted as an international standard
by the International Organization for Standardization (ISO). This original standard version
of SQL is often referred to, informally, as ”SQL/86”. In 1989 the original standard was
extended and this new standard is often, again informally, referred to as ”SQL/89”. Also in
1989, a related standard called Database Language Embedded SQL was developed.
The ISO and ANSI committees have been working for many years on the defini-
tion of a greatly expanded version of the original standard, referred to informally as
”SQL2” or ”SQL/92”. This version became a ratified standard - ”International Standard
ISO/IEC 9075:1992, Database Language SQL” - in late 1992. ”SQL/92” is the version nor-
mally meant when people refer to ”the SQL standard”. A detailed description of ”SQL/92”
is given in [DATE96]. At the time of writing this document a new standard informally
referred to as ”SQL3” is under development. It is planned to make SQL a turing-complete
language, i.e. all computable queries (e.g. recursive queries) will be possible. This is a very
complex task and therefore the completion of the new standard can not be expected before
1999.
9
10 CHAPTER 1. SQL
SUPPLIER is a table storing the number (SNO), the name (SNAME) and the city
(CITY) of a supplier.
PART is a table storing the number (PNO) the name (PNAME) and the price (PRICE)
of a part.
SELLS stores information about which part (PNO) is sold by which supplier (SNO).
It serves in a sense to connect the other two tables together.
The tables PART and SUPPLIER may be regarded as entities and SELLS may be regarded
as a relationship between a particular part and a particular supplier.
As we will see later, SQL operates on tables like the ones just defined but before that
we will study the theory of the relational model.
Definition 1.2 A Relation is any subset of the Cartesian product of one or more domains:
576
The Relational Algebra which is an algebraic notation, where queries are expressed
by applying specialized operators to the relations.
The Relational Calculus which is a logical notation, where queries are expressed by
formulating some logical restrictions that the tuples in the answer must satisfy.
5
SELECT ( = ): extracts tuples from a relation that 5
satisfy$ a givenrestriction.
-
Let
5FE
be a table that contains an
attribute
. =>@?BA &C*D
D &C/ where D
5
denotes a tuple of and D denotes the value of attribute of tuple D .
5
PROJECT (G ): extracts specified attributes 5
(columns)
E
from a relation. Let
$ 5 -
be a
relation that contains an attribute H . GJI &<*D H D , where D H denotes
the value of attribute H of tuple D .
5
PRODUCT
( ): builds the Cartesian product
of
two relations. Let be
8L
a table
with
5
arity and let
K be a table with arity . K is the set of all
-tuples
5
whose first components form a tuple in and whose last components form a
tuple in K .
5
UNION ( ): builds the set-theoretic union5 of two tables. Given the tables and K
5
(both must have the same arity), the union K is the set of tuples that are in or
K or both.
INTERSECT ( ): builds the set-theoretic intersection of two tables. Given the tables
5 5 5 5
and K , K is the set of tuples that are in and in K . We again require that
and K have the same arity.
5
DIFFERENCE ( M or ): builds the set difference of two tables. Let and K again
5 5
be two tables with the same arity. M'K is the set of tuples in but not in K .
5
JOIN ( N ): connects two tables by their common attributes. Let be
a table
with the attributes and and let K a table with the attributes and O .
5
There is one attribute common to both relations, the attribute . NPKQ&
5
GJRBS >BT RBS UVT RBS W@T XS YZT XS [ =RBS W@?BXS W
K . What are we doing here? We first cal-
5
culate the Cartesian product K . Then we select those tuples whose values for
the common attribute are equal ( =RBS W@?BXS W ). Now we got a table that contains the
attribute two times and we correct this by projecting out the duplicate column.
Example 1.1 Let’s have a look at the tables that are produced by evaluating the steps
necessary for a join.
Let the following two tables be given:
R A | B | C S C | D | E
---+---+--- ---+---+---
1 | 2 | 3 3 | a | b
4 | 5 | 6 6 | c | d
7 | 8 | 9
5
First we calculate the Cartesian product K and get:
R x S A | B | R.C | S.C | D | E
---+---+-----+-----+---+---
1 | 2 | 3 | 3 | a | b
1 | 2 | 3 | 6 | c | d
4 | 5 | 6 | 3 | a | b
4 | 5 | 6 | 6 | c | d
7 | 8 | 9 | 3 | a | b
7 | 8 | 9 | 6 | c | d
1.2. OPERATIONS IN THE RELATIONAL DATA MODEL 13
5
After the selection =RBS W@?BXS W K we get:
A | B | R.C | S.C | D | E
---+---+-----+-----+---+---
1 | 2 | 3 | 3 | a | b
4 | 5 | 6 | 6 | c | d
\
To remove the duplicate
column K
we project it out by the following operation:
5
GJRBS >BT RBS UVT RBS W@T XS YZT XS [ =RBS W@?BXS W K and get:
A | B | C | D | E
---+---+---+---+---
1 | 2 | 3 | a | b
4 | 5 | 6 | c | d
5 ^
DIVIDE ( ] ): Let be a table
with the attributes and and let K be a table $
5 E `
with the$ 5
attributes and .
Then we define
the division
-
as: f@ ]
g
K_&C*D D1a
Kcb D d such that D d &D3e<D d &D1a
f
where
g
D d denotes a tuple
5
of table that consists only of the components and . Note that the tuple D only
5
consists of the components and of relation .
R A | B | C | D S C | D
---+---+---+--- ---+---
a | b | c | d c | d
a | b | e | f e | f
b | c | e | f
e | d | c | d
e | d | e | f
a | b | d | e
5
]K is derived as
A | B
---+---
a | b
e | d
For a more detailed description and definition of the relational algebra refer to [ULL88] or
[DATE94].
Example 1.3 Recall that we formulated all those relational operators to be able to retrieve
data from the database. Let’s return to our example of section 1.2 where someone wanted
to know the names of all suppliers that sell the part ’Screw’. This question can be answered
using relational algebra by the following operation:
5 5}
G@X h@i@i@jlk^[mRBS XnZ>@o[ =i@>@R@pJS i@nZ>@o[m?mqrX sd1t u@q KwvxxyVz O NKBO{yVyZK4N|x
We call such an operation a query. If we evaluate the above query against the tables form
figure 1.1 The suppliers and parts database we will obtain the following result:
SNAME
-------
Smith
Adams
14 CHAPTER 1. SQL
The Domain Relational Calculus (DRC), where variables stand for components (at-
tributes) of the tuples.
The Tuple Relational Calculus (TRC), where variables stand for tuples.
We want to discuss the tuple relational calculus only because it is the one underlying the
most relational languages. For a detailed discussion on DRC (and also TRC) see [DATE94]
or [ULL88].
f ~
where is a tuple variable
is a set of attributes and is a formula. The resulting relation
~
consists of all tuples D that satisfy D .
Example 1.4 If we want to answer the question from example 1.3 using TRC we formulate
the following query:
f@ % E f$ 5
* KB O KwvxxyVz O e
g$ $ 5} "gJ f@
b KBO{yVyZKcbl x KB & KB e
gJ
x & x e
% 2^^ -
x O &JK
Evaluating the query against the tables from figure 1.1 The suppliers and parts database
again leads to the same result as in example 1.3.
1.3.1 Select
The most often used command in SQL is the SELECT statement that is used to retrieve
data. The syntax is:
SELECT [ALL|DISTINCT]
{ * | <expr_1> [AS <c_alias_1>] [, ...
[, <expr_k> [AS <c_alias_k>]]]}
FROM <table_name_1> [t_alias_1]
[, ... [, <table_name_n> [t_alias_n]]]
[WHERE condition]
[GROUP BY <name_of_attr_i>
[,... [, <name_of_attr_j>]] [HAVING condition]]
[{UNION | INTERSECT | EXCEPT} SELECT ...]
[ORDER BY <name_of_attr_i> [ASC|DESC]
[, ... [, <name_of_attr_j> [ASC|DESC]]]];
Now we will illustrate the complex syntax of the SELECT statement with various exam-
ples. The tables used for the examples are defined in figure 1.1 The suppliers and parts
database.
Simple Selects
Example 1.5 Here are some simple examples using a SELECT statement:
To retrieve all tuples from table PART where the attribute PRICE is greater than 10
we formulate the following query
SELECT *
FROM PART
WHERE PRICE > 10;
and get the table:
PNO | PNAME | PRICE
-----+---------+--------
3 | Bolt | 15
4 | Cam | 25
Using ” ” in the SELECT statement will deliver all attributes from the table. If we want to
retrieve only the attributes PNAME and PRICE from table PART we use the statement:
SELECT PNAME, PRICE
FROM PART
WHERE PRICE > 10;
16 CHAPTER 1. SQL
PNAME | PRICE
--------+--------
Bolt | 15
Cam | 25
Note that the SQL SELECT corresponds to the ”projection” in relational algebra not to the
”selection” (see section 1.2.1 Relational Algebra).
The qualifications in the WHERE clause can also be logically connected using the
keywords OR, AND and NOT:
PNAME | PRICE
--------+--------
Bolt | 15
Arithmetic operations may be used in the selectlist and in the WHERE clause. For example
if we want to know how much it would cost if we take two pieces of a part we could use
the following query:
and we get:
PNAME | DOUBLE
--------+---------
Screw | 20
Nut | 16
Bolt | 30
Note that the word DOUBLE after the keyword AS is the new title of the second column.
This technique can be used for every element of the selectlist to assign a new title to the
resulting column. This new title is often referred to as alias. The alias cannot be used
throughout the rest of the query.
Joins
Example 1.6 The following example shows how joins are realized in SQL:
To join the three tables SUPPLIER, PART and SELLS over their common attributes
we formulate the following statement:
Aggregate Operators
SQL provides aggregate operators (e.g. AVG, COUNT, SUM, MIN, MAX) that take the
name of an attribute as an argument. The value of the aggregate operator is calculated over
all values of the specified attribute (column) of the whole table. If groups are specified in
the query the calculation is done only over the values of a group (see next section).
Example 1.7 If we want to know the average cost of all parts in table PART we use the
following query:
SELECT AVG(PRICE) AS AVG_PRICE
FROM PART;
The result is:
AVG_PRICE
-----------
14.5
If we want to know how many parts are stored in table PART we use the statement:
SELECT COUNT(PNO)
FROM PART;
and get:
COUNT
-------
4
Aggregation by Groups
SQL allows to partition the tuples of a table into groups. Then the aggregate operators
described above can be applied to the groups (i.e. the value of the aggregate operator is no
longer calculated over all the values of the specified column but over all values of a group.
Thus the aggregate operator is evaluated individually for every group.)
18 CHAPTER 1. SQL
The partitioning of the tuples into groups is done by using the keywords GROUP BY
1
followed by a list of attributes that define the groups. If we have GROUP BY
we partition the relation into groups, such that two tuples are in the same group if and only
#1
if they agree on all the attributes .
Example 1.8 If we want to know how many parts are sold by every supplier we formulate
the query:
and get:
Next we partition the tuples into groups by putting all tuples together that agree on both
attributes S.SNO and S.SNAME:
In our example we got four groups and now we can apply the aggregate operator COUNT
to every group leading to the total result of the query given above.
1.3. THE SQL LANGUAGE 19
Note that for the result of a query using GROUP BY and aggregate operators to make sense
the attributes grouped by must also appear in the selectlist. All further attributes not ap-
pearing in the GROUP BY clause can only be selected by using an aggregate function. On
the other hand you can not use aggregate functions on attributes appearing in the GROUP
BY clause.
Having
The HAVING clause works much like the WHERE clause and is used to consider only
those groups satisfying the qualification given in the HAVING clause. The expressions
allowed in the HAVING clause must involve aggregate functions. Every expression using
only plain attributes belongs to the WHERE clause. On the other hand every expression
involving an aggregate function must be put to the HAVING clause.
Example 1.9 If we want only those suppliers selling more than one part we use the query:
SELECT S.SNO, S.SNAME, COUNT(SE.PNO)
FROM SUPPLIER S, SELLS SE
WHERE S.SNO = SE.SNO
GROUP BY S.SNO, S.SNAME
HAVING COUNT(SE.PNO) > 1;
and get:
SNO | SNAME | COUNT
-----+-------+-------
1 | Smith | 2
3 | Adams | 2
4 | Blake | 3
Subqueries
In the WHERE and HAVING clauses the use of subqueries (subselects) is allowed in every
place where a value is expected. In this case the value must be derived by evaluating the
subquery first. The usage of subqueries extends the expressive power of SQL.
Example 1.10 If we want to know all parts having a greater price than the part named
’Screw’ we use the query:
SELECT *
FROM PART
WHERE PRICE > (SELECT PRICE FROM PART
WHERE PNAME=’Screw’);
The result is:
PNO | PNAME | PRICE
-----+---------+--------
3 | Bolt | 15
4 | Cam | 25
When we look at the above query we can see the keyword SELECT two times. The first
one at the beginning of the query - we will refer to it as outer SELECT - and the one in
the WHERE clause which begins a nested query - we will refer to it as inner SELECT.
For every tuple of the outer SELECT the inner SELECT has to be evaluated. After every
evaluation we know the price of the tuple named ’Screw’ and we can check if the price of
the actual tuple is greater.
20 CHAPTER 1. SQL
If we want to know all suppliers that do not sell any part (e.g. to be able to remove
these suppliers from the database) we use:
SELECT *
FROM SUPPLIER S
WHERE NOT EXISTS
(SELECT * FROM SELLS SE
WHERE SE.SNO = S.SNO);
In our example the result will be empty because every supplier sells at least one part.
Note that we use S.SNO from the outer SELECT within the WHERE clause of the inner
SELECT. As described above the subquery is evaluated for every tuple from the outer query
i.e. the value for S.SNO is always taken from the actual tuple of the outer SELECT.
These operations calculate the union, intersect and set theoretic difference of the tuples
derived by two subqueries:
The only tuple returned by both parts of the query is the one having KB<&( .
1.3. THE SQL LANGUAGE 21
Create Table
The most fundamental command for data definition is the one that creates a new relation (a
new table). The syntax of the CREATE TABLE command is:
Example 1.12 To create the tables defined in figure 1.1 the following SQL statements are
used:
1
DECIMAL
(@ ): signed packed decimal number
,
of digits precision with as-
7
sumed of them right to the decimal point. + . If is omitted it is
assumed to be 0.
FLOAT: signed doubleword floating point number.
CHAR( ): fixed length character string of length .
VARCHAR( ): varying length character string of maximum length .
Create Index
5
Indices
are used to speed up access to a relation.
If a relation has an index on attribute
then we can retrieve all tuples D having D &)/ in time roughly proportional to the
5
number of such tuples D rather than in time proportional to the size of .
To create an index in SQL the CREATE INDEX command is used. The syntax is:
CREATE INDEX <index_name>
ON <table_name> ( <name_of_attribute> );
Create View
A view may be regarded as a virtual table, i.e. a table that does not physically exist in the
database but looks to the user as if it did. By contrast, when we talk of a base table there
is really a physically stored counterpart of each row of the table somewhere in the physical
storage.
Views do not have their own, physically separate, distinguishable stored data. Instead,
the system stores the definition of the view (i.e. the rules about how to access physically
stored base tables in order to materialize the view) somewhere in the system catalogs (see
section 1.3.4 System Catalogs). For a discussion on different techniques to implement
views refer to section 3.4.1 Techniques To Implement Views.
In SQL the CREATE VIEW command is used to define a view. The syntax is:
CREATE VIEW <view_name>
AS <select_stmt>
where
select stmt is a valid select statement as defined in section 1.3.1. Note that
the select stmt is not executed when the view is created. It is just stored in the
system catalogs and is executed whenever a query against the view is made.
Example 1.14 Let the following view definition be given (we use the tables from figure
1.1 The suppliers and parts database again):
CREATE VIEW London_Suppliers
AS SELECT S.SNAME, P.PNAME
FROM SUPPLIER S, PART P, SELLS SE
WHERE S.SNO = SE.SNO AND
P.PNO = SE.PNO AND
S.CITY = ’London’;
1.3. THE SQL LANGUAGE 23
Now we can use this virtual relation London Suppliers as if it were another base table:
SELECT *
FROM London_Suppliers
WHERE P.PNAME = ’Screw’;
SNAME | PNAME
-------+-------
Smith | Screw
To calculate this result the database system has to do a hidden access to the base tables
SUPPLIER, SELLS and PART first. It does so by executing the query given in the view
definition against those base tables. After that the additional qualifications (given in the
query against the view) can be applied to obtain the resulting table.
Example 1.15 To destroy the SUPPLIER table use the following statement:
Example 1.16 To insert the first tuple into the relation SUPPLIER of figure 1.1 The sup-
pliers and parts database we use the following statement:
Update
To change one or more attribute values of tuples in a relation the UPDATE command is
used. The syntax is:
UPDATE <table_name>
SET <name_of_attr_1> = <value_1>
[, ... [, <name_of_attr_k> = <value_k>]]
WHERE <condition>;
Example 1.17 To change the value of attribute PRICE of the part ’Screw’ in the relation
PART we use:
UPDATE PART
SET PRICE = 15
WHERE PNAME = ’Screw’;
The new value of attribute PRICE of the tuple whose name is ’Screw’ is now 15.
Delete
To delete a tuple from a particular table use the command DELETE FROM. The syntax is:
Example 1.18 To delete the supplier called ’Smith’ of the table SUPPLIER the following
statement is used:
There are queries that cannot be formulated using pure SQL (i.e. recursive queries).
To be able to perform such queries we need a host language with a greater expressive
power than SQL.
We simply want to access a database from some application that is written in the host
language (e.g. a ticket reservation system with a graphical user interface is written in
C and the information about which tickets are still left is stored in a database that can
be accessed using embedded SQL).
1.3. THE SQL LANGUAGE 25
A program using embedded SQL in a host language consists of statements of the host
language and of embedded SQL (ESQL) statements. Every ESQL statement begins with
the keywords EXEC SQL. The ESQL statements are transformed to statements of the host
language by a precompiler (mostly calls to library routines that perform the various SQL
commands).
When we look at the examples throughout section 1.3.1 we realize that the result of
the queries is very often a set of tuples. Most host languages are not designed to operate
on sets so we need a mechanism to access every single tuple of the set of tuples returned
by a SELECT statement. This mechanism can be provided by declaring a cursor. After
that we can use the FETCH command to retrieve a tuple and set the cursor to the next tuple.
This chapter contains information that will be useful for people that only want to use Post-
greSQL. It gives a listing and description of the available features including a lot of exam-
ples. The users interested in the internals of PostgreSQL should read chapter 3 PostgreSQL
from the Programmer’s Point of View.
26
2.3. WHERE TO GET POSTGRESQL 27
Copyright of PostgreSQL
PostgreSQL is not public domain software. It is copyrighted by the University of Cali-
fornia but may be used according to the licensing terms of the the copyright included in
every distribution (refer to the file COPYRIGHT included in every distribution for more
information).
subscribe
end
which will start postmaster in the background and even if you log out of the system the
process remains active. All errors and messages will be logged to the file server.log.
$ createdb testdb
which will create a database called testdb. The user executing the command will
become the database administrator and will therefore be the only user (except the database
superuser postgres) who can destroy the database again.
Note: To create the database you don’t need to know anything about the tables (re-
lations) that will be used within the database. The tables will be defined later using the
SQL statements shown in section 2.4.4 Defining and Populating Tables.
2.4. HOW TO USE POSTGRESQL 29
$ psql testdb
which will make a connection to a database called testdb. psql is a command line
interface using GNU readline. It can handle a connection to only one database at a time.
When the connection is established psql presents itself as follows:
testdb=>
Now you can either enter any valid SQL statement terminated by a ’;’ or use one of the
slash commands. A list of all available slash commands can be obtained by typing ’ ?’.
q quits psql.
d lists all tables, views and indexes existing in the current database.
Example 2.1 To create and populate the table SUPPLIER used in figure 1.1 The suppliers
and parts database we could use the following session:
$ psql testdb
Note: The data type int4 is not part of the SQL92 standard. It is a built in Post-
greSQL type denoting a four byte signed integer. For information on which data types are
available you can use the dT command which will give a list and short description of all
datatypes currently known to PostgreSQL.
Example 2.2 We assume that all the tables form figure 1.1 The suppliers and parts
database exist in the database testdb. If we want to know all parts that are sold in
London we use the following session:
pname
-----
Screw
Nut
(2 rows)
testdb=>
Example 2.3 We use again the tables given in figure 1.1. Now we want to retrieve all
suppliers selling no parts at all (to remove them from the suppliers table for example):
testdb=>
The result relation is empty in our example telling us that every supplier contained in the
database sells at least one part. Note that we used a nested subselect to formulate the query.
inheritance
rules
Some other features, implemented in most modern RDBMSs provide additional power and
flexibility:
triggers
transaction integrity
2.5.1 Inheritance
Inheritance is a feature well known from object oriented programming languages such as
Smalltalk or C++. PostgreSQL refers to tables as classes and the definition of a class may
inherit the contents of another class:
32 CHAPTER 2. POSTGRESQL FROM THE USER’S POINT OF VIEW
testdb=>
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 33
If we want to know the names of all cities (including capitals) that are located at an altitude
over 100 meters the query is:
testdb=> select * from city*
testdb-> where altitude > 100;
name | population | altitude
---------+------------+---------
Linz | 200000 | 270
Graz | 250000 | 350
Villach | 50000 | 500
Salzburg | 150000 | 420
Vienna | 1500000 | 200
(5 rows)
testdb=>
Here the ’*’ after city indicates that the query should be run over city and all classes
below city in the inheritance hierarchy. Many of the commands that we have already
discussed (SELECT, UPDATE, DELETE, etc) support this ’*’ notation.
Before PostgreSQL was extended to support nested subqueries user defined query
language (SQL) functions could be used to simulate them. Consider example 2.3 where we
have wanted to know the names of all suppliers that do not sell any part at all. Normally
we would formulate the query as we did in example 2.3. Here we want to show a possible
way of formulating the query without using a subquery. This is done in two steps. In the
first step we define the function my exists. In the second step we formulate a query
using the new function.
34 CHAPTER 2. POSTGRESQL FROM THE USER’S POINT OF VIEW
In the first step we define the new function my exists(int4) which takes an integer as
argument:
testdb=> create function my_exists(int4) returns int4
testdb-> as ’select count(pno) from sells
testdb-> where sno = $1;’ language ’sql’;
CREATE
testdb=>
Here is the second step which performs the intended retrieve:
testdb=> select s.sname from supplier s
testdb-> where my_exists(s.sno) = 0;
sname
-----
(0 rows)
testdb=>
Now let’s have a look at what is happening here. The function my exists(int4) takes
one argument which must be of type integer. Within the function definition this argument
can be refered to using the $1 notation (if there were furhter arguments they could be
referred to by $2, $3, . . . ). my exists(int4) returns the number of tuples from table
sells where the attribute sno is equal to the given argument $1 (sno = $1). The
keyword language ’sql’ tells PostgreSQL that the new function is a query language
function.
The query in the second step examines every tuple from table supplier and checks if
it satifies the given qualification. It does so by taking the supplier id sno of every tuple and
giving it as an argument to the function my exists(int4). In other words the function
my exists(int4) is called once for every tuple of table supplier. The function
returns the number of tuples having the given supplier id sno contained in table sells. A
result of zero means that no such tuple is available meaning that the corresponding supplier
does not sell a single part. We can see that this query is semantically equivalent to the one
given in example 2.3.
Example 2.6 This example shows how to use a composite type in a function definition.
Imagine that the price of every part was doubled over night. If you want to look at
the part table with the new values you could use the following function which uses the
composite type part for its argument:
testdb=> create function new_price(part) returns float
testdb-> as ’select $1.price * 2;’ language ’sql’;
CREATE
testdb=> select pno, pname, new_price(price) as new_price
testdb-> from part;
pno | pname | new_price
----+---------+-----------
1 | Screw | 20
2 | Nut | 16
3 | Bolt | 30
4 | Cam | 50
(4 rows)
testdb=>
Note that this could have been done by a normal query (without using a user defined func-
tion) as well but it’s an easy to understand example for the usage of functions.
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 35
Example 2.7 We show how to realize the user defined function pg my sqrt(int4).
The implementation can be divided into three steps:
formulating the new function in C
Formulating the New Function in C: We create a new file called sqrt.c and add the
following lines:
#include <postgres.h>
#include <utils/palloc.h>
#include <math.h>
The function pg my sqrt() takes one argument of type int4 which is a Post-
greSQL type defined in postgres.h and returns the integer value next to the
square root of the argument. As with query language functions (see previous sec-
tion) the arguments can be of base or of composite type. Special care must be taken
when using base types that are larger than four bytes in length. PostgreSQL supports
three types of passing a value to the user defined function:
pass by value, fixed length
pass by reference, fixed length
pass by reference, variable length
Only data types that are 1, 2 or 4 bytes in length can be passed by value. We just
give an example for the usage of base types that can be used for pass by value here.
For information on how to use types that require pass by reference or how to use
composite types refer to [LOCK98].
Compiling and Linking It to a Shared Library: PostgreSQL binds the new function to
the runtime system by using a shared library containing the function. Therefore we
have to create a shared library out of the objectfile(s) containing the function(s). It
depends on the system and the compiler how this can be done. On a Linux ELF
system using gcc it can be done by using the following commands:
where $PGROOT is the path PostgreSQL was installed to. The important options
given to gcc here are -fpic in the first line which tells gcc to produce position in-
dependent code that can be loaded to any address of the process image and -shared
36 CHAPTER 2. POSTGRESQL FROM THE USER’S POINT OF VIEW
in the second line telling gcc to produce a shared library. If you got another system
where the above described steps do not work you will have to refer to the manual
pages of your c-compiler (often man cc) and your linker (man ld) to see how
shared libraries can be built.
Making the New Function Known to PostgreSQL: Now we have to tell PostgreSQL
about the new function. We do so by using the create function command
within psql as we did for query language functions:
From now on the function pg my sqrt(int4) can be used in every query. Here
is a query against table part using the new function:
testdb=>
Adding a new data type to PostgreSQL also requires the definition of an input and an output
function. These functions are implemented using the techniques presented in the previous
section Programming Language Functions. The functions determine how the type appears
in strings (for input by the user and output to the user) and how the type is organized
in memory. The input function takes a null-delimited character string as its input and
returns the internal (in memory) representation of the type. The output function takes the
internal representation of the type and returns a null delimited character string. Besides the
definition of input and output functions it is often necessary to enhance operators (e.g. ’+’)
and aggregate functions for the new data type. How this is done is described in section 2.5.4
Extending Operators and section 2.5.5 Extending Aggregates.
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 37
Example 2.8 Suppose we want to define a complex type which represents complex num-
bers. Therefore we create a new file called complex.c with the following contents:
#include <postgres.h>
#include <utils/palloc.h>
#include <math.h>
/* Output Function */
/* Takes a pointer to an instance of structure Complex
* as argument and returns a character pointer to a
* string representation of the given argument */
char *
complex_out(Complex *complex)
{
char *result;
Note that the functions defined above operate on types that require pass by reference. The
functions take a pointer to the data as argument and return a pointer to the derived data
instead of passing and returning the data itself. That’s why we have to reserve memory
using palloc within the functions. (If we would just define a local variable and return
the addresses of these variables the system would fail, because the memory used by local
variables is freed when the function defining these variables completes.)
The next step is to compile the C-functions and create the shared library complex.so.
This is done in the way described in the previous section Programming Language Functions
and depends on the system you are using. On a Linux ELF system using gcc it would look
like this:
$ gcc -I$PGROOT/include -fpic -c complex.c -o complex.o
$ gcc -shared -o complex.so complex.o
Now we are ready to define the new datatype but before that we have to make the input and
output function known to PostgreSQL:
testdb=> create function complex_in(opaque)
testdb-> returns complex
testdb-> as ’/<where_ever_you_put_it>/complex.so’
testdb-> language ’c’;
NOTICE: ProcedureCreate: type ’complex’ is not
yet defined
CREATE
testdb=> create function complex_out(opaque)
testdb-> returns opaque
testdb-> as ’/<where_ever_you_put_it>/complex.so’
testdb-> language ’c’;
CREATE
testdb=> create type complex (
testdb-> internallength = 16,
testdb-> input = complex_in,
testdb-> output = complex_out
testdb-> );
CREATE
testdb=>
Note that the argument type given in the definition of complex out() and
complex in() - opaque - is needed by PostgreSQL to be able to provide an
uniform mechanism for the definition of the input and output functions needed by a new
data type. It is not necessary to specify the exact type of the arguments given to the
functions. The input function is never called explicitly and when it is called implicitly
(e.g. by a statement like insert into) it is clear that a character string (i.e. a part of
the insert query) will be passed to it. The output function is only called ( by an internal
mechanism of PostgreSQL) when data of the corresponding user defined type has to be
displayed. In this case it is also clear that the input is of the type used for the internal
representation (e.g. complex). The output is of type character string.
The new type can now be used as if it were another base type:
testdb=> create table complex_test
testdb-> (val complex);
CREATE
testdb=> insert into complex_test
testdb-> (val) values (’(1,2)’);
INSERT 155872 1
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 39
testdb=>
Example 2.9 We show how the operator ’=’ can be adapted for the usage on the complex
data type defined in section 2.5.3 User Defined Types. We need a user defined function
complex cmp(complex,complex) that returns true if the complex numbers given
as arguments are equal and false otherwise. This function is defined as described in sec-
tion 2.5.2 User Defined Functions. In our case there are already two functions present for
the usage of type complex -the input and output function defined in example 2.8. So we
can add the new function complex cmp(complex,complex) by simply appending
the following lines to the file complex.c given in example 2.8:
/* Comparison Function */
/* returns true if arg1 and arg2 are equal */
bool complex_cmp(Complex *arg1, Complex *arg2)
{
if((arg1->x == arg2->x) &&
(arg1->y == arg2->y))
{
return true;
}
else
{
return false;
}
}
Now we create the shared library again:
$ gcc -I$PGROOT/include -fpic -c complex.c -o complex.o
$ gcc -shared -o complex.so complex.o
Note that all the functions defined in complex.c (complex in(), complex out()
and complex cmp()) are now contained in the shared library complex.so.
40 CHAPTER 2. POSTGRESQL FROM THE USER’S POINT OF VIEW
Now we make the new function known to PostgreSQL and after that we define the new
operator ’=’ for the complex type:
From now on we are able to perform comparisons between complex numbers in a query’s
qualification (We use the table complex test as defined in example 2.8):
testdb=>
sfunc1 (state function one): is called for every tuple of the current group and the
appropriate attribute’s value of the current tuple is passed to the function. The given
argument is used to change the internal state of the function in the way given by the
body of the function. For example sfunc1 of the aggregate function sum is called
for every tuple of the current group. The value of the attribute the sum is built on is
taken from the current tuple and added to the internal sum state of sfunc1.
sfunc2 is also called for every tuple of the group but it does not use any argument
from outside to manipulate its state. It just keeps track of the own internal state. A
typical application for sfunc2 is a counter that is incremented for every tuple of the
group that has been processed.
finalfunc is called after all tuples of the current group have been processed. It
takes the internal state of sfunc1 and the state of sfunc2 as arguments and derives
the result of the aggregate function from the two given arguments. For example with
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 41
the aggregate function average, sfunc1 sums up the attribute values of each tuple
in the group, sfunc2 counts the tuples in the group. finalfunc divides the sum
by the count to derive the average.
If we define an aggregate using only sfunc1 we get an aggregate that computes a running
function of the attribute values from each tuple. sum is an example for this kind of aggre-
gate. On the other hand, if we create an aggregate function using only sfunc2 we get an
aggregate that is independent of the attribute values from each tuple. count is a typical
example of this kind of aggregate.
Example 2.10 Here we want to realize the aggregate functions complex sum and
complex avg for the user defined type complex (see example 2.8).
First we have to create the user defined functions complex add and
complex scalar div. We can append these two functions to the file complex.c
from example 2.8 again (as we did with complex cmp):
Next we create the shared library complex.so again, which will contain all func-
tions defined in the previous examples as well as the new functions complex add and
complex scalar div:
Now we have to make the functions needed by the new aggregates known to PostgreSQL.
After that we define the two new aggregate functions complex sum and complex avg
that make use of the functions complex add and complex scalar div:
The aggregate function complex sum is defined using only sfunc1. basetype is
the type of the result of the aggregate function. The function complex add is used as
sfunc1 and stype1 defines the type sfunc1 will operate on. initcond1 gives the
initial value of the internal state of sfunc1.
If we look at the definition of the aggregate function complex avg we can see that
the part concerning sfunc1 is identical to the corresponding part of the definition
of complex sum. The only difference is the additional definition of sfunc2 and
finalfunc. The built in function int4inc is used as sfunc2 and increments the
internal state of sfunc2 for every tuple processed. After all tuples have been processed
complex scalar div is used as finalfunc to create the average.
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 43
testdb=>
2.5.6 Triggers
PostgreSQL supports the calling of C functions as trigger actions. Triggers are not a fea-
ture that is only present in PostgreSQL. In fact most modern RDMSs support triggers. We
describe them here because the previous sections are necessary to understand the imple-
mentation chosen.
At the moment it is possible to define trigger actions that are executed before and after
the SQL commands insert, update or delete for a tuple. Triggers can be used
to ensure data integrity. For example we can define a trigger action that returns an error
whenever somebody wants to insert (or update) a tuple with a negative supplier id sno into
table supplier defined in figure 1.1.
The system stores information about when a trigger action has to be per-
formed. Whenever a command triggering an action is detected, the trigger
manager is called within PostgreSQL, which initializes a global data structure
TriggerData *CurrentTriggerData and calls the appropriate trigger function.
A central role in the definition of trigger functions plays the global data
structure TriggerData *CurrentTriggerData (The global pointer
CurrentTriggerData can be accessed from within every trigger function):
Now we give a short description of the structure’s contents relevant for the example below.
For a detailed description of this and other structures and functions refer to [LOCK98]:
44 CHAPTER 2. POSTGRESQL FROM THE USER’S POINT OF VIEW
tg event: Describes the event the function is called for. Contains information
about when the function was called (before or after the command execution) and for
which command it was called (insert, update or delete).
tg relation: Is a pointer to a structure describing the relation.
tg relation- rd att is of special interest for us because it can be given as an
argument to the function SPI getbinval() described later.
tg trigtuple: Is a pointer to the tuple for which the trigger is fired. If the com-
mand is insert or delete this is the tuple to be returned by the trigger function.
tg newtuple: If the command is update this is a pointer to the new version of
tuple and NULL otherwise. This is what has to be returned by the trigger function if
the command is update.
Example 2.11 We define a trigger function called trigf() that is designed to prevent
inserting (updating) tuples with a negative supplier id sno into table supplier.
First we have to define the function trigf() using C and therefore we create a
new file trigger.c. The function definition is done in exactly the same way as for the
definition of user defined functions (see section 2.5.2).
HeapTuple
trigf()
{ TupleDesc tupdesc;
HeapTuple rettuple;
bool isnull;
int val;
if (!CurrentTriggerData)
elog(ERROR, "trigf: triggers are not initialized");
tupdesc = CurrentTriggerData->tg_relation->rd_att;
CurrentTriggerData = NULL;
The function SPI getbinval is part of the Server Programming Interface (SPI) de-
scribed in section 2.5.7. It takes the current tuple, the description of the relation, the num-
ber of the attribute and the address of a variable isnull as arguments and returns the
binary value of the given attribute from the current tuple. If the attribute’s value is NULL
isnull is set to true otherwise to false.
trigf() first checks whether it was called by an update command or by an
insert and sets rettuple accordingly. Then it gets the value of the first attribute
of the current tuple (remember sno is the first attribute in the relation supplier). If
the value is greater than zero the current tuple is returned (and inserted or
updated by the
executor of PostgreSQL) otherwise an error is returned (elog(ERROR, ) logs an error
and aborts processing) and the table supplier won’t be affected.
Next we have to make the function trigf() known to PostgreSQL by the create
function command in the same way we did for the functions in the previous examples
(e.g. example 2.8). After that we can define the trigger tbefore. The definition ensures
that function trigf() given as the trigger action will be executed before the execution
of the commands insert and update affecting the table supplier. Note that
insert and update commands executed against tables other than supplier won’t
cause the execution of function trigf().
testdb=> create function trigf() returns opaque
testdb-> as ’/<where_ever_you_put_it/trigger.so’
testdb-> language ’c’;
CREATE
testdb=> create trigger tbefore
testdb-> before insert or update on supplier
testdb-> for each row execute procedure trigf();
CREATE
testdb=>
Now we can check if the trigger is working correctly. If the value for sno is greater than
zero the insert and update commands against table supplier are not affected.
testdb=> insert into supplier
testdb-> (sno, sname, city)
testdb-> values(5, ’Miles’, ’Berlin’);
INSERT 156064 1
The set of functions of SPI can be divided into the following parts:
Interface Functions: These functions are used to establish a connection to a running
backend. Whenever you want to execute a SQL query within a user defined function
you will have to connect to a backend by SPI connect().
– SPI connect() opens a connection to the PostgreSQL backend.
– SPI finish() closes a connection to the PostgreSQL backend.
– SPI exec() takes a character string containing a SQL query and a number
tcount as arguments and executes the query. The result can be obtained from
the global data structure SPI tuptable which is set by SPI exec(). If
tcount is zero then the query is executed for all tuples returned by the query
scan. Using tcount 0 restricts the number of tuples for which the query
will be executed. This function should only be called after SPI connect()
has been processed and a connection has been established.
– SPI prepare() creates and returns an execution plan
(parser+planner+optimizer) but doesn’t execute the query. (The function
performs the same steps as SPI exec() except that it does not execute the
plan. Should only be called after a connection has been established.
– SPI saveplan() stores a plan prepared by SPI prepare() in safe mem-
ory protected from freeing by SPI finish().
– SPI execp() executes a plan prepared by SPI prepare() or returned by
SPI saveplan().
Interface Support Functions: These functions can be used from within a connected or
within an unconnected user defined function. An example for the use from within an
unconnected function was given in example 2.11 where SPI getbinval() was
used to deliver the value of attribute sno from the new (to be inserted) tuple. An
example for the use from within a connected function will be given in example 2.12.
– SPI copytuple() makes a copy of the tuple given as argument.
– SPI modifytuple() modifies one or more attributes of a given tu-
ple. The new values for the attributes to be changed are passed to
SPI modifytuple() as arguments.
– SPI fnumber() takes the description of a tuple and the name of an attribute
as arguments and returns the number of the attribute.
– SPI fname() takes the description of a tuple and the number of an attribute
as arguments and returns the name of the attribute.
– SPI getvalue() takes the description of a tuple, the tuple and the number
of an attribute as arguments and returns a string representation of the given
attribute’s value.
– SPI getbinval() takes the description of a tuple, the tuple and the number
of an attribute as arguments and returns the binary value of the given attribute.
– SPI gettype() returns a copy of the type name for the specified attribute.
– SPI gettypeid() returns the type OID for the specified attribute.
2.5. SOME OF POSTGRESQL’S SPECIAL FEATURES IN DETAIL 47
Example 2.12 We want PostgreSQL to automatically generate a value for the supplier id
sno whenever a new tuple is inserted into table supplier. Therefore we define a trigger
function trigf sno() that is called before the execution of every insert statement.
trigf sno() has to perform the following steps:
get the greatest supplier id sno contained in table supplier using SPI exec()
modify attribute sno of the tuple to be inserted to contain the next greater number
using SPI modifytuple()
Here are the contents of function trigf sno() that can be appended to the file
trigger.c from example 2.11:
HeapTuple
trigf_sno()
{ HeapTuple rettuple;
bool isnull;
int ret, max;
int atts_to_be_changed[1];
Datum new_value[1];
char nulls;
if (!CurrentTriggerData)
elog(ERROR,
"trigf: triggers are not initialized");
/* Connect to backend */
if ((ret = SPI_connect()) < 0)
elog(ERROR,
"trigf_sno: SPI_connect returned %d",ret);
if (rettuple == NULL)
elog(ERROR,
"trigf_sno: SPI_modifytuple failed");
CurrentTriggerData = NULL;
return (rettuple);
}
Next we have to make the function trigf sno() known to PostgreSQL and after that
we can define the trigger sno before which is called before an insert to the table
supplier. The trigger function ensures that the next greater number for sno will be
used.
Every time a new tuple is inserted to the table supplier a new sno is assigned automat-
ically regardless of the value given in the insert statement:
testdb=>
we define a rule that is fired every time the new table is accessed. Instead of retrieving the
data from the virtual table the rule body is executed retrieving the data from one or more
physically stored tables.
Example 2.13 We want to show how a view could be realized in PostgreSQL. Note that
there is of course an own command to create views create view in PostgreSQL which
performs the steps of our example internally.
Next we create the rule that will be fired whenever a select against the table my view
shows up:
testdb=> create rule my_view_rule
testdb-> as on select to my_view
testdb-> do instead select s.sname, p.pname
testdb-> from supplier s, part p, sells se
testdb-> where s.sno=se.sno and p.pno=se.pno;
CREATE
testdb=>
Now we can use the table my view as if it were populated with tuples:
testdb=> select * from my_view;
sname |pname
------+-----
Smith |Screw
Smith |Nut
Jones |Cam
Adams |Screw
Adams |Bolt
Blake |Nut
Blake |Bolt
Blake |Cam
(8 rows)
testdb=>
Chapter 3
This chapter gives an overview of the internal structure of the backend of PostgreSQL. After
having read the following sections you should have an idea of how a query is processed.
Don’t expect a detailed description here (I think such a description dealing with all data
structures and functions used within PostgreSQL would exceed 1000 pages!). This chapter
is intended to help understanding the general control and data flow within the backend from
receiving a query to sending the results.
51
52 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
User
LIBPQ Postmaster
App
Client Host
Server Host
User
LIBPQ Postmaster
App
The parser defined in gram.y and scan.l is built using the UNIX tools yacc
and lex.
The transformation process does some necessary transformation to the data structure
returned by the parser.
3.3. THE PARSER STAGE 53
3.3.1 Parser
The parser has to check the query string (which arrives as plain ASCII text) for valid syn-
tax. If the syntax is correct a parse tree is built up and handed back otherwise an error is
returned. For the implementation the well known UNIX tools lex and yacc are used.
The lexer is defined in the file scan.l and is responsible for recognizing identifiers,
the SQL keywords etc. For every keyword or identifier that is found, a token is generated
and handed to the parser.
The parser is defined in the file gram.y and consists of a set of grammar rules and
actions that are executed whenever a rule is fired. The code of the actions (which is actually
C-code) is used to build up the parse tree.
The file scan.l is transformed to the C-source file scan.c using the program lex
and gram.y is transformed to gram.c using yacc. After these transformations have
taken place a normal C-compiler can be used to create the parser. Never make any changes
to the generated C-files as they will be overwritten the next time lex or yacc is called.
(Note that the mentioned transformations and compilations are normally done automati-
cally using the makefiles shipped with the PostgreSQL source distribution.)
A detailed description of yacc or the grammar rules given in gram.y would be
beyond the scope of this paper. There are many books and documents dealing with lex
and yacc. You should be familiar with yacc, before you start to study the grammar given
in gram.y otherwise you won’t understand what happens there.
For a better understanding of the data structures used in PostgreSQL for the pro-
cessing of a query we use an example to illustrate the changes made to these data structures
in every stage.
Example 3.1 This example contains the following simple query that will be used in various
descriptions and figures throughout the following sections. The query assumes that the
tables given in example 1.1 have already been defined.
Figure 3.2 shows the parse tree built by the grammar rules and actions given in gram.y
for the query given in example 3.1 (without the operator tree for the where clause which
is shown in figure 3.3 because there was not enough space to show both data structures in
one figure).
The top node of the tree is a SelectStmt node. For every entry appearing in the from
clause of the SQL query a RangeVar node is created holding the name of the alias and
a pointer to a RelExpr node holding the name of the relation. All RangeVar nodes are
collected in a list which is attached to the field fromClause of the SelectStmt node.
For every entry appearing in the select list of the SQL query a ResTarget node
is created holding a pointer to an Attr node. The Attr node holds the relation name
of the entry and a pointer to a Value node holding the name of the attribute. All
ResTarget nodes are collected to a list which is connected to the field targetList of
the SelectStmt node.
Figure 3.3 shows the operator tree built for the where clause of the SQL query
given in example 3.1 which is attached to the field qual of the SelectStmt node. The
top node of the operator tree is an A Expr node representing an AND operation. This
node has two successors called lexpr and rexpr
pointing to two subtrees. The subtree
attached to lexpr
represents
#
the qualification m<( and the one attached to rexpr
represents m{&_ m . For every attribute an Attr node is created holding the name
54 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
of the relation and a pointer to a Value node holding the name of the attribute. For the
constant appearing in the query a Const node is created holding the value.
Value
val.str
Operator Tree
representing
RangeVar "s" "sno"
the qualifications
name
RelExpr
relExpr
relname
"supplier"
RangeVar "se"
name
RelExpr
relExpr
relname
"sells"
SelectStmt TargetList
unique
FromList
unionall: false
targetList A_Expr
fromClause opname
whereClause oper AND
groupClause rexpr
havingClause lexpr
sortClause
A_Expr A_Expr
opname ">" opname "="
oper OP oper OP
rexpr rexpr
lexpr lexpr
A_Const Attr
val.val.ival: 2 relname "se"
attrs
Attr Attr
relname "s" relname "s" Value
attrs attrs val.str
"sno" "sno"
Next it is checked if the attribute names used are contained in the relations given in the
query. For every attribute that is found a TLE node is created holding a pointer to a
Resdom node (which holds the name of the column) and a pointer to a VAR node. There
are two important numbers in the VAR node. The field varno gives the position of the
relation containing the current attribute in the range table entry list created above. The
field varattno gives the position of the attribute within the relation. If the name of an
attribute cannot be found an error will be returned and the query processing will be aborted.
Query
command: 1
utility RTE RTE
resultRelation: 0 relname "supplier" relname "sells"
isBinary: false
union_all: false
sortClause
TLE TLE
unique
resdom resdom
rtable
expr expr
targetlist
qual
RESDOM RESDOM
groupClause
resno: 1 resno: 2
havingQual
resname "sname" resname "pno"
hasAggs: false
hasSublinks: false
VAR VAR
unionClause
varno: 1 varno: 2
intersectClause
varattno: 2 varattno: 2
Figure 3.4: Transformed TargetList and FromList for query of example 3.1
Figure 3.5 shows the transformed where clause. Every A Expr node is transformed to an
Expr node. The Attr nodes representing the attributes are replaced by VAR nodes as
it has been done for the targetlist above. Checks if the appearing attributes are valid and
known to the system are made. If there is an attribute used which is not known an error
will be returned and the query processing will be aborted.
Query
command: 1
utility Expr
opType: AND
resultRelation: 0
oper
into
args
isPortal: false
isBinary: false
union_all: false Expr Oper
sortClause opType: OP opno: 521
VAR VAR
varno: 1 varno: 2
varattno: 2 varattno: 2
The given rule will be fired whenever a select against the relation test view is
detected. Instead of selecting the tuples from test view the select statement given in
the action part of the rule is executed.
select sname
from test_view
where sname <> ’Smith’;
Here is a list of the steps performed by the query rewrite system whenever a user-query
against test view appears. (The following listing is a very informal description of
the algorithm just intended for basic understanding. For a detailed description refer to
[STON89]).
Adapt the targetlist to meet the number and order of attributes given in the user-query.
3.5. PLANNER/OPTIMIZER 59
Add the qualification given in the where clause of the user-query to the qualification
of the query given in the action part of the rule.
Given the rule definition above, the user-query will be rewritten to the following form (Note
that the rewriting is done on the internal representation of the user-query handed back by
the parser stage but the derived new data structure will represent the following query):
select s.sname
from supplier s, sells se, part p
where s.sno = se.sno and
p.pno = se.pno and
s.sname <> ’Smith;
3.5 Planner/Optimizer
The task of the planner/optimizer is to create an optimal execution plan. It first combines all
possible ways of scanning and joining the relations that appear in a query. All the created
paths lead to the same result and it’s the optimizer’s task to estimate the cost of executing
each path and find out which one is the cheapest.
nested iteration join: The right relation is scanned once for every tuple found in the
left relation. This strategy is easy to implement but can be very time consuming.
merge sort join: Each relation is sorted on the join attributes before the join starts.
Then the two relations are merged together taking into account that both relations
are ordered on the join attributes. This kind of join is more attractive because every
relation has to be scanned only once.
hash join: the right relation is first hashed on its join attributes. Next the left relation
is scanned and the appropriate values of every tuple found are used as hash keys to
locate the tuples in the right relation.
subnodes represents one relation of the join. As mentioned above a merge sort join requires
each relation to be sorted. That’s why
we find a Sort node in each subplan. The additional
qualification given in the query ( m<( ) is pushed down as far as possible and is
attached to the qpqual field of the leaf SeqScan node of the corresponding subplan.
The list attached to the field mergeclauses of the MergeJoin
node
,
contains in-
formation about the join attributes. The values +++ and ++ for the varno
fields in the VAR nodes appearing in the mergeclauses list (and also in the targetlist)
mean that not the tuples of the current node should be considered but the tuples of the next
”deeper” nodes (i.e. the top nodes of the subplans) should be used instead.
Note that every Sort and SeqScan node appearing in figure 3.6 has got a targetlist
but because there was not enough space only the one for the MergeJoin node could be
drawn.
Another task performed by the planner/optimizer is fixing the operator ids in the Expr
and Oper nodes. As mentioned earlier, PostgreSQL supports a variety of different data
types and even user defined types can be used. To be able to maintain the huge amount of
functions and operators it is necessary to store them in a system table. Each function and
operator gets a unique operator id. According to the types of the attributes used within the
qualifications etc., the appropriate operator ids have to be used.
3.6 Executor
The executor takes the plan handed back by the planner/optimizer and starts processing the
top node. In the case of our example (the query given in example 3.1) the top node is a
MergeJoin node.
Before any merge can be done two tuples have to be fetched (one from each subplan).
So the executor recursively calls itself to process the subplans (it starts with the subplan
attached to lefttree). The new top node (the top node of the left subplan) is a SeqScan
node and again a tuple has to be fetched before the node itself can be processed. The
executor calls itself recursively another time for the subplan attached to lefttree of the
SeqScan node.
Now the new top node is a Sort node. As a sort has to be done on the whole relation,
the executor starts fetching tuples from the Sort node’s subplan and sorts them into a
temporary relation (in memory or a file) when the Sort node is visited for the first time.
(Further examinations of the Sort node will always return just one tuple from the sorted
temporary relation.)
Every time the processing of the Sort node needs a new tuple the executor is recur-
sively called for the SeqScan node attached as subplan. The relation (internally refer-
enced by the value given in the scanrelid field) is scanned for the next tuple. If the
tuple satisfies the qualification given by the tree attached to qpqual it is handed back,
otherwise the next tuple is fetched until the qualification is satisfied. If the last tuple of the
relation has been processed a NULL pointer is returned.
After a tuple has been handed back by the lefttree of the MergeJoin the
righttree is processed in the same way. If both tuples are present the executor pro-
cesses the MergeJoin node. Whenever a new tuple from one of the subplans is needed a
recursive call to the executor is performed to obtain it. If a joined tuple could be created it
is handed back and one complete processing of the plan tree has finished.
Now the described steps are performed once for every tuple, until a NULL pointer is
returned for the processing of the MergeJoin node, indicating that we are finished.
3.6. EXECUTOR 61
MergeJoin
...
targetList
mergeclauses
TLE TLE
lefttree resdom resdom
righttree expr expr
...
VAR VAR
varno: 65001 varno: 65000
Sort Sort
varattno: 1 varattno: 1
...
...
lefttree lefttree
Expr Oper
SeqScan SeqScan opType: OP opno: 96
...
...
oper opid: 65
qpqual qpqual args
scanrelid: 1 scanrelid: 2
lefttree lefttree
VAR Const
varno: 1 constvalue: 2
varattno 1
The formula given in the where clause is evaluated for every tuple. If the evaluation
returns true the tuple is returned, every tuple not satisfying the qualification is
ignored.
In the case of groups the having clause is evaluated for every group. If the evaluation
returns true the group is taken into account otherwise it is ignored.
Example 3.2 Here is an example of the usage of the aggregate function count which
counts the number of part numbers (pno) of every group. (The table sells is defined in
example 1.1.)
A query like the one in example 3.2 is processed by the usual stages:
the planner/optimizer
the executor
and in the following sections we will describe what every stage does to the query in order
to obtain the appropriate result.
The parser stage builds up a querytree containing the where qualification and information
about the grouping that has to be done (i.e. a list of all attributes to group for is attached
to the field groupClause). The main difference to querytrees built up for queries with-
out aggregate functions is given in the field hasAggs which is set to true and in the
targetlist. The expr field of the second TLE node of the targetlist shown in figure 3.7
does not point directly to a VAR node but to an Aggreg node representing the aggregate
function used in the query.
A check is made that every attribute grouped for appears only without an aggregate
function in the targetlist and that every attribute which appears without an aggregate func-
tion in the targetlist is grouped for.
3.7. THE REALIZATION OF THE HAVING CLAUSE 63
Query
rtable
targetlist RTE
qual relname: sells
groupClause
TLE TLE
hasAggs: true
resdom resdom
...
expr expr
Resdom Resdom
resname: sno resname: count
VAR Aggreg
varno: 1 aggname: count
varattno: 1 target
VAR
GroupClause varno: 1
entry varattno: 2
The rewriting system does not make any changes to the querytree as long as the query
involves just base tables. If any views are present the query is rewritten to access the tables
specified in the view definition.
Planner/Optimizer
AGG
aggs
qptargetlist
lefttree
...
TLE TLE
resdom resdom
expr expr
Resdom Resdom
resname: sno resname: count
VAR Aggreg
varno: 1 aggname: count
varattno: 1 target
VAR
GRP
varno: 1
qptargetlist
varattno: 2
lefttree
Pointer to
...
SORT targetlist of
qptargetlist SORT node
lefttree
...
SeqScan
qptargetlist
lefttree
...
Executor
The executor uses the function execAgg() to execute AGG nodes. As described earlier it
uses one main function ExecProcNode which is called recursively to execute subtrees.
The following steps are performed by execAgg():
The list attached to the field aggs of the AGG node is examined and for every ag-
gregate function included the transition functions are fetched from a function table.
Calculating the value of an aggregate function is done using three functions:
– The first transition function xfn1 is called with the current value of the at-
tribute the aggregate function is applied to and changes its internal state using
the attribute’s value given as an argument.
– The second transition function xfn2 is called without any arguments and
changes its internal state only according to internal rules.
– The final function finalfn takes the final states of xfn1 and xfn2 as argu-
ments and finishes the aggregation.
Example 3.3 Recall the functions necessary to implement the aggregate function
avg building the average over all values of an attribute in a group (see section 2.5.5
Extending Aggregates):
– The first transition function xfn1 has to be a function that takes the actual value
of the attribute avg is applied to as an argument and adds it to the internally
stored sum of previous calls.
– The second transition function xfn2 only increases an internal counter every
time it is called.
– The final function finalfn divides the result of xfn1 by the counter of xfn2
to calculate the average.
Note that xfn2 and finalfn may be absent (e.g. for the aggregate function sum
which simply sums up all values of the given attribute within a group.
execAgg() creates an array containing one entry for every aggregate func-
tion found in the list attached to the field aggs. The array will hold information
needed for the execution of every aggregate function (including the transition
functions described above).
The following steps are executed in a loop as long as there are still tuples returned
by the subplan (i.e. as long as there are still tuples left in the current group). When
there are no tuples left in the group a NULL pointer is returned indicating the end of
the group.
– A new tuple from the subplan (i.e. the plan attached to the field lefttree)
is fetched by recursively calling ExecProcNode() with the subplan as argu-
ment.
– For every aggregate function (contained in the array created before) apply the
transition functions xfn1 and xfn2 to the values of the appropriate attributes
of the current tuple.
When we get here, all tuples of the current group have been processed and the tran-
sition functions of all aggregate functions have been applied to the values of the
attributes. We are now ready to complete the aggregation by applying the final func-
tion (finalfn) for every aggregate function.
66 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
Store the tuple containing the new values (the results of the aggregate functions) and
hand it back.
Note that the procedure described above only returns one tuple (i.e. it processes just one
group and when the end of the group is detected it processes the aggregate functions and
hands back one tuple). To retrieve all tuples (i.e. to process all groups) the function
execAgg() has to be called (returning a new tuple every time) until it returns a NULL
pointer indicating that there are no groups left to process.
In order to implement the having clause a variety of changes have been made to the
following stages:
The parser stage has been modified slightly to build up and transform an operator
tree for the having clause.
The rewrite system has been adapted to be able to use views with the having logic.
The planner/optimizer now takes the operator tree of the having clause and attaches
it to the AGG node (which is the top node of the queryplan).
The executor has been modified to evaluate the operator tree (i.e. the internal repre-
sentation of the having qualification) attached to the AGG node and the results of the
aggregation are only considered if the evaluation returns true.
In the following sections we will describe the changes made to every single stage in detail.
The grammar rules of the parser defined in gram.y did not require any changes (i.e. the
rules had already been prepared for the having clause). The operator tree built up for the
having clause is attached to the field havingClause of the SelectStmt node handed
back by the parser.
The transformation procedures applied to the tree handed back by the parser transform
the operator tree attached to the field havingClause using exactly the same functions
used for the transformation of the operator tree for the where clause. This is possible
because both trees are built up by the same grammar rules of the parser and are therefore
compatible. Additional checks which make sure that the having clause involves at least one
aggregate function etc. are performed at a later point in time in the planner/optimizer stage.
The necessary
changes have been applied to the following functions included in the
file /src/backend/parser/analyze.c. Note, that only the relevant parts of
the affected code are presented instead of the whole functions. Every added source line
will be marked by a ’+’ at the beginning of the line and every changed source line will
be marked by a ’!’ throughout the following code listings. Whenever a part of the code
which is not relevant at the moment is skipped, three vertical dots are inserted instead.
3.7. THE REALIZATION OF THE HAVING CLAUSE 67
transformInsertStmt()
This function becomes is invoked every time a SQL insert statement involving a
select is used like the following example illustrates:
insert into t2
select x, y
from t1;
Two statements have been added to this function. The first one performs the transfor-
mation of the operator tree attached to the field havingClause using the function
transformWhereClause() as done for the where clause. It is possible to use
the same function for both clauses, because they are both built up by the same gram-
mar rules given in gram.y and are therefore compatible.
The second statement makes sure, that aggregate functions are involved in the query
whenever a having clause is used, otherwise the query could have been formulated
using only a where clause.
static Query *
transformInsertStmt(ParseState *pstate,
InsertStmt *stmt)
{
/* make a new query tree */
Query *qry = makeNode(Query);
.
.
.
/* fix where clause */
qry->qual = transformWhereClause(pstate,
stmt->whereClause);
transformSelectStmt()
Exactly the same statements added to the function transformInsertStmt()
above have been added here as well.
static Query *
transformSelectStmt(ParseState *pstate,
SelectStmt *stmt)
{
Query *qry = makeNode(Query);
qry->commandType = CMD_SELECT;
.
.
.
qry->qual = transformWhereClause(pstate,
stmt->whereClause);
This section describes the changes to the rewrite system of PostgreSQL that have been
necessary to support the use of views within queries using a having clause and to support
the definition of views by queries using a having clause.
As described in section 3.4.1 Techniques To Implement Views a query rewrite technique
is used to implement views. There are two cases to be handled within the rewrite system as
far as the having clause is concerned:
3.7. THE REALIZATION OF THE HAVING CLAUSE 69
The view definition does not contain a having clause but the queries evaluated against
this view may contain having clauses.
The view definition contains a having clause. In this case queries evaluated against
this view must meet some restrictions as we will describe later.
No having clause in the view definition: First we will look at the changes necessary to
support queries using a having clause against a view defined without a having clause.
select *
from testview
where sno <> 5;
The query given in the definition of the view test view is the backbone of the rewritten
query. The targetlist is taken from the user’s query and also the where qualification of the
user’s query is added to the where qualification of the new query by using an AND operation.
From now on it is no longer sufficient to add just the where clause and the targetlist of the
user’s query to the new query. The group clause and the having qualification also have to
be added to the rewritten query:
Several changes that have already been applied to the targetlist and the where clause also
have to be applied to the having clause. Here is a collection of all additional steps that have
to be performed in order to rewrite a query using a having clause against a simple view (i.e.
a view whose definition does not use any group and having clauses):
Rewrite the subselects contained in the having clause if any are present.
Adapt the varno and varattno fields of all VAR nodes contained in the operator
tree representing the having clause in the same way as it has been done for the tree
representing the where clause. The varno fields are changed to use the base tables
given in the view definition (which have been inserted into the range table entry list
in the meantime) instead of the virtual tables. The positions of the attributes used
in the view may differ from the positions of the corresponding attributes in the base
tables. That’s why the varattno fields also have to be adapted.
Adapt the varno and varattno fields of all VAR nodes contained in the
groupClause of the user’s query in the way and for the reasons described above.
Attach the tree representing the having qualification (which is currently attached
to the field havingClause of the Query node for the user’s query) to the field
havingClause of the Query node for the new (rewritten) query.
Attach the list representing the group clause (currently attached to the field
groupClause of the Query node for the user’s query) to the field group clause of
the node for the new (rewritten) query.
The view definition contains a having clause: Now we will look at the problems that
can arise using views that are defined using a query involving a having clause.
Simple queries against this view will not cause any troubles:
select *
from test_view
where sno <> 5;
This query
can easily be rewritten by adding the where qualification of the user’s query
(sno 5) to the where qualification of the view definition’s query.
The next query is also simple but it will cause troubles when it is evaluated against
the above given view definition:
select *
from test_view
where number > 1; /* count(pno) in the view def.
* is called number here */
3.7. THE REALIZATION OF THE HAVING CLAUSE 71
The currently implemented techniques for query rewriting will rewrite the query to:
select *
from sells
where sno > 2 and
count(pno) > 1
group by sno
having count(pno) > 1;
which is an invalid query because an aggregate function appears in the where clause.
As you can see this query does neither involve a where clause nor a having clause but it
contains a group clause which groups by the attribute pno. The query in the definition
of the view also contains a group clause that groups by the attribute sno. The two group
clauses are in conflict with each other and therefore the query cannot be rewritten to a
form that would make sense.
Note: There is no solution to the above mentioned problems at the moment and it
does not make sense to put much effort into that because the implementation of the support
for queries like:
(which is part of the SQL92 standard) will automatically also solve these problems.
In the next part of the current section we will present the changes applied to the
source code in order to realize the above described items. Note that it is not necessary to
understand the meaning of every single source line here and therefore we will not discuss
detailed questions like ”Why has the variable varno to be increased by 3?”. Questions
like that belong to a chapter dealing with the implementation of views in PostgreSQL and
to be able to answer them it would be necessary to know all the functions and not only
those described here. The fact important for us is to make sure, that whatever is applied
to the targetlist and the data structures representing the where clause is also applied to the
data structures for the having clause. There are three files affected:
/src/backend/rewrite/rewriteHandler.c
/src/backend/rewrite/rewriteManip.c
/src/backend/commands/view.c
Here
is a description of the changes made to the functions contained in the file
/src/backend/rewrite/rewriteHandler.c:
72 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
ApplyRetrieveRule()
This function becomes invoked whenever a select statement against a view
is recognized and applies the rewrite rule stored in the system catalogs. The
additional source lines given in the listing below make sure that the functions
OffsetVarNodes() and ChangeVarNodes() that are invoked for the where
clause and the targetlist of the query given in the view definition are also called for
the having clause and the group clause of the query in the view definition. These
functions adapt the varno and varattno fields of the VAR nodes involved.
The additional source lines at the end of ApplyRetrieveRule() attach the data
structures representing the having clause and the group clause of the query in the
view definition to the rewritten parsetree. As mentioned earlier, a view definition
involving a group clause will cause troubles whenever a query using a different group
clause against this view is executed. There is no mechanism preventing these troubles
included at the moment.
Note that the functions OffsetVarNodes() , ChangeVarNodes() and
AddHavingQual() appearing in ApplyRetrieveRule() are described at a
later point in time.
static void
ApplyRetrieveRule(Query *parsetree, RewriteRule *rule,
int rt_index, int relation_level,
Relation relation, int *modified)
{
Query *rule_action = NULL;
Node *rule_qual;
List *rtable,
.
.
.
OffsetVarNodes((Node *) rule_action->targetList,
rt_length);
OffsetVarNodes(rule_qual, rt_length);
+ OffsetVarNodes((Node *) rule_action->groupClause,
+ rt_length);
+ OffsetVarNodes((Node *) rule_action->havingQual,
+ rt_length);
.
.
.
ChangeVarNodes(rule_qual,
PRS2_CURRENT_VARNO + rt_length,
rt_index, 0);
+ ChangeVarNodes((Node *) rule_action->groupClause,
+ PRS2_CURRENT_VARNO + rt_length,
+ rt_index, 0);
+ ChangeVarNodes((Node *) rule_action->havingQual,
+ PRS2_CURRENT_VARNO + rt_length,
+ rt_index, 0);
.
.
.
3.7. THE REALIZATION OF THE HAVING CLAUSE 73
QueryRewriteSubLink()
This function is called by QueryRewrite() to process possibly contained sub-
queries first. It searches for nested queries by recursively tracing through the parse-
tree given as argument. The additional statement makes sure that the having clause
is also examined.
static void
QueryRewriteSubLink(Node *node)
{
if (node == NULL)
return;
switch (nodeTag(node))
{
case T_SubLink:
{
.
.
.
QueryRewriteSubLink((Node *) query->qual);
+ QueryRewriteSubLink((Node *)
+ query->havingQual);
.
.
.
}
.
.
.
}
return;
}
74 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
QueryRewrite()
This function takes the parsetree of a query and rewrites it using PostgreSQL’s
rewrite system. Before the query itself can be rewritten, subqueries that are
possibly part of the query have to be processed. Therefore the function
QueryRewriteSubLink() is called for the where clause and for the having
clause.
List *
QueryRewrite(Query *parsetree)
{
QueryRewriteSubLink(parsetree->qual);
+ QueryRewriteSubLink(parsetree->havingQual);
return QueryRewriteOne(parsetree);
}
Here
we present the changes applied to the functions that are contained in the file
/src/backend/rewrite/rewriteManip.c:
OffsetVarNodes()
Recursively steps through the parsetree given as the first argument and increments
the varno and varnoold fields of every VAR node found by the offset given as
the second argument. The additional statements are necessary to be able to handle
GroupClause nodes and Sublink nodes that may appear in the parsetree from
now on.
void
OffsetVarNodes(Node *node, int offset)
{
if (node == NULL)
return;
switch (nodeTag(node))
{
.
.
.
+ /* This has to be done to make queries using
+ * groupclauses work on views
+ */
+ case T_GroupClause:
+ {
+ GroupClause *group = (GroupClause *) node;
+
+ OffsetVarNodes((Node *)(group->entry),
+ offset);
+ }
+ break;
.
.
.
+ case T_SubLink:
+ {
+ SubLink *sublink = (SubLink *) node;
+ List *tmp_oper, *tmp_lefthand;
+
3.7. THE REALIZATION OF THE HAVING CLAUSE 75
ChangeVarNodes()
This function is similar to the above described function OffsetVarNodes()
but instead of incrementing the fields varno and varnoold of all VAR nodes
found, it processes only those VAR nodes whose varno value matches the pa-
rameter old varno given as argument and whose varlevelsup value matches
the parameter sublevels up. Whenever such a node is found, the varno and
varnoold fields are set to the value given in the parameter new varno. The addi-
tional statements are necessary to be able to handle GroupClause and Sublink
nodes.
void
ChangeVarNodes(Node *node, int old_varno,
int new_varno, int sublevels_up)
{
if (node == NULL)
return;
switch (nodeTag(node))
{
.
.
.
+ /* This has to be done to make queries using
+ * groupclauses work on views */
+ case T_GroupClause:
+ {
+ GroupClause *group = (GroupClause *) node;
+
76 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
+ ChangeVarNodes((Node *)(group->entry),
+ old_varno, new_varno,
+ sublevels_up);
+ }
+ break;
.
.
.
case T_Var:
{
.
.
.
/* This is a hack: Whenever an attribute
* from the "outside" query is used within
* a nested subquery, the varlevelsup will
* be >0. Nodes having varlevelsup > 0 are
* forgotten to be processed. The call to
* OffsetVarNodes() should really be done at
* another place but this hack makes sure
* that also those VAR nodes are processed.
*/
+ if (var->varlevelsup > 0)
+ OffsetVarNodes((Node *)var,3);
}
break;
.
.
.
case T_SubLink:
{
.
.
.
+ ChangeVarNodes((Node *) query->havingQual,
+ old_varno, new_varno,
+ sublevels_up);
+ ChangeVarNodes((Node *) query->targetList,
+ old_varno, new_varno,
+ sublevels_up);
+
+ /* We also have to adapt the variables used in
+ * sublink->lefthand and sublink->oper
+ */
+ ChangeVarNodes((Node *) (sublink->lefthand),
+ old_varno, new_varno,
+ sublevels_up);
}
break;
.
.
.
}
}
3.7. THE REALIZATION OF THE HAVING CLAUSE 77
AddHavingQual()
This function adds the operator tree given by the parameter havingQual to the
one attached to the field havingQual of the parsetree given by the parameter
parsetree. This is done by adding a new AND node and attaching the old and
the new operator tree as arguments to it. AddHavingQual() has not been exist-
ing until v6.3.2. It has been created for the having logic.
void
AddHavingQual(Query *parsetree, Node *havingQual)
{
Node *copy, *old;
if (havingQual == NULL)
return;
copy = havingQual;
old = parsetree->havingQual;
if (old == NULL)
parsetree->havingQual = copy;
else
parsetree->havingQual =
(Node *) make_andclause(
makeList(parsetree->havingQual,
copy, -1));
}
AddNotHavingQual()
This function is similar to the above described function AddHavingQual(). It
also adds the operator tree given by the parameter havingQual but prefixes it by
a NOT node. AddNotHavingQual() has also not been existing until v6.3.2 and
has been created for the having logic.
void
AddNotHavingQual(Query *parsetree,
Node *havingQual)
{
Node *copy;
if (havingQual == NULL)
return;
nodeHandleViewRule()
This function is called by HandleViewRule(). It replaces all VAR nodes of
the user query evaluated against the view (the fields of these VAR nodes represent
the positions of the attributes in the virtual table) by VAR nodes that have already
been prepared to represent the positions of the corresponding attributes in the phys-
ical tables (given in the view definition). The additional statements make sure that
GroupClause nodes and Sublink nodes are handled correctly.
78 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
static void
nodeHandleViewRule(Node **nodePtr, List *rtable,
List *targetlist, int rt_index,
int *modified, int sublevels_up)
{
Node *node = *nodePtr;
if (node == NULL)
return;
switch (nodeTag(node))
{
.
.
.
+ /* This has to be done to make queries using
+ * groupclauses work on views
+ */
+ case T_GroupClause:
+ {
+ GroupClause *group = (GroupClause *) node;
+ nodeHandleViewRule((Node **) (&(group->entry)),
+ rtable, targetlist, rt_index,
+ modified, sublevels_up);
+ }
+ break;
.
.
.
case T_Var:
{
.
.
.
if (n == NULL)
{
*nodePtr = make_null(((Var *)node)->vartype);
}
else
+ /* This is a hack: The varlevelsup of the
+ * original variable and the new one should
+ * be the same. Normally we adapt the node
+ * by changing a pointer to point to a var
+ * contained in ’targetlist’. In the
+ * targetlist all varlevelsups are 0 so if
+ * we want to change it to the original
+ * value we have to copy the node before!
+ * (Maybe this will cause troubles with some
+ * sophisticated queries on views?)
+ */
+ {
+ if(this_varlevelsup>0)
+ {
+ *nodePtr = copyObject(n);
+ }
+ else
3.7. THE REALIZATION OF THE HAVING CLAUSE 79
+ {
+ *nodePtr = n;
+ }
+ ((Var *)*nodePtr)->varlevelsup =
+ this_varlevelsup;
+ }
*modified = TRUE;
}
break;
.
.
.
case T_SubLink:
{
.
.
.
+ nodeHandleViewRule(
+ (Node **) &(query->havingQual),
+ rtable, targetlist, rt_index,
+ modified, sublevels_up);
+ nodeHandleViewRule(
+ (Node **) &(query->targetList),
+ rtable, targetlist, rt_index,
+ modified, sublevels_up);
+ /* We also have to adapt the variables used
+ * in sublink->lefthand and sublink->oper
+ */
+ nodeHandleViewRule(
+ (Node **) &(sublink->lefthand),
+ rtable, targetlist, rt_index,
+ modified, sublevels_up);
+ /* Make sure the first argument of
+ * sublink->oper points to the same var as
+ * sublink->lefthand does otherwise we will
+ * run into troubles using aggregates
+ * (aggno will not be set correctly!)
+ */
+ pfree(lfirst(((Expr *)
+ lfirst(sublink->oper))->args));
+ tmp_lefthand = sublink->lefthand;
+ foreach(tmp_oper, sublink->oper)
+ {
+ lfirst(((Expr *) lfirst(tmp_oper))->args) =
+ lfirst(tmp_lefthand);
+ tmp_lefthand = lnext(tmp_lefthand);
+ }
}
break;
.
.
.
}
}
80 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
HandleViewRule()
This function calls nodeHandleViewRule() for the where clause, the targetlist,
the group clause and the having clause of the user query evaluated against the given
view.
void
HandleViewRule(Query *parsetree, List *rtable,
List *targetlist, int rt_index,
int *modified)
{
.
.
.
+ /* The variables in the havingQual and
+ * groupClause also have to be adapted
+ */
+ nodeHandleViewRule(&parsetree->havingQual, rtable,
+ targetlist, rt_index,
+ modified, 0);
+ nodeHandleViewRule(
+ (Node **)(&(parsetree->groupClause)),
+ rtable, targetlist, rt_index, modified, 0);
}
The following function is contained in /src/backend/commands/view.c:
UpdateRangeTableOfViewParse()
This function updates the range table of the parsetree given by the parameter
viewParse. The additional statement makes sure that the VAR nodes of the having
clause are modified in the same way as the VAR nodes of the where clause are.
static void
UpdateRangeTableOfViewParse(char *viewName,
Query *viewParse)
{
.
.
.
OffsetVarNodes(viewParse->qual, 2);
+ OffsetVarNodes(viewParse->havingQual, 2);
.
.
.
}
Planner/Optimizer
The planner builds a queryplan like the one shown in figure 3.8 and in addition to that
it takes the operator tree attached to the field havingClause of the Query node and
attaches is to the qpqual field of the AGG node.
Unfortunately this is not the only thing to do. Remember from section 3.7.1 How
Aggregate Functions are Implemented that the targetlist is searched for aggregate functions
which are appended to a list that will be attached to the field aggs of the AGG node. This
3.7. THE REALIZATION OF THE HAVING CLAUSE 81
was sufficient as long as aggregate functions have only been allowed to appear within the
targetlist. Now the having clause is another source of aggregate functions. Consider the
following example:
Here the aggregate functions max and count are in use. If only the targetlist is scanned
(as it was the case before the having clause had been implemented) we will only find and
process the aggregate function max. The second function count is not processed and
therefore any reference to the result of count from within the having clause will fail.
The solution to this problem is to scan the whole operator tree representing the having
clause for aggregate functions not contained in the targetlist yet and add them to the list of
aggregate functions attached to the field aggs of the AGG node. The scanning is done by
the function check having qual for aggs() which steps recursively through the
tree.
While scanning the having clause for aggregate functions not contained in the tar-
getlist yet, an additional check is made to make sure that aggregate functions are used
within the having clause (otherwise the query could have been formulated using the where
clause). Consider the following query which is not a valid SQL92 query:
There is no need to express this query using a having clause, this kind of qualification
belongs to the where clause:
There is still an unsolved problem left. Consider the following query where we want to
know just the supplier numbers (sno) of all suppliers selling more than one part:
select sno
from sells
group by sno
having count(pno) > 1;
The planner creates a queryplan (like the one shown in figure 3.8) where the targetlists of
all nodes involved contain only entries of those attributes listed after the select keyword
of the query. Looking at the example above this means that the targetlists of the AGG
node, the GRP node the SORT node and the SeqScan node contain only the entry for the
attribute sno. As described earlier the aggregation logic operates on attributes of the tuples
returned by the subplan of the AGG node (i.e. the result of the GRP node). Which attributes
are contained in the tuples handed back by a subplan is determined by the targetlist. In
the case of our example the attribute pno needed for the aggregate function count is not
included and therefore the aggregation will fail.
82 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
The extended targetlist is used to create the subplan attached to the lefttree field
of the AGG node. That means that the targetlists of the GRP node, of the SORT
node and of the SeqScan node will now contain an entry for the attribute pno.
The targetlist of the AGG node itself will not be changed because we do not want to
include the attribute pno into the result returned by the whole query.
Care has to be taken that the varattno fields of the VAR nodes used in the targetlists
contain the position of the corresponding attribute in the targetlist of the subplan (i.e the
subplan delivering the tuples for further processing by the actual node).
The following part deals with the source code of the new and changed functions in-
volved in the planner/optimizer stage. The files affected are:
/src/backend/optimizer/plan/setrefs.c
/src/backend/optimizer/plan/planner.c
Since all of the functions presented here are very long and would need very much
space if presented as a whole, we just list the most important parts.
List *
check_having_qual_for_aggs(Node *clause,
List *subplanTargetList,
List *groupClause)
{
List *t, *l1;
List *agg_list = NIL;
int contained_in_group_clause = 0;
if (IsA(clause, Var))
{
3.7. THE REALIZATION OF THE HAVING CLAUSE 83
TargetEntry *subplanVar;
select sid
from part
group by sid
having min(pid) > 1;
In the above query the attribute pid is used in the having clause but it does not appear
in the targetlist (i.e. the list of attributes after the keyword select). Unfortunately
only those attributes are delivered by the subplan and can therefore be used within
the having clause. To become able to handle queries like that correctly, we have to
extend the actual targetlist by those attributes used in the having clause but not
already appearing in the targetlist.
List *
check_having_qual_for_vars(Node *clause,
List *targetlist_so_far)
{
List *t;
if (IsA(clause, Var))
{
Rel tmp_rel;
tmp_rel.targetlist = targetlist_so_far;
/* Check if the VAR is already contained in the
* targetlist
*/
if (tlist_member((Var *)clause,
(List *)targetlist_so_far) == NULL)
{
add_tl_element(&tmp_rel, (Var *)clause);
}
return tmp_rel.targetlist;
}
else
if (is_funcclause(clause) || not_clause(clause)
|| or_clause(clause) || and_clause(clause))
{
/* This is a function. Recursively call this
* routine for its arguments...
*/
foreach(t, ((Expr *) clause)->args)
{
targetlist_so_far =
check_having_qual_for_vars(lfirst(t),
targetlist_so_far);
3.7. THE REALIZATION OF THE HAVING CLAUSE 85
}
return targetlist_so_far;
}
else
if (IsA(clause, Aggreg))
{
targetlist_so_far =
check_having_qual_for_vars(
((Aggreg *)clause)->target,
targetlist_so_far);
return targetlist_so_far;
}
.
.
.
}
The next function is found in /src/backend/optimizer/plan/planner.c:
union planner()
This function creates a plan from the parsetree given to it by the parameter parse
that can be executed by the executor.
If aggregate functions are present (indicated by parse->hasAggs set to true) the
first step is to extend the targetlist by those attributes that are used within the having
clause (if any is present) but do not appear in the select list (Refer to the description
of check having qual for vars() above).
The next step is to call the function query planner() creating a plan without
taking the group clause, the aggregate functions and the having clause into account
for the moment.
Next insert a GRP node at the top of the plan according to the group clause of the
parsetree if any is present.
Add an AGG node to the top of the current plan if aggregate functions are present and
if a having clause is present additionally perform the following steps:
Plan *
union_planner(Query *parse)
{
List *tlist = parse->targetList;
Executor
The executor takes the queryplan produced by the planner/optimizer in the way just de-
scribed and processes all aggregate functions in the way described in section 3.7.1 The Im-
plementation of Aggregate Functions but before the tuple derived is handed back the oper-
ator tree attached to the field qpqual is evaluated by calling the function ExecQual().
This function recursively steps through the operator tree (i.e. the having clause) and eval-
uates the predicates appearing there. Thanks to our changes that have been made to the
planner the values of all operands needed to evaluate the predicates (e.g. the values of
all aggregate functions) are already present and can be accessed throughout the evaluation
without any problems.
If the evaluation of the having qualification returns true the tuple is returned by the
function execAgg() otherwise it is ignored and the next group is processed.
88 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
The necessarychanges
and enhancements have been applied to the following func-
tion in the file /src/backend/executor/nodeAgg.c:
execAgg() Whenever the executor gets to an AGG node this function is called.
Before the having logic had been implemented, all the tuples of the current group
were fetched from the subplan and all aggregate functions were applied to these
tuples. After that, the results were handed back to the calling function.
Since the having logic has been implemented there is one additional step executed.
Before the results of applying the aggregate functions are handed back, the function
ExecQual() is called with the representation of the having clause as an argument.
If true is returned, the results are handed back, otherwise they are ignored and we
start from the beginning for the next group until a group meeting the restrictions
given in the having clause is found.
TupleTableSlot *
ExecAgg(Agg *node)
{
.
.
.
/* We loop retrieving groups until we find one
* matching node->plan.qual
*/
+ do
+ {
.
.
.
/* Apply *all* aggregate function to the
* tuples of the *current* group
*/
.
.
.
econtext->ecxt_scantuple =
aggstate->csstate.css_ScanTupleSlot;
resultSlot = ExecProject(projInfo, &isDone);
A C1|C2|C3 B C1|C2|C3
--+--+-- --+--+--
1| a| b 1| a| b
2| a| b 5| a| b
3| c| d 3| c| d
4| e| f 8| e| f
C C1|C2|C3
--+--+--
4| e| f
8| e| f
select * from A
union
select * from B;
C1|C2|C3
--+--+--
1| a| b
2| a| b
3| c| d
4| e| f
5| a| b
8| e| f
C1|C3
--+--
1| b
2| b
1| a
5| a
90 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
Note that the selected columns do not need to have identical names, they only have to be
of the same type. In the previous example we selected for C1 and C3 in the first select
statement and for C1 and C2 in the second one. The names of the resulting columns are
taken from the first select statement.
select * from A
intersect
select * from B;
will return:
C1|C2|C3
--+--+--
1| a| b
3| c| d
select * from A
except
select * from B;
will return:
C1|C2|C3
--+--+--
2| a| b
4| e| f
The last examples were rather simple because they only used one set operator at a time with
only two operands. Now we look at some more complex queries involving more operators:
select * from A
union
select * from B
intersect
select * from C;
will return:
C1|C2|C3
--+--+--
4| e| f
8| e| f
8%
The above query performs the set theoretic computation . When no parentheses
4
are used, the operations are considered to be left associative, i.e.
m{@
will be
treated as .
The same query using parenthesis can lead to a completely different result:
select * from A
union
(select * from B
intersect
select * from C);
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 91
c"7{
performs and will return:
C1|C2|C3
--+--+--
1| a| b
2| a| b
3| c| d
4| e| f
8| e| f
rewrite system
planner/optimizer
executor
and we will now describe what every single stage does to the query. For our explanation we
assume to process a simple query (i.e. a query without subselects, aggregates and without
involving views)
The parser (given by gram.y) processes all three select statements, creates a
SelectStmt node for every select and attaches the where qualifications, targetlists
etc. to the corresponding nodes. Then it creates a list of the second and the third
92 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
...
Pointer to qualtree
qual of 1st Select
unionclause
...
Pointer to qualtree Pointer to qualtree
qual qual
of 2nd Select of 3rd Select
unionclause unionclause
SelectStmt node (of B and C) and attaches it to the field unionClause of the first
node (of A). Finally it hands back the first node (node A) with the list of the remaining
nodes attached as shown in figure 3.9.
The following transformation routines process the data structure handed back by the
parser. First the top node (node A) is transformed from a SelectStmt node to a
Query node. The targetlist, the where qualification etc. attached to it are transformed
as well. Next the list of the remaining nodes (attached to unionClause of node A) is
transformed and in this step also a check is made if the types and lengths of the targetlists
of the involved nodes are equal. The new Query nodes are now handed back in the same
way as the SelectStmt nodes were before (i.e. the Query nodes B and C are collected
in a list which is attached to unionClause of Query node A).
If any rewrite rules are present for the Query nodes (i.e. one of the select statements
uses a view) the necessary changes to the Query nodes are performed (see section 3.4.1
Techniques To Implement Views). Otherwise no changes are made to the nodes in this stage.
Planner/Optimizer
This stage has to create a plan out of the querytree produced by the parser stage that can
be executed by the executor. In most cases there are several ways (paths) with different
cost to get to the same result. It’s the planner/optimizer’s task to find out which path is the
cheapest and to create a plan using this path. The implementation of unions in PostgreSQL
is based on the following idea:
)_
The set derived
by evaluating must contain
every member of
and every
member of . So if we append the members
of
to the members of we are almost
done. If there exist members common to and these members are now contained twice
in our new set, so the only thing left to do is to remove these duplicates.
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 93
In the case of our example the planner would build up the tree shown in figure 3.10. Every
Query node is planned separately and results in a SeqScan node in our example. The
three SeqScan nodes are put together into a list which is attached to unionplans of an
Append node.
Unique node
...
lefttree
Sort node
...
lefttree
Append node
...
unionplans
...
...
Executor
The executor will process all the SeqScan nodes and append all the delivered tuples to
a single result relation. Now it is possible that duplicate tuples are contained in the result
relation which have to be removed. The removal is done by the Unique node and the sort
is just performed to make its work easier.
The idea is based on the fact that intersect and except statements are redundant in
SQL, i.e. for every intersect or except statement it is possible to formulate a semantically
equivalent statement without using intersect or except.
Example 3.5 This example shows how a query using intersect can be transformed to a
semantically equivalent query without an intersect:
94 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
Definition 3.1
The intersection of two sets and is defined as:
8 f E f$c fc$-
&_* e
1%
The intersection of sets is defined as:
%9w f E fc$c%9 -
9 &<* 9
? ?
Definition 3.2
The difference of two sets and is defined as:
Z f E f$c f $-
&_* e
Definition 3.3
The union of two sets and is defined as:
8 f E f$cfc$-
&_*
#1%
The union of sets is defined as:
%9w f E fc$c%9 -
9 &<* 9
? ?
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 95
9 f$ fc$
Example
fc$c
3.6 In the following
f$
example the yV¡ are
fc$4{J4f$ fc$
of the
form H or ¢ H :
ec¢
fc$c'f$
e
fc$4
e
fc$
O is a formula in DNF
e ¢ is not in DNF.
The transformation of any formula in propositional logic into DNF is done by successively
applying the following rules:
5 ,# ~ 8 ~ w£¤ ~ ~ #
¢ ¢ e¢
5 ~ ~ w£¤ ~ V ~ #
( ¢ e ¢ ¢
5 ~ ~ 8 ~B¥ w£¤ ~ ~ @4 ~ ~B¥
e e e
5%¦ c ~ V ~ ~B¥ £¤ ~ ~B¥ @4 ~ ~B¥
e e e
It can be shown that the transformation using the rules (R1) to (R4) always termi-
nates after a finite number of steps.
As figure 3.9 illustrates, the operands of a union operation are just Query nodes
(the first operand is the top node and all further operands form a list which is attached to
the field unionClause of the top node). So our goal will be to transform every query
involving set operations into this form. (Note that the operands to the union operation may
be complex, i.e. subselects, grouping, aggregates etc. are allowed.)
The transformation of a query involving set operations in any order into a query that
can be accepted by the union logic is equivalent to transforming the membership formula
(see definitions 3.1, 3.2 and 3.3) in propositional logic into disjunctive normal form (DNF).
The transformation of any formula in propositional logic into DNF is always possible in a
finite number of steps.
96 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
The advantage of this transformation technique is the little impact on the whole system
and the implicit invocation of the planner/optimizer. The only changes necessary are made
to the parser stage and the rewrite system.
Here are some changes that had to be applied to the source code before the parser
stage and the rewrite system could be adapted:
Add
the new keywords EXCEPT and INTERSECT to the file
/src/backend/parser/keywords.c:
static void
_outQuery(StringInfo str, Query *node)
{
.
.
.
appendStringInfo(str, " :unionClause ");
_outNode(str, node->unionClause);
+ appendStringInfo(str, " :intersectClause ");
+ _outNode(str, node->intersectClause);
}
static Query *
_readQuery()
{
.
.
.
token = lsptok(NULL, &length);
local_node->unionClause = nodeRead(true);
+ token = lsptok(NULL, &length);
+ local_node->intersectClause = nodeRead(true);
return (local_node);
}
void
ExecReScan(Plan *node, ExprContext *exprCtxt,
Plan *parent)
{
.
.
.
switch (nodeTag(node))
{
.
.
.
98 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
+ case T_Group:
+ ExecReScanGroup((Group *) node,
+ exprCtxt, parent);
+ break;
.
.
.
}
}
void
ExecReScanGroup(Group *node, ExprContext *exprCtxt,
Plan *parent)
{
GroupState *grpstate = node->grpstate;
grpstate->grp_useFirstTuple = FALSE;
grpstate->grp_done = FALSE;
grpstate->grp_firstTuple = (HeapTupleData *)NIL;
/*
* if chgParam of subnode is not null then plan
* will be re-scanned by first ExecProcNode.
*/
if (((Plan *) node)->lefttree->chgParam == NULL)
ExecReScan(((Plan *) node)->lefttree,
exprCtxt, (Plan *) node);
}
Parser
The parser defined in the file /src/backend/parser/gram.y had to be modified
in two ways:
The grammar had to be adapted to support the usage of parenthesis (to be able to
specify the order of execution of the set operators).
The code building up the data structures handed back by the parser had to be inserted.
Here is a part of the grammar which is responsible for select statements having the code
building up the data structures inserted:
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 99
if IsA($1, SelectStmt)
{
SelectStmt *n = (SelectStmt *)$1;
n->sortClause = $2;
$$ = (Node *)n;
}
else
{
/* Create a "flat list" of the operator
* tree built up by ’select_w_o_sort’ and
* let select_list point to it
*/
create_select_list((Node *)op,
&select_list,
&unionall_present);
/* Replace all the A_Expr nodes in the
* operator tree by Expr nodes.
*/
op = A_Expr_to_Expr(op, &intersect_present);
.
.
.
/* Get the leftmost SelectStmt node (which
* automatically represents the first Select
* Statement of the query!) */
first_select =
(SelectStmt *)lfirst(select_list);
/* Attach the list of all SelectStmt nodes
* to unionClause
*/
first_select->unionClause = select_list;
The keywords SELECT, EXCEPT, UNION, INTERSECT, ’(’ and ’)’ are termi-
nal symbols and SelectStmt, select w o sort, sort clause, opt union,
SubSelect, opt unique, res target list2, result, from clause,
where clause, group clause, having clause are nonterminal symbols. The
symbols EXCEPT, UNION and INTERSECT are left associative meaning that a statement
like:
select * from A
union
select * from B
union
select * from C;
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 101
((select * from A
union
select * from B)
union
select * from C)
The select w o sort rule builds up an operator tree using nodes of type A Expr. For
every union an OR node is created, for every intersect an AND node and for every except
and AND NOT node building up a representation of a formula in propositional logic. If the
query parsed did not contain any set operations the rule hands back a SelectStmt node
representing the query otherwise the top node of the operator tree is returned. Figure 3.11
shows a typical operator tree returned by the select w o sort rule.
A_Expr
AND
A_Expr A_Expr
OR NOT
...
A_Expr
AND
unionClause unionClause
intersectClause intersectClause
...
unionClause unionClause
intersectClause intersectClause
Figure 3.11: Operator tree for
The rule SelectStmt transforms the operator tree built of A Expr nodes into an
operator tree using Expr nodes by a call to the function A Expr to Expr() which
additionally replaces every OR node by an AND node and vice versa. This is performed in
order to be able to use the function cnfify() later on.
Create a flat list of all SelectStmt nodes of the operator tree (by a call to the
function create select list()) and attach the list to the field unionClause
of the leftmost SelectStmt (see next step).
Find the leftmost leaf (SelectStmt node) of the operator tree (this is auto-
matically the first member of the above created list because of the technique
create select list() uses to create the list).
Attach the whole operator tree (including the leftmost node itself) to the field
intersectClause of the leftmost SelectStmt node.
Attach the data structure built up by the sort clause rule to the field
sortClause of the leftmost SelectStmt node.
Hand back the leftmost SelectStmt node (with the operator tree, the list of all
SelectStmt nodes and the sortClause attached to it).
Figure 3.12 shows the final data structure derived from the operator tree shown in figure
3.11 handed back by the SelectStmt rule:
unionClause
intersectClause
Expr
OR
args
Expr Expr
AND NOT
args args
OR
args
unionClause
intersectClause
SelectStmt (C) SelectStmt (D)
...
...
unionClause unionClause
intersectClause intersectClause
Here
is a description of the above used functions. They can be found in the file
/src/backend/parser/anlayze.c.
void
create_select_list(Node *ptr, List **select_list,
bool *unionall_present)
{
if(IsA(ptr, SelectStmt))
{
*select_list = lappend(*select_list, ptr);
if(((SelectStmt *)ptr)->unionall == TRUE)
*unionall_present = TRUE;
return;
}
A Expr to Expr():
This function recursively steps through the operator tree handed to it by the param-
eter ptr and replaces A Expr nodes by Expr nodes. Additionally it exchanges
AND nodes with OR nodes and vice versa. The reason for this exchange is easy to
understand. We implement intersect and except clauses by rewriting these queries
to semantically equivalent queries that use IN and NOT IN subselects. To be able
to use all three operations (unions, intersects and excepts) in one complex query, we
have to translate the queries into Disjunctive Normal Form (DNF). Unfortunately
there is no function dnfify(), but there is a function cnfify() which produces
DNF when we exchange AND nodes with OR nodes and vice versa before calling
cnfify() and exchange them again in the result.
Node *
A_Expr_to_Expr(Node *ptr,
bool *intersect_present)
{
Node *result;
switch(nodeTag(ptr))
{
case T_A_Expr:
{
104 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
switch (a->oper)
{
case AND:
{
Expr *expr = makeNode(Expr);
Node *lexpr =
A_Expr_to_Expr(((A_Expr *)ptr)->lexpr,
intersect_present);
Node *rexpr =
A_Expr_to_Expr(((A_Expr *)ptr)->rexpr,
intersect_present);
*intersect_present = TRUE;
expr->typeOid = BOOLOID;
expr->opType = OR_EXPR;
expr->args = makeList(lexpr, rexpr, -1);
result = (Node *) expr;
break;
}
case OR:
{
Expr *expr = makeNode(Expr);
Node *lexpr =
A_Expr_to_Expr(((A_Expr *)ptr)->lexpr,
intersect_present);
Node *rexpr =
A_Expr_to_Expr(((A_Expr *)ptr)->rexpr,
intersect_present);
expr->typeOid = BOOLOID;
expr->opType = AND_EXPR;
expr->args = makeList(lexpr, rexpr, -1);
result = (Node *) expr;
break;
}
case NOT:
{
Expr *expr = makeNode(Expr);
Node *rexpr =
A_Expr_to_Expr(((A_Expr *)ptr)->rexpr,
intersect_present);
expr->typeOid = BOOLOID;
expr->opType = NOT_EXPR;
expr->args = makeList(rexpr, -1);
result = (Node *) expr;
break;
}
}
break;
}
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 105
default:
{
result = ptr;
}
}
}
return result;
}
Note that the stmtmulti and OptStmtMulti rules had to be changed in order to avoid
shift/reduce conflicts. The old rules allowed an invalid syntax (e.g. the concatenation of
two statements without a ’;’ inbetween) which will be prevented now. The new rules have
the second line commented out as shown below:
Transformations
This step normally transforms every SelectStmt node found into a Query node and
does a lot of transformations to the targetlist, the where qualification etc. As mentioned
above this stage expects a SelectStmt node and cannot handle an A Expr node. That’s
why we did the changes to the operator tree shown in figure 3.12.
The transformation of the list attached to unionClause is prevented. The raw list
is now passed through instead and the necessary transformations are performed at a
later point in time.
The changes described in the above paragraph have been applied to the functions
transformInsertStmt()
and transformSelectStmt() which are contained
in the file /src/backend/parser/analyze.c:
transformInsertStmt():
static Query *
transformInsertStmt(ParseState *pstate,
InsertStmt *stmt)
{
.
.
.
106 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
transformSelectStmt():
static Query *
transformSelectStmt(ParseState *pstate,
SelectStmt *stmt)
{
.
.
.
/* Just pass through the unionClause and
* intersectClause. We will process it in
* the function Except_Intersect_Rewrite()
*/
qry->unionClause = stmt->unionClause;
qry->intersectClause = stmt->intersectClause;
.
.
.
return (Query *) qry;
}
In this stage the information contained in the operator tree attached to the topmost
SelectStmt node is used to form a tree of Query nodes representing the rewritten
query (i.e. the semantically equivalent query that contains only union but no intersect or
except operations).
Recursively step through the operator tree and transform every SelectStmt node
to a Query node using the function intersect tree analyze() described
below. The one node already transformed (the topmost node) is still contained in the
operator tree and must not be transformed again because this would cause troubles
in the transforming logic.
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 107
Expr
OR
args
Expr
NOT
args
...
...
unionClause unionClause
intersectClause intersectClause
Expr
OR
args
Expr
NOT
args
...
unionClause unionClause
intersectClause intersectClause
Figure 3.13: Data structure of after transformation to DNF
Transform the new operator tree into DNF (disjunctive normal form). PostgreSQL
does not provide any function for the transformation into DNF but it provides a
function cnfify() that performs a transformation into CNF (conjunctive normal
form). So we can easily make use of this function when we exchange every OR with
an AND and vice versa before calling cnfify() as we did already in the parser
(compare figure 3.11 to figure 3.12). When cnfify() is called with a special flag,
the removeAndFlag set to true it returns a list where the entries can be thought
of being connected together by ANDs, so the explicit AND nodes are removed.
After cnfify() has been called we normally would have to exchange OR and AND
nodes again. We skip this step by simply treating every OR node as an AND node
throughout the following steps (remember, that there are no AND nodes left that have
to be treated as OR nodes because of the removeAndFlag).
Figure 3.13 shows what the data structure looks like after the transformation to DNF
has taken place for the following query:
108 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
(select * from A
union
select * from B)
except
select * from C;
For every entry of the list returned by cnfify() (i.e. for every operator tree which
may only contain OR and NOT operator nodes and Query nodes as leaves) contained
in the union list perform the following steps:
– Check if the targetlists of all Query nodes appearing are compatible (i.e. all
targetlists have the same number of attributes and the corresponding attributes
are of the same type)
– There must be at least one positive OR node (i.e. at least one OR node which is
not preceded by a NOT node). Create a list of all Query nodes (or of Query
nodes preceded by NOT nodes if OR NOT is found) with the non negated node
first using the function create list() described below.
– The first (non negated) node of the list will be the topmost Query node of
the current union operand. For all other nodes found in the list add an IN
subselect (NOT IN subselect if the Query node is preceded by a NOT) to the
where qualification of the topmost node. Adding a subselect to the where
qualification is done by logically ANDing it to the original qualification.
– Append the Query node setup in the last steps to a list which is hold by the
pointer union list.
Take the first node of union list as the new topmost node of the whole query
and attach the rest of the list to the field unionClause of this topmost node. Since
the new topmost node might differ from the original one (i.e. from the node which
was topmost when we entered the rewrite stage) we have to attach the fields saved
in the first step to the new topmost node (i.e. the sortClause, targetList,
unionFlag, etc.).
Hand the new topmost Query node back. Now the normal query rewriting takes
place (in order to handle views if present) and then the planner/optimizer and execu-
tor functions are called to get a result. There have no changes been made to the code
of these stages.
Figure 3.14 shows the rewritten data structure of the query:
select C1, C2 from A
intersect
select C1, C3 from C;
against the tables defined in example 3.4. The rewritten data structure represents the query:
select C1, C2 form A
where (C1, C2) in
(select C1,C3 from C);
The field lefttree of the Sublink node points to a list where every entry points to a
VAR node of the targetlist of the topmost node (node A). The field oper of the Sublink
node points to a list holding a pointer to an Expr node for every attribute of the topmost
targetlist. Every Expr node is used to compare a VAR node of the topmost targetlist with
the corresponding VAR node of the subselect’s targetlist. So the first argument of every
Expr node points to a VAR node of the topmost targetlist and the second argument points
to the corresponding VAR node of the subselect’s targetlist.
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 109
Query
rtable
targetlist RTE
qual relname: "A"
...
TLE TLE
resdom resdom
expr expr
Resdom Resdom
resname: C1 resname: C2
VAR VAR
varno: 1 varno: 1
varattno: 1 varattno: 2
Sublink
lefthand
oper
subselect
Expr
...
opType: ’=’
Expr args
opType: ’=’
args
Query
rtable
targetlist RTE
qual relname: "C"
...
TLE TLE
resdom resdom
expr expr
Resdom Resdom
resname: C1 resname: C3
VAR VAR
varno: 1 varno: 1
varattno: 1 varattno: 2
Figure 3.14: Data structure of after query rewriting
110 CHAPTER 3. POSTGRESQL FROM THE PROGRAMMER’S POINT OF VIEW
If the user’s query involves union as well as intersect or except there will be more Query
nodes of the form shown in figure 3.14. One will be the topmost node (as described above)
and the others will be collected in a list which is attached to the field unionClause of
the topmost node. (The intersectClause fields of all Query nodes will be set to NULL
because they are no longer needed.)
List *
pg_parse_and_plan(char *query_string, Oid *typev,
int nargs,
QueryTreeList **queryListP,
CommandDest dest)
{
.
.
.
/* Rewrite Union, Intersect and Except Queries
* to normal Union Queries using IN and NOT
* IN subselects */
if(querytree->intersectClause != NIL)
{
querytree = Except_Intersect_Rewrite(querytree);
}
.
.
.
}
Query *
Except_Intersect_Rewrite (Query *parsetree)
{
.
.
.
/* Save some fields, to be able to restore them
* to the resulting top node at the end of the
* function
*/
sortClause = parsetree->sortClause;
uniqueFlag = parsetree->uniqueFlag;
into = parsetree->into;
isBinary = parsetree->isBinary;
isPortal = parsetree->isPortal;
while(intersect_list != NIL)
{
n = makeNode(SubLink);
n->lefthand =
lappend(n->lefthand, tent->expr);
}
foreach(elist, left_expr)
{
Node *lexpr = lfirst(elist);
Node *rexpr = lfirst(right_expr);
TargetEntry *tent = (TargetEntry *) rexpr;
Expr *op_expr;
create list()
Create a list of nodes that are either Query nodes or NOT nodes followed by a
Query node. The tree given in ptr contains at least one non negated Query node.
This node is put to the beginning of the list.
if(IsA(ptr,Query))
{
/* The non negated node is attached at the
* beginning (lcons) */
*intersect_list = lcons(ptr, *intersect_list);
return;
}
if(IsA(ptr,Expr))
{
if(((Expr *)ptr)->opType == NOT_EXPR)
{
/* negated nodes are appended to the
* end (lappend)
*/
*intersect_list =
lappend(*intersect_list, ptr);
return;
}
else
{
foreach(arg, ((Expr *)ptr)->args)
{
create_list(lfirst(arg), intersect_list);
}
return;
}
return;
}
}
if(IsA(tree, SelectStmt))
{
3.8. THE REALIZATION OF UNION, INTERSECT AND EXCEPT 115
QueryTreeList *qtree;
PostgreSQL has become one of the most popular non commercial RDBMSs in the UNIX
world. It provides an extended subset of the SQL92 standard as query language, allows
concurrent database access, provides a huge amount of datatypes, etc. At the the time of
writing this document the implemented part of SQL92 in PostgreSQL lacked two important
features:
The having clause was not implemented.
The implementation of these two items was the motivation for the whole work. When
I started to implement the above mentioned features, I noticed that there was almost
no documentation on the internal structure of PostgreSQL available that could help a
programmer to find his way in. So I decided to include all the knowledge I collected
while working with the source code into this document, hoping that it will be a useful for
any newcomer who wants to enhance PostgreSQL etc. Additionally I included a short
discussion on SQL and a description of PostgreSQL’s features (like triggers etc.) and how
they can be used.
Chapter 3 first gives an overview on the internal structure of PostgreSQL and presents
the stages and data structures that are involved whenever a SQL query arrives. The
parser stage, the rule system (which is mainly used for the implementation of views), the
planner/optimizer and the executor are described and illustrated by a lot of figures. After
that the changes necessary for the implementation of the missing features (having clause
and except/intersect) are presented including parts of the added source code.
116
Bibliography
[DATE96] C. J. Date with H. Darwen: A Guide to the SQL Standard, Fourth Edition,
Addison-Wesley 1996
117