PL/pgSQL - SQL Procedural Language
PL/pgSQL is a loadable procedural language for the
Postgres database system.
This package was originally written by Jan Wieck.
Overview
The design goals of PL/pgSQL were to create a loadable procedural
language that
can be used to create functions and trigger procedures,
adds control structures to the SQL language,
can perform complex computations,
inherits all user defined types, functions and operators,
can be defined to be trusted by the server,
is easy to use.
The PL/pgSQL call handler parses the function's source text and
produces an internal binary instruction tree the first time the
function is called. The produced bytecode is identified
in the call handler by the object ID of the function. This ensures
that changing a function by a DROP/CREATE sequence will take effect
without establishing a new database connection.
For all expressions and SQL statements used in
the function, the PL/pgSQL bytecode interpreter creates a
prepared execution plan using the SPI manager's SPI_prepare() and
SPI_saveplan() functions. This is done the first time the individual
statement is processed in the PL/pgSQL function. Thus, a function with
conditional code that contains many statements for which execution
plans would be required, will only prepare and save those plans
that are really used during the lifetime of the database
connection.
Because PL/pgSQL saves execution plans in this way, queries that appear
directly in a PL/pgSQL function must refer to the same tables and fields
on every execution; that is, you cannot use a parameter as the name of
a table or field in a query. To get around
this restriction, you can construct dynamic queries using the PL/pgSQL
EXECUTE statement --- at the price of constructing a new query plan
on every execution.
Except for input/output conversion and calculation functions
for user defined types, anything that can be defined in C language
functions can also be done with PL/pgSQL. It is possible to
create complex conditional computation functions and later use
them to define operators or use them in functional indices.
Description
Structure of PL/pgSQL
The PL/pgSQL language is case insensitive. All keywords and
identifiers can be used in mixed upper- and lower-case.
PL/pgSQL is a block oriented language. A block is defined as
<<label>>
DECLARE
declarations
BEGIN
statements
END;
There can be any number of sub-blocks in the statement section
of a block. Sub-blocks can be used to hide variables from outside a
block of statements. The variables
declared in the declarations section preceding a block are
initialized to their default values every time the block is entered,
not only once per function call.
It is important not to confuse the use of BEGIN/END for
grouping statements in PL/pgSQL with the database commands for
transaction control. PL/pgSQL's BEGIN/END are only for grouping;
they do not start or end a transaction. Functions and trigger procedures
are always executed within a transaction established by an outer query
--- they cannot start or commit transactions, since
Postgres does not have nested transactions.
Comments
There are two types of comments in PL/pgSQL. A double dash --
starts a comment that extends to the end of the line. A /*
starts a block comment that extends to the next occurrence of */.
Block comments cannot be nested, but double dash comments can be
enclosed into a block comment and a double dash can hide
the block comment delimiters /* and */.
Declarations
All variables, rows and records used in a block or its
sub-blocks must be declared in the declarations section of a block,
except for the loop variable of a FOR-loop iterating over a range
of integer values. Parameters given to a PL/pgSQL function are
automatically declared with the usual identifiers $1, $2, etc.
The declarations have the following syntax:
name CONSTANT
type NOT NULL DEFAULT | :=
value ;
Declares a variable of the specified base type. If the variable
is declared as CONSTANT, the value cannot be changed. If NOT NULL
is specified, an assignment of a NULL value results in a runtime
error. Since the default value of all variables is the
SQL NULL value, all variables declared as NOT NULL
must also have a default value specified.
The default value is evaluated every time the block is entered. So
assigning 'now' to a variable of type
timestamp causes the variable to have the
time of the actual function call, not when the function was
precompiled into its bytecode.
name table%ROWTYPE;
Declares a row with the structure of the given table. table must be
an existing table or view name of the database. The fields of the row
are accessed in the dot notation. Parameters to a function can
be composite types (complete table rows). In that case, the
corresponding identifier $n will be a rowtype, but it
must be aliased using the ALIAS command described below. Only the user
attributes of a table row are accessible in the row, no Oid or other
system attributes (because the row could be from a view and view rows
don't have useful system attributes).
The fields of the rowtype inherit the table's field sizes
or precision for char() etc. data types.
name RECORD;
Records are similar to rowtypes, but they have no predefined structure.
They are used in selections and FOR loops to hold one actual
database row from a SELECT operation. One and the same record can be
used in different selections. Accessing a record or an attempt to assign
a value to a record field when there is no actual row in it results
in a runtime error.
The NEW and OLD rows in a trigger are given to the procedure as
records. This is necessary because in Postgres
one and the same trigger procedure can handle trigger events for
different tables.
name ALIAS FOR $n;
For better readability of the code it is possible to define an alias
for a positional parameter to a function.
This aliasing is required for composite types given as arguments to
a function. The dot notation $1.salary as in SQL functions is not
allowed in PL/pgSQL.
RENAME oldname TO newname;
Change the name of a variable, record or row. This is useful
if NEW or OLD should be referenced by another name inside a
trigger procedure.
Data Types
The type of a variable can be any of the existing base types of
the database. type in the declarations
section above is defined as:
Postgres-basetype
variable%TYPE
table.field%TYPE
variable is the name of a variable,
previously declared in the
same function, that is visible at this point.
table is the name of an existing table
or view where field is the name of
an attribute.
Using the table.field%TYPE
causes PL/pgSQL to look up the attributes definitions at the
first call to the function during the lifetime of a backend.
Suppose we have a table with a char(20) attribute and some PL/pgSQL functions
that deal with its content in local variables. Now someone
decides that char(20) is not enough, dumps the table, drops it,
recreates it now with the attribute in question defined as
char(40) and restores the data. Hah - he forgot about the
functions. The computations inside them will truncate the values
to 20 characters. But if they are defined using the
table.field%TYPE
declarations, they will automagically handle the size change or
if the new table schema defines the attribute as text type.
Expressions
All expressions used in PL/pgSQL statements are processed using
the backend's executor. Expressions that appear to contain
constants may in fact require runtime evaluation (e.g., 'now' for the
timestamp type) so
it is impossible for the PL/pgSQL parser
to identify real constant values other than the NULL keyword. All
expressions are evaluated internally by executing a query
SELECT expression
using the SPI manager. In the expression, occurrences of variable
identifiers are substituted by parameters and the actual values from
the variables are passed to the executor in the parameter array. All
expressions used in a PL/pgSQL function are only prepared and
saved once. The only exception to this rule is an EXECUTE statement
if parsing of a query is needed each time it is encountered.
The type checking done by the Postgres
main parser has some side
effects to the interpretation of constant values. In detail there
is a difference between what the two functions
CREATE FUNCTION logfunc1 (text) RETURNS timestamp AS '
DECLARE
logtxt ALIAS FOR $1;
BEGIN
INSERT INTO logtable VALUES (logtxt, ''now'');
RETURN ''now'';
END;
' LANGUAGE 'plpgsql';
and
CREATE FUNCTION logfunc2 (text) RETURNS timestamp AS '
DECLARE
logtxt ALIAS FOR $1;
curtime timestamp;
BEGIN
curtime := ''now'';
INSERT INTO logtable VALUES (logtxt, curtime);
RETURN curtime;
END;
' LANGUAGE 'plpgsql';
do. In the case of logfunc1(), the
Postgres main parser knows when
preparing the plan for the INSERT, that the string
'now' should be interpreted as
timestamp because the target field of logtable is of
that type. Thus, it will make a constant from it at this time and
this constant value is then used in all invocations of
logfunc1() during the lifetime of the
backend. Needless to say that this isn't what the programmer
wanted.
In the case of logfunc2(), the Postgres
main parser does not know
what type 'now' should become and therefore it returns a data type of
text containing the string 'now'. During the assignment
to the local variable curtime, the PL/pgSQL interpreter casts this
string to the timestamp type by calling the text_out() and timestamp_in()
functions for the conversion.
This type checking done by the Postgres main
parser got implemented after PL/pgSQL was nearly done.
It is a difference between 6.3 and 6.4 and affects all functions
using the prepared plan feature of the SPI manager.
Using a local
variable in the above manner is currently the only way in PL/pgSQL to get
those values interpreted correctly.
If record fields are used in expressions or statements, the data types of
fields should not change between calls of one and the same expression.
Keep this in mind when writing trigger procedures that handle events
for more than one table.
Statements
Anything not understood by the PL/pgSQL parser as specified below
will be put into a query and sent down to the database engine
to execute. The resulting query should not return any data.
Assignment
An assignment of a value to a variable or row/record field is
written as
identifier := expression;
If the expressions result data type doesn't match the variables
data type, or the variable has a size/precision that is known
(as for char(20)), the result value will be implicitly cast by
the PL/pgSQL bytecode interpreter using the result types output- and
the variables type input-functions. Note that this could potentially
result in runtime errors generated by the types input functions.
An assignment of a complete selection into a record or row can
be done by
SELECT INTO target expressions FROM ...;
target can be a record, a row variable or a
comma separated list of variables and record-/row-fields. Note that
this is quite different from Postgres' normal interpretation of
SELECT INTO, which is that the INTO target is a newly created table.
(If you want to create a table from a SELECT result inside a PL/pgSQL
function, use the equivalent syntax CREATE TABLE AS SELECT.)
if a row or a variable list is used as target, the selected values
must exactly match the structure of the target(s) or a runtime error
occurs. The FROM keyword can be followed by any valid qualification,
grouping, sorting etc. that can be given for a SELECT statement.
There is a special variable named FOUND of type boolean that can be used
immediately after a SELECT INTO to check if an assignment had success.
SELECT INTO myrec * FROM EMP WHERE empname = myname;
IF NOT FOUND THEN
RAISE EXCEPTION ''employee % not found'', myname;
END IF;
If the selection returns multiple rows, only the first is moved
into the target fields. All others are silently discarded.
Calling another function
All functions defined in a Postgres
database return a value. Thus, the normal way to call a function
is to execute a SELECT query or doing an assignment (resulting
in a PL/pgSQL internal SELECT). But there are cases where someone
is not interested in the function's result.
PERFORM query
executes a SELECT query over the
SPI manager and discards the result. Identifiers like local
variables are still substituted into parameters.
Executing dynamic queries
EXECUTE query-string
where query-string is a string of
type text containing the query to be
executed.
Unlike all other queries in PL/pgSQL, a
query run by an EXECUTE statement
is not prepared and saved just once during the life of the
server. Instead, the query is
prepared each time the statement is run. The
query-string can be dynamically created
within the procedure to perform actions on variable tables and
fields.
The results from SELECT queries are discarded by EXECUTE, and
SELECT INTO is not currently supported within EXECUTE. So, the
only way to extract a result from a dynamically-created SELECT
is to use the FOR ... EXECUTE form described later.
An example:
EXECUTE ''UPDATE tbl SET ''
|| quote_ident(fieldname)
|| '' = ''
|| quote_literal(newvalue)
|| '' WHERE ...'';
This example shows use of the functions
quote_ident(TEXT) and
quote_literal(TEXT).
Variables containing field and table identifiers should be
passed to function quote_ident().
Variables containing literal elements of the dynamic query
string should be passed to
quote_literal(). Both take the
appropriate steps to return the input text enclosed in single
or double quotes and with any embedded special characters
properly escaped.
Obtaining other results status
GET DIAGNOSTICS variable = item , ...
This command allows retrieval of system status indicators. Each
item is a keyword identifying a state
value to be assigned to the specified variable (which should be of
the right datatype to receive it). The currently available status
items are ROW_COUNT>, the number of rows processed by
the last SQL query sent down to the SQL engine; and
RESULT_OID>, the Oid of the last row inserted by the
most recent SQL query. Note that RESULT_OID> is only
useful after an INSERT query.
Returning from the function
RETURN expression
The function terminates and the value of expression
will be returned to the upper executor. The return value of a function
cannot be undefined. If control reaches the end of the top-level block
of the function without hitting a RETURN statement, a runtime error
will occur.
The expressions result will be automatically casted into the
function's return type as described for assignments.
Aborting and messages
As indicated in the above examples there is a RAISE statement that
can throw messages into the Postgres
elog mechanism.
RAISE level 'format' , identifier ...;
Inside the format, % is used as a placeholder for the
subsequent comma-separated identifiers. Possible levels are
DEBUG (silently suppressed in production running databases), NOTICE
(written into the database log and forwarded to the client application)
and EXCEPTION (written into the database log and aborting the transaction).
Conditionals
IF expression THEN
statements
ELSE
statements
END IF;
The expression must return a value that
is of type boolean or can be casted to a boolean.
Loops
There are multiple types of loops.
<<label>>
LOOP
statements
END LOOP;
An unconditional loop that must be terminated explicitly
by an EXIT statement. The optional label can be used by
EXIT statements of nested loops to specify which level of
nesting should be terminated.
<<label>>
WHILE expression LOOP
statements
END LOOP;
A conditional loop that is executed as long as the evaluation
of expression is true.
<<label>>
FOR name IN REVERSE expression .. expression LOOP
statements
END LOOP;
A loop that iterates over a range of integer values. The variable
name is automatically created as type
integer and exists only inside the loop. The two expressions giving
the lower and upper bound of the range are evaluated only when entering
the loop. The iteration step is always 1.
<<label>>
FOR record | row IN select_clause LOOP
statements
END LOOP;
The record or row is assigned all the rows resulting from the select
clause and the loop body is executed for each row. If the loop is
terminated with an EXIT statement, the last assigned row is still
accessible after the loop.
<<label>>
FOR record | row IN EXECUTE text_expression LOOP
statements
END LOOP;
This is like the previous form, except that the source SELECT
statement is specified as a string expression, which is evaluated
and re-planned on each entry to the FOR loop. This allows the
programmer to choose the speed of a pre-planned query or the
flexibility of a dynamic query, just as with a plain EXECUTE
statement.
EXIT label WHEN expression ;
If no label given,
the innermost loop is terminated and the
statement following END LOOP is executed next.
If label is given, it
must be the label of the current or an upper level of nested loop
blocks. Then the named loop or block is terminated and control
continues with the statement after the loops/blocks corresponding
END.
Trigger Procedures
PL/pgSQL can be used to define trigger procedures. They are created
with the usual CREATE FUNCTION command as a function with no
arguments and a return type of OPAQUE.
There are some Postgres specific details
in functions used as trigger procedures.
First they have some special variables created automatically in the
top-level blocks declaration section. They are
NEW
Data type RECORD; variable holding the new database row on INSERT/UPDATE
operations on ROW level triggers.
OLD
Data type RECORD; variable holding the old database row on UPDATE/DELETE
operations on ROW level triggers.
TG_NAME
Data type name; variable that contains the name of the trigger actually
fired.
TG_WHEN
Data type text; a string of either 'BEFORE' or 'AFTER' depending on the
triggers definition.
TG_LEVEL
Data type text; a string of either 'ROW' or 'STATEMENT' depending on the
triggers definition.
TG_OP
Data type text; a string of 'INSERT', 'UPDATE', or 'DELETE' telling
for which operation the trigger is actually fired.
TG_RELID
Data type oid; the object ID of the table that caused the
trigger invocation.
TG_RELNAME
Data type name; the name of the table that caused the trigger
invocation.
TG_NARGS
Data type integer; the number of arguments given to the trigger
procedure in the CREATE TRIGGER statement.
TG_ARGV[]
Data type array of text; the arguments from the CREATE TRIGGER statement.
The index counts from 0 and can be given as an expression. Invalid
indices (< 0 or >= tg_nargs) result in a NULL value.
Second they must return either NULL or a record/row containing
exactly the structure of the table the trigger was fired for.
Triggers fired AFTER might always return a NULL value with no
effect. Triggers fired BEFORE signal the trigger manager
to skip the operation for this actual row when returning NULL.
Otherwise, the returned record/row replaces the inserted/updated
row in the operation. It is possible to replace single values directly
in NEW and return that or to build a complete new record/row to
return.
Exceptions
Postgres does not have a very smart
exception handling model. Whenever the parser, planner/optimizer
or executor decide that a statement cannot be processed any longer,
the whole transaction gets aborted and the system jumps back
into the main loop to get the next query from the client application.
It is possible to hook into the error mechanism to notice that this
happens. But currently it is impossible to tell what really
caused the abort (input/output conversion error, floating point
error, parse error). And it is possible that the database backend
is in an inconsistent state at this point so returning to the upper
executor or issuing more commands might corrupt the whole database.
And even if, at this point the information, that the transaction
is aborted, is already sent to the client application, so resuming
operation does not make any sense.
Thus, the only thing PL/pgSQL currently does when it encounters
an abort during execution of a function or trigger
procedure is to write some additional DEBUG level log messages
telling in which function and where (line number and type of
statement) this happened.
Examples
Here are only a few functions to demonstrate how easy it is to write PL/pgSQL
functions. For more complex examples the programmer
might look at the regression test for PL/pgSQL.
One painful detail in writing functions in PL/pgSQL is the handling
of single quotes. The function's source text in the CREATE FUNCTION command must
be a literal string. Single quotes inside of literal strings must be
either doubled or quoted with a backslash. We are still looking for
an elegant alternative. In the meantime, doubling the single quotes
as in the examples below should be used. Any solution for this
in future versions of Postgres will be
forward compatible.
A Simple PL/pgSQL Functions
The following two PL/pgSQL functions are identical to their
counterparts from the C language function discussion.
CREATE FUNCTION add_one (integer) RETURNS integer AS '
BEGIN
RETURN $1 + 1;
END;
' LANGUAGE 'plpgsql';
CREATE FUNCTION concat_text (text, text) RETURNS text AS '
BEGIN
RETURN $1 || $2;
END;
' LANGUAGE 'plpgsql';
A PL/pgSQL Function on a Composite Type
Again, this is the PL/pgSQL equivalent to the example from
the C functions.
CREATE FUNCTION c_overpaid (EMP, integer) RETURNS boolean AS '
DECLARE
emprec ALIAS FOR $1;
sallim ALIAS FOR $2;
BEGIN
IF emprec.salary ISNULL THEN
RETURN ''f'';
END IF;
RETURN emprec.salary > sallim;
END;
' LANGUAGE 'plpgsql';
A PL/pgSQL Trigger Procedure
This trigger ensures that any time a row is inserted or updated
in the table, the current user name and time are stamped into the
row. And it ensures that an employees name is given and that the
salary is a positive value.
CREATE TABLE emp (
empname text,
salary integer,
last_date timestamp,
last_user text);
CREATE FUNCTION emp_stamp () RETURNS OPAQUE AS '
BEGIN
-- Check that empname and salary are given
IF NEW.empname ISNULL THEN
RAISE EXCEPTION ''empname cannot be NULL value'';
END IF;
IF NEW.salary ISNULL THEN
RAISE EXCEPTION ''% cannot have NULL salary'', NEW.empname;
END IF;
-- Who works for us when she must pay for?
IF NEW.salary < 0 THEN
RAISE EXCEPTION ''% cannot have a negative salary'', NEW.empname;
END IF;
-- Remember who changed the payroll when
NEW.last_date := ''now'';
NEW.last_user := current_user;
RETURN NEW;
END;
' LANGUAGE 'plpgsql';
CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
FOR EACH ROW EXECUTE PROCEDURE emp_stamp();