EXASOL User Manual 6.0.9 en
EXASOL User Manual 6.0.9 en
EXASOL User Manual 6.0.9 en
Version 6.0.9
Empowering
analytics.
Experience the world´s fastest,
most intelligent, in-memory analytics
database.
Copyright © 2018 Exasol AG. All rights reserved.
The information in this publication is subject to change without notice. EXASOL SHALL NOT BE HELD LIABLE FOR TECHNICAL OR
EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN NOR FOR ACCIDENTAL OR CONSEQUENTIAL DAMAGES RES-
ULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF. No part of this publication may be photocopied or reproduced in any
form without prior written consent from Exasol. All named trademarks and registered trademarks are the property of their respective owners.
Exasol User Manual
Table of Contents
Foreword ..................................................................................................................................... ix
Conventions ................................................................................................................................. xi
Changes in Version 6.0 ................................................................................................................. xiii
1. What is Exasol? .......................................................................................................................... 1
2. SQL reference ............................................................................................................................ 5
2.1. Basic language elements .................................................................................................... 5
2.1.1. Comments in SQL ................................................................................................. 5
2.1.2. SQL identifier ....................................................................................................... 5
2.1.3. Regular expressions ............................................................................................... 7
2.2. SQL statements .............................................................................................................. 11
2.2.1. Definition of the database (DDL) ............................................................................ 12
2.2.2. Manipulation of the database (DML) ....................................................................... 37
2.2.3. Access control using SQL (DCL) ............................................................................ 60
2.2.4. Query language (DQL) ......................................................................................... 72
2.2.5. Verification of the data quality ................................................................................ 82
2.2.6. Other statements .................................................................................................. 87
2.3. Data types ................................................................................................................... 103
2.3.1. Overview of Exasol data types .............................................................................. 104
2.3.2. Data type details ................................................................................................. 104
2.3.3. Data type aliases ................................................................................................ 107
2.3.4. Type conversion rules .......................................................................................... 108
2.3.5. Default values .................................................................................................... 110
2.3.6. Identity columns ................................................................................................ 112
2.4. Geospatial data ............................................................................................................. 113
2.4.1. Geospatial objects .............................................................................................. 114
2.4.2. Geospatial functions ........................................................................................... 115
2.5. Literals ....................................................................................................................... 117
2.5.1. Numeric literals ................................................................................................. 118
2.5.2. Boolean literals .................................................................................................. 119
2.5.3. Date/Time literals ............................................................................................... 119
2.5.4. Interval literals ................................................................................................... 119
2.5.5. String literals ..................................................................................................... 121
2.5.6. NULL literal ..................................................................................................... 121
2.6. Format models ............................................................................................................. 121
2.6.1. Date/Time format models ..................................................................................... 122
2.6.2. Numeric format models ....................................................................................... 124
2.7. Operators .................................................................................................................... 126
2.7.1. Arithmetic Operators ........................................................................................... 127
2.7.2. Concatenation operator || ...................................................................................... 128
2.7.3. CONNECT BY Operators .................................................................................... 129
2.8. Predicates .................................................................................................................... 129
2.8.1. Introduction ...................................................................................................... 130
2.8.2. List of predicates ................................................................................................ 130
2.9. Built-in functions .......................................................................................................... 135
2.9.1. Scalar functions ................................................................................................. 136
2.9.2. Aggregate functions ............................................................................................ 140
2.9.3. Analytical functions ............................................................................................ 140
2.9.4. Alphabetical list of all functions ............................................................................ 143
3. Concepts ................................................................................................................................ 257
3.1. Transaction management ............................................................................................... 257
3.1.1. Basic concept .................................................................................................... 257
3.1.2. Differences to other systems ................................................................................. 258
3.1.3. Recommendations for the user .............................................................................. 258
3.2. Rights management ....................................................................................................... 258
3.2.1. User ................................................................................................................. 259
iii
3.2.2. Roles ............................................................................................................... 259
3.2.3. Privileges .......................................................................................................... 260
3.2.4. Access control with SQL statements ....................................................................... 260
3.2.5. Meta information on rights management ................................................................. 261
3.2.6. Rights management and transactions ...................................................................... 261
3.2.7. Example of rights management ............................................................................. 261
3.3. Priorities ..................................................................................................................... 262
3.3.1. Introduction ...................................................................................................... 263
3.3.2. Priorities in Exasol ............................................................................................. 263
3.3.3. Example ........................................................................................................... 264
3.4. ETL Processes .............................................................................................................. 264
3.4.1. Introduction ...................................................................................................... 265
3.4.2. SQL commands IMPORT and EXPORT ................................................................. 265
3.4.3. Scripting complex ETL jobs ................................................................................. 266
3.4.4. User-defined IMPORT using UDFs ........................................................................ 266
3.4.5. User-defined EXPORT using UDFs ....................................................................... 268
3.4.6. Hadoop and other systems .................................................................................... 268
3.4.7. Using virtual schemas for ETL .............................................................................. 269
3.4.8. Definition of file formats (CSV/FBV) .................................................................... 269
3.5. Scripting ..................................................................................................................... 271
3.5.1. Introduction ...................................................................................................... 272
3.5.2. General script language ....................................................................................... 273
3.5.3. Database interaction ............................................................................................ 280
3.5.4. Libraries ........................................................................................................... 287
3.5.5. System tables .................................................................................................... 294
3.6. UDF scripts ................................................................................................................ 294
3.6.1. What are UDF scripts? ........................................................................................ 295
3.6.2. Introducing examples .......................................................................................... 296
3.6.3. Details for different languages .............................................................................. 301
3.6.4. The synchronous cluster file system BucketFS ......................................................... 323
3.6.5. Expanding script languages using BucketFS ............................................................ 325
3.7. Virtual schemas ............................................................................................................ 329
3.7.1. Virtual schemas and tables ................................................................................... 330
3.7.2. Adapters and properties ....................................................................................... 331
3.7.3. Grant access on virtual tables ................................................................................ 332
3.7.4. Privileges for administration ................................................................................. 332
3.7.5. Metadata ........................................................................................................... 333
3.7.6. Details for experts .............................................................................................. 334
3.8. SQL Preprocessor ........................................................................................................ 335
3.8.1. How does the SQL Preprocessor work? .................................................................. 336
3.8.2. Library sqlparsing .............................................................................................. 336
3.8.3. Best Practice ..................................................................................................... 339
3.8.4. Examples .......................................................................................................... 339
3.9. Profiling ...................................................................................................................... 345
3.9.1. What is Profiling? ............................................................................................... 346
3.9.2. Activation and Analyzing ..................................................................................... 346
3.9.3. Example ........................................................................................................... 347
3.10. Skyline ...................................................................................................................... 348
3.10.1. Motivation ....................................................................................................... 349
3.10.2. How Skyline works ........................................................................................... 349
3.10.3. Example ......................................................................................................... 350
3.10.4. Syntax elements ............................................................................................... 350
4. Clients and interfaces ............................................................................................................... 353
4.1. EXAplus ..................................................................................................................... 353
4.1.1. Installation ........................................................................................................ 353
4.1.2. The graphical user interface .................................................................................. 354
4.1.3. The Console mode .............................................................................................. 358
4.1.4. EXAplus-specific commands ................................................................................ 361
iv
Exasol User Manual
v
vi
Exasol User Manual
List of Tables
2.1. Pattern elements in regular expressions ......................................................................................... 9
2.2. Overview of Exasol data types ................................................................................................. 104
2.3. Summary of Exasol aliases ..................................................................................................... 108
2.4. Possible implicit conversions ................................................................................................... 109
2.5. Elements of Date/Time format models ...................................................................................... 123
2.6. Elements of numeric format models ......................................................................................... 125
2.7. Precedence of predicates ........................................................................................................ 130
3.1. Additional basics for UDF scripts ............................................................................................. 296
4.1. Information about the work with EXAplus ................................................................................. 356
4.2. EXAplus command line parameters (Console mode only) ............................................................. 359
4.3. Known problems associated with using the ODBC driver on Windows ............................................ 381
4.4. Known problems when using the ODBC driver for Linux/Unix ...................................................... 382
4.5. Supported DriverProperties of the JDBC driver ........................................................................... 391
4.6. Keywords in the ADO.NET Data Provider connection string ......................................................... 394
B.1. System privileges in Exasol .................................................................................................... 460
B.2. Object privileges in Exasol ..................................................................................................... 461
B.3. Required privileges for running SQL statements ......................................................................... 462
C.1. SQL 2008 Mandatory Features ............................................................................................... 468
C.2. SQL 2008 Optional Features supported by Exasol ...................................................................... 472
vii
viii
Foreword
Foreword
This User Manual provides an overview of Exasol and documents the user interfaces and the extent of which the
SQL language is supported. Further technical information about Exasol can also be found in our Online Solution
Center [https://www.exasol.com/portal/display/SOL].
We always strive to ensure the highest possible quality standards. With this in mind, Exasol warmly invites you
to participate in improving the quality of documentation.
Send us your suggestions and comments to the address shown below or add them directly online in our IDEA
project [https://www.exasol.com/support/projects/IDEA/issues/]. We thank you sincerely and will endeavor to
implement your suggestions in the next version.
Further details about our support can be found in our Support Dashboard
[https://www.exasol.com/portal/display/EXA/Support+Dashboard]
ix
x
Conventions
Conventions
Symbols
In this User Manual, the following symbols are used:
Note: e.g. "Please consider that inside the database, empty strings are interpreted as NULL
values."
Tip: e.g. "We recommend to use local variables to explicitly show the variable declaration."
Caution: e.g. "The execution time of this command can take some time."
xi
xii
Changes in Version 6.0
You can now enjoy the following features without extra costs in the Standard Edition: Query Cache, LDAP
authentication and IMPORT/EXPORT interfaces JDBC (JDBC data sources) and ORA (native Oracle in-
terface).
If you have questions regarding these editions, please contact your EXASOL account manager or ask our
support team.
• The new virtual schemas concept provides a powerful abstraction layer to conveniently access arbitrary
data sources. Virtual schemas are a read-only link to an external source and contain virtual tables which
xiii
New features and improvements
look like regular tables except that the data is not stored locally. Details can be found in Section 3.7, “Vir-
tual schemas”.
• Using the new synchronous file system BucketFS, you can store files on the cluster and provide local access
for UDF scripts. Details can be found in Section 3.6.4, “The synchronous cluster file system BucketFS”.
• EXASOL's script framework was enhanced so that you can now install new script languages on the EXASOL
cluster. You can either use several versions of a language (e.g. Python 2 and Python 3), add additional
libraries (even for R), or integrated completely new languages (Julia, C++, ...). Further details can be found
in Section 3.6.5, “Expanding script languages using BucketFS”.
• The option IF NOT EXISTS was introduced in statements CREATE SCHEMA, CREATE TABLE
and ALTER TABLE (column). You can execute DDL scripts automatically without receiving error messages
when certain objects already exist.
• PRELOAD loads certain tables or columns and the corresponding internal indices from disk in case they
are not yet in the database cache. Due to EXASOL's smart cache management we highly recommend to
use this command only exceptionally. Otherwise you risk a worse overall system performance.
• Similar to the structures in the UDF scripts, we added the metadata information to the scripting language.
Additionally, this metadata was extended by the script schema and the current user. Please refer to the
corresponding sections of Section 3.5, “Scripting” and Section 3.6, “UDF scripts ” for further details.
• UDF scripts can handle dynamic input and output parameters to become more flexible. For details see
section Dynamic input and output parameters in Section 3.6, “UDF scripts ”.
• In SCALAR UDF scripts, you can now combine an EMITS output in expressions within the select list
without any need to nest it in a separate subselect anymore.
• Function APPROXIMATE_COUNT_DISTINCT calculates the approximate number of distinct elements,
at a much faster rate than COUNT function.
• By using the new parameter NLS_DATE_LANGUAGE in function TO_CHAR (datetime) you can overwrite
the session settings.
• Function NVL2 was added which is similar to NVL, but has an additional parameter for replacing the
values which are not NULL.
• The bitwise functions BIT_LSHIFT, BIT_RSHIFT, BIT_LROTATE and BIT_RROTATE were added.
Additionally, the bit range was extended from 59 to 64 for the other bitwise functions.
• The new session/system parameter TIMESTAMP_ARITHMETIC_BEHAVIOR defines the behavior for
+/- operators:
xiv
Changes in Version 6.0
• Adding small amounts of data through INSERT (e.g. only a single row) is significantly faster due to an
automatic, hybrid storage mechanism. The "latest" data is stored row-wise and automatically merged into
the column-wise storage in the background. The user does not notice anything except a higher insert rate
because less data blocks have to be committed to the disks.
• In scenarios with many parallel write transactions, the overall system throughput was improved.
• The optimization of SELECT statements which access the same view (or WITH clause) several times was
improved. As results, in more cases than before the better alternative is chosen, whether such a view (or
WITH clause) should be materialized upfront or not. Overall, the performance should improve although
the optimizer does not always make the right decision due to the complexity of the subject. Customers who
have explicitly deactivated that view optimization via the extra database parameter -disableviewoptimization,
might want to try out if this is not necessary anymore.
• By optimizing the internal data transfer for distributed table rows, several operations will gain performance
(such as MERGE, IMPORT and cluster enlargements).
• The MERGE statement will accelerate for certain situations, e.g. if the target table is large or if source and
target tables are not distributed by the ON condition columns (DISTRIBUTE BY).
• When deleting data, the affected rows are internally just marked as deleted, but not physically dropped yet.
After reaching a certain threshold, this data is deleted and the table reorganized. This reorganization is
faster due to advanced internal parallelization.
• Due to an intelligent replication mechanism small tables are no longer persistently replicated, but are kept
incrementally in the cache. Consequently, INSERT operations on small tables are significantly faster.
• GROUP BY calculations with a very small number of groups can be faster in situations, because the
method how the group aggregations are distributed across the cluster nodes has been improved.
• The execution of BETWEEN filters on DATE columns takes already advantage of existing internal indices
to accelerate the computation. We extended this capability by supporting the data types TIMESTAMP,
DECIMAL, DOUBLE and INTERVAL.
• A smart recompress logic has been implemented so that only tables and columns are recompressed if a
substantial improvement can be achieved. This is evaluated automatically on the fly, and will accelerate
the DELETE statement since this internally invokes a recompression after a certain threshold of deleted
rows. You can still enforce a full compression of a table using the new ENFORCE option of the RECOM-
PRESS command. Further, you can now specify certain columns in the RECOMPRESS statement to se-
lectively recompress only these columns.
• Instead of checking all constraints of a table, only the constraints for the new columns should be checked
for statement ALTER TABLE ADD COLUMN.
• For newly (!) created EXAStorage volumes, the read performance has been improved based on an optimized
physical storage layout. Read operations on big and heavily fragmented data volumes are faster. This leads
to an overall better read performance for situations where the main memory is not sufficient, leading to
continuous data read from all kinds of different tables.
• Local backup and restore operations have been significantly accelerated. In internal tests we achieved a
speedup factor between 4-5 for backup and 2-3 for restore operations. Please note that the actual speedup
on your system depends on the specific hardware configuration.
• Smart recovery mechanisms reduce the amount of data which has to be restored by EXAStorage in case
of node failures.
• The startup and restart times for databases has been accelerated, especially for larger clusters.
• System info:
• You'll find lots of information about the usage of the various storage volumes in the system table
EXA_VOLUME_USAGE.
• The DURATION column has been added in the EXA_DBA_AUDIT_SQL and EXA_SQL_LAST_DAY
system tables. It stores the duration of the statement in seconds.
• The number of rows of a table can be found in column TABLE_ROW_COUNT of system tables
EXA_ALL_TABLES, EXA_DBA_TABLES and EXA_USER_TABLES.
• Interfaces:
• Please note that with version 6.0 we dropped the support for AIX, HP-UX and Solaris. More details about
our life cycle policies can be found here: https://www.exasol.com/portal/display/DOWNLOAD/EXASolu-
tion+Life+Cycle
• A connection-oriented JSON over WebSockets API has been introduced. It is now possible to integrate
EXASOL with nearly any programming language of your choice. Another advantage is the protocol supports
client/server compression. Initially, we have published an open-source EXASOL Python driver that uses
xv
New features and improvements
this interface, but additional implementations for further languages will likely follow. Further details can
be found in Section 4.5, “WebSockets ”.
• In case of severe overload situations, the user SYS can now analyze the system and terminate problematic
processes via the new connection parameter SUPERCONNECTION (ODBC) or superconnection
(JDBC and ADO.NET).
• The parameter CONNECTTIMEOUT (ODBC) or connecttimeout (JDBC and ADO.NET) define the
maximal time in milliseconds (default: 2000) the driver will wait to establish a TPC connection to a server.
This timeout is intended to limit the overall login time especially in cases of a large cluster with several
reserve nodes.
• Encrypted client/server connections are now activated by default and based on the algorithm ChaCha20
(RFC 7539).
• The evaluation of data types of prepared parameters in the PREPARE step has been improved. Until now,
the generic type VARCHAR(2000000) was always used.
• ODBC driver
• If no schema is specified in the connection parameters, the driver will no longer try to open the schema
with the corresponding user name. This behavior was implemented due to historical reasons, but was
decided to be eliminated for consistency reasons.
• JDBC
• The JDBC driver added support for java.sql.Driver file. This option was added in the JDBC 4.0 standard
which allows Java applications to use JDBC drivers without explicitly load a driver via Class.for-
Name or DriverManager.registerDriver. It is now sufficient to add the driver to the classpath.
• The data types of query result sets are optimized. In detail, function getColumnType() of class
ResultSetMetadata determines the minimal matching integer type (smallint, int or bigint)
for decimals without scale.
• The JDBC driver supports parallel read and insert from the cluster nodes via so-called sub-connections.
Details and examples can be found in our Solution Center: https://www.exasol.com/support/browse/SOL-
546
• The class EXAPreparedStatement has the new method setFixedChar() which sets the
parameter type to CHAR. Since the standard method setString() uses VARCHAR, this new
function can be useful for comparisons with CHAR values.
• ADO.NET driver
• You can use the parameter querytimeout to define how many seconds a statement may run before
it is automatically aborted.
• The DDEX provider and pluggable SQL cartridge for SQL Server Analysis Services supports SQL
Server 2016 (v 13.0).
• EXAplus
• The new graphical system usage statistics can be displayed by clicking on the pie chart button in the
tool bar.
• The bar at the bottom-right corner shows the memory allocation pool (heap) usage of the current EXAplus
process. By double-clicking you can trigger the program to execute a garbage collection.
• You can specify additional JDBC parameter using the commandline parameter -jdbcparam or the
advanced settings in a connection profile.
• The console version has new parameters to handle user profiles (-lp for printing existing profiles,
-dp <profile> for deleting and -wp <profile> for writing certain connection options into a
profile)
• SDK
• Until now we shipped two versions of our SDK library: libexacli and libexacli_c. The first library
contained dependencies to C++, and due to a customer request we added the latter one without any
dependencies in version 5. We removed the C++ dependencies from libexacli and will only ship this
single version from now on.
• Operation:
• Details about how to upgrade to version 6 can be found in the following article in our Solution Center: ht-
tps://www.exasol.com/support/browse/SOL-504
xvi
Changes in Version 6.0
xvii
xviii
Chapter 1. What is Exasol?
Exasol is the technology of choice for all kinds of analytic challenges in today's business world. From business-
critical operational data applications and interactive advanced customer analytics systems to enterprise (logical)
data warehouses - Exasol helps you attain optimal insights from all sorts of big data using a completely new level
of analytical capabilities.
With Exasol you are enabled to extend your historical data frame, increase the precision of your analysis, satisfy
the user experience through real-time responsiveness and multiply the number of users.
The deep integration of parallel R, Python, Java and Lua (see Section 3.6, “UDF scripts ”) and the ability to combine
regular SQL queries with Map Reduce jobs directly within our in-memory database offers a very powerful and
1
flexible capability for optimal in-database analytics. Graph analytics capabilities (see CONNECT BY in SELECT),
our Skyline extension (see Section 3.10, “Skyline”) and the support of geospatial analysis (see Section 2.4, “Geo-
spatial data”) completes the analytical spectrum.
Exasol combines in-memory, column-based and massively parallel technologies to provide unseen performance,
flexibility and scalability. It can be smoothly integrated into every IT infrastructure and heterogeneous analytical
platform.
Please be aware that Exasol is available in two different editions. Compared to the Standard Edition, the Advanced
Edition contains the following extra features:
Interfaces
Currently, there are three important standardized interfaces for which Exasol makes appropriate driver software
available: ODBC, JDBC and ADO.NET. In addition, the OLE DB Provider for ODBC Drivers™ from Microsoft
allows Exasol to be accessed via an OLE DB interface.
Our JSON over WebSockets API allows you to integrate Exasol into nearly any programming language of your
choice. Drivers built on top of that protocol, such as a native Python package, are provided as open source projects.
And in order to be able to connect Exasol to in-house C/C++ applications, Exasol additionally provides a client
SDK (Software Development Kit).
The driver connection makes it possible to send SQL queries and display the results in the connected application
as well as to load data into Exasol. However, for reasons of performance the IMPORT statement is recommended
for larger load operations.
EXAplus
EXAplus is a console for entering SQL statements. EXAplus simplifies the task of running SQL scripts or performing
administrative tasks. EXAplus was implemented in Java; hence, it is not platform-dependent and can be used as
both a graphical user interface or text console.
2
Chapter 1. What is Exasol?
3
4
Chapter 2. SQL reference
A detailed summary of the supported SQL standard features can be found in Appendix C, Compliance to the SQL
standard.
• Line comments begin with the character -- and indicate that the remaining part of the current line is a comment.
• Block comments are indicated by the characters /* and */ and can be spread across several lines. All of the
characters between the delimiters are ignored.
Examples
-- This is a comment
SELECT * FROM dual;
Regular identifiers
Regular identifiers are stated without quotation marks. As defined in the SQL standard, Exasol only allows a
subset of the unicode characters, whereas the first letter is more restricted than the following ones. The detailed
list of characters of the specified classes are defined in the unicode standard.
5
2.1. Basic language elements
• First letter: all letters of unicode classes Lu (upper-case letters), Ll (lower-case letters), Lt (title-case letters),
Lm (modifier letters), Lo (other letters) and Nl (letter numbers).
• Following letters: the classes of the first letter (see above) and additionally all letters of unicode classes Mn
(nonspacing marks), Mc (spacing combining marks), Nd (decimal numbers), Pc (connector punctuations), Cf
(formatting codes) and the unicode character U+00B7 (middle dot).
Regarding e.g. the simple ASCII character set, these rules denote that a regular identifier may start with letters of
the set {a-z,A-Z} and may further contain letters of the set {a-z,A-Z,0-9,_}.
A further restriction is that reserved words (see also below) cannot be used as a regular identifier.
If you want to use characters which are prohibited for regular identifiers, you can use
delimited identifiers (see next section). E.g. if you want to use the word table as iden-
tifier, you have to quote it ("table"), since it's a reserved word.
Regular identifiers are always stored in upper case. They are not case sensitive. By way of example, the two
identifiers are therefore equal ABC and aBc.
Examples
Delimited identifiers
These identifiers are names enclosed in double quotation marks. Any character can be contained within the quotation
marks except the dot ('.'). Only in case of users and roles, this point is allowed for being able to use e.g. email
addresses. This is possible because users and roles are not in conflict with schema-qualified objects (see next
chapter).
A peculiarity arises from the use of double quotation marks: if a user wishes to use this character in the name, he
must write two double quotation marks next to one another (i.e. "ab""c" indicates the name ab"c).
Excepting users and roles, identifiers in quotation marks are always case sensitive when stored in the database.
Examples
6
Chapter 2. SQL reference
Schema-qualified names
If database objects are called via schema-qualified names, the names are separated with a point. This enables access
to schema objects, which are not located in the current schema.
Examples
Reserved words
A series of reserved words exist, which cannot be used as regular identifiers. By way of example, the keyword,
SELECT, is a reserved word. If a table needs to be created with the name, SELECT, this will only be possible if
the name is in double quotation marks. "SELECT" as a table name is, therefore, different to a table name such as
"Select" or "seLect".
The list of reserved words in Exasol can be found in the EXA_SQL_KEYWORDS system table.
7
2.1. Basic language elements
This chapter shall describe the basic functionality of regular expressions in Exasol. Detailed information about the
PCRE dialect can be found on www.pcre.org [http://www.pcre.org].
• REGEXP_INSTR
• REGEXP_REPLACE
• REGEXP_SUBSTR
• [NOT] REGEXP_LIKE
Examples
In the description of the corresponding scalar functions and predicates you can find examples of regular expressions
within SQL statements. The following examples shall demonstrate the general possibilities of regular expressions:
a
Meaning Regular Expression
American Express credit card number (15 digits and starting 3[47][0-9]{13}
with 34 or 37)
IP addresses like 192.168.0.1 \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
Floating point numbers like -1.23E-6 or 40 [+-]?[0-9]*\.?[0-9]+([+-]?[eE][0-9]+)?
Date values with format YYYY-MM-DD, e.g. 2010-12-01 (19|20)\d\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])
(restricted to 1900-2099)
Email addresses like hello.world@yahoo.com (?i)[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}
a
Please note that the example expressions are simplified for clearness reasons. Productive expressions for e.g. identifying valid email addresses
are much more complex.
Pattern elements
The following elements can be used in regular expressions:
8
Chapter 2. SQL reference
9
2.1. Basic language elements
\2 Captures are defined by subpatterns which are delimited by brackets. They are numbered
by the opening brackets, from left to right. If you want to use more than 9 captures, you
... have to name a capture (see below).
\1 ab
\2 b
\3 123
Important: name must start with a non-digit. Named captures doesn't affect the ascending
numeration (see example).
10
Chapter 2. SQL reference
\1 ab
\2 123
(?i:) To simplify the syntax, you can set/unset modifiers between ? and :
11
2.2. SQL statements
CREATE SCHEMA
Purpose
A new schema is created with this statement. This can either be a physical schema where you can create standard
schema objects or a virtual schema which is some kind of virtual link to an external data source, mapping the remote
tables including its data and meta data into the virtual schema.
Prerequisite(s)
Syntax
create_schema::=
IF NOT EXISTS
CREATE SCHEMA schema
create_virtual_schema::=
IF NOT EXISTS
CREATE VIRTUAL SCHEMA schema
Note(s)
12
Chapter 2. SQL reference
Please note that virtual schemas are part of the Advanced Edition of Exasol.
Example(s)
DROP SCHEMA
Purpose
This statement can be used to completely delete a schema and all of the objects contained therein.
Prerequisite(s)
• Physical schema: System privilege DROP ANY SCHEMA or the schema belongs to the current user or one
of that users roles.
• Virtual schema: System privilege DROP ANY VIRTUAL SCHEMA or the schema belongs to the current
user or one of that users roles.
Syntax
drop_schema::=
FORCE
VIRTUAL
DROP SCHEMA
CASCADE
IF EXISTS RESTRICT
schema
Note(s)
13
2.2. SQL statements
• Details about the concept of virtual schemas can be found in Section 3.7, “Virtual schemas”. If you specify
the FORCE option, then the corresponding adapter script is not informed by that action. Otherwise, the adapter
script will be called and could react to that action depending on its implementation.
Please note that virtual schemas are part of the Advanced Edition of Exasol.
Example(s)
ALTER SCHEMA
Purpose
This statement alters a schema. With the CHANGE OWNER option you can assign a schema to another user or
role. The schema and all the schema objects contained therein then belong to the specified user or the specified
role. And for virtual schemas, you can change its properties or refresh its metadata.
Prerequisite(s)
• Physical schemas: The system privilege ALTER ANY SCHEMA or object privilege ALTER on the schema.
Simply owning the schema is insufficient!
• Virtual schemas:
• CHANGE OWNER: The system privilege ALTER ANY VIRTUAL SCHEMA or object privilege ALTER
on the schema. Simply owning the schema is insufficient!
• SET: System privilege ALTER ANY VIRTUAL SCHEMA, object privilege ALTER on s or the schema
is owned by the current user or one of its roles. Additionally, access rights are necessary on the corresponding
adapter script.
• REFRESH: System privileges ALTER ANY VIRTUAL SCHEMA or ALTER ANY VIRTUAL SCHEMA
REFRESH, object privilege ALTER or REFRESH on s, or s is owned by the current user or one of its
roles. Additionally, access rights are necessary on the corresponding adapter script.
Syntax
alter_schema::=
user
ALTER SCHEMA schema CHANGE OWNER
role
alter_virtual_schema::=
14
Chapter 2. SQL reference
TABLES table
REFRESH
user
CHANGE OWNER
role
Note(s)
Please note that virtual schemas are part of the Advanced Edition of Exasol.
Example(s)
CREATE TABLE
Purpose
This statement can be used to create a table. This occurs by specifying the column types or other tables as sample
or assigning a subquery.
Prerequisite(s)
• System privilege CREATE TABLE system privilege if the table shall be created in your own schema or in
that of an assigned role or CREATE ANY TABLE. If the table is defined via a subquery, the user requires
the appropriate SELECT privileges for all the objects referenced in the subquery.
• If the OR REPLACE option has been specified and the table already exists, the rights for DROP TABLE are
also needed.
Syntax
create_table::=
15
2.2. SQL statements
column_definition
, distribute_by
( like_clause )
out_of_line_constraint
like_clause
WITH DATA
WITH NO DATA
AS subquery
COMMENT IS string
column_definition::=
DEFAULT expr
int
IDENTITY inline_constraint
column datatype
COMMENT IS string
inline_constraint::=
NOT ENABLE
name NULL
CONSTRAINT DISABLE
PRIMARY KEY
references_clause
references_clause::=
( column )
REFERENCES table
out_of_line_constraint::=
constraint
CONSTRAINT
16
Chapter 2. SQL reference
,
ENABLE
PRIMARY KEY ( column )
DISABLE
,
like_clause::=
AS
alias
table ( column )
LIKE
view
INCLUDING INCLUDING
DEFAULTS IDENTITY
EXCLUDING EXCLUDING
INCLUDING
COMMENTS
EXCLUDING
distribute_by::=
DISTRIBUTE BY column
Note(s)
• The OR REPLACE option can be used to replace an already existing table without having to explicitly delete
this by means of DROP TABLE.
• If you have specified the option IF NOT EXISTS, no error message will be thrown if the table already exists.
• Detailed information on the usable data types can be found in Section 2.3, “Data types”.
• For details about constraints see also command ALTER TABLE (constraints).
• A specified default value must be convertible to the data type of its column. Detailed information on the per-
mitted expressions for default values can be found in Section 2.3.5, “Default values”.
• Details on identity columns can be found in Section 2.3.6, “Identity columns”.
• If you use the WITH NO DATA option within the CREATE TABLE AS statement, the query is not executed,
but an empty result table with same data types will be defined.
• The following should be noted with regards to table definition with a (CREATE TABLE AS) subquery: if
columns are not directly referenced in the subquery but assembled expressions, aliases must be specified for
these columns.
• Alternatively to the CREATE TABLE AS statement you can also the SELECT INTO TABLE syntax (see
SELECT INTO).
• Using the LIKE clause you can create an empty table with same data types like the defined sample table or
view. If you specify the option INCLUDING DEFAULTS, INCLUDING IDENTITY or INCLUDING
COMMENTS then all default values, its identity column and its column comments are adopted. A NOT NULL
constraint is always adopted, in contrast to primary or foreign keys (never adopted).
17
2.2. SQL statements
• LIKE expressions and normal column definition can be mixed, but have then to be enclosed in brackets.
• The distribution_clause defines the explicit distribution of the table across the cluster nodes. For details
see also the notes of the command ALTER TABLE (distribution).
• Comments on table and columns can also be set or changed afterwards via the command COMMENT.
Example(s)
SELECT INTO
Purpose
Via this command you can create a table, similar to CREATE TABLE AS (see CREATE TABLE) .
Prerequisite(s)
• System privilege CREATE TABLE system privilege if the table shall be created in your own schema or in
that of an assigned role or CREATE ANY TABLE. Furthermore, the user requires the appropriate SELECT
privileges for all referenced objects.
Syntax
select_into::=
Note(s)
• For details about the syntax elements (like e.g. select_list, ...) see SELECT.
18
Chapter 2. SQL reference
• If columns are not directly referenced in the select list, but compound expressions, aliases must be specified
for these columns.
Example(s)
DROP TABLE
Purpose
Prerequisite(s)
• System privilege DROP ANY TABLE or the table belongs to the current user or one of the roles of the current
user.
Syntax
drop_table::=
Note(s)
• If a foreign key references the table which should be deleted, you have to specify the option CASCADE
CONSTRAINTS. In this case the foreign key is also deleted even though the referencing table doesn't belong
to the current user.
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the table does
not exist.
Example(s)
Purpose
Add, drop, change datatype and rename a column or define default values and column identities.
Prerequisite(s)
• System privilege ALTER ANY TABLE, object privilege ALTER on the table or its schema, or the table belongs
to the current user or one of his roles.
19
2.2. SQL statements
Syntax
alter_column::=
add_column
drop_column
modify_column
ALTER TABLE table
rename_column
alter_column_default
alter_column_identity
add_column::=
DEFAULT expr
int
IDENTITY inline_constraint
drop_column::=
modify_column::=
COLUMN
MODIFY column datatype
DEFAULT expr
int
IDENTITY inline_constraint
inline_constraint::=
NOT ENABLE
name NULL
CONSTRAINT DISABLE
PRIMARY KEY
references_clause
references_clause::=
20
Chapter 2. SQL reference
( column )
REFERENCES table
rename_column::=
alter_column_default::=
alter_column_identity::=
int
COLUMN SET IDENTITY
ALTER column
DROP IDENTITY
Note(s)
ADD COLUMN • If the table already contains rows, the content of the inserted column will be
set to the default value if one has been specified, otherwise it will be initial-
ized with NULL.
• If the default value expr is not appropriate for the specified data type, an error
message is given and the statement is not executed. Details on the permissible
expressions for the default value expr can be found in Section 2.3.5, “Default
values”.
• For identity columns a monotonically increasing number is generated if the
table already possesses rows. Details on identity columns can be found in
Section 2.3.6, “Identity columns”.
• If you have specified the option IF NOT EXISTS, no error message will be
thrown if the schema already exists.
• For details about constraints see also command ALTER TABLE (con-
straints).
DROP COLUMN • If a foreign key references the column which should be deleted, you have to
specify the option CASCADE CONSTRAINTS. In this case the foreign key
is also deleted even though the referencing table doesn't belong to the current
user.
• If a column is dropped which is part of the distribution key (see also ALTER
TABLE (distribution)), then the distribution keys are deleted completely and
the new rows of the table will be distributed randomly across the cluster
nodes.
• If the optional IF EXISTS clause is specified, then the statement does not
throw an exception if the column does not exist.
MODIFY COLUMN • The specified types must be convertible. This concerns the data types (e.g.
BOOL is not convertible into TIMESTAMP) and the content of the column
(CHAR(3) with content '123' can be converted into DECIMAL, but 'ABC'
can not).
21
2.2. SQL statements
• A default value must be convertible to the data type. If a default value has
not been specified, any older default value that exists will be used; this must
be appropriate for the new data type. Details on the permissible expressions
for the default value expr can be found in Section 2.3.5, “Default values”.
• Details on identity columns can be found in Section 2.3.6, “Identity columns”.
If the identity column option has not been specified for an existing identity
column, then the column still has the identity column property and the new
data type must be appropriate.
• For details about constraints see also command ALTER TABLE (con-
straints). By specifying constraints in the MODIFY command you can only
add constraints, but not modify them. Exception: If you specify a NULL
constraint, a previously existing NOT NULL constraint is removed. If you
don't specify a constraint, then existing constraints are not changed.
• If a column is modified which is part of the distribution key (see also ALTER
TABLE (distribution)), the table is redistributed.
RENAME COLUMN • This statement will not change the contents of the table.
• The table may not contain a column with the name new_name.
ALTER COLUMN DEFAULT • This statement will not change the contents of the table.
• A default value must be convertible to the data type.
• Details on the permissible expressions for the default value expr can be found
in Section 2.3.5, “Default values”.
• DROP DEFAULT should be given priority over SET DEFAULT NULL.
ALTER COLUMN IDENTITY • The content of the tables is not affected by this statement.
• When specifying the optional parameter int the number generator will be
set to this number, otherwise to 0. Starting with this number monotonically
increasing numbers are generated for INSERT statements which do not insert
an explicit value for the identity column. Please consider that those numbers
will be generated monotonically increasing, but can include gaps. By the use
of this command you can also "reset" the number generator.
• Details on identity columns can be found in Section 2.3.6, “Identity columns”.
Example(s)
22
Chapter 2. SQL reference
Purpose
By using this statement you can control the distribution of rows across the cluster nodes.
Prerequisite(s)
• System privilege ALTER ANY TABLE, object privilege ALTER on the table or its schema, or the table belongs
to the current user or one of his roles.
Syntax
distribute_by::=
DISTRIBUTE BY column
ALTER TABLE table
DROP DISTRIBUTION KEYS
Note(s)
• Explicitly distributed tables can accelerate certain SQL significantly. E.g. a join between two tables which are
both distributed by the join-columns can be executed locally on the cluster-nodes. GROUP BY queries on
single tables can also be accelerated if all distribution keys (columns) are part of the grouping elements.
However it is senseless to explicitly distribute small tables with only a few thousands rows.
The execution time of this command can take some time. It should only be used if
you are aware of its consequences. In particular you can influence the performance
of queries negatively.
• If the table distribution was not defined explicitly, the system distributes the rows equally across the cluster
nodes.
• If a table is explicitly distributed, the distribution of the rows across the cluster nodes is determined by the
hash value of the distribution keys. The order of the defined columns is irrelevant. The distribution of a table
is sustained, especially when inserting, deleting or updating rows.
• The actual mapping between values and database nodes can be evaluated using the function VALUE2PROC.
• If you define inappropriate distribution keys which would lead to a heavy imbalance of rows in the cluster, the
command is aborted and a corresponding error message is thrown.
• The explicit distribution of a table can also be defined directly in the CREATE TABLE statement.
• By setting or unsetting the distribution key, all internal indices are recreated.
• Via the DROP DISTRIBUTION KEYS statement you can undo the explicit distribution of a table. Afterwards
the rows will again be distributed randomly, but equally across the cluster nodes.
• By executing the DESC[RIBE] statement or by using the information in the system table
EXA_ALL_COLUMNS you can identify the distribution keys of a table. In system table EXA_ALL_TABLES
you can see whether a table is explicitly distributed.
Example(s)
DESCRIBE my_table;
23
2.2. SQL statements
Purpose
Prerequisite(s)
• System privilege ALTER ANY TABLE, object privilege ALTER on the table or its schema, or the table belongs
to the current user or one of his roles.
Syntax
alter_table_constraint::=
ADD out_of_line_constraint
out_of_line_constraint::=
constraint
CONSTRAINT
,
ENABLE
PRIMARY KEY ( column )
DISABLE
,
references_clause::=
24
Chapter 2. SQL reference
( column )
REFERENCES table
Note(s)
PRIMARY KEY All values have to be unique, NULL values are not allowed. A table may only have one
primary key.
FOREIGN KEY A foreign key always references the primary key of a second table. The column content
must either exist in the primary key column or must be NULL (in case of a composite key
at least one of the columns). The datatype of a foreign key and its corresponding primary
key must be identical. Foreign keys cannot reference a virtual object.
NOT NULL No NULL values can be inserted. A NOT NULL constraint can only be specified either
directly in the table definition or via the ALTER TABLE MODIFY COLUMN statement.
• Constraints can have a name for easier identification and always have one of the following two states:
ENABLE The constraint is directly checked after DML statements (see Section 2.2.2, “Manipulation of
the database (DML)”). This process costs some time, but ensures the data integrity.
DISABLE The constraint is deactivated and not checked. This state can be interesting if you want to define
the metadata within the database, but avoid a negative performance impact.
If no explicit state is defined, then the session parameter CONSTRAINT_STATE_DEFAULT is used (see
also ALTER SESSION).
• The corresponding metadata for primary and foreign key constraints can be found in system tables
EXA_USER_CONSTRAINTS, EXA_ALL_CONSTRAINTS and EXA_DBA_CONSTRAINTS, the metadata
for NOT NULL constraints in EXA_USER_CONSTRAINT_COLUMNS, EXA_ALL_CON-
STRAINT_COLUMNS and EXA_DBA_CONSTRAINT_COLUMNS. If no explicit name was specified, the
system generates implicitly a unique name.
• Constraints can also be defined directly within the CREATE TABLE statement and can be modified by the
command ALTER TABLE (constraints).
Example(s)
CREATE VIEW
Purpose
25
2.2. SQL statements
Prerequisite(s)
• System privilege CREATE VIEW if the view shall be created in your own schema or in that of an assigned
role or CREATE ANY VIEW. Additionally, the owner of the view (who is not automatically the creator) must
possess the corresponding SELECT privileges on all the objects referenced in the subquery.
• If the OR REPLACE option has been specified and the view already exists, the user also needs the same rights
as for DROP VIEW.
Syntax
create_view::=
OR REPLACE FORCE
CREATE VIEW view
COMMENT IS string
( column )
AS subquery
COMMENT IS string
Note(s)
• An already existing view can be replaced with the OR REPLACE option, without having to explicitly delete
this with DROP VIEW.
• By specifying the FORCE option you can create a view without compiling its text. This can be very useful if
you want to create many views which depend on each other and which otherwise must be created in a certain
order. Please consider that syntax or other errors will not be found until the the view is first used.
• The column names of the view are defined through the specification of column aliases. If these are not specified,
the column names are derived from the subquery. If columns are not directly referenced in the subquery but
assembled expressions, aliases must be specified for these either within the subquery or for the entire view.
• The view creation text is limited to 20.000 characters.
• A view is set to INVALID If one of its underlying objects was changed (see also column STATUS in system
tables like e.g. EXA_ALL_VIEWS). At the next read access, the system tries to automatically recompile the
view. If this is successful, the view is valid again. Otherwise an appropriate error message is thrown. This
means in particular that an invalid view can still be usable.
Example(s)
DROP VIEW
Purpose
26
Chapter 2. SQL reference
Prerequisite(s)
• System privilege DROP ANY VIEW, or the view belongs to the current user or one of his roles.
Syntax
drop_view::=
CASCADE
IF EXISTS RESTRICT
DROP VIEW view
Note(s)
• RESTRICT or CASCADE are not of significance but are supported syntactically for reasons of compatibility.
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the view does
not exist.
Example(s)
CREATE FUNCTION
Purpose
Prerequisite(s)
• System privilege CREATE FUNCTION if the function shall be created in your own schema or in that of an
assigned role or CREATE ANY FUNCTION.
• If the OR REPLACE option has been specified and the function already exists, the rights for DROP FUNCTION
are also needed.
• In order to use a function, one needs the system privilege EXECUTE ANY FUNCTION, the object privilege
EXECUTE on the function or its schema, or must be the owner.
Syntax
create_function::=
OR REPLACE
CREATE
FUNCTION function
IN
( param data_type ) RETURN data_type
27
2.2. SQL statements
IS
variable data_type ;
function ;
END
function_statement::=
assignment
if_branch
for_loop
while_loop
RETURN expr ;
assignment::=
identifier := expr ;
if_branch::=
ELSEIF
condition function_statement
ELSIF
ELSE function_statement
END IF ;
for_loop::=
while_loop::=
28
Chapter 2. SQL reference
Note(s)
Example(s)
-- if-branch
IF input_variable = 0 THEN
res := NULL;
ELSE
res := input_variable;
END IF;
-- for loop
FOR cnt := 1 TO input_variable
29
2.2. SQL statements
DO
res := res*2;
END FOR;
-- while loop
WHILE cnt <= input_variable
DO
res := res*2;
cnt := cnt+1;
END WHILE;
DROP FUNCTION
Purpose
Prerequisite(s)
• System privilege DROP ANY FUNCTION or the function belongs to the current user.
Syntax
drop_function::=
IF EXISTS
DROP FUNCTION function
Note(s)
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the function
does not exist.
Example(s)
CREATE SCRIPT
Purpose
A script can be created by this statement, either a user defined function (UDF), a scripting program or an adapter
script.
Prerequisite(s)
• System privilege CREATE SCRIPT if the script shall be created in your own schema or in that of an assigned
role or CREATE ANY SCRIPT.
• If the OR REPLACE option has been specified and the script already exists, the rights for DROP SCRIPT are
also needed.
30
Chapter 2. SQL reference
• To execute a script (either UDF or scripting program) you need the system privilege EXECUTE ANY SCRIPT,
the object privilege EXECUTE on the script or its schema or the user has to be the owner of the script.
Syntax
create_scripting_script::=
OR REPLACE
CREATE SCRIPT script scripting_metadata
script_content
scripting_metadata::=
ARRAY
param_name
( )
TABLE
RETURNS
ROWCOUNT
AS
create_udf_script::=
JAVA
PYTHON
LUA
OR REPLACE alias
CREATE
SCALAR
SCRIPT script udf_metadata
SET
script_content
udf_metadata::=
31
2.2. SQL statements
,
udf_order_by_clause
param_name data_type
( )
...
RETURNS data_type
param_name data_type
EMITS ( ) AS
...
udf_order_by_clause::=
ASC FIRST
NULLS
DESC LAST
ORDER BY param_name
create_adapter_script::=
JAVA
PYTHON
LUA
OR REPLACE alias
CREATE ADAPTER SCRIPT script
script_content
Note(s)
• An already existing script can be replaced with the OR REPLACE option, without having to explicitly delete
this with DROP SCRIPT.
• The ending slash ('/') is only required when using EXAplus.
• The Content of a script is integrated in the corresponding system tables (e.g. EXA_ALL_SCRIPTS - see also
Appendix A, System tables). It begins from the first line after the AS keyword which is no blank line.
• If you want to integrate further script languages for user defined functions and adapter scripts, please see also
Section 3.6.5, “Expanding script languages using BucketFS”.
• Notes for scripting programs:
• Scripting programs provide a way of controlling the execution of several SQL commands (e.g. for ETL
jobs) and are executed with the EXECUTE SCRIPT command (see also Section 3.5, “Scripting”).
32
Chapter 2. SQL reference
• For scripting programs only Lua is supported as programming language. More details can be found in
Section 3.5, “Scripting”.
• If neither of the two options RETURNS TABLE and RETURNS ROWCOUNT are specified, implicitly the
option RETURNS ROWCOUNT is used (for details about those options see also section Return value of a
script).
• You can execute a script via the statement EXECUTE SCRIPT.
• Notes for user defined functions:
• User defined functions (UDF) can be used directly within SELECT statements and can proceed big data
volumes (see also Section 3.6, “UDF scripts ”).
Please note that UDF scripts are part of the Advanced Edition of Exasol.
• Additionally to scalar functions, you can also create aggregate and analytical functions. Even MapReduce
algorithms can be implemented. More details can be found in Section 3.6, “UDF scripts ”.
• If you define the ORDER BY clause, the groups of the SET input data are processed in an ordered way.
You can also specify this clause in the function call within a SELECT statement.
• Notes for adapter scripts:
• Details about adapter scripts and virtual schemas can be found in Section 3.7, “Virtual schemas”.
Please note that virtual schemas are part of the Advanced Edition of Exasol.
• The existing open source adapters provided by Exasol can be found in our GitHub repository: https://www.git-
hub.com/exasol
• Adapter scripts can only be implemented in Java and Python.
Example(s)
-- UDF example
CREATE LUA SCALAR SCRIPT my_average (a DOUBLE, b DOUBLE)
RETURNS DOUBLE AS
function run(ctx)
33
2.2. SQL statements
X Y MY_AVERAGE(T.X,T.Y)
----------------- ----------------- -------------------
1 4 2.5
2 6 4
3 3 3
DROP SCRIPT
Purpose
A script can be dropped by this statement (UDF, scripting program or adapter script).
Prerequisite(s)
• System privilege DROP ANY SCRIPT or the current user is the owner of the script (i.e. the whole schema).
Syntax
drop_script::=
ADAPTER IF EXISTS
DROP SCRIPT script
Note(s)
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the script does
not exist.
• In case of an adapter script still referenced by a virtual schema, an exception is thrown that the script cannot
be dropped.
Example(s)
34
Chapter 2. SQL reference
RENAME
Purpose
Prerequisite(s)
• If the object is a schema, it must belong to the user or one of his roles.
• If the object is a schema object, the object must belong to the user or one of his roles (i.e. located in one's own
schema or that of an assigned role).
• If the object is a user or role, the user requires the CREATE USER or CREATE ROLE.
• If the object is a connection, the user requires the ALTER ANY CONNECTION system privilege or he must
have received the connection with the WITH ADMIN OPTION.
Syntax
rename::=
SCHEMA
TABLE
VIEW
FUNCTION
SCRIPT
USER
ROLE
CONNECTION
RENAME old_name TO new_name
Note(s)
• Schema objects cannot be shifted to another schema with the RENAME statement, i.e. "RENAME TABLE
s1.t1 TO s2.t2" is not allowed.
• Distinguishing between schema, table, etc. is optional and only necessary if two identical schema objects share
the same name.
Example(s)
35
2.2. SQL statements
COMMENT
Purpose
Prerequisite(s)
Syntax
comment_table::=
TABLE IS string
COMMENT ON table
( column IS string )
comment_object::=
COLUMN
SCHEMA
FUNCTION
USER
ROLE
CONNECTION
36
Chapter 2. SQL reference
Note(s)
• Via the corresponding metadata system tables (e.g. EXA_ALL_OBJECTS, EXA_ALL_TABLES, ...) and the
command DESC[RIBE] (with FULL option) the comments can be displayed.
• Comments can be dropped by assigning NULL or the empty string.
• Comments can also be defined in the CREATE TABLE statement.
• View comments can only be specified in the CREATE VIEW statement.
Example(s)
37
2.2. SQL statements
INSERT
Purpose
This statement makes it possible for the user to insert constant values as well the result of a subquery in a table.
Prerequisite(s)
• System privilege INSERT ANY TABLE, object privilege INSERT on the table or its schema, or the table
belongs to the current user or one of his roles.
• If the result of a subquery is to be inserted, the user requires the appropriate SELECT rights for the objects
referenced in the subquery.
Syntax
insert::=
,
AS
t_alias ( column )
INSERT INTO table
expr
VALUES ( )
DEFAULT
DEFAULT VALUES
subquery
Note(s)
• The number of target columns of the target table must match the number of constants or the number of SELECT
columns in a subquery. Otherwise, an exception occurs.
• If only a specific number of target columns (<column_list>) are specified for the target table, the entries of the
remaining columns will be filled automatically. For identity columns a monotonically increasing number is
generated and for columns with default values their respective default value is used. For all other columns the
value NULL is inserted.
• If for INSERT INTO t VALUES the 'value' DEFAULT is specified for a column, then the behavior is the same
as the implicit insert for unspecified columns (see above).
• INSERT INTO t DEFAULT VALUES has the same behavior as if you would specify the literal DEFAULT
for each column.
• Details on default values can be found in Section 2.3.5, “Default values”, on identity columns in Section 2.3.6,
“Identity columns”.
• Details about the syntax of subqueries can be found in the description of the SELECT statement in Section 2.2.4,
“Query language (DQL)”.
38
Chapter 2. SQL reference
Example(s)
UPDATE
Purpose
The UPDATE statement makes it possible to make targeted changes to the contents of a table. The restriction of
the WHERE condition even makes it possible to change only one column entry of a single row.
Prerequisite(s)
• System privilege UPDATE ANY TABLE, object privilege UPDATE on the table or its schema, or the table
belongs to the current user or one of his roles.
• Appropriate SELECT privileges on the schema objects referenced in the optional FROM clause.
Syntax
update::=
AS
t_alias
UPDATE table
, ,
Note(s)
• By using the FROM clause, you can define several tables which are joined through the WHERE clause. By
that, you can specify complex update conditions which are similar to complex SELECT statements. Please
note that you have to specify the updated table within the FROM clause.
• If a column is set to the 'value' DEFAULT, then the rows in this column affected by the UPDATE are filled
automatically. For identity columns a monotonically increasing number is generated and for columns with
default values their respective default value is used. For all other columns the value NULL is used.
39
2.2. SQL statements
• Details on default values can be found in Section 2.3.5, “Default values”, on identity columns in Section 2.3.6,
“Identity columns”.
• The PREFERRING clause defines an Skyline preference term. Details can be found in Section 3.10, “Skyline”.
Example(s)
--Salary increase by 10 %
UPDATE staff SET salary=salary*1.1 WHERE name='SMITH';
--Euro conversion
UPDATE staff AS U SET U.salary=U.salary/1.95583, U.currency='EUR'
WHERE U.currency='DM';
MERGE
Purpose
The MERGE statement makes it possible to import the contents of an update table into a target table. The rows of
the update table determine which rows will be changed, deleted or inserted. Hence, the MERGE statement unites
the three statements UPDATE, DELETE, and INSERT.
For example, the update table can contain the data of new customers or customers to be dropped or the change
information of already existing customers. The MERGE statement can now be used to insert the new customers
into the target table, delete non-valid customers, and import the changes of existing customers.
The UPDATE, DELETE and INSERT clauses are optional, which means that at any given time only one part of
the actions described above is possible. Overall, the MERGE statement is a powerful tool for manipulating the
database.
Prerequisite(s)
Syntax
merge::=
AS
t_alias
MERGE INTO table
AS
table
t_alias
USING view ON ( condition )
subquery
40
Chapter 2. SQL reference
merge_update_clause
WHEN MATCHED THEN
merge_delete_clause
merge_update_clause::=
expr where_clause
UPDATE SET column =
DEFAULT
DELETE where_clause
merge_delete_clause::=
where_clause
DELETE
merge_insert_clause::=
, ,
( column ) expr
INSERT VALUES ( )
DEFAULT
where_clause
Note(s)
• The ON condition describes the correlation between the two tables (similar to a join). The MATCHED clause
is used for matching row pairs, the NOT MATCHED clause is used for those that do not match. In the ON
condition only equivalence conditions (=) are permitted.
• UPDATE clause: the optional WHERE condition specifies the circumstances under which the UPDATE is
conducted, whereby it is permissible for both the target table and the change table to be referenced for this.
With the aid of the optional DELETE condition it is possible to delete rows in the target table. Only the rows
that have been changed are taken into account and used to check the values after the UPDATE.
• DELETE clause: the optional WHERE condition specifies the circumstances under which the DELETE is
conducted.
• INSERT clause: the optional WHERE condition specifies the circumstances under which the INSERT is
conducted. In this respect, it is only permissible to reference the columns of the change table.
• The change table can be a physical table, a view or a subquery.
• The UPDATE or DELETE and INSERT clauses are optional with the restriction that at least one must be
specified. The order of the clauses can be exchanged.
• Default values and identity columns are treated by the INSERT and UPDATE clauses in exactly the same way
as by the INSERT and UPDATE statements (see there), with the only exception that INSERT DEFAULT
VALUES is not allowed.
41
2.2. SQL statements
• If there are several entries in the change table that could apply to an UPDATE of a single row in the target
table, this leads to the error message "Unable to get a stable set of rows in the source tables" if the original
value of the target table would be changed by the UPDATE candidates.
• If there are several entries in the change table that could apply to a DELETE of a single row in the target table,
this leads to the error message "Unable to get a stable set of rows in the source tables".
Example(s)
/* Sample tables
staff: changes: deletes:
name | salary | lastChange name | salary name
-----------------|----------- ---------------- ---------
meier | 30000 | 2006-01-01 schmidt | 43000 meier
schmidt | 40000 | 2006-05-01 hofmann | 35000
mueller | 50000 | 2005-08-01 meier | 29000
*/
DELETE
Purpose
42
Chapter 2. SQL reference
Prerequisite(s)
• System privilege DELETE ANY TABLE, object privilege DELETE on the table or its schema, or the table
belongs to the current user or one of his roles.
Syntax
delete::=
AS
* t_alias
DELETE FROM table
Note(s)
• Internally, rows are not immediately deleted, but marked as deleted. When a certain threshold is reached (default
is 25% of the rows), these rows are finally dropped. Hence DELETE statements can have varying execution
times. The current percentage of marked rows can be found in system tables EXA_*_TABLES (e.g.
EXA_ALL_TABLES) in the column DELETE_PERCENTAGE.
• The PREFERRING clause defines an Skyline preference term. Details can be found in Section 3.10, “Skyline”.
Example(s)
TRUNCATE
Purpose
The TRUNCATE statement makes it possible to completely delete the contents of a table.
Prerequisite(s)
• System privilege DELETE ANY TABLE, object privilege DELETE on the table or its schema, or the table
belongs to the current user or one of his roles.
Syntax
truncate::=
Example(s)
43
2.2. SQL statements
IMPORT
Purpose
Via the IMPORT command you can transfer data from external data sources into a table.
Prerequisite(s)
• In the source system: corresponding privileges to read the table contents or the files
• In Exasol: corresponding privileges to insert rows into the table (see INSERT)
• When using a connection (see also CREATE CONNECTION in Section 2.2.3, “Access control using SQL
(DCL)”) you need either the system privilege USE ANY CONNECTION or the connection has to be granted
via the GRANT statement to the user or to one of its roles
• When using an error table you need the appropriate rights for writing or inserting data
Syntax
import::=
( cols )
IMPORT INTO table
dbms_src error_clause
file_src
FROM
script_src
dbms_src:=
EXA
ORA
connection_def
DRIVER = string
JDBC
( cols )
TABLE table
STATEMENT stmt_string
connection_def:=
file_src:=
44
Chapter 2. SQL reference
CSV
connection_def FILE string
FBV
SECURE CSV
LOCAL FILE string
FBV
csv_cols
fbv_cols file_opts
csv_cols:=
FORMAT = string
( col_nr )
fbv_cols:=
SIZE = bytes
START = int
( FORMAT = string )
ALIGN = align
PADDING = string
file_opts:=
ENCODING = string
SKIP = int
TRIM
LTRIM
RTRIM
NULL = string
45
2.2. SQL statements
error_clause:=
REPLACE
( expr ) TRUNCATE
ERRORS INTO error_dst
int ERRORS
REJECT LIMIT
UNLIMITED
error_dst:=
SECURE
LOCAL CSV FILE string
table
script_src:=
Note(s)
• Additional information about ETL processes can be found in Section 3.4, “ETL Processes”
• The current progress of the data transfer can be seen within a second connection via the system table
EXA_USER_SESSIONS (column ACTIVITY)
• Import statements can also be transparently used within SELECT queries. Further details can be found in
Section 2.2.4, “Query language (DQL)”.
• Please note that in case of an IMPORT from JDBC or CSV sources, decimals are truncated if the target data
type has less precision than the source data type.
• Overview of the different elements and their meaning:
Element Meaning
dbms_src Defines the database source whose connection data is specified in the connection_def
(see below). You can choose among an Exasol connection (EXA), a native connection
to an Oracle database (ORA) or a JDBC connection to any database (JDBC).
Some JDBC drivers are already delivered as default (visible in EXAoperation) and can
be addressed within the connection string (e.g. jdbc:mysql, jdbc:postgres). You can
additionally configure JDBC drivers in EXAoperation and choose them via the DRIVER
option if its prefix is ambiguous.
The source data can either be a database table (as identifier like e.g.
MY_SCHEMA.MY_TABLE) or a database statement (as string like e.g. 'SELECT
46
Chapter 2. SQL reference
Element Meaning
"TEST" FROM DUAL'). In the second case this expression is executed on the database,
e.g. a SQL query or a procedure call.
Please note that table names are treated similar to Exasol tables.
Therefore you have to quote case-sensitive table names.
• Importing from Exasol databases is always parallelized. Please note that for Exasol,
loading tables directly is significantly faster than using database statements.
• If you import data from Oracle sources, partitioned tables will be loaded in parallel.
• Specifying multiple statements is only possible for JDBC sources.
1. Remote file(s)
FTP, FTPS, SFTP, HTTP and HTTPS servers are supported whose connection data
is defined via the connection_def (see below).
Notes:
• Certificates are not verified for encrypted connections.
• If you specify a folder instead of a file name, then the list of contained files will
be imported if the server supports that.
• In case of URLs starting with 'ftps://', the implicit encryption is used.
• In case of URLs starting with 'ftp://', Exasol encrypts user name and
password (explicit encryption) if this is supported by the server. If the server
demands encryption, then the whole data transfer is done encrypted.
• For HTTP and HTTPS servers only basic authentication is supported.
• For HTTP and HTTPS connections, http query parameters can be specified by
appending them to the fi l e name (e.g. FILE
'file.csv?op=OPEN&user.name=user').
2. Local file(s)
You can also import local files from your client system. When specifying the SE-
CURE option, the data is transferred encrypted, but also with slower performance.
The source files can either be CSV or FBV files and shall comply to the format specific-
ations in The CSV Data format and The Fixblock Data format (FBV). File names may
only consist of ASCII characters. A BOM is not supported.
47
2.2. SQL statements
Element Meaning
Compressed files are recognized by their file extension. Supported extensions are .zip,
.gz (gzip) and .bz2 (bzip2).
When System.in is specified as filename, data is read from the standard input stream
(System.in).
script_src Specifies the UDF script to be used for a user-defined import. Optionally, you can define
a connection or properties which will be forwarded to the script. The specified script
will generate an SQL statement internally which does the actual import using INSERT
INTO SELECT. The script has to implement a special callback function which receives
the import specification (e.g. parameters and connection information) and returns an
SQL statement. For details and examples, we refer to Section 3.4.4, “User-defined IM-
PORT using UDFs”.
connection_def
Optional connection definition for being able to encapsulate connection information such
as password. See also the separate section in this table for the exact syntax.
Optional parameters to be passed to the script. Each script can define the mandatory and
optional parameters it supports. Parameters are simple key-value pairs, with value being
a string, e.g.:
connection_def Defines the connection to the external database or file server. This can be specified
within a connection string (e.g. 'ftp://192.168.1.1/') and the corresponding
login information.
For regular ETL jobs you can also take use of connections where the connection data
like user name and password can easily be encapsulated. For details and examples please
refer to CREATE CONNECTION in Section 2.2.3, “Access control using SQL (DCL)”.
The declaration of user and password within the IMPORT command are optional. If they
are omitted, the data of the connection string or the connection object are used.
col_nr Defines the column number (starting from 1). Alternatively, you can define
a certain column range via .. (e.g. 5..8 for columns 5,6,7,8). Please note
that column numbers have to be in ascending order!
FORMAT Optional format definition for numbers or datetime values (default: session
format). Please consider Section 2.6.2, “Numeric format models” and Sec-
tion 2.6.1, “Date/Time format models”.
48
Chapter 2. SQL reference
Element Meaning
In the following example the first 4 columns of the CSV file is loaded, the last column
with the specified date format:
(1..3,4 FORMAT='DD-MM-YYYY')
fbv_cols Defines which and how the columns of the FBV file are interpreted. Please also refer to
The Fixblock Data format (FBV).
SIZE Defines the number of bytes of the column and must always be specified.
START Start byte of the column (starting with 0). Note that the START values have
to be in ascending order!
FORMAT Optional format definition for numbers or datetime values (default: session
format). Please consider Section 2.6.2, “Numeric format models” and Sec-
tion 2.6.1, “Date/Time format models”.
PAD- Padding characters for columns. In the default case, Space is used. You can
DING specify any ASCII character, either in plain text (e.g.: '+'), as hexadecimal
value (e.g: '0x09') or as abbreviation (one of 'NUL','TAB','LF','CR','ESC').
In the following example 4 columns of a FBV file are imported. The first column is right-
aligned and padded with x characters. After the first 12 bytes a gap exists and the fourth
column has the specified date format:
file_opts ENCODING Encoding of the CSV or FBV file (default is UTF8). All
supported encodings can be found in Appendix D, Suppor-
ted Encodings for ETL processes and EXAplus.
TRIM, LTRIM, RTRIM Defines whether spaces are deleted at the border of CSV
columns (LTRIM: from the left, RTRIM: from the right,
49
2.2. SQL statements
Element Meaning
TRIM: from both sides). In default case, no spaces are
trimmed.
COLUMN SEPARATOR Defines the field separator for CSV files. In the default
case, the comma (,) is used. You can specify any string,
either as plain text (e.g.: ','), as a hexadecimal value (e.g.:
'0x09') or as an abbreviation (one of 'NUL', 'TAB', 'LF',
'CR', 'ESC'). A plain text value is limited to 10 characters,
which will be automatically converted to the file's specified
ENCODING (see above). A hexadecimal value is limited
to 10 bytes (not characters) and will not be converted.
COLUMN DELIMITER Defines the field delimiter for CSV files. In the default
case, the double quote (") is used. You can specify any
string, either as plain text (e.g.: '"'), as a hexadecimal value
(e.g.: '0x09') or as an abbreviation (one of 'NUL', 'TAB',
'LF', 'CR', 'ESC'). A plain text value is limited to 10 char-
acters, which will be automatically converted to the file's
specified ENCODING (see above). A hexadecimal value
is limited to 10 bytes (not characters) and will not be con-
verted. If you don't want to use any field delimiter, you
can define the empty string ('').
ROW SIZE Only for FBV files. If the last column of the FBV file is
not used, this value must be specified to recognize the end
of a row. Otherwise the end is implicitly calculated by the
last column which was defined in the IMPORT command
(e.g. in case of (SIZE=4 START=5) it is assumed that
one column is read with 4 bytes and that the row consists
of overall 9 bytes).
error_clause This clause defines how many invalid rows of the source are allowed. E.g. in case of
REJECT LIMIT 5 the statement would work fine if less or equal than five invalid rows
occur, and would throw an exception after the sixth error.
Additionally you can write the faulty rows into a file or a local table within Exasol to
process or analyze them afterwards:
Table For every faulty row, the following columns are created: row number,
error message, (expression), truncated flag and the actual data. The trun-
cated flag indicates whether the data was truncated to the maximal string
length.
CSV file For every faulty row, a comment row is created with row number, error
message and (expression), followed by the actual data row.
The (optional) expression can be specified for identification reasons in case you use the
same error table or file multiple times. You could e.g. use CURRENT_TIMESTAMP
for that.
Example(s)
50
Chapter 2. SQL reference
SKIP = 5;
51
2.2. SQL statements
EXPORT
Purpose
Via the EXPORT command you can transfer data from Exasol into an external files or database systems.
Prerequisite(s)
• In the target system: corresponding privileges to insert rows or writing files. If you specify the corresponding
options, you need rights to replace or truncate the target.
• In Exasol: corresponding privileges to read the table contents.
• When using a connection (see also CREATE CONNECTION in Section 2.2.3, “Access control using SQL
(DCL)”) you need either the system privilege USE ANY CONNECTION or the connection has to be granted
via the GRANT statement to the user or to one of its roles
Syntax
export::=
( col_list )
table
EXPORT
( query )
dbms_dst error_clause
file_dst
INTO
script_dst
dbms_dst:=
EXA
ORA
connection_def
DRIVER = string
JDBC
REPLACE
TRUNCATE
STATEMENT stmt_string
connection_def:=
52
Chapter 2. SQL reference
file_dst:=
CSV
connection_def FILE string
FBV
SECURE CSV
LOCAL FILE string
FBV
csv_cols
fbv_cols file_opts
csv_cols:=
ALWAYS
DELIMIT = NEVER
fbv_cols:=
SIZE = bytes
FORMAT = string
ALIGN = align
PADDING = string
( )
file_opts:=
53
2.2. SQL statements
REPLACE
TRUNCATE
ENCODING = string
NULL = string
BOOLEAN = string
= string
COLUMN DELIMITER
ALWAYS
DELIMIT = NEVER
AUTO
error_clause:=
int ERRORS
REJECT LIMIT
UNLIMITED
script_dst:=
Note(s)
• Additional information about ETL processes can be found in Section 3.4, “ETL Processes”
• If no other option is specified (see below), the data is appended to the target
• The current progress of the data transfer can be seen within a second connection via the system table
EXA_USER_SESSIONS (column ACTIVITY)
• Only statements or views with ORDER BY clause on the top level are exported in sorted order (only in case
of files).
• Overview of the different elements and their meaning:
Element Meaning
data_src The source data can either be a table (as identifier like e.g. MY_SCHEMA.MY_TABLE)
or a query (as string like e.g. 'SELECT "TEST" FROM DUAL'). For tables you
can also specify the columns to be used.
dbms_dst Defines the database destination whose connection data is specified in the connec-
tion_def (see below). You can choose among an Exasol connection (EXA), a
native connection to an Oracle database (ORA) or a JDBC connection to any database
(JDBC).
54
Chapter 2. SQL reference
Element Meaning
Some JDBC drivers are already delivered as default (visible in EXAoperation) and
can be addressed within the connection string (e.g. jdbc:mysql, jdbc:postgres). You
can additionally configure JDBC drivers in EXAoperation and choose them via the
DRIVER option if its prefix is ambiguous.
For the target you can define either a table or a prepared statement (e.g. an INSERT
statement or a procedure call). In the latter case the data is passed as input data to
the prepared statement. Please note that you have to use schema-qualified table
names.
1. Remote file
FTP, FTPS, SFTP, HTTP and HTTPS servers are supported whose connection
data is defined via the connection_def (see below).
Notes:
• Certificates are not verified for encrypted connections.
• In case of URLs starting with 'ftps://', the implicit encryption is used.
• In case of URLs starting with 'ftp://', Exasol encrypts user name and
password (explicit encryption) if this is supported by the server. If the
server demands encryption, then the whole data transfer is done encrypted.
• For HTTP and HTTPS servers only basic authentication is supported.
• For HTTP and HTTPS connections, http query parameters can be specified
by appending them to the file name (e.g. FILE 'file.csv?op=CRE-
ATE&user.name=user').
2. Local file
You can also export into local files on your client system. When specifying the
SECURE option, the data is transferred encrypted, but also with slower perform-
ance.
55
2.2. SQL statements
Element Meaning
For exporting local files, the JDBC driver opens an
internal connection to the cluster and provides a HTTP
or HTTPS (SECURE-Option) server. But this is all
transparent for the user.
The target file can either be CSV or FBV files and shall comply to the format spe-
cifications in The CSV Data format and The Fixblock Data format (FBV). File
names may only consist of ASCII characters. A BOM is not supported.
Compressed files are recognized by their file extension. Supported extensions are
.zip, .gz (gzip) and .bz2 (bzip2).
When specifying multiple files, the actual data distribution depends on several factors.
It is also possible that some file are completely empty.
script_dst Specifies the UDF script to be used for a user-defined export. Optionally, you can
define a connection or properties which will be forwarded to the script. The specified
script will generate a SELECT statement internally that will be executed to do the
actual export. The script has to implement a special callback function that receives
the export specification (e.g. parameters and connection information) and returns a
SELECT statement. For details and examples, refer to Section 3.4.5, “User-defined
EXPORT using UDFs”.
connection_def
Optional parameters to be passed to the script. Each script can define the mandatory
and optional parameters it supports. Parameters are simple key-value pairs, with
value being a string, e.g.:
connection_def Defines the connection to the external database or file server. This can be specified
within a connection string (e.g. 'ftp://192.168.1.1/') and the corresponding
login information.
For regular ETL jobs you can also take use of connections where the connection data
like user name and password can easily be encapsulated. For details please refer to
CREATE CONNECTION in Section 2.2.3, “Access control using SQL (DCL)”.
The declaration of user and password within the IMPORT command are optional. If
they are omitted, the data of the connection string or the connection object are used.
For JDBC connections, it is possible to use Kerberos authentication by specifying
specific data in the IDENTIFIED BY field. This data consists of a key which indic-
ates that Kerberos authentication should be used (ExaAuthType=Kerberos), a
base64 encoded configuration file and a base64 encoded keytab file containing the
credentials for the principal. The syntax looks like the following: IMPORT INTO
table1 FROM JDBC AT '<JDBC_URL>' USER '<kerberos_princip-
al>' IDENTIFIED BY 'ExaAuthType=Kerber-
os;<base64_krb_conf>;<base64_keytab>' TABLE table2; Further
details and examples can be found in our solution center: https://www.ex-
asol.com/portal/display/SOL-512
56
Chapter 2. SQL reference
Element Meaning
csv_cols Defines which and how the columns of the CSV file are written. Please also refer to
The CSV Data format.
col_nr Defines the column number (starting from 1). Alternatively, you can
define a certain column range via .. (e.g. 5..8 for columns 5,6,7,8).
Please note that column numbers have to be in ascending order.
DELIMIT In default case (AUTO), the field delimiters are written only if special
characters occur within the data: the COLUMN SEPARATOR, the
ROW SEPARATOR, the COLUMN DELIMITER or a whitespace
character. By using the options ALWAYS or NEVER you can define
whether column separators shall be written always or never. This local
column option overwrites the global option (see file_opts).
Please note that if you use the NEVER option, then it's not guaranteed
that the exported data can be imported again into Exasol!
In the following example the first 4 columns of the source are loaded into the CSV
file, the last column with the specified date format:
(1..3,4 FORMAT='DD-MM-YYYY')
fbv_cols Defines which and how the columns of the FBV file are interpreted. Please also refer
to The Fixblock Data format (FBV).
SIZE Defines the number of bytes of the column, default is calculated by the
source data type:
57
2.2. SQL statements
Element Meaning
PAD- Padding characters for columns. In the default case, Space is used. You
DING can specify any ASCII character, either in plain text (e.g.: '+'), as hexa-
decimal value (e.g: '0x09') or as abbreviation (one of
'NUL','TAB','LF','CR','ESC').
In the following example 4 columns of a FBV file are written. The first column is
right-aligned and filled to 8 bytes with + characters, the fourth column has the spe-
cified date format:
COLUMN SEPARATOR Defines the field separator for CSV files. In the default
case, the comma (,) is used. You can specify any string,
either as plain text (e.g.: ','), as a hexadecimal value (e.g.:
58
Chapter 2. SQL reference
Element Meaning
'0x09') or as an abbreviation (one of 'NUL', 'TAB', 'LF',
'CR', 'ESC'). A plain text value is limited to 10 characters,
which will be automatically converted to the file's spe-
cified ENCODING (see above). A hexadecimal value is
limited to 10 bytes (not characters) and will not be con-
verted.
COLUMN DELIMITER Defines the field delimiter for CSV files. In the default
case, the double quote (") is used. You can specify any
string, either as plain text (e.g.: '"'), as a hexadecimal
value (e.g.: '0x09') or as an abbreviation (one of 'NUL',
'TAB', 'LF', 'CR', 'ESC'). A plain text value is limited to
10 characters, which will be automatically converted to
the file's specified ENCODING (see above). A hexa-
decimal value is limited to 10 bytes (not characters) and
will not be converted. If you don't want to use any field
delimiter, you can define the empty string ('') or use the
option DELIMIT NEVER (see below).
Please note that if you use the NEVER option, then it's
not guaranteed that the exported data can be imported
again into Exasol!
WITH COLUMN By the help of this option (only possible for CSV files),
NAMES an additional row is written at the beginning of the file
which contains the column names of the exported table.
In case of a subselect that can also be expressions. The
other options like e.g. the column separator are also ap-
plied for that row. If you want to import the same file
again using the IMPORT statement, you can use the
option SKIP 1.
error_clause This clause defines how many invalid rows of the source are allowed. E.g. in case
of REJECT LIMIT 5 the statement would work fine if less or equal than five invalid
rows occur, and would throw an exception after the sixth error. REJECT LIMIT 0
has the same behavior as though you omit the error clause completely.
Example(s)
59
2.2. SQL statements
60
Chapter 2. SQL reference
An introduction to the basic concepts of rights management can be found in Section 3.2, “Rights management”.
Further details such as a list of all privileges, a summary of the access rights for SQL statements as well as the
system tables relevant to rights management are set out in Appendix B, Details on rights management.
In addition, within the SQL reference for each SQL statement, the prerequisites in terms of which privileges are
necessary for the respective statement are specified.
CREATE USER
Purpose
Adds a user to the database. Exasol uses the specified password for authentication.
Prerequisite(s)
Syntax
create_user::=
BY password
AT LDAP AS dn_string
Note(s)
• In order for the user to be able to login subsequently, the system privilege CREATE SESSION must also be
granted.
• For the user name, the same rules as for SQL identifiers (see Section 2.1.2, “SQL identifier”) apply. However,
even with identifiers in quotation marks no attention is paid to case sensitivity. This means that the usernames
"Test", "TEST" and test are synonymous.
• You can choose one of two authentification methods:
Via Password During the log in, the database checks the given password directly. Please consider that
passwords have to be specified as identifiers (see Section 2.1.2, “SQL identifier”). If
you use delimited (quoted) identifiers, then the password is case sensitive.
Via Kerberos The drivers authenticate via Kerberos service (single sign-on). Typically, the defined
principal looks like the following: <user>@<realm>. Additional information about
the overall Kerberos configuration can be found in the corresponding driver chapters
(Chapter 4, Clients and interfaces) and in our Operational Manual: https://www.ex-
asol.com/portal/display/DOC/Operational+Manual
61
2.2. SQL statements
Via LDAP The database checks the password against a LDAP server which can be configured per
database within EXAoperation. The parameter dn-string (string in single quotes)
specifies the so called distinguished name which is the user name configured in the
LDAP server. Not supported are SASL and a certification management.
• After creation of the user, no schema exists for this user.
• A user can be renamed by the command RENAME.
Example(s)
ALTER USER
Purpose
Prerequisite(s)
• If password authentification is set, then a customer can always change the own password.
• Setting a new password for other users or defining the Kerberos / LDAP authentification needs the system
privilege ALTER USER.
Syntax
alter_user::=
REPLACE old_password
BY password
AT LDAP AS dn_string
Note(s)
• If one possesses the system privilege ALTER USER, the REPLACE clause is optional and the old password
is not verified.
• For security reasons, the old password must also be specified if a user wishes to change his own password
(unless he possesses the system privilege, ALTER USER).
• Details to the Kerberos / LDAP authentification and the rules for password creation can be found at CREATE
USER.
Example(s)
62
Chapter 2. SQL reference
DROP USER
Purpose
Deletes a user as well as the schemas of that user including all of the schema objects contained therein.
Prerequisite(s)
Syntax
drop_user::=
IF EXISTS CASCADE
DROP USER user
Note(s)
• If CASCADE is specified, all of the schemas of the user as well as their contents will be deleted! Furthermore,
all foreign keys which reference the tables of the user are deleted database-wide.
• If schemas that belong to the user still exist, CASCADE must be specified or these must be explicitly deleted
beforehand (using DROP SCHEMA).
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the user does
not exist.
• If the user to be deleted is logged-in at the same time, an error message is thrown and the user is not dropped.
In this case it is recommended to revoke the CREATE SESSION privilege from this user and terminate its
session using the command KILL.
Example(s)
CREATE ROLE
Purpose
Creates a role.
Prerequisite(s)
63
2.2. SQL statements
Syntax
create_role::=
Note(s)
• A role possesses no privileges after creation. These are assigned with the GRANT statement. A role is either
granted privileges directly or other roles are assigned to it.
• The same rules apply for role names as for usernames (see CREATE USER).
• A role can be renamed by the command RENAME.
Example(s)
DROP ROLE
Purpose
Deletes a role.
Prerequisite(s)
• Either the system privilege DROP ANY ROLE or this role with the WITH ADMIN OPTION must be assigned
to the user.
Syntax
drop_role::=
IF EXISTS CASCADE
DROP ROLE role
Note(s)
• If CASCADE is specified, all of the schemas of the role as well as their contents will be deleted!
• If schemas that belong to the role still exist, CASCADE must be specified or these must be explicitly deleted
beforehand (using DROP SCHEMA).
• This statement will also remove the role from other users who possessed it. However, open transactions of
such users are not affected.
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the role does
not exist.
• If you have created a role that doesn't mean the you can delete it.
Example(s)
64
Chapter 2. SQL reference
CREATE CONNECTION
Purpose
Prerequisite(s)
Syntax
create_connection::=
OR REPLACE
CREATE CONNECTION connection TO string
Note(s)
• External connections can be used within the statements IMPORT and EXPORT. Users must have the corres-
ponding access rights to the connection (via GRANT). The connection is automatically granted to the creator
(including the ADMIN OPTION).
• Further, connections can control the read access for users who want to use scripts for processing data from
local buckets stored in BucketFS. Details can be found in Section 3.6.4, “The synchronous cluster file system
BucketFS”.
• You can define an Exasol connection, a native connection to an Oracle database or a JDBC connection to any
database. Some JDBC drivers are already delivered as default (visible in EXAoperation) and can be addressed
within the connection string (e.g. jdbc:mysql, jdbc:postgres). You can additionally configure JDBC drivers
in EXAoperation and choose them via the DRIVER option if its prefix is ambiguous.
Only the pre-installed JDBC drivers (marked gray in EXAoperation) are tested and
officially supported. But our support will try to help you in case of problems with
other drivers.
• The declaration of user and password is optional and can be specified within the IMPORT and EXPORT
statements.
• Invalid connection data will not be noticed before the usage within the IMPORT and EXPORT statements.
• The list of all database connections can be found in the system table EXA_DBA_CONNECTIONS (see Ap-
pendix A, System tables).
• A connection can be renamed by the command RENAME.
Example(s)
65
2.2. SQL statements
ALTER CONNECTION
Purpose
Prerequisite(s)
• System privilege ALTER ANY CONNECTION or the connection must be granted to the user with the WITH
ADMIN OPTION.
Syntax
alter_connection::=
Example(s)
DROP CONNECTION
Purpose
66
Chapter 2. SQL reference
Prerequisite(s)
• System privilege DROP ANY CONNECTION or the connection must be granted to the user with the WITH
ADMIN OPTION.
Syntax
drop_connection::=
IF EXISTS
DROP CONNECTION connection
Note(s)
• If the optional IF EXISTS clause is specified, then the statement does not throw an exception if the connection
does not exist.
Example(s)
GRANT
Purpose
The GRANT statement can be used to grant system privileges, object privileges, roles or the access to connections
to users or roles.
Prerequisite(s)
• For system privileges the grantor requires the GRANT ANY PRIVILEGE system privilege or the user must
have received this system privilege with the WITH ADMIN OPTION.
• For object rights the grantor must either be the owner of the object or possess the GRANT ANY OBJECT
PRIVILEGE system privilege.
With regard to GRANT SELECT on views, attention must be paid that the grantor is
permitted to grant the SELECT on the view and that the owner of the view possesses
corresponding SELECT privileges on the base tables, which are grantable to other
users by the owner of the view. This is true if either he is the owner of the base tables
or possesses the privilege GRANT ANY OBJECT PRIVILEGE. Otherwise, it would
be possible to allow any user access to a foreign table by creating a view.
• For roles the grantor requires the GRANT ANY ROLE system privilege or he must have received the role
with the WITH ADMIN OPTION.
• For priorities the grantor requires the GRANT ANY PRIORITY system privilege.
• For connections the grantor requires the GRANT ANY CONNECTION system privilege or he must have re-
ceived the connection with the WITH ADMIN OPTION.
Syntax
grant_system_privileges::=
67
2.2. SQL statements
PRIVILEGES ,
ALL
user
GRANT , TO
role
system_privilege
grant_object_privileges::=
SCHEMA
TABLE
PRIVILEGES
ALL VIEW
, FUNCTION
ALTER SCRIPT
GRANT ON object
DELETE
EXECUTE
INSERT
REFERENCES
SELECT
UPDATE
user
TO
role
grant_roles::=
,
,
user WITH ADMIN OPTION
GRANT role TO
role
grant_priority::=
,
LOW
user
GRANT PRIORITY MEDIUM TO
role
HIGH
68
Chapter 2. SQL reference
grant_connection::=
,
,
user
GRANT CONNECTION connection TO
role
grant_connection_restricted::=
schema name
FOR
SCRIPT udf script name ,
SCHEMA schema name user
TO
role
Note(s)
• The list of system privileges supported by Exasol can be found in Table B.1, “System privileges in Exasol”.
• In order to ensure the security of the database, the GRANT statement should only be used in a very targeted
manner. Some of the privileges and roles lead to full control of the database. The DBA role possesses all possible
system privileges with the ADMIN option. With the privilege, GRANT ANY PRIVILEGE, it is possible to
grant all system privileges. With the ALTER USER privilege it is possible to change the password of SYS.
And with the GRANT ANY ROLE privilege it is possible to grant all roles (i.e. also the role DBA for example).
• With GRANT ALL the user is granted all system and object privileges.
• When granting an object privilege to a schema, this privilege is applied to all contained schema objects. Object
privileges for virtual schemas and its contained tables is not possible.
• The object privilege REFERENCES cannot be granted to a role.
• Assigned roles cannot be activated or deactivated by the user.
• The ACCESS privilege grants access to the details of a connection (also the password!) for (certain) UDF
scripts. This is necessary for adapter scripts of virtual schemas (see also Section 3.7, “Virtual schemas”).
• Details about priorities and connections can be found inSection 3.3, “Priorities” and in the descriptions of the
statement CREATE CONNECTION.
Example(s)
-- System privilege
GRANT CREATE SCHEMA TO role1;
GRANT SELECT ANY TABLE TO user1 WITH ADMIN OPTION;
-- Object privileges
GRANT INSERT ON my_schema.my_table TO user1, role2;
GRANT SELECT ON VIEW my_schema.my_view TO user1;
69
2.2. SQL statements
-- Roles
GRANT role1 TO user1, user2 WITH ADMIN OPTION;
GRANT role2 TO role1;
-- Priority
GRANT PRIORITY HIGH TO role1;
-- Connection
GRANT CONNECTION my_connection TO user1;
REVOKE
Purpose
The REVOKE statement can be used to withdraw system privileges, object privileges, roles or the access to con-
nections.
Prerequisite(s)
• For system privileges the revoker requires the GRANT ANY PRIVILEGE system privilege or he must have
received this system privilege with the WITH ADMIN OPTION.
• For object privileges the revoker requires the GRANT ANY OBJECT PRIVILEGE system privilege or he
must be the owner of the object.
• For roles the revoker requires the GRANT ANY ROLE system privilege or he must have received the role
with the WITH ADMIN OPTION.
• For priorities the revoker requires the GRANT ANY PRIORITY system privilege.
• For connections the revoker requires the GRANT ANY CONNECTION system privilege or he must have
received the connection with the WITH ADMIN OPTION.
Syntax
revoke_system_privileges::=
PRIVILEGES ,
ALL
user
REVOKE , FROM
role
system_privilege
revoke_object_privileges::=
70
Chapter 2. SQL reference
SCHEMA
TABLE
PRIVILEGES
ALL VIEW
, FUNCTION
ALTER SCRIPT
REVOKE ON object
DELETE
EXECUTE
INSERT
REFERENCES
SELECT
UPDATE
revoke_roles::=
,
,
user
REVOKE role FROM
role
revoke_connection_restricted::=
schema name
FOR
SCRIPT udf script name ,
SCHEMA schema name user
FROM
role
revoke_priority::=
71
2.2. SQL statements
,
LOW
user
REVOKE PRIORITY MEDIUM FROM
role
HIGH
revoke_connections::=
,
,
user
REVOKE CONNECTION connection FROM
role
Note(s)
• If the user has received the same privilege or the same role from several users, a corresponding REVOKE will
delete all of these.
• If an object privilege was granted to a single schema object, but also to its schema (that means implicitly to
all contained objects), and the privilege of the schema was revoked again, then the object privilege for the
single schema object is still retained.
• The object privilege REFERENCES can only be revoked if the corresponding user has not yet created foreign
keys on that table. In this case you can automatically drop those foreign keys by specifying the option CASCADE
CONSTRAINTS.
• As opposed to Oracle, REVOKE ALL [PRIVILEGES] will delete all system or object privileges, even if the
user was not granted all rights beforehand by means of GRANT ALL.
Example(s)
-- System privilege
REVOKE CREATE SCHEMA FROM role1;
-- Object privileges
REVOKE SELECT, INSERT ON my_schema.my_table FROM user1, role2;
REVOKE ALL PRIVILEGES ON VIEW my_schema.my_view FROM PUBLIC;
-- Role
REVOKE role1 FROM user1, user2;
-- Priority
REVOKE PRIORITY FROM role1;
-- Connections
REVOKE CONNECTION my_connection FROM user1;
72
Chapter 2. SQL reference
SELECT
Purpose
The SELECT statement can be used to retrieve data from tables or views.
Prerequisite(s)
• System privilege SELECT ANY TABLE or appropriate SELECT privileges on tables or views which are
referenced in the SELECT list. Either the tables or views belong to the actual user or one of its roles or the
actual user owns the object privilege SELECT on the table/view or its schema.
• When accessing views, it is necessary that the owner of the view has appropriate SELECT privileges on the
referenced objects of the view.
• If you use a subimport. you need the appropriate rights similar to the IMPORT statement.
Syntax
subquery::=
with_clause::=
( column_aliases )
WITH query_name AS ( subquery )
select_list::=
73
2.2. SQL statements
AS
column_alias
expr
DISTINCT
table.*
ALL
view.*
from_item::=
TABLE
table
AS ( col_aliases )
view table_alias
subquery
( subimport )
values_table
( )
join_clause
connect_by_clause::=
AND
NOCYCLE START WITH condition
CONNECT BY condition
AND
NOCYCLE
START WITH condition CONNECT BY condition
preferring_clause::=
PARTITION BY expr
PREFERRING preference_term
preference_term::=
74
Chapter 2. SQL reference
( preference_term )
HIGH
expr
LOW
boolean_expr
PLUS
preference_term preference_term
PRIOR TO
INVERSE ( preference_term )
group_by_clause::=
expr
position
HAVING condition
GROUP BY cube_rollup_clause
grouping_sets_clause
()
cube_rollup_clause::=
CUBE
( grouping_expression_list )
ROLLUP
grouping_sets_clause::=
cube_rollup_clause
GROUPING SETS ( )
grouping_expression_list
grouping_expression_list::=
expr
( expr )
order_by_clause::=
75
2.2. SQL statements
ASC FIRST
expr NULLS
DESC LAST
ORDER BY position
col_alias
limit_clause::=
offset ,
count
LIMIT OFFSET offset
count
join_clause::=
inner_outer_clause
from_item
CROSS JOIN from_item
inner_outer_clause::=
INNER
LEFT OUTER
ON condition
JOIN from_item
RIGHT USING ( cols )
FULL OUTER
subimport::=
dbms_src
INTO ( cols ) error_clause
IMPORT FROM file_src
script_src
values_table::=
VALUES ( expr )
Note(s)
• You can calculate scalar expressions by omitting the FROM clause (e.g SELECT 'abc').
• Using the WITH clause you can define temporary views which are only valid during the execution of a subquery.
76
Chapter 2. SQL reference
• In case of DISTINCT, identical rows will be eliminated. If you use the keyword ALL (default), all rows will
be present in the result table.
• Source tables and views are defined in the FROM clause. Through the values_table you can easily define
static tables, e.g. by (VALUES (1,TRUE), (2,FALSE), (3, NULL)) AS t(i, b) a table with
two columns and three rows is specified.
• The SELECT list defines the columns of the result table. If * is used, then all columns will be listed.
• In complex expressions within the SELECT list, the usage of column aliases can be very useful. Directly, such
aliases can only be referenced inside the ORDER BY clause. But you can also reference those aliases indirectly
via the keyword LOCAL within the other clauses (WHERE, GROUP BY, HAVING) and even in the SELECT
list. An example for the indirect referencing: SELECT ABS(x) AS x FROM t WHERE local.x>10. Moreover,
column aliases define the column names of the result table.
• The WHERE clause can be used to restrict the result by certain filter conditions.
• Equality join conditions among tables can be specified within the WHERE clause by using the = operator. If
you want to define an outer condition, add (+) after the outer-expression. This kind of syntax is more readable
than using the join_clause.
• You can use the USING clause within a join if the corresponding column names are identical in both tables.
In that case, you simply specify the list of unqualified column names which shall be joined, i.e. without spe-
cifying table names or aliases. Please also mention that a coalesce expression is used in case of an outer join.
Hence, in case of non-matching values, the value is returned and not NULL. Afterwards, only this calculated
column can be reference, but not the original columns of the two tables.
• The CONNECT BY clause can be used to define hierarchical conditions. It is evaluated before the WHERE
conditions - except the join conditions on tables which are referenced in the CONNECT BY clause.
The following elements are relevant for the CONNECT BY clause (a detailed example can be found below):
START WITH By this condition you can specify the set of root nodes in the graph.
condition(s) You can define several conditions. A hierarchical connection between father and son rows
can be defined via the keyword PRIOR (e.g. PRIOR employee_id = manager_id).
If you don't specify such a PRIOR condition, the cross product will be computed. Thus,
the statement SELECT LEVEL FROM dual CONNECT BY LEVEL<=100 results
in a table with 100 rows, because 100 cross products were calculated for the table dual.
NOCYCLE If you specify this option, the query also returns results if there exists a cycle. In this case
the expansion will be terminated when a cycle is detected.
The following functions and operators can be used in the SELECT list and the WHERE conditions to qualify
the results of a hierarchical query.
SYS_CONNECT_BY_PATH (expr, Returns a string containing the full path from the root node to the current
char) node, containing the values for expr and separated by char.
LEVEL Returns the hierarchy level of a row, i.e. 1 for the root node, 2 for it's
direct sons, and so on.
PRIOR References the row's parent row. By that, you can define the father-son
condition. But furthermore, you can use this operator to access the values
of the parent row. Hence, the following two CONNECT BY conditions
are equivalent:
• PRIOR employee_id = manager_id AND PRIOR em-
ployee_id=10.
• PRIOR employee_id = manager_id AND man-
ager_id=10.
CONNECT_BY_ROOT Instead of a row's value, the corresponding value of the root node is
used (e.g. CONNECT_BY_ROOT last_name would be evaluated by
the name of the highest manager of an employee if the condition
PRIOR employee_id = manager_id was defined in the CON-
NECT BY clause).
77
2.2. SQL statements
CONNECT_BY_ISLEAF This expression returns 1 if a row is a leaf within the tree (i.e. it has no
sons), otherwise 0.
CONNECT_BY_ISCYCLE Returns whether the current row causes a cycle. In the path (see above)
such a row will occur exactly twice. This expression can only be used
in combination with the NOCYCLE option.
The function LEVEL and the operator PRIOR can also be used within the CONNECT
BY clause.
• The PREFERRING clause defines an Skyline preference term. Details can be found in Section 3.10, “Skyline”.
• The GROUP BY clause defines groups of rows which will be aggregated in the result table. Inside the SELECT
list, you can use aggregate functions. Using an numerical value x (position) results in aggregating the result
table by the x-th column. If GROUP BY is used, all SELECT list elements have to be aggregated except those
which define the grouping keys.
• CUBE, ROLLUP and GROUPING SETS are extensions of the GROUP BY clause for calculating superaggregates.
Those are hierarchical aggregation levels like e.g. partial sums for days, months and years and can be computed
within one single GROUP BY statement instead of using a UNION of several temporary results.
You can distinguish between regular result rows (normal aggregation on the deepest level) and superaggregate
rows. The total of all arguments results in the normal aggregation, the subsets result in the corresponding su-
peraggregates. You can discern the result row types by using the function GROUPING[_ID].
CUBE Calculates the aggregation levels for all possible combinations of the arguments (2n
combinations).
Example: Via CUBE(countries,products) you can sum up all subtotal revenues of all
country/product pairs (regular result rows), but additionally the subtotals of each
country, the subtotals of each product and the total sum (3 additional superaggregate
rows).
ROLLUP Calculates the aggregation levels for the first n, n-1, n-2, ... 0 arguments (overall n+1
combinations). The last level corresponds to the total sum.
Example: Via ROLLUP(year,month,day) you can sum up all revenues of all single
date (regular result rows), but additionally for each month of year, for each year and
the total sum (superaggregate rows).
GROUPING SETS Calculates the aggregation levels for the specified combinations. CUBE and ROLLUP
are special forms of GROUPING SETS and simplify the notation.
() Is similar to GROUPING SETS () and aggregates the whole table as one single group.
If multiple hierarchical groupings are specified, separated by a comma, then the result is the set of all combin-
ations of partial groupings (cross product). E.g. the expression ROLLUP(a,b),ROLLUP(x,y) results in overall
9 combinations. Starting with the subsets (a,b), (a), () and (x,y), (x), () you get the following combinations:
(a,b,x,y), (a,b,x), (a,b), (a,x,y), (a,x), (a), (x,y), (x), ().
• By using the HAVING clause you can restrict the number of groups.
• The result table can be sorted by specifying the ORDER BY clause. Using an numerical value x (position)
results in sorting the result table by the x-th column.
78
Chapter 2. SQL reference
• The NULLS LAST (Default) and NULLS FIRST option can be used to determine whether NULL values
are sorted at the end or the beginning.
• The number of result rows can be restricted by defining the LIMIT clause. The optional offset can only be
used in combination with ORDER BY, because otherwise the result would not be deterministic. LIMIT is not
allowed in aggregated SELECT statements and within correlated subqueries of an EXISTS predicate.
• By using the subimport clause, you can integrate the import of external data sources directly in your query.
Please not the following notes:
• Details about the usage of external data sources and their options can be found in the description of the
IMPORT statement in Section 2.2.2, “Manipulation of the database (DML)”.
• It is highly recommended to explicitly specify the target column types (see example below). Otherwise,
the column names and data types are chosen in a generic way. For importing files, these types are mandatory.
• Local files cannot be imported directly within queries.
• By creating a view, external data sources can be transparently integrated in Exasol as a sort of external
tables.
• Local filter conditions on such imports are not propagated to the source databases. However, you can
achieve that by using the STATEMENT option.
• SELECT statements can be combined using the Table operators UNION [ALL], INTERSECT, MINUS.
• Please note that SELECT queries can be directly returned from the query cache in case the syntactically equi-
valent query was already executed before. Details can be found in the notes of command ALTER SYSTEM.
Example(s)
C_ID NAME
---------- ----------
1 smith
2 jackson
STORE VOLUME
---------- --------
TOKYO 653.58
NEW YORK 1516.78
MUNICH 252.98
79
2.2. SQL statements
NAME VOLUME
---------- -------
jackson 1569.77
smith 853.57
Purpose
In order to combine the results of various queries with one another, the table operators UNION ALL, UNION,
INTERSECT and MINUS (=EXCEPT) exist. These calculate the set union, the set union without duplicates, the
intersection without duplicates, and the set difference without duplicates from two subqueries.
UNION ALL Union from both subqueries. All of the rows from both operands are taken into account.
80
Chapter 2. SQL reference
UNION The set union from both subqueries without duplicates. All of the rows from both operands
are taken into account. Duplicate entries in the result are eliminated.
INTERSECT The intersection from both subqueries without duplicates. All of the rows that appear in
both operands are accounted for in the result. Duplicate entries in the result are eliminated.
MINUS or EXCEPT The set difference from both subqueries without duplicates. The result comprises those
rows in the left operand that do not exist in the right operand. Duplicate entries in the
result are eliminated.
Syntax
table_operators::=
ALL
UNION
INTERSECT
( subquery ) ( subquery )
MINUS
EXCEPT
Note(s)
• The table operators (except UNION ALL) are expensive operations and can lead to performance problems, in
particular with very large tables. This is primarily because the result must not contain duplicates. Removing
duplicates is an expensive operation.
• The number of columns of both operands must match and the data types of the columns of both operands must
be compatible.
• The names of the left operand are used as columns name for the result.
• Additionally, several table operators can be combined next to one another. In this respect, INTERSECT has
higher priority than UNION [ALL] and MINUS. Within UNION [ALL] and MINUS evaluation is performed
from left to right. However, for reasons of clarity parentheses should always be used.
• EXCEPT comes from the SQL standard, MINUS is an alias and is, e.g. supported by Oracle. Exasol supports
both alternatives.
Example(s)
I1 C1
---------- ---
1 abc
2 def
3 abc
3 abc
5 xyz
I2 C2
---------- ---
1 abc
abc
81
2.2. SQL statements
3
4 xyz
4 abc
I1 C1
---------- ---
1 abc
2 def
3 abc
3 abc
5 xyz
1 abc
abc
3
4 xyz
4 abc
I1 C1
---------- ---
1 abc
3 abc
4 abc
abc
2 def
4 xyz
5 xyz
3
I1 C1
---------- ---
1 abc
I1 C1
---------- ---
3 abc
2 def
5 xyz
82
Chapter 2. SQL reference
Purpose
This construct can be used to check whether a number of columns have the primary key property. This is the case
if the specified columns do not contain duplicates and no NULL values. Rows that do not conform to the primary
key property are selected.
Syntax
select_invalid_primary_key::=
DISTINCT
Note(s)
• In the formulation without select_list invalid rows in the columns to be checked for the primary key
property are selected.
• ROWNUM cannot be used in combination with this statement.
• Verification of the primary key property occurs in the table stated in the FROM clause. It is not until after this
that WHERE, GROUP BY, etc. are used on the table with the columns that violate the property.
Example(s)
NR NAME FIRST_NAME
---------- ---------- ----------
1 meiser inge
2 mueller hans
3 meyer karl
3 meyer karl
5 schmidt ulla
6 benno
2 fleischer jan
83
2.2. SQL statements
FIRST_NAME NAME
---------- ----------
hans mueller
jan fleischer
karl meyer
karl meyer
NR NAME FIRST_NAME
---------- ---------- ----------
3 meyer karl
3 meyer karl
6 benno
FIRST_NAME
----------
karl
karl
Purpose
This construct can be used to verify whether the rows of a number of columns cols are unique. This is the case
if the specified columns cols do not contain data records in duplicate. Rows in the specified columns cols
which only contain NULL values are classified as being unique (even if there is more than one). Non-unique rows
are selected.
Syntax
select_invalid_unique::=
DISTINCT
Note(s)
• In the formulation without select_list invalid rows in the columns to be checked for uniqueness are se-
lected.
84
Chapter 2. SQL reference
Example(s)
NR NAME FIRST_NAME
---------- ---------- ----------
1 meiser inge
2 mueller hans
3 meyer karl
3 meyer karl
5 schmidt ulla
6 benno
2 fleischer jan
3
FIRST_NAME NAME
---------- ----------
karl meyer
karl meyer
NR NAME FIRST_NAME
---------- ---------- ----------
3 meyer karl
3 meyer karl
SELECT first_name WITH INVALID UNIQUE (nr, name, first_name) from T1;
FIRST_NAME
----------
karl
karl
Purpose
Selects rows that violate the specified foreign key property. This is the case if the row values of the specified
columns contain NULL values or do not exist in the specified columns of the referenced table.
Syntax
select_invalid_foreign_key::=
85
2.2. SQL statements
DISTINCT
( ref_column )
REFERENCING refTable
Note(s)
• Preferably, the referenced columns of the reference table should possess the primary key property. However,
this is not verified by this statement.
• In the formulation without select_list invalid rows in the columns to be checked for the foreign key
property are selected.
• ROWNUM cannot be used in combination with this statement.
• Verification of the foreign key property occurs directly in the table specified in the FROM clause. It is not
until after this that WHERE, GROUP BY, etc. are used on the table with the columns that violate the property.
Example(s)
NR NAME FIRST_NAME
---------- ---------- ----------
1 meiser inge
2 mueller hans
3 meyer karl
3 meyer karl
5 schmidt ulla
6 benno
2 fleischer jan
ID NAME FIRST_NAME
---------- ---------- --------------------
1 meiser otto
2 mueller hans
3 meyer karl
5 schmidt ulla
6 benno
7 fleischer jan
86
Chapter 2. SQL reference
FIRST_NAME NAME
---------- ----------
NR NAME FIRST_NAME
---------- ---------- ----------
1 meiser inge
6 benno
NR FIRST_NAME NAME
---------- ---------- ----------
1 inge meiser
6 benno
2 jan fleischer
87
2.2. SQL statements
COMMIT
Purpose
The COMMIT statement is used to persistently store changes of the current transaction in the database.
Prerequisite(s)
• None.
Syntax
commit::=
WORK
COMMIT
Note(s)
• The keyword WORK is optional and is only supported in order to conform to the SQL standard.
• The automatic running of COMMIT after each SQL statement is possible with the EXAplus SET AUTOCOM-
MIT command.
• More information on transactions can be found in Section 3.1, “Transaction management ”.
Example(s)
COUNT(*)
--------
1
COUNT(*)
--------
1
88
Chapter 2. SQL reference
ROLLBACK
Purpose
Prerequisite(s)
• None.
Syntax
rollback::=
WORK
ROLLBACK
Note(s)
• The keyword WORK is optional and is only supported in order to conform to the SQL standard.
• More information on transactions can be found in Section 3.1, “Transaction management ”.
Example(s)
COUNT(*)
--------
1
ROLLBACK;
COUNT(*)
--------
0
EXECUTE SCRIPT
Purpose
Prerequisite(s)
• System privilege EXECUTE ANY SCRIPT, object privilege EXECUTE on the script, or the current user
owns the script.
89
2.2. SQL statements
Syntax
execute_script::=
script_param
( )
EXECUTE SCRIPT script
WITH OUTPUT
script_param::=
expr
ARRAY ( expr )
Note(s)
• A script can be created and dropped by using the statements CREATE SCRIPT and DROP SCRIPT.
• An extended introduction to the script language can be found in Section 3.5, “Scripting”.
• Content and parameters of a script are integrated in the corresponding system tables (e.g. EXA_ALL_SCRIPTS
− see also Appendix A, System tables).
• The return value of a script can be a table or a rowcount. It is specified as option in the CREATE SCRIPT
statement (see also section Return value of a script in Section 3.5, “Scripting”).
• When specifying the option WITH OUTPUT, the return value of the script is ignored. In this case always a
result table is returned which exists of all debug messages which are created via the function output()
during the script execution (see also section Debug output in Section 3.5, “Scripting”).
• Contrary to views a script is executed with the privileges of the executing user, not with the privileges of the
person which created the script via CREATE SCRIPT.
Example(s)
KILL
Purpose
Prerequisite(s)
• In order to terminate a foreign session or query, the user requires the system privilege KILL ANY SESSION.
90
Chapter 2. SQL reference
Syntax
kill::=
session_id
SESSION
CURRENT_SESSION
KILL
stmt_id
STATEMENT IN SESSION session_id
Note(s)
• In case of KILL SESSION the corresponding user gets an error message and is logged out of the database.
• KILL STATEMENT terminates the current statement of the corresponding session, but not the session itself.
If you specify a certain statement id (see also EXA_ALL_SESSIONS), you can constrain the termination to
a certain statement of a session. If this statement does not exist anymore (e.g. already finished), then a corres-
ponding exception will be thrown. The termination of statements is similar to the Query Timeout (see ALTER
SESSION or ALTER SYSTEM). When a statement is terminated, it may finish with an exception within a
few seconds through an internal cancellation point. If this fails - for example because there are no such cancel-
lation points (e.g. in LUA) or the query is slowed down due to disk operations - the query is terminated forcefully
and the transaction is rolled back (including an internal reconnect). If the affected statement is part of EXECUTE
SCRIPT, the whole script is terminated.
Example(s)
ALTER SESSION
Purpose
Prerequisite(s)
• None.
Syntax
alter_session::=
Note(s)
• The session-based parameters are initialized with the system-wide parameters (see ALTER SYSTEM), but
can be overwritten with the ALTER SESSION statement. The current settings can be found in the
EXA_PARAMETERS system table.
• At the moment that a user logs out, changes to settings made via ALTER SESSION are lost.
• The following parameters can be set:
91
2.2. SQL statements
TIME_ZONE Defines the time zone in which the values of type TIMESTAMP WITH
LOCAL TIME ZONE are interpreted. Further information can be found
in section Date/Time data types in Section 2.3, “Data types”. The list
of supported timezones can be found in the system table
EXA_TIME_ZONES. The function SESSIONTIMEZONE returns the
current session time zone.
TIME_ZONE_BEHAVIOR Defines the course of action for ambiguous and invalid timestamps
within a certain time zone. Further information can be found in section
Date/Time data types in Section 2.3, “Data types”.
TIMESTAMP_ARITHMETIC_BE- Defines the behavior for +/- operators:
HAVIOR
• INTERVAL - The difference of two datetime values is an interval,
and when adding a decimal value to a timestamp, the number is
rounded to an integer, so actually a certain number of full days is
added.
• DOUBLE - The difference of two datetime values is a double, and
when adding a decimal value to a timestamp, the fraction of days is
added (hours, minutes, ...).
NLS_DATE_FORMAT Sets the date format used for conversions between dates and strings.
Possible formats are described in Section 2.6.1, “Date/Time format
models”.
NLS_TIMESTAMP_FORMAT Sets the timestamp format used for conversions between timestamps
and strings. Possible formats are described in Section 2.6.1, “Date/Time
format models”.
NLS_DATE_LANGUAGE Sets the language of the date format used in abbreviated month and day
formats and those written in full (see Section 2.6.1, “Date/Time format
models”). Possible languages are English (ENG = Default) and German
(DEU). The English language can be set using ENG or ENGLISH and
the German language with DEU, DEUTSCH, and GERMAN.
NLS_FIRST_DAY_OF_WEEK Defines the first day of a week (integer 1-7 for Monday-Sunday).
NLS_NUMERIC_CHARACTERS Defines the decimal and group characters used for representing numbers.
This parameter is also relevant to the use of numeric format models (see
also Section 2.6.2, “Numeric format models”).
DEFAULT_LIKE_ESCAPE_CHAR- Defines the escape character for the LIKE predicate (see Section 2.8,
ACTER “Predicates”) in case no explicit one was specified.
QUERY_CACHE The parameter QUERY_CACHE defines the usage of a read cache for
SELECT queries. If the syntactically identical query is sent multiple
times (except upper/lower case, spaces, ...), then the database can read
the result directly out of a cache instead of executing the query. This is
only applicable if the corresponding schema objects haven't changed in
the meantime. The following values can be set:
• ON - The query cache is actively used, i.e. query results are read
from and written into the cache.
• OFF - The query cache is not used.
• READONLY - Results are read from the cache, but additional new
queries will not be cached.
Whether a query was returned from the cache can be determined by the
column EXECUTION_MODE in the corresponding system tables (e.g.
EXA_SQL_LAST_DAY).
QUERY_TIMEOUT Defines how many seconds a statement may run before it is automatically
aborted. When this point is reached, the statement may finish with an
exception within a few seconds through an internal cancellation point.
If this fails - for example because there are no such cancellation points
(e.g. in LUA) or the query is slowed down due to disk operations - the
query is terminated forcefully and the transaction is rolled back (includ-
ing an internal reconnect). Time spent waiting for other transactions (in
92
Chapter 2. SQL reference
Example(s)
TO_CHAR1
----------------------------
MONDAY -31-DECEMBER -2007
TO_CHAR2
--------------
123.123.123,45
ALTER SYSTEM
Purpose
93
2.2. SQL statements
Prerequisite(s)
Syntax
alter_system::=
Note(s)
• The session-based parameters are initialized with the system-wide parameters (ALTER SYSTEM), but can be
overwritten with the ALTER SESSION statement. The current settings can be found in the EXA_PARAMETERS
system table.
• If a value is changed via ALTER SYSTEM, it will only impact new connections to the database.
• The following parameters can be set:
TIME_ZONE Defines the time zone in which the values of type TIMESTAMP WITH
LOCAL TIME ZONE are interpreted. Further information can be found
in section Date/Time data types in Section 2.3, “Data types”. The list
of supported timezones can be found in the system table
EXA_TIME_ZONES. The function SESSIONTIMEZONE returns the
current session time zone.
TIME_ZONE_BEHAVIOR Defines the course of action for ambiguous and invalid timestamps
within a certain time zone. Further information can be found in section
Date/Time data types in Section 2.3, “Data types”.
TIMESTAMP_ARITHMETIC_BE- Defines the behavior for +/- operators:
HAVIOR
• INTERVAL - The difference of two datetime values is an interval,
and when adding a decimal value to a timestamp, the number is
rounded to an integer, so actually a certain number of full days is
added.
• DOUBLE - The difference of two datetime values is a double, and
when adding a decimal value to a timestamp, the fraction of days is
added (hours, minutes, ...).
NLS_DATE_FORMAT Sets the date format used for conversions between dates and strings.
Possible formats are described in Section 2.6.1, “Date/Time format
models”.
NLS_TIMESTAMP_FORMAT Sets the timestamp format used for conversions between timestamps
and strings. Possible formats are described in Section 2.6.1, “Date/Time
format models”.
NLS_DATE_LANGUAGE Sets the language of the date format used in abbreviated month and day
formats and those written in full (see Section 2.6.1, “Date/Time format
models”). Possible languages are English (ENG = Default) and German
(DEU). The English language can be set using ENG or ENGLISH and
the German language with DEU, DEUTSCH, and GERMAN.
NLS_FIRST_DAY_OF_WEEK Defines the first day of a week (integer 1-7 for Monday-Sunday).
NLS_NUMERIC_CHARACTERS Defines the decimal and group characters used for representing numbers.
This parameter is also relevant to the use of numeric format models (see
also Section 2.6.2, “Numeric format models”).
DEFAULT_LIKE_ESCAPE_CHAR- Defines the escape character for the LIKE predicate (see Section 2.8,
ACTER “Predicates”) in case no explicit one was specified.
QUERY_CACHE The parameter QUERY_CACHE defines the usage of a read cache for
SELECT queries. If the syntactically identical query is sent multiple
times (except upper/lower case, spaces, ...), then the database can read
the result directly out of a cache instead of executing the query. This is
94
Chapter 2. SQL reference
• ON - The query cache is actively used, i.e. query results are read
from and written into the cache.
• OFF - The query cache is not used.
• READONLY - Results are read from the cache, but additional new
queries will not be cached.
Whether a query was returned from the cache can be determined by the
column EXECUTION_MODE in the corresponding system tables (e.g.
EXA_SQL_LAST_DAY).
QUERY_TIMEOUT Defines how many seconds a statement may run before it is automatically
aborted. When this point is reached, the statement may finish with an
exception within a few seconds through an internal cancellation point.
If this fails - for example because there are no such cancellation points
(e.g. in LUA) or the query is slowed down due to disk operations - the
query is terminated forcefully and the transaction is rolled back (includ-
ing an internal reconnect). Time spent waiting for other transactions (in
state Waiting for session) is not excluded. In case of EXECUTE
SCRIPT the QUERY_TIMEOUT is applied to the script in whole, when
reaching the timeout the script is terminated (including any statements
being executed by the script). Please note that any changes of the
QUERY_TIMEOUT within a script will only be applied when the script
exits. The default value for QUERY_TIMEOUT is '0' (no restrictions).
CONSTRAINT_STATE_DEFAULT This parameter defines the default state of constraints ('ENABLE' or
'DISABLE') in case the state wasn't explicitly specified during the cre-
ation (see also CREATE TABLE and ALTER TABLE (constraints)).
PROFILE Activates/deactivates the profiling (values 'ON' or 'OFF'). For details
see also Section 3.9, “Profiling”.
SCRIPT_LANGUAGES Defines the script language aliases. For details see Section 3.6.5, “Ex-
panding script languages using BucketFS”.
SQL_PREPROCESSOR_SCRIPT Defines a preprocessor script. If such a script is specified (a regular
script which was created via CREATE SCRIPT), then every executed
SQL statement is preprocessed by that script. Please refer to Section 3.8,
“SQL Preprocessor ” for more information on SQL preprocessing. De-
tails about the script language can be found in Section 3.5, “Scripting”.
Example(s)
95
2.2. SQL statements
OPEN SCHEMA
Purpose
A schema can be opened with this statement which affects the name resolution.
Prerequisite(s)
• None.
Syntax
open_schema::=
Note(s)
Example(s)
CLOSE SCHEMA
Purpose
This statement is used to close the current schema which affects the name resolution.
Prerequisite(s)
• None.
Syntax
close_schema::=
CLOSE SCHEMA
Note(s)
• If there is no schema opened, all the schema objects must be referenced using schema-qualified names (see
also Section 2.1.2, “SQL identifier”).
96
Chapter 2. SQL reference
Example(s)
DESC[RIBE]
Purpose
This statement is used to print column information(s) for a given table or view.
Prerequisite(s)
• If the object to be described is a table, one of the following conditions must be fulfilled:
• The current user has one of the following system privileges: SELECT ANY TABLE (or SELECT ANY
DICTIONARY in context of system tables, respectively),INSERT ANY TABLE, UPDATE ANY TABLE,
DELETE ANY TABLE, ALTER ANY TABLE or DROP ANY TABLE.
• The current user has any object privilege on the table.
• The table belongs to the current user or one of his roles.
• If the object to be described is a view, one of the following conditions must be fulfilled:
• The current user has one of the following system privileges: SELECT ANY TABLE or DROP ANY VIEW.
• The current user has any object privilege on the view.
• The view belongs to the current user or one of his roles.
Syntax
describe::=
DESCRIBE FULL
object_name
DESC
Note(s)
• The SQL_TYPE column displays the datatype. In case of a string type, the used character set will be additionally
shown (ASCII or UTF8).
• The NULLABLE column indicates whether the column is permitted to contain NULL values.
• The value of DISTRIBUTION_KEY shows whether the column is part of the distribution key (see also the
ALTER TABLE (distribution) statement). For Views this value is always NULL.
• If you specify the option FULL, then the additional column COLUMN_COMMENT displays the column
comment (cut to maximal 200 characters), if this was set either implicitly by the CREATE TABLE command
or explicitly via the statement COMMENT.
• DESCRIBE can be abbreviated by DESC (e.g., DESC my_table;).
Example(s)
97
2.2. SQL statements
DISTRIBUTE BY i);
DESCRIBE FULL t;
EXPLAIN VIRTUAL
Purpose
This statement is useful to analyze what an adapter script is pushing down to the underlying data source of a virtual
object.
Prerequisite(s)
Syntax
explain_virtual::=
Note(s)
• Details about adapter scripts and virtual schemas can be found in Section 3.7, “Virtual schemas”.
• If you access virtual objects in a query and use the EXPLAIN command, the actual query is not executed and
no underlying data is transferred. The query is only compiled and optimized for pushing down as much logic
as possible into the query for the underlying system. The result of that statement is then a table containing the
effective queries for the external systems.
• You can use the EXPLAIN command as a subselect and thus process its result via SQL.
• Similar information about the details of a pushdown can also be found in profiling information, but this way
it is far easier to access.
Example(s)
98
Chapter 2. SQL reference
RECOMPRESS
Purpose
Prerequisite(s)
• Access to all tables by the ownership of the user or any of its roles or by any of the modifying ANY TABLE
system privileges or any modifying object privilege. Modifying privileges are all except SELECT.
Syntax
recompress::=
( column )
TABLE table
TABLES table
ENFORCE
RECOMPRESS SCHEMA schema
SCHEMAS schema
DATABASE
Note(s)
This command implies a COMMIT before and after recompressing any of the specified
tables except if you use the single table alternative RECOMPRESS TABLE!
• The execution time of this command can take some time. A smart logic tries to only recompress those columns
where a significant improvement can be achieved, but you can enforce the recompression using the ENFORCE
option.
• A recompression of a table especially makes sense after inserting a big amount of new data. But please note
that afterwards, the compression ratios do not need to be significantly better than before.
• You can also specify certain columns to only partially compress a table.
• The compressed and raw data size of a table can be found in the system table EXA_ALL_OBJECT_SIZES.
Example(s)
99
2.2. SQL statements
REORGANIZE
Purpose
By using this command the database can reorganized internally, which is necessary if the database cluster was
enlarged. This statement redistributes the data across the nodes and reconstitutes the distribution status (see also
ALTER TABLE (distribution)). That is why it is recommended that this statement is executed immediately after
the cluster enlargement. Elsewise the system performance can even be worse than before the enlargement.
Prerequisite(s)
• Access to all tables by the ownership of the user or any of its roles or by any of the modifying ANY TABLE
system privileges or any modifying object privilege. Modifying privileges are all except SELECT.
Syntax
reorganize::=
TABLE table
TABLES table
ENFORCE
REORGANIZE SCHEMA schema
SCHEMAS schema
DATABASE
Note(s)
• If you specify the option DATABASE, all existing tables are reorganized.
• Multiple tables are reorganized step by step and an implicit COMMIT is executed after each table. This leads
to less transaction conflicts and improves the performance when accessing already reorganized tables as soon
as possible.
• The reorganization consists of the following actions:
• DELETE reorganization: refilling rows that are only marked as deleted (a kind of defragmentation). This
happens automatically after reaching a certain threshold, but can be explicitly triggered with this command.
See also DELETE statement for further details.
• Recompressing the columns whose compression ratio decreased significantly over time.
• Recreation of all internal indices.
• TABLE redistribution: distributing the rows evenly across the cluster nodes and re-establishing the DIS-
TRIBUTE BY status after a cluster enlargement.
• If you specify the ENFORCE option, all specified tables are reorganized. Otherwise only those tables are ad-
justed where this operation is absolutely necessary (e.g. due to a cluster enlargement or lots of rows marked
as deleted).
• The execution time of this command can take some time. In particular each COMMIT potentially writes lots
of data to discs.
Example(s)
REORGANIZE DATABASE;
100
Chapter 2. SQL reference
Purpose
Via this command you can clear the data of the system tables EXA_DBA_AUDIT_SESSIONS,
EXA_DBA_AUDIT_SQL, EXA_USER_TRANSACTION_CONFLICTS_LAST_DAY and
EXA_DBA_TRANSACTION_CONFLICTS. This can be necessary e.g. if the gathered data volume is too big.
Prerequisite(s)
Syntax
truncate_audit_logs::=
DAY
LAST MONTH
KEEP YEAR
FROM datetime
TRUNCATE AUDIT LOGS
Note(s)
• By using the KEEP option you can keep the recent data.
Example(s)
FLUSH STATISTICS
Purpose
Statistical information in system tables is gathered continuously, but committed in certain periods. Using the
FLUSH STATISTICS command, you can force this commit.
Prerequisite(s)
• None.
Syntax
flush_statistics::=
FLUSH STATISTICS
101
2.2. SQL statements
Note(s)
• To show the newest flushed statistics, you may need to open a new transaction.
• This command generates additional load for the DBMS and should be used with caution. The statistical inform-
ation is anyway updated every minute.
• The available statistical information is described in Section A.2.3, “Statistical system tables”.
Example(s)
FLUSH STATISTICS;
COMMIT;
SELECT * FROM EXA_USAGE_LAST_DAY ORDER BY MEASURE_TIME DESC LIMIT 10;
PRELOAD
Purpose
Loads certain tables or columns and the corresponding internal indices from disk in case they are not yet in the
database cache. Due to Exasol's smart cache management we highly recommend to use this command only excep-
tionally. Otherwise you risk a worse overall system performance.
Prerequisite(s)
• Access to all tables by the ownership of the user or any of its roles or by any read/write system or object priv-
ileges.
Syntax
preload::=
( column )
TABLE table
TABLES table
SCHEMAS schema
DATABASE
Note(s)
• If you specify schemas or the complete database, the corresponding tables will be loaded into the cache.
102
Chapter 2. SQL reference
Example(s)
103
2.3. Data types
p ≥ 1; s ≥ 0
DOUBLE PRECISION
GEOMETRY[(srid)] srid defines the coordinate system (see also EXA_SPA-
TIAL_REF_SYS)
INTERVAL DAY [(p)] TO SECOND [(fp)] 1 ≤ p ≤ 9, 0 ≤ fp ≤ 9, accuracy precise to a millisecond
INTERVAL YEAR [(p)] TO MONTH 1≤p≤9
TIMESTAMP Timestamp with accuracy precise to a millisecond
TIMESTAMP WITH LOCAL TIME ZONE Timestamp which considers the session time zone
VARCHAR(n) 1 ≤ n ≤ 2,000,000
104
Chapter 2. SQL reference
• In case of TIMESTAMP WITH LOCAL TIME ZONE columns, the timestamps are internally normalized to
UTC, while the input and output value is interpreted in the session time zone. Hence, users in different time
zones can easily insert and display data without having to care about the internal storage. However, you should
be aware that executing the similar SQL statement in sessions with different time zones can lead to different
results.
• While TIMESTAMP is a simple structure consisting of year, month, day, hour, minute and second, data of
type TIMESTAMP WITH LOCAL TIME ZONE represents a specific moment on the time axis. Internally,
the data is normalized to UTC, because within the certain time zones exist time shifts (e.g. when switching
from winter to summer time) and ambiguous periods (e.g. when switching from summer to winter time). If
such problematic data is inserted within the local session time zone, the session value TIME_ZONE_BEHA-
VIOR (changeable by ALTER SESSION) defines the course of action.
INVALID
When the time is shifted forward in a timezone, then a gap evolves. If timestamps are located within this gap,
they can be treated in different ways:
SHIFT Corrects the value by adding the daylight saving time offset (typically one hour)
ADJUST Rounds the value to the first valid value after the time shift
NULLIFY Sets the value to NULL
105
2.3. Data types
AMBIGUOUS
When the time is shifted backward in a timezone, then ambiguous timestamps exist which can be treated in
different ways:
ST Interprets the value in Standard Time (ST)
DST Interprets the value in Daylight Saving Time (DST)
NULLIFY Sets the value to NULL
REJECT Throws an exception
• When casting between the data types TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE, the
session time zone is evaluated and the TIMESTAMP WITH LOCAL TIME ZONE transformed into a normal
TIMESTAMP. This is the similar approach to displaying a TIMESTAMP WITH LOCAL TIME ZONE value
in e.g. EXAplus, since then the internally UTC-normalized value is also converted into a normal TIMESTAMP,
considering the session time zone.
• Special literals do not exist for the data type TIMESTAMP WITH LOCAL TIME ZONE. The normal
TIMESTAMP literals are expected (see Section 2.5, “Literals”), and the corresponding moment on the time
axis is defined via the session time zone. Details about the arithmetic on datetime values and the datetime
functions can be found in the corresponding chapters (Section 2.9.1, “Scalar functions” and Section 2.7, “Op-
erators”).
• Please mention that timestamp values logged in statistical system tables are interpreted in the database time
zone ( DBTIMEZONE) which can be set via EXAoperation. This is especially relevant if you want to use the
different functions for the current timestamp:
SYSTIMESTAMP Returns the current timestamp, interpreted in the database time zone ( DB-
TIMEZONE)
CURRENT_TIMESTAMP Returns the current timestamp, interpreted the session time zone ( SESSION-
TIMEZONE)
LOCALTIMESTAMP Synonym for CURRENT_TIMESTAMP
NOW Synonym for CURRENT_TIMESTAMP
• The list of supported timezones can be found in the system table EXA_TIME_ZONES.
106
Chapter 2. SQL reference
The CHAR(n) data type has a fixed, pre-defined length n. When storing shorter strings, these are filled with spacing
characters ("padding").
VARCHAR(n) can contain any string of the length n or smaller. These strings are stored in their respective length.
The length of both types is limited to 2,000 characters (CHAR) and 2,000,000 characters (VARCHAR), respectively
both can be defined with a character set: ASCII or UTF8 (Unicode). If you omit this definition, UTF8 will be used.
stringtype_definition ::=
The character set of a certain column can be displayed by using the command DESC[RIBE].
107
2.3. Data types
p ≥ 1; s ≥ 0
DECIMAL DECIMAL(18,0)
DECIMAL(p) DECIMAL(p,0) 1 ≤ p ≤ 36
DOUBLE DOUBLE PRECISION
FLOAT DOUBLE PRECISION
INT DECIMAL(18,0)
INTEGER DECIMAL(18,0)
LONG VARCHAR VARCHAR(2000000)
NCHAR(n) CHAR(n)
NUMBER DOUBLE PRECISION Possible loss in precision
NUMBER(p) DECIMAL(p,0) 1 ≤ p ≤ 36
NUMBER(p,s) DECIMAL(p,s) s ≤ p ≤ 36
p ≥ 1; s ≥ 0
NUMERIC DECIMAL(18,0)
NUMERIC(p) DECIMAL(p,0) 1 ≤ p ≤ 36
NUMERIC(p,s) DECIMAL(p,s) s ≤ p ≤ 36
p ≥ 1; s ≥ 0
NVARCHAR(n) VARCHAR(n) 1 ≤ n ≤ 2,000,000
NVARCHAR2(n) VARCHAR(n) 1 ≤ n ≤ 2,000,000
REAL DOUBLE PRECISION
SHORTINT DECIMAL(9,0)
SMALLINT DECIMAL(9,0)
TINYINT DECIMAL(3,0)
VARCHAR2(n) VARCHAR(n) 1 ≤ n ≤ 2,000,000
108
Chapter 2. SQL reference
In many places in SQL queries, certain data types are expected, e.g. the CHAR or VARCHAR data type in the
string function SUBSTR[ING]. Should a user wish to use a different type, it is recommendable to work with ex-
plicit conversion functions (see also the list of Conversion functions).
If explicit conversion functions are not used, the system attempts to perform an implicit conversion. If this is not
possible or if one single value cannot be successfully converted during the computation, the system generates an
error message.
Symbol meaning:
Comments:
109
2.3. Data types
• When converting from CHAR(n) or VARCHAR(n) to BOOLEAN, you can use strings '0', 'F', 'f'
or 'FALSE' (case-insensitive) for value FALSE and strings '1', 'T', 't' or 'TRUE' (case-insensitive)
for value TRUE.
• In operations with multiple operands (e.g. the operators +,-,/,*) the operands are implicitly converted to the
biggest occurring data type (e.g. DOUBLE is bigger than DECIMAL) before executing the operation. This
rule is also called numeric precedence.
In the following example an implicit conversion is conducted in order to insert the BOOLEAN entry into the
DECIMAL column of the created table.
Example(s)
SELECT * FROM t;
D
----
1
Introduction
Default values are preconfigured values which are always inserted into table columns instead of NULL if no explicit
value is specified during an insert operation (e.g. INSERT).
Example
110
Chapter 2. SQL reference
Default values are explicitly or implicitly used in tables with the following statements:
• INSERT: DEFAULT as a 'value' for a column (INSERT INTO t(i,j,k) VALUES (1, DEFAULT,5))
or DEFAULT VALUES for all columns (INSERT INTO t DEFAULT VALUES) or implicitly at column
selection (for the columns that were not selected)
• IMPORT: implicitly at column selection (for the columns that were not selected)
• UPDATE: DEFAULT as a 'value' for a column (SET i=DEFAULT)
• MERGE: DEFAULT as a 'value' for a column in the INSERT and UPDATE parts or implicitly in the INSERT
part at column selection (for the columns that were not selected)
• ADD COLUMN: if a default value was specified for the inserted column with DEFAULT
• Constants
• Constant values at evaluation time such as CURRENT_USER or CURRENT_DATE
• Value expressions, which only contain the functions of the two above mentioned expressions
Examples
Example:
COLUMN_DEFAULT
--------------
3+4
111
2.3. Data types
• With MODIFY COLUMN, an existing default value is adopted when changing the data type. If the old default
value is not appropriate for the new data type, an error message is given.
• If entries used as default values could have different lengths and potentially might not fit in the table column
(e.g. CURRENT_USER or CURRENT_SCHEMA), an error message is given at the time of insertion if the
value still fit when the default value was set, but does not fit any more. In this case, the insertion will not be
performed. If in doubt, the default value should be reduced to the length of the column via SUBSTR[ING].
• Interpretation of default values can depend on the format model (e.g. a DATE value). In this case it is advisable
to explicitly enforce the format (e.g. by means of TO_DATE with format specification).
Example
Introduction
By the use of identity columns you can generate ids. They are similar to default values, but evaluated dynamically.
Example
ID LASTNAME SURNAME
------------------- -------------------- --------------------
1 Pacino Al
2 Willis Bruce
3 Pitt Brad
Notes
112
Chapter 2. SQL reference
• If you specify an explicit value for the identity column while inserting a row, then this value is inserted.
• In all other cases monotonically increasing numbers are generated by the system, but gaps can occur between
the numbers.
• The current value of the number generator can be changed via ALTER TABLE (column). Explicit inserted
values have no influence on the number generator.
• Via DML statements ( INSERT, IMPORT, UPDATE, MERGE) you can anytime manipulate the values
of an identity column.
You should not mistake an identity column with a constraint, i.e. identity columns
do not guarantee unique values. But the values are unique as long as values are inserted
only implicitly and are not changed manually.
• Identity columns must have an exact numeric data type without any scale (INTEGER, DECIMAL( x ,0)).
The range for the generated values is limited by the data type.
• Tables can have at most one identity column.
• A column cannot be an identity column and have a default value at the same time.
The dynamically created values of an identity column are explicitly or implicitly used in the following statements:
• INSERT: DEFAULT as 'value' for a column (INSERT INTO t(i,j,k) VALUES (1,DEFAULT,5))
or implicitly in case of a column selection where the identity column is not included
• IMPORT: implicitly in case of a column selection where the identity column is not included
• UPDATE: DEFAULT as 'value' for a column (SET i=DEFAULT)
• MERGE: DEFAULT as 'value' for a column within the INSERT and UPDATE parts or implicitly in the INSERT
part in case of a column selection where the identity column is not included
• ADD COLUMN: if the added column is an identity column
Example:
COLUMN_IDENTITY
-------------------------------------
30
113
2.4. Geospatial data
Please note that the geospatial data feature is part of the Advanced Edition of Exasol.
Please note that GEOMETRY columns are filled with strings (like e.g. 'POINT(2
5)'). If you read this data externally by the drivers, this data is automatically converted
to strings. The same applies for the commands IMPORT and EXPORT.
For geospatial objects, a multitude of functions are provided to execute calculations and operations.
Constructor Description
Geometry General abstraction of any geospatial objects
POINT(x y) Point within the two-dimensional area
LINESTRING(x y, x y, ...) LineString which connects several two-dimensional Points
LINEARRING(x y, x y, ...) A linear ring is a LineString whose start and end points are
identical.
POLYGON((x y, ...), [(x y, ...), Area which is defined by a linear ring and an optional list
...]) of holes within this area
GEOMETRYCOLLECTION(geometry, ...) A collection of any geospatial objects
MULTIPOINT(x y, ...) Set of Points
MULTILINESTRING((x y, ...), ...) Set of LineStrings
MULTIPOLYGON((x y, ...), ...) Set of Polygons
Instead of numerical arguments you can also use the keyword EMPTY for creating the
empty set of an object (e.g. POLYGON EMPTY)
114
Chapter 2. SQL reference
Examples
POINT(2 5) -- PT
LINESTRING(10 1, 15 2, 15 10) -- L
POLYGON((5 1, 5 5, 9 7, 10 1, 5 1),
(6 2, 6 3, 7 3, 7 2, 6 2) -- PG
Function Description
Point Functions
ST_X(p) x coordinate of a Point.
ST_Y(p) y coordinate of a Point.
(Multi-)LineString Functions
ST_ENDPOINT(ls) End point of a LineString.
ST_ISCLOSED(mls) Defines whether all contained LineStrings are rings, i.e. whether their start
and end points are identical.
ST_ISRING(ls) Defines whether a LineString is a closed ring, i.e. start and end points are
identical.
ST_LENGTH(mls) Length of a LineString or the sum of lengths of all objects of a MultiLineS-
tring.
ST_NUMPOINTS(ls) Number of Points within the LineString.
ST_POINTN(ls, n) The n-th point of a LineString, starting with 1. Returns NULL if
ST_NUMPOINTS(ls)<n.
ST_STARTPOINT(ls) Start point of a LineString.
(Multi-)Polygon Functions
ST_AREA(mp) Area of a polygon or sum of areas of all objects of a MultiPolygon.
115
2.4. Geospatial data
Function Description
ST_EXTERIORRING(pg) Outer ring of the object.
ST_INTERIORRINGN(pg,n) The n-th hole of a Polygon, starting with 1. Returns NULL if ST_NUMIN-
TERIORRINGS(pg)<n.
ST_NUMINTERIORRINGS(pg) Number of holes within a Polygon.
GeometryCollection Functions
ST_GEOMETRYN(gc,n) The n-th object of a GeometryCollection, starting with 1. Returns NULL
if ST_NUMGEOMETRIES(gc)<n.
ST_NUMGEOMETRIES(gc) Number of objects within a collection of geometry objects.
General Functions
ST_BOUNDARY(g) Geometric boundary of a geospatial object (e.g. the end points of a LineS-
tring or the outer LinearRing of a Polygon).
ST_BUFFER(g,n) Returns a geospatial object, whose points have maximal distance n to the
first argument. This is similar to a kind of extension of the borders of an
object. Around edges some kind of divided circle is created which is ap-
proximated by a number of points.
ST_CENTROID(g) Geometric center of mass of an object.
ST_CONTAINS(g,g) Defines whether the first object fully contains the second one.
ST_CONVEXHULL(g) Convex hull of a geospatial object.
ST_CROSSES(g,g) Defines whether the two objects cross each other. This is the case if
• The intersection is not empty and does not equal to one of the objects
• The dimension of the intersection is smaller than the maximal dimension
of both arguments
116
Chapter 2. SQL reference
Function Description
LINESTRING Simple if does not pass through the same Point
twice (except start and end points).
ST_OVERLAPS(g,g) Defines whether two geospatial objects overlap. This is the case if the ob-
jects are not identical, their intersection is not empty and has the same di-
mension as the two objects.
ST_SETSRID(g,srid) Sets the SRID for a geometry object (the coordinate system, see also
EXA_SPATIAL_REF_SYS).
ST_SYMDIFFERENCE(g,g) Symmetric difference set of two geospatial objects.
ST_TOUCHES(g,g) Defines whether two geospatial objects touch each other. This is the case
if the intersection is not empty and is only located on the boundaries of the
objects (see ST_BOUNDARY).
ST_TRANSFORM(g,srid) Converts a geospatial object into the given reference coordinate system
(see also EXA_SPATIAL_REF_SYS).
ST_UNION(g,g) Union set of two geospatial objects. This function can also be used as ag-
gregate function.
ST_WITHIN(g,g) Defines whether the first object is fully contained by the second one (op-
posite of ST_CONTAINS).
Examples
117
2.5. Literals
2.5. Literals
Literals represent constants, which possess a specific value as well as a corresponding data type. Literals can be
entered into tables as values or used within queries as constants, e.g. for comparisons or function parameters. If
the literal displays a different type to the assigned or comparison type, the "smaller" data type is implicitly converted.
Literals are grouped as follows: numeric literals, Boolean literals, date/time literals and string literals. There is
also a special literal for the NULL value.
Examples
123 Integer number (integral decimal number)
-123.456 Decimal number
1.2345E-15 Double value
TRUE Boolean value
DATE '2007-03-31' Date
TIMESTAMP '2007-03-31 12:59:30.123' Timestamp
INTERVAL '13-03' YEAR TO MONTH Interval (YEAR TO MONTH )
INTERVAL '1 12:00:30.123' DAY TO SECOND Interval (DAY TO SECOND)
'ABC' String
NULL NULL value
integer_literal::=
–
digit
decimal_literal::=
+
digit
– digit .
double_literal::=
–
E digit
decimal_literal
118
Chapter 2. SQL reference
boolean_literal::=
TRUE
FALSE
UNKNOWN
date_literal::=
DATE string
timestamp_literal::=
TIMESTAMP string
interval_year_to_month_literal::=
( precision )
’ int ’ YEAR
( precision )
INTERVAL ’ int ’ MONTH
( precision )
’ int – int ’ YEAR TO MONTH
119
2.5. Literals
Notes:
• int[-int] defines integer values for the number of years and months. In case of YEAR TO MONTH you
have to specify int-int.
• The optional parameter precision (1-9) specifies the maximal number of digits. Without this parameter 2
digits are allowed (months from 0 to 11).
• Examples:
interval_day_to_second_literal::=
int
time_expr
DAY
( precision )
HOUR
MINUTE
, fractional_precision
( precision )
SECOND
HOUR
MINUTE
TO
( fractional_precision )
SECOND
Notes:
• int defines the number of days (see precision for the maximal number of digits).
• time_expr specifies a time value in format HH[:MI[:SS[.n]]] or MI[:SS[.n]] or SS[.n]. Valid
values are 0-23 for hours (HH), 0-59 for minutes (MI) and 0-59.999 for seconds (SS). The parameter preci-
sion defines the maximal number of digits for the leading field (see below) which allows you also to use
larger numbers for hours, minutes and seconds (see examples). The parameter fractional_precision
defines at which position the fractional part of the seconds are rounded (0-9, default is 3). In 'time_expr'
you have to specify the whole used range, e.g. 'HH:MI' in case of HOUR TO MINUTE .
• Please mention that the precision of seconds is limited to 3 decimal places (like for timestamp values), although
you can specify more digits in the literal.
• The optional parameter precision (1-9) specifies the maximal number of digits for the leading field. In
default case, two digits are allowed.
120
Chapter 2. SQL reference
• The interval borders must be descendant. That is why e.g. SECOND TO MINUTE is not valid.
• Examples:
The smallest character set is automatically used for a string literal. If only ASCII characters are included, then the
ASCII character set will be used, otherwise the UTF8 character set (Unicode).
string_literal::=
character
’ ’
In order to use single quotes within a string, two single quotes are written next to one
another. For example, the literal 'AB''C' represents the value AB'C.
null_literal::=
NULL
121
2.6. Format models
If no format is specified, the current default format is used. This is defined in the NLS_DATE_FORMAT and
NLS_TIMESTAMP_FORMAT session parameters and can be changed for the current session by means of ALTER
SESSION or for the entire database by means of ALTER SYSTEM.
The default format is also important to implicit conversions (see also Section 2.3.4, “Type conversion rules”). If,
for example, a string value is inserted into a date column, a TO_DATE is executed implicitly.
If no language is specified, the current session language is used for representing the date. This is defined in the
NLS_DATE_LANGUAGE session parameter and can be changed for the current session by means of ALTER
SESSION or for the entire database by means of ALTER SYSTEM.
The current values for the NLS parameters can be found in the EXA_PARAMETERS system table.
For abbreviated month and day formats and those written in full, representation of the first two letters can be varied
via use of upper case and lower case letters (see the examples on the next page).
The following table lists all the elements that can occur in date/time format models. It is important to note that
each of the elements may only appear once in the format model. Exceptions to this are separators, padding characters
and the output format for TO_CHAR (datetime). It should also be noted that in the English language setting ab-
breviated weekdays are displayed with three letters, but only two with the German.
122
Chapter 2. SQL reference
Examples
123
2.6. Format models
SELECT TO_DATE('06-2003-MON','WW-YYYY-DY');
TO_CHAR1 TO_CHAR2
------------------------ ------------------------------
MONDAY -4-365; 2008\01 SATURDAY -Saturday -saturday -
The following table lists all the elements that can occur in numeric format models.
124
Chapter 2. SQL reference
Comments
• If the format string is too short for the number, a string is returned, which is filled out with the character, #.
• The group and decimal separators are defined in the NLS_NUMERIC_CHARACTERS parameter and can be
changed for the current session by means of ALTER SESSION or for the entire database by means of ALTER
SYSTEM.
• A point is used for the decimal separator and a comma for the group separator by default. The current values
for the NLS parameters can be found in the EXA_PARAMETERS system table.
The following examples illustrate the use of numeric format models in function TO_CHAR (number).
125
2.6. Format models
126
Chapter 2. SQL reference
2.7. Operators
Operators combine one or two values (operands), e.g. in case of an addition of two numbers.
The precedence, that is the order in which the operators are evaluated, is the following:
Syntax
+ operator::=
number + number
interval + interval
integer
datetime +
interval
- operator::=
number – number
interval – interval
date
date – integer
interval
* operator::=
number
number *
interval
/ operator::=
number
/ number
interval
Note(s)
• + operator
• If you add a decimal value to a timestamp, the result depends on the session/system parameter
TIMESTAMP_ARITHMETIC_BEHAVIOR:
127
2.7. Operators
Example(s)
Purpose
Returns the concatenation from string1 and string2. This concatenation operator is equivalent to the
CONCAT function.
Syntax
|| operator::=
string1 || string2
Example(s)
'abc'||'DEF'
128
Chapter 2. SQL reference
------------
abcDEF
Purpose
Details about the usage of operators can be found in the description of the SELECT statement in Section 2.2.4,
“Query language (DQL)”.
Syntax
PRIOR operator::=
PRIOR expr
CONNECT_BY_ROOT operator::=
CONNECT_BY_ROOT expr
Example(s)
SELECT last_name,
PRIOR last_name PARENT_NAME,
CONNECT_BY_ROOT last_name ROOT_NAME,
SYS_CONNECT_BY_PATH(last_name, '/') "PATH"
FROM employees
CONNECT BY PRIOR employee_id = manager_id
START WITH last_name = 'Clark';
129
2.8. Predicates
2.8. Predicates
2.8.1. Introduction
Predicates are expressions which return a boolean value as the result, i.e. FALSE, TRUE, or NULL (or its alias,
UNKNOWN).
• in the SELECT list as well as in the WHERE and HAVING clauses of a SELECT query
• in the WHERE clause of the UPDATE and DELETE statements
• in the ON clause and the WHERE clauses of the MERGE statement
Comparison predicates
Logical join predicates
[NOT] BETWEEN
EXISTS
[NOT] IN
IS [NOT] NULL
[NOT] REGEXP_LIKE
[NOT] LIKE
The order in which predicates are evaluated in complex expressions is determined by the precedence of the respective
predicate. The following table defines this in descending order, i.e. the predicates of the first row will be evaluated
first. However, the desired evaluation sequence can be specified by enclosing the expressions in parentheses.
Comparison predicates
Purpose
Comparison predicates compare two expressions and return whether the comparison is true.
130
Chapter 2. SQL reference
Syntax
!=
<
expr1 expr2
<=
>
>=
Note(s)
= Parity check
!= Disparity check (the aliases <> and ^= also exist for this)
< or <= Check for "less than" or "less than or equal to"
> or >= Check for "more than" or "more than or equal to"
NULL values If one of the two expressions is the NULL value, the result is also the NULL value.
Example(s)
Purpose
Syntax
AND
condition condition
OR
NOT condition
Note(s)
131
2.8. Predicates
Example(s)
[NOT] BETWEEN
Purpose
Syntax
NOT
expr1 BETWEEN expr2 AND expr3
Note(s)
Example(s)
RES
-----
TRUE
EXISTS
Purpose
132
Chapter 2. SQL reference
Syntax
EXISTS ( subquery )
Example(s)
i
-----
1
2
i
-----
2
[NOT] IN
Purpose
Syntax
,
NOT expr
expr IN ( )
subquery
Example(s)
x
-----
2
4
133
2.8. Predicates
x
-----
2
4
IS [NOT] NULL
Purpose
Syntax
NOT
expr IS NULL
Example(s)
[NOT] REGEXP_LIKE
Purpose
Syntax
NOT
string REGEXP_LIKE reg_expr
Note(s)
• Details and examples for regular expressions can be found in Section 2.1.3, “Regular expressions”.
• See also functions SUBSTR[ING], REGEXP_INSTR, REGEXP_SUBSTR and REGEXP_REPLACE.
134
Chapter 2. SQL reference
Example(s)
CONTAINS_EMAIL
--------------
TRUE
[NOT] LIKE
Purpose
Similar to the predicate [NOT] REGEXP_LIKE, but only with simple pattern matching.
Syntax
Note(s)
• Special characters:
esc_chr Character with which the characters _ and % can be used literally in a pattern. By default this is
the session variable DEFAULT_LIKE_ESCAPE_CHARACTER (see also ALTER SESSION).
Example(s)
RES1 RES2
----- -----
FALSE TRUE
135
2.9. Built-in functions
The individual sections summarize the functions according to their use. The actual description of the functions
can be found in Section 2.9.4, “Alphabetical list of all functions”.
Examples:
SELECT SIN(1);
SELECT LENGTH(s) FROM t;
SELECT EXP(1+ABS(n)) FROM t;
Scalar functions normally expect a special data type for their arguments. If this is not specified, an implicit conversion
to this data type is attempted or an error message is issued.
Numeric functions
Numeric functions are given a numeric value as input and normally deliver a numeric value as output.
ABS
ACOS
ASIN
ATAN
ATAN2
CEIL[ING]
COS
COSH
COT
DEGREES
DIV
EXP
FLOOR
LN
LOG
LOG10
LOG2
MOD
PI
POWER
RADIANS
RAND[OM]
ROUND (number)
SIGN
SIN
SINH
SQRT
TAN
TANH
136
Chapter 2. SQL reference
TO_CHAR (number)
TO_NUMBER
TRUNC[ATE] (number)
String functions
String functions can either return a string (e.g. LPAD) or a numeric value (e.g. LENGTH).
ASCII
BIT_LENGTH
CHARACTER_LENGTH
CH[A]R
COLOGNE_PHONETIC
CONCAT
DUMP
EDIT_DISTANCE
INSERT
INSTR
LCASE
LEFT
LENGTH
LOCATE
LOWER
LPAD
LTRIM
MID
OCTET_LENGTH
POSITION
REGEXP_INSTR
REGEXP_REPLACE
REGEXP_SUBSTR
REPEAT
REPLACE
REVERSE
RIGHT
RPAD
RTRIM
SOUNDEX
SPACE
SUBSTR[ING]
TO_CHAR (datetime)
TO_CHAR (number)
TO_NUMBER
TRANSLATE
TRIM
UCASE
UNICODE
UNICODECHR
UPPER
Date/Time functions
Date/Time functions manipulate the DATE, TIMESTAMP, TIMESTAMP WITH LOCAL TIME ZONE and IN-
TERVAL data types.
ADD_DAYS
ADD_HOURS
137
2.9. Built-in functions
ADD_MINUTES
ADD_MONTHS
ADD_SECONDS
ADD_WEEKS
ADD_YEARS
CONVERT_TZ
CURDATE
CURRENT_DATE
CURRENT_TIMESTAMP
DATE_TRUNC
DAY
DAYS_BETWEEN
DBTIMEZONE
EXTRACT
FROM_POSIX_TIME
HOUR
HOURS_BETWEEN
LOCALTIMESTAMP
MINUTE
MINUTES_BETWEEN
MONTH
MONTHS_BETWEEN
NOW
NUMTODSINTERVAL
NUMTOYMINTERVAL
POSIX_TIME
ROUND (datetime)
SECOND
SECONDS_BETWEEN
SESSIONTIMEZONE
SYSDATE
SYSTIMESTAMP
TO_CHAR (datetime)
TO_DATE
TO_DSINTERVAL
TO_TIMESTAMP
TO_YMINTERVAL
TRUNC[ATE] (datetime)
WEEK
YEAR
YEARS_BETWEEN
Geospatial functions
There exist a lot of functions to analyze geospatial data (see ST_* and Section 2.4, “Geospatial data”).
Bitwise functions
Bitwise functions can compute bit operations on numerical values.
BIT_AND
BIT_CHECK
BIT_LROTATE
BIT_LSHIFT
BIT_NOT
BIT_OR
138
Chapter 2. SQL reference
BIT_RROTATE
BIT_RSHIFT
BIT_SET
BIT_TO_NUM
BIT_XOR
Conversion functions
Conversion functions can be used to convert values to other data types.
CAST
CONVERT
IS_*
NUMTODSINTERVAL
NUMTOYMINTERVAL
TO_CHAR (datetime)
TO_CHAR (number)
TO_DATE
TO_DSINTERVAL
TO_NUMBER
TO_TIMESTAMP
TO_YMINTERVAL
CONNECT_BY_ISCYCLE
CONNECT_BY_ISLEAF
LEVEL
SYS_CONNECT_BY_PATH
CASE
COALESCE
CURRENT_SCHEMA
CURRENT_SESSION
CURRENT_STATEMENT
CURRENT_USER
DECODE
GREATEST
HASH_MD5
HASH_SHA[1]
HASH_TIGER
IPROC
LEAST
NULLIF
NULLIFZERO
NPROC
NVL
NVL2
ROWID
SYS_GUID
139
2.9. Built-in functions
USER
VALUE2PROC
ZEROIFNULL
If a GROUP BY clause is not stated, an aggregate function always refers to the entire table. This type of query
then returns exactly one result row.
APPROXIMATE_COUNT_DISTINCT
AVG
CORR
COUNT
COVAR_POP
COVAR_SAMP
FIRST_VALUE
GROUP_CONCAT
GROUPING[_ID]
LAST_VALUE
MAX
MEDIAN
MIN
PERCENTILE_CONT
PERCENTILE_DISC
REGR_*
ST_INTERSECTION (see ST_* and Section 2.4, “Geospatial data”)
ST_UNION (see ST_* and Section 2.4, “Geospatial data”)
STDDEV
STDDEV_POP
STDDEV_SAMP
SUM
VAR_POP
VAR_SAMP
VARIANCE
If an ORDER BY is specified for an analytical function, the number of data records relevant to the computation
can be further restricted by defining a window (WINDOW). Normally the window encompasses the data records
from the beginning of the partition to the current data record, however, it can also be explicitly restricted with
ROWS (physical border). Restriction using RANGE (logical border) is not yet supported by Exasol.
With exception to ORDER BY, analytical functions are executed to the end, i.e. after evaluating the WHERE,
GROUP BY, and HAVING clauses. Therefore, they may only be used in the SELECT list or the ORDER BY
clause.
140
Chapter 2. SQL reference
Analytical functions enable complex evaluations and analyzes through a variety of statistical functions and are a
valuable addition to aggregate functions (see Section 2.9.2, “Aggregate functions”). The following analytical
functions are supported:
AVG
CORR
COUNT
COVAR_POP
COVAR_SAMP
DENSE_RANK
FIRST_VALUE
LAG
LAST_VALUE
LEAD
MAX
MEDIAN
MIN
PERCENTILE_CONT
PERCENTILE_DISC
RANK
RATIO_TO_REPORT
REGR_*
ROW_NUMBER
STDDEV
STDDEV_POP
STDDEV_SAMP
SUM
VAR_POP
VAR_SAMP
VARIANCE
Analytical query
Purpose
Analytical functions evaluate a set of input values. However, unlike aggregate functions they return a result value
for each database row and not for each group of rows.
Syntax
analytical_query :=
expr
function ( ) over_clause
over_clause :=
partitionby :=
141
2.9. Built-in functions
PARTITION BY expr
orderby :=
ORDER BY expr
window :=
lower_boundary :=
UNBOUNDED PRECEDING
CURRENT ROW
upper_boundary :=
UNBOUNDED FOLLOWING
CURRENT ROW
Note(s)
• Analytical functions are always evaluated after WHERE, GROUP BY and HAVING but before ORDER BY
(global).
• If the table is divided into multiple partitions via the PARTITION BY clause, the results within each partition
are calculated independently from the rest of the table. If no PARTITION BY clause is stated, an analytical
function always refers to the entire table.
• A reliable collation of rows for an analytical function can only be achieved using ORDER BY within the OVER
- clause.
• If an ORDER BY clause is used, you can additionally limit the set of relevant rows for the computation by
specifying a window. The default window is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT
ROW.
Example(s)
SELECT age,
FLOOR(age/10) || '0ies' AS agegroup,
COUNT(*) OVER (
PARTITION BY FLOOR(age/10)
ORDER BY age
) AS COUNT
FROM staff;
142
Chapter 2. SQL reference
25 20ies 1
26 20ies 2
27 20ies 3
28 20ies 4
31 30ies 1
39 30ies 2
ABS
Purpose
Syntax
abs::=
ABS ( n )
Example(s)
ABS
----
123
ACOS
Purpose
Syntax
acos::=
ACOS ( n )
Note(s)
Example(s)
143
2.9. Built-in functions
ACOS
-----------------
1.047197551196598
ADD_DAYS
Purpose
Syntax
add_days::=
Note(s)
Example(s)
AD1 AD2
---------- --------------------------
2000-02-29 2001-03-01 12:00:00.000000
ADD_HOURS
Purpose
Syntax
add_hours::=
Note(s)
144
Chapter 2. SQL reference
Example(s)
AH1 AH2
-------------------------- --------------------------
2000-01-01 01:00:00.000000 2000-01-01 11:23:45.000000
ADD_MINUTES
Purpose
Syntax
add_minutes::=
Note(s)
Example(s)
AM1 AM2
-------------------------- --------------------------
1999-12-31 23:59:00.000000 2000-01-01 00:02:00.000000
ADD_MONTHS
Purpose
Syntax
add_months::=
145
2.9. Built-in functions
Note(s)
Example(s)
AM1 AM2
---------- --------------------------
2006-02-28 2006-03-31 12:00:00.000000
ADD_SECONDS
Purpose
Syntax
add_seconds::=
Note(s)
Example(s)
AS1 AS2
-------------------------- --------------------------
1999-12-31 23:59:59.000000 2000-01-01 00:00:01.234000
ADD_WEEKS
Purpose
146
Chapter 2. SQL reference
Syntax
add_weeks::=
Note(s)
Example(s)
AW1 AW2
---------- --------------------------
2000-03-07 2005-01-24 12:00:00.000000
ADD_YEARS
Purpose
Syntax
add_years::=
Note(s)
• If the resulting month has fewer days than the day of the date of entry, the last day of this month is returned.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
AY1 AY2
---------- --------------------------
2001-02-28 2004-01-31 12:00:00.000000
147
2.9. Built-in functions
APPROXIMATE_COUNT_DISTINCT
Purpose
Syntax
approximate_count_distinct::=
APPROXIMATE_COUNT_DISTINCT ( expr )
Note(s)
• The result isn't exact as it is with function COUNT, but it can be computed a lot faster.
• For the calculation, the algorithm HyperLogLog is used internally.
Example(s)
COUNT_EXACT COUNT_APPR
----------- ----------
10000000 10143194
ASCII
Purpose
Syntax
ascii::=
ASCII ( char )
Note(s)
Example(s)
SELECT ASCII('X');
ASCII('X')
----------
88
148
Chapter 2. SQL reference
ASIN
Purpose
Returns the arcsine of number n. The result is between -π/2 and π/2.
Syntax
asin::=
ASIN ( n )
Note(s)
Example(s)
SELECT ASIN(1);
ASIN(1)
-----------------
1.570796326794897
ATAN
Purpose
Returns the arctangent of number n. The result is between -π/2 and π/2.
Syntax
atan::=
ATAN ( n )
Example(s)
SELECT ATAN(1);
ATAN(1)
-----------------
0.785398163397448
ATAN2
Purpose
Returns the arctangent of two numbers n and m. The expression is equivalent to ATAN(n/m).
149
2.9. Built-in functions
Syntax
atan2::=
ATAN2 ( n , m )
Example(s)
ATAN2(1,1)
-----------------
0.785398163397448
AVG
Purpose
Syntax
avg::=
DISTINCT
ALL over_clause
AVG ( expr )
Note(s)
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• Only numeric operands are supported.
Example(s)
AVG
-----------------
36.25
BIT_AND
Purpose
150
Chapter 2. SQL reference
Syntax
bit_and::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The result data type is DECIMAL(20,0).
Example(s)
BIT_AND(9,3)
-------------------
1
BIT_CHECK
Purpose
Checks whether a certain bit of a numerical value is set. The position parameter starts from 0 which means the
lowest bit.
Syntax
bit_check::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The value pos may be between 0 and 63.
Example(s)
B0 B1 B2 B3
----- ----- ----- -----
TRUE TRUE FALSE FALSE
151
2.9. Built-in functions
BIT_LENGTH
Purpose
Returns the bit length of a string. If only ASCII characters are used, then this function is equivalent to CHARAC-
TER_LENGTH * 8.
Syntax
bit_length::=
BIT_LENGTH ( string )
Example(s)
BIT_LENGTH
----------
24
BIT_LENGTH
----------
48
BIT_LROTATE
Purpose
Syntax
bit_lrotate::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The second parameter can be between 0 and 63.
• The result data type is DECIMAL(20,0).
Example(s)
SELECT BIT_LROTATE(1024,63);
BIT_LROTATE(1024,63)
152
Chapter 2. SQL reference
--------------------
512
BIT_LSHIFT
Purpose
Syntax
bit_lshift::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The second parameter can be between 0 and 63.
• The result data type is DECIMAL(20,0).
Example(s)
SELECT BIT_LSHIFT(1,10);
BIT_LSHIFT(1,10)
----------------
1024
BIT_NOT
Purpose
Syntax
bit_not::=
BIT_NOT ( integer )
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The result data type is DECIMAL(20,0).
153
2.9. Built-in functions
Example(s)
BIT_NOT(0) BIT_NOT(18446744073709551615)
--------------------- ---------------------------
18446744073709551615 0
BIT_AND(BIT_NOT(1),5)
---------------------
4
BIT_OR
Purpose
Syntax
bit_or::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The result data type is DECIMAL(20,0).
Example(s)
BIT_OR(9,3)
-------------------
11
BIT_RROTATE
Purpose
Syntax
bit_rrotate::=
154
Chapter 2. SQL reference
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The second parameter can be between 0 and 63.
• The result data type is DECIMAL(20,0).
Example(s)
SELECT BIT_RROTATE(1024,63);
BIT_RROTATE(1024,63)
--------------------
2048
BIT_RSHIFT
Purpose
Syntax
bit_rshift::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The second parameter can be between 0 and 63.
• The result data type is DECIMAL(20,0).
Example(s)
SELECT BIT_RSHIFT(1024,10);
BIT_RSHIFT(1024,10)
-------------------
1
BIT_SET
Purpose
Sets a certain bit of a numerical value. The position parameter starts from 0 which means the lowest bit.
Syntax
bit_set::=
155
2.9. Built-in functions
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The value pos may be between 0 and 63.
• The result data type is DECIMAL(20,0).
Example(s)
SELECT BIT_SET(8,0);
BIT_SET(8,0)
-------------------
9
BIT_TO_NUM
Purpose
Syntax
bit_to_num::=
BIT_TO_NUM ( digit )
Note(s)
Example(s)
SELECT BIT_TO_NUM(1,1,0,0);
BIT_TO_NUM(1,1,0,0)
-------------------
12
156
Chapter 2. SQL reference
BIT_XOR
Purpose
Computes the bitwise exclusive OR operation of two numerical values. The result in each position is 1 if the two
corresponding bits are different.
Syntax
bit_xor::=
Note(s)
• Bit functions are limited to 64 bits, which means to positive numbers between 0 and 18446744073709551615.
• The result data type is DECIMAL(20,0).
Example(s)
BIT_XOR(9,3)
-------------------
10
CASE
Purpose
With the aid of the CASE function, an IF THEN ELSE logic can be expressed within the SQL language.
Syntax
case::=
simple_case_expr
CASE END
searched_case_expr
simple_case_expr::=
ELSE expr
expr WHEN comparison_expr THEN result
searched_case_expr::=
ELSE expr
WHEN condition THEN result
157
2.9. Built-in functions
Note(s)
• With the simple_case_expr the expr is compared with the specified alternatives. The THEN part of the
first match defines the result.
• With the searched_case_expr the row is evaluated using all of the conditions until one equates to the
TRUE value. The THEN part of this condition is the result.
• If none of the options apply, the ELSE value is returned. If this was not specified, the NULL value is returned.
Example(s)
NAME GRADE
------- -----------
Fischer VERY GOOD
Schmidt FAIR
NAME CLASS
------ --------
Meier STANDARD
Huber PREMIUM
CAST
Purpose
Converts an expression into the specified data type. If this is not possible, then an exception is thrown.
Syntax
cast::=
Note(s)
158
Chapter 2. SQL reference
Example(s)
DATECAST
----------
2006-01-01
CEIL[ING]
Purpose
Returns the smallest whole number that is larger or equal to the given number.
Syntax
ceiling::=
CEIL
( number )
CEILING
Example(s)
CEIL
----
1
CHARACTER_LENGTH
Purpose
Syntax
character_length::=
CHARACTER_LENGTH ( string )
Example(s)
159
2.9. Built-in functions
C_LENGTH
----------
8
CH[A]R
Purpose
Returns the ASCII character whose ordinal number is the given integer.
Syntax
chr::=
CHR
( integer )
CHAR
Note(s)
Example(s)
CHR
---
X
COALESCE
Purpose
Returns the first value from the argument list which is not NULL. If all of the values are NULL, the function returns
NULL.
Syntax
coalesce::=
COALESCE ( expr )
Note(s)
160
Chapter 2. SQL reference
• The COALESCE(expr1,expr2) function is equivalent to the CASE expression CASE WHEN expr1 IS
NOT NULL THEN expr1 ELSE expr2 END
Example(s)
COALES
------
abc
COLOGNE_PHONETIC
Purpose
This function returns a phonetic representation of a string. You can use it to compare words which sounds similar,
but are spelled different.
Syntax
cologne_phonetic::=
COLOGNE_PHONETIC ( string )
Note(s)
Example(s)
COLOGNE_PHONETIC('schmitt') COLOGNE_PHONETIC('Schmidt')
--------------------------- ---------------------------
862 862
CONCAT
Purpose
Syntax
concat::=
161
2.9. Built-in functions
CONCAT ( string )
Note(s)
Example(s)
CONCAT
------
abcdef
CONNECT_BY_ISCYCLE
Purpose
Returns for a CONNECT BY query whether a row causes a cycle. Details can be found in the description of the
SELECT statement in Section 2.2.4, “Query language (DQL)”.
Syntax
connect_by_iscycle::=
CONNECT_BY_ISCYCLE
Example(s)
SELECT CONNECT_BY_ISCYCLE,
SYS_CONNECT_BY_PATH(last_name, '/') "PATH"
FROM employees WHERE last_name = 'Clark'
CONNECT BY NOCYCLE PRIOR employee_id = manager_id
START WITH last_name = 'Clark';
CONNECT_BY_ISCYCLE PATH
------------------ ----------------------------
0 /Clark
1 /Clark/Jackson/Johnson/Clark
CONNECT_BY_ISLEAF
Purpose
Returns for a CONNECT BY query whether a row is a leaf within the tree. Details can be found in the description
of the SELECT statement in Section 2.2.4, “Query language (DQL)”.
162
Chapter 2. SQL reference
Syntax
connect_by_isleaf::=
CONNECT_BY_ISLEAF
Example(s)
CONVERT
Purpose
Syntax
convert::=
Note(s)
Example(s)
STRINGCAST
---------------
ABC
DATECAST
----------
2006-01-01
163
2.9. Built-in functions
CONVERT_TZ
Purpose
Converts a timestamp from one time zone into another one. Please note that timestamps don't contain any timezone
information. This function therefore just adds the time shift between two specified timezones.
Syntax
convert_tz::=
, options
CONVERT_TZ ( datetime , from_tz , to_tz )
Note(s)
• The list of supported timezones can be found in the system table EXA_TIME_ZONES.
• If the input value has type TIMESTAMP WITH LOCAL TIME ZONE, then this function is only allowed if
the session time zone ( SESSIONTIMEZONE) is identical to the parameter from_tz. However, the result
type is still the TIMESTAMP data type.
• The optional fourth parameter (string) specifies options how problematic input data due to time shifts should
be handled. The following alternatives exist:
Details about the options can be found in section Date/Time data types in Section 2.3, “Data types”. The last
option is a special option to ensure the reversibility of the conversion. An exception is thrown if the input data
is invalid or ambiguous, and if the result timestamp would be ambiguous (which means it couldn't be converted
back without information loss).
When omitting the fourth parameter, the default behavior is defined by the session value TIME_ZONE_BEHA-
VIOR (see ALTER SESSION).
Example(s)
164
Chapter 2. SQL reference
CONVERT_TZ
-------------------
2012-05-10 14:00:00
CORR
Purpose
Returns the coefficient of correlation of a set of number pairs (a type of relation measure). This equates to the
following formula:
Syntax
corr::=
over_clause
CORR ( expr1 , expr2 )
Note(s)
• If either expr1 or expr2 is the value NULL, then the corresponding number pair is not considered for the
computation.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
INDUSTRY CORR
---------- -----------------
Finance 0.966045513268967
IT 0.453263345203583
COS
Purpose
165
2.9. Built-in functions
Syntax
cos::=
COS ( n )
Example(s)
SELECT COS(PI()/3);
COS(PI()/3)
-----------------
0.5
COSH
Purpose
Syntax
cosh::=
COSH ( n )
Example(s)
SELECT COSH(1);
COSH(1)
-----------------
1.543080634815244
COT
Purpose
Syntax
cot::=
COT ( n )
166
Chapter 2. SQL reference
Example(s)
SELECT COT(1);
COT(1)
-----------------
0.642092615934331
COUNT
Purpose
Syntax
count::=
DISTINCT
expr
ALL
,
over_clause
COUNT ( ( expr ) )
Note(s)
Example(s)
CNT_ALL
-------------------
10
CNT_DIST
-------------------
8
167
2.9. Built-in functions
COVAR_POP
Purpose
Returns the population covariance of a set of number pairs (a type of relation measure). This equates to the following
formula:
Pn
i=1 expr1i − expr1 expr2i − expr1
COVAR POP(expr1, expr2) =
n
Syntax
covar_pop::=
over_clause
COVAR_POP ( expr1 , expr2 )
Note(s)
• If either expr1 or expr2 is the value NULL, then the corresponding number pair is not considered for the
computation.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
INDUSTRY COVAR_POP
---------- -----------------
Finance 209360
IT 31280
COVAR_SAMP
Purpose
Returns the sample covariance of a set of number pairs (a type of relation measure). This equates to the following
formula:
Pn
i=1 expr1i − expr1 expr2i − expr1
COVAR SAMP(expr1, expr2) =
n−1
Syntax
covar_samp::=
over_clause
COVAR_SAMP ( expr1 , expr2 )
168
Chapter 2. SQL reference
Note(s)
• If either expr1 or expr2 is the value NULL, then the corresponding number pair is not considered for the
computation.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
INDUSTRY COVAR_SAMP
---------- -----------------
Finance 261700
IT 39100
CURDATE
Purpose
Syntax
curdate::=
CURDATE ( )
Note(s)
Example(s)
CURDATE
----------
1999-12-31
CURRENT_DATE
Purpose
Syntax
current_date::=
169
2.9. Built-in functions
CURRENT_DATE
Note(s)
Example(s)
SELECT CURRENT_DATE;
CURRENT_DATE
------------
1999-12-31
CURRENT_SCHEMA
Purpose
Returns the schema currently open. If a schema is not open, the NULL value is the result.
Syntax
current_schema::=
CURRENT_SCHEMA
Example(s)
SELECT CURRENT_SCHEMA;
CURRENT_SCHEMA
--------------
MY_SCHEMA
CURRENT_SESSION
Purpose
Returns the id of the current session. This id is also referenced e.g. in system table EXA_ALL_SESSIONS.
Syntax
current_session::=
CURRENT_SESSION
170
Chapter 2. SQL reference
Example(s)
SELECT CURRENT_SESSION;
CURRENT_SESSION
---------------------
7501910697805018352
CURRENT_STATEMENT
Purpose
Returns the current id of the statements which is serially numbered within the current session.
Syntax
current_statement::=
CURRENT_STATEMENT
Example(s)
SELECT CURRENT_STATEMENT;
CURRENT_STATEMENT
---------------------
26
CURRENT_TIMESTAMP
Purpose
Returns the current timestamp, interpreted in the current session time zone.
Syntax
current_timestamp::=
CURRENT_TIMESTAMP
Note(s)
• The return value is of data type TIMESTAMP WITH LOCAL TIME ZONE.
• The function NOW is an alias for CURRENT_TIMESTAMP.
• Other functions for the current moment:
• LOCALTIMESTAMP
• SYSTIMESTAMP
171
2.9. Built-in functions
Example(s)
SELECT CURRENT_TIMESTAMP;
CURRENT_TIMESTAMP
-------------------
1999-12-31 23:59:59
CURRENT_USER
Purpose
Syntax
current_user::=
CURRENT_USER
Example(s)
SELECT CURRENT_USER;
CURRENT_USER
------------
SYS
DATE_TRUNC
Purpose
Syntax
date_trunc::=
Note(s)
• As format you can use one of the following elements: 'microseconds', 'milliseconds', 'second', 'minute', 'hour',
'day', 'week', 'month', 'quarter', 'year', 'decade', 'century', 'millennium'
• The first day of a week (for format element 'week') is defined via the parameter NLS_FIRST_DAY_OF_WEEK
(see ALTER SESSION and ALTER SYSTEM).
• A similar functionality provides the Oracle compatible function TRUNC[ATE] (datetime).
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
172
Chapter 2. SQL reference
Example(s)
DATE_TRUNC
----------
2006-12-01
DATE_TRUNC
--------------------------
2006-12-31 23:59:00.000000
DAY
Purpose
Syntax
day::=
DAY ( date )
Note(s)
Example(s)
DAY
---
20
DAYS_BETWEEN
Purpose
Syntax
days_between::=
173
2.9. Built-in functions
Note(s)
• If a timestamp is entered, only the date contained therein is applied for the computation.
• If the first date value is earlier than the second date value, the result is negative.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
DB1 DB2
---------- ----------
-1 1
DBTIMEZONE
Purpose
Returns the database time zone which is set system-wide in EXAoperation and represents the local time zone of
the EXASOL servers.
Syntax
dbtimezone::=
DBTIMEZONE
Note(s)
Example(s)
SELECT DBTIMEZONE;
DBTIMEZONE
-------------
EUROPE/BERLIN
DECODE
Purpose
The decode function returns the result value for which the expression, expr, matches the expression, search.
If no match is found, NULL or − if specified − the default value is returned.
174
Chapter 2. SQL reference
Syntax
decode::=
,
, default
DECODE ( expr , search , result )
Note(s)
• Decode is similar to CASE, but has slightly different functionality (to be compliant to other databases):
• The expression expr can be directly compared with value NULL (e.g. DE-
CODE(my_column,NULL,0,my_column))
• String comparisons are done "non-padded" (that's why DECODE(my_column,'abc',TRUE,FALSE)
on a CHAR(10) column always returns false)
• Due to readability reasons we recommend to use CASE.
Example(s)
DECODE
------
2
DEGREES
Purpose
Syntax
degrees::=
DEGREES ( n )
Note(s)
Example(s)
SELECT DEGREES(PI());
DEGREES(PI())
-----------------
180
175
2.9. Built-in functions
DENSE_RANK
Purpose
Syntax
dense_rank::=
partitionby
DENSE_RANK ( ) OVER ( orderby )
Note(s)
• DENSE_RANK can only be used as an analytical function (in combination with OVER(...), see also Sec-
tion 2.9.3, “Analytical functions”).
• The OVER clause must contain an ORDER BY part and may not contain a window clause.
• The same value is returned for rows with equal ranking. However, the following values are not skipped - as is
the case with RANK.
Example(s)
DIV
Purpose
Syntax
div::=
DIV ( m , n )
Example(s)
176
Chapter 2. SQL reference
DIV
---
2
DUMP
Purpose
Returns the byte length and the character set of string, as well as the internal representation of the characters
specified by startposition start and length length.
Syntax
dump::=
, length
, start
, format
DUMP ( string )
Note(s)
• The argument format specifies the format of the return value. There are four valid format-values:
• 8: Octal notation
• 10: Decimal notation (Default)
• 16: Hexadecimal notation
• 17: ASCII characters are directly printed, multi-byte-characters are printed in hexadecimal format
• The argument length specifies the maximal number of selected characters beginning at startposition start.
If length=0 all possible characters are selected. For negative numbers the absolute value of length will
be used.
• The argument start specifies the startposition of the character selection. If the character length of string
is less than the absolute value of start the function returns NULL. For negative numbers the startposition
is set to the absolute value of start counted from the right (Default=1).
• If the argument string is NULL the function returns the character string 'NULL'.
Example(s)
DUMP
-------------------------------------------
Len=6 CharacterSet=ASCII: 49,50,51,97,98,99
DUMP
------------------------------------------------
Len=8 CharacterSet=UTF8: c3,bc,c3,a4,c3,b6,34,35
177
2.9. Built-in functions
EDIT_DISTANCE
Purpose
This function defines the distance between two strings, indicating how similar the are.
Syntax
edit_distance::=
Note(s)
• To check the phonetic equivalence of strings you can use the functions SOUNDEX and COLOGNE_PHON-
ETIC.
• The number of changes is calculated which need to be done to convert one string into the other.
• The result is a number between 0 and the length of the wider string.
Example(s)
EDIT_DISTANCE('schmitt','Schmidt')
----------------------------------
2
EXP
Purpose
Syntax
exp::=
EXP ( n )
Example(s)
SELECT EXP(1);
EXP(1)
-----------------
2.718281828459045
178
Chapter 2. SQL reference
EXTRACT
Purpose
Syntax
extract::=
YEAR
MONTH
DAY datetime
EXTRACT ( FROM )
HOUR interval
MINUTE
SECOND
Note(s)
• When extracting seconds, the milliseconds contained in the timestamp or interval are also extracted.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
FIRST_VALUE
Purpose
179
2.9. Built-in functions
Syntax
first_value::=
over_clause
FIRST_VALUE ( expr )
Note(s)
• Due to the fact that the rows in EXASOL are distributed across the cluster, FIRST_VALUE is non-determin-
istic as an aggregate function. Accordingly, FIRST_VALUE serves primarily as a help function in the event
that only the same elements are contained within a group.
• The same applies when being used as an analytical function (see also Section 2.9.3, “Analytical functions”.)
if the OVER clause does not include an ORDER BY part.
Example(s)
SELECT name,
hire_date,
FIRST_VALUE(hire_date) OVER (ORDER BY hire_date) FIRST_VAL
FROM staff;
FLOOR
Purpose
Syntax
floor::=
FLOOR ( n )
Example(s)
FLOOR
-----
4
180
Chapter 2. SQL reference
FROM_POSIX_TIME
Purpose
Posix time (also known as Unix time) is a system for describing points in time, defined as the number of seconds
elapsed since midnight of January 1, 1970 (UTC). By using this function you can convert the Posix Time (that
means a numerical value) to a timestamp.
Syntax
from_posix_time::=
FROM_POSIX_TIME ( number )
Note(s)
Example(s)
FPT1 FPT2
-------------------------- --------------------------
1970-01-01 00:00:01.000000 2009-02-13 23:31:30.000000
GREATEST
Purpose
Syntax
greatest::=
GREATEST ( expr )
Note(s)
181
2.9. Built-in functions
Example(s)
GREATEST
--------
5
GROUP_CONCAT
Purpose
Syntax
group_concat::=
DISTINCT
GROUP_CONCAT ( expr
ASC FIRST
NULLS
DESC LAST
ORDER BY expr
SEPARATOR sep_string
)
Note(s)
• If you specify DISTINCT, duplicate strings are eliminated, if they would be in series.
• When using ORDER BY, the rows within a group are sorted before the aggregation (concatenation). In default
case, the options ASC NULLS LAST are used (ascending, NULL values at the end).
• If the ORDER BY option is omitted and the DISTINCT option is specified, then the rows of a group are impli-
citly sorted by concat_expr to ensure that the result is deterministic.
• By using SEPARATOR sep_string, you can define the delimiter between the concatenated elements. In
default case the delimiter ',' is used. By using the empty string ('') you can also omit the delimiter com-
pletely.
• The resulting data type is the maximal string type (VARCHAR(2000000)).
Example(s)
DEPARTMENT GROUP_CONCAT
---------- ------------
sales carsten,joe,thomas
marketing alex,monica
182
Chapter 2. SQL reference
GROUPING[_ID]
Purpose
By the use of this function you can distinguish between regular result rows and superaggregate rows which are
created in case of GROUPING SETS, CUBE or ROLLUP clauses.
Syntax
grouping::=
,
GROUPING
( expr )
GROUPING_ID
Note(s)
Example(s)
REVENUE Y M SUPERAGGREGATE
------------------- -------- ---------- --------------
1725.90 2010 December
1725.90 2010 yearly
735.88 2011 April
752.46 2011 February
842.32 2011 March
931.18 2011 January
3261.84 2011 yearly
4987.74 total
HASH_MD5
Purpose
183
2.9. Built-in functions
Syntax
hash_md5::=
HASH_MD5 ( expr )
Note(s)
• Return values have data type CHAR(32) and contain hex characters.
• The data types of the input parameters are significant. That is why HASH_MD5(123) is different to
HASH_MD5('123').
• Multiple input expressions are concatenated (in their internal byte representation) before the hash value is
computed. Please note that generally, HASH_MD5(c1,c2) is not similar to HASH_MD5(c1||c2).
• The function returns NULL if all input expressions are NULL.
Example(s)
SELECT HASH_MD5('abc');
HASH_MD5('abc')
--------------------------------
900150983cd24fb0d6963f7d28e17f72
HASH_SHA[1]
Purpose
Syntax
hash_sha1::=
,
HASH_SHA1
( expr )
HASH_SHA
Note(s)
• Return values have data type CHAR(40) and contain hex characters.
• The data types of the input parameters are significant. That is why HASH_SHA1(123) is different to
HASH_SHA1('123').
• Multiple input expressions are concatenated (in their internal byte representation) before the hash value is
computed. Please note that generally, HASH_SHA1(c1,c2) is not similar to HASH_SHA1(c1||c2).
• The function returns NULL if all input expressions are NULL.
• HASH_SHA() is an alias for HASH_SHA1().
184
Chapter 2. SQL reference
Example(s)
SELECT HASH_SHA1('abc');
HASH_SHA1('abc')
----------------------------------------
a9993e364706816aba3e25717850c26c9cd0d89d
HASH_TIGER
Purpose
Syntax
hash_tiger::=
HASH_TIGER ( expr )
Note(s)
• Return values have data type CHAR(48) and contain hex characters.
• The data types of the input parameters are significant. That is why HASH_TIGER(123) is different to
HASH_TIGER('123').
• Multiple input expressions are concatenated (in their internal byte representation) before the hash value is
computed. Please note that generally, HASH_TIGER(c1,c2) is not similar to HASH_TIGER(c1||c2).
• The function returns NULL if all input expressions are NULL.
Example(s)
SELECT HASH_TIGER('abc');
HASH_TIGER('abc')
------------------------------------------------
2aab1484e8c158f2bfb8c5ff41b57a525129131c957b5f93
HOUR
Purpose
Syntax
hour::=
HOUR ( datetime )
185
2.9. Built-in functions
Note(s)
Example(s)
HOU
---
11
HOURS_BETWEEN
Purpose
Returns the number of hours between timestamp timestamp1 and timestamp timestamp2.
Syntax
hours_between::=
Note(s)
• If timestamp timestamp1 is earlier than timestamp timestamp2, then the result is negative.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated internally within UTC.
Example(s)
HB
---------------
0.9819166666667
INSERT
Purpose
Replaces the substring of string, with length length beginning at position, with string new_string.
Syntax
insert::=
186
Chapter 2. SQL reference
Note(s)
• The first character of string has position 1. If the variable position is 0 or outside the string, then the string
isn't changed. If it is negative, then the function counts backwards from the end.
• If length=0, then new_string is just inserted and nothing is replaced.
• If position+length>length(string) or if length<0, then the string is replaced beginning from pos-
ition.
• If one of the parameters is NULL, then NULL is returned.
Example(s)
SELECT INSERT('abc',2,2,'xxx'),
INSERT('abcdef',3,2,'CD');
INSERT('abc',2,2,'xxx') INSERT('abcdef',3,2,'CD')
----------------------- -------------------------
axxx abCDef
INSTR
Purpose
Returns the position in string at which search_string appears. If this is not contained, the value 0 is returned.
Syntax
instr::=
, occurence
, position
INSTR ( string , search_string )
Note(s)
• The optional parameter position defines from which position the search shall begin (the first character has
position 1). If the value is negative, EXASOL counts and searches backwards from the end (e.g. IN-
STR(string,'abc',-3) searches backwards from the third last letter).
• The optional positive number occurrence defines which occurrence shall be searched for.
• INSTR(string,search_string) is similar to INSTR(string,search_string,1,1).
• The functions POSITION and LOCATE are similar.
Example(s)
INSTR1 INSTR2
187
2.9. Built-in functions
---------- ----------
3 19
IPROC
Purpose
Returns the local node number within the cluster. By that, you can visualize which rows are stored on which nodes.
Syntax
iproc::=
IPROC ( )
Note(s)
Example(s)
C1 IPROC
-- -----
1 0
2 1
3 2
4 3
5 0
6 1
IPROC CNT
----- ---
0 2
1 2
2 1
3 1
IS_*
Purpose
Returns TRUE if string can be converted to a certain data type. If e.g. IS_NUMBER returns TRUE, you can
convert the string via TO_NUMBER.
188
Chapter 2. SQL reference
Syntax
is_datatype::=
IS_NUMBER
, format
IS_DATE ( string )
IS_TIMESTAMP
IS_BOOLEAN
IS_DSINTERVAL ( string )
IS_YMINTERVAL
Note(s)
Example(s)
LAG
Purpose
By using LAG in an analytical function, you are able to access previous rows within a partition. The expression
expr is evaluated on that row which is located exactly offset rows prior to the current row.
Syntax
lag::=
, default
, offset
LAG ( expr )
partitionby
OVER ( orderby )
189
2.9. Built-in functions
Note(s)
• LAG can only be used as an analytical function (in combination with OVER(...), see also Section 2.9.3,
“Analytical functions”).
• The OVER clause must contain an ORDER BY part and may not contain a window clause.
• If the ORDER BY part doesn't define an unique sort order, the result is non-deterministic.
• If the access is beyond the scope of the current partition, LAG returns the value of parameter default or
NULL if default was not specified.
• If you omit the parameter offset, value 1 is used. Negative values are not allowed for offset. Rows with
offset NULL are handled as if they were beyond the scope of the current partition.
• To access following rows you can use the function LEAD.
Example(s)
LAST_VALUE
Purpose
Syntax
last_value::=
over_clause
LAST_VALUE ( expr )
190
Chapter 2. SQL reference
Note(s)
• Due to the fact that the rows in EXASOL are distributed across the cluster, LAST_VALUE is non-deterministic
as an aggregate function. Accordingly, LAST_VALUE serves primarily as a help function in the event that
only the same elements are contained within a group.
• The same applies when being used as an analytical function (see also Section 2.9.3, “Analytical functions”.)
if the OVER clause does not include an ORDER BY part.
Example(s)
SELECT name,
hire_date,
LAST_VALUE(hire_date) OVER (ORDER BY hire_date
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) LAST_VAL
FROM staff;
LCASE
Purpose
Syntax
lcase::=
LCASE ( string )
Note(s)
Example(s)
LCASE
------
abcdef
LEAD
Purpose
By using LEAD in an analytical function, you are able to access following rows within a partition. The expression
expr is evaluated on that row which is located exactly offset rows beyond the current row.
191
2.9. Built-in functions
Syntax
lead::=
, default
offset
LEAD ( expr , )
partitionby
OVER ( orderby )
Note(s)
• LEAD can only be used as an analytical function (in combination with OVER(...), see also Section 2.9.3,
“Analytical functions”).
• The OVER clause must contain an ORDER BY part and may not contain a window clause.
• If the ORDER BY part doesn't define an unique sort order, the result is non-deterministic.
• If the access is beyond the scope of the current partition, LEAD returns the value of parameter default or
NULL if default was not specified.
• If you omit the parameter offset, value 1 is used. Negative values are not allowed for offset. Rows with
offset NULL are handled as if they were beyond the scope of the current partition.
• To access previous rows you can use the function LAG.
Example(s)
192
Chapter 2. SQL reference
LEAST
Purpose
Syntax
least::=
LEAST ( expr )
Note(s)
Example(s)
LEAST
-----
1
LEFT
Purpose
Syntax
left::=
Note(s)
Example(s)
LEFT_SUBSTR
193
2.9. Built-in functions
-----------
abc
LENGTH
Purpose
Syntax
length::=
LENGTH ( string )
Example(s)
LENGTH
----------
3
LEVEL
Purpose
Returns for CONNECT BY queries the level of a node within the tree. Details can be found in the description of
the SELECT statement in Section 2.2.4, “Query language (DQL)”.
Syntax
level::=
LEVEL
Example(s)
SELECT last_name,
LEVEL,
SYS_CONNECT_BY_PATH(last_name, '/') "PATH"
FROM employees
CONNECT BY PRIOR employee_id = manager_id
START WITH last_name = 'Clark';
194
Chapter 2. SQL reference
LN
Purpose
Returns the natural logarithm of number n. The function LN(n) is equivalent to LOG(EXP(1),n).
Syntax
ln::=
LN ( n )
Note(s)
Example(s)
LN
---------------
4.6051701859881
LOCALTIMESTAMP
Purpose
Returns the current timestamp, interpreted in the current session time zone.
Syntax
localtimestamp::=
LOCALTIMESTAMP
Note(s)
Example(s)
SELECT LOCALTIMESTAMP;
LOCALTIMESTAMP
-----------------------
2000-12-31 23:59:59.000
195
2.9. Built-in functions
LOCATE
Purpose
Returns the position in string at which search_string appears. If this is not contained, the value 0 is returned.
Syntax
locate::=
, position
LOCATE ( search_string , string )
Note(s)
• The optional parameter position defines from which position the search shall begin (starting with 1). If the
value is negative, EXASOL counts and searches backwards from the end (e.g. LOCATE('abc',string,-
3) searches backwards from the third last letter).
• LOCATE(search_string,string) is similar to LOCATE(search_string,string,1).
• The functions POSITION and INSTR are similar.
Example(s)
LOCATE1 LOCATE2
---------- ----------
3 25
LOG
Purpose
Syntax
log::=
LOG ( base , n )
Note(s)
196
Chapter 2. SQL reference
Example(s)
SELECT LOG(2,1024);
LOG(2,1024)
-----------------
10
LOG10
Purpose
Syntax
log10::=
LOG10 ( n )
Note(s)
Example(s)
LOG10
-----------------
4
LOG2
Purpose
Syntax
log2::=
LOG2 ( n )
Note(s)
197
2.9. Built-in functions
Example(s)
LOG2
-----------------
10
LOWER
Purpose
Syntax
lower::=
LOWER ( string )
Note(s)
Example(s)
SELECT LOWER('AbCdEf');
LOWER('AbCdEf')
---------------
abcdef
LPAD
Purpose
Returns a string of length n, which is string, filled from the left with expression padding.
Syntax
lpad::=
, padding
LPAD ( string , n )
Note(s)
198
Chapter 2. SQL reference
Example(s)
SELECT LPAD('abc',5,'X');
LPAD('abc',5,'X')
-----------------
XXabc
LTRIM
Purpose
LTRIM deletes all of the characters specified in the expression trim_chars from the left border of string.
Syntax
ltrim::=
, trim_chars
LTRIM ( string )
Note(s)
Example(s)
MAX
Purpose
Syntax
max::=
DISTINCT
ALL over_clause
MAX ( expr )
199
2.9. Built-in functions
Note(s)
Example(s)
MAX
-------------------
57
MEDIAN
Purpose
MEDIAN is an inverse distribution function. In contrast to the average function (see AVG) the median function
returns the middle value or an interpolated value which would be the middle value once the elements are sorted
(NULL values are ignored).
Syntax
median::=
Note(s)
Example(s)
COUNT
-------------------
50
100
200
900
MEDIAN
200
Chapter 2. SQL reference
-------------------
150
MID
Purpose
Returns a substring of length length from position position out of the string string.
Syntax
mid::=
, length
MID ( string , position )
Note(s)
• If length is not specified, all of the characters to the end of the string are used.
• The first character of a string has position 1. If position is negative, counting begins at the end of the string.
• See also functions RIGHT and LEFT.
• MID is an alias for SUBSTR[ING].
Example(s)
MID
---
bcd
MIN
Purpose
Syntax
min::=
DISTINCT
ALL over_clause
MIN ( expr )
Note(s)
201
2.9. Built-in functions
Example(s)
MIN
-------------------
25
MINUTE
Purpose
Syntax
minute::=
MINUTE ( datetime )
Note(s)
Example(s)
MIN
---
59
MINUTES_BETWEEN
Purpose
Returns the number of minutes between two timestamps timestamp1 and timestamp2.
Syntax
minutes_between::=
Note(s)
• If timestamp timestamp1 is earlier than timestamp timestamp2, then the result is negative.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated internally within UTC.
202
Chapter 2. SQL reference
Example(s)
MINUTES
---------------
0.9666666666667
MOD
Purpose
Syntax
mod::=
MOD ( m , n )
Example(s)
MODULO
------
3
MONTH
Purpose
Syntax
month::=
MONTH ( date )
Note(s)
203
2.9. Built-in functions
Example(s)
MON
---
10
MONTHS_BETWEEN
Purpose
Syntax
months_between::=
Note(s)
• If a timestamp is entered, only the date contained therein is applied for the computation.
• If the days are identical or both are the last day of a month, the result is an integer.
• If the first date value is earlier than the second date value, the result is negative.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
MB1 MB2
----------------- -----------------
0.548387096774194 7
NOW
Purpose
Returns the current timestamp, interpreted in the current session time zone.
Syntax
now::=
NOW ( )
204
Chapter 2. SQL reference
Note(s)
Please note that the result data type will be changed from TIMESTAMP to
TIMESTAMP WITH LOCAL TIME ZONE in the next major version. To avoid any
influence on the existing processes in a minor version, we delayed that change. The
impact will be rather small, since the values will still be the same (current timestamp
interpreted in the session time zone), only the data type will differ.
Example(s)
NOW
-------------------
1999-12-31 23:59:59
NPROC
Purpose
Syntax
nproc::=
NPROC ( )
Note(s)
Example(s)
NPROC
-----
4
205
2.9. Built-in functions
NULLIF
Purpose
Returns the value NULL if two expressions are identical. Otherwise, the first expression is returned.
Syntax
nullif::=
Note(s)
• The NULLIF function is equivalent to the CASE expression CASE WHEN expr1=expr2 THEN NULL
ELSE expr1 END
Example(s)
NULLIF1 NULLIF2
------- -------
1
NULLIFZERO
Purpose
Returns the value NULL if number has value 0. Otherwise, number is returned.
Syntax
nullifzero::=
NULLIFZERO ( number )
Note(s)
• The NULLIFZERO function is equivalent to the CASE expression CASE WHEN number=0 THEN NULL
ELSE number END.
• See also ZEROIFNULL.
Example(s)
NIZ1 NIZ2
---- ----
1
206
Chapter 2. SQL reference
NUMTODSINTERVAL
Purpose
Syntax
numtodsinterval::=
NUMTODSINTERVAL ( n , ’ interval_unit ’ )
Note(s)
Example(s)
NUMTODSINTERVAL
-----------------------------
+000000000 03:12:00.000000000
NUMTOYMINTERVAL
Purpose
Syntax
numtoyminterval::=
NUMTOYMINTERVAL ( n , ’ interval_unit ’ )
Note(s)
Example(s)
NUMTOYMINTERVAL
---------------
+000000003-06
207
2.9. Built-in functions
NVL
Purpose
Syntax
nvl::=
Note(s)
Example(s)
NVL_1 NVL_2
----- -----
abc xyz
NVL2
Purpose
Syntax
nvl2::=
Note(s)
Example(s)
208
Chapter 2. SQL reference
NVL_1 NVL_2
----- -----
3 2
OCTET_LENGTH
Purpose
Returns the octet length of a string. If only ASCII characters are used, then this function is equivalent to CHAR-
ACTER_LENGTH and LENGTH.
Syntax
octet_length::=
OCTET_LENGTH ( string )
Example(s)
OCT_LENGTH
----------
4
OCT_LENGTH
----------
6
PERCENTILE_CONT
Purpose
PERCENTILE_CONT is an inverse distribution function and expects as input parameter a percentile value and a
sorting specification which defines the rank of each element within a group. The functions returns the percentile
of this sort order (e.g. in case of percentile 0.7 and 100 values, the 70th value is returned).
If the percentile cannot be assigned exactly to an element, then the linear interpolation between the two nearest
values is returned (e.g. in case of percentile 0.71 and 10 values, the interpolation between the 7th and 8th value).
Syntax
percentile_cont::=
209
2.9. Built-in functions
DESC
ASC
( ORDER BY expr )
Note(s)
Example(s)
SELECT region,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY count),
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY count)
FROM sales
GROUP BY region;
PERCENTILE_DISC
Purpose
PERCENTILE_DISC is an inverse distribution function and returns the value from the group set which has the
smallest cumulative distribution value (corresponding to the given sort order) which is larger than or equal to the
specified percentile value. NULL values are ignored for the calculation.
Syntax
percentile_disc::=
210
Chapter 2. SQL reference
DESC
ASC
( ORDER BY expr )
Note(s)
Example(s)
SELECT region,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY count),
PERCENTILE_DISC(0.7) WITHIN GROUP (ORDER BY count)
FROM sales
GROUP BY region;
PI
Purpose
Syntax
pi::=
211
2.9. Built-in functions
PI ( )
Example(s)
SELECT PI();
PI
-----------------
3.141592653589793
POSITION
Purpose
Returns the position in the string, string, at which the string, search_string, first appears. If this is not
contained, the value 0 is returned.
Syntax
position::=
Note(s)
Example(s)
POS
----------
3
POSIX_TIME
Purpose
Posix time (also known as Unix time) is a system for describing points in time, defined as the number of seconds
elapsed since midnight of January 1, 1970 (UTC). By using this function you can convert a datetime value to a
numerical value.
Syntax
posix_time::=
212
Chapter 2. SQL reference
datetime
POSIX_TIME ( )
Note(s)
Example(s)
PT1 PT2
----------------- -----------------
1.000 1234567890.000
POWER
Purpose
Syntax
power::=
Example(s)
POWER
-----------------
1024
RADIANS
Purpose
213
2.9. Built-in functions
Syntax
radians::=
RADIANS ( n )
Note(s)
Example(s)
RADIANS
-----------------
3.141592653589793
RAND[OM]
Purpose
Syntax
random::=
Note(s)
Example(s)
RANDOM_1 RANDOM_2
----------------- -----------------
0.379277567626116 12.7548096816858
214
Chapter 2. SQL reference
RANK
Purpose
Syntax
rank::=
partitionby
RANK ( ) OVER ( orderby )
Note(s)
• RANK can only be used as an analytical function (in combination with OVER(...), see also Section 2.9.3,
“Analytical functions”).
• The OVER clause must contain an ORDER BY part and may not contain a window clause.
• The same value is returned for rows with equal ranking. Therefore, afterwards a lot of values are omitted (as
opposed to DENSE_RANK).
Example(s)
RATIO_TO_REPORT
Purpose
Syntax
ratio_to_report::=
partitionby
RATIO_TO_REPORT ( expr ) OVER ( )
215
2.9. Built-in functions
Note(s)
• The OVER clause may not include neither an ORDER BY nor a window clause (see also Section 2.9.3,
“Analytical functions”).
Example(s)
REGEXP_INSTR
Purpose
Searches the regular expression pattern in string. If this is not contained, the value 0 is returned, otherwise
the corresponding position of the match (see notes for details).
Syntax
regexp_instr::=
, return_opt
, occurence
, position
)
Note(s)
• Details and examples for regular expressions can be found in Section 2.1.3, “Regular expressions”.
• The optional parameter position defines from which position the search shall begin (starting with 1).
• The optional positive number occurrence defines which occurrence shall be searched for. Please note that
the search of the second occurrence begins at the first character after the first occurrence.
• The optional parameter return_opt defines the result of the function in case of a match:
0 (default) Function returns the beginning position of the match (counting starts from 1)
1 Function returns the end position of the match (character following the occurrence, counting
starts from 1)
• REGEXP_INSTR(string,pattern) is similar to REGEXP_INSTR(string,pattern,1,1).
• See also functions INSTR, REGEXP_REPLACE and REGEXP_SUBSTR and the predicate [NOT] REG-
EXP_LIKE.
216
Chapter 2. SQL reference
Example(s)
REGEXP_INSTR1 REGEXP_INSTR2
-------------- --------------
8 31
REGEXP_REPLACE
Purpose
Syntax
regexp_replace::=
, occurrence
, position
, replace_string
)
Note(s)
• Details and examples for regular expressions can be found in Section 2.1.3, “Regular expressions”.
• If pattern is NULL, string is returned.
• If replace_string is omitted or NULL, the matches of pattern are deleted in the result.
• In replace_string you can use captures via \1, \2, ..., \9 or \g<name> which are defined by pattern.
• The optional parameter position defines from which position the search shall begin (starting with 1).
• The optional positive number occurrence defines which occurrence shall be searched for. Please note that
occurrences do not overlap. So the search of the second occurrence begins at the first character after the first
occurrence. In case of 0 all occurrences are replaced (default). In case of a positive integer n the n-th occurrence
will be replaced.
• See also functions REPLACE, REGEXP_INSTR and REGEXP_SUBSTR and the predicate [NOT] REG-
EXP_LIKE.
Example(s)
SELECT REGEXP_REPLACE(
'From: my_mail@yahoo.com',
'(?i)^From: ([a-z0-9._%+-]+)@([a-z0-9.-]+\.[a-z]{2,4}$)',
'Name: \1 - Domain: \2') REGEXP_REPLACE;
217
2.9. Built-in functions
REGEXP_REPLACE
---------------------------------
Name: my_mail - Domain: yahoo.com
REGEXP_SUBSTR
Purpose
Syntax
regexp_substring::=
, occurrence
, position
)
Note(s)
• Details and examples for regular expressions can be found in Section 2.1.3, “Regular expressions”.
• Function REGEXP_SUBSTR is similar to function REGEXP_INSTR, but it returns the whole matching
substring instead of returning the position of the match.
• The parameter pattern defines a regular expression to be searched for. If no match is found, NULL is returned.
Otherwise the corresponding substring is returned.
• The optional parameter position defines from which position the search shall begin (starting with 1).
• The optional positive number occurrence defines which occurrence shall be searched for. Please note that
the search of the second occurrence begins at the first character after the first occurrence.
• REGEXP_SUBSTR(string,pattern) is similar to REGEXP_SUBSTR(string,pattern,1,1).
• See also functions SUBSTR[ING], REGEXP_INSTR and REGEXP_REPLACE and the predicate [NOT]
REGEXP_LIKE.
Example(s)
EMAIL
-----------------
my_mail@yahoo.com
REGR_*
Purpose
With the help of the linear regression functions you can determine a least-square regression line.
218
Chapter 2. SQL reference
Syntax
regr_functions::=
REGR_AVGX
REGR_AVGY
REGR_COUNT
REGR_INTERCEPT
over_clause
REGR_R2 ( expr1 , expr2 )
REGR_SLOPE
REGR_SXX
REGR_SXY
REGR_SYY
Note(s)
• If either expr1 or expr2 is NULL, then the corresponding number pair is not considered for the computation.
• Description for the regression functions:
219
2.9. Built-in functions
• In the following example, two regression lines are determined. The crosses correspond to entries in the table
staff (red=Finance, blue=IT). The two lines correspond to the determined regression lines. In the example
you can see that the line for the finance sector shows more goodness (see REGR_R2 value), that means that
the development of the salary is more dependent to the age.
Example(s)
SELECT industry,
REGR_SLOPE(salary,age) AS REGR_SLOPE,
REGR_INTERCEPT(salary,age) AS REGR_INTERCEPT,
REGR_R2(salary,age) AS REGR_R2
FROM staff GROUP BY industry;
REPEAT
Purpose
Syntax
repeat::=
REPEAT ( string , n )
Note(s)
220
Chapter 2. SQL reference
Example(s)
SELECT REPEAT('abc',3);
REPEAT('abc',3)
---------------
abcabcabc
REPLACE
Purpose
Returns the string that emerges if in the string all occurrences of search_string are replaced by re-
place_string.
Syntax
replace::=
, replace_string
REPLACE ( string , search_string )
Note(s)
• If replace_string is omitted or if it is NULL, all occurrences of search_string are deleted from the
result.
• If search_string is NULL, string is returned.
• If the input parameters are not strings, they will be automatically converted to strings.
• The return type is always a string, even if all of the parameters possess another type.
Example(s)
REPLACE_1
---------------------
Orange juice is great
REPLACE_2
-------------
Text
REVERSE
Purpose
221
2.9. Built-in functions
Syntax
reverse::=
REVERSE ( string )
Note(s)
Example(s)
REVERSE
-------
edcba
RIGHT
Purpose
Syntax
right::=
Note(s)
Example(s)
RIGHT_SUBSTR
------------
def
222
Chapter 2. SQL reference
ROUND (datetime)
Purpose
Syntax
round (datetime)::=
, format
ROUND ( date )
Note(s)
Example(s)
ROUND
----------
2007-01-01
ROUND
-----------------------
2006-12-31 12:35:00.000
223
2.9. Built-in functions
ROUND (number)
Purpose
Rounds number n to integer digits behind the decimal point (round to nearest, in case of a tie away of the zero).
Syntax
round (number)::=
, integer
ROUND ( n )
Note(s)
Example(s)
ROUND
-------
123.46
ROW_NUMBER
Purpose
Syntax
row_number::=
partitionby
ROW_NUMBER ( ) OVER ( orderby )
Note(s)
• ROW_NUMBER can only be used as an analytical function (in combination with OVER(...), see also Sec-
tion 2.9.3, “Analytical functions”).
• The OVER clause must contain an ORDER BY part and may not contain a window clause.
• The value is non-deterministic with rows of equal ranking.
224
Chapter 2. SQL reference
Example(s)
NAME ROW_NUMBER
---------- -------------------
Huber 3
Meier 6
Müller 2
Schmidt 1
Schultze 4
Schulze 5
ROWID
Purpose
Every row of a base table in the database has a unique address, the so-called ROWID. Read access to this address
can be obtained via the ROWID pseudo column (DECIMAL(36.0) data type).
Syntax
rowid::=
schema .
table .
ROWID
Note(s)
The ROWID column can be used in, among others, the following SQL constructs:
The ROWIDs of a table are managed by the DBMS. This ensures that the ROWIDs within a table are distinct - in
contrast, it is quite acceptable for ROWIDs of different tables to be the same. Using DML statements such as
INSERT, UPDATE, DELETE, TRUNCATE or MERGE, all the ROWIDs of the relevant tables are invalidated
and reassigned by the DBMS. This compares to structural table changes such as adding a column, which leave the
ROWIDs unchanged.
The ROWID pseudo column is only valid for base tables, not for views.
An example of using ROWIDs would be the targeted deletion of specific rows in a table, e.g. in order to restore
the UNIQUE attribute.
Example(s)
ROWID I
------------------------------------- -------------------
318815196395658560306020907325849600 1
225
2.9. Built-in functions
318815196395658560306020907325849601 1
318815196395658560306302382302560256 2
318815196395658560306302382302560257 3
RPAD
Purpose
Returns a string of the length n, which is string, filled from the right with expression padding.
Syntax
rpad::=
, padding
RPAD ( string , n )
Note(s)
Example(s)
SELECT RPAD('abc',5,'X');
RPAD('abc',5,'X')
-----------------
abcXX
RTRIM
Purpose
RTRIM deletes all of the characters specified in the expression trim_chars from the right border of string.
226
Chapter 2. SQL reference
Syntax
rtrim::=
, trim_chars
RTRIM ( string )
Note(s)
Example(s)
SELECT RTRIM('abcdef','afe');
RTRIM('abcdef','afe')
---------------------
abcd
SECOND
Purpose
Syntax
second::=
, precision
SECOND ( datetime )
Note(s)
• The optional second parameter defines the number of digits behind the decimal point.
• This function can also be applied on strings, in contrast to function EXTRACT.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
SECOND
------
40.12
227
2.9. Built-in functions
SECONDS_BETWEEN
Purpose
Syntax
seconds_between::=
Note(s)
• If timestamp timestamp1 is earlier than timestamp timestamp2, then the result is negative.
• Additionally, the result contains the difference in milliseconds.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated internally within UTC.
Example(s)
SB
-----------------
62.345
SESSIONTIMEZONE
Purpose
Returns the session time zone which was set via ALTER SESSION.
Syntax
sessiontimezone::=
SESSIONTIMEZONE
Note(s)
Example(s)
SELECT SESSIONTIMEZONE;
SESSIONTIMEZONE
---------------
EUROPE/BERLIN
228
Chapter 2. SQL reference
SIGN
Purpose
Syntax
sign::=
SIGN ( n )
Example(s)
SELECT SIGN(-123);
SIGN(-123)
----------
-1
SIN
Purpose
Syntax
sin::=
SIN ( n )
Example(s)
SELECT SIN(PI()/6);
SIN(PI()/6)
-----------------
0.5
SINH
Purpose
Syntax
sinh::=
229
2.9. Built-in functions
SINH ( n )
Example(s)
SINH
-----------------
0
SOUNDEX
Purpose
SOUNDEX returns a phonetic representation of a string. You can use SOUNDEX to compare words which sounds
similar, but are spelled different.
Syntax
soundex::=
SOUNDEX ( string )
Note(s)
• For the computation of SOUNDEX the algorithm is used which is described in: Donald Knuth, The Art of
Computer Programming, Vol. 3.
• The result is always a string with 4 characters (1 letter and 3 digits).
• This function is similar to COLOGNE_PHONETIC which is more appropriate for German words.
Example(s)
SOUNDEX('smythe') SOUNDEX('Smith')
----------------- ----------------
S530 S530
SPACE
Purpose
Syntax
space::=
230
Chapter 2. SQL reference
SPACE ( integer )
Note(s)
Example(s)
MY_STRING
---------
x x
SQRT
Purpose
Syntax
sqrt::=
SQRT ( n )
Note(s)
Example(s)
SELECT SQRT(2);
SQRT(2)
-----------------
1.414213562373095
ST_*
Purpose
Several functions for geospatial objects. Please refer to Section 2.4, “Geospatial data” for details about geospatial
objects and its functions.
Please note that UDF scripts are part of the Advanced Edition of EXASOL.
231
2.9. Built-in functions
Syntax
geospatial_functions_1::=
ST_AREA
ST_BOUNDARY
ST_BUFFER
ST_CENTROID
ST_CONTAINS
ST_CONVEXHULL
( args )
ST_CROSSES
ST_DIFFERENCE
ST_DIMENSION
ST_DISJOINT
ST_DISTANCE
ST_ENDPOINT
geospatial_functions_2::=
ST_ENVELOPE
ST_EQUALS
ST_EXTERIORRING
ST_FORCE2D
ST_GEOMETRYN
ST_GEOMETRYTYPE
ST_INTERIORRINGN ( args )
ST_INTERSECTION
ST_INTERSECTS
ST_ISCLOSED
ST_ISEMPTY
ST_ISRING
ST_ISSIMPLE
geospatial_functions_3::=
232
Chapter 2. SQL reference
ST_LENGTH
ST_NUMGEOMETRIES
ST_NUMINTERIORRINGS
ST_NUMPOINTS
ST_OVERLAPS
ST_SETSRID
ST_POINTN
ST_STARTPOINT ( args )
ST_SYMDIFFERENCE
ST_TOUCHES
ST_TRANSFORM
ST_UNION
ST_WITHIN
ST_X
ST_Y
Example(s)
ST_DISTANCE
-----------------
10
STDDEV
Purpose
Returns the standard deviation within a random sample. This equates to the following formula:
sP
n
− expr)2
i=1 (expri
STDDEV(expr) =
n−1
Syntax
stddev::=
233
2.9. Built-in functions
DISTINCT
ALL over_clause
STDDEV ( expr )
Note(s)
Example(s)
SELECT STDDEV(salary) STDDEV FROM staff WHERE age between 20 and 30;
STDDEV
-----------------
19099.73821810132
STDDEV_POP
Purpose
Returns the standard deviation within a population. This equates to the following formula:
r Pn
i=1 (expri − expr)2
STDDEV POP(expr) =
n
Syntax
stddev_pop::=
DISTINCT
ALL over_clause
STDDEV_POP ( expr )
Note(s)
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
STDDEV_POP
-----------------
234
Chapter 2. SQL reference
20792.12591343175
STDDEV_SAMP
Purpose
Returns the standard deviation within a random sample. This equates to the following formula:
sP
n
− expr)2
i=1 (expri
STDDEV SAMP(expr) =
n−1
Syntax
stddev_samp::=
DISTINCT
ALL over_clause
STDDEV_SAMP ( expr )
Note(s)
• STDDEV_SAMP is identical to the STDDEV function. However, if the random sample only encompasses
one element, the result is NULL instead of 0.
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
STDDEV_SAMP
-----------------
19099.73821810132
SUBSTR[ING]
Purpose
Returns a substring of the length length from the position position, out of the string string.
Syntax
substring::=
235
2.9. Built-in functions
, length
SUBSTR ( string , position
FOR length )
SUBSTRING ( string FROM position
Note(s)
• If length is not specified, all of the characters to the end of the string are used.
• If position is negative, counting begins at the end of the string.
• If position is 0 or 1, the result begins from the first character of the string.
• MID is an alias for this function.
• See also REGEXP_SUBSTR.
Example(s)
S1 S2
--- --
bcd de
SUM
Purpose
Syntax
sum::=
DISTINCT
ALL over_clause
SUM ( expr )
Note(s)
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• Only numeric operands are supported.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
SELECT SUM(salary) SUM FROM staff WHERE age between 20 and 30;
236
Chapter 2. SQL reference
SUM
------------------------------
220500
SUM SALARY
------------------------------ -------------------
25000 25000
60500 35500
97500 37000
145500 48000
220500 75000
301500 81000
SYS_CONNECT_BY_PATH
Purpose
Returns for a CONNECT BY query a string containing the full path from the root node to the current node, con-
taining the values for expr and separated by char. Details can be found in the description of the SELECT
statement in Section 2.2.4, “Query language (DQL)”.
Syntax
sys_connect_by_path::=
Example(s)
PATH
----------------------
/Clark/Jackson/Johnson
SYS_GUID
Purpose
Syntax
sys_guid::=
237
2.9. Built-in functions
SYS_GUID ( )
Example(s)
SELECT SYS_GUID();
SYS_GUID()
------------------------------------------------
069a588869dfcafb8baf520c801133ab139c04f964f047f1
SYSDATE
Purpose
Returns the current system date by evaluating TO_DATE(SYSTIMESTAMP), thus interpreted in the current
database time zone.
Syntax
sysdate::=
SYSDATE
Note(s)
Example(s)
SELECT SYSDATE;
SYSDATE
----------
2000-12-31
SYSTIMESTAMP
Purpose
Returns the current timestamp, interpreted in the current database time zone.
Syntax
systimestamp::=
SYSTIMESTAMP
238
Chapter 2. SQL reference
Note(s)
Example(s)
SELECT SYSTIMESTAMP;
SYSTIMESTAMP
-------------------
2000-12-31 23:59:59
TAN
Purpose
Syntax
tan::=
TAN ( n )
Example(s)
SELECT TAN(PI()/4);
TAN(PI()/4)
-----------------
1
TANH
Purpose
Syntax
tanh::=
TANH ( n )
239
2.9. Built-in functions
Example(s)
TANH
-----------------
0
TO_CHAR (datetime)
Purpose
Syntax
to_char (datetime)::=
, ’ nlsparam ’
, format
datetime
TO_CHAR ( )
interval
Note(s)
• The standard format is used if no format is specified, this is defined in session parameter NLS_DATE_FORMAT
or NLS_TIMESTAMP_FORMAT.
• Possible formats can be found in Section 2.6.1, “Date/Time format models”.
• Via the optional third parameter you can specify the language setting for the format (e.g. 'NLS_DATE_LAN-
GUAGE=GERMAN'). Supported languages are German (DEU, DEUTSCH or GERMAN) and English (ENG,
ENGLISH).
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
TO_CHAR
----------
1999-12-31
TO_CHAR
-------------------
23:59:00 31-12-1999
240
Chapter 2. SQL reference
TO_CHAR
------------
16. DEZ 2013
TO_CHAR (number)
Purpose
Syntax
to_char (number)::=
, format
TO_CHAR ( number )
Note(s)
Example(s)
TO_CHAR
-----------
12345.6789
TO_CHAR
----------------
12345.678900000
TO_CHAR
-------------------
000,012,345.678900-
TO_DATE
Purpose
Syntax
to_date::=
241
2.9. Built-in functions
, format
TO_DATE ( string )
Note(s)
• If no format is specified, the standard format is used to interpret string, this is defined in session parameter
NLS_DATE_FORMAT.
• Possible formats can be found in Section 2.6.1, “Date/Time format models”.
• ISO formats (IYYY, IW, ID) may not be mixed with non-ISO formats.
• If single elements are omitted, then their minimum values are assumed (e.g. TO_DATE('1999-12',
'YYYY-MM') is interpreted as December 1st, 1999).
• Session parameter NLS_DATE_FORMAT defines the output of the date.
Example(s)
TO_DATE
----------
1999-12-31
TO_DATE
----------
1999-12-31
TO_DSINTERVAL
Purpose
Syntax
to_dsinterval::=
TO_DSINTERVAL ( string )
Note(s)
• The string has always format [+|-]DD HH:MI:SS[.FF]. Valid values are 0-999999999 for days (DD),
0-23 for hours (HH), 0-59 for minutes (MI) and 0-59.999 for seconds (SS[.FF]).
• Please note that the fractional seconds are cut at the third position although you can specify more than three.
• See also TO_YMINTERVAL, NUMTODSINTERVAL and NUMTOYMINTERVAL.
Example(s)
242
Chapter 2. SQL reference
TO_DSINTERVAL
-----------------------------
+000000003 10:59:59.123000000
TO_NUMBER
Purpose
Syntax
to_number::=
, format
TO_NUMBER ( string )
Note(s)
• The format has no influence on the value, but simply its representation.
• If a format is specified, the corresponding string may only contain digits as well as the characters plus, minus,
NLS_NUMERIC_CHARACTERS, decimal point (decimal separator) and comma (group separator). However,
plus and minus may only be used at the beginning of the string.
• The format for every digit in the string must contain format element nine or zero at least once, see also Sec-
tion 2.6.2, “Numeric format models”. If the string contains a decimal separator, the format must also contain
the corresponding decimal separator element (D or .).
• If the data type of parameter string is no string, then the value is implicitly converted.
• The data type of the result of this function is dependent on the format, but typically a DECIMAL type. If no
format is specified, the result type is DECIMAL(1,0) in case of a boolean input parameter, and DOUBLE in
any other case.
Example(s)
TO_NUMBER1 TO_NUMBER2
----------------- ----------
123 -123.450
TO_TIMESTAMP
Purpose
Syntax
to_timestamp::=
243
2.9. Built-in functions
, format
TO_TIMESTAMP ( string )
Note(s)
• The standard format is used if no format is specified, this is defined in session parameter
NLS_TIMESTAMP_FORMAT.
• Possible formats can be found in Section 2.6.1, “Date/Time format models”.
• Session parameter NLS_TIMESTAMP_FORMAT defines the output of the timestamp.
Example(s)
TO_TIMESTAMP
--------------------------
1999-12-31 23:59:00.000000
TO_TIMESTAMP
--------------------------
1999-12-31 23:59:00.000000
TO_YMINTERVAL
Purpose
Syntax
to_yminterval::=
TO_YMINTERVAL ( string )
Note(s)
• The string always has format [+|-]YY-MM. Valid values are 0 to 999999999 for years (YY) and 0 to 11 for
months (MM).
• See also TO_DSINTERVAL, NUMTODSINTERVAL and NUMTOYMINTERVAL.
Example(s)
TO_YMINTERVAL
-------------
+000000003-11
244
Chapter 2. SQL reference
TRANSLATE
Purpose
Replaces the characters out of from_string with the corresponding character out of to_string in the string,
expr.
Syntax
translate::=
Note(s)
• Those characters in expr which do not exist in from_string are not replaced.
• If from_string is longer than to_string, the relevant characters are deleted and not replaced.
• If to_string is longer than from_string, the relevant characters are ignored during the replacement
process.
• If one of the parameters is the empty string, then the result is NULL.
Example(s)
TRANSLATE
---------
xyd
TRIM
Purpose
TRIM deletes all of the characters specified in the expression, trim_string, from both the right and left border
of string.
Syntax
trim::=
, trim_string
TRIM ( string )
Note(s)
245
2.9. Built-in functions
Example(s)
TRIM
----
bcde
TRUNC[ATE] (datetime)
Purpose
Returns a date and/or a timestamp, which is trimmed in accordance with the format definition. Accordingly,
TRUNC(datetime) behaves in exactly the same way as ROUND (datetime), with the exception that TRUNC
rounds-down.
Syntax
trunc (datetime)::=
TRUNC , format
( date )
TRUNCATE
Note(s)
Example(s)
TRUNC
246
Chapter 2. SQL reference
----------
2006-12-01
TRUNC
--------------------------
2006-12-31 23:59:00.000000
TRUNC[ATE] (number)
Purpose
Syntax
trunc (number)::=
TRUNC , integer
( n )
TRUNCATE
Note(s)
Example(s)
TRUNC
-------
123.45
UCASE
Purpose
Syntax
ucase::=
UCASE ( string )
247
2.9. Built-in functions
Note(s)
Example(s)
UCASE
-------
ABCDEF
UNICODE
Purpose
Syntax
unicode::=
UNICODE ( char )
Note(s)
Example(s)
UNICODE
----------
228
UNICODECHR
Purpose
Syntax
unicodechr::=
UNICODECHR ( n )
248
Chapter 2. SQL reference
Note(s)
Example(s)
UNICODECHR
----------
ü
UPPER
Purpose
Syntax
upper::=
UPPER ( string )
Note(s)
Example(s)
UPPER
-------
ABCDEF
USER
Purpose
Syntax
user::=
USER
249
2.9. Built-in functions
Note(s)
Example(s)
SELECT USER;
USER
----
SYS
VALUE2PROC
Purpose
This function returns the corresponding database node for a certain value. This mapping corresponds with the data
distribution that would be achieved if you do a DISTRIBUTE BY that value.
Syntax
value2proc::=
VALUE2PROC ( expr )
Note(s)
• This function can be used to understand the actual data distribution across the cluster nodes which can be
achieved with the ALTER TABLE (distribution).
• The return value is an integer between 0 and NPROC-1.
• In this context, please also note functions IPROC and NPROC.
Example(s)
SELECT IPROC(),
c1, VALUE2PROC(c1) V2P_1,
c2, VALUE2PROC(c2) V2P_2 FROM t;
250
Chapter 2. SQL reference
VAR_POP
Purpose
Returns the variance within a population. This equates to the following formula:
Pn
i=1 (expri − expr)2
VAR POP(expr) =
n
Syntax
var_pop::=
DISTINCT
ALL over_clause
VAR_POP ( expr )
Note(s)
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
VAR_POP
-----------------
432312500
VAR_SAMP
Purpose
Returns the variance within a random sample. This equates to the following formula:
Pn
− expr)2
i=1 (expri
VAR SAMP(expr) =
n−1
Syntax
var_samp::=
DISTINCT
ALL over_clause
VAR_SAMP ( expr )
251
2.9. Built-in functions
Note(s)
• VAR_SAMP is identical to the VARIANCE function. However, if the random sample only encompasses one
element, the result is NULL instead of 0.
• If ALL or nothing is specified, then all of the entries are considered. If DISTINCT is specified, duplicate entries
are only accounted for once.
• See also Section 2.9.3, “Analytical functions” for the OVER() clause and analytical functions in general.
Example(s)
SELECT VAR_SAMP(salary) AS VAR_SAMP FROM staff WHERE age between 20 and 30;
VAR_SAMP
-----------------
364800000
VARIANCE
Purpose
Returns the variance within a random sample. This equates to the following formula:
Pn
− expr)2
i=1 (expri
VARIANCE(expr) =
n−1
Syntax
variance::=
DISTINCT
ALL over_clause
VARIANCE ( expr )
Note(s)
Example(s)
SELECT VARIANCE(salary) VARIANCE FROM staff WHERE age between 20 and 30;
VARIANCE
-----------------
364800000
252
Chapter 2. SQL reference
WEEK
Purpose
Syntax
week::=
WEEK ( date )
Note(s)
Example(s)
WEEK
----
1
YEAR
Purpose
Syntax
year::=
YEAR ( date )
Note(s)
Example(s)
253
2.9. Built-in functions
YEAR(
-----
2010
YEARS_BETWEEN
Purpose
Syntax
years_between::=
Note(s)
• If a timestamp is entered, only the date contained therein is applied for the computation.
• If the months and days are identical, the result is an integer.
• If the first date value is earlier than the second date value, the result is negative.
• For data type TIMESTAMP WITH LOCAL TIME ZONE this function is calculated within the session time
zone.
Example(s)
YB1 YB2
--------------- ---
0.5456989247312 1
ZEROIFNULL
Purpose
Syntax
zeroifnull::=
ZEROIFNULL ( number )
Note(s)
• The ZEROIFNULL function is equivalent to the CASE expression CASE WHEN number is NULL THEN
0 ELSE number END.
254
Chapter 2. SQL reference
Example(s)
ZIN1 ZIN2
------------------- ----
0 1
255
256
Chapter 3. Concepts
Chapter 3. Concepts
This chapter introduces some fundamental concepts in Exasol.
Example
-- Transaction 1
CREATE SCHEMA my_schema;
COMMIT;
-- Transaction 2
CREATE TABLE t (i DECIMAL);
SELECT * FROM t;
ROLLBACK;
-- Transaction 3
CREATE TABLE t (i VARCHAR(20));
COMMIT;
The aim of a transaction-based system is the maintenance of complete transaction security, i.e. each transaction
should return a correct result and leave the database in a consistent condition. To achieve this, transactions must
comply with the so-called ACID principles:
In order to ensure compliance with the ACID principles, every transaction is subject to an evaluation by the TMS.
If necessary, the TMS intervenes and automatically rectifies conflicts through the enforcement of waiting times
or by rolling back transactions in the event of a collision.
In order to keep the number of colliding transactions as low as possible, Exasol supports the so-called "MultiCopy
Format". This means that multiple versions of every database object may exist (temporarily). In this manner system
throughput (number of fully executed transactions per unit of time) can be significantly increased compared to
databases with SingleCopy format.
257
3.1. Transaction management
The individual transactions are isolated from one another by the TMS by means of a lock procedure. The granularity
of the lock procedure always surrounds an entire database object, e.g. one schema or one table. This means, for
example, that two transactions cannot simultaneously update different rows of a table.
Due to the TMS, for each transaction that is started the user is confronted with one of the following scenarios:
This reduces the risk of a collision when simultaneously executing transactions with schema statements, but has
the disadvantage that a rollback, for example, cannot undo the schema statement that has been executed.
In contrast, the Exasol TMS does not conceal transactions from the user and does not persistently store statements
in the database automatically.
However, for parts that affect the performance of the system, it may be more effective to not run a COMMIT after
each SQL statement. This is especially true if intermediate tables, which are not intended to be saved persistently,
are computed in scripts. Therefore, in EXAplus the user has the possibility of disabling this option with the command,
"SET AUTOCOMMIT OFF".
AUTOCOMMIT = OFF increases the risk of collisions and can affect other users negat-
ively.
If the autocommit mode is disabled, the option "-x" in the EXAplus console is recommended. This causes a SQL
script to be aborted if an error occurs, particularly during an automatic rollback after a transaction conflict. If batch
scripts are started without this option, processing of the SQL statement sequence would continue despite the error,
which could lead to incorrect results and should, therefore, be avoided.
258
Chapter 3. Concepts
This chapter is merely an introduction to the basic concepts of rights management. Full details are set out in Ap-
pendix B, Details on rights management and Section 2.2.3, “Access control using SQL (DCL)”.
3.2.1. User
An administrator must create a USER account for each user who wishes to connect to the database (with the
CREATE USER SQL statement). In the process, the user receives a password that can be changed at any time
and with which he authenticates himself to the database.
The naming conventions for user names and passwords are the same as for SQL identifiers (identifier for database
objects such as table names, see also Section 2.1.2, “SQL identifier”). However, with this case sensitivity is of no
significance.
Appropriate privileges are necessary for a user to perform actions in the database. These are granted or withdrawn
by an administrator or other users. For example, a user needs the system privilege CREATE SESSION in order
to connect to the database. This system privilege is withdrawn if one wishes to temporarily disable a user.
A special user exists in the database, SYS, this cannot be deleted and it possesses universal privileges.
Initially the SYS password is exasol, however, this should be changed immediately after
the first login in order to prevent potential security risks.
3.2.2. Roles
Role facilitate the grouping of users and simplify rights management considerably. They are created with the
CREATE ROLE statement.
A user can be assigned one or several roles with the GRANT SQL statement. This provides the user with the
rights of the respective role. Instead of granting many "similar" users the same privileges, one can simply create
a role and grant this role the appropriate privileges. By assigning roles to roles, a hierarchical structure of privileges
is even possible.
Roles cannot be disabled (as with, e.g. Oracle). If an assignment to a role needs to be reversed, the role can be
withdrawn by using the REVOKE SQL statement.
The PUBLIC role stands apart because every user receives this role automatically. This makes it very simple to
grant and later withdraw certain privileges to/from all users of the database. However, this should only occur if
259
3.2. Rights management
one is quite sure that it is safe to grant the respective rights and the shared data should be publicly accessible. The
PUBLIC role cannot be deleted.
Another pre-defined role is the DBA role. It stands for database administrator and has all possible privileges. This
role should only be assigned to very few users because it provides these with full access to the database. Similar
to PUBLIC, the DBA role cannot be deleted.
3.2.3. Privileges
Privileges control access on the database. Privileges are granted and withdrawn with the SQL statements, GRANT
and REVOKE . Distinction is made between two different privilege types:
System privileges control general rights such as "Create new schema", "Create new user" or "Access any table".
Object privileges allow access to single schema objects (e.g. "SELECT access to table t in schema s"). Tables,
views, functions and scripts are referred to as schema objects. It should be noted that each schema and all the
schema objects contained therein belong to exactly one user or one role. This user and all owners of this role have
the right to delete these objects and grant other users access to them. If an object privilege is granted for a schema,
then this privilege is applied to all containing schema objects.
A detailed list of all privileges available in Exasol can be found in Section B.1, “List of system and object privileges
”.
In order to be able to grant or withdraw privileges to/from a user or a role, the user himself must possess certain
privileges. The exact rules are explained in the detailed description of the GRANT and REVOKE statements in
the SQL reference (see also Section 2.2.3, “Access control using SQL (DCL)”).
One should always be aware of whom one grants what privileges and what the potential consequences are of doing
so. In this respect, particular emphasis is given to the system privileges: GRANT ANY ROLE, GRANT ANY
PRIVILEGE and ALTER USER. They allow full access to the database and should only be granted to a limited
number of users. Through the system privilege, GRANT ANY ROLE, a user can assign the DBA role to any other
user (naturally, also himself) and in doing so would have full access to the database. If the user has the GRANT
ANY PRIVILEGE system privilege, he can grant himself or any other user the GRANT ANY ROLE system
privilege and in turn receive full access. With the ALTER USER privilege it is possible to change the SYS password
and in doing so also receive full access.
The GRANT ANY ROLE, GRANT ANY PRIVILEGE, and ALTER USER system
privileges provide full access to the database.
An overview of all SQL statements supported by Exasol as well as the necessary privileges can be found in Sec-
tion B.2, “Required privileges for SQL statements”. In Chapter 2, SQL reference in the detailed description of
each SQL statement, this information is also provided.
Access rights to the individual columns of a table or view is not supported. If only part of a table should be visible
for certain users/roles, this can be achieved by using views, which select the relevant part. Instead of granting access
to the actual table, this is only permitted for the generated view. This allows access protection for specific columns
and/or rows.
260
Chapter 3. Concepts
A list of all the system tables relevant to rights management can be found in Section B.3, “System tables for rights
management”.
Who has access to what system tables is also controlled by privileges. There are some, to which only a DBA has
access for reasons of security. In addition, there are system tables that are visible to all but that only show individually
permitted information (for example, EXA_ALL_OBJECTS: all schema objects to which the user has access).
This means that changes are not visible until the transaction has been confirmed by means of COMMIT. Due to
the fact that with all SQL statements a read operation on the user's privileges is performed, where possible DCL
statements should always be conducted with AUTOCOMMIT switched on. If not, there is an increase in the risk
of transaction conflicts.
More information on this issue can be found in Section 3.1, “Transaction management ”.
• Role, ANALYST: performs analyzes on the database, therefore he is permitted to read all tables and create his
own schemas and schema objects
• Role, HR: manages the staff, therefore he is permitted to edit the STAFF table
• Role, DATA_ADMIN: gives full access to the data schema
• Table, STAFF: contains information on the company's staff
• User, SCHNEIDER: is an administrator and can do anything
• User, SCHMIDT: works in marketing and has the ANALYST role
• User, MAIER: works in the personnel office, therefore he has the HR role
• User, MUELLER: is an administrator for the DATA schema and can only gain access to this. Therefore, he
owns the DATA_ADMIN role
The following SQL script could be used to implement this scenario in the database:
--create table
CREATE SCHEMA infos;
CREATE TABLE infos.staff (id DECIMAL,
last_name VARCHAR(30),
name VARCHAR(30),
salary DECIMAL);
CREATE SCHEMA data_schema;
--create roles
CREATE ROLE analyst;
CREATE ROLE hr;
CREATE ROLE data_admin;
--create users
261
3.2. Rights management
--for connecting to db
GRANT CREATE SESSION TO schneider, schmidt, maier, mueller;
262
Chapter 3. Concepts
3.3. Priorities
3.3.1. Introduction
By the use of priorities, Exasol resources can be systematically distributed across users and roles.
Even during the execution of one single query, Exasol attempts to use of as many resources (CPU, RAM, Network,
I/O) as possible by internal parallelization (multithreading). But since not all execution steps can be parallelized,
the utilization of all hardware resources will be achieved only with multiple parallel queries. If more parallel
queries are executed than the number of cores per server, Exasol's resource manager schedules the parallel queries
to ensure a constant system performance.
Using a time slot model, the resource manager distributes the resources evenly across the parallel queries while
limiting the overall resource consumption by regularly pausing and restarting certain queries.Without such
scheduling the overall usage could exhaust the system resources, leading to a decreased throughput. Without ex-
plicitly setting priorities, Exasol treats all queries equally except short queries running less than 5 seconds, which
get a higher weighting for minimizing the system's latency.
If you want to influence the execution of many parallel queries, you can use the priorities which can be assigned
via the GRANT statement. This intervention should only be necessary in case of a highly parallel usage of your
system, combined with a certain user group which needs maximal performance. Priorities could be for instance
useful if you connect a web application which has low latency requirements, and your system frequently has to
process 10 or more parallel queries. On the other side, long running ETL processes or queries from less important
applications could get a lower priority.
Valuable information about the usage of your database can be found in the statistical
system tables (e.g. EXA_USAGE_DAILY and EXA_SQL_DAILY, see also Section A.2.3,
“Statistical system tables”).
Notes
• Users without an explicit priority obtain the default priority group MEDIUM.
• A user inherits the highest priority of his roles, but a directly granted priority overwrites that.
• Multiple sessions of a certain user obtain the same priority.
• Please note that all sessions of a priority group share its resources equally. As consequence, e.g. each of many
parallel HIGH sessions can get less resources than a single LOW session. On the other hand, this way of system
allocation has the great advantage that certain user groups can be assured to get a certain amount of resources
(e.g. at least 69% for the HIGH users).
• The execution times of queries doesn't exactly follow the resource distribution. Hence an identical query with
twice as much resources than another one will be executed significantly faster, but not exactly by factor 1:2.
• A user can set the NICE attribute via the statement ALTER SESSION (ALTER SESSION SET NICE='ON').
This indicates the system that the sessions of other users shall be affected as less as possible by this user's
session. The resource manager then divides the session's weight by the number of currently active queries.
Hereby such sessions can be processed without affecting sessions from the same or other priority groups.
263
3.3. Priorities
• The priority and concrete resource allocation of users, roles and their sessions can be displayed via several
system tables (e.g.EXA_ALL_USERS, EXA_ALL_ROLES, EXA_ALL_SESSIONS).
3.3.3. Example
In the following example some roles with different priorities are defined. In the picture below, the resulting resource
distribution for a list of active sessions is displayed.
264
Chapter 3. Concepts
3.4.1. Introduction
ETL processes are the transfer of data from source systems into a target system:
With regards to the source systems, these usually involve one or more operational databases, the data of which are
to be combined in a target system. This normally begins with an initial load; the periodic loading of data then occurs
incrementally in order to keep the data warehouse up-to-date.
ETL or ELT?
Instead of using ETL tools, more complex data transformations than just data type conversions can of course also
be performed directly within the database using SQL statements. This includes for example data modifications,
aggregation or schema manipulations.
One can also speak of ELT (instead of ETL) processes in this context, because the Transformation phase is performed
after the Extract and Load. This approach is recommended by Exasol because the high performance of the parallel
cluster and the database can be exploited for complex operations.
Instead of an external bulk loading tool, Exasol's loading engine is directly integrated within the cluster. This
provides optimal performance by leveraging a powerful, parallel cluster, and simplifies the overall ETL process
by just using SQL statements instead of having to install and maintain client tools on different platforms and systems.
The credentials for external systems can easily be encapsulated by using connection objects (see also CREATE
CONNECTION in Section 2.2.3, “Access control using SQL (DCL)”).
Therefore, the data integration from other databases or data management systems is very simple, powerful and
flexible. You can load data by a simple SQL command and combine that with further post-processing via subselects.
You can even specify a statement instead of a table name for the external data source which is executed on that
system, e.g. to just load a certain part of a table into Exasol.
File load/export
For pure file processing, both the CSV (Comma separated Value) and the FBV (Fix Block Value) file formats are
supported which are described later on. You achieve the best performance if you specify an HTTP or FTP server
and even split the files into several parts to allow Exasol to read/write the data in parallel.
265
3.4. ETL Processes
In case you want to read from or write into local files of your computer, you can also use the statements IMPORT
and EXPORT, but only in combination with EXAplus or the JDBC driver.
In some cases, the integration of an external bulk loading tool is still necessary, and for these cases we deliver the
tool exajload as part of our JDBC driver package.
That's why we recommend to read Section 3.5, “Scripting” if you plan to develop a more sophisticated ETL process.
With our scripting capabilities, you can execute several SQL statements within one script, use exception handling
and even process smaller amounts of data directly within the programming language.
The foundation of this concept is the creation of a row-emitting UDF script (i.e. with return type EMIT) by using
the CREATE SCRIPT statement. If you integrate such a script within an INSERT INTO <table> (SELECT
my_emit_script('param') [FROM ...]) command, then this UDF script is called on the cluster during
the execution of the SQL statement and the emitted rows will be inserted into the target table.
Within the script code you can implement all possible processes, such as establishing connections to external
systems, extracting the data, data type conversion and even more complex transformations. Via parameters you
can hand over all necessary information to the script.
The big advantage of this architecture is the capability to integrate all kinds of (open source) libraries from various
script languages to perform specific functions. For example: if you use libraries from a script language to take care
of data formats, you can easily use new data formats from Hadoop systems within Exasol, without needing to wait
for the next software release.
By leveraging the ability to use dynamic input and output parameters (see also Dynamic input and output parameters)
you can develop scripts generically which can be re-used for all kinds of tables with different structures and data
types.
The following example demonstrates how easy it is to load data in parallel with the help of UDF scripts. The inner
SELECT invokes a UDF script (once) which connects to a service and returns the list of existing JSON files, plus
a certain partition number. The outer SELECT calls another UDF script (with input type SET) that finally loads
that data in parallel, according to partition ID.
266
Chapter 3. Concepts
The only precondition is that the UDF script implements a specific method that creates a corresponding SELECT
statement string out of the IMPORT information. In the simplest case, the same UDF script will be called within
that SELECT that invokes the actual data loading. Specifics about this method for the various script languages
can be found in Section 3.6.3, “Details for different languages”.
Therefore the internal execution consists of two phases, the SQL text generation and the subsequent embedding
into an INSERT INTO SELECT statement. Exasol handles the second phase by considering and adds the target
table name and columns.
The following simple example illustrates the general concept. The user-defined IMPORT generates a sequence of
integer values whose size is defined via a parameter.
CREATE PYTHON SCALAR SCRIPT etl.generate_seq (num INT) EMITS (val INT) AS
def generate_sql_for_import_spec(import_spec):
if not "RECORDS" in import_spec.parameters:
raise ValueError('Mandatory parameter RECORDS is missing')
return "SELECT ETL.GENERATE_SEQ("+import_spec.parameters["RECORDS"]+")"
def run(ctx):
for i in range(1, ctx["num"]+1):
ctx.emit(i)
/
VAL
-------
1
2
3
4
5
The example also shows how the IMPORT parameter (see WITH clause of the IMPORT command) is handed
over to the script call within the SELECT. The object import_spec provides the script various information,
e.g. parameters (WITH clause), the used connection or the column names of the target table of the IMPORT
statement.
By using this information, you can dynamically control the appropriate SQL text generation and the interaction
with the external system. The specific name of the metadata object and its content can vary across the script languages
and is specified in detail in Section 3.6.3, “Details for different languages”.
267
3.4. ETL Processes
For more details, we recommend to have a look at our open source project "Hadoop ETL UDFs" which implements
the data transfer to/from Hadoop systems via IMPORT/EXPORT commands by using UDF scripts. : https://www.git-
hub.com/exasol/hadoop-etl-udfs
• IMPORT remains the central command for loading data into Exasol
• The user interface is simple and intuitive. Just define the parameters required for your IMPORT. This hides
the complexity of the SQL statements, which can become complex for real-world scenarios.
• IMPORT FROM SCRIPT supports named parameters, which makes it easier to work with mandatory and
optional parameters and improves the readability.
• Similar to the normal IMPORT, the IMPORT FROM SCRIPT command can be used as subselect (and hence
in views) which allows you to embed the data access.
Similar to the concept of user-defined IMPORT processes, you can implement user-defined EXPORT processes.
You simply have to create a UDF script that establishes a connection to the external system within the script code
and transfers the data afterwards. A simple SELECT my_export_script(...) FROM <table> could send
all table rows to an external system. It is recommended to use a SET script and control the parallelism via a GROUP
BY clause to minimize the overhead of establishing connections.
The only precondition is again that the UDF script implements a specific method that creates a corresponding SQL
string using the export information. Specifics about this method for the various script languages can be found in
Section 3.6.3, “Details for different languages”.
In the following section you can find a corresponding example for our open-sourced Hadoop integration.
In our open source repository (see https://github.com/exasol) we provide UDF scripts for easily integrating external
systems. Just download the corresponding UDF scripts, execute them on your Exasol database, and you can already
start in most cases. And we would be very happy if you would contribute to our open source community by using,
extending and adding to our open-source tools.
One of the existing script implementations is an integration for Hadoop systems (see https://www.github.com/ex-
asol/hadoop-etl-udfs). In the following example, Apache HCatalog™ tables are imported and exported using the
provided UDF scripts import_hcat_table and export_hcat_table. You can see how simple the resulting
commands look like.
268
Chapter 3. Concepts
If you are not familiar with the concept of virtual schemas, we refer to Section 3.7,
“Virtual schemas”.
If you don't want to materialize all tables, you could also create a mix of views directing to the virtual tables and
materialized ones and decide case by case which tables should be permanently stored within Exasol to gain optimal
performance.
A further advantage of virtual tables is that you don't need to adjust the ETL process if columns were introduced
or renamed in the source system.
Comments A data record which has a "#" (sharp) as its first character is ignored.
Comments, header and footer can be used in this manner.
Separation of the fields The fields within a data record are separated with the field separator.
Row separation The data records are separated from one another with the row separator.
Each row is equal to one data record.
269
3.4. ETL Processes
Spacing characters Spacing character at the beginning and/or the end of a field can optionally be trimmed
by specifying the TRIM option in the IMPORT statement. In case of the option
RTRIM, the data
Hence,
John,,Any St.
John,NULL,Any St.
Numbers Numbers can be used in floating point or exponential notation. Optionally you can
use a format (see also Section 2.6.2, “Numeric format models”). Please also consider
the settings for the session parameter NLS_NUMERIC_CHARACTERS.
Examples:
Timestamp/Date If no explicit format (see also Section 2.6.1, “Date/Time format models”) was spe-
cified, then the current default formats are used, defined by the session parameters
NLS_DATE_FORMAT and NLS_TIMESTAMP_FORMAT. Please also consider
the session parameter NLS_DATE_LANGUAGE for certain format elements.
Boolean value Boolean values are TRUE or FALSE. Those values can be represented by different
values like e.g. 0 for FALSE or T for TRUE (see also Boolean data type).
Special characters If the field separator, field delimiter or row separator occurs within the data the af-
fected field must be enclosed in the field delimiter. To contain the field delimiter itself
in the data it has to be written twice consecutive.
Field #3: Empty string which corresponds to a NULL value within the data-
base!
The same example without using the field delimiter will result in an error:
270
Chapter 3. Concepts
For the interpretation of the data you have to consider the following elements:
Column width A fixed number of bytes is used per column. If the size of the content is smaller than
the column width, the content is supplemented with the specified padding characters.
Column alignment With left alignment the content is at the beginning of the column and the padding char-
acters follow. With right alignment the content is at the end of the column and is preceded
by the padding characters.
Row separation The linefeed character optionally follows at the end of a data record.
NULL values In order to realize NULL values , a column is written across the entire length with pad-
ding characters.
Numbers Numbers are stored in a form readable by humans, whereby you can import floating
point and scientific notation numbers. However, the column width must be maintained
and padding characters used as necessary.
Explicit formats Optionally you can specify a format for numbers or datetime values. Please consider
Section 2.6.2, “Numeric format models” and Section 2.6.1, “Date/Time format models”.
271
3.5. Scripting
3.5. Scripting
3.5.1. Introduction
In this chapter we introduce the scripting programming, which is in contrary an interface for executing several
SQL commands sequentially and for handling errors during those executions. Hence you can run a control jobs
within the database (e.g. complex loading processes) and ease up repeating jobs by parameterized scripts - like
the creation of an user including its password and privileges.
Additionally, you can indeed process result tables of SQL queries, but a scripting program is a sequential program
and only runs on a single cluster node (except the contained SQL statements). Therefore it is not reasonable to do
iterative operations on big data sets. For this purpose we recommend the use of user defined functions (see Sec-
tion 3.6, “UDF scripts ”).
• For control jobs including several SQL commands you can use scripting programs
• Iterative operations on big data sets can be done via user defined functions (see also
Section 3.6, “UDF scripts ”).
For scripting programs, only the programming language Lua is available (in version 5.1), extended by a couple of
specific features. The illustration of the Lua language in the following chapters should enable you to work with
the scripting language of Exasol. But if you want to learn the fundamentals of Lua, we recommend to read the of-
ficial documentation (see also http://www.lua.org).
A script program is created, executed and dropped by the commands CREATE SCRIPT, EXECUTE SCRIPT
and DROP SCRIPT. The return value of the EXECUTE SCRIPT command is either a number (as rowcount) or
a result table.
Example
In this example a script is created which can simply copy all tables from one schema into a new schema.
272
Chapter 3. Concepts
cleanup()
end
Lexical conventions
Contrary to the general SQL language, the script language is case-sensitive, that means upper and lower case has
to be considered for e.g. variable definitions. Furthermore, there exists a constraint that variable and function
identifiers can only consist of ASCII characters. However, the ending semicolon (;) after a script statement is
optional.
Scripts are case-sensitive, thus also for variable and function names
Comments
There are two different kinds of comment in Exasol:
• Line comments begin with the character -- and indicate that the remaining part of the current line is a comment.
• Block comments are indicated by the characters /* and */ and can be spread across several lines. All of the
characters between the delimiters are ignored.
Examples
/*
This is
a multiline comment
*/
273
3.5. Scripting
Please note that the type decimal is not a standard Lua type, but a user-defined Exasol type (of type userdata,
similar to the special value NULL). This decimal type supports the following operators and methods for the usual
mathematical calculations and conversions:
Constructor
decimal(value [,precision [, value can be of type string, number or decimal.
scale]])
The default for precision and scale is (18,0),
i.e. decimal(5.1) is rounded to the value
5!
Operators
+, -, *, /, % Addition, subtraction, multiplication, division and modulo calculation
of two numerical values. The return type is determined dynamically:
decimal or number
== , <, <=, >, >= Comparison operators for numerical values. Return type: boolean
Methods
var:add(), var:sub(), Addition, subtraction, multiplication and modulo calculation of two
var:mul(), var:mod() numerical values.
d1 = decimal(10)
d2 = decimal(5.9, 2, 1)
s = d1:scale() -- s=0
str = tostring(d2) -- str='5.9'
d1:add(d2) -- result is 16
274
Chapter 3. Concepts
Simple Variables
Script variables are typed dynamically. That means that variables have no type, but the values which are assigned
to the variables. An assignment is done by the operator =.
As default, the scope of a variable is global. But it can be limited to the current execution block by using the
keyword local.
Examples
Arrays
An array consists of a list of values (my_array={2,3,1}) which can also be heterogeneous (with different
types).
An element of an array can be accessed through its position, beginning from 1 (my_array[position]). The
size of an array can be determined by the #-operator (#my_array). In case of a nil value you will get an exception.
The elements of an array can also be an array. That is how you can easily create multidimensional arrays.
Examples
Dictionary Tables
Beside simple variables and arrays you can also use dictionary tables which consist of a collection of key/value
pairs. The keys and values can be heterogeneous (with different types).
The access to a specific value is accomplished by the key, either by using the array notation (variable[key])
or by using the point notation (variable.key).
By using the function pairs(t) you can easily iterate through all entries of the dictionary (for k,v in
pairs(t) do end).
In the Lua documentation there is no difference between arrays and dictionary tables -
they are both simply named table.
275
3.5. Scripting
Examples
Execution blocks
Execution blocks are elements which limit the scope of local variables. The script itself is the outermost execution
block. Other blocks are defined through control structures (see next section) or function declarations (see section
Functions).
Explicit blocks can be declared via do end. They are mainly useful to limit the scope of local variables.
Examples
Control structures
The following control structures are supported:
Element Syntax
if if <condition> then <block>
[elseif <condition> then <block>]
276
Chapter 3. Concepts
Element Syntax
[else <block>]
end
repeat repeat
<block>
until <condition>
Notes
• The condition <condition> is evaluated as false if its value is false or nil, otherwise it is evaluated
as true. That means in particular that the value 0 and an empty string is evaluated as true!
• The control expressions <start>, <end>, and <step> of the for loop are evaluated only once, before the
loop starts. They must all result in numbers. Within the loop, you may not assign a value to the loop variable
<var>. If you do not specify a value for <step>, then the loop variable is incremented by 1.
• The break statement can be used to terminate the execution of a while, repeat and for, skipping to the
next statement after the loop. For syntactic reasons, the break statement can only be written as the last statement
of a block. If it is really necessary to break in the middle of a block, then an explicit block can be used (do
break end).
Examples
if var == false
then a = 1
else a = 2
end
while a <= 6 do
p = p*2
a = a+1
end
repeat
p = p*2
b = b+1
until b == 6
for i=1,6 do
if p< 0 then break end
p = p*2
end
277
3.5. Scripting
Operators
The following operators are supported withing the script language:
Operators Notes
+, -, *, /,% Common arithmetic operators
^ Power (2^3=8)
==, ~= If the operands of the equality operator (==) are different, the condition is always
evaluated as false! The inequality operator (~=) is exactly the opposite of the equality
operator.
<, <=, >, >= Comparison operators
and, or, not • and returns the first operand, if it is nil or false, otherwise the second one
• or returns the first operand, if it is not nil or false, otherwise the second one
• Both operators use short-cut evaluation, that is, the second operand is evaluated
only if necessary
• not only returns true, if the operand is nil or false, otherwise it returns
false.
Operator precedence follows the order below, from higher to lower priority:
1. ^
2. not, - (unary operand)
3. *, /, %
4. +, -
5. ..
6. <, >, <=, >=, ~=, ==
7. and
8. or
Examples
278
Chapter 3. Concepts
Functions
Scripts can be easily structured by the usage of functions.
Syntax
Notes
• Simple variables are treated as per value parameters. That means they cannot be manipulated within the function.
However, arrays and dictionary tables are treated as per reference parameters which means their entries are
mutable. But if you assign a complete new object, then the original function parameter is not affected.
• If you call a function with too many arguments, the supernumerous ones are ignored. If you call it with too
few arguments, the rest of them are initialized with nil.
• Via return, you can exit a function and return one or more return values. For syntactic reasons, the return
statement can only be written as the last statement of a block, if it returns a value. If it is really necessary to
return in the middle of a block, then an explicit block can be used (do return end).
• Functions are first-class values, they can be stored in a variable or passed in a parameter list.
Examples
function min_max(a,b,c)
local min,max=a,b
if a>b then min,max=b,a
end
if c>max then max=c
elseif c<min then min=c
end
return min,max
end
pcall() Can be used to protect a function call. The parameters are the function name and all parameters of
the function, e.g. pcall(my_function,param1,param2) instead of my_func-
tion(param1,param2). The function name pcall stands for protected call.
279
3.5. Scripting
These return values can be assigned to variables and evaluated afterwards (e.g. success,res-
ult=pcall(my_function,param1)).
Examples
If an error occurs during the call of the query(), the script terminates and returns the corresponding exception
message. To protect such a call, you can use either the special function pquery()or pcall() (see also section
ERROR Handling via pcall()and error() ).
The return values of query() delivers diverse information which depends of the query type:
1. SELECT statements
Returns a two dimensional array whose values are read-only. If you want to process these values, you have
to create another two-dimensional array and copy the values. However, please consider that scripts are not
applicable for operations on big data volumes.
The first dimension of the result represents the rows, the second one the columns. Beside addressing by
numbers you can also use column names, though you have to consider the case (e.g. res-
ult[i]["MY_COLUMN"]).
The number of rows can be determined via the # operator (#result) , the number of columns via the #
operator on the first row (#result[1]).
280
Chapter 3. Concepts
Additionally, you can access the SQL text via the key statement_text.
2. Other statements
When executing IMPORT and EXPORT commands, the following keys are defined:
By using parametrized SQL commands, you can execute dynamic SQL statements without having to create new
SQL text. You could e.g. execute a command within a script to create a table multiple times by passing different
table names. In such a case, you can use the table name as parameter of the SQL statement.
However, such parameters are evaluated either as identifier or as value before they are placed into the text. The
reason for that approach is to avoid any security risks through SQL Injection.
:variable Evaluates the content of the variable as constant value, e.g. in a column filter (query([[SE-
LECT * FROM t WHERE i=:v]],{v=1})).
::variable Evaluates the content of the variable as identifier. Delimited identifier have to be delimited by
double quotes withing the string (e.g. table_name=[["t"]]).
If you use a parametrized query() or pquery() function, you have to specify a dictionary table as second
function parameter to define the variables. In this definition you can either specify constant values (e.g.
query([[SELECT * FROM t WHERE ::c=:v]],{c=column_name,v=1})) or assign script variables.
It is not possible to directly use script variables withing the functions query()and pquery().
In case no second parameter is specified, the SQL text is executed without interpretation. This is e.g. important if
you want to create a script via query() or pquery() which contain such variables in the text.
281
3.5. Scripting
Examples
/*
* just executing simple SQL statements
*/
query([[CREATE TABLE t(c CHAR(1))]]) -- creates table T
query([[CREATE TABLE "t"(c CHAR(1))]]) -- creates table t
local my_sql_text = [[INSERT INTO "T" VALUES 'a','b','c','d']]
query(my_sql_text)
/*
* using result tables of a query by concatenating
* all table entries into one result string
*/
local res_table = query([[SELECT c FROM t]])
local first_element = res_table[1][1] -- access first row of first column
local row_size = #res_table -- row_size=4
if row_size==0 then
col_size = 0
else
col_size=#res_table[1] -- col_size=1
end
local concat_str = ''
for col=1, col_size do
for row=1,row_size do
concat_str = concat_str..res_table[row][col] --> 'abcd'
end
end
/*
* using result information about manipulated rows
*/
local res = query([[
MERGE INTO staff s USING update_table u ON (s.id=u.id)
WHEN MATCHED THEN UPDATE SET s.salary=u.salary DELETE WHERE u.delete_flag
WHEN NOT MATCHED THEN INSERT VALUES (u.id,u.salary)
]])
local i, u, d = res.rows_inserted, res.rows_updated, res.rows_deleted
/*
* using pquery to avoid abort of script
*/
local success, result = pquery([[DROP USER my_user]])
if not success then
-- accessing error message
local error_txt = 'Could not drop user. Error: '..result.error_message
query([[INSERT INTO my_errors VALUES (CURRENT_TIMESTAMP, :err)]],
{err=error_txt})
end
/*
* now using variables inside the SQL statements
* and create 5 tables at once
282
Chapter 3. Concepts
*/
for i=1,5 do
local tmp_table_name = 'TABLE_'..i
query([[CREATE TABLE ::t (c CHAR(1))]], {t=tmp_table_name})
end
Scripting parameters
To be able to pass parameters to a script, you can specify them simply as an input list within the script definition
( CREATE SCRIPT). Those parameters do not need any type definition. Simple variables are just specified as
identifiers which are case-sensitive like within the script.
If you want to pass an array as parameter, you have to declare that by using the keyword ARRAY before the iden-
tifier. When calling a script, you have to use the construct ARRAY(value1, value2, ...). By using this
concept, you can easily pass dynamic lists to a script. E.g. a script could be called for a list of user names and could
grant a certain system privilege, independent to the number of passed users.
Examples
TABLE_NAME TABLE_TYPE
---------- ----------
T1 TABLE
T2 TABLE
T3 TABLE
T4 TABLE
283
3.5. Scripting
When including a script via the import() function, the imported script is executed and
all functions and global variables are provided in the corresponding namespace.
In the following example the script other_script is created which defines the function min_max(). This
function can be accessed in the second script my_script after the inclusion via import("schema1.oth-
er_script", "my_alias") by using the syntax my_alias.min_max().
If you want to import a parametrized script, you have to specify those params in the call of the import() function
at the end, that means after the optional alias. Such parameters can be either constant values or script variables.
Examples
The possible return type is specified within the CREATE SCRIPT command:
The value which is returned by the drivers as rowcount. If you omit the exit() parameter or if the script is
not exited explicitly, then this value is 0. If you directly return the result of a query execution via query()
or pquery() (e.g. exit(query([[INSERT INTO...]])), then the rowcount of that query is passed
(not possible when executing a SELECT query). Alternatively, you can specify the rowcount explicitly by
passing a dictionary table which contains the key rows_affected (e.g. exit({rows_affected=30})).
If you specify this option the result of the script execution will be a table.
284
Chapter 3. Concepts
If no parameter is passed to the exit() function or if the script is not terminated explicitly, the result is an
empty table. If you directly return the result of a query execution via query() or pquery() (e.g.
exit(query([[SELECT...]])), then the result table of this query is passed (only possible for SELECT
queries).
Alternatively, you can specify a two dimensional array. In that case you however have to specify the column
types of the result table as second parameter (see example - analog to the column definition of a CREATE
TABLE command).
The result table of a script can be processed in other scripts. But a persistent storage
of that table in the database via SQL is not possible (like to CREATE TABLE
<table> AS SELECT...).
If you specify the WITH OUTPUT option within the EXECUTE SCRIPT statement, the
RETURNS option of the script creation is overwritten (see section Debug output).
Examples
I C B
------------------- --- -----
1 abc TRUE
2 xyz FALSE
3
Metadata
Within every script you can access various metadata via global variables to enhance the control of a script.
285
3.5. Scripting
Debug output
Especially during the development of scripts it is very useful to analyze the sequence of actions via debug output.
That is why the WITH OUTPUT option of the EXECUTE SCRIPT statement is provided. If you specify this option,
every output which was created via the output() function is returned as a result table with one column and
multiple rows - independent of the actual return value of the script.
The input parameter of the output() function is a single string. If you pass a different value, an implicit conversion
is tried, but if that conversion is not possible, the value NULL is inserted in the result table.
The column name of the result table is called OUTPUT, the column type is a VARCHAR with the length of the
longest string which was inserted via output().
If you omit the WITH OUTPUT option when executing a script, all debug output via output() is ignored.
Examples
if param1==false then
output('PARAM1 is false, exit SCRIPT')
exit()
else
output('PARAM2 is '..param2)
end
output('End of SCRIPT reached')
/
OUTPUT
---------------------
SCRIPT started
PARAM2 is 5
End of SCRIPT reached
286
Chapter 3. Concepts
• quote(param)
Adds quotes around the parameter and doubles embedded quotes. Primarily, this function is useful for delimited
identifiers within SQL commands.
Generates a string which includes all parameters p2,p3,..., separated by p1. By using the function call
join(".",...), you can easily create schema qualified identifiers (see also Section 2.1.2, “SQL identifier”).
Examples
Further Notes
Line numbers The script text is stored beginning from the first line after the AS keyword which is no blank
line and can also be found in the corresponding system tables. That is why the following ex-
ample returns an error in "line 5":
Scope-Schema During the execution of a script the current schema is used as scope schema (except you
change the schema explicitly within the script). Therefore you should consider that schema
objects are preferably accessed schema qualified.
3.5.4. Libraries
String library
This library provides generic functions for string manipulation, such as finding and extracting substrings, and
pattern matching. More information about patterns can be found below.
Please note that when indexing a string, the first character is at position 1. Indices are allowed to be negative and
are interpreted as indexing backwards, from the end of the string. Thus, the last character is at position -1.
287
3.5. Scripting
Functions
Looks for the first match of pattern in the string s. If it finds a match, then find returns the indices of s
where this occurrence starts and ends; otherwise, it returns nil. A third, optional numerical argument init
specifies where to start the search; its default value is 1 and may be negative. A value of true as a fourth, op-
tional argument plain turns off the pattern matching facilities, so the function does a plain "find substring"
operation, with no characters in pattern being considered "magic". Note that if plain is given, then init
must be given as well. If the pattern has captures, then in a successful match the captured values are also returned,
after the two indices.
Looks for the first match of pattern in the string s. If it finds one, then match returns the captures from
the pattern; otherwise it returns nil. If pattern specifies no captures, then the whole match is returned. A
third, optional numerical argument init specifies where to start the search; its default value is 1 and may be
negative.
• string.gmatch(s, pattern)
Returns an iterator function that, each time it is called, returns the next captures from pattern over string
s. If pattern specifies no captures, then the whole match is produced in each call.
The example collects all pairs key=value from the given string into a dictionary table:
local t = {}
local s = 'from=world, to=moon'
for k, v in string.gmatch(s, '(%w+)=(%w+)') do
t[k] = v
end
--> t['from']='world'
--> t['to'] ='moon'
• string.sub(s, i [, j])
Returns the substring of s that starts at i and continues until j; i and j may be negative. If j is absent, then
it is assumed to be equal to -1 (which is the same as the string length). In particular, the call
string.sub(s,1,j) returns a prefix of s with length j, and string.sub(s, -i) returns a suffix of
s with length i.
288
Chapter 3. Concepts
Returns a copy of s in which all occurrences of the pattern have been replaced by a replacement string
specified by repl. gsub also returns, as its second value, the total number of substitutions made. The optional
last parameter n limits the maximum number of substitutions to occur. For instance, when n is 1 only the first
occurrence of pattern is replaced.
The string repl is used for replacement. The character % works as an escape character: Any sequence in
repl of the form %n, with n between 1 and 9, stands for the value of the n-th captured substring (see below).
The sequence %0 stands for the whole match. The sequence %% stands for a single %.
• string.len(s)
Receives a string and returns its length. The empty string '' has length 0.
• string.rep(s, n)
local x = string.rep('hello',3)
--> x='hellohellohello'
• string.reverse(s)
• string.lower(s)
Receives a string and returns a copy of this string with all uppercase letters changed to lowercase. All other
characters are left unchanged.
• string.upper(s)
Receives a string and returns a copy of this string with all lowercase letters changed to uppercase. All other
characters are left unchanged. Example:
289
3.5. Scripting
Returns a formatted version of its variable number of arguments following the description given in its first ar-
gument (which must be a string). The format string follows the same rules as the printf() family of standard C
functions. The only differences are that the options/modifiers *, l, L, n, p, and h are not supported and that
there is an extra option, q. The q option formats a string in a form suitable to be safely read back by the Lua
interpreter. The options c, d, E, e, f, g, G, i, o, u, x, and X all expect a number as argument, whereas q and
s expect a string.
Options:
%d, %i Signed integer
%u Unsigned integer
%f, %g, %G Floating point
%e, %E Scientific notation
%o Octal integer
%x, %X Hexadecimal integer
%c Character
%s String
%q Safe string
Example:
Pattern
x Represents the character x itself (if x is not one of the magic characters ^$()%.[]*+-?)
. (a dot) represents all characters
%a represents all letters
%c represents all control characters
%d represents all digits
%l represents all lowercase letters
%p represents all punctuation characters
%s represents all space characters
%u represents all uppercase letters
%w represents all alphanumeric characters
%x represents all hexadecimal digits
%z represents the character with representation 0
%x represents the character x (where x is any non-alphanumeric character). This is the standard way to
escape the magic characters. Any punctuation character (even the non magic) can be preceded by a
`%´ when used to represent itself in a pattern.
[set] represents the class which is the union of all characters in set. A range of characters may be specified
by separating the end characters of the range with a `-´. All classes %x described above may also be
used as components in set. All other characters in set represent themselves. The interaction between
ranges and classes is not defined. Therefore, patterns like [%a-z] or [a-%%] have no meaning.
290
Chapter 3. Concepts
[0-7%l%-] represents the octal digits plus the lowercase letters plus the `-´ character
[^set] represents the complement of set, where set is interpreted as above.
For all classes represented by single letters (%a, %c, etc.), the corresponding uppercase letter represents the com-
plement of the class. For instance, %S represents all non-space characters.
• a single character class, which matches any single character in the class;
• a single character class followed by `*´, which matches 0 or more repetitions of characters in the class. These
repetition items will always match the longest possible sequence;
• a single character class followed by `+´, which matches 1 or more repetitions of characters in the class. These
repetition items will always match the longest possible sequence;
• a single character class followed by `-´, which also matches 0 or more repetitions of characters in the class.
Unlike `*´, these repetition items will always match the shortest possible sequence;
• a single character class followed by `?´, which matches 0 or 1 occurrence of a character in the class;
• %n, for n between 1 and 9; such item matches a substring equal to the n-th captured string (see below);
• %bxy, where x and y are two distinct characters; such item matches strings that start with x, end with y, and
where the x and y are balanced. This means that, if one reads the string from left to right, counting +1 for an
x and -1 for a y, the ending y is the first y where the count reaches 0. For instance, the item %b() matches ex-
pressions with balanced parentheses.
A pattern is a sequence of pattern items. A `^´ at the beginning of a pattern anchors the match at the beginning of
the subject string. A `$´ at the end of a pattern anchors the match at the end of the subject string. At other positions,
`^´ and `$´ have no special meaning and represent themselves.
A pattern may contain sub-patterns enclosed in parentheses; they describe captures. When a match succeeds, the
substrings of the subject string that match captures are stored (captured) for future use. Captures are numbered
according to their left parentheses. For instance, in the pattern "(a*(.)%w(%s*))", the part of the string matching
"a*(.)%w(%s*)" is stored as the first capture (and therefore has number 1); the character matching "." is captured
with number 2, and the part matching "%s*" has number 3.
As a special case, the empty capture () captures the current string position (a number). For instance, if we apply
the pattern "()aa()" on the string "flaaap", there will be two captures: 3 and 5.
Unicode
If you want to process unicode characters, you can use the library unicode.utf8 which contains the similar
functions like the string library, but with unicode support. Another library called unicode.ascii exists
which however has the same functionality like the library string.
XML parsing
The library lxp contains several features to process XML data. The official reference to those functions can be
found under http://matthewwild.co.uk/projects/luaexpat.
lxp.new(callbacks, [separator]) The parser is created by a call to the function lxp.new, which returns the
created parser or raises a Lua error. It receives the callbacks table and op-
tionally the parser separator character used in the namespace expanded
element names.
close() Closes the parser, freeing all memory used by it. A call to parser:close()
without a previous call to parser:parse() could result in an error.
291
3.5. Scripting
parse(s) Parse some more of the document. The string s contains part (or perhaps
all) of the document. When called without arguments the document is closed
(but the parser still has to be closed). The function returns a non nil value
when the parser has been successful, and when the parser finds an error it
returns five results: nil, msg, line, col, and pos, which are the error message,
the line number, column number and absolute position of the error in the
XML document.
pos() Returns three results: the current parsing line, column, and absolute position.
setbase(base) Sets the base to be used for resolving relative URIs in system identifiers.
setencoding(encoding) Set the encoding to be used by the parser. There are four built-in encodings,
passed as strings: "US-ASCII", "UTF-8", "UTF-16", and "ISO-8859-1".
stop() Abort the parser and prevent it from parsing any further through the data it
was last passed. Use to halt parsing the document when an error is discovered
inside a callback, for example. The parser object cannot accept more data
after this call.
SQL parsing
The self developed library sqlparsing contains a various number of functions to process SQL statements.
Details about the functions and their usage can be found in Section 3.8, “SQL Preprocessor ”.
Internet access
Via the library socket you can open http, ftp and smtp connections. More documentation can be found under
http://w3.impa.br/~diego/software/luasocket/.
Math library
The math library is an interface to the standard C math library and provides the following functions and values:
math.atan2(y,x) Arc tangent of y/x (in radians), but uses the signs of both operands to find the
quadrant of the result
292
Chapter 3. Concepts
math.fmod(x,y) Modulo
math.frexp(x) Returns m and n so that x=m2^n, n is an integer and the absolute value of m is in
the range [0.5;1) (or zero if x is zero)
math.huge Value HUGE_VAL which is larger than any other numerical value
math.modf(x) Returns two number, the integral part of x and the fractional part of x
math.pi Value of π
math.random([m,[n]]) Interface to random generator function rand from ANSI C (no guarantees can be
given for statistical properties). When called without arguments, returns a pseudo-
random real number in the range [0;1). If integer m is specified, then the range is
[1;m]. If called with m and n, then the range is [m;n].
math.tan(x) Tangent of x
Table library
This library provides generic functions for table manipulation. Most functions in the table library assume that the
table represents an array or a list. For these functions, when we talk about the "length" of a table we mean the
result of the length operator (#table).
Functions
Inserts element value at position pos in table, shifting up other elements if necessary. The default value
for pos is n+1, where n is the length of the table, so that a call table.insert(t,x) inserts x at the end
of table t.
• table.remove(table [, pos])
293
3.5. Scripting
Removes from table the element at position pos, shifting down other elements if necessary. Returns the
value of the removed element. The default value for pos is n, where n is the length of the table, so that a call
table.remove(t) removes the last element of table t.
Returns table[i]..sep..table[i+1] ... sep..table[j]. The default value for sep is the
empty string, the default for i is 1, and the default for j is the length of the table. If i is greater than j, returns
the empty string.
• table.sort(table [, comp])
Sorts table elements in a given order, in-place, from table[1] to table[n], where n is the length of the table. If
comp is given, then it must be a function that receives two table elements, and returns true when the first is
less than the second (so that not comp(a[i+1],a[i]) will be true after the sort). If comp is not given, then the
standard Lua operator < is used instead. The sort algorithm is not stable; that is, elements considered equal by
the given order may have their relative positions changed by the sort.
• table.maxn(table)
Returns the largest positive numerical index of the given table, or zero if the table has no positive numerical
indices. (To do its job, this function does a linear traversal of the whole table.)
• EXA_USER_SCRIPTS
• EXA_ALL_SCRIPTS
• EXA_DBA_SCRIPTS
294
Chapter 3. Concepts
Please note that UDF scripts are part of the Advanced Edition of Exasol.
• Scalar functions
• Aggregate functions
• Analytical functions
• MapReduce algorithms
• User-defined ETL processes
To take advantage of the variety of UDF scripts, you only need to create a script ( CREATE SCRIPT, see Sec-
tion 2.2.1, “Definition of the database (DDL)”) and use this script afterwards within a SELECT statement. By this
close embedding within SQL you can achieve ideal performance and scalability.
Exasol supports multiple programming languages (Java, Lua, Python, R) to simplify your start. Furthermore the
different languages provide you different advantages due to their respective focus (e.g. statistical functions in R)
and the different delivered libraries (XML parser, etc.). Thus, please note the next chapters in which the specific
characteristics of each language is described.
The actual versions of the scripting languages can be listed with corresponding metadata
functions.
Within the CREATE SCRIPT command, you have to define the type of input and output values. You can e.g.
create a script which generates multiple result rows out of a single input row (SCALAR ... EMITS).
Input values:
SCALAR The keyword SCALAR specifies that the script processes single input rows. It's code is
therefore called once per input row.
SET If you define the option SET, then the processing refers to a set of input values. Within
the code, you can iterate through those values (see Section 3.6.2, “Introducing examples”).
Output values:
RETURNS In this case the script returns a single value.
EMITS If the keyword EMITS was defined, the script can create (emit) multiple result rows
(tuples). In case of input type SET, the EMITS result can only be used alone, thus not be
combined with other expressions. However you can of course nest it through a subselect
to do further processing of those intermediate results.
The data types of input and output parameters can be defined to specify the conversion between internal data types
and the database SQL data types. If you don't specify them, the script has to handle that dynamically (see details
and examples below).
Please note that input parameters of scripts are always treated case-sensitive, similar to
the script code itself. This is different to SQL identifiers which are only treated case-
sensitive when being delimited.
295
3.6. UDF scripts
Scripts must contain the main function run(). This function is called with a parameter providing access to the
input data of Exasol. If your script processes multiple input tuples (thus a SET script), you can iterate through the
single tuples by the use of this parameter.
For scalar functions, the input rows are distributed across those virtual machines to
achieve a maximal parallelism.
ORDER BY clause Either when creating a script ( CREATE SCRIPT) or when calling it you can specify
an ORDER BY clause which leads to a sorted processing of the groups of SET input
data. For some algorithms, this can be reasonable. But if it is necessary for the al-
gorithm, then you should already specify this clause during the creation to avoid
wrong results due to misuse.
Performance comparison The performance of the different languages can hardly be compared, since the specific
between the programming elements of the languages can have different capacities. Thus a string processing can
languages be faster in one language, while the XML parsing is faster in the other one.
Scalar functions
User defined scalar functions (keyword SCALAR) are the simplest case of user defined scripts, returning one
scalar result value (keyword RETURNS) or several result tuples (keyword SET) for each input value (or tuple).
Please note that scripts have to implement a function run() in which the processing is done. This function is
called during the execution and gets a kind of context as parameter (has name data in the examples) which is the
actual interface between the script and the database.
In the following example, a script is defined which returns the maximum of two values. This is equivalent to the
CASE expression CASE WHEN x>=y THEN x WHEN x<y THEN y ELSE NULL.
296
Chapter 3. Concepts
X Y MY_MAXIMUM(T.X,T.Y)
----------------- ----------------- -------------------
1 2 2
2 2 2
3 2 3
Furthermore, scripts can either return a single scalar value (keyword RETURNS) or multiple result tuples (keyword
EMITS).
The following example defines two scripts: the aggregate function my_average (simulates AVG) and the ana-
lytical function my_sum which creates three values per input row (one sequential number, the current value and
the sum of the previous values). The latter one processes the input data in sorted order due to the ORDER BY
clause.
MY_AVERAGE(T.X)
-----------------
7.75
297
3.6. UDF scripts
In order to access and evaluate dynamic input parameters in UDF scripts, extract the number of input parameters
and their types from the metadata and then access each parameter value by its index. For instance, in Python the
number of input parameters is stored in the variable exa.meta.input_column_count. Please note the details
of the different programming languages in section Section 3.6.3, “Details for different languages”.
If the UDF script is defined with dynamic output parameters, the actual output parameters and their types are de-
termined dynamically whenever the UDF is called. There are three possibilities:
1. You can specify the output parameters directly in the query after the UDF call using the EMITS keyword
followed by the names and types the UDF shall output in this specific call.
2. If the UDF is used in the top level SELECT of an INSERT INTO SELECT statement, the columns of the
target table are used as output parameters.
3. If neither EMITS is specified, nor INSERT INTO SELECT is used, the database tries to call the function
default_output_columns() (the name varies, here for Python) which returns the output parameters
dynamically, e.g. based on the input parameters. This method can be implemented by the user. See Section 3.6.3,
“Details for different languages” on how to implement the callback method in each language.
-- Define a pretty simple sampling script where the last parameter defines
-- the percentage of samples to be emitted.
CREATE PYTHON SCALAR SCRIPT sample_simple (...) EMITS (...) AS
from random import randint, seed
298
Chapter 3. Concepts
seed(1001)
def run(ctx):
percent = ctx[exa.meta.input_column_count-1]
if randint(0,100) <= percent:
currentRow = [ctx[i] for i in range(0, exa.meta.input_column_count-1)]
ctx.emit(*currentRow)
/
-- This is the same UDF, but output arguments are generated automatically
-- to avoid explicit EMITS definition in SELECT.
-- In default_output_columns(), a prefix 'c' is added to the column names
-- because the input columns are autogenerated numbers
CREATE PYTHON SCALAR SCRIPT sample (...) EMITS (...) AS
from random import randint, seed
seed(1001)
def run(ctx):
percent = ctx[exa.meta.input_column_count-1]
if randint(0,100) <= percent:
currentRow = [ctx[i] for i in range(0, exa.meta.input_column_count-1)]
ctx.emit(*currentRow)
def default_output_columns():
output_args = list()
for i in range(0, exa.meta.input_column_count-1):
name = exa.meta.input_columns[i].name
type = exa.meta.input_columns[i].sql_type
output_args.append("c" + name + " " + type)
return str(", ".join(output_args))
/
-- Example table
ID USER_NAME PAGE_VISITS
-------- --------- -----------
1 Alice 12
2 Bob 4
3 Pete 0
4 Hans 101
5 John 32
6 Peter 65
7 Graham 21
8 Steve 4
9 Bill 64
10 Claudia 201
-- The first UDF requires to specify the output columns via EMITS.
-- Here, 20% of rows should be extracted randomly.
ID USER_NAME PAGE_VISITS
-------- --------- -----------
2 Bob 4
5 John 32
299
3.6. UDF scripts
C0 C1 C2
-------- --------- -----------
2 Bob 4
5 John 32
-- In case of INSERT INTO, the UDF uses the target types automatically
CREATE TABLE people_sample LIKE people;
INSERT INTO people_sample
SELECT sample_simple(id, user_name, page_visits, 20) FROM people;
MapReduce programs
Due to its flexibility, the UDF scripts framework is able to implement any kind of analyses you can imagine. To
show you it's power, we list an example of a MapReduce program which calculates the frequency of single words
within a text - a problem which cannot be solved with standard SQL.
In the example, the script map_words extracts single words out of a text and emits them. This script is integrated
within a SQL query without having the need for an additional aggregation script (the typical Reduce step of
MapReduce), because we can use the built-in SQL function COUNT. This reduces the implementation efforts
since a whole bunch of built-in SQL functions are already available in Exasol. Additionally, the performance can
be increased by that since the SQL execution within the built-in functions is more native.
WORDS COUNT(*)
--------------------------- -------------------
the 1376964
slyly 649761
regular 619211
final 542206
carefully 535430
furiously 534054
ironic 519784
blithely 450438
even 425013
quickly 354712
300
Chapter 3. Concepts
In the following example, a list of URLs (stored in a table) is processed, the corresponding documents are read
from the webserver and finally the length of the documents is calculated. Please note that every script language
provides different libraries to connect to the internet.
URL DOC_LENGTH
-------------------------------------------------- -----------------
http://en.wikipedia.org/wiki/Main_Page.htm 59960
http://en.wikipedia.org/wiki/Exasol 30007
Lua
Beside the following information you can find additional details for Lua in Section 3.5, “Scripting”. Furthermore,
we recommend the official documentation (see http://www.lua.org).
run() a n d The method run() is called for each input tuple (SCALAR) or each group (SET). Its
cleanup() meth- parameter is a kind of execution context and provides access to the data and the iterator in
ods case of a SET script.
To initialize expensive steps (e.g. opening external connections), you can write code outside
the run() method, since this code is executed once at the beginning by each virtual ma-
chine. For deinitialization purpose, the method cleanup() exists which is called once
for each virtual machine, at the end of the execution.
Parameters Note that the internal Lua data types and the database SQL types are not identical. Therefore
casts must be done for the input and output data:
301
3.6. UDF scripts
DOUBLE number
DECIMAL and INTEGER decimal
BOOLEAN boolean
VARCHAR and CHAR string
Other Not supported
Please also consider the details about Lua types in Section 3.5,
“Scripting”, especially for the special type decimal.
But you can also use a dynamic number of parameters via the notation (...), e.g. CREATE
LUA SCALAR SCRIPT my_script (...). The parameters can then be accessed
through an index, e.g. data[1]. The number of parameters and their data types (both de-
termined during the call of the script) are part of the metadata.
Metadata The following metadata can be accessed via global variables:
Data iterator, For scripts having multiple input tuples per call (keyword SET), you can iterate through
next(), size() that data via the method next(), which is accessible through the context. Initially, the
and reset() iterator points to the first input row. That's why a repeat...until loop can be ideally used to
iterate through the data (see examples).
If the input data is empty, then the run() method will not be called,
and similar to aggregate functions the NULL value is returned as
result (like for e.g. SELECT MAX(x) FROM t WHERE
false).
Additionally, there is a method reset() that resets the iterator to the first input element.
Hereby you can do multiple iterations through the data, if this is necessary for your algorithm.
The method size() returns the number of input values.
emit() You can return multiple output tuples per call (keyword EMITS) via the method emit().
302
Chapter 3. Concepts
The method expects as many parameters as output columns were defined. In the case of
dynamic output parameters, it is handy in Lua to use an array using unpack() - e.g.
ctx.emit(unpack({1,"a"})).
Import of other Other scripts can be imported via the method exa.import(). Scripting programs (see
scripts Section 3.5, “Scripting”) can also be imported, but their parameters will be ignored.
Accessing connection The data that has been specified when defining connections with CREATE CONNECTION
definitions is available in LUA UDF scripts via the method exa.get_connection("<connec-
tion_name>"). The result is a Lua table with the following fields:
type The type of the connection definition. Currently only the type "PASSWORD"
is used.
address The part of the connection definition that followed the TO keyword in the
CREATE CONNECTION command.
user The part of the connection definition that followed the USER keyword in
the CREATE CONNECTION command.
password The part of the connection definition that followed the IDENTIFIED BY
keyword in the CREATE CONNECTION command.
Auxiliary libraries In Section 3.5, “Scripting” you find details about the supported libraries of Lua. Shortly
summarized, those are:
Dynamic output If the UDF script was defined with dynamic output parameters and the output parameters
parameters callback can not be determined (via specifying EMITS in the query or via INSERT INTO SELECT),
function de- the database calls the function default_output_columns() which you can implement.
fault_out- The expected return value is a String with the names and types of the output columns, e.g.
put_columns() "a int, b varchar(100)". See Dynamic input and output parameters for an explan-
ation when this method is called including examples.
User defined import To support a user defined import you can implement the callback method gener-
callback function ate_sql_for_import_spec(import_spec). Please see also Dynamic input and
g e n e r - output parameters and the IMPORT statement for the syntax. The parameter import_spec
ate_sql_for_im- contains all information about the executed IMPORT FROM SCRIPT statement. The
port_spec() (see function has to generate and return a SELECT SQL statement which will retrieve the data
Section 3.4.4, “User- to be imported. import_spec is a Lua table with the following fields:
303
3.6. UDF scripts
defined IMPORT us- parameters Parameters specified in the IMPORT statement. For
ing UDFs”) example the parameter FOO could be obtained by ac-
cessing import_spec.parameters.FOO . The
value nil is returned if the accessed parameter was
not specified.
is_subselect This is true, if the IMPORT is used inside a SELECT
statement and not inside an IMPORT INTO table
statement.
subselect_column_names This is only defined, if is_subselect is true and
the user specified the target column names and types.
It returns a list of strings with the names of all specified
columns. The value nil is returned if the target
columns are not specified.
subselect_column_types This is only defined, if is_subselect is true and
the user specified the target columns and names. It re-
turns a list of strings with the types of all specified
columns. The types are returned in SQL format (e.g.
"VARCHAR(100)"). The value nil is returned if the
target columns are not specified.
connection_name This returns the name of the connection, if such was
specified. Otherwise it returns nil. The UDF script
can then obtain the connection information via
exa.get_connection(name).
connection This is only defined, if the user provided connection
information. It returns a Lua table similar to such that
is returned by exa.get_connection(name).
Returns nil if no connection information are specified.
User defined export To support a user defined export you can implement the callback method gener-
callback function ate_sql_for_export_spec(export_spec). Please see also Dynamic input and
g e n e r - output parameters and the EXPORT statement for the syntax. The parameter ex-
ate_sql_for_ex- port_spec contains all information about the executed EXPORT INTO SCRIPT statement.
port_spec() (see The function has to generate and return a SELECT SQL statement which will generate the
Section 3.4.5, “User- data to be exported. The FROM part of that string can be a dummy table (e.g. DUAL) since
defined EXPORT us- the export command is aware which table should be exported. But it has to be specified to
ing UDFs”) be able to compile the SQL string successfully.
304
Chapter 3. Concepts
Example:
/*
This example loads from a webserver
and processes the following file goalies.xml:
p = lxp.new(
305
3.6. UDF scripts
function run(ctx)
content = http.request(ctx.url)
p:parse(content); p:parse(); p:close();
for i=1, #users do
ctx.emit(users[i].first_name, users[i].last_name)
end
end
/
Java
Besides the following information you can find additional details for Java in the official documentation.
Java main class The default convention is that the script main class (which includes the methods (run() and optionally
init() and cleanup()) must be named exactly like the name of the script (please consider the
general rules for identifiers).
You can also specify the script class explicitly using the keyword %scriptclass (e.g. %script-
class com.mycompany.MyScriptClass;)
All classes which are defined directly in the script are implicitly inside the
Java package package com.exasol, because the statement package
com.exasol; is implicitly added to the beginning of the script code.
)Using the Maven The Exasol Java API for scripts is available in Maven to facilitate the development of Java code.
repository Please add the following repository and dependency to the build configuration of your project (e.g.
pom.xml for Maven):
306
Chapter 3. Concepts
<repositories>
<repository>
<id>maven.exasol.com</id>
<url>https://maven.exasol.com/artifactory/exasol-releases</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.exasol</groupId>
<artifactId>exasol-script-api</artifactId>
<version>6.0.0</version>
</dependency>
</dependencies>
run(Exa- The method static <Type> run(ExaMetadata, ExaIterator) is called for each input
Metadata, Ex- tuple (SCALAR) or each group (SET). Its parameter is a metadata object (ExaMetadata) and an iter-
aIterator), ator for the data access (ExaIterator).
init(Exa-
Metadata) and To initialize expensive steps (e.g. opening external connections), you can define the method static
cleanup() void init(ExaMetadata). This will be executed once at the beginning by each virtual machine.
methods For deinitialization purposes, the method static void cleanup(ExaMetadata) exists which
is called once for each virtual machine, at the end of execution.
Data iterator, The following methods are provided in the class ExaIterator:
next(), emit()
size() and re- • next()
set() • emit(Object... values)
• reset()
• size()
• getInteger(String name) and getInteger(int column)
• getLong(String name) and getLong(int column)
• getBigDecimal(String name) and getBigDecimal(int column)
• getDouble(String name) and getDouble(int column)
• getString(String name) and getString(int column)
• getBoolean(String name) and getBoolean(int column)
• getDate(String name) and getDate(int column)
• getTimestamp(String name) and getTimestamp(int column)
For scripts having multiple input tuples per call (keyword SET), you can iterate through that data via
the method next(). Initially, the iterator points to the first input row. For iterating you can use a
while(true) loop which is aborted in case if (!ctx.next()).
If the input data is empty, then the run() method will not be called, and sim-
ilar to aggregate functions the NULL value is returned as the result (e.g.
SELECT MAX(x) FROM t WHERE false).
Additionally, there is a reset() method which resets the iterator to the first input element. Hereby
you can do multiple iterations through the data, if this is necessary for your algorithm.
The method size() returns the number of input values.
You can return multiple output tuples per call (keyword EMITS) via the method emit(). The
method expects as many parameters as output columns were defined. In the case of dynamic output
parameters, it is handy in Java to use an Object Array (Example: iter.emit(new Object[]
{1,"a"})).
307
3.6. UDF scripts
Parameter Note that the internal Java data types and the database SQL types are not identical. Therefore casts
must be done for the input and output data:
The input data can be accessed via get() methods, e.g. ctx.getString("url").
You can use a dynamic number of parameters via the notation (...), e.g. CREATE JAVA SCALAR
SCRIPT my_script (...). The parameters can then be accessed through an index, e.g.
ctx.getString(0) for the first parameter. The number of parameters and their data types (both
determined during the call of the script) are part of the metadata.
Metadata The following metadata can be accessed via the object ExaMetadata:
308
Chapter 3. Concepts
• ExaCompilationException
• ExaDataTypeException
• ExaIterationException
Import of other Besides the importScript() method of class ExaMetadata (see above), other scripts can be easily im-
scripts ported via the keyword %import (e.g. %import OTHER_SCRIPT;). Afterwards that script is ac-
cessible in the namespace (e.g. OTHER_SCRIPT.my_method()).
Accessing connec- The data that has been specified when defining connections with CREATE CONNECTION is available
tion definitions in Java UDF scripts via the method ExaMetadata.getConnection(String connection-
Name). The result is a Java object that implements the Interface com.exasol.ExaConnectionInformation
which features the following methods:
String ExaConnectionInformation.get- The part of the connection definition that followed the TO
Address() keyword in the CREATE CONNECTION command.
String ExaConnectionInforma- The part of the connection definition that followed the USER
tion.getUser() keyword in the CREATE CONNECTION command.
String ExaConnectionInformation.get- The part of the connection definition that followed the IDEN-
Password() TIFIED BY keyword in the CREATE CONNECTION com-
mand.
Integrate your own Please see Section 3.6.5, “Expanding script languages using BucketFS”
JAR packages
JVM Options To enable the tuning of script performance depending on its memory requirements, the %jvmoption
keyword can be used to specify the following Java VM options:
309
3.6. UDF scripts
This sets the initial Java heap size to 128 MB, maximum heap size to 1024 MB, and thread stack size
to 512 kB.
Please note that if multiple values are given for a single option (e.g., resulting from the import of an-
other script), the last value will be used.
Dynamic output If the UDF script was defined with dynamic output parameters and the output parameters can not be
parameters callback determined (via specifying EMITS in the query or via INSERT INTO SELECT), the database calls
function getDe- the method static String getDefaultOutputColumns(ExaMetadata exa) which
faultOutput- you can implement. The expected return value is a String with the names and types of the output
Columns(Exa- columns, e.g. "a int, b varchar(100)". See Dynamic input and output parameters for an
Metadata exa) explanation when this method is called including examples.
User defined im- To support a user defined import you can implement the callback method public static String
port callback func- generateSqlForImportSpec(ExaMetadata meta, ExaImportSpecification
tion generat- importSpec). Please see also Dynamic input and output parameters and the IMPORT for the
eSqlForIm- syntax. The parameter importSpec contains all information about the executed IMPORT FROM
portSpec()(see SCRIPT statement. The function has to generate and return a SELECT SQL statement which will re-
Section 3.4.4, trieve the data to be imported.
“User-defined IM-
PORT using UD- importSpec is an object with the following fields:
Fs”)
Map<String, String> getParameters() Parameters specified in the IMPORT statement.
boolean isSubselect() This is true, if the IMPORT is used inside a SELECT statement
and not inside an IMPORT INTO table statement.
List<String> getSubselectColumn- If isSubselect() is true and the user specified the target
Names() column names and types, this returns the names of all specified
columns.
List<String> getSubselectColumn- If isSubselect() is true and the user specified the target
Types() column names and types, this returns the types of all specified
columns. The types are returned in SQL format (e.g.
"VARCHAR(100)").
boolean hasConnectionName() This method returns true if a connection was specified. The
UDF script can then obtain the connection information via
ExaMetadata.getConnection(name).
String getConnectionName() If hasConnection() is true, this returns the name of the
specified connection.
boolean hasConnectionInformation() This returns true if connection information were provided. The
UDF script can then obtain the connection information via
getConnectionInformation().
ExaConnectionInformation getCon- If hasConnectionInformation() is true, this returns
nectionInformation() the connection information provided by the user. See above in
this table for the definition of ExaConnectionInforma-
tion.
310
Chapter 3. Concepts
User defined export To support a user defined export you can implement the callback method public static String
callback function generateSqlForExportSpec(ExaMetadata meta, ExaExportSpecification
generateSqlF- export_spec). Please see also Dynamic input and output parameters and the EXPORT statement
o r E x - for the syntax. The parameter export_spec contains all information about the executed EXPORT
portSpec() (see INTO SCRIPT statement. The function has to generate and return a SELECT SQL statement which
Section 3.4.5, will generate the data to be exported. The FROM part of that string can be a dummy table (e.g. DUAL)
“User-defined EX- since the export command is aware which table should be exported. But it has to be specified to be
PORT using UD- able to compile the SQL string successfully.
Fs”)
export_spec is an object with the following fields:
Example:
/*
This example loads from a webserver
and processes the following file goalies.xml:
311
3.6. UDF scripts
<user active="1">
<first_name>Joe</first_name>
<last_name>Hart</last_name>
</user>
<user active="0">
<first_name>Oliver</first_name>
<last_name>Kahn</last_name>
</user>
</users>
*/
import java.net.URL;
import java.net.URLConnection;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
class PROCESS_USERS {
static void run(ExaMetadata exa, ExaIterator ctx) throws Exception {
URL url = new URL(ctx.getString("url"));
URLConnection conn = url.openConnection();
DocumentBuilder docBuilder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = docBuilder.parse(conn.getInputStream());
NodeList nodes =
doc.getDocumentElement().getElementsByTagName("user");
for (int i = 0; i < nodes.getLength(); i++) {
if (nodes.item(i).getNodeType() != Node.ELEMENT_NODE)
continue;
Element elem = (Element)nodes.item(i);
if (!elem.getAttribute("active").equals("1"))
continue;
Node name = elem.getElementsByTagName("first_name").item(0);
String firstName = name.getChildNodes().item(0).getNodeValue();
name = elem.getElementsByTagName("last_name").item(0);
String lastName = name.getChildNodes().item(0).getNodeValue();
ctx.emit(firstName, lastName);
}
}
}
/
SELECT process_users ('http://www.my_valid_webserver/goalies.xml')
FROM DUAL;
FIRSTNAME LASTNAME
-------------------- --------------------
Manuel Neuer
Joe Hart
312
Chapter 3. Concepts
Python
Beside the following information you can find additional details for Python in the official documentation (see ht-
tp://www.python.org).
run() a n d The method run() is called for each input tuple (SCALAR) or each group (SET). Its
cleanup() meth- parameter is a kind of execution context and provides access to the data and the iterator in
ods case of a SET script.
To initialize expensive steps (e.g. opening external connections), you can write code outside
the run() method, since this code is executed once at the beginning by each virtual ma-
chine. For deinitialization purpose, the method cleanup() exists which is called once
for each virtual machine, at the end of the execution.
Parameters Note that the internal Python data types and the database SQL types are not identical.
Therefore casts must be done for the input and output data:
DECIMAL(p,0) int
DECIMAL(p,s) decimal.Decimal
DOUBLE float
DATE datetime.date
TIMESTAMP datetime.datetime
BOOLEAN bool
VARCHAR and CHAR unicode
Other Not supported
But you can also use a dynamic number of parameters via the notation (...), e.g. CREATE
PYTHON SCALAR SCRIPT my_script (...). The parameters can then be accessed
through an index, e.g. data[0] for the first parameter. The number of parameters and
their data types (both determined during the call of the script) are part of the metadata.
Metadata The following metadata can be accessed via global variables:
313
3.6. UDF scripts
Data iterator, For scripts having multiple input tuples per call (keyword SET), you can iterate through
next(), size() that data via the method next(), which is accessible through the context. Initially, the
and reset() iterator points to the first input row. For iterating you can use a while True loop which
is aborted in case if not ctx.next().
If the input data is empty, then the run() method will not be called,
and similar to aggregate functions the NULL value is returned as
result (like for e.g. SELECT MAX(x) FROM t WHERE
false).
Additionally, there exist a method reset() which resets the iterator to the first input
element. Hereby you can do multiple iterations through the data, if this is necessary for
your algorithm.
The method size() returns the number of input values.
emit() You can return multiple output tuples per call (keyword EMITS) via the method emit().
The method expects as many parameters as output columns were defined. In the case of
dynamic output parameters, it is handy in Python to use a list object which can be refer-
enced using * (like in the example above: ctx.emit(*currentRow)).
Import of other Other scripts can be imported via the method exa.import_script(). The return value
scripts of this method must be assigned to a variable, representing the imported module.
Accessing connection The data that has been specified when defining connections with CREATE CONNECTION
definitions is available in Python UDF scripts via the method exa.get_connection("<connec-
tion_name>"). The result is a Python object with the following fields:
address The part of the connection definition that followed the TO keyword in the
CREATE CONNECTION command.
user The part of the connection definition that followed the USER keyword in
the CREATE CONNECTION command.
password The part of the connection definition that followed the IDENTIFIED BY
keyword in the CREATE CONNECTION command.
Auxiliary libraries The following libraries are provided which are not already part of the language:
314
Chapter 3. Concepts
Dynamic output If the UDF script was defined with dynamic output parameters and the output parameters
parameters callback can not be determined (via specifying EMITS in the query or via INSERT INTO SELECT),
function de- the database calls the method default_output_columns() which you can implement.
fault_out- The expected return value is a String with the names and types of the output columns, e.g.
put_columns() "a int, b varchar(100)". See Dynamic input and output parameters for an explan-
ation when this method is called including examples.
User defined import To support a user defined import you can implement the callback method gener-
callback function ate_sql_for_import_spec(import_spec). Please see also Dynamic input and
g e n e r - output parameters and the IMPORT statement for the syntax. The parameter import_spec
ate_sql_for_im- contains all information about the executed IMPORT FROM SCRIPT statement. The
port_spec() (see function has to generate and return a SELECT SQL statement which will retrieve the data
Section 3.4.4, “User- to be imported.
defined IMPORT us-
ing UDFs”) import_spec has the following fields:
User defined export To support a user defined export you can implement the callback method gener-
callback function ate_sql_for_export_spec(export_spec). Please see also Dynamic input and
g e n e r - output parameters and the EXPORT statement for the syntax. The parameter ex-
ate_sql_for_ex- port_spec contains all information about the executed EXPORT INTO SCRIPT statement.
port_spec() (see The function has to generate and return a SELECT SQL statement which will generate the
Section 3.4.5, “User- data to be exported. The FROM part of that string can be a dummy table (e.g. DUAL) since
315
3.6. UDF scripts
defined EXPORT us- the export command is aware which table should be exported. But it has to be specified to
ing UDFs”) be able to compile the SQL string successfully.
Example:
/*
This example loads from a webserver
and processes the following file goalies.xml:
316
Chapter 3. Concepts
*/
def run(ctx):
content = etree.parse(urlopen(ctx.url))
for user in content.findall('user/[@active="1"]'):
ctx.emit(user.find('first_name').text, user.find('last_name').text)
/
FIRSTNAME LASTNAME
-------------------- --------------------
Manuel Neuer
Joe Hart
R
Beside the following information you can find additional details for R in the official documentation (see ht-
tp://www.r-project.org).
run() a n d The method run() is called for each input tuple (SCALAR) or each group (SET). Its
cleanup() meth- parameter is a kind of execution context and provides access to the data and the iterator in
ods case of a SET script.
To initialize expensive steps (e.g. opening external connections), you can write code outside
the run() method, since this code is executed once at the beginning by each virtual ma-
chine. For deinitialization purpose, the method cleanup() exists which is called once
for each virtual machine, at the end of the execution.
Parameters Note that the internal R data types and the database SQL types are not identical. Therefore
casts must be done for the input and output data:
But you can also use a dynamic number of parameters via the notation (...), e.g. CREATE
R SCALAR SCRIPT my_script (...). The parameters can then be accessed through
an index. Please note that in this case a special notation is necessary, e.g. ctx[[1]]()
317
3.6. UDF scripts
for the first parameter. The number of parameters and their data types (both determined
during the call of the script) are part of the metadata.
Metadata The following metadata can be accessed via global variables:
Data iterator, For scripts having multiple input tuples per call (keyword SET) there are two ways to access
n e x t _ r o w ( ) , the data of your group:
size() and re-
set() 1. Iterate over the data, one record at a time
2. Load all (or multiple) rows for the group into a vector in memory to run R operations
on it
To iterate over the data, use the method next_row() which is accessible through the context.
The function differs from the other languages, because next is a reserved keyword in R.
Initially, the iterator points to the first input row. That's why a repeat...until loop can be
ideally used to iterate through the data (see examples). If the input data is empty, then the
run() method will not be called, and similar to aggregate functions the NULL value is re-
turned as result (like for e.g. SELECT MAX(x) FROM t WHERE false).
To access the data as a vector, you can call the method next_row with a parameter that
specifies how many records you want to read into the vector (e.g. next_row(1000) or
next_row(NA)). If you specify NA, all records of the group will be read into a single
vector.
Please keep in mind that the vector will be held completely in memory. For very large
groups this might exceed the memory limits, and you should fetch e.g. 1000 rows at a time
until you processed all records of the group.
To get the actual vector with data, you can access the context property with the name of
your input parameter (e.g. ctx$input_word, see the example below). Note that you
have to call the next_row function first in this case, before accessing the data.
318
Chapter 3. Concepts
Additionally, there exist a method reset() which resets the iterator to the first input
element. Hereby you can do multiple iterations through the data, if this is necessary for
your algorithm.
The method size() returns the number of input values.
emit() You can return multiple output tuples per call (keyword EMITS) via the method emit().
The method expects as many parameters as output columns were defined. In the case of
dynamic output parameters, it is handy to use a list and the do.call method like in the
following example: do.call(ctx$emit, list(1,"a"))
For scripts of type RETURNS, vectors can be used to improve the performance (see examples
below).
Import of other Other scripts can be imported via the method exa$import_script(). The return value
scripts of this method must be assigned to a variable, representing the imported module.
Accessing connection The data that has been specified when defining connections with CREATE CONNECTION
definitions is available in R UDF scripts via the method exa$get_connection("<connec-
tion_name>"). The result is an R list with the following entries:
address The part of the connection definition that followed the TO keyword in the
CREATE CONNECTION command.
user The part of the connection definition that followed the USER keyword in
the CREATE CONNECTION command.
password The part of the connection definition that followed the IDENTIFIED BY
keyword in the CREATE CONNECTION command.
Auxiliary libraries The following libraries are provided which are not already part of the language:
319
3.6. UDF scripts
Dynamic output If the UDF script was defined with dynamic output parameters and the output parameters
parameters callback can not be determined (via specifying EMITS in the query or via INSERT INTO SELECT),
function de- the database calls the method defaultOutputColumns() which you can implement.
faultOutput- The expected return value is a String with the names and types of the output columns, e.g.
Columns() "a int, b varchar(100)". See Dynamic input and output parameters for an explan-
ation when this method is called including examples.
User defined import To support a user defined import you can implement the callback method gener-
callback function ate_sql_for_import_spec(import_spec). Please see also Dynamic input and
g e n e r - output parameters and the IMPORT statement for the syntax. The parameter import_spec
ate_sql_for_im- contains all information about the executed IMPORT FROM SCRIPT statement. The
port_spec() (see function has to generate and return a SELECT SQL statement which will retrieve the data
Section 3.4.4, “User- to be imported.
defined IMPORT us-
ing UDFs”) import_spec has the following fields:
320
Chapter 3. Concepts
User defined export To support a user defined export you can implement the callback method gener-
callback function ate_sql_for_export_spec(export_spec). Please see also Dynamic input and
g e n e r - output parameters and the EXPORT statement for the syntax. The parameter ex-
ate_sql_for_ex- port_spec contains all information about the executed EXPORT INTO SCRIPT statement.
port_spec() (see The function has to generate and return a SELECT SQL statement which will generate the
Section 3.4.5, “User- data to be exported. The FROM part of that string can be a dummy table (e.g. DUAL) since
defined EXPORT us- the export command is aware which table should be exported. But it has to be specified to
ing UDFs”) be able to compile the SQL string successfully.
Example:
LAST_NAME R_UPPER
--------- -------
Smith SMITH
321
3.6. UDF scripts
Miller MILLER
/*
The following example loads from a webserver
and processes the following file goalies.xml:
322
Chapter 3. Concepts
FIRSTNAME LASTNAME
-------------------- --------------------
Manuel Neuer
Joe Hart
The Exasol BucketFS file system has been developed for such use cases, where data should be stored synchronously
and replicated across the cluster. But we will also explain in the following sections how this concept can be used
to extend script languages and even to install completely new script languages on the Exasol cluster.
If you are interested in a concrete practical data science use case, we recommend the following entry in our Solution
Center: https://www.exasol.com/support/browse/SOL-257
What is BucketFS?
The BucketFS file system is a synchronous file system which is available in the Exasol cluster. This means that
each cluster node can connect to this service (e.g. through the http interface) and will see exactly the same content.
The data is replicated locally on every server and automatically synchronized. Hence,
you shouldn't store large amounts of data there.
The data in BucketFS is not part of the database backups and has to be backed up manually
if necessary.
One BucketFS service contains any number of so-called buckets, and every bucket stores any number of files.
Every bucket can have different access privileges as we will explain later on. Folders are not supported directly,
but if you specify an upload path including folders, these will be created transparently if they do not exist yet. If
all files from a folder are deleted, the folder will be dropped automatically.
Writing data is an atomic operation. There don't exist any locks on files, so the latest write operation will finally
overwrite the file. In contrast to the database itself, BucketFS is a pure file-based system and has no transactional
semantic.
323
3.6. UDF scripts
If you follow the link of an BucketFS Id, you can create and configure any number of buckets within this BucketFS.
Beside the bucket name, you have to specify read/write passwords and define whether the bucket should be public
readable, i.e. accessible for everyone.
A default bucket already exists in the default BucketFS which contains the pre-installed
script languages (Java, Python, R). However, for storing larger user data we highly recom-
mend to create a separate BucketFS instance on a separate partition.
For accessing a bucket through http(s) the users r and w are always configured and are
associated with the configured read and write passwords.
In the following example the http client curl is used to list the existing buckets, upload the files file1 and
tar1.tgz into the bucket bucket1 and finally display the list of contained files in this bucket. The relevant
parameters for our example are the port of the BucketFS (1234), the name of the bucket (bucket1) and the
passwords (readpw and writepw).
From scripts running on the Exasol cluster, you can access the files locally for simplification reasons. You don't
need to define any IP address and can be sure that the data is used from the local node. The corresponding path
for a bucket can be found in EXAoperation in the overview of a bucket.
The access control is organized by using a database CONNECTION object (see also CREATE CONNECTION),
because for the database, buckets look similar to normal external data sources. The connection contains the path
to the bucket and the read password. After granting that connection to someone using the GRANT command, the
bucket becomes visible/accessible for that user. If you want to allow all users to access a bucket, you can define
that bucket in EXAoperation as public.
Similar to external clients, write access from scripts is only possible via http(s), but you
still would have to be careful with the parallelism of script processes.
In the following example, a connection to a bucket is defined and granted. Afterwards, a script is created which
lists the files from a local path. You can see in the example that the equivalent local path for the previously created
bucket bucket1 is /buckets/bfsdefault/bucket1.
324
Chapter 3. Concepts
def run(c):
try:
p = subprocess.Popen('ls '+c.my_path,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
close_fds = True,
shell = True)
out, err = p.communicate()
for line in out.strip().split('\n'):
c.emit(line)
finally:
if p is not None:
try: p.kill()
except: pass
/
SELECT ls('/buckets/bfsdefault/bucket1');
FILES
---------------------
file1
tar1
SELECT ls('/buckets/bfsdefault/bucket1/tar1/');
FILES
---------------------
a
b
As you might have recognized in the example, archives (.zip, .tar, .tar.gz or .tgz) are always extracted
for the script access on the local file system. From outside (e.g. via curl) you see the archive while you can locally
use the extracted files from within the scripts.
If you store archives (.zip, .tar, .tar.gz or .tgz) in the BucketFS, both the ori-
ginal files and the extracted files are stored and need therefore storage space twice.
If you want to work on an archive directly, you can avoid the extraction by renaming the
file extension (e.g. .zipx instead of .zip).
325
3.6. UDF scripts
The script language Lua is not expandable, because it is natively compiled into the Exasol
database software.
For Java, you can integrate.jar files in Exasol easily. You only have to save the file in a bucket and reference
the corresponding path directly in your Java script.
If you for instance want to use Google's library to process telephone numbers (http://mavensearch.io/repo/com.google-
code.libphonenumber/libphonenumber/4.2), you could upload the file similarly to the examples above in the
bucket named javalib. The corresponding local bucket path would look like the following: /buckets/bfs-
default/javalib/libphonenumber-4.2.jar.
In the script below you can see how this path is specified to be able to import the library.
import com.google.i18n.phonenumbers.PhoneNumberUtil;
import com.google.i18n.phonenumbers.NumberParseException;
import com.google.i18n.phonenumbers.Phonenumber.PhoneNumber;
class JPHONE {
static String run(ExaMetadata exa, ExaIterator ctx) throws Exception {
PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
try {
PhoneNumber swissNumberProto = phoneUtil.parse(ctx.getString("num"),
"DE");
return swissNumberProto.toString();
} catch (NumberParseException e) {
System.err.println("NumberParseException thrown: " + e.toString());
}
return "failed";
}
}
/
Python libraries
Python libraries are often provided in the form of .whl files. You can easily integrate such libraries into Exasol
by uploading the file to a bucket and extend the Python search path appropriately.
If you for instance want to use Google's library for processing phone numbers (https://pypi.py-
thon.org/pypi/phonenumbers), you could upload the file into the bucket named pylib. The corresponding local
path would look like the following: /buckets/bfsdefault/pylib/phonenumbers-7.7.5-py2.py3-
none-any.whl.
In the script below you can see how this path is specified to be able to import the library.
326
Chapter 3. Concepts
import sys
import glob
sys.path.extend(glob.glob('/buckets/bfsdefault/pylib/*'))
import phonenumbers
def run(c):
return str(phonenumbers.parse(c.phone_number,None))
/
GHOST_BUSTERS
-----------------------------------------
Country Code: 1 National Number: 55552368
R packages
The installation of R packages is a bit more complex because they have to be compiled using the c compiler. To
manage that, you can download the existing Exasol Linux container, compile the package in that container and
upload the resulting package into BucketFS. Details about the Linux container will be explained in the following
chapter.
Again we want to use an existing package for processing phone numbers, this time from Steve Myles (https://cran.r-
project.org/web/packages/phonenumber/index.html).
Within the Docker container, you can start R and install it:
# export R_LIBS="/r_pkg/"
# R
> install.packages('phonenumber', repos="http://cran.r-project.org")
Installing package into '/r_pkg'
(as 'lib' is unspecified)
trying URL 'http://cran.r-project.org/src/contrib/phonenumber_0.2.2.tar.gz'
Content type 'application/x-gzip' length 10516 bytes (10 KB)
==================================================
downloaded 10 KB
The first line installs the package from the Linux container into the subfolder r_pkg which can be accessed outside
the Docker container.
Afterwards, the resulting tgz archive is uploaded into the bucket named rlib:
In the following script you can see how the resulting local path (/buckets/bfsdefault/rlib/r_pkg) is
specified to be able to use the library.
327
3.6. UDF scripts
How you create a language client will be explained in the next chapter. In principle, the language has to be expanded
by some small APIs which implement the communication between Exasol and the language. Afterwards the client
is compiled for the usage in the pre-installed Exasol Linux Container, so that it can be started within a secure
process on the Exasol cluster.
Uploading and providing files through the BucketFS has already been explained in the previous chapters.
The last step creates a link between the language client and Exasol's SQL language, actually the CREATE SCRIPT
command. This facilitates many options to try out new versions of a language or completely new languages and
finally replace them completely.
If you for instance plan to migrate from Python 2 to Python 3, you could upload a new client and link the alias
PYTHON temporarily to the new version via ALTER SESSION. After a thorough testing phase, you can finally
switch to the new version for all users via the ALTER SYSTEM statement.
On the other side, it is also possible to use both language versions in parallel by defining two aliases separately
(e.g. PYTHON2 and PYTHON3).
The main task of installing new languages is developing the script client. If you are interested in a certain language,
you should first check whether the corresponding client has already been published in our open source repository
(see https://github.com/exasol/script-languages). Otherwise we would be very happy if you would contribute self-
developed clients to our open source community.
A script client is based on a Linux container which is stored in BucketFS. The pre-installed script client for languages
Python, Java and R is located in the default bucket of the default BucketFS. Using the Linux container technology,
an encapsulated virtual machine is started in a safe environment whenever an SQL query contains script code. The
Linux container includes a complete Linux distribution and starts the corresponding script client for executing the
script code. In general, you could also upload your own Linux container and combine it with your script client.
The script client has to implement a certain protocol that controls the communication between the scripts and the
database. For that purpose, the established technologies ZeroMQ™ (see http://zeromq.org) and Google's Protocol
Buffers™ (https://github.com/google/protobuf) are used. Because of the length, the actual protocol definition is
not included in this user manual. For details, please have a look into our open source repository (see https://git-
hub.com/exasol/script-languages) where you'll find the following:
• Protocol specification
• Necessary files for building the Exasol Linux container (Docker configuration file and build script)
328
Chapter 3. Concepts
• Example implementations for further script clients (e.g. C++ and a native Python client that supports both Python
2 and Python 3)
After building and uploading a script client, you have to configure an alias within Exasol. The database then knows
where each script language has been installed.
You can change the script aliases via the commands ALTER SESSION and ALTER SYSTEM, either session-
wide or system-wide. This is handy especially for using several language versions in parallel, or migrating from
one version to another.
The current session and system parameters can be found in EXA_PARAMETERS. The scripts aliases are defined
via the parameter SCRIPT_LANGUAGES:
PARAMETER_NAME
---------------------------------------------------
PYTHON=builtin_python R=builtin_r JAVA=builtin_java
These values are not very meaningful since they are just internal macros to make that parameter compact and to
dynamically use the last installed version. Written out, the alias for Java would look like the following:
JAVA=localzmq+protobuf:///bfsdefault/default/EXAClusterOS/ScriptLanguages-
2016-10-21/?lang=java#buckets/bfsdefault/default/EXASolution-6.0.0/exaudfcli-
ent
That alias means that for all CREATE JAVA SCRIPT statements, the Exasol database will use the script client
exaudfclient from local path buckets/bfsdefault/default/EXASolution-2016-10-21/,
started within the Exasol Linux container from path bfsdefault/default/EXAClusterOS/ScriptLan-
guages-2016-10-21. The used communication protocol is localzmq+protobuf (this is the only supported
protocol so far).
For the three pre-installed languages (Python, R, Java), Exasol uses one single script client which evaluates the
parameter lang=java to differentiate between these languages. That's why the internal macro for the Python
alias looks nearly identical. Script clients implemented by users can of course define and evaluate such optional
parameters individually.
<alias>=localzmq+protobuf:///<path_to_linux-container>/[?<client_param_list>]#<path_to_client>
Maybe you have noticed in the Java alias that the path of the Linux container begins with the BucketFS while the
client path contains the prefix buckets/. The reason for that is that the client is started in a secure environment
within the Linux container and may only access the existing (and visible) buckets. Access is granted just to the
embedded buckets (via mount), but not to the real server's underlying file system. The path /buckets/ can be
used read-only from within scripts as described in the previous chapter.
You can define or rename any number of aliases. As mentioned before, we recommend to test such adjustments
at first in your own session (ALTER SESSION) before making that change globally visible via ALTER SYSTEM.
329
3.7. Virtual schemas
Please note that virtual schemas are part of the Advanced Edition of Exasol.
After creating a virtual schema, its included tables can be used in SQL queries and even combined with persistent
tables stored directly in Exasol, or with other virtual tables from other virtual schemas. The SQL optimizer internally
translates the virtual objects into connections to the underlying systems and implicitly transfers the necessary data.
SQL conditions are tried to be pushed down to the data sources to ensure minimal data transfer and optimal per-
formance.
That's why this concept creates a kind of logical view on top of several data sources which could be databases or
other data services. By that, you can either implement a harmonized access layer for your reporting tools. Or you
can use this technology for agile and flexible ETL processing, since you don't need to change anything in Exasol
if you change or extend the objects in the underlying system.
The following basic example shows you how easy it is to create and use a virtual schema by using our JDBC adapter
to connect Exasol with a Hive system.
CREATE VIRTUAL SCHEMA Creating a virtual schema (see also CREATE SCHEMA)
DROP VIRTUAL SCHEMA Deleting a virtual schema and all contained virtual tables (see also DROP
SCHEMA)
ALTER VIRTUAL SCHEMA Adjust the properties of an virtual schema or refresh the metadata using the
REFRESH option (see also ALTER SCHEMA)
EXPLAIN VIRTUAL Useful to analyze which resulting queries for external systems are created by
the Exasol compiler (see also EXPLAIN VIRTUAL)
330
Chapter 3. Concepts
Instead of shipping just a certain number of supported connectors (so-called adapters) to other technologies, we
decided to provide users an open, extensible framework where the connectivity logic is shared as open source. By
that, you can easily use existing adapters, optimize them for your need or even create additional adapters to all
kinds of data sources by your own without any need to wait for a new release from Exasol.
In the following chapters, we will explain how that framework works and where you'll find the existing adapters.
Read metadata Receive information about the objects included in the schema (tables, columns, types)
and define the logic how to map the data types of the source system to the Exasol data
types.
Push down query Push down parts of the Exasol SQL query into an appropriate query the the external system.
The adapter defines what kind of logic Exasol can push down (e.g. filters or certain
functions). The Exasol optimizer will then try to push down as much as possible and ex-
ecute the rest of the query locally on the Exasol cluster.
Adapters are similar to UDF scripts (see also Section 3.6, “UDF scripts ” and the details in the next chapter). They
can be implemented in one of the supported programming languages, for example Java or Python, and they can
access the same metadata which is available within UDF scripts. To install an adapter you simply download and
execute the SQL scripts which creates the adapter script in one of your normal schemas.
The existing open source adapters provided by Exasol can be found in our GitHub repos-
itory: https://www.github.com/exasol/virtual-schemas
A very generic implementation is our JDBC adapter with which you can integrate nearly
any data source providing a Linux JDBC driver. For some database systems, an appropriate
dialect was already implemented to push as much processing as possible down to the
underlying system. Please note that for using this JDBC adapter you have to upload the
corresponding in BucketFS for the access from adapter scripts (see also Section 3.6.4,
“The synchronous cluster file system BucketFS”). Additionally the driver has to be in-
stalled via EXAoperation, because the JDBC adapter executes an implicit IMPORT
command).
EXPLAIN VIRTUAL Useful to analyze which resulting queries for external systems are created by
the Exasol compiler, see also EXPLAIN VIRTUAL
331
3.7. Virtual schemas
Afterwards you can create virtual schemas by providing certain properties which are required for the adapter script
(see initial example). These properties typically define the necessary information to establish a connection to the
external system. In the example, this was the jdbc connection string and the credentials.
But properties can flexibly defined and hence contain all kinds of auxiliary data which controls the logic how to
use the data source. If you implement or adjust your own adapter scripts, then you can define your own properties
and use them appropriately.
The list of specified properties of a specific virtual schema can be seen in the system table EXA_ALL_VIRTU-
AL_SCHEMA_PROPERTIES. After creating the schema, you can adjust these properties using the SQL command
ALTER SCHEMA.
After the virtual schema was created in the described way, you can use the contained tables in the same way as
the normal tables in Exasol, and even combine them in SQL queries. And if you want to use this technology just
for simple ETL jobs, you can of course simply materialize a query on the virtual tables:
Internally, this works similar to views since the check for privileges is executed in the name of the script owner.
By that the details are completely encapsulated, i.e. the access to adapter scripts and the credentials to the external
system.
Views Instead of granting direct access to the virtual schema you can also create
views on that data and provide indirect access for certain users.
Logic within the adapter script It is possible to solve this requirement directly within the adapter script.
E.g. in our published JDBC adapter, there exists the parameter
TABLE_FILTER through which you can define a list of tables which should
be visible (see https://www.github.com/exasol). If this virtual schema
property is not defined, then all available tables are made visible.
332
Chapter 3. Concepts
In this case you only define the connection name but not the actual credentials:
Of course, the adapter script has to support this and extract the credentials from the connection to be able to establish
a connection to the external system.
The administrator of the virtual schema needs the following privileges to enable the encapsulation via connections:
1. You have to grant the connection itself to the administrator (GRANT CONNECTION), because in most cases,
an adapter script will internally generate an IMPORT statement in the two-phased execution (see also Sec-
tion 3.7.6, “Details for experts”) which uses this connection like standard IMPORT statements do. Since
IMPORT is an internal Exasol command and not a public script, the credentials cannot be extracted at all.
2. In most cases you also need access to the connection details (actually the user and password), because the
adapter script needs a direct connection to read the metadata from the external in case of commands such as
CREATE and REFRESH. For that purpose the special ACCESS privilege has been introduced, because of
the criticality of these data protection relevant connection details. By the statement GRANT ACCESS ON
CONNECTION [FOR SCRIPT] you can also limit that access only to a specific script (FOR SCRIPT
clause) and ensure that the administrator cannot access that data himself (e.g. by creating a new script which
extracts and simply returns the credentials). Of course that user should only be allowed to execute, but not
alter the script by any means.
In the example below, the administrator gets the appropriate privileges to create a virtual schema by using the adapter
script (jdbc_adapter) and a certain connection (hive_conn).
3.7.5. Metadata
You'll find detailed information about all created adapter scripts and virtual schemas in the following system tables:
Virtual schemas
• EXA_VIRTUAL_SCHEMAS
• EXA_ALL_VIRTUAL_SCHEMA_PROPERTIES
333
3.7. Virtual schemas
• EXA_USER_VIRTUAL_SCHEMA_PROPERTIES
• EXA_DBA_VIRTUAL_SCHEMA_PROPERTIES
• EXA_ALL_VIRTUAL_TABLES
• EXA_DBA_VIRTUAL_TABLES
• EXA_USER_VIRTUAL_TABLES
• EXA_ALL_VIRTUAL_COLUMNS
• EXA_DBA_VIRTUAL_COLUMNS
• EXA_USER_VIRTUAL_COLUMNS
Adapter scripts
• EXA_ALL_SCRIPTS (SCRIPT_TYPE='ADAPTER')
• EXA_DBA_SCRIPTS (SCRIPT_TYPE='ADAPTER')
• EXA_USER_SCRIPTS (SCRIPT_TYPE='ADAPTER')
Connections
• EXA_ALL_CONNECTIONS
• EXA_DBA_CONNECTIONS
• EXA_SESSION_CONNECTIONS
• EXA_DBA_CONNECTION_PRIVS
• EXA_USER_CONNECTION_PRIVS
But if you want to know more about the underlying concepts, or if you even plan to create or adjust your own adapter,
then please read on.
Every time you access data of a virtual schema, on one node of the cluster a container of the corresponding language
is started, e.g. a JVM or a Python container. Inside this container, the code of the adapter script will be loaded.
Exasol interacts with the adapter script using a simple request-response protocol encoded in JSON. The database
takes the active part sending the requests by invoking a callback method.
In case of Python, this method is called adapter_call per convention, but this can vary.
1. Exasol determines that a virtual table is involved, looks up the corresponding adapter script and starts the
language container on one single node in the Exasol cluster.
334
Chapter 3. Concepts
2. Exasol sends a request to the adapter, asking for the capabilities of the adapter.
3. The adapter returns a response including the supported capabilities, for example whether it supports specific
WHERE clause filters or specific scalar functions.
4. Exasol sends an appropriate pushdown request by considering the specific adapter capabilities. For example,
the information for column projections (in the example above, only the single column name is necessary) or
filter conditions is included.
5. The adapter processes this request and sends back a certain SQL query in Exasol syntax which will be executed
afterwards. This query is typically an IMPORT statement or a SELECT statement including an row-emitting
UDF script which cares about the data processing.
The example above could be transformed into these two alternatives (IMPORT and SELECT):
SELECT name FROM ( IMPORT FROM JDBC AT ... STATEMENT 'SELECT name from
remoteschema.users WHERE name LIKE "A%"' );
In the first alternative, the adapter can handle filter conditions and creates an IMPORT command including
a statement which is sent to the external system. In the second alternative, a UDF script is used with two
parameters handing over the address of the data source and the column projection, but not using any logic
for the filter condition. This would then be processed by Exasol rather than the data source.
Please be aware that the examples show the fully transformed query while only the inner statements are created
by the adapter.
6. The received data is directly integrated into the overall query execution of Exasol.
To understand the full API of the adapter scripts we refer to our open source repository (https://www.github.com/ex-
asol) and our existing adapters. They include API documentation and it is far easier to read these concrete examples
than to explain all details in this user manual.
If you want to enhance existing or create completely new adapters, we recommend to use EXPLAIN VIRTUAL
to easily analyze what the adapter is exactly pushing down to the underlying system.
335
3.8. SQL Preprocessor
If you want to use the SQL Preprocessor, please read this chapter carefully since the im-
pacts on your database system could be extensive.
The SQL Preprocessor is deactivated by default. Via the statements ALTER SESSION and ALTER SYSTEM,
you can define a script session or system wide which is responsible for the preprocessing of all SQL commands.
Before a SQL statement is passed to the actual database compiler, the preprocessor does a kind of text transform-
ation. Within the script, you can get and set the original text and manipulate it by using our auxiliary library (see
next section). Details to the scripting language can be found in Section 3.5, “Scripting”.
The SQL Preprocessor is a powerful tool to flexibly extend the SQL language of Exasol. But you should also be
very careful before activating such a SQL manipulation system wide. In worst case no single SQL statement could
work anymore. But you can always deactivate the preprocessing via ALTER SESSION and ALTER SYSTEM
(by setting the parameter SQL_PREPROCESSOR_SCRIPT to the NULL value), because these statements are de-
liberately excluded from the preprocessing. For data security reasons, we also excluded all statements which include
passwords ( CREATE USER, ALTER USER, CREATE CONNECTION, ALTER CONNECTION, IMPORT,
EXPORT if the IDENTIFIED BY clause was specified).
In the auditing table EXA_DBA_AUDIT_SQL, a separate entry for the execution of the preprocessor script is
added (EXECUTE SCRIPT and the original text within a comment). The executed transformed SQL statement is
listed in another entry.
Overview
Split into tokens • tokenize()
336
Chapter 3. Concepts
Details
• sqlparsing.tokenize(sqlstring)
Splits an input string into an array of strings which correspond to the tokens recognized by the database compiler.
If you concatenate these tokens, you will get exactly the original input string (including upper/lower case, line
breaks, whitespaces, etc.). Hence, the equation table.concat(tokens)==sqlstring is valid.
OUTPUT
---------------------
SELECT
dummy
FROM
dual
• sqlparsing.iscomment(tokenstring)
• sqlparsing.iswhitespace(tokenstring)
• sqlparsing.iswhitespaceorcomment(tokenstring)
Returns whether the input string is a whitespace or comment token. This function can be useful for function
find(), because it filters all irrelevant tokens corresponding to the SQL standard.
• sqlparsing.isidentifier(tokenstring)
• sqlparsing.iskeyword(tokenstring)
Returns whether the input string is a SQL keyword token (e.g. SELECT, FROM, TABLE). The functions
isidentifier() and iskeyword() return both true for non-reserved keywords. Hence, you are also
able to identify non-reserved keywords.
337
3.8. SQL Preprocessor
• sqlparsing.isstringliteral(tokenstring)
• sqlparsing.isnumericliteral(tokenstring)
• sqlparsing.isany(tokenstring)
Always returns true. This can e.g. be useful if you want to find any first relevant token (as match function
within the method find()).
• sqlparsing.normalize(tokenstring)
Returns a normalized string for similar representations (e.g. upper/lower case identifiers), on the basis of the
following rules:
• Regular identifiers are transformed into upper-case letters, e.g. dual -> DUAL
• Keywords are transformed in upper-case letters, e.g. From -> FROM
• Whitespace-Token of any size are replaced by a single whitespace
• In numerical literals, an optional lower-case 'e' is replaced by 'E', e.g. 1.2e34 -> 1.2E34
Searches in the token list, starting from positions startTokenNr, forward or backward (searchForward)
and optionally only within the current level of brackets (searchSameLevel), for the directly successive
sequence of tokens which are matched by parameters match1, ... matchN. In that search process, all tokens
which match by function ignoreFunction will be not considered by the match functions.
If the searched token sequence is found, then an array of size N is returned (in case of N match elements) whose
X-th entry contains the position of the token within the token list which was matched by matchX. If the token
sequence was not found, the function returns nil.
Details on parameters:
searchForward Defines whether the search should be applied forward (true) or backward (false).
This affects only the direction by that the search process is moving across the list, but
the match functions always search forward.
That means that if you e.g. search the token sequence KEYWORD, IDENTIFIER,
within the token list 'select' 'abc' 'from' 'dual', and start from position 3, then 'from'
'dual' will be matched and not 'from' 'abc', even when searching backward. If you start
your search at position 2, then the backward search will return 'select' 'abc', and the
forward search will return 'from' 'dual'.
searchSameLevel Defines whether the search should be limited to the current level of brackets (true)
or also beyond (false). This applies only to the match of the first token of the se-
quence. Subsequent tokens can also be located in more inner bracket levels. That
means that the search of the token sequence '=' '(' 'SELECT' is also possible if it is
constrained to the current level, although the 'SELECT' is located in the next inner
bracket level. The option searchSameLevel is especially useful for finding the
corresponding closing bracket, e.g of a subquery.
Example: Search the closing bracket within the token sequence 'SELECT' 't1.x' '+' '('
'SELECT' 'min' '(' 'y' ')' 'FROM' 't2' ')' 'FROM' 't1' which corresponds to the bracket
338
Chapter 3. Concepts
ignoreFunction Here, a function of type function(string)->bool is expected. The tokens for which the
function ignoreFunction returns true will be ignored by the match functions.
That means in particular that you can specify tokens types which may occur within
the sequence without breaking the match. In many cases, the function
iswhitespaceorcomment is useful for that purpose.
• sqlparsing.getsqltext()
This function is only available within the main SQL Preprocessor script.
• sqlparsing.setsqltext(string)
Sets the SQL statement text to a new value which will eventually be passed to the database compiler for exe-
cution.
This function is only available within the main SQL Preprocessor script.
• You should extensively test a preprocessor script in your own session before activating it system wide.
• The SQL processing should be implemented by the use of separate auxiliary scripts and integrated in one main
script which is just a wrapper to hand over the SQL text (e.g. sqlparsing.setsqltext(myscript.pre-
process(sqlparsing.getsqltext()))). The reason for that approach is that the functions getsql-
text() and setsqltext() are only available within the preprocessing and not in normal script executions.
By the separation you can test the processing on several test SQL constructs (e.g. on your own daily SQL history
stored within a table) before activating the main script as preprocessor script.
• Be sure that all necessary privileges are granted to execute the preprocessor script. It is recommended that you
start a test with a user without special rights. Otherwise, certain user groups could be blocked of executing any
SQL statements.
• The preprocessing should be as simple as possible. Especially query() and pquery() should only be used
in exceptional cases if you activate the preprocessing globally, because all SQL queries will be decelerated,
and a parallel access on similar tables increases the risk of transaction conflicts.
3.8.4. Examples
In the following you find some examples for Preprocessor scripts which shall show you the functionality and power
of this feature.
339
3.8. SQL Preprocessor
340
Chapter 3. Concepts
'',
ifEnd[1]+1)
end
return sqltext
end
/
COL1
----
6
Example 2: ls command
In this example, the familiar Unix command ls is transferred to the database world. This command returns either
the list of all objects within a schema or the list of all schemas if no schema is opened. Additionally, you can apply
filters (case insensitive) like e.g. ls '%name%' to display all objects whose name contains the text 'name'.
341
3.8. SQL Preprocessor
sqlparsing.isany)
while (not(foundPos==nil) )
do
local foundToken = tokens[foundPos[1]]
if (sqlparsing.isstringliteral(foundToken)) then
addFilters[#addFilters+1] = "UPPER("..searchCol..") \
LIKE UPPER("..foundToken .. ")"
elseif (not (sqlparsing.normalize(foundToken) == ';')) then
error("only string literals allowed as arguments for ls,\
but found '"..foundToken.."'")
end
lastValid = foundPos[1]
foundPos = sqlparsing.find(tokens,
lastValid+1,
true,
false,
sqlparsing.iswhitespaceorcomment,
sqlparsing.isany)
end
if ( #addFilters > 0 ) then
local filterText = table.concat(addFilters, " OR ")
return returnText.." AND ("..filterText..")".." ORDER BY "..searchCol
else
return returnText.." ORDER BY "..searchCol
end
end
function processUnixCommands(input)
local tokens = sqlparsing.tokenize(input)
local findResult = sqlparsing.find(tokens,
1,
true,
false,
sqlparsing.iswhitespaceorcomment,
sqlparsing.isany)
if (findResult==nil) then
return input
end
local command = tokens[findResult[1]]
if (sqlparsing.normalize( command )=='LS') then
return processLS(input, tokens, findResult[1])
end
return input;
end
/
342
Chapter 3. Concepts
OBJECT_NAME OBJECT_TYPE
-------------------------- ---------------
ADDUNIXCOMMANDS SCRIPT
PREPROCESSWITHUNIXTOOLS SCRIPT
CLOSE SCHEMA;
LS;
SCHEMA_NAME
--------------------------
SCHEMA_1
SCHEMA_2
SCHEMA_3
SQL_PREPROCESSING
Example 3: ANY/ALL
ANY and ALL are SQL constructs which are currently not supported by Exasol. By the use of the following script,
you can add this functionality.
343
3.8. SQL Preprocessor
foundPositions = sqlparsing.find(tokens,
openBracketPos,
true,
true,
sqlparsing.iswhitespaceorcomment,
')');
if (foundPositions ~= nil) then
local closeBracketPos = foundPositions[1]
local operatorToken = tokens[operatorPos];
local anyOrAll = sqlparsing.normalize(tokens[anyAllPos]);
if (operatorToken=='<' or operatorToken=='<='
or operatorToken=='>' or operatorToken=='>=') then
-- now we have <|<=|>|>= ANY|ALL (SELECT <something> FROM
-- rebuild to <|<=|>|>= (SELECT MIN|MAX(<something>) FROM
local setfunction = 'MIN';
if ( ((anyOrAll=='ANY' or anyOrAll=='SOME') and
(operatorToken=='<' or operatorToken=='<=')
) or
(anyOrAll=='ALL' and (operatorToken=='>'
or operatorToken=='>=')
)
) then
setfunction = 'MAX';
end
tokens[anyAllPos] = '';
tokens[openBracketPos] =
'(SELECT ' .. setfunction .. '(anytab.anycol) FROM (';
tokens[closeBracketPos] = ') as anytab(anycol) )';
elseif (operatorToken=='=' and anyOrAll=='ALL') then
-- special rebuild for = ALL
-- rebuild to=(SELECT CASE WHEN COUNT(DISTINCT <something>)==1
-- THEN FIRST_VALUE(<something>) ELSE NULL END FROM
tokens[anyAllPos] = '';
tokens[openBracketPos] =
'(SELECT CASE WHEN COUNT(DISTINCT anytab.anycol) = 1 \
THEN FIRST_VALUE(anytab.anycol) ELSE NULL END FROM (';
tokens[closeBracketPos] = ') as anytab(anycol) )';
elseif ((operatorToken=='!=' or operatorToken=='<>')
and anyOrAll=='ALL') then
-- special rebuild for != ALL
-- rebuild to NOT IN
tokens[operatorPos] = ' NOT IN '
tokens[anyAllPos] = ''
elseif (operatorToken=='!=' and
(anyOrAll=='ANY' or anyOrAll=='SOME')) then
--special rebuild for != ANY, rebuild to
-- CASE WHEN (SELECT COUNT(DISTINCT <something>) FROM ...) == 1
-- THEN operand != (SELECT FIRST_VALUE(<something>) FROM ...)
-- ELSE operand IS NOT NULL END
--note: This case would normally require to determine the operand
-- which requires full understanding of a value expression
-- in SQL standard which is nearly impossible in
-- preprocessing (and very susceptible to errors)
-- so we evaluate the
-- SELECT COUNT(DISTINCT <something) FROM ...) == 1 here and
-- insert the correct expression
--
344
Chapter 3. Concepts
I
-------------------
1
2
3
4
345
3.9. Profiling
3.9. Profiling
Nevertheless there exist situations where you want to know how much time the certain execution parts of a query
take. Long running queries can then be analyzed and maybe rewritten. Furthermore this kind of information can
be provided to Exasol to improve the query optimizer continuously.
In Exasol, you can switch on the profiling feature on demand. Afterwards, the corresponding information is gathered
and provided to the customer via system tables. Further details are described in the following sections.
You can switch on the general profiling feature through the statements ALTER SESSION or ALTER SYSTEM
by setting the option PROFILE. Afterwards, the corresponding profiling data is gathered during the execution of
queries and is collected in the system tables EXA_USER_PROFILE_LAST_DAY and EXA_DBA_PRO-
FILE_LAST_DAY. Please mention this profiling data is part of the statistical system tables which are committed
just periodically. Therefore a certain delay occurs until the data is provided within the system tables. If you want
to analyze the profiling information directly after executing a query, you can use the command FLUSH STATISTICS
to enforce the COMMIT of the statistical data.
The profiling system tables list for each statement of a session certain information about the corresponding execution
parts, e.g. the execution time (DURATION), the used CPU time (CPU), the memory usage (MEM_PEAK and
TEMP_DB_RAM_PEAK) or the network communication (NET). Additionally, the number of processed rows (OB-
JECT_ROWS), the number of resulting rows (OUT_ROWS) and further information about a part (PART_INFO)
is gathered. Please mention that some parts do not contain the full set of data.
COMPILE / EXECUTE Compilation and execution of the statement (including query optimizing and
e.g. the automatic creation of table statistics)
SCAN Scan of a table
JOIN Join with a table
FULL JOIN Full outer join to a table
OUTER JOIN Outer Join to a table
EXISTS EXISTS computation
GROUP BY Calculation of the GROUP BY aggregation
GROUPING SETS Calculation of the GROUPING SETS aggregation
SORT Sorting the data (ORDER BY, also in case of analytical functions)
ANALYTICAL FUNCTION Computation of analytical functions (without sorting of data)
CONNECT BY Computation of hierarchical queries
PREFERENCE PROCESSING Processing of Skyline queries (for details see Section 3.10, “Skyline”)
PUSHDOWN Pushdown SQL statement generated by the adapter for queries on virtual objects
QUERY CACHE RESULT Accessing the Query Cache
CREATE UNION Under certain circumstances, the optimizer can create a combined table out of
several tables connected by UNION ALL, and process it much faster
UNION TABLE This part is created for each individual UNION ALL optimized table (see also
part "CREATE UNION")
INSERT Inserting data (INSERT, MERGE or IMPORT)
346
Chapter 3. Concepts
3.9.3. Example
The following example shows how the profiling data looks like for a simple query:
-- run query
SELECT YEAR(o_orderdate) AS "YEAR", COUNT(*)
FROM orders
GROUP BY YEAR(o_orderdate)
ORDER BY 1 LIMIT 5;
YEAR COUNT(*)
----- -------------------
1992 227089
1993 226645
1994 227597
1995 228637
1996 228626
347
3.9. Profiling
348
Chapter 3. Concepts
3.10. Skyline
3.10.1. Motivation
When optimal results for a specific question shall be found for big data volumes and many dimensions, certain
severe problems arise:
1. Data flooding
Large data volumes can hardly be analyzed. When navigating through millions of data rows, you normally
sort by one dimension and regard the top N results. By that, you will nearly always miss the optimum.
2. Empty result
Filters are often used to simplify the problem of large data volumes. Mostly those are however too restrictive
which leads to completely empty result sets. By iteratively adjusting the filters, people try to extract a control-
lable data set. This procedure generally prevents finding the optimum, too.
When many dimensions are relevant, you can hardly find an optimum via normal SQL. By the use of metrics,
analysts try to find an adequate heuristic to weight the different attributes. But by simplifying the problem to
only one single number, a lot of information and correlation within the data is eliminated. The optimum is
also mostly missed by this strategy.
In summery, analysts often navigate iteratively through large data volumes by using diverse filters, aggregations
and metrics. This approach is time consuming and results in hardly comprehensible results, whose relevance often
keeps being disputable. Instead of a real optimum for a certain question, only a kind of compromise is found which
disregards the complexity between the dimensions.
Instead of hard filters via the WHERE clause and metrics between the columns, the relevant dimensions are simply
specified in the PREFERRING clause. Exasol will then determine the actual optimum. The optimal set means the
number of non-dominated points in the search space, also named as Pareto set. By definition, a point is dominated
by another one if it is inferior in all dimensions.
For better illustration, please imagine an optimization space with just two dimensions, for example the decision
for a car by using the two attributes "high power" and "low price". A car A is consequently dominated by a car B
if its price is lower and its performance is higher. If only one dimension is superior, but the other one is inferior,
then you cannot find an obvious decision.
In consequence, the Pareto set means in this case the set of cars with preferably high performance and low price.
Without Skyline, you can only try to find a reasonable metric by combining the performance and price in a formula,
or by limiting the results by certain price or performance ranges. But the actual optimal cars won't be identified
by that approach.
With Skyline, you can simply specify the two dimensions within the PREFERRING clause (PREFERRING HIGH
power PLUS LOW price) and Exasol will return the optimal combinations. You should bring to mind that
the Skyline algorithm compares all rows of a table with all others. In contrast, by using a simple metric the result
can be easily determined (sorting by a number), but the complexity of the correlations is totally eliminated. Skyline
hence gives you the possibility to actually consider the structure of the data and to find the real optimal result set.
349
3.10. Skyline
The advantages of Skyline could hopefully be demonstrated by the simple two-dimensional example. But the al-
gorithm can also consider a large number of dimensions which the human brain can hardly imagine any more.
The PREFERRING clause can by the way be used in the SELECT, DELETE and UPDATE statements.
3.10.3. Example
To illustrate the power of Skyline, we consider the selection of the best funds in the market which is a daily problem
for financial investors. To select good funds, one can use a lot of attributes, for example its performance, volatility,
investment fees, ratings, yearly costs, and many more.
To simplify the problem, we want to concentrate on the first two attributes. In reality, the performance and volat-
ility have a reversed correlation. The more conservative a fund is, typically the lower its volatility, but also its
performance.
In the picture above, thousands of funds are plotted by its performance and volatility. The green points are the
result of given Skyline query. These funds represent the optimal subset regarding the two dimensions. Afterwards,
the decision whether one selects a more conservative or more risky fund can be done in a subsequent, subjective
step.
But you can already see in this pretty simple example how much the problem can be reduced. Out of thousands
of funds, only the best 23 are extracted, and this subset is the actual optimum for the given problem.
preferring_clause::=
PARTITION BY expr
PREFERRING preference_term
350
Chapter 3. Concepts
preference_term::=
( preference_term )
HIGH
expr
LOW
boolean_expr
PLUS
preference_term preference_term
PRIOR TO
INVERSE ( preference_term )
PARTITION BY If you specify this option, then the preferences are evaluated separately for each
partition.
HIGH and LOW Defines whether an expression should have high or low values. Please note that
numerical expressions are expected here.
Boolean expressions In case of boolean expressions, the elements are preferred where the condition
results in TRUE. The expression x>0 is therefore equivalent to HIGH (x>0).
The latter expression would be implicitly converted into the numbers 0 and 1.
PLUS Via the keyword PLUS, multiple expressions of the same importance can be spe-
cified.
PRIOR TO With this clause you can nest two expressions hierarchically. The second term will
only be considered if two elements have the similar value for the first term.
INVERSE By using the keyword INVERSE, you can create the opposite/inverse preference
expression. Hence, the expression LOW price is equivalent to INVERSE(HIGH
price).
The following, more complex example shows the selection of the best cars with nested expressions:
351
352
Chapter 4. Clients and interfaces
4.1. EXAplus
EXAplus is an user interface for dealing with SQL statements in Exasol. It is implemented in Java and is available
as graphical application and simple console version both under Windows and Linux.
4.1.1. Installation
MS Windows
EXAplus has been successfully tested on the following systems:
– Windows 10 (x86/x64)
– Windows 8.1 (x86/x64)
– Windows 7, Service Pack 1 (x86/x64)
– Windows Server 2012 R2 (x86/x64)
– Windows Server 2012 (x86/x64)
– Windows Server 2008 R2, Service Pack 1 (x86/x64)
– Windows Server 2008, Service Pack 2 (x86/x64)
To install EXAplus, please follow the instructions of the installation wizard which will be started when executing
the installation file.
Administrator rights and the Microsoft .NET Framework 4.0 Client Profile ™are required
for installing EXAplus.
• EXAplus requires a Java runtime environment for execution. To be able to use all features of EXAplus we re-
commend to use at least Java 7 and applied updates.
Further it is recommended that support for additional languages was selected during the Java installation. For
correct formatting of special unicode characters (e.g. Japanese characters) a font that is capable of displaying
those characters must be installed on your Windows system.
• After you have successfully installed EXAplus, a corresponding start menu entry is added to start the graphical
EXAplus version. But you can also use EXAplus in your commandline interface (cmd.exe) by calling ex-
aplus64.exe or exaplus.exe (32-bit executable). If you do not want to type in the complete path to
EXAplus, please choose the option Add EXAplus to path during the installation.
Linux/Unix
EXAplus has been successfully tested on the following systems:
353
4.1. EXAplus
• Java 7 or higher has to be installed and the program "java" must be included in the path.
For correct formatting of special unicode characters (e.g. Japanese characters), a font that is capable of displaying
those characters must be installed on your system.
• If you want to use Japanese characters and your Input Manager is Scim, we recommend to install at least version
1.4.7 of Scim. Additionally, Scim-Anthy (Japanese character support) should be installed in version 1.2.4 or
higher.
The EXAplus installation archive can be unpacked with the command tar -xfz. If the above software requirements
are met, EXAplus can now be started by invoking the startup script exaplusgui (graphical version) or exaplus
(commandline version). A separate installation is not necessary.
354
Chapter 4. Clients and interfaces
Database Browser Shows the existing database objects like schemas, tables, scripts, users, etc. When doing
a double click on the object an object information will be opened in the Editor area. In
the context menu of the objects (right-click) you will find a list of possible actions you
can execute on this object.
History Overview over the recently executed commands. If you want to save a query as favorite,
you can do this via the context menu in the History (right-click on the SQL).
Favorites Preferred or frequently used queries can be saved as a favorite. You can import and
export your favorites via the context menu (right-click on a folder).
Templates Lists a bunch of templates for SQL statements and functions, Lua functions and EXAplus
commands. This is a quick reference if you don't remember the syntax details or e.g.
the list of available string functions. Via Drag&Drop you can copy the templates into
your SQL editor.
SQL Editor Working area to display, edit and execute SQL scripts. Moreover, the object information
is opened here.
Result Area Display of query results in table form and the log protocol of EXAplus.
Status Bar Shows the current connection information and configuration which can be directly
changed in part (e.g. the Autocommit mode or the limitation of result tables). You can
also see the current memory allocation.
Most features of EXAplus will be intuitive, but in the following table you may find some topics which facilitate
the daily work.
355
4.1. EXAplus
If a user uses several databases or wants to connect to a database through multiple user pro-
files, the usage of connection profiles will be useful. Those can be created and configured
through the menu entry EXAplus -> Connection Profiles.
Afterwards you can choose these profiles when clicking on the connect icon or via the menu
entry EXAplus -> Connect To. If some parts like the password was not already configured
in the profile, you have to complete them in the connection dialog. After profiles are created,
the Custom connection dialog doesn't appear any more when starting EXAplus.
To automate the connection process completely, you can define a default connection which
will then be used when EXAplus is started.
You can also use the created connections in Console mode (com-
mandline parameter -profile)
Multiple EXAplus If a user wants to connect to multiple databases or wants to open multiple sessions to a single
instances one, he can start multiple instances of EXAplus by starting EXAplus multiple times or via
the menu entry EXAplus -> New Window.
Moreover, you can define the maximal amount of rows which is displayed. This option limits
the data volume which is sent from the database to the client and can be useful especially in
case of big result sets.
Result tables can be reordered by clicking on a certain column. The content of result tables
can be copied to external applications like Excel via Copy & Paste.
Current schema In the status bar you can find and switch the current schema. Additionally, you can restrict
the list of displayed schemas in the database browser via the menu entry View -> Show Only
Current Schema.
Autocommit mode The autocommit mode is displayed in the status bar and can also be switched there. The
following values are possible:
Drag&Drop Via Drag&Drop you can copy SQL commands from the history directly in the editor area.
Equally you can insert the schema-qualified name of a database object (e.g.
MY_SCHEMA.MY_TABLE) from the database browser into the editor area. Only in case
of columns the schema name is omitted.
356
Chapter 4. Clients and interfaces
Topic Annotation
Autocompletion By executing the shortcut CTRL+SPACE EXAplus tries to complete the current word in the
editor by considering metadata of the database (like column names of tables or table names
in the current scope). Additionally, information and examples are provided for e.g. functions.
On the top of the object info you find some useful icons to e.g. automatically create the cor-
responding DDL statements of this object (also available for whole schemas), drop the object,
export a table or view or edit views or scripts.
Those and more actions can also be found in the context menu of a database browser object.
System monitoring By clicking the pie chart button on the tool bar, you can open a separate window showing
graphical usage statistics of the connected database. You can add more statistics, let graphs
stack upon each other or be displayed in a grid, and select the interval, type, and data group
for each graph. The graphs are updated about every 3 minutes.
Language settings The preferable language can be switched in the EXAplus -> Preferences.
Error behavior Via the menu entry SQL -> Error Handling you can specify the behavior in case of an error.
See also WHENEVER command for details.
Memory Usage The bar at the bottom-right corner shows the memory allocation pool (heap) usage of the
current EXAplus process. By a double-click you can hint the program to trigger a garbage
collection.
Associate EXAplus If you switch on this option during the installation, you can open SQL files in EXAplus by
with SQL files a double-click (on Windows systems). For encoding the last preferences are used when a
file was opened or saved.
357
4.1. EXAplus
Topic Annotation
Migration of EX- If you want to migrate the preferences e.g. to another computer, you can easily copy the
Aplus preferences corresponding xml files from the EXAplus folder (exasol or .exasol) in your home
directory. Here you will find the Favorites (favorites.xml), the History (his-
tory*.xml), the connection profiles (profiles.xml) and the general EXAplus prefer-
ences (exaplus.xml).
The xml files should only be copied for an EXAplus instance with
similar version.
In the background, EXAplus opens a second connection to the database which requests
meta data for e.g. the Database Browser or the autocompletion feature. Please mention
that therefore database changes cannot be displayed in the Database Browser unless they
are not committed in the main connection.
After successfully connecting to the database an interactive session is started where the user can send SQL commands
(see Chapter 2, SQL reference) or EXAplus commands (see Section 4.1.4, “EXAplus-specific commands”) to the
database. Result tables are then displayed in text form.
Notes:
• You can exit EXAplus via the commands EXIT and QUIT
• All commands have to be finished by a semicolon (except view, function and script definitions which must be
finished by a / in a new line and except the commands EXIT and QUIT)
• The command history can be used via the arrow keys (up/down)
• Multiline commands can be aborted by using the shortcut <CTRL>-T
• By pressing <TAB> EXAplus tries to complete the input
358
Chapter 4. Clients and interfaces
Default: ON
-profile <profile name> Name of connection profile defined in <configDir>/profiles.xml
(changeable via EXAplus GUI or with profile handling parameters).
-k Use Kerberos based single sign-on.
Profile handling options
-lp Print a list of existing profiles and exit.
-dp <profile name> Delete a specified profile and exit.
-wp <profile name> Write a specified profile and exit. The profile is defined by connection
options.
File options
SQL files can be integrated via the parameters -f or -B or via the EXAplus commands start, @ and @@ (see
also Section 4.1.4, “EXAplus-specific commands”). EXAplus will search the specified files firstly relatively to
the working directory, but also in the folders which are defined in the environment variable SQLPATH. If the files
are not found, EXAplus also tries to add an implicit .sql and .SQL ending. The variable SQLPATH contains
similar to the variable PATH a list of folder, separated by an “:” under Linux and by an “;” under Windows.
-init <file> File which is initially executed after the start.
Default: exaplus.sql
-f <file> EXAplus executes that script and terminates.
-B <file> EXAplus executes that script in batch mode and terminates.
-encoding <encoding> Sets the character set for reading of SQL-scripts started with -f or -B. For
supported encodings see Appendix D, Supported Encodings for ETL pro-
cesses and EXAplus. As default the character set UTF8 is used.
By using the command SET ENCODING you can also change the character
set during a session. But you have to consider that the change has no impact
on already opened files.
359
4.1. EXAplus
Parameter Annotation
-- <args> SQL files can use arguments given over via the parameter “-- ” by evalu-
ating the variables &1, &2 etc. .
--test.sql
SELECT * FROM &1;
Default: UNLIMITED
-retryDelay <num> Minimum interval between two recovery attempts in seconds.
Default: 5
-closeOnConnectionLost Exits EXAplus after the loss of a connection that cannot be recovered.
Default: ON
Other options
-autocommit <ON|OFF|EXIT> Sets the autocommit mode. ON enables autocommit, OFF disables it. EXIT
effects an autocommit when the program is exited or disconnected.
Default: ON
-lang <EN|DE|JA> Defines the language of EXAplus messages.
lang=EN or DE HALF
lang=JA FULL
Default: 2000
-Q <num> Query Timeout in seconds. A query will be aborted if the timeout is ex-
ceeded.
Default: -1 (unlimited)
360
Chapter 4. Clients and interfaces
Parameter Annotation
-autoCompletion <ON|OFF> If this function is enabled, the user can obtain proposals by pressing TAB.
When using scripts, this prediction aid will be automatically deactivated.
Default: ON
-pipe With this parameter you can use pipes on Linux/Unix systems (e.g. echo
"SELECT * FROM dual;" | exaplus -pipe -c ... -u ... -p ...).
Default: OFF
-sql <SQL statement> By this parameter you can execute single SQL statements. EXAplus quits
afterwards.
Overview
Command Function
File and operating system commands
@ and START Loads a text file and executes the statements contained therein.
@@ Loads a text file and executes the statements contained therein. However, if
this statement is used in a SQL script, the search path begins in the folder in
which the SQL script is located.
HOST Performs an operating system command and then returns to EXAplus.
SET SPOOL ROW SEPARATOR Defines the row separator for the SPOOL command.
SPOOL Saves the input and output in EXAplus to a file.
Controlling EXAplus
BATCH Switches batch mode on or off.
CONNECT Establishes a new connection with Exasol.
DISCONNECT Disconnects the current connection with the database.
EXIT and QUIT Terminates all open connections and closes EXAplus.
PAUSE Issues some text in the console and waits for confirmation.
PROMPT Issues some text in the console.
SET AUTOCOMMIT Controls whether Exasol should automatically perform COMMIT statements.
SET AUTOCOMPLETION Switches auto-completion on or off.
SHOW Displays EXAplus settings.
TIMING Controls the built-in timer.
WHENEVER Defines the behavior of EXAplus in the event of errors.
Formatting
COLUMN Shows the formatting settings or amends them.
SET COLSEPARATOR Sets the string that separates two columns
SET ENCODING Selects a character set or outputs the current one.
SET ESCAPE Sets the escape character, which makes it possible to input special characters.
361
4.1. EXAplus
Command Function
SET FEEDBACK Controls the output of confirmations.
SET HEADING Switches the output of column headings on or off.
SET LINESIZE Sets width of the output lines.
SET NULL Defines the string for displaying NULL values in tables.
SET NUMFORMAT Sets the formatting of numeric columns in tables.
SET PAGESIZE Sets how many lines of text there should be before column headings are re-
peated.
SET TIME Switches output of the current time of the client system at the input prompt
on or off.
SET TIMING Switches display of the time needed for execution of an SQL statement on or
off.
SET TRUNCATE HEADING Defines whether column headings are truncated or not.
SET VERBOSE Switches additional program information on or off.
Statements for handling variables
The user can define any number of variables, which remain effective for the duration of the EXAplus session or
until explicitly deleted with an appropriate statement. The value of a variable can be accessed in all statements
with &variable. Variables are always treated as strings.
ACCEPT Receives input from the user.
DEFINE Assigns a value to a variable.
SET DEFINE Sets the characters with which the user variables are initiated.
UNDEFINE Deletes a variable.
@ and START
Syntax
@ <file> [args];
Description
Loads a text file and executes the statements contained therein. @ and START are synonymous.
If no absolute path is specified, the file is searched for in relation to the working directory of EXAplus. Rather
than local paths, it is also possible to use http and ftp URLs. If the file cannot be opened at the first attempt, the
extension .sql is appended to the name and another search is made for the file.
A script can be given any number of arguments via command line parameter (Console mode), which can be called
from the script as &1, &2 ... The variable &0 contains the name of the script. These variables are only defined
during the lifetime of the script.
Example(s)
@test1.sql 2008 5;
@ftp://frank:swordfish@ftp.scripts/test2.sql;
@http://192.168.0.1/test3.sql;
362
Chapter 4. Clients and interfaces
@@
Syntax
@@ <file> [args];
Description
Similar to @ or START, however, the file is searched for in the path in which the called script is located (if the
call is made through a script, otherwise the working directory of EXAplus is used).
Example(s)
ACCEPT
Syntax
Description
Receives the value of the <variable> variable from the user as a keyboard input. If the prompt parameter is
entered, the specified <text> text will be output beforehand. The value submitted via the <default> parameter
is taken as the requirement if the user simply presses the return button.
Example(s)
BATCH
Syntax
BATCH BEGIN|END|CLEAR;
Description
Switches batch mode on or off. In this mode, SQL statements are not immediately executed but bundled, this can
increase the execution speed.
BEGIN Switches to batch mode. From this point all entered SQL statements are saved and only executed
after BATCH END has been entered.
363
4.1. EXAplus
END Runs all SQL statements entered since the last BATCH BEGIN and exits batch mode.
If an exception is raised for a statements in the batch mode, the following statements are
not executed anymore and the status before the batch execution is restored. A single ex-
ception is the COMMIT statement which finishes a transaction and stores changes per-
sistently.
Example(s)
BATCH BEGIN;
insert into t values(1,2,3);
insert into v values(4,5,6);
BATCH END;
COLUMN
Syntax
Description
Displays the formatting settings for the <name> column or changes the formatting options for the <name> column.
<command> can be a combination of one or more of the following options:
Format strings of numbers consist of the elements '9', '0', '.' (point) and
'EEEE'. The width of the column results from the length of the specified
format string.
9 At this position one digit is represented if it is not a NULL
before or after the decimal point.
0 At this position one digit is always represented, even if it a
NULL before or after the decimal point.
364
Chapter 4. Clients and interfaces
The following "COLUMN" SQL*Plus statements are not supported. However, for reasons of compatibility they
will not generate a syntax error:
• NEWLINE
• NEW_VALUE
• NOPRINT
• OLD_VALUE
• PRINT
• FOLD_AFTER
• FOLD_BEFORE
Example(s)
SQL_EXA> column A;
COLUMN A ON
FORMAT 9990
365
4.1. EXAplus
A B
-------- ------
0011.0 11
0044.0 44
0045.0 45
0019.0 19
0087.0 87
0099.0 99
0125.0 125
0033.0 33
0442.0 442
CONNECT
Syntax
Description
If a connection already exists, it will be disconnected if the new connection was established successfully. If no
password is specified, this is requested by EXAplus. If <connection string> is not specified, the information
of the last connection is used. A COMMIT will also be performed if "SET AUTOCOMMIT EXIT" has been set.
Example(s)
CONN scott/tiger;
CONNECT scott/gondor@191.168.2.1:8563;
DEFINE
Syntax
DEFINE [<variable>[=<value>]];
Description
Assigns the string <value> to the variable <variable>. Single and double quotes have to be doubled like in
SQL strings. If this variable does not exist, it will be created. If DEFINE is only called with the name of a variable,
the value of the variable will be displayed. As a result of calling DEFINE without parameters, all variables and
the assigned values will be listed.
The dot is used as delimiter for variable names (e.g. after calling define v=t the string
&v.c1 is replaced by tc1). That's why if you want to add a dot you have to specify two
dots (&v..c1 will be evaluated as t.c1)
366
Chapter 4. Clients and interfaces
Example(s)
define tablename=usernames;
define message='Action successfully completed.';
DISCONNECT
Syntax
DISCONNECT;
Description
If EXAplus is connected to the database, this command terminates the current connection. A COMMIT will also
be performed if "SET AUTOCOMMIT EXIT" has been set. The command has no effect if there is no connection.
Example(s)
DISCONNECT;
Syntax
[EXIT|QUIT][;]
Description
Exit or Quit closes the connection with the database and quits EXAplus (not available for the GUI version). A
terminating semicolon is not needed for this.
Example(s)
exit;
quit;
HOST
Syntax
HOST <command>;
Description
Performs an operating system command on the client host and then returns to EXAplus. Single and double quotes
have to be doubled like in SQL strings.
It is not possible to use programs that expect input from the keyboard with the HOST
command.
367
4.1. EXAplus
Example(s)
PAUSE
Syntax
PAUSE [<text>];
Description
Similar to PROMPT, but waits for the user to press the return key after the text is displayed. Single and double
quotes have to be doubled like in SQL strings.
Example(s)
PROMPT
Syntax
PROMPT [<text>];
Description
Prints <text> to the console. Single and double quotes have to be doubled like in SQL strings. If the parameter is
not specified, an empty line is output.
Example(s)
prompt Ready.;
SET AUTOCOMMIT
Syntax
Description
ON After each SQL statement a COMMIT statement is executed automatically. This is preset by default.
OFF Automatic COMMIT statements are not executed.
EXIT A COMMIT statement is executed when EXAplus is exited.
The recommended setting for normal applications is "ON". See also Section 3.1,
“Transaction management ”.
368
Chapter 4. Clients and interfaces
Example(s)
SET AUTOCOMPLETION
Syntax
Description
Example(s)
SET COLSEPARATOR
Syntax
Description
Sets the string that separates two columns. Preset to a spacing character.
Example(s)
A |B
-------------------|--------------------
11|Meier
SET DEFINE
Syntax
Description
Sets the character with which user variables are initiated. Presetting is the character "'&". <C> must relate to one
single special character. Via ON and OFF you can activate and deactivate user variables.
369
4.1. EXAplus
Example(s)
SET ENCODING
Syntax
Description
Sets encoding for different files which EXAplus is reading or writing. Default encoding for reading and writing
files is UTF8, but can also be changed via commandline parameter -encoding. For all supported encodings see
Appendix D, Supported Encodings for ETL processes and EXAplus. When called without arguments the current
settings are listed.
Example(s)
SET ESCAPE
Syntax
Description
Specifies the escape character, which makes it possible to input special characters like &. <C> must relate to one
single special character. You can activate/deactivate the escape character by using SET ESCAPE ON|OFF. When
setting a character, it is activated automatically. In default case, the escape character is \, but it is deactivated.
Example(s)
370
Chapter 4. Clients and interfaces
SET FEEDBACK
Syntax
Description
Controls the output of confirmations. This includes the output after statements such as "INSERT INTO" and display
of the row number in the output of a table. "OFF" suppresses the output of all confirmations, "ON" switch all
confirmations on. If <num> is a numeric value greater than 0, tables with at least <num> rows will have row
numbers displayed. The "SET VERBOSE OFF" statement and the command line parameter "-q" sets this setting
to "OFF".
Example(s)
A B
------------------- --------------------
11 Meier
44 Schmitt
SET HEADING
Syntax
Description
Example(s)
371
4.1. EXAplus
SET LINESIZE
Syntax
Description
Sets width of the output lines to "n" characters. If a table with wider lines is output, EXAplus adds line breaks
between two columns.
Example(s)
A
-------------------
B
--------------------
11
Meier
SET NULL
Syntax
Description
Defines the string for displaying NULL values in tables. Preset to an empty field. This setting is only used for
columns that have not had a different representation of NULL values defined with the COLUMN statement.
Example(s)
NULL
-----
EMPTY
SET NUMFORMAT
Syntax
372
Chapter 4. Clients and interfaces
Description
Sets the formatting of numeric columns in tables. The format string is the same as that of the COLUMN statement.
There you also can find a closer description of the possible formats. This setting is only used for numeric columns
that have not had a different format defined with the COLUMN statement.
Example(s)
43
---------
43.00
SET PAGESIZE
Syntax
Description
Sets how many text lines there should be before column headings are repeated. Preset to "UNLIMITED", this
means that column headings are only displayed before the first line. If the HEADING setting has the value, OFF,
no column headings are displayed.
Example(s)
A B
------------------- --------------------
11 Meier
44 Schmitt
45 Huber
19 Kunze
87 Mueller
99 Smith
125 Dooley
33 Xiang
A B
------------------- --------------------
442 Chevaux
Syntax
373
4.1. EXAplus
Description
Defines the row separator for the SPOOL command. The option AUTO (default) uses the common character of
the local system.
Example(s)
SET TIME
Syntax
Description
Switches output of the current time of the client system at the input prompt on or off.
Example(s)
SET TIMING
Syntax
Description
Switches display of the time needed for execution of a SQL statement on or off.
Example(s)
Timing element: 1
Elapsed: 00:00:00.313
Syntax
374
Chapter 4. Clients and interfaces
Description
Defines whether column headings are truncated to the corresponding column data type length. Preset to "ON".
Example(s)
SET VERBOSE
Syntax
Description
Switches additional program information on or off. Preset to ON. The parameter -q is used to set the value to OFF
at program startup.
Example(s)
SHOW
Syntax
SHOW [<var>];
Description
Displays the EXAplus settings (see following). If no setting is specified, all are displayed.
Example(s)
SQL_EXA> show;
AUTOCOMMIT = "ON"
AUTOCOMPLETION = "ON"
COLSEPARATOR = " "
DEFINE = "&"
ENCODING = "UTF-8"
ESCAPE = "OFF"
FEEDBACK = "ON"
375
4.1. EXAplus
HEADING = "ON"
LINESIZE = "200"
NULL = "null"
NUMFORMAT = "null"
PAGESIZE = "UNLIMITED"
SPOOL ROW SEPARATOR = "AUTO"
TIME = "OFF"
TIMING = "OFF"
TRUNCATE HEADING = "ON"
VERBOSE = "ON"
SPOOL
Syntax
SPOOL [<file>|OFF];
Description
If a filename is specified as a parameter, the file will be opened and the output of EXAplus saved to this file (the
encoding is used which was set via commandline parameter -encoding or via the command SET ENCODING).
If the file already exists, it will be overwritten. SPOOL OFF terminates saving and closes the file. By entering
SPOOL without parameters, the name of spool file will be displayed.
Example(s)
spool log.out;
select * from table1;
spool off;
TIMING
Syntax
TIMING START|STOP|SHOW;
Description
TIMING START [name] Starts a new timer with the specified name. If no name is specified, the number of
the timer is used as a name.
TIMING STOP [name] Halts the timer with the specified name and displays the measured time. If no name
is specified, the timer launched most recently is halted.
TIMING SHOW [name] Displays the measured time from the specified timer without stopping this. If no
name is specified, all timers are displayed.
Example(s)
376
Chapter 4. Clients and interfaces
UNDEFINE
Syntax
UNDEFINE <variable>;
Description
Deletes the variable <variable>. If there is no variable with the specified name, an error is reported.
Example(s)
undefine message;
WHENEVER
Syntax
Description
This statement defines the behavior of EXAplus in the event of errors. WHENEVER SQLERROR responds to
errors in the execution of SQL statements, WHENEVER OSERROR to operating system errors (file not found,
etc.).
CONTINUE is the default setting for both types of error (and EXIT if EXAplus is started
with the parameter -x).
Example(s)
377
4.2. ODBC driver
If the rowcount of a query exceeds 2147483647 (231-1), the ODBC driver will return the
value 2147483647 for the function SQLRowCount(). The reason for that behavior is the
32-bit limitation for this return value defined by the ODBC standard.
System requirements
The Exasol ODBC driver is provided for both the 32-bit version and the 64-bit version of Windows. The requirements
for installing the ODBC driver are listed below:
• The Windows user who installs the Exasol ODBC and its components must be the computer administrator or
a member of the administrator group.
• Microsoft .NET Framework 4.0 Client Profile™ has to be installed on your system.
• All applications and services that integrate the ODBC must be stopped during installation. If "Business Objects
XI" is installed on this machine, the "Web Intelligence Report Server" service must be stopped.
The ODBC driver has been successfully tested on the following systems:
– Windows 10 (x86/x64)
– Windows 8.1 (x86/x64)
– Windows 7, Service Pack 1 (x86/x64)
– Windows Server 2012 R2 (x86/x64)
– Windows Server 2012 (x86/x64)
– Windows Server 2008 R2, Service Pack 1 (x86/x64)
– Windows Server 2008, Service Pack 2 (x86/x64)
Installation
A simple setup wizard leads you through the installation steps after executing the installation file. Please note that
separate installation executables are provided for 32bit and 64bit applications.
Already existing ODBC data sources are automatically configured to the new driver
version in case of an update.
378
Chapter 4. Clients and interfaces
Within the tool, please select the "User DSN" or "System DSN" tab. Click on "Add" and select "Exasol Driver"
in the window "Create New Data Source".
379
4.2. ODBC driver
Test connection The user can use this to test whether the connection data has been entered correctly.
An attempt to connect with Exasol is made.
In the advanced settings you can define additional options and arbitrary connection string parameters (see also
Section 4.2.5, “Connecting via Connection Strings”):
When you click the "OK" button in the Exasol configuration window, your new connection will appear in the list
of Windows data sources.
The Connection Pooling of the driver manager is deactivated by default. You can explicitly
activate it in the configuration tool "ODBC Data Source Administrator". But please note
that in that case reused connections keep their session settings which were set via SQL
commands (see ALTER SESSION).
Known problems
This chapter covers known problems associated with using the ODBC driver. Generally the causes are easy to
correct.
380
Chapter 4. Clients and interfaces
Table 4.3. Known problems associated with using the ODBC driver on Windows
Problem Solution Screenshot
System error code: Important elements of the ODBC driver
are not installed correctly. Quit all applic-
126, 127, 193 or 14001 ations and services that may be used by
the ODBC and reinstall the Exasol
ODBC.In case of error 14001, important
system libraries needed by the Exasol
ODBC are not installed on this machine.
Reinstall the Visual C++ Redistributables
and afterwards the Exasol ODBC. This
error can also be resolved by installing the
latest Windows updates.
Installation error: A similar message may also occur during
the installation. This message occurs if an
Incompatible dll versions older version of the Exasol ODBC was
already installed and was not overwritten
during the installation. Quit all applica-
tions and services that may be used by the
ODBC and reinstall the Exasol ODBC.
Error opening file for writ- The installer has detected that it cannot
ing overwrite an Exasol ODBC component.
Quit all applications and services that may
be used by the ODBC and reinstall the
Exasol ODBC.
Data source name not found Please check the Data source name.
and no default driver spe-
cified On 64-bit systems please check whether
the created data source is a 64-bit data
source, although the application expects
a 32-bit data source or vice versa.
The list of options you can set in the file odbc.ini can be found in Section 4.2.5,
“Connecting via Connection Strings”.
The Exasol ODBC driver for Linux/Unix has been designed to run on as many distributions as possible. It was
successfully tested on the following systems:
381
4.2. ODBC driver
The ODBC driver was tested on Mac OS X with the driver manager iODBC in two
variations (which one is used depends on the application):
Known problems
Table 4.4. Known problems when using the ODBC driver for Linux/Unix
Description Solution
Error "Data source name not found, Possibly the unixODBC driver manager uses the wrong odbc.ini. You
and no default driver specified" can set the file which should be used via the environment variable ODB-
CINI.
SQLConnect()
In this method you choose a DSN entry and define user and password.
Example call:
SQLConnect(connection_handle,
(SQLCHAR*)"exa_test", SQL_NTS,
382
Chapter 4. Clients and interfaces
(SQLCHAR*)"sys", SQL_NTS,
(SQLCHAR*)"exasol", SQL_NTS);
SQLDriverConnect()
For function SQLDriverConnect() there exist two different alternatives. Either you choose a DSN entry of
the file odbc.ini, or you choose a certain DRIVER from the file odbcinst.ini. In both cases a connection
string is used to define further options (see next section).
Example call:
SQLDriverConnect(
connection_handle, NULL,
(SQLCHAR*)"DSN=exa_test;UID=sys;PWD=exasol", SQL_NTS,
NULL, 0, NULL, SQL_DRIVER_NOPROMPT);
SQLDriverConnect(
connection_handle, NULL,
(SQLCHAR*)
"DRIVER={EXASOL Driver};EXAHOST=192.168.6.11..14:8563;UID=sys;PWD=exasol",
SQL_NTS, NULL, 0, NULL, SQL_DRIVER_NOPROMPT);
The data of the connection string have higher priority than the values of the odbc.ini
file on Linux/Unix systems and the configuration of the data source on Windows systems.
DSN=exa_test;UID=sys;PWD=exasol;EXASCHEMA=MY_SCHEMA
DRIVER={EXASOL Driver};EXAHOST=192.168.6.11..14:8563;UID=sys;PWD=exasol
EXAHOST Defines the servers and the port of the Exasol cluster (e.g.
192.168.6.11..14:8563).
Examples:
myhost:8563 Single server with name myhost and
port 8563.
myhost1,myhost2:8563 Two servers with port 8563.
myhost1..4:8563 Four servers (myhost1, myhost2, my-
host3, myhost4) and port 8563.
192.168.6.11..14:8563 Four servers from 192.168.6.11 up
to 192.168.6.14 and port 8563 .
383
4.2. ODBC driver
Instead of a concrete list you can also specify a file which contains such a
list. (e.g. //c:\mycluster.txt). The two slashes ("/") indicate that
a filename is specified.
EXAUID or UID Username for the login. UID is automatically removed from the connection
string by some applications.
EXAPWD or PWD Password of user. PWD is automatically removed from the connection string
by some applications.
LOGMODE Specifies the mode for the log file. The following values are possible:
KERBEROSSERVICENAME Principal name of the Kerberos service. If nothing is specified, the name
"exasol" will be used as default.
KERBEROSHOSTNAME Host name of the Kerberos service. If nothing is specified, the host name
of the parameter EXAHOST will be used as default.
ENCRYPTION Switches on the automatic encryption. Valid values are "Y" and "N" (default
is "Y").
AUTOCOMMIT Autocommit mode of the connection. Valid values are "Y" and "N" (default
is "Y").
QUERYTIMEOUT Defines the query timeout in seconds for a connection. If set to 0, the query
timeout is deactivated. Default is 0.
CONNECTTIMEOUT Maximal time in milliseconds the driver will wait to establish a TPC con-
nection to a server. This timeout is interesting to limit the overall login time
especially in case of a large cluster with several reserve nodes. Default:
2000
SUPERCONNECTION Enables the user to execute queries even if the limit for active sessions
(executing a query) has been reached. Valid values are "Y" and "N" (default
is "N").
384
Chapter 4. Clients and interfaces
DEFAULTPARAMSIZE Default size for VARCHAR parameters in prepared statements whose type
cannot be determined at prepare time. Default is 2000.
COGNOSSUPPORT If you want to use the Exasol ODBC driver in combination with Cognos,
we recommend to set on this option on "Y". Default is "N".
CONNECTIONLCCTYPE Sets LC_CTYPE to the given value during the connection. Example values
for Windows: "deu", "eng". Example values for Linux/Unix: "de_DE",
"en_US". On Linux/Unix, you can also set an encoding, e.g. "en_US.UTF-
8".
CONNECTIONLCNUMERIC Sets LC_NUMERIC to the given value during the connection. Example
values for Windows: "deu", "eng". Example values for Linux/Unix: "de_DE",
"en_US".
SHOWONLYCURRENTSCHEMA Defines whether the ODBC driver considers all schemas or just the current
schema for metadata like the list of columns or tables. Valid values are "Y"
and "N" (default is "N").
STRINGSNOTNULL Defines whether the the ODBC driver returns empty strings for NULL
strings when reading table data (please notice that the database internally
doesn't distinguish between NULL and empty strings and returns in both
cases NULL values to the driver). Valid values are "Y" and "N" (default is
"N").
INTTYPESINRESULTSI- If you switch on this option, then DECIMAL types without scale will be
FPOSSIBLE returned as SQL_INTEGER ( 9 digits) or SQL_BIGINT ( 18 digits) instead
of SQL_DECIMAL. Valid values are "Y" and "N" (default is "N").
On Windows servers the data which is sent to the ODBC driver has to be in an encoding which is installed locally.
The encoding can be specified via the language settings of the connection (see above).
On Linux/Unix servers, the encoding can be set via environment variables. To set e.g. the German language and
the encoding UTF8 in the console, the command export LC_CTYPE=de_DE.utf8 can be used. For graphical
applications, a wrapper script is recommended.
385
4.2. ODBC driver
It is important that the used encoding has to be installed on your system. You can identify the installed encodings
via the command "locale -a".
To achieve the best performance you should try to fetch about 50-100 MB
of data by choosing the number of rows per SQLFetch.
Inserting data into the database Instead of using single insert statements like "INSERT INTO t VALUES
1, 2, ..." you should use the more efficient interface of prepared statements
and their parameters. Prepared statements achieve optimal performance
when using parameter sets between 50 and 100 MB.
Moreover you should insert the data by using the native data types. E.g. for
number 1234 you should use SQL_C_SLONG (Integer) instead of
SQL_CHAR ("1234").
Autocommit mode Please note that you should deactivate the autocommit mode on Windows
systems only via the method SQLSetConnectAttr() and not in the data source
settings. Otherwise, the windows driver manager doesn't notice this change,
assumes that autocommit is switched on and doesn't pass SQLEndTran()
calls to the database. This behavior could lead to problems.
386
Chapter 4. Clients and interfaces
• Access to the driver via the DriverManager-API and configuration of the driver using the DriverPropertyInfo
interface
• Execution of SQL statements directly, as a prepared statement, and in batch mode
• Support of more than one open ResultSet
• Support of the following metadata APIs: DatabaseMetaData and ResultSetMetaData
• Savepoints
• User-defined data types and the types Blob, Clob, Array and Ref.
• Stored procedures
• Read-only connections
• The API ParameterMetaData
If the rowcount of a query exceeds 2147483647 (231-1), the JDBC driver will return the
value 2147483647. The reason for that behavior is the 32-bit limitation for this return
value defined by the JDBC standard.
Detailed information about the supported interfaces are provided in the API Reference
which you can find either in the start menu (Windows) or in the folder html of the install
directory (Linux/Unix).
– Windows 10 (x86/x64)
– Windows 8.1 (x86/x64)
– Windows 7, Service Pack 1 (x86/x64)
– Windows Server 2012 R2 (x86/x64)
– Windows Server 2012 (x86/x64)
– Windows Server 2008 R2, Service Pack 1 (x86/x64)
– Windows Server 2008, Service Pack 2 (x86/x64)
387
4.3. JDBC driver
For Windows systems an automatic installation wizard exists. In this case, Microsoft .NET Framework 4.0 Client
Profile™ has to be installed on your system.
All classes of the JDBC driver belong to the Java "com.exasol.jdbc" package. The main class of the driver is
com.exasol.jdbc.EXADriver
<repositories>
<repository>
<id>maven.exasol.com</id>
<url>https://maven.exasol.com/artifactory/exasol-releases</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.exasol</groupId>
<artifactId>exasol-jdbc</artifactId>
<version>6.0.0</version>
</dependency>
</dependencies>
URL of Exasol
The URL for the Exasol used by the driver has the following form:
jdbc:exa:<host>:<port>[;<prop_1>=<value_1>]...[;<prop_n>=<value_n>]
388
Chapter 4. Clients and interfaces
When opening a connection, the driver will randomly choose an address from the spe-
cified address range. If the connection fails the driver will continue to try all other pos-
sible addresses.
Examples:
Instead of a concrete list you can also specify a file which contains such a list. (e.g.
//c:\mycluster.txt). The two slashes ("/") indicate that a filename is specified.
<prop_i=value_i> An optional list of properties separated by a ";" follows the port, the values of which
should be set when logging-in. These properties correspond with the supported Driver
Properties and are described in the following section. It is important to note that the
values of properties within the URL can only consist of alphanumeric characters.
import java.sql.*;
import com.exasol.jdbc.*;
389
4.3. JDBC driver
}
}
}
The above code sample builds a JDBC connection to Exasol, which runs on servers 192.168.6.11 up to 192.168.6.14
on port 8563. User sys with the password exasol is logged-in and the schema sys is opened. After that all
tables in this schema is shown:
Supported DriverProperties
The following properties can be transferred to the JDBC driver via the URL:
390
Chapter 4. Clients and interfaces
debug 0=off, 1=on Switches on the driver's log function. The driver then writes for each estab-
lished connection a log file named jdbc_timestamp.log.
These files contain information on the called driver methods and the pro-
gress of the JDBC connection and can assist the Exasol Support in the
diagnosis of problems.
Default: 0
logdir String Defines the directory where the JDBC debug log files shall be written to
(in debug mode).
Example: jdbc:exa:192.168.6.11..14:8563;debug=1;log-
dir=/tmp/my folder/;schema=sys
clientname String Tells the server what the application is called. Default: "Generic JDBC
client"
clientversion String Tells the server the version of the application. Default: empty ("")
logintimeout numeric, >=0 Maximal time in seconds the driver will wait for the database in case of a
connect or disconnect request. Default is 0 (unlimited)
connecttimeout numeric, >=0 Maximal time in milliseconds the driver will wait to establish a TPC con-
nection to a server. This timeout is interesting to limit the overall login
time especially in case of a large cluster with several reserve nodes. Default:
2000
querytimeout numeric, >=0 Defines how many seconds a statement may run before it is automatically
aborted. Default is 0 (unlimited)
391
4.3. JDBC driver
Default: 0
slave 0=off, 1=on So-called sub-connections for parallel read and insert have this flag switched
on. Details and examples can be found in our Solution Center: ht-
tps://www.exasol.com/support/browse/SOL-546
Default: 0
slavetoken numeric, >=0 Is necessary to establish parallel sub-connections. Default: 0
Reading big data volumes Via the parameter "fetchsize", you can determine the data volume which
should be fetched from the database per communication round. If this value
is too low, the data transfer can take too long. If this value is too high, the
JVM can run out of memory. We recommend a fetch size of 1000-2000.
Inserting data into the database Instead of using single insert statements like "INSERT INTO t VALUES
1, 2, ..." you should use the more efficient interface of prepared statements
and their parameters. Prepared statements achieve optimal performance
when using parameter sets between 500 kB and 20 MB. Moreover you
should insert the data by using the native data types.
Unused resources Unused resources should be freed immediately, e.g. Prepared Statements
via "close()".
Connection servers Don't specify a needless wide IP address range. Since those addresses are
randomly tried until a successful connection, the "connect" could take much
time.
392
Chapter 4. Clients and interfaces
– Windows 10 (x86/x64)
– Windows 8.1 (x86/x64)
– Windows 7, Service Pack 1 (x86/x64)
– Windows Server 2012 R2 (x86/x64)
– Windows Server 2012 (x86/x64)
– Windows Server 2008 R2, Service Pack 1 (x86/x64)
– Windows Server 2008, Service Pack 2 (x86/x64)
Microsoft .NET Framework 4.0 Client Profile™ has to be installed on your system.
The Data Provider is available for download in the form of an executable installer file. The installation requires
administrator rights and is started by double-clicking on the downloaded file. The Data Provider is installed in the
global assembly cache and registered in the global configuration file of the .NET framework.
Besides the ADO.NET driver, the additional tool Data Provider Metadata View ™(DPMV)
is installed with which you can easily check the connectivity and run simple metadata
queries.
SQL Server Integration Services Installation of the Data Destination as a data flow destination for Integration
Services projects for:
• Visual Studio 2005 with SQL Server 2005 (v 9.0)
• Visual Studio 2008 with SQL Server 2008 (v 10.0)
• Newer versions work with generic ADO.NET and ODBC and don't need
a Data Destination
SQL Server Reporting Services Installation of the Data Processing Extension and adjustment of the config-
uration of the report server for:
• Visual Studio 2005 with SQL Server 2005 (v 9.0)
• Visual Studio 2008 with SQL Server 2008 (v 10.0)
• Visual Studio 2010 with SQL Server 2012 (v 11.0)
• Visual Studio 2012 with SQL Server 2012 (v 11.0)
• Visual Studio 2013 with SQL Server 2014 (v 12.0)
SQL Server Analysis Services Installation of the DDEX provider and pluggable SQL cartridge for:
• Visual Studio 2008 with SQL Server 2008 (v 10.0)
• Visual Studio 2010 with SQL Server 2012 (v 11.0)
• Visual Studio 2012 with SQL Server 2012 (v 11.0)
• Visual Studio 2013 with SQL Server 2014 (v 12.0)
• Visual Studio 2015 with SQL Server 2016 (v 13.0)
• Visual Studio 2013 with SQL Server 12.0
393
4.4. ADO.NET Data Provider
entering an invariant identifier, with which the Data Provider is selected. The identifier of the Exasol Data Provider
is "Exasol.EXADataProvider".
In order to connect with Exasol, the Data Provider must be given a connection string containing any information
necessary for establishing a connection from the client application. The connection string is a sequence of
keyword/value pairs separated by semicolons. A typical example of a connection string for the Exasol Data Provider
is:
host=192.168.6.11..14:8563;UID=sys;PWD=exasol;Schema=test
using System;
using System.Collections.Generic;
using System.Text;
using System.Data.Common;
namespace ConnectionTest
394
Chapter 4. Clients and interfaces
{
class Program
{
static void Main(string[] args)
{
DbProviderFactory factory=null;
try
{
factory = DbProviderFactories.GetFactory("Exasol.EXADataProvider");
Console.WriteLine("Found Exasol driver");
connection.Open();
Console.WriteLine("Connected to server");
DbCommand cmd = connection.CreateCommand();
cmd.Connection = connection;
cmd.CommandText = "SELECT * FROM CAT";
reader.Close();
connection.Close();
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
}
}
Schema Collections
The Exasol ADO.NET driver supports the following Schema Collections:
• MetaDataCollections
395
4.4. ADO.NET Data Provider
• DataSourceInformation
• DataTypes
• Restrictions
• ReservedWords
• Tables
• Columns
• Procedures
• ProcedureColumns
• Views
• Schemas
Instead of using single insert statements like "INSERT INTO t VALUES 1, 2, ..." you should use the more
efficient interface of prepared statements and their parameters. Prepared statements achieve optimal performance
when using parameter sets between 500 kB and 20 MB. Moreover you should insert the data by using the
native data types. For execution of prepared statements we recommend to use the interface "IParameterTable".
An instance of "ParameterTable" can be created with a "Command" via the method "CreateParameterTable()".
Exasol ADO.NET provides a special interface to execute prepared statements (class ParameterTable)
since it is not possible to update rows in Exasol via the ADO.NET specific interfaces DataSet and
DataAdapter.
Executing parametrized queries is done by using SQL statements including question marks at the position of
parameters. The values of the ParameterTable are used to replace the parameters when executing the
statement. Example for a parametrized query:
• Decimal types
396
Chapter 4. Clients and interfaces
The ADO.NET DECIMAL data type can only store values with a precision of a maximum of 28 digits.
Decimal columns in Exasol with greater precision do not fit in these data types and have to be provided as
strings.
• Column names
• Batch execution
If you want to execute several statements in a batch, you can use the functions AddBatchCommandText()
and ExecuteBatch() of the class EXACommand. Example for a batch execution:
Creating a DataReaderSource
• Create a new "Integration Services Project" in Visual Studio.
• Create a new data flow and insert a "Data Reader Source" from the toolbox section "data flow sources" in the
data flow.
• Create a new "ADO.NET Connection" in the "Connection Manager" and choose the Exasol Data Provider
from the list of .NET providers.
• Configure the Exasol Data Provider and then test the connection.
• Open the new "Data Reader Source" from the data flow via double-clicking.
Component Properties In section "User-defined Properties", enter the SQL that will read the data
from Exasol (e.g. "select * from my_schema.my_table").
397
4.4. ADO.NET Data Provider
• Create an instance of the EXADataDestination by using the mouse to drag the object from the toolbox to the
workspace.
• Connect the output of the data source (green) with the EXADataDestination instance.
I/O Properties Here you can define further options per column, e.g. the names and data types.
398
Chapter 4. Clients and interfaces
• Once the data flow destination is configured, the data flow can be tested or started. If the sequence is successful,
the data flow elements turn green.
• Start Visual Studio 2005 create a new report server project and Open the Project sheet explorer.
• Create a new data source (EXADataSource) and insert the connection string, user and password (see tab
"Credentials")
• In the Project sheet explorer, add a new report using the wizard. When selecting the data source, choose the
ExaDataSource that was previously configured.
• As soon as the report is ready, it must be created and published. Both functions are contained in the "Create"
menu.
• The report can now be run in your browser. To do this, open the browser (e.g. Internet Explorer) and enter the
address of the report server (e.g. "http://localhost/Reports"). Check with your system administrator that you
have the necessary permissions for running reports in the report server.
• After being run, the output in the browser could look like this:
399
4.5. WebSockets
4.5. WebSockets
The JSON over WebSockets client-server protocol allows customers to implement their own drivers for all kinds
of platforms using a connection-based web protocol. The main advantages are flexibility regarding the programming
languages you want to integrate Exasol into, and a more native access compared to the standardized ways of
communicating with a database, such as JDBC, ODBC or ADO.NET, which are mostly old and static standards
and create additional complexity due to the necessary driver managers.
If you are interested to learn more about our WebSockets API, please have a look at our open source GitHub re-
pository (https://www.github.com/exasol) where you'll find a lot of details about the API specification and example
implementations such as a native Python driver based on the protocol. And we would be very happy if you would
contribute to our open source community by using, extending and adding to our open-sourced tools.
4.6. SDK
Exasol provides the SDK (Software Development Kit) to connect client applications with Exasol. You'll find the
package in the download area on our website which includes the following subfolders:
R and Python R- and Python packages for easily executing Python and R code on the Exasol database
within your usual programming environment. More details can be found in the corresponding
readme files.
CLI Call level interface for developing C++ applications. More details can be found in the following
section.
• Exasol CLI Libraries: EXACLI.lib and all dynamic libraries (*.dll) in the directory "lib32" or "lib64",
compiled for 32 and 64-bit Windows.
• The header files "exaCInterface.h" and "exaDefs.h". These contain declarations and constants of the
Exasol CLI Interface.
• A fully functional sample program, which illustrates integration of the CLI interface. The sample is supplied
as source code (.cpp). Additionally, the package contains a suitable project file for Visual Studio 10 (.vc-
proj).
400
Chapter 4. Clients and interfaces
The CLI can be used on 32-bit and 64-bit Windows versions and has been successfully tested on the following
systems:
– Windows 10 (x86/x64)
– Windows 8.1 (x86/x64)
– Windows 7, Service Pack 1 (x86/x64)
– Windows Server 2012 R2 (x86/x64)
– Windows Server 2012 (x86/x64)
– Windows Server 2008 R2, Service Pack 1 (x86/x64)
– Windows Server 2008, Service Pack 2 (x86/x64)
Microsoft .NET Framework 4.0 Client Profile™ has to be installed on your system.
Open the project file (.vcproj) from the CLI in the directory "examples\sqlExec\" with Visual Studio. The sample
sources (.cpp) are already integrated in the project. The Lib and include paths point to the corresponding directories,
which were installed with the CLI. If files from the CLI package have been moved or changed, the project must
be adapted accordingly.
The sample program can be compiled for 32 and 64-bit as a debug and release version. The generated .exe file
is a standard Windows console application. The source code of the sample program contains comments, which
describe the parts of the application.
This submits the SQL string "select * from exa_syscat" via standard-in to the sample program and via the CLI
function, "EXAExecDirect()", it is executed in Exasol.
Delivery scope
For configuration, be sure to read the file README.txt which is included in the install-
ation package.
System requirements
The Exasol CLI has been successfully tested on the following systems:
401
4.6. SDK
– Debian 8 (x86/x64)
– Ubuntu 16.04 LTS (x86/64)
– Ubuntu 14.04 LTS (x86/64)
– SUSE Linux Enterprise Server 12 (x64)
– SUSE Linux Enterprise Desktop 12 (x64)
– SUSE Linux Enterprise Server 11 (x86/x64)
– openSUSE Leap 42.2 (x64)
You can compile the example program using the Makefile in the subfolder examples. GCC 4.1.0 or higher is
recommended for that.
This example executes the given sql statement (select * from exa_syscat) and returns the number of
columns and rows of the result.
Example
The following example illustrates how to send SQL queries to the database. The return values are not checked to
simplify the code.
// Like in the ODBC interface, handles are created before the connection
// is established (for environment and connection)
SQLHENV henv;
SQLHDBC hdbc;
EXAAllocHandle(SQL_HANDLE_ENV, NULL, &henv);
EXAAllocHandle(SQL_HANDLE_DBC, henv, &hdbc);
402
Chapter 4. Clients and interfaces
To achieve the best performance you should try to fetch about 50-100 MB
of data by choosing the number of rows per SQLFetch.
Inserting data into the database Instead of using single insert statements like "INSERT INTO t VALUES
1, 2, ..." you should use the more efficient interface of prepared statements
and their parameters. Prepared statements achieve optimal performance
when using parameter sets between 50 and 100 MB.
Moreover you should insert the data by using the native data types. E.g. for
number 1234 you should use SQL_C_SLONG (Integer) instead of
SQL_CHAR ("1234").
403
404
Appendix A. System tables
There are some system tables that are critical to security, these can only be accessed by users with the "SELECT
ANY DICTIONARY" system privilege (users with the DBA role have this privilege implicitly). This includes all
system tables with the "EXA_DBA_" prefix.
There are also system tables to which everyone has access, however, the content of these is dependent on the current
user. In EXA_ALL_OBJECTS, for example, only the database objects the current user has access to are displayed.
EXA_SYSCAT
This system table lists all existing system tables.
Column Meaning
SCHEMA_NAME Name of the system schema
OBJECT_NAME Name of the system table
OBJECT_TYPE Type of object: TABLE or VIEW
OBJECT_COMMENT Comment on the object
EXA_ALL_COLUMNS
This system table contains information on all the table columns to which the current user has access.
Column Meaning
COLUMN_SCHEMA Associated schema
COLUMN_TABLE Associated table
COLUMN_OBJECT_TYPE Associated object type
405
A.2. List of system tables
Column Meaning
COLUMN_NAME Name of column
COLUMN_TYPE Data type of column
COLUMN_TYPE_ID ID of data type
COLUMN_MAXSIZE Maximum number of characters for strings
COLUMN_NUM_PREC Precision for numeric values
COLUMN_NUM_SCALE Scale for numeric values
COLUMN_ORDINAL_POSITION Position of the column in the table beginning at 1
COLUMN_IS_VIRTUAL States whether the column is part of a virtual table
COLUMN_IS_NULLABLE States whether NULL values are allowed (TRUE or FALSE).
In case of views this value is always NULL.
COLUMN_IS_DISTRIBUTION_KEY States whether the column is part of the distribution key
(TRUE or FALSE)
COLUMN_DEFAULT Default value of the column
COLUMN_IDENTITY Current value of the identity number generator, if this column
has the identity attribute.
COLUMN_OWNER Owner of the corresponding object
COLUMN_OBJECT_ID ID of the column
STATUS Status of the object
COLUMN_COMMENT Comment on the column
EXA_ALL_CONNECTIONS
Lists all connections of the database.
Column Meaning
CONNECTION_NAME Name of the connection
CREATED Time of the creation date
CONNECTION_COMMENT Comment on the connection
EXA_ALL_CONSTRAINTS
This system table contains information about constraints of tables to which the current user has access.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_ENABLED Displays whether the constraint is checked or not
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
406
Appendix A. System tables
EXA_ALL_CONSTRAINT_COLUMNS
This system table contains information about referenced table columns of all constraints to which the current user
has access.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
ORDINAL_POSITION Position of the column in the table beginning at 1
COLUMN_NAME Name of the column
REFERENCED_SCHEMA Referenced schema (only for foreign keys)
REFERENCED_TABLE Referenced table (only for foreign keys)
REFERENCED_COLUMN Name of the column in the referenced table (only for foreign
keys)
EXA_ALL_DEPENDENCIES
Lists all direct dependencies between schema objects to which the current user has access. Please note that e.g.
dependencies between scripts cannot be determined. In case of a view, entries with REFERENCE_TYPE=NULL
values can be shown if underlying objects have been changed and the view has not been accessed again yet.
Column Meaning
OBJECT_SCHEMA Schema of object
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
OBJECT_OWNER Owner of the object
OBJECT_ID ID of the object
REFERENCE_TYPE Reference type (VIEW, CONSTRAINT)
REFERENCED_OBJECT_SCHEMA Schema of the referenced object
REFERENCED_OBJECT_NAME Name of the referenced object
REFERENCED_OBJECT_TYPE Type of the referenced object
REFERENCED_OBJECT_OWNER Owner of the referenced object
REFERENCED_OBJECT_ID ID of the referenced object
EXA_ALL_FUNCTIONS
This system table describes all functions of the database to which the current user has access.
407
A.2. List of system tables
Column Meaning
FUNCTION_SCHEMA Schema of the function
FUNCTION_NAME Name of the function
FUNCTION_OWNER Owner of the function
FUNCTION_OBJECT_ID ID of the function
FUNCTION_TEXT Generation text of the function
FUNCTION_COMMENT Comment on the function
EXA_ALL_INDICES
This system table describes all indices on tables to which the current user has access. Please note that indices are
created and managed automatically by the system. Hence, the purpose of this table is mainly for transparency.
Column Meaning
INDEX_SCHEMA Schema of the index
INDEX_TABLE Table of the index
INDEX_OWNER Owner of the index
INDEX_OBJECT_ID ID of the index
INDEX_TYPE Index type
MEM_OBJECT_SIZE Index size in bytes (at last COMMIT)
CREATED Timestamp of when the index was created
LAST_COMMIT Last time the object was changed in the DB
REMARKS Additional information about the index
EXA_ALL_OBJ_PRIVS
This table contains all of the object privileges granted for objects in the database to which the current user has access.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_ALL_OBJ_PRIVS_MADE
Lists all object privileges that are self-assigned by the user or those on objects that belong to the user.
408
Appendix A. System tables
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_ALL_OBJ_PRIVS_RECD
Lists all object privileges granted to the user directly or via PUBLIC.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_ALL_OBJECTS
This system table describes all of the database objects to which the current user has access.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
OBJECT_ID ID of the object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
ROOT_ID ID of the containing object
OBJECT_IS_VIRTUAL States whether this is a virtual object
OBJECT_COMMENT Comment on the object
409
A.2. List of system tables
EXA_ALL_OBJECT_SIZES
This system table contains the sizes of all of the database objects to which the current user has access. The values
are calculated recursively, i.e. the size of a schema includes the total of all of the sizes of the schema objects con-
tained therein.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
RAW_OBJECT_SIZE Uncompressed volume of data in the object in bytes (at last
COMMIT)
MEM_OBJECT_SIZE Compressed volume of data in the object in bytes (at last COM-
MIT)
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
OBJECT_ID ID of the object
OBJECT_IS_VIRTUAL States whether this is a virtual object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
ROOT_ID ID of the containing object
EXA_ALL_ROLES
A list of all roles known to the system
Column Meaning
ROLE_NAME Name of role
CREATED Timestamp of when the role was created
ROLE_PRIORITY Priority of the role
ROLE_COMMENT Comment on the role
EXA_ALL_SCRIPTS
This system table describes all scripts of the database to which the current user has access.
Column Meaning
SCRIPT_SCHEMA Name of the schema of the script
SCRIPT_NAME Name of the script
SCRIPT_OWNER Owner of the script
SCRIPT_OBJECT_ID ID of the script object
SCRIPT_TYPE Type of the script (PROCEDURE, ADAPTER or UDF)
410
Appendix A. System tables
Column Meaning
SCRIPT_LANGUAGE Script language
SCRIPT_INPUT_TYPE Script input type (NULL, SCALAR or SET)
SCRIPT_RESULT_TYPE Return type of the script (ROWCOUNT, TABLE, RETURNS or
EMITS)
SCRIPT_TEXT Complete creation text for a script
SCRIPT_COMMENT Comment on the script
EXA_ALL_SESSIONS
This system table contains information on user sessions. Among other things the most recent SQL statement is
shown. With regard to security issues, only the command name is displayed. The complete SQL text can be found
in EXA_USER_SESSIONS and EXA_DBA_SESSIONS.
Column Meaning
SESSION_ID Id of the session
USER_NAME Logged-in user
STATUS Current status of the session. The most important of these are:
EXA_ALL_TABLES
This system table describes all of the tables in the database to which the current user has access.
411
A.2. List of system tables
Column Meaning
TABLE_SCHEMA Name of the schema of the table
TABLE_NAME Name of the table
TABLE_OWNER Owner of the table
TABLE_OBJECT_ID ID of the table
TABLE_IS_VIRTUAL States whether this is a virtual table
TABLE_HAS_DISTRIBUTION_KEY States whether the table is explicitly distributed
TABLE_ROW_COUNT Number of rows in the table
DELETE_PERCENTAGE Fraction of the rows which are just marked as deleted, but not
yet physically deleted (in percent)
TABLE_COMMENT Comment on the table
EXA_ALL_USERS
This table provides restricted information on all of the users known to the system.
Column Meaning
USER_NAME Name of the user
CREATED Time the user was created
USER_PRIORITY Priority of the user
USER_COMMENT Comment on the user
EXA_ALL_VIEWS
Lists all of the views accessible to the current user.
Column Meaning
VIEW_SCHEMA Name of the schema in which the view was created
VIEW_NAME Name of the view
SCOPE_SCHEMA Schema from which the view was created
VIEW_OWNER Owner of the view
VIEW_OBJECT_ID Internal ID of the view
VIEW_TEXT Text of the view, with which it was created
VIEW_COMMENT Comment on the view
EXA_ALL_VIRTUAL_COLUMNS
Lists all columns of virtual tables to which the current user has access. It contains the information which are spe-
cific to virtual columns. Virtual columns are also listed in the table EXA_ALL_COLUMNS.
412
Appendix A. System tables
Column Meaning
COLUMN_SCHEMA Associated virtual schema
COLUMN_TABLE Associated virtual table
COLUMN_NAME Name of the virtual column
COLUMN_OBJECT_ID ID of the virtual columns object
ADAPTER_NOTES The adapter can store additional information about the virtual
column in this field
EXA_ALL_VIRTUAL_SCHEMA_PROPERTIES
This system table contains information on the properties of all virtual schemas to which the current user has access.
Column Meaning
SCHEMA_NAME Name of the virtual schema
SCHEMA_OBJECT_ID ID of the virtual schema object
PROPERTY_NAME Name of the property of the virtual schema
PROPERTY_VALUE Value of the property of the virtual schema
EXA_ALL_VIRTUAL_TABLES
Lists all virtual tables to which the current user has access. It contains the information which are specific to virtual
tables. Virtual tables are also listed in the table EXA_ALL_TABLES.
Column Meaning
TABLE_SCHEMA Name of the virtual schema containing the virtual table
TABLE_NAME Name of the virtual table
TABLE_OBJECT_ID ID of the virtual table
LAST_REFRESH Timestamp of the last metadata refresh (when metadata was commit-
ted)
LAST_REFRESH_BY Name of the user that performed the last metadata refresh
ADAPTER_NOTES The adapter can store additional information about the table in this
field
EXA_DBA_COLUMNS
This system table contains information on all table columns.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
COLUMN_SCHEMA Associated schema
COLUMN_TABLE Associated table
COLUMN_OBJECT_TYPE Associated object type
COLUMN_NAME Name of column
413
A.2. List of system tables
Column Meaning
COLUMN_TYPE Data type of column
COLUMN_TYPE_ID ID of data type
COLUMN_MAXSIZE Maximum number of characters for strings
COLUMN_NUM_PREC Precision for numeric values
COLUMN_NUM_SCALE Scale for numeric values
COLUMN_ORDINAL_POSITION Position of the column in the table beginning at 1
COLUMN_IS_VIRTUAL States whether the column is part of a virtual table
COLUMN_IS_NULLABLE States whether NULL values are allowed (TRUE or FALSE).
In case of views this value is always NULL.
COLUMN_IS_DISTRIBUTION_KEY States whether the column is part of the distribution key
(TRUE or FALSE)
COLUMN_DEFAULT Default value of the column
COLUMN_IDENTITY Current value of the identity number generator, if this column
has the identity attribute.
COLUMN_OWNER Owner of the associated object
COLUMN_OBJECT_ID ID of the column
STATUS Status of the object
COLUMN_COMMENT Comment on the column
EXA_DBA_CONNECTIONS
Lists all connections of the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
CONNECTION_NAME Name of the connection
CONNECTION_STRING Defines the target of the connection
USER_NAME User name which is used when a connection is used
CREATED Time of the creation date
CONNECTION_COMMENT Comment on the connection
EXA_DBA_CONNECTION_PRIVS
Lists all connections which were granted to the user or one of his roles.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
GRANTEE Name of the user/role who/which has been granted the right, to adopt
GRANTED_ROLE
GRANTED_CONNECTION Name of the connection
ADMIN_OPTION Specifies whether GRANTEE is allowed to grant the right to the connec-
tion to other users or roles
414
Appendix A. System tables
EXA_DBA_RESTRICTED_OBJ_PRIVS
Lists all connection objects to which certain, restricted scripts were granted access to.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema of the object for which the restricted privilege has been granted
OBJECT_NAME Name of the object for which the restricted privilege has been granted
OBJECT_TYPE Type of the object for which the restricted privilege has been granted
FOR_OBJECT_SCHEMA Schema of the object that the privilege is restricted to
FOR_OBJECT_NAME Name of the object that the privilege is restricted to
FOR_OBJECT_TYPE Type of the object that the privilege is restricted to
PRIVILEGE The restricted privilege that is granted
GRANTEE Name of the user/role who/which has been granted the restricted privilege
GRANTOR Name of the user/role who/which has granted the restricted privilege
OWNER Name of the owner of the object for which the restricted privilege is
granted
EXA_DBA_CONSTRAINTS
This system table contains information about all constraints of the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_ENABLED Displays whether the constraint is checked or not
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
EXA_DBA_CONSTRAINT_COLUMNS
This system table contains information about referenced table columns of all constraints of the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
ORDINAL_POSITION Position of the column in the table beginning at 1
415
A.2. List of system tables
Column Meaning
COLUMN_NAME Name of the column
REFERENCED_SCHEMA Referenced schema (only for foreign keys)
REFERENCED_TABLE Referenced table (only for foreign keys)
REFERENCED_COLUMN Name of the column in the referenced table (only for foreign
keys)
EXA_DBA_DEPENDENCIES
Lists all direct dependencies between schema objects. Please note that e.g. dependencies between scripts cannot
be determined. In case of a view, entries with REFERENCE_TYPE=NULL values can be shown if underlying
objects have been changed and the view has not been accessed again yet.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema of object
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
OBJECT_OWNER Owner of the object
OBJECT_ID ID of the object
REFERENCE_TYPE Reference type (VIEW, CONSTRAINT)
REFERENCED_OBJECT_SCHEMA Schema of the referenced object
REFERENCED_OBJECT_NAME Name of the referenced object
REFERENCED_OBJECT_TYPE Type of the referenced object
REFERENCED_OBJECT_OWNER Owner of the referenced object
REFERENCED_OBJECT_ID ID of the referenced object
EXA_DBA_DEPENDENCIES_RECURSIVE
Lists all direct and indirect dependencies between schema objects (thus recursive). Please note that e.g. dependencies
between scripts cannot be determined. Views are not shown if underlying objects have been changed and the view
has not been accessed again yet.
Please note that all database objects are read when accessing this system table.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema of object
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
OBJECT_OWNER Owner of the object
OBJECT_ID ID of the object
REFERENCE_TYPE Reference type (VIEW, CONSTRAINT)
REFERENCED_OBJECT_SCHEMA Schema of the referenced object
REFERENCED_OBJECT_NAME Name of the referenced object
416
Appendix A. System tables
Column Meaning
REFERENCED_OBJECT_TYPE Type of the referenced object
REFERENCED_OBJECT_OWNER Owner of the referenced object
REFERENCED_OBJECT_ID ID of the referenced object
DEPENDENCY_LEVEL Hierarchy level in the dependency graph
EXA_DBA_FUNCTIONS
This system table describes all function of the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
FUNCTION_SCHEMA Schema of the function
FUNCTION_NAME Name of the function
FUNCTION_OWNER Owner of the function
FUNCTION_OBJECT_ID ID of the function
FUNCTION_TEXT Generation text of the function
FUNCTION_COMMENT Comment on the function
EXA_DBA_INDICES
This system table describes all indices on tables. Please note that indices are created and managed automatically
by the system. Hence, the purpose of this table is mainly for transparency.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
INDEX_SCHEMA Schema of the index
INDEX_TABLE Table of the index
INDEX_OWNER Owner of the index
INDEX_OBJECT_ID ID of the index
INDEX_TYPE Index type
MEM_OBJECT_SIZE Index size in bytes (at last COMMIT)
CREATED Timestamp of when the index was created
LAST_COMMIT Last time the object was changed in the DB
REMARKS Additional information about the index
EXA_DBA_OBJ_PRIVS
This table contains all of the object privileges granted for objects in the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
417
A.2. List of system tables
Column Meaning
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_DBA_OBJECTS
This system table describes all of the database objects.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
OBJECT_ID Unique ID of the object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
ROOT_ID ID of the containing object
OBJECT_IS_VIRTUAL States whether this is a virtual object
OBJECT_COMMENT Comment on the object
EXA_DBA_OBJECT_SIZES
This system table contains the sizes of all database objects. The values are calculated recursively, i.e. the size of
a schema includes the total of all of the sizes of the schema objects contained therein.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
RAW_OBJECT_SIZE Uncompressed volume of data in the object in bytes (at last
COMMIT)
MEM_OBJECT_SIZE Compressed volume of data in the object in bytes (at last COM-
MIT)
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
OBJECT_ID ID of the object
OBJECT_IS_VIRTUAL States whether this is a virtual object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
418
Appendix A. System tables
Column Meaning
ROOT_ID ID of the containing object
EXA_DBA_ROLES
A list of all roles known to the system
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
ROLE_NAME Name of role
CREATED Timestamp of when the role was created
ROLE_PRIORITY Priority of the role
ROLE_COMMENT Comment on the role
EXA_DBA_ROLE_PRIVS
List of all roles granted to a user or a role.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
GRANTEE Name of user or role who/which has been granted the right, to adopt
GRANTED_ROLE
GRANTED_ROLE Name of role that was granted
ADMIN_OPTION Specifies whether GRANTEE is allowed to grant the right to the role to
other users or roles
EXA_DBA_SCRIPTS
This system table describes all scripts of the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SCRIPT_SCHEMA Name of the schema of the script
SCRIPT_NAME Name of the script
SCRIPT_OWNER Owner of the script
SCRIPT_OBJECT_ID ID of the script object
SCRIPT_TYPE Type of the script (PROCEDURE, ADAPTER or UDF)
SCRIPT_LANGUAGE Script language
SCRIPT_INPUT_TYPE Script input type (NULL, SCALAR or SET)
SCRIPT_RESULT_TYPE Return type of the script (ROWCOUNT, TABLE, RETURNS or
EMITS)
SCRIPT_TEXT Complete creation text for a script
SCRIPT_COMMENT Comment on the script
419
A.2. List of system tables
EXA_DBA_SESSIONS
This system table contains information on user sessions. Among other things the most recent SQL statement is
shown.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
USER_NAME Logged-in user
STATUS Current status of the session. The most important of these are:
EXA_DBA_SYS_PRIVS
This table shows the system privileges granted to all users and roles.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
GRANTEE Name of user or role who/which has been granted the privilege.
420
Appendix A. System tables
Column Meaning
PRIVILEGE System privilege that was granted.
ADMIN_OPTION Specifies whether GRANTEE is allowed to grant the right.
EXA_DBA_TABLES
This system table describes all tables in the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
TABLE_SCHEMA Name of the schema of the table
TABLE_NAME Name of the table
TABLE_OWNER Owner of the table
TABLE_OBJECT_ID ID of the table
TABLE_IS_VIRTUAL States whether this is a virtual table
TABLE_HAS_DISTRIBUTION_KEY States whether the table is explicitly distributed
TABLE_ROW_COUNT Number of rows in the table
DELETE_PERCENTAGE Fraction of the rows which are just marked as deleted, but not
yet physically deleted (in percent)
TABLE_COMMENT Comment on the table
EXA_DBA_USERS
The DBA_USERS table provides complete information on all of the users known to the system.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
USER_NAME Name of the user
CREATED Time of the creation date
DISTINGUISHED_NAME For the authorization against a LDAP server
KERBEROS_PRINCIPAL Kerberos principal
PASSWORD Encoded hash value of the password
USER_PRIORITY Priority of the user
USER_COMMENT Comment on the user
EXA_DBA_VIEWS
Lists all views in the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
VIEW_SCHEMA Name of the schema in which the view was created
VIEW_NAME Name of the view
SCOPE_SCHEMA Schema from which the view was created
421
A.2. List of system tables
Column Meaning
VIEW_OWNER Owner of the view
VIEW_OBJECT_ID Internal ID of the view
VIEW_TEXT Text of the view, with which it was created
VIEW_COMMENT Comment on the view
EXA_DBA_VIRTUAL_COLUMNS
Lists all columns of virtual tables. It contains the information which are specific to virtual columns. Virtual columns
are also listed in the table EXA_DBA_COLUMNS.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
COLUMN_SCHEMA Associated virtual schema
COLUMN_TABLE Associated virtual table
COLUMN_NAME Name of the virtual column
COLUMN_OBJECT_ID ID of the virtual columns object
ADAPTER_NOTES The adapter can store additional information about the virtual
column in this field
EXA_DBA_VIRTUAL_SCHEMA_PROPERTIES
This system table lists the properties of all virtual schemas in the database.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SCHEMA_NAME Name of the virtual schema
SCHEMA_OBJECT_ID ID of the virtual schema object
PROPERTY_NAME Name of the property of the virtual schema
PROPERTY_VALUE Value of the property of the virtual schema
EXA_DBA_VIRTUAL_TABLES
Lists all virtual tables and contains the information which are specific to virtual tables. Virtual tables are also listed
in the table EXA_DBA_TABLES.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
TABLE_SCHEMA Name of the virtual schema containing the virtual table
TABLE_NAME Name of the virtual table
TABLE_OBJECT_ID ID of the virtual table
LAST_REFRESH Timestamp of the last metadata refresh (when metadata was commit-
ted)
LAST_REFRESH_BY Name of the user that performed the last metadata refresh
422
Appendix A. System tables
Column Meaning
ADAPTER_NOTES The adapter can store additional information about the table in this
field
EXA_LOADAVG
This system table contains information on the current CPU load in each of the Exasol nodes.
Column Meaning
IPROC Number of the node
LAST_1MIN Average load in the last minute
LAST_5MIN Average load in the last 5 minutes
LAST_15MIN Average load in the last 15 minutes
RUNNING Contains two numbers separated with "/". The first indicates the number of
active processes or threads at the time of evaluation. The second indicates
the overall number of processes and threads on the node.
EXA_METADATA
This system table contains information that describes the properties of the database.
Column Meaning
PARAM_NAME Name of the property
PARAM_VALUE Value of the property
IS_STATIC TRUE The value of this property remains constant
FALSE The value of this property can change at any time
EXA_PARAMETERS
This system table provides the database parameters - both system-wide and session-based information is displayed.
Column Meaning
PARAMETER_NAME Name of the parameter
SESSION_VALUE Value of the session parameter
SYSTEM_VALUE Value of the system parameter
EXA_ROLE_CONNECTION_PRIVS
Lists any connection that the current user possesses indirectly via other roles.
423
A.2. List of system tables
Column Meaning
GRANTEE Name of the role which received the right
GRANTED_CONNECTION Name of the connection which was granted
ADMIN_OPTION Information on whether the connection can be passed on to other
users/roles
EXA_ROLE_RESTRICTED_OBJ_PRIVS
Lists all connection objects to which certain, restricted scripts were granted access to, either granted directly to
the current user or indirectly via other roles.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema of the object for which the restricted privilege has been granted
OBJECT_NAME Name of the object for which the restricted privilege has been granted
OBJECT_TYPE Type of the object for which the restricted privilege has been granted
FOR_OBJECT_SCHEMA Schema of the object that the privilege is restricted to
FOR_OBJECT_NAME Name of the object that the privilege is restricted to
FOR_OBJECT_TYPE Type of the object that the privilege is restricted to
PRIVILEGE The restricted privilege that is granted
GRANTEE Name of the user/role who/which has been granted the restricted privilege
GRANTOR Name of the user/role who/which has granted the restricted privilege
OWNER Name of the owner of the object for which the restricted privilege is
granted
EXA_ROLE_OBJ_PRIVS
Lists all object privileges that have been granted to roles of the user.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Name of the role
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_ROLE_ROLE_PRIVS
Lists any role that the current user possesses indirectly via other roles.
424
Appendix A. System tables
Column Meaning
GRANTEE Name of the role through which the user possesses the role indirectly
GRANTED_ROLE Name of the role possessed by the user indirectly
ADMIN_OPTION Information on whether the role can be passed on to other users/roles
EXA_ROLE_SYS_PRIVS
Lists any system privilege that the current user possesses via roles.
Column Meaning
GRANTEE Name of the role
PRIVILEGE Name of the system privilege
ADMIN_OPTION Shows whether it is permissible for the privilege to be passed on to
other users/roles
EXA_SCHEMAS
This system table lists all the schemas of the database.
Column Meaning
SCHEMA_NAME Name of the schema object
SCHEMA_OWNER Owner of the object
SCHEMA_OBJECT_ID ID of the schema object
SCHEMA_IS_VIRTUAL States whether this is a virtual schema
SCHEMA_COMMENT Comment on the object
EXA_SCHEMA_OBJECTS
Lists all objects that exist in the current schema.
Column Meaning
OBJECT_NAME Name of the object
OBJECT_TYPE Type of the object
EXA_SESSION_CONNECTIONS
List of all connections the user can access. The entries for columns CONNECTION_STRING and USER_NAME
are only displayed if the user can edit the connection (see also ALTER CONNECTION).
Column Meaning
CONNECTION_NAME Name of the connection
425
A.2. List of system tables
Column Meaning
CONNECTION_STRING Defines the target of the connection
USER_NAME User name which is used when a connection is used
CREATED Time of the creation date
CONNECTION_COMMENT Comment on the connection
EXA_SESSION_PRIVS
Lists all of the system privileges the user currently disposes of.
Column Meaning
PRIVILEGE Name of the system privilege
EXA_SESSION_ROLES
Lists all roles held by the current user.
Column Meaning
ROLE_NAME Name of the role
ROLE_PRIORITY Priority of the role
ROLE_COMMENT Comment on the role
EXA_SPATIAL_REF_SYS
List of supported spatial reference systems.
Column Meaning
SRID Spatial reference system identifier
AUTH_NAME Spatial reference system authority name
AUTH_SRID Authority specific spatial reference system identifier
SRTEXT WKT description of the spatial reference system
PROJ4TEXT Parameters for Proj4 projections
EXA_SQL_KEYWORDS
This system table contains all SQL keywords in Exasol.
Column Meaning
KEYWORD Keyword
RESERVED Defines whether the keyword is reserved. Reserved keywords cannot be used
as SQL identifier (see also Section 2.1.2, “SQL identifier”).
426
Appendix A. System tables
EXA_SQL_TYPES
This system table describes the SQL data types of Exasol.
Column Meaning
TYPE_NAME Name according to SQL standard
TYPE_ID ID of the data type
PRECISION The precision in relation to numeric values, the (maximum) length in
bytes in relation to strings and other types.
LITERAL_PREFIX Prefix, with which a literal of this type must be initiated
LITERAL_SUFFIX Suffix, with which a literal of this type must be terminated
CREATE_PARAMS Information on which information is necessary in order to create a
column of this type
IS_NULLABLE States whether NULL values are allowed (TRUE or FALSE).
CASE_SENSITIVE States whether case sensitivity is relevant to the type
SEARCHABLE States how the type can be used in a WHERE clause:
0 Cannot be searched
1 Can only be searched with WHERE .. LIKE
2 Cannot be searched with WHERE .. LIKE
3 Can be searched with any WHERE clause
EXA_STATISTICS_OBJECT_SIZES
This system table contains the sizes of all statistical system tables aggregated by the type of the statistical info (see
also Section A.2.3, “Statistical system tables”).
Column Meaning
STATISTICS_TYPE Type of statistics
427
A.2. List of system tables
Column Meaning
OTHER Miscellaneous statistics, e.g. for internal op-
timizations
EXA_TIME_ZONES
This system table lists all named timezones supported by the database.
Column Meaning
TIME_ZONE_NAME Time zone name
EXA_USER_COLUMNS
This system table contains information on the table columns to those tables owned by the current user.
Column Meaning
COLUMN_SCHEMA Associated schema
COLUMN_TABLE Associated table
COLUMN_OBJECT_TYPE Associated object type
COLUMN_NAME Name of column
COLUMN_TYPE Data type of column
COLUMN_TYPE_ID ID of data type
COLUMN_MAXSIZE Maximum number of characters for strings
COLUMN_NUM_PREC Precision for numeric values
COLUMN_NUM_SCALE Scale for numeric values
COLUMN_ORDINAL_POSITION Position of the column in the table beginning at 1
COLUMN_IS_VIRTUAL States whether the column is part of a virtual table
COLUMN_IS_NULLABLE States whether NULL values are allowed (TRUE or FALSE).
In case of views this value is always NULL.
COLUMN_IS_DISTRIBUTION_KEY States whether the column is part of the distribution key
(TRUE or FALSE)
COLUMN_DEFAULT Default value of the column
428
Appendix A. System tables
Column Meaning
COLUMN_IDENTITY Current value of the identity number generator, if this column
has the identity attribute.
COLUMN_OWNER Owner of the associated object
COLUMN_OBJECT_ID ID of the column
STATUS Status of the object
COLUMN_COMMENT Comment on the column
EXA_USER_CONNECTION_PRIVS
Lists all connections which were granted directly to the user.
Column Meaning
GRANTEE Name of the user who received the right
GRANTED_CONNECTION Name of the connection which was granted
ADMIN_OPTION Information on whether the connection can be passed on to other
users/roles
EXA_USER_RESTRICTED_OBJ_PRIVS
Lists all connection objects to which certain, restricted scripts were granted access to, granted directly to the current
user.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
OBJECT_SCHEMA Schema of the object for which the restricted privilege has been granted
OBJECT_NAME Name of the object for which the restricted privilege has been granted
OBJECT_TYPE Type of the object for which the restricted privilege has been granted
FOR_OBJECT_SCHEMA Schema of the object that the privilege is restricted to
FOR_OBJECT_NAME Name of the object that the privilege is restricted to
FOR_OBJECT_TYPE Type of the object that the privilege is restricted to
PRIVILEGE The restricted privilege that is granted
GRANTEE Name of the user/role who/which has been granted the restricted privilege
GRANTOR Name of the user/role who/which has granted the restricted privilege
OWNER Name of the owner of the object for which the restricted privilege is
granted
EXA_USER_CONSTRAINTS
This system table contains information about constraints of tables owned by the current user.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
429
A.2. List of system tables
Column Meaning
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_ENABLED Displays whether the constraint is checked or not
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
EXA_USER_CONSTRAINT_COLUMNS
This system table contains information about referenced table columns of all constraints owned by the current
user.
Column Meaning
CONSTRAINT_SCHEMA Associated schema
CONSTRAINT_TABLE Associated table
CONSTRAINT_TYPE Constraint type (PRIMARY KEY, FOREIGN KEY or NOT
NULL)
CONSTRAINT_NAME Name of the constraints
CONSTRAINT_OWNER Owner of the constraint (which is the owner of the table)
ORDINAL_POSITION Position of the column in the table beginning at 1
COLUMN_NAME Name of the column
REFERENCED_SCHEMA Referenced schema (only for foreign keys)
REFERENCED_TABLE Referenced table (only for foreign keys)
REFERENCED_COLUMN Name of the column in the referenced table (only for foreign
keys)
EXA_USER_DEPENDENCIES
Lists all direct dependencies between schema objects owned by the current user. Please note that e.g. dependencies
between scripts cannot be determined. In case of a view, entries with REFERENCE_TYPE=NULL values can be
shown if underlying objects have been changed and the view has not been accessed again yet.
Column Meaning
OBJECT_SCHEMA Schema of object
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
OBJECT_OWNER Owner of the object
OBJECT_ID ID of the object
REFERENCE_TYPE Reference type (VIEW, CONSTRAINT)
REFERENCED_OBJECT_SCHEMA Schema of the referenced object
REFERENCED_OBJECT_NAME Name of the referenced object
REFERENCED_OBJECT_TYPE Type of the referenced object
430
Appendix A. System tables
Column Meaning
REFERENCED_OBJECT_OWNER Owner of the referenced object
REFERENCED_OBJECT_ID ID of the referenced object
EXA_USER_FUNCTIONS
This system table describes all functions in the database owned by the current user.
Column Meaning
FUNCTION_SCHEMA Schema of the function
FUNCTION_NAME Name of the function
FUNCTION_OWNER Owner of the function
FUNCTION_OBJECT_ID ID of the function
FUNCTION_TEXT Generation text of the function
FUNCTION_COMMENT Comment on the function
EXA_USER_INDICES
This system table describes all indices on tables owned by the current user. Please note that indices are created
and managed automatically by the system. Hence, the purpose of this table is mainly for transparency.
Column Meaning
INDEX_SCHEMA Schema of the index
INDEX_TABLE Table of the index
INDEX_OWNER Owner of the index
INDEX_OBJECT_ID ID of the index
INDEX_TYPE Index type
MEM_OBJECT_SIZE Index size in bytes (at last COMMIT)
CREATED Timestamp of when the index was created
LAST_COMMIT Last time the object was changed in the DB
REMARKS Additional information about the index
EXA_USER_OBJ_PRIVS
This table contains all of the object privileges granted for objects in the database to which the current user has access
(except via the PUBLIC role).
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
431
A.2. List of system tables
Column Meaning
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_USER_OBJ_PRIVS_MADE
Lists all of the object privileges related to objects owned by the current user.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_USER_OBJ_PRIVS_RECD
Lists all object privileges granted directly to the user.
Column Meaning
OBJECT_SCHEMA Schema in which the target object is located
OBJECT_NAME Name of the object
OBJECT_TYPE Object type
PRIVILEGE The granted right
GRANTEE Recipient of the right
GRANTOR Name of the user who granted the right
OWNER Owner of the target object
EXA_USER_OBJECTS
This system table lists all of the objects owned by the current user.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
432
Appendix A. System tables
Column Meaning
OBJECT_ID ID of the object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
ROOT_ID ID of the containing object
OBJECT_IS_VIRTUAL States whether this is a virtual object
OBJECT_COMMENT Comment on the object
EXA_USER_OBJECT_SIZES
Contains the size of all database objects owned by the current user. The values are calculated recursively, i.e. the
size of a schema includes the total of all of the sizes of the schema objects contained therein.
Column Meaning
OBJECT_NAME Name of object
OBJECT_TYPE Type of object
RAW_OBJECT_SIZE Uncompressed volume of data in the object in bytes (at last
COMMIT)
MEM_OBJECT_SIZE Compressed volume of data in the object in bytes (at last COM-
MIT)
CREATED Timestamp of when the object was created
LAST_COMMIT Last time the object was changed in the DB
OWNER Owner of the object
OBJECT_ID ID of the object
OBJECT_IS_VIRTUAL States whether this is a virtual object
ROOT_NAME Name of the containing object
ROOT_TYPE Type of the containing object
ROOT_ID ID of the containing object
EXA_USER_ROLE_PRIVS
This table lists all of the roles directly granted to the current user (not via other roles)
Column Meaning
GRANTEE Name of the user
GRANTED_ROLE Name of the role possessed by the user
ADMIN_OPTION Information on whether the role can be passed on to other users/roles
EXA_USER_SCRIPTS
This system table describes all of the scripts in the database owned by the current user.
433
A.2. List of system tables
Column Meaning
SCRIPT_SCHEMA Name of the schema of the script
SCRIPT_NAME Name of the script
SCRIPT_OWNER Owner of the script
SCRIPT_OBJECT_ID ID of the script object
SCRIPT_TYPE Type of the script (PROCEDURE, ADAPTER or UDF)
SCRIPT_LANGUAGE Script language
SCRIPT_INPUT_TYPE Script input type (NULL, SCALAR or SET)
SCRIPT_RESULT_TYPE Return type of the script (ROWCOUNT, TABLE, RETURNS or
EMITS)
SCRIPT_TEXT Complete creation text for a script
SCRIPT_COMMENT Comment on the script
EXA_USER_SESSIONS
This system table contains information on user sessions. Among other things the most recent SQL statement is
shown.
Column Meaning
SESSION_ID Id of the session
USER_NAME Logged-in user
STATUS Current status of the session. The most important of these are:
434
Appendix A. System tables
Column Meaning
OS_NAME Operating system of the client server
SCOPE_SCHEMA Name of the schema in which the user is located
PRIORITY Priority group
NICE NICE attribute
RESOURCES Allocated resources in percent
SQL_TEXT SQL text of the statement
EXA_USER_SYS_PRIVS
Lists all system privileges that have been directly granted to the user.
Column Meaning
GRANTEE Name of the user
PRIVILEGE Name of the system privilege
ADMIN_OPTION States whether it is permissible for the system privilege to be passed
on to other users/roles
EXA_USER_TABLES
This system table describes all of the tables in the database owned by the current user.
Column Meaning
TABLE_SCHEMA Name of the schema of the table
TABLE_NAME Name of the table
TABLE_OWNER Owner of the table
TABLE_OBJECT_ID ID of the table
TABLE_IS_VIRTUAL States whether this is a virtual table
TABLE_HAS_DISTRIBUTION_KEY States whether the table is explicitly distributed
TABLE_ROW_COUNT Number of rows in the table
DELETE_PERCENTAGE Fraction of the rows which are just marked as deleted, but not
yet physically deleted (in percent)
TABLE_COMMENT Comment on the table
EXA_USER_USERS
This table provides the same information as EXA_ALL_USERS, however, it is limited to the user currently logged
in.
Column Meaning
USER_NAME Name of the user
CREATED Time the user was created
435
A.2. List of system tables
Column Meaning
USER_PRIORITY Priority of the user
USER_COMMENT Comment on the user
EXA_USER_VIEWS
Lists all views owned by the current user.
Column Meaning
VIEW_SCHEMA Name of the schema in which the view was created
VIEW_NAME Name of the view
SCOPE_SCHEMA Schema from which the view was created
VIEW_OWNER Owner of the view
VIEW_OBJECT_ID Internal ID of the view
VIEW_TEXT Text of the view, with which it was created
VIEW_COMMENT Comment on the view
EXA_USER_VIRTUAL_COLUMNS
Lists all columns of virtual tables owned by the current user. It contains the information which are specific to vir-
tual columns. Virtual columns are also listed in the table EXA_USER_COLUMNS.
Column Meaning
COLUMN_SCHEMA Associated virtual schema
COLUMN_TABLE Associated virtual table
COLUMN_NAME Name of the virtual column
COLUMN_OBJECT_ID ID of the virtual columns object
ADAPTER_NOTES The adapter can store additional information about the virtual
column in this field
EXA_USER_VIRTUAL_SCHEMA_PROPERTIES
This system table contains information on the properties of all virtual schemas belonging to the current user.
Column Meaning
SCHEMA_NAME Name of the virtual schema
SCHEMA_OBJECT_ID ID of the virtual schema object
PROPERTY_NAME Name of the property of the virtual schema
PROPERTY_VALUE Value of the property of the virtual schema
436
Appendix A. System tables
EXA_USER_VIRTUAL_TABLES
Lists all virtual tables owned by the current user. It contains the information which are specific to virtual tables.
Virtual tables are also listed in the table EXA_USER_TABLES.
Column Meaning
TABLE_SCHEMA Name of the virtual schema containing the virtual table
TABLE_NAME Name of the virtual table
TABLE_OBJECT_ID ID of the virtual table
LAST_REFRESH Timestamp of the last metadata refresh (when metadata was commit-
ted)
LAST_REFRESH_BY Name of the user that performed the last metadata refresh
ADAPTER_NOTES The adapter can store additional information about the table in this
field
EXA_VIRTUAL_SCHEMAS
Lists all virtual schemas and shows the properties which are specific to virtual schemas. Virtual schemas are also
listed in the table EXA_SCHEMAS.
Column Meaning
SCHEMA_NAME Name of the virtual schema
SCHEMA_OWNER Owner of the virtual schema
SCHEMA_OBJECT_ID ID of the virtual schema object
ADAPTER_SCRIPT Name of the adapter script used for this virtual schema
LAST_REFRESH Timestamp of the last metadata refresh
LAST_REFRESH_BY Name of the user that performed the last metadata refresh
ADAPTER_NOTES The adapter can store additional information about the schema in
this field
EXA_VOLUME_USAGE
Shows details of the database usage of the storage volumes.
Column Meaning
TABLESPACE The tablespace of the volume
VOLUME_ID The identifier of the volume
IPROC Number of the node
LOCALITY Describes the locality of the master segment. If the value is FALSE, then a
performance degradation of I/O operations is probable.
REDUNDANCY Redundancy level of the volume
HDD_TYPE Type of HDDs of the volume
HDD_COUNT Number of HDDs of the volume
437
A.2. List of system tables
Column Meaning
HDD_FREE Physical free space on the node in GiB of all disks of the volume
VOLUME_SIZE Size of the volume on the node in GiB
USE Usage of the volume on the node in percent, i.e. 100% - (UN-
USED_DATA/VOLUME_SIZE)
COMMIT_DATA Committed data on the node in GiB
SWAP_DATA Swapped data on the node in GiB
UNUSED_DATA Unused data on the node in GiB. Be aware that volume fragmentation might
prevent complete use of all unused data.
DELETE_DATA Deleted data on the node in GiB. This includes all data which cannot be de-
leted immediately, e.g. because of an active backup or shrink operation.
MULTICOPY_DATA Multicopy data on the node in GiB. Multicopy data occurs in case of trans-
actions where one transaction reads old data and the other one writes new
data of the same table in parallel.
438
Appendix A. System tables
Statistics are updated periodically, for an explicit update the command FLUSH STATISTICS is available (see
Section 2.2.6, “Other statements”).
Statistical system tables can be accessed by all users read-only and are subject ot the transactional concept (see
Section 3.1, “Transaction management ”). Therefore you maybe have to open a new transaction to see the up-to-
date data!
To minimize transactional conflicts for the user and the DBMS, within a transaction you
should access exclusively either statistical system tables or normal database objects.
EXA_DBA_AUDIT_SESSIONS
Lists all sessions if auditing is switched on in EXAoperation.
This system table can be cleared by the statement TRUNCATE AUDIT LOGS.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
LOGIN_TIME Login time
LOGOUT_TIME Logout time
USER_NAME User name
CLIENT Client application used by the user
DRIVER Used driver
ENCRYPTED Flag whether the connection is encrypted
HOST Computer name or IP address from which the user has logged-in
OS_USER User name under which the user logged into the operating system of
the computer from which the login came
OS_NAME Operating system of the client server
SUCCESS TRUE Login was successfully
EXA_DBA_AUDIT_SQL
Lists all executed SQL statements if the auditing is switched on in EXAoperation.
This system table can be cleared by the statement TRUNCATE AUDIT LOGS.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
439
A.2. List of system tables
Column Meaning
SESSION_ID Id of the session (see also EXA_DBA_AUDIT_SESSIONS)
STMT_ID Serially numbered id of statement within a session
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
DURATION Duration of the statement in seconds
START_TIME Start point of the statement
STOP_TIME Stop point of the statement
CPU CPU utilization in percent
TEMP_DB_RAM_PEAK Maximal usage of temporary DB memory of the query in MiB (cluster wide)
HDD_READ Hard disk read ratio in MiB per second (per node, averaged over the duration)
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the duration)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, av-
eraged over the duration)
SUCCESS TRUE Statement was executed successfully
EXA_DBA_PROFILE_LAST_DAY
Lists all profiling information of sessions with activated profiling. Details for this topic can also be found in Sec-
tion 3.9, “Profiling”.
440
Appendix A. System tables
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
STMT_ID Serially numbered id of statement within a session
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
PART_ID Serially numbered id of the execution part within the statement
PART_NAME Name of the execution part (see also Section 3.9, “Profiling”)
PART_INFO Extended information of the execution part
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the duration)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, aver-
aged over the duration)
REMARKS Additional information
SQL_TEXT Corresponding SQL text
EXA_DBA_PROFILE_RUNNING
Lists all profiling information of running queries. Details for this topic can also be found in Section 3.9, “Profiling”.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
STMT_ID Serially numbered id of statement within a session
441
A.2. List of system tables
Column Meaning
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
PART_ID Serially numbered id of the execution part within the statement
PART_NAME Name of the execution part (see also Section 3.9, “Profiling”)
PART_INFO Extended information of the execution part
PART_FINISHED Defines whether the execution part has already been finished
OBJECT_SCHEMA Schema of the processed object
OBJECT_NAME Name of the processed object
OBJECT_ROWS Number of rows of the processed object
OUT_ROWS Number of result rows of the execution part
DURATION Duration of the execution part in seconds
CPU CPU utilization in percent of the execution part (averaged over the duration)
TEMP_DB_RAM_PEAK Usage of temporary DB memory of the execution part in MiB (cluster wide,
maximum over the duration)
HDD_READ Hard disk read ratio in MiB per second (per node, averaged over the duration)
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the duration)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, aver-
aged over the duration)
REMARKS Additional information
SQL_TEXT Corresponding SQL text
EXA_DBA_SESSIONS_LAST_DAY
Lists all sessions of the last day.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
LOGIN_TIME Time of login
LOGOUT_TIME Time of logout
USER_NAME User name
442
Appendix A. System tables
Column Meaning
CLIENT Client application used by the user
DRIVER Used driver
ENCRYPTED Flag whether the connection is encrypted
HOST Computer name or IP address from which the user has logged-in
OS_USER User name under which the user logged into the operating system of
the computer from which the login came
OS_NAME Operating system of the client server
SUCCESS Information whether the login was successful
ERROR_CODE Error code if the login failed
ERROR_TEXT Error text if the login failed
EXA_DBA_TRANSACTION_CONFLICTS
Lists all transaction conflicts.
This system table can be cleared by the statement TRUNCATE AUDIT LOGS.
Only users with the "SELECT ANY DICTIONARY" system privilege have access.
Column Meaning
SESSION_ID Id of the session
CONFLICT_SESSION_ID Session which produces the conflict
START_TIME Start time of the conflict
STOP_TIME End time of the conflict or NULL if the conflict is still open
CONFLICT_TYPE Type of the conflict:
WAIT FOR COMMIT One session has to wait until the other
one is committed
EXA_DB_SIZE_LAST_DAY
This system tables contains the database sizes of the recent 24 hours. The information is aggregated across all
cluster nodes.
Column Meaning
MEASURE_TIME Point of the measurement
RAW_OBJECT_SIZE Uncompressed data volume in GiB
MEM_OBJECT_SIZE Compressed data volume in GiB
AUXILIARY_SIZE Size in GiB of auxiliary structures like indices
STATISTICS_SIZE Size in GiB of statistical system tables
443
A.2. List of system tables
Column Meaning
RECOMMENDED_DB_RAM_SIZE Recommended DB RAM size in GiB to exploit the maximal system
performance
STORAGE_SIZE Size of the persistent volume in GiB
USE Ratio of effectively used space of the persistent volume size in
percent
OBJECT_COUNT Number of schema objects in the database.
EXA_DB_SIZE_HOURLY
This system table describes the hourly aggregated database sizes sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
RAW_OBJECT_SIZE_AVG Average uncompressed data volume in GiB
RAW_OBJECT_SIZE_MAX Maximum uncompressed data volume in GiB
MEM_OBJECT_SIZE_AVG Average compressed data volume in GiB
MEM_OBJECT_SIZE_MAX Maximum compressed data volume in GiB
AUXILIARY_SIZE_AVG Average size in GiB of auxiliary structures like indices
AUXILIARY_SIZE_MAX Maximum size in GiB of auxiliary structures like indices
STATISTICS_SIZE_AVG Average size in GiB of statistical system tables
STATISTICS_SIZE_MAX Maximum size in GiB of statistical system tables
RECOMMENDED_DB_RAM_SIZE_AVG Average recommended DB RAM size in GiB to exploit the optimal
system performance
RECOMMENDED_DB_RAM_SIZE_MAX Maximum recommended DB RAM size in GiB to exploit the op-
timal system performance
STORAGE_SIZE_AVG Average size of the persistent volume in GiB
STORAGE_SIZE_MAX Maximum size of the persistent volume in GiB
USE_AVG Average ratio of effectively used space of the persistent volume
size in percent
USE_MAX Maximum ratio of effectively used space of the persistent volume
size in percent
OBJECT_COUNT_AVG Average number of schema objects in the database.
OBJECT_COUNT_MAX Maximum number of schema objects in the database.
EXA_DB_SIZE_DAILY
This system table describes the daily aggregated database sizes sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
RAW_OBJECT_SIZE_AVG Average uncompressed data volume in GiB
RAW_OBJECT_SIZE_MAX Maximum uncompressed data volume in GiB
444
Appendix A. System tables
Column Meaning
MEM_OBJECT_SIZE_AVG Average compressed data volume in GiB
MEM_OBJECT_SIZE_MAX Maximum compressed data volume in GiB
AUXILIARY_SIZE_AVG Average size in GiB of auxiliary structures like indices
AUXILIARY_SIZE_MAX Maximum size in GiB of auxiliary structures like indices
STATISTICS_SIZE_AVG Average size in GiB of statistical system tables
STATISTICS_SIZE_MAX Maximum size in GiB of statistical system tables
RECOMMENDED_DB_RAM_SIZE_AVG Average recommended DB RAM size in GiB to exploit the optimal
system performance
RECOMMENDED_DB_RAM_SIZE_MAX Maximum recommended DB RAM size in GiB to exploit the op-
timal system performance
STORAGE_SIZE_AVG Average size of the persistent volume in GiB
STORAGE_SIZE_MAX Maximum size of the persistent volume in GiB
USE_AVG Average ratio of effectively used space of the persistent volume
size in percent
USE_MAX Maximum ratio of effectively used space of the persistent volume
size in percent
OBJECT_COUNT_AVG Average number of schema objects in the database.
OBJECT_COUNT_MAX Maximum number of schema objects in the database.
EXA_DB_SIZE_MONTHLY
This system table describes the monthly aggregated database sizes sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
RAW_OBJECT_SIZE_AVG Average uncompressed data volume in GiB
RAW_OBJECT_SIZE_MAX Maximum uncompressed data volume in GiB
MEM_OBJECT_SIZE_AVG Average compressed data volume in GiB
MEM_OBJECT_SIZE_MAX Maximum compressed data volume in GiB
AUXILIARY_SIZE_AVG Average size in GiB of auxiliary structures like indices
AUXILIARY_SIZE_MAX Maximum size in GiB of auxiliary structures like indices
STATISTICS_SIZE_AVG Average size in GiB of statistical system tables
STATISTICS_SIZE_MAX Maximum size in GiB of statistical system tables
RECOMMENDED_DB_RAM_SIZE_AVG Average recommended DB RAM size in GiB to exploit the optimal
system performance
RECOMMENDED_DB_RAM_SIZE_MAX Maximum recommended DB RAM size in GiB to exploit the op-
timal system performance
STORAGE_SIZE_AVG Average size of the persistent volume in GiB
STORAGE_SIZE_MAX Maximum size of the persistent volume in GiB
USE_AVG Average ratio of effectively used space of the persistent volume
size in percent
USE_MAX Maximum ratio of effectively used space of the persistent volume
size in percent
445
A.2. List of system tables
Column Meaning
OBJECT_COUNT_AVG Average number of schema objects in the database.
OBJECT_COUNT_MAX Maximum number of schema objects in the database.
EXA_MONITOR_LAST_DAY
This system table describes monitoring information (the maximal values in the cluster).
The data ratios are no indicators to the hardware performance. They were introduced to
improve the comparability in case of variations of the measure intervals. If you multiply
the ratio with the last interval duration, you get the real data volumes.
Column Meaning
MEASURE_TIME Point of the measurement
LOAD System load (equals the load value of program uptime)
CPU CPU utilization in percent (of the database instance, averaged over the last measure
interval)
TEMP_DB_RAM Usage of temporary DB memory in MiB (of the database instance, maximum over
the last measure interval)
HDD_READ Hard disk read ratio in MiB per second (per node, averaged over the last measure
interval)
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the last measure
interval)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, averaged
over the last measure interval)
SWAP Swap ratio in MiB per second (averaged over the last measure interval). If this
value is higher than 0, a system configuration problem may exist.
EXA_MONITOR_HOURLY
This system table describes the hourly aggregated monitoring information (of values from EXA_MONIT-
OR_LAST_DAY) sorted by the interval start.
The data ratios are no indicators to the hardware performance. They were introduced to
improve the comparability in case of variations of the measure intervals. If you multiply
the ratio with the last interval duration, you get the real data volumes.
Column Meaning
INTERVAL_START Start point of the aggregation interval
LOAD_AVG Average system load (equals the 1-minute load value of program uptime)
LOAD_MAX Maximal system load (equals the 1-minute load value of program uptime)
CPU_AVG Average CPU utilization in percent (of the database instance)
CPU_MAX Maximal CPU utilization in percent (of the database instance)
TEMP_DB_RAM_AVG Average usage of temporary DB memory in MiB (of the database instance)
TEMP_DB_RAM_MAX Maximal usage of temporary DB memory in MiB (of the database instance)
HDD_READ_AVG Average hard disk read ratio in MiB per second
446
Appendix A. System tables
Column Meaning
HDD_READ_MAX Maximal hard disk read ratio in MiB per second
HDD_WRITE_AVG Average hard disk write ratio in MiB per second
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second
NET_AVG Average network traffic ratio in MiB per second
NET_MAX Maximal network traffic ratio in MiB per second
SWAP_AVG Average swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
SWAP_MAX Maximal swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
EXA_MONITOR_DAILY
This system table describes the daily aggregated monitoring information (of values from EXA_MONIT-
OR_LAST_DAY) sorted by the interval start.
The data ratios are no indicators to the hardware performance. They were introduced to
improve the comparability in case of variations of the measure intervals. If you multiply
the ratio with the last interval duration, you get the real data volumes.
Column Meaning
INTERVAL_START Start point of the aggregation interval
LOAD_AVG Average system load (equals the 1-minute load value of program uptime)
LOAD_MAX Maximal system load (equals the 1-minute load value of program uptime)
CPU_AVG Average CPU utilization in percent (of the database instance)
CPU_MAX Maximal CPU utilization in percent (of the database instance)
TEMP_DB_RAM_AVG Average usage of temporary DB memory in MiB (of the database instance)
TEMP_DB_RAM_MAX Maximal usage of temporary DB memory in MiB (of the database instance)
HDD_READ_AVG Average hard disk read ratio in MiB per second
HDD_READ_MAX Maximal hard disk read ratio in MiB per second
HDD_WRITE_AVG Average hard disk write ratio in MiB per second
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second
NET_AVG Average network traffic ratio in MiB per second
NET_MAX Maximal network traffic ratio in MiB per second
SWAP_AVG Average swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
SWAP_MAX Maximal swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
EXA_MONITOR_MONTHLY
This system table describes the monthly aggregated monitoring information (of values from EXA_MONIT-
OR_LAST_DAY) sorted by the interval start.
447
A.2. List of system tables
The data ratios are no indicators to the hardware performance. They were introduced to
improve the comparability in case of variations of the measure intervals. If you multiply
the ratio with the last interval duration, you get the real data volumes.
Column Meaning
INTERVAL_START Start point of the aggregation interval
LOAD_AVG Average system load (equals the 1-minute load value of program uptime)
LOAD_MAX Maximal system load (equals the 1-minute load value of program uptime)
CPU_AVG Average CPU utilization in percent (of the database instance)
CPU_MAX Maximal CPU utilization in percent (of the database instance)
TEMP_DB_RAM_AVG Average usage of temporary DB memory in MiB (of the database instance)
TEMP_DB_RAM_MAX Maximal usage of temporary DB memory in MiB (of the database instance)
HDD_READ_AVG Average hard disk read ratio in MiB per second
HDD_READ_MAX Maximal hard disk read ratio in MiB per second
HDD_WRITE_AVG Average hard disk write ratio in MiB per second
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second
NET_AVG Average network traffic ratio in MiB per second
NET_MAX Maximal network traffic ratio in MiB per second
SWAP_AVG Average swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
SWAP_MAX Maximal swap ratio in MiB per second. If this value is higher than 0, a
system configuration problem may exist.
EXA_SQL_LAST_DAY
This system table contains all executed SQL statements without any reference to the executing user or detail sql
texts. Only those statements are considered which could be successfully compiled.
Column Meaning
SESSION_ID Id of the session
STMT_ID Serially numbered id of statement within a session
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
DURATION Duration of the statement in seconds
START_TIME Start point of the statement
STOP_TIME Stop point of the statement
CPU CPU utilization in percent
TEMP_DB_RAM_PEAK Maximal usage of temporary DB memory of the query in MiB (cluster wide)
HDD_READ Maximal hard disk read ratio in MiB per second (per node, averaged over the
last measure interval)
448
Appendix A. System tables
Column Meaning
HDD_WRITE Maximal hard disk write ratio in MiB per second (per node, averaged over
the last measure interval)
NET Maximal network traffic ratio in MiB per second (sum of send/receive, per
node, averaged over the last measure interval)
SUCCESS Result of the statement
EXA_SQL_HOURLY
This system table contains the hourly aggregated number of executed SQL statements sorted by the interval start.
Per interval several entries for each command type (e.g. SELECT) is created.
Column Meaning
INTERVAL_START Start point of the aggregation interval
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
SUCCESS Result of the statement
449
A.2. List of system tables
Column Meaning
CPU_MAX Maximal CPU utilization in percent
TEMP_DB_RAM_PEAK_AVG Average usage of temporary DB memory of queries in MiB (cluster wide)
TEMP_DB_RAM_PEAK_MAX Maximal usage of temporary DB memory of queries in MiB (cluster wide)
HDD_READ_AVG Average hard disk read ratio in MiB per second (per node)
HDD_READ_MAX Maximal hard disk read ratio in MiB per second (per node)
HDD_WRITE_AVG Average hard disk write ratio in MiB per second (per node, COMMIT
only)
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second (per node, COMMIT
only)
NET_AVG Average network traffic ratio in MiB per second (per node)
NET_MAX Maximal network traffic ratio in MiB per second (per node)
ROW_COUNT_AVG Average number of result rows for queries, or number of affected rows
for DML and DDL statements
ROW_COUNT_MAX Maximal number of result rows for queries, or number of affected rows
for DML and DDL statements
EXECUTION_MODE EXECUTE Normal execution of statements
EXA_SQL_DAILY
This system table contains the daily aggregated number of executed SQL statements sorted by the interval start.
Per interval several entries for each command type (e.g. SELECT) is created.
Column Meaning
INTERVAL_START Start point of the aggregation interval
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
SUCCESS Result of the statement
450
Appendix A. System tables
Column Meaning
HDD_READ_AVG Average hard disk read ratio in MiB per second (per node)
HDD_READ_MAX Maximal hard disk read ratio in MiB per second (per node)
HDD_WRITE_AVG Average hard disk write ratio in MiB per second (per node, COMMIT
only)
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second (per node, COMMIT
only)
NET_AVG Average network traffic ratio in MiB per second (per node)
NET_MAX Maximal network traffic ratio in MiB per second (per node)
ROW_COUNT_AVG Average number of result rows for queries, or number of affected rows
for DML and DDL statements
ROW_COUNT_MAX Maximal number of result rows for queries, or number of affected rows
for DML and DDL statements
EXECUTION_MODE EXECUTE Normal execution of statements
EXA_SQL_MONTHLY
This system table contains the monthly aggregated number of executed SQL statements sorted by the interval start.
Per interval several entries for each command type (e.g. SELECT) is created.
Column Meaning
INTERVAL_START Start point of the aggregation interval
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
SUCCESS Result of the statement
451
A.2. List of system tables
Column Meaning
HDD_WRITE_MAX Maximal hard disk write ratio in MiB per second (per node, COMMIT
only)
NET_AVG Average network traffic ratio in MiB per second (per node)
NET_MAX Maximal network traffic ratio in MiB per second (per node)
ROW_COUNT_AVG Average number of result rows for queries, or number of affected rows
for DML and DDL statements
ROW_COUNT_MAX Maximal number of result rows for queries, or number of affected rows
for DML and DDL statements
EXECUTION_MODE EXECUTE Normal execution of statements
EXA_SYSTEM_EVENTS
This system table contains system events like startup or shutdown of the DBMS.
Column Meaning
MEASURE_TIME Time of the event
EVENT_TYPE Type of the event
452
Appendix A. System tables
EXA_USAGE_LAST_DAY
This system table contains information about the DBMS usage of the recent 24 hours.
Column Meaning
MEASURE_TIME Point of the measurement
USERS Number of users connected to the DBMS
QUERIES Number of concurrent queries
EXA_USAGE_HOURLY
This system table describes the hourly aggregated usage information of the DBMS, sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
USERS_AVG Average number of users connected to the DBMS
USERS_MAX Maximum number of users connected to the DBMS
QUERIES_AVG Average number of concurrent queries
QUERIES_MAX Maximum number of concurrent queries
IDLE Percentage of last period where no query is running at all
EXA_USAGE_DAILY
This system table describes the daily aggregated usage information of the DBMS, sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
USERS_AVG Average number of users connected to the DBMS
USERS_MAX Maximum number of users connected to the DBMS
QUERIES_AVG Average number of concurrent queries
QUERIES_MAX Maximum number of concurrent queries
IDLE Percentage of last period where no query is running at all
EXA_USAGE_MONTHLY
This system table describes the monthly aggregated usage information of the DBMS, sorted by the interval start.
Column Meaning
INTERVAL_START Start point of the aggregation interval
USERS_AVG Average number of users connected to the DBMS
USERS_MAX Maximum number of users connected to the DBMS
453
A.2. List of system tables
Column Meaning
QUERIES_AVG Average number of concurrent queries
QUERIES_MAX Maximum number of concurrent queries
IDLE Percentage of last period where no query is running at all
EXA_USER_PROFILE_LAST_DAY
Lists all profiling information of own sessions with activated profiling. Details for this topic can also be found in
Section 3.9, “Profiling”.
Column Meaning
SESSION_ID Id of the session
STMT_ID Serially numbered id of statement within a session
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
PART_ID Serially numbered id of the execution part within the statement
PART_NAME Name of the execution part (see also Section 3.9, “Profiling”)
PART_INFO Extended information of the execution part
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the duration)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, aver-
aged over the duration)
REMARKS Additional information
SQL_TEXT Corresponding SQL text
454
Appendix A. System tables
EXA_USER_PROFILE_RUNNING
Lists all profiling information of your own running queries. Details for this topic can also be found in Section 3.9,
“Profiling”.
Column Meaning
SESSION_ID Id of the session
STMT_ID Serially numbered id of statement within a session
COMMAND_NAME Name of the statement (e.g. SELECT, COMMIT, MERGE etc.)
COMMAND_CLASS Class of statement (e.g. DQL, TRANSACTION, DML etc.)
PART_ID Serially numbered id of the execution part within the statement
PART_NAME Name of the execution part (see also Section 3.9, “Profiling”)
PART_INFO Extended information of the execution part
PART_FINISHED Defines whether the execution part has already been finished
OBJECT_SCHEMA Schema of the processed object
OBJECT_NAME Name of the processed object
OBJECT_ROWS Number of rows of the processed object
OUT_ROWS Number of result rows of the execution part
DURATION Duration of the execution part in seconds
CPU CPU utilization in percent of the execution part (averaged over the duration)
TEMP_DB_RAM_PEAK Usage of temporary DB memory of the execution part in MiB (cluster wide,
maximum over the duration)
HDD_READ Hard disk read ratio in MiB per second (per node, averaged over the duration)
HDD_WRITE Hard disk write ratio in MiB per second (per node, averaged over the duration)
NET Network traffic ratio in MiB per second (sum of send/receive, per node, aver-
aged over the duration)
REMARKS Additional information
SQL_TEXT Corresponding SQL text
EXA_USER_SESSIONS_LAST_DAY
Lists all own sessions of the last day.
455
A.2. List of system tables
Column Meaning
SESSION_ID Id of the session
LOGIN_TIME Time of login
LOGOUT_TIME Time of logout
USER_NAME User name
CLIENT Client application used by the user
DRIVER Used driver
ENCRYPTED Flag whether the connection is encrypted
HOST Computer name or IP address from which the user has logged-in
OS_USER User name under which the user logged into the operating system of
the computer from which the login came
OS_NAME Operating system of the client server
SUCCESS Information whether the login was successful
ERROR_CODE Error code if the login failed
ERROR_TEXT Error text if the login failed
EXA_USER_TRANSACTION_CONFLICTS_LAST_DAY
Lists all transaction conflicts linked to the current user's sessions.
This system table can be cleared by the statement TRUNCATE AUDIT LOGS.
Column Meaning
SESSION_ID Id of the session
CONFLICT_SESSION_ID Session which produces the conflict
START_TIME Start time of the conflict
STOP_TIME End time of the conflict or NULL if the conflict is still open
CONFLICT_TYPE Type of the conflict:
WAIT FOR COMMIT One session has to wait until the other
one is committed
456
Appendix A. System tables
CAT
This system table lists all of the tables and views in the current schema. CAT is not completely compatible with
Oracle because here all objects owned by the user are visible. This definition was decided upon because in Exasol
several schemas can belong to one user.
Column Meaning
TABLE_NAME Name of the table or the view
TABLE_TYPE Type of object: TABLE or VIEW
DUAL
Along the lines of Oracle's identically named system table, this system table can be used to output static information
(e.g. "SELECT CURRENT_USER FROM DUAL"). It contains one row with one single column.
Column Meaning
DUMMY Contains a NULL value
457
458
Appendix B. Details on rights management
An introduction to the basic concepts of rights management can be found in Section 3.2, “Rights management”.
Further details on the various SQL statements can be found in section Section 2.2.3, “Access control using SQL
(DCL)”.
459
B.1. List of system and object privileges
460
Appendix B. Details on rights management
461
B.2. Required privileges for SQL statements
462
Appendix B. Details on rights management
463
B.2. Required privileges for SQL statements
464
Appendix B. Details on rights management
User
Roles
465
B.3. System tables for rights management
EXA_ROLE_ROLE_PRIVS Roles possessed by the current user indirectly via other roles
EXA_SESSION_ROLES Roles possessed by the current user
Connections
System privileges
Object privileges
466
Appendix C. Compliance to the SQL standard
Although many manufacturers claim compliance to all of the standards in their advertising, we are not aware of
any system that actually supports all of the mandatory features of the SQL standard.
Symbol meaning:
✔ Fully supported
✓ Partially supported (i.e. not all sub-features are supported)
467
C.1. SQL 2008 Standard Mandatory Features
468
Appendix C. Compliance to the SQL standard
Feature Exasol
E061-07 Quantified comparison predicate
E061-08 EXISTS predicate ✔
E061-09 Subqueries in comparison predicate ✔
E061-11 Subqueries in IN predicate ✔
E061-12 Subqueries in quantified comparison predicate
E061-13 Correlated Subqueries ✔
E061-14 Search condition ✔
E071 Basic query expressions ✔
E071-01 UNION DISTINCT table operator ✔
E071-02 UNION ALL table operator ✔
E071-03 EXCEPT DISTINCT table operator ✔
E071-05 Columns combined via table operators need not have exactly the same data type. ✔
E071-06 Table operators in subqueries ✔
E081 Basic privileges ✓
E081-01 SELECT privilege at the table level ✔
E081-02 DELETE privilege ✔
E081-03 INSERT privilege at the table level ✔
E081-04 UPDATE privilege at the table level ✔
E081-05 UPDATE privilege at the column level
E081-06 REFERENCES privilege at the table level ✔
E081-07 REFERENCES privilege at the column level
E081-08 WITH GRANT OPTION
E081-09 USAGE privilege
E081-10 EXECUTE privilege ✔
E091 Set functions ✔
E091-01 AVG ✔
E091-02 COUNT ✔
E091-03 MAX ✔
E091-04 MIN ✔
E091-05 SUM ✔
E091-06 ALL quantifier ✔
E091-07 DISTINCT quantifier ✔
E101 Basic data manipulation ✔
E101-01 INSERT statement ✔
E101-03 Searched UPDATE statement ✔
E101-04 Searched DELETE statement ✔
E111 Single row SELECT statement
E121 Basic cursor support
E121-01 DECLARE CURSOR
E121-02 ORDER BY columns need not be in select list
E121-03 Value expressions in ORDER BY clause
E121-04 OPEN statement
469
C.1. SQL 2008 Standard Mandatory Features
Feature Exasol
E121-06 Positioned UPDATE statement
E121-07 Positioned DELETE statement
E121-08 CLOSE statement
E121-10 FETCH statement: implicit NEXT
E121-17 WITH HOLD cursors
E131 Null value support (nulls in lieu of values) ✔
E141 Basic integrity constraints ✓
E141-01 NOT NULL constraint ✔
E141-02 UNIQUE constraints of NOT NULL columns
E141-03 PRIMARY KEY constraint ✔
E141-04 Basic FOREIGN KEY constraint with the NO ACTION default for both referential ✔
delete action and referential update action.
E141-06 CHECK constraint
E141-07 Column defaults ✔
E141-08 NOT NULL inferred on PRIMARY KEY ✔
E141-10 Names in a foreign key can be specified in any order
E151 Transaction support ✔
E151-01 COMMIT statement ✔
E151-02 ROLLBACK statement ✔
E152 Basic SET TRANSACTION statement
E152-01 SET TRANSACTION statement: ISOLATION LEVEL SERIALIZABLE clause
E152-02 SET TRANSACTION statement: READ ONLY and READ WRITE clauses
E153 Updatable queries with subqueries
E161 SQL comments using leading double minus ✔
E171 SQLSTATE support
E182 Module language
F031 Basic schema manipulation ✔
F031-01 CREATE TABLE statement to create persistent base tables ✔
F031-02 CREATE VIEW statement ✔
F031-03 GRANT statement ✔
F031-04 ALTER TABLE statement: ADD COLUMN clause ✔
F031-13 DROP TABLE statement: RESTRICT clause ✔
F031-16 DROP VIEW statement: RESTRICT clause ✔
F031-19 REVOKE statement: RESTRICT clause ✔
F041 Basic joined table ✔
F041-01 Inner join (but not necessarily the INNER keyword) ✔
F041-02 INNER keyword ✔
F041-03 LEFT OUTER JOIN ✔
F041-04 RIGHT OUTER JOIN ✔
F041-05 Outer joins can be nested ✔
F041-07 The inner table in a left or right outer join can also be used in an inner join ✔
F041-08 All comparison operators are supported (rather than just =) ✔
470
Appendix C. Compliance to the SQL standard
Feature Exasol
F051 Basic date and time ✓
F051-01 DATE data type (including support of DATE literal) ✔
F051-02 TIME data type (including the support of TIME literal) with fractional seconds
precision of at least 0
F051-03 TIMESTAMP data type (including the support of TIMESTAMP literal) with ✓
fractional seconds precision of at least 0 and 6
F051-04 Comparison predicate on DATE, TIME and TIMESTAMP data types ✔
F051-05 Explicit CAST between datetime types and character string types ✔
F051-06 CURRENT_DATE ✔
F051-07 LOCALTIME
F051-08 LOCALTIMESTAMP ✔
F081 UNION and EXCEPT in views ✔
F131 Grouped operations ✓
F131-01 WHERE, GROUP BY and HAVING clauses supported in queries with grouped ✔
views
F131-02 Multiple tables supported in queries with grouped views ✔
F131-03 Set functions supported in queries with grouped views ✔
F131-04 Subqueries with GROUP BY and HAVING clauses and grouped views ✔
F131-05 Single row SELECT with GROUP BY and HAVING clauses and grouped views
F181 Multiple module support
F201 CAST function ✔
F221 Explicit defaults ✔
F261 CASE expression ✔
F261-01 Simple CASE ✔
F261-02 Searched CASE ✔
F261-03 NULLIF ✔
F261-04 COALESCE ✔
F311 Schema definition statement ✓
F311-01 CREATE SCHEMA ✔
F311-02 CREATE TABLE for persistent base tables (within CREATE SCHEMA)
F311-03 CREATE VIEW (within CREATE SCHEMA)
F311-04 CREATE VIEW: WITH CHECK OPTION (within CREATE SCHEMA)
F311-05 GRANT STATEMENT (within CREATE SCHEMA)
F471 Scalar subquery values ✔
F481 Expanded NULL predicate ✔
F812 Basic flagging
S011 Distinct data types
T321 Basic SQL-invoked routines ✓
T321-01 User-defined functions with no overloading ✔
T321-02 User-defined stored procedures with no overloading
T321-03 Function invocation ✔
T321-04 CALL statement
T321-05 RETURN statement ✔
471
C.2. SQL 2008 Standard Optional Features
Feature Exasol
T631 IN predicate with one list element ✔
472
Appendix D. Supported Encodings for ETL
processes and EXAplus
Encoding Aliases
ASCII US-ASCII, US, ISO-IR-6, ANSI_X3.4-1968, ANSI_X3.4-1986, ISO_646.IRV:1991, ISO646-
US, IBM367, IBM-367, CP367, CP-367, 367
ISO-8859-1 ISO8859-1, ISO88591, LATIN-1, LATIN1, L1, ISO-IR-100, ISO_8859-1:1987, ISO_8859-
1, IBM819, IBM-819, CP819, CP-819, 819
ISO-8859-2 ISO8859-2, ISO88592, LATIN-2, LATIN2, L2, ISO-IR-101, ISO_8859-2:1987, ISO_8859-
2
ISO-8859-3 ISO8859-3, ISO88593, LATIN-3, LATIN3, L3, ISO-IR-109, ISO_8859-3:1988, ISO_8859-
3
ISO-8859-4 ISO8859-4, ISO88594, LATIN-4, LATIN4, L4, ISO-IR-110, ISO_8859-4:1988, ISO_8859-
4
ISO-8859-5 ISO8859-5, ISO88595, CYRILLIC, ISO-IR-144, ISO_8859-5:1988, ISO_8859-5
ISO-8859-6 ISO8859-6, ISO88596, ARABIC, ISO-IR-127, ISO_8859-6:1987, ISO_8859-6, ECMA-
114, ASMO-708
ISO-8859-7 ISO8859-7, ISO88597, GREEK, GREEK8, ISO-IR-126, ISO_8859-7:1987, ISO_8859-7,
ELOT_928, ECMA-118
ISO-8859-8 ISO8859-8, ISO88598, HEBREW, ISO-IR-138, ISO_8859-8:1988, ISO_8859-8
ISO-8859-9 ISO8859-9, ISO88599, LATIN-5, LATIN5, L5, ISO-IR-148, ISO_8859-9:1989, ISO_8859-
9
ISO-8859-11 ISO8859-11, ISO885911
ISO-8859-13 ISO8859-13, ISO885913, LATIN-7, LATIN7, L7, ISO-IR-179
ISO-8859-15 ISO8859-15, ISO885915, LATIN-9, LATIN9, L9
IBM850 IBM-850, CP850, CP-850, 850
IBM852 IBM-852, CP852, CP-852, 852
IBM855 IBM-855, CP855, CP-855, 855
IBM856 IBM-856, CP856, CP-856, 856
IBM857 IBM-857, CP857, CP-857, 857
IBM860 IBM-860, CP860, CP-860, 860
IBM861 IBM-861, CP861, CP-861, 861, CP-IS
IBM862 IBM-862, CP862, CP-862, 862
IBM863 IBM-863, CP863, CP-863, 863
IBM864 IBM-864, CP864, CP-864, 864
IBM865 IBM-865, CP865, CP-865, 865
IBM866 IBM-866, CP866, CP-866, 866
IBM868 IBM-868, CP868, CP-868, 868, CP-AR
IBM869 IBM-869, CP869, CP-869, 869, CP-GR
WINDOWS-1250 CP1250, CP-1250, 1250, MS-EE
WINDOWS-1251 CP1251, CP-1251, 1251, MS-CYRL
473
Encoding Aliases
WINDOWS-1252 CP1252, CP-1252, 1252, MS-ANSI
WINDOWS-1253 CP1253, CP-1253, 1253, MS-GREEK
WINDOWS-1254 CP1254, CP-1254, 1254, MS-TURK
WINDOWS-1255 CP1255, CP-1255, 1255, MS-HEBR
WINDOWS-1256 CP1256, CP-1256, 1256, MS-ARAB
WINDOWS-1257 CP1257, CP-1257, 1257, WINBALTRIM
WINDOWS-1258 CP1258, CP-1258, 1258
WINDOWS-874 CP874, CP-874, 874, IBM874, IBM-874
WINDOWS-31J WINDOWS-932, CP932, CP-932, 932
WINDOWS-936 CP936, CP-936, 936, GBK, MS936, MS-936
CP949 WINDOWS-949, CP-949, 949
BIG5 WINDOWS-950, CP950, CP-950, 950, BIG, BIG5, BIG-5, BIG-FIVE, BIGFIVE, CN-
BIG5, BIG5-CP950
SHIFT-JIS SJIS
UTF8 UTF-8, ISO10646/UTF8
474
Appendix E. Customer Service
Requests
We recommend that all requests are sent to the following email address: <service@exasol.com>. A brief
description of the issue by email or, in urgent cases, by phone will suffice. We would ask you to always advise us
of your contact information.
Internal processes
Customer Service receives all requests and, if possible, responds immediately. If an immediate reply is not possible,
the request will be categorized and passed on to the relevant department at Exasol. In this case, our customer receives
feedback on when a conclusive or intermediate answer can be expected.
Downloads
This manual as well as the latest software such as drivers and other information are available on the customer
portal on our website www.exasol.com.
Contact information
Exasol AG
Customer Service
Phone: 00800 EXASUPPORT (00800 3927 877 678)
Email: <service@exasol.com>
475
476
Abbreviations
Abbreviations
A
ADO.NET Abstract Data Objects .NET
B
BI Business Intelligence
C
CLI Call Level Interface
D
DB Database
E
ETL Extract, Transform, Load
F
FBV Fix Block Values
477
G
GB Gigabyte = 109 = 1.000.000.000 Bytes
H
HPC High Performance Computing
I
ISO International Standards Organization
J
JDBC Java DataBase Connectivity
JSON JavaScript Object Notation - an open -standard format that uses human-readable
text to transmit data objects consisting of attribute-value pairs
K
kB Kilobyte = 103 = 1.000 Bytes
L
LDAP Lightweight Directory Access Protocol (authentification service)
M
MB Megabyte = 106 = 1.000.000 Bytes
O
ODBC Open DataBase Connectivity
P
PCRE Perl Compatible Regular Expressions
478
Abbreviations
S
SASL Simple Authentication and Security Layer
T
TMS Transaction Management System
U
UDF User Defined Function
W
WKB Well Known Binary (binary representation of geospatial data)
479
480
Index
Authentification
Index LDAP, 62
Password, 61
AVG function, 150
A
ABS function, 143 B
ACOS function, 143
BETWEEN predicate, 132
ADD_DAYS function, 144
BIT_AND function, 150
ADD_HOURS function, 144
BIT_CHECK function, 151
ADD_MINUTES function, 145
BIT_LENGTH function, 152
ADD_MONTHS function, 145
BIT_LROTATE function, 152
ADD_SECONDS function, 146
BIT_LSHIFT function, 153
ADD_WEEKS function, 146
BIT_NOT function, 153
ADD_YEARS function, 147
BIT_OR function, 154
ADO.NET Data Destination, 397
BIT_RROTATE function, 154
ADO.NET Data Processing Extension, 399
BIT_RSHIFT function, 155
ADO.NET Data Provider, 393
BIT_SET function, 155
Installation, 393
BIT_TO_NUM function, 156
Use, 393
BIT_XOR function, 157
Alias, 77
BOOLEAN data type, 105
ALTER ANY CONNECTION system privilege, 35
BucketFS, 65, 323
ALTER ANY CONNECTION System privilege, 66
ALTER ANY SCHEMA system privilege, 14
ALTER ANY TABLE system privilege, 19, 23, 24, 36 C
ALTER ANY VIRTUAL SCHEMA REFRESH system CASCADE
privilege, 14 in DROP SCHEMA statement, 13
ALTER ANY VIRTUAL SCHEMA system privilege, in DROP VIEW Statement, 27
14 CASCADE CONSTRAINTS
ALTER CONNECTION statement, 66 in ALTER TABLE statement, 21
ALTER object privilege, 14, 19, 23, 24 in DROP TABLE statement, 19
ALTER SCHEMA statement, 14 in REVOKE statement, 72
ALTER SESSION statement, 91 CASE function, 157
ALTER SYSTEM statement, 93 CAST function, 158
ALTER TABLE statement CEIL function, 159
ADD COLUMN, 19 CEILING function, 159
ADD CONSTRAINT, 24 CHAR data type, 107
ALTER COLUMN DEFAULT, 19 CHARACTER_LENGTH function, 159
ALTER COLUMN IDENTITY, 19 CH[A]R function, 160
DISTRIBUTE BY, 23 CLI
DROP COLUMN, 19 Best Practice, 403
DROP CONSTRAINT, 24 Example, 402
DROP DISTRIBUTION KEYS, 23 General, 400
MODIFY COLUMN, 19 Linux/Unix version, 401
MODIFY CONSTRAINT, 24 Windows version, 400
RENAME COLUMN, 19 CLOSE SCHEMA statement, 96
RENAME CONSTRAINT, 24 Cluster Enlargement, 100
ALTER USER statement, 62 COALESCE function, 160
ALTER USER System privilege, 36 COLOGNE_PHONETIC function, 161
ALTER USER system privilege, 62 Column
AND predicate, 131 Add column, 19
APPROXIMATE_COUNT_DISTINCT Function, 148 Alter column identity, 19
ASC, 78 Change data type, 19
ASCII function, 148 Comment column, 36
ASIN function, 149 Delete column, 19
Assignment, 28 Rename column, 19
ATAN function, 149 Set default value, 19
ATAN2 function, 149 Column alias, 77
Auditing, 439 COMMENT statement, 36
481
Comments, 5 CREATE USER statement, 61
COMMIT Statement, 88 CREATE USER system privilege, 35, 36, 61
CONCAT function, 161 CREATE VIEW statement, 25
CONNECT BY, 77 CREATE VIEW system privilege, 26
CONNECT_BY_ISCYCLE, 78 CREATE VIRTUAL SCHEMA system privilege, 12
CONNECT_BY_ISLEAF, 78 CROSS JOIN, 76
CONNECT_BY_ROOT, 77 CSV Data format, 269
LEVEL, 77 CUBE, 78
NOCYCLE, 77 CURDATE function, 169
START WITH, 77 CURRENT_DATE function, 169
SYS_CONNECT_BY_PATH, 77 CURRENT_SCHEMA, 12
Connection CURRENT_SCHEMA function, 170
Alter connection, 66 CURRENT_SESSION function, 170
Comment connection, 36 CURRENT_STATEMENT function, 171
Create connection, 65 CURRENT_TIMESTAMP function, 171
Drop Connection, 66 CURRENT_USER function, 172
Grant connection, 67
Rename connection, 35 D
Revoke connection, 70 Data Destination, 393, 397
CONNECT_BY_ISCYCLE, 78 Data Processing Extension, 393, 399
CONNECT_BY_ISCYCLE function, 162 Data types, 104
CONNECT_BY_ISLEAF, 78 Aliases, 107, 108
CONNECT_BY_ISLEAF function, 162 Date/Time
CONNECT_BY_ROOT, 77 DATE, 105
Constraints, 17 INTERVAL DAY TO SECOND, 107
FOREIGN KEY, 25 INTERVAL YEAR TO MONTH, 107
NOT NULL, 25 TIMESTAMP, 105
PRIMARY KEY, 25 TIMESTAMP WITH LOCAL TIME ZONE, 105
Status Details, 104
DISABLE, 25, 93, 95 GEOMETRY, 107
ENABLE, 25, 93, 95 Numeric
CONSTRAINT_STATE_DEFAULT, 25, 93, 95 DECIMAL, 104
CONVERT function, 163 DOUBLE PRECISION, 104
CONVERT_TZ function, 164 Overview, 104
CORR function, 165 Strings
COS function, 165 CHAR, 107
COSH function, 166 VARCHAR, 107
COT function, 166 Type conversion rules, 108
COUNT Function, 167 Database
COVAR_POP function, 168 Reorganize, 100
COVAR_SAMP function, 168 DATE data type, 105
CREATE ANY FUNCTION system privilege, 27 DATE_TRUNC function, 172
CREATE ANY SCRIPT system privilege, 30 DAY function, 173
CREATE ANY TABLE system privilege, 15, 18 DAYS_BETWEEN function, 173
CREATE ANY VIEW system privilege, 26 DBTIMEZONE function, 174
CREATE CONNECTION statement, 65 DCL statements, 61
CREATE CONNECTION system privilege, 36 ALTER CONNECTION, 66
CREATE CONNECTION System privilege, 65 ALTER USER, 62
CREATE FUNCTION statement, 27 CREATE CONNECTION, 65
CREATE FUNCTION system privilege, 27 CREATE ROLE, 63
CREATE ROLE statement, 63 CREATE USER, 61
CREATE ROLE system privilege, 35, 36 DROP CONNECTION, 66
CREATE SCHEMA statement, 12 DROP ROLE, 64
CREATE SCHEMA system privilege, 12 DROP USER, 63
CREATE SCRIPT statement, 30 GRANT, 67
CREATE SCRIPT system privilege, 30 REVOKE, 70
CREATE TABLE statement, 15 DDEX provider, 393
CREATE TABLE system privilege, 15, 18
482
Index
483
SET AUTOCOMMIT, 368 AVG, 150
SET AUTOCOMPLETION, 369 CORR, 165
SET COLSEPARATOR, 369 COUNT, 167
SET DEFINE, 369 COVAR_POP, 168
SET ENCODING, 370 COVAR_SAMP, 168
SET ESCAPE, 370 FIRST_VALUE, 179
SET FEEDBACK, 371 GROUPING[_ID], 183
SET HEADING, 371 GROUP_CONCAT, 182
SET LINESIZE, 372 LAST_VALUE, 190
SET NULL, 372 MAX, 199
SET NUMFORMAT, 372 MEDIAN, 200
SET PAGESIZE, 373 MIN, 201
SET SPOOL ROW SEPARATOR, 373 PERCENTILE_CONT, 209
SET TIME, 374 PERCENTILE_DISC, 210
SET TIMING, 374 REGR_*, 218
SET TRUNCATE HEADING, 374 REGR_AVGX, 219
SET VERBOSE, 375 REGR_AVGY, 219
SHOW, 375 REGR_COUNT, 219
SPOOL, 376 REGR_INTERCEPT, 219
START, 362 REGR_R2, 219
TIMING, 376 REGR_SLOPE, 219
UNDEFINE, 377 REGR_SXX, 219
WHENEVER, 377 REGR_SXY, 219
EXA_TIME_ZONES, 92, 94, 106, 164 REGR_SYY, 219
EXCEPT, 80 STDDEV, 233
(see also MINUS) STDDEV_POP, 234
EXECUTE ANY FUNCTION system privilege, 27 ST_INTERSECTION, 116
EXECUTE ANY SCRIPT system privilege, 31 ST_UNION, 117
EXECUTE ANY SCRIPT System privilege, 89 SUM, 236
EXECUTE object privilege, 27 VARIANCE, 252
EXECUTE Object privilege, 31, 89 VAR_POP, 251
EXECUTE SCRIPT statement, 89 VAR_SAMP, 251
EXISTS predicate, 132 Aggregated functions
EXP function, 178 STDDEV_SAMP, 235
EXPLAIN VIRTUAL statement, 98 Analytical functions, 140
EXPORT statement, 52 AVG, 150
EXTRACT function, 179 CORR, 165
COUNT, 167
F COVAR_POP, 168
FBV Data format, 271 COVAR_SAMP, 168
FIRST_VALUE function, 179 DENSE_RANK, 176
FLOOR function, 180 FIRST_VALUE, 179
FLUSH STATISTICS Statement, 101 LAG, 189
FOR loop, 28 LAST_VALUE, 190
FOREIGN KEY, 25 LEAD, 191
Verification of the property, 85 MAX, 199
Format models, 122 MEDIAN, 200
Date/Time, 122 MIN, 201
Numeric, 124 PERCENTILE_CONT, 209
FROM, 77 PERCENTILE_DISC, 210
FROM_POSIX_TIME Function, 181 RANK, 215
FULL OUTER JOIN, 76 RATIO_TO_REPORT, 215
Function REGR_*, 218
Rename function, 35 REGR_AVGY, 219
Functions, 136 REGR_COUNT, 219
Aggregate functions, 140 REGR_INTERCEPT, 219
APPROXIMATE_COUNT_DISTINCT, 148 REGR_R2, 219
REGR_SLOPE, 219
484
Index
485
ST_OVERLAPS, 117 HASH_MD5, 183
ST_POINTN, 115 HASH_SHA[1], 184
ST_SETSRID, 117 HASH_TIGER, 185
ST_STARTPOINT, 115 IPROC, 188
ST_SYMDIFFERENCE, 117 LEAST, 193
ST_TOUCHES, 117 NPROC, 205
ST_TRANSFORM, 117 NULLIF, 206
ST_UNION, 117 NULLIFZERO, 206
ST_WITHIN, 117 NVL, 208
ST_X, 115 NVL2, 208
ST_Y, 115 ROWID, 225
Hierarchical queries, 139 SYS_GUID, 237
CONNECT_BY_ISCYCLE, 162 USER, 249
CONNECT_BY_ISLEAF, 162 VALUE2PROC, 250
LEVEL, 194 ZEROIFNULL, 254
SYS_CONNECT_BY_PATH, 237 String functions, 137
Numeric functions, 136 ASCII, 148
ABS, 143 BIT_LENGTH, 152
ACOS, 143 CHARACTER_LENGTH, 159
ASIN, 149 CH[A]R, 160
ATAN, 149 COLOGNE_PHONETIC, 161
ATAN2, 149 CONCAT, 161
CEIL, 159 DUMP, 177
CEILING, 159 EDIT_DISTANCE, 178
COS, 165 INSERT, 186
COSH, 166 INSTR, 187
COT, 166 LCASE, 191
DEGREES, 175 LEFT, 193
DIV, 176 LENGTH, 194
EXP, 178 LOCATE, 196
FLOOR, 180 LOWER, 198
LN, 195 LPAD, 198
LOG, 196 LTRIM, 199
LOG10, 197 MID, 201
LOG2, 197 OCTET_LENGTH, 209
MOD, 203 POSITION, 212
PI, 211 REGEXP_INSTR, 216
POWER, 213 REGEXP_REPLACE, 217
RADIANS, 213 REGEXP_SUBSTR, 218
RAND[OM], 214 REPEAT, 220
ROUND (number), 224 REPLACE, 221
SIGN, 229 REVERSE, 221
SIN, 229 RIGHT, 222
SINH, 229 RPAD, 226
SQRT, 231 RTRIM, 226
TAN, 239 SOUNDEX, 230
TANH, 239 SPACE, 230
TO_CHAR (number), 241 SUBSTR, 235
TRUNC[ATE] (number), 247 SUBSTRING, 235
Other scalar functions, 139 TRANSLATE, 245
CASE, 157 TRIM, 245
COALESCE, 160 UCASE, 247
CURRENT_SCHEMA, 170 UNICODE, 248
CURRENT_SESSION, 170 UNICODECHR, 248
CURRENT_STATEMENT, 171 UPPER, 249
CURRENT_USER, 172 User-defined
DECODE, 174 CREATE FUNCTION, 27
GREATEST, 181 DROP FUNCTION, 30
486
Index
G POLYGON, 114
getBigDecimal(), 307
GEOMETRY data type, 107
getBoolean(), 307
GEOMETRY Object, 114
getDate(), 307
GEOMETRYCOLLECTION Object, 114
getDouble(), 307
Geospatial data, 114
getInteger(), 307
Empty set, 114
getLong(), 307
Functions, 115
getString(), 307
ST_AREA, 115
getTimestamp(), 307
ST_BOUNDARY, 116
GRANT ANY CONNECTION system privilege, 67, 70
ST_BUFFER, 116
GRANT ANY OBJECT PRIVILEGE system privilege,
ST_CENTROID, 116
67, 70
ST_CONTAINS, 116
GRANT ANY PRIORITY system privilege, 67, 70
ST_CONVEXHULL, 116
GRANT ANY PRIVILEGE system privilege, 67, 69,
ST_CROSSES, 116
70
ST_DIFFERENCE, 116
GRANT ANY ROLE system privilege, 67, 70
ST_DIMENSION, 116
GRANT statement, 67
ST_DISJOINT, 116
Graph analytics, 77
ST_DISTANCE, 116
Graph search, 77
ST_ENDPOINT, 115
GREATEST function, 181
ST_ENVELOPE, 116
GROUPING SETS, 78
ST_EQUALS, 116
GROUPING[_ID] function, 183
ST_EXTERIORRING, 116
GROUP_CONCAT function, 182
ST_FORCE2D, 116
ST_GEOMETRYN, 116
ST_GEOMETRYTYPE, 116 H
ST_INTERIORRINGN, 116 Hadoop, 266, 268
ST_INTERSECTION, 116 HASH_MD5 function, 183
ST_INTERSECTS, 116 HASH_SHA[1] function, 184
ST_ISCLOSED, 115 HASH_TIGER function, 185
ST_ISEMPTY, 116 HAVING, 78
ST_ISRING, 115 HCatalog, 268
ST_ISSIMPLE, 116 HOUR function, 185
ST_LENGTH, 115 HOURS_BETWEEN function, 186
ST_NUMGEOMETRIES, 116
ST_NUMINTERIORRINGS, 116 I
ST_NUMPOINTS, 115 Identifier, 5
ST_OVERLAPS, 117 delimited, 6
ST_POINTN, 115 regular, 5
ST_SETSRID, 117 reserved words, 7
ST_STARTPOINT, 115 schema-qualified, 7
ST_SYMDIFFERENCE, 117 Identity columns, 19, 21, 38, 39, 41, 112
ST_TOUCHES, 117 Display, 113
ST_TRANSFORM, 117 Example, 112
ST_UNION, 117 Set and change, 113
ST_WITHIN, 117 IF branch, 28
ST_X, 115 IF EXISTS
ST_Y, 115 in DROP COLUMN statement, 21
Objects, 114 in DROP CONNECTION statement, 67
Empty set, 114 in DROP FUNCTION statement, 30
GEOMETRY, 114 in DROP ROLE statement, 64
GEOMETRYCOLLECTION, 114 in DROP SCHEMA statement, 13
LINEARRING, 114 in DROP SCRIPT statement, 34
LINESTRING, 114 in DROP TABLE statement, 19
MULTILINESTRING, 114 in DROP USER statement, 63
MULTIPOINT, 114 in DROP VIEW statement, 27
MULTIPOLYGON, 114 IMPORT statement, 44, 79
POINT, 114 IN predicate, 133
487
INNER JOIN, 76 LIMIT, 79
INSERT ANY TABLE system privilege, 38 LINEARRING Object, 114
INSERT function, 186 LINESTRING Object, 114
INSERT object privilege, 38 Literals, 118
INSERT statement, 38 Boolean, 119
INSTR function, 187 Date/Time, 119
Interfaces Examples, 118
ADO.NET Data Provider, 393 Interval, 119
EXAplus, 353 NULL, 121
JDBC driver, 387 Numeric, 118
ODBC driver, 378 Strings, 121
SDK, 400 LN function, 195
WebSockets, 400 LOCAL, 77
INTERSECT, 80 LOCALTIMESTAMP function, 195
INTERVAL DAY TO SECOND data type, 107 LOCATE function, 196
INTERVAL YEAR TO MONTH data type, 107 LOG function, 196
IPROC function, 188 LOG10 function, 197
IS NULL predicate, 134 LOG2 function, 197
IS_BOOLEAN function, 188 LOWER function, 198
IS_DATE function, 188 LPAD function, 198
IS_DSINTERVAL function, 188 LTRIM function, 199
IS_NUMBER function, 188 Lua, 301
IS_TIMESTAMP function, 188
IS_YMINTERVAL function, 188 M
MapReduce, 300
J MAX function, 199
Java, 306 MEDIAN function, 200
JDBC driver, 387 MERGE statement, 40
Best Practice, 392 MID function, 201
Standards, 387 MIN function, 201
System requirements, 387 MINUS, 80
Use, 388 MINUTE function, 202
Join, 76, 77 MINUTES_BETWEEN function, 202
CROSS JOIN, 76 MOD function, 203
FULL OUTER JOIN, 76 MONTH function, 203
INNER JOIN, 76 MONTHS_BETWEEN function, 204
LEFT JOIN, 76 MULTILINESTRING Object, 114
OUTER JOIN, 76 MULTIPOINT Object, 114
RIGHT JOIN, 76 MULTIPOLYGON Object, 114
K N
Kerberos, 61, 384, 391 next(), 302, 307, 314
KILL statement, 90 next_row(), 318
NICE, 93, 263, 411, 420, 435, 440, 449
L NLS_DATE_FORMAT, 92, 94
LAG function, 189 NLS_DATE_LANGUAGE, 92, 94
LAST_VALUE function, 190 NLS_FIRST_DAY_OF_WEEK, 92, 94, 172, 223, 246
LCASE function, 191 NLS_NUMERIC_CHARACTERS, 92, 94
LDAP, 62 NLS_TIMESTAMP_FORMAT, 92, 94
LEAD function, 191 NOCYCLE, 77
LEAST function, 193 NOT NULL, 25
LEFT function, 193 NOT predicate, 132
LEFT JOIN, 76 NOW function, 204
LENGTH function, 194 NPROC function, 205
LEVEL, 77 NULL
LEVEL function, 194 Literal, 121
LIKE predicate, 135 NULLIF function, 206
488
Index
489
REGEXP_SUBSTR function, 218 open, 96
REGR_* functions, 218 rename, 35
REGR_AVGX function, 219 Schema objects, 12
REGR_AVGY function, 219 Schema-qualified identifiers, 7
REGR_COUNT function, 219 Script
REGR_INTERCEPT function, 219 Comment script, 36
REGR_R2 function, 219 Creating a script, 30
REGR_SLOPE function, 219 Dropping a script, 34
REGR_SXX function, 219 Executing a script, 89
REGR_SXY function, 219 Functions
REGR_SYY function, 219 exit(), 284
Regular expressions, 8 import(), 284
Examples, 8 output(), 286
Pattern elements, 8 Rename script, 35
RENAME statement, 35 Scripting, 272
REORGANIZE statement, 100 Arrays, 275
REPEAT function, 220 Auxiliary functions for identifiers, 286
REPLACE function, 221 Comments, 273
Reserved words, 7 Control structures, 276
reset(), 302, 307, 314, 318 Debug output, 286
RESTRICT Dictionary Tables, 275
in DROP SCHEMA statement, 13 Error Handling, 279
in DROP VIEW statement, 27 Example, 272
RETURNS, 295 Executing SQL statements, 280
REVERSE function, 221 Execution blocks, 276
REVOKE statement, 70 Functions, 279
RIGHT function, 222 error(), 279
RIGHT JOIN, 76 join(), 287
Rights management, 259 pairs(), 275
Access control with SQL statements, 260 pcall(), 279
Example, 261 pquery(), 280
Meta information, 261 query(), 280
Privileges, 260 quote(), 287
Roles, 259 sqlparsing.find(), 338
User, 259 sqlparsing.getsqltext(), 339
Roles sqlparsing.isany(), 338
Create roles, 63 sqlparsing.iscomment(), 337
Delete roles, 64 sqlparsing.isidentifier(), 337
Grant rights, 67 sqlparsing.iskeyword(), 337
Revoke rights, 70 sqlparsing.isnumericliteral(), 338
ROLLBACK statement, 89 sqlparsing.isstringliteral(), 338
ROLLUP, 78 sqlparsing.iswhitespace(), 337
ROUND (datetime) function, 223 sqlparsing.iswhitespaceorcomment(), 337
ROUND (number) function, 224 sqlparsing.normalize(), 338
ROWID, 225 sqlparsing.setsqltext(), 339
ROW_NUMBER function, 224 sqlparsing.tokenize(), 337
RPAD function, 226 string.find(), 288
RTRIM function, 226 string.format(formatstring, e1, e2, ...), 290
string.gmatch(s, pattern), 288
S string.gsub(s, pattern, repl [, n]), 288
SCALAR, 295 string.len(s), 289
Schema string.lower(s), 289
change owner, 14 string.match(s, pattern [, init]), 288
close, 96 string.rep(s, n), 289
Comment schema, 36 string.reverse(s), 289
create, 12 string.sub(s, i [, j]), 288
delete, 13 string.upper(s), 289
table.concat(), 294
490
Index
491
SUM function, 236 GRANT ANY ROLE, 67, 70, 460
SYSDATE function, 238 INSERT ANY TABLE, 38, 460
System parameter KILL ANY SESSION, 460
CONSTRAINT_STATE_DEFAULT, 93, 95 SELECT ANY DICTIONARY, 461
DEFAULT_LIKE_ESCAPE_CHARACTER, 92, 94 SELECT ANY TABLE, 73, 461
NICE, 93 Summary, 459
NLS_DATE_FORMAT, 92, 94 UPDATE ANY TABLE, 39, 461
NLS_DATE_LANGUAGE, 92, 94 USE ANY CONNECTION, 44, 52, 460
NLS_FIRST_DAY_OF_WEEK, 92, 94 System tables, 405
NLS_NUMERIC_CHARACTERS, 92, 94 CAT, 457
NLS_TIMESTAMP_FORMAT, 92, 94 DUAL, 457
PROFILE, 93, 95 EXA_ALL_COLUMNS, 405
QUERY_CACHE, 92, 94 EXA_ALL_CONNECTIONS, 406
QUERY_TIMEOUT, 92, 95 EXA_ALL_CONSTRAINTS, 406
SCRIPT_LANGUAGES, 93, 95 EXA_ALL_CONSTRAINT_COLUMNS, 407
SQL_PREPROCESSOR_SCRIPT, 93, 95 EXA_ALL_DEPENDENCIES, 407
TIMESTAMP_ARITHMETIC_BEHAVIOR, 92, 94 EXA_ALL_FUNCTIONS, 407
TIME_ZONE, 92, 94 EXA_ALL_INDICES, 408
System privilege EXA_ALL_OBJECTS, 409
ACCESS ANY CONNECTION, 460 EXA_ALL_OBJECT_SIZES, 410
ALTER ANY CONNECTION, 35, 66, 460 EXA_ALL_OBJ_PRIVS, 408
ALTER ANY SCHEMA, 14, 460 EXA_ALL_OBJ_PRIVS_MADE, 408
ALTER ANY TABLE, 19, 460 EXA_ALL_OBJ_PRIVS_RECD, 409
ALTER ANY VIRTUAL SCHEMA, 14, 460 EXA_ALL_ROLES, 410
ALTER ANY VIRTUAL SCHEMA REFRESH, 14, EXA_ALL_SCRIPTS, 410
460 EXA_ALL_SESSIONS, 411
ALTER SYSTEM, 460 EXA_ALL_TABLES, 411
ALTER USER, 62, 460 EXA_ALL_USERS, 412
CREATE ANY FUNCTION, 27, 461 EXA_ALL_VIEWS, 412
CREATE ANY SCRIPT, 30, 461 EXA_ALL_VIRTUAL_COLUMNS, 412
CREATE ANY TABLE, 15, 18, 460 EXA_ALL_VIRTUAL_SCHEMA_PROPERTIES,
CREATE ANY VIEW, 461 413
CREATE CONNECTION, 65, 460 EXA_ALL_VIRTUAL_TABLES, 413
CREATE FUNCTION, 27, 461 EXA_DBA_COLUMNS, 413
CREATE ROLE, 35, 460 EXA_DBA_CONNECTIONS, 414
CREATE SCHEMA, 12, 460 EXA_DBA_CONNECTION_PRIVS, 414
CREATE SCRIPT, 30, 461 EXA_DBA_CONSTRAINTS, 415
CREATE SESSION, 61, 460 EXA_DBA_CONSTRAINT_COLUMNS, 415
CREATE TABLE, 15, 18, 460 EXA_DBA_DEPENDENCIES, 416
CREATE USER, 35, 61, 460 EXA_DBA_DEPENDENCIES_RECURSIVE, 416
CREATE VIEW, 461 EXA_DBA_FUNCTIONS, 417
CREATE VIRTUAL SCHEMA, 12, 460 EXA_DBA_INDICES, 417
DELETE ANY TABLE, 43, 460 EXA_DBA_OBJECTS, 418
DROP ANY CONNECTION, 460 EXA_DBA_OBJECT_SIZES, 418
DROP ANY FUNCTION, 461 EXA_DBA_OBJ_PRIVS, 417
DROP ANY ROLE, 460 EXA_DBA_RESTRICTED_OBJ_PRIVS, 415
DROP ANY SCHEMA, 13, 460 EXA_DBA_ROLES, 419
DROP ANY SCRIPT, 461 EXA_DBA_ROLE_PRIVS, 419
DROP ANY TABLE, 19, 460 EXA_DBA_SCRIPTS, 419
DROP ANY VIEW, 461 EXA_DBA_SESSIONS, 420
DROP ANY VIRTUAL SCHEMA, 13, 460 EXA_DBA_SYS_PRIVS, 420
DROP USER, 63, 460 EXA_DBA_TABLES, 421
EXECUTE ANY FUNCTION, 27, 461 EXA_DBA_USERS, 421
EXECUTE ANY SCRIPT, 31, 89, 461 EXA_DBA_VIEWS, 421
GRANT ANY CONNECTION, 67, 70, 460 EXA_DBA_VIRTUAL_COLUMNS, 422
GRANT ANY OBJECT PRIVILEGE, 67, 70, 460 EXA_DBA_VIRTUAL_SCHEMA_PROPERTIES,
GRANT ANY PRIORITY, 67, 70, 460 422
GRANT ANY PRIVILEGE, 67, 69, 70, 460 EXA_DBA_VIRTUAL_TABLES, 422
492
Index
493
TIME_ZONE_BEHAVIOR, 92, 94, 105 UPDATE ANY TABLE system privilege, 39
TO_CHAR (datetime) function, 240 UPDATE object privilege, 39
TO_CHAR (number) function, 241 UPDATE statement, 39
TO_DATE function, 241 UPPER function, 249
TO_DSINTERVAL function, 242 User
TO_NUMBER function, 243 Change password, 62
TO_TIMESTAMP function, 243 Create user, 61
TO_YMINTERVAL function, 244 Delete user, 63
Transaction, 88, 89, 102, 257, 261, 439 Grant rights, 67
Transaction conflict, 428, 439, 443, 456 Revoke rights, 70
Transaction management, 257 USER function, 249
TRANSLATE function, 245 User-defined functions
TRIM function, 245 Assignment, 28
TRUNCATE AUDIT LOGS Statement, 101 CREATE FUNCTION, 27
TRUNCATE statement, 43 DROP FUNCTION, 30
TRUNC[ATE] (datetime) function, 246 FOR loop, 28
TRUNC[ATE] (number) function, 247 IF branch, 28
Syntax, 27
U WHILE loop, 28
UCASE function, 247 USING, 77
UDF scripts, 295
Access to external services, 301 V
Aggregate and analytical functions, 297 VALUE2PROC function, 250
BucketFS, 323 VARCHAR data type, 107
cleanup(), 301, 307, 313, 317 VARIANCE function, 252
Dynamic input and output parameters, 298 VAR_POP function, 251
Dynamic parameter list, 302, 308, 313, 317 VAR_SAMP function, 251
emit(), 302, 307, 314, 319 View
EMITS, 295 Comment view, 36
init(), 307 Create view, 25
Introducing examples, 296 Delete view, 26
Introduction, 295 INVALID, 26
Java, 306 Rename view, 35
Lua, 301 Status, 26
MapReduce programs, 300 Virtual schemas, 330
Metadata, 308 Access concept, 332
next(), 302, 307, 314 Adapters and properties, 331
next_row(), 318 Details for experts, 334
ORDER BY, 296 EXPLAIN VIRTUAL, 98
Parameter, 308 Metadata, 333
Parameters, 301, 313, 317 Privileges for administration, 332
Performance, 296 Virtual schemas and tables, 330
Python, 313
R, 317 W
reset(), 302, 307, 314, 318 WebSockets, 400
RETURNS, 295 WEEK function, 253
run(), 301, 307, 313, 317 WHERE, 77
SCALAR, 295 WHILE loop, 28
Scalar functions, 296 WITH, 76
SET, 295
size(), 302, 307, 314, 318
User-defined ETL using UDFs, 301
Y
YEAR function, 253
UNICODE function, 248
YEARS_BETWEEN function, 254
UNICODECHR function, 248
UNION [ALL], 80
UNIQUE Z
Verification of the property, 84 ZEROIFNULL function, 254
ZeroMQ, 328
494