C 2738510
C 2738510
C 2738510
Version 11.70
SC27-3851-00
SC27-3851-00
Note Before using this information and the product it supports, read the information in Notices on page E-1.
Edition This document contains proprietary information of IBM. It is provided under a license agreement and is protected by copyright law. The information contained in this publication does not include any product warranties, and any statements provided in this manual should not be interpreted as such. When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. Copyright IBM Corporation 2010, 2011. US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
About this publication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Types of users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Software dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Assumptions about your locale . . . . . . . . . . . . . . . . . . . . . . . . . . . . v What's New in the Informix Warehouse Accelerator . . . . . . . . . . . . . . . . . . . . . . vi Example code conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Additional documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Compliance with industry standards . . . . . . . . . . . . . . . . . . . . . . . . . . viii Syntax diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii How to read a command-line syntax diagram . . . . . . . . . . . . . . . . . . . . . . . ix Keywords and punctuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Identifiers and names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x How to provide documentation feedback . . . . . . . . . . . . . . . . . . . . . . . . . xi
. . . . . . . . . . . . . . . . . . . . . . . 2-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 2-2 2-3 2-4 2-5
. . . . . . . . . . . . . . . . . . . . . . 3-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 . 3-2 . 3-3 . 3-6 . 3-7 . 3-9 . 3-9 . 3-10 . 3-11 . 3-12 . 3-13 . 3-13 . 3-14 . 3-14 . 3-15
Configuring the accelerator (non-cluster installation) Configuring the accelerator (cluster installation) . . dwainst.conf configuration file . . . . . . . . Connecting the database server to the accelerator . Enabling and disabling query acceleration . . . . The ondwa utility . . . . . . . . . . . . Users who can run the ondwa commands . . . ondwa setup command . . . . . . . . . ondwa start command . . . . . . . . . ondwa status command . . . . . . . . ondwa getpin command . . . . . . . . ondwa tasks command . . . . . . . . . ondwa stop command . . . . . . . . . ondwa reset command . . . . . . . . . ondwa clean command . . . . . . . . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
iii
Creating data mart definitions by using Deploying a data mart . . . . . . Loading data into data marts . . . . Refreshing the data in a data mart . . . Updating the data in a data mart . . . Drop a data mart . . . . . . . . . Handling schema changes . . . . . . Removing probing data from the database Monitoring AQTs . . . . . . . . .
workload analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Chapter 5. Reversion requirements for an Informix warehouse edition server and Informix Warehouse Accelerator . . . . . . . . . . . . . . . . . . . . . . . . 5-1 Chapter 6. Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 . 6-1 . 6-1
Missing sbspace . . . . . . . . . . . . . . . . Memory issues for the coordinator node and the worker nodes . Ensuring a result set includes the most current data . . . .
Appendix A. Sample warehouse schema. . . . . . . . . . . . . . . . . . . . . A-1 Appendix B. Sysmaster interface (SMI) pseudo tables for query probing data. . . . . B-1 Appendix C. Supported locales . . . . . . . . . . . . . . . . . . . . . . . . . C-1 Appendix D. Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1
Accessibility features for IBM Informix products Accessibility features . . . . . . . . . Keyboard navigation . . . . . . . . . Related accessibility information . . . . . IBM and accessibility. . . . . . . . . Dotted decimal syntax diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1 D-1 D-1 D-1 D-1 D-1
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-3
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X-1
iv
Introduction
About this publication
This publication contains comprehensive information about using the Informix Warehouse Accelerator to process data warehouse queries more quickly than processing the queries using the Informix database server.
Types of users
This publication is written for the following users: v Database administrators v System administrators v Performance engineers v Application developers This publication is written with the assumption that you have the following background: v A working knowledge of your computer, your operating system, and the utilities that your operating system provides v Some experience working with relational and dimensional databases or exposure to database concepts v Some experience with database server administration, operating-system administration, network administration, or application development You can access the Informix information centers, as well as other technical information such as technotes, white papers, and IBM Redbooks publications online at http://www.ibm.com/software/data/sw-library/.
Software dependencies
This publication is written with the assumption that you are using IBM Informix Version 11.70.xC2 or later as your database server.
English format conventions for displaying and entering date, time, number, and currency values. They also support the ISO 8859-1 code set (on UNIX and Linux) or the Microsoft 1252 code set (on Windows), which includes the ASCII code set plus many 8-bit characters such as , , and . You can specify another locale if you plan to use characters from other locales in your data or your SQL identifiers, or if you want to conform to other collation rules for character data.
See Users who can run the ondwa commands on page 3-9.
See the list of scalar functions at Supported functions and expressions on page 1-9.
vi
Table 2. What's New in IBM Informix Warehouse Accelerator Administration Guide for IBM Informix Version 11.70.xC3 Overview Create data mart definitions automatically Creating data mart definitions is one of the more time consuming tasks in setting up the accelerator. You can use a new capability in Informix Warehouse Accelerator to automatically create the data mart definitions for you. This capability is especially useful if there are a very large number of tables in your database and using the administration interface to create the data mart definitions is cumbersome. It is also useful when you are not intimately familiar with the table schemas in your database. Additional locales supported Previously Informix Warehouse Accelerator supported only the default locale en_us.8859-1. Additional locales are now supported. Reference See Creating data mart definitions by using the administration interface on page 4-8.
To use this SQL code for a specific product, you must apply the syntax rules for that product. For example, if you are using an SQL API, you must use EXEC SQL at the start of each statement and a semicolon (or other appropriate delimiter) at the end of the statement. If you are using DBAccess, you must delimit multiple statements with semicolons. Tip: Ellipsis points in a code example indicate that more code would be added in a full application, but it is not necessary to show it to describe the concept being discussed. For detailed directions on using SQL statements for a particular application development tool or SQL API, see the documentation for your product.
Additional documentation
Documentation about this release of IBM Informix products is available in various formats. You can access or install the product documentation from the Quick Start CD that is shipped with Informix products. To get the most current information, see the Informix information centers at ibm.com. You can access the information centers
Introduction
vii
and other Informix technical information such as technotes, white papers, and IBM Redbooks publications online at http://www.ibm.com/software/data/sw-library/.
Syntax diagrams
Syntax diagrams use special components to describe the syntax for statements and commands.
Table 3. Syntax Diagram Components Component represented in PDF Component represented in HTML >>---------------------Meaning Statement begins.
----------------------->
Statement continues on next line. Statement continues from previous line. Statement ends. Required item. Optional item.
Optional items with choice are shown below the main line, one of which you might specify.
viii
Table 3. Syntax Diagram Components (continued) Component represented in PDF Component represented in HTML .---NEXT---------. ----+----------------+--+---PRIOR--------+ ---PREVIOUS----- Meaning The values below the main line are optional, one of which you might specify. If you do not specify an item, the value above the line is used by default. Optional items. Several items are allowed; a comma must precede each repetition.
.-------,-----------. V | ---+-----------------+--+---index_name---+ ---table_name--- >>-| Table Reference |->< Table Reference |--+-----view--------+--| +------table------+ ----synonym------
-t
table
This diagram has a segment named Setting the Run Mode, which according to the diagram footnote is on page Z-1. If this was an actual cross-reference, you would find this segment on the first page of Appendix Z. Instead, this segment is shown in the following segment diagram. Notice that the diagram uses segment start and end components.
Introduction
ix
To see how to construct a command correctly, start at the upper left of the main diagram. Follow the diagram to the right, including the elements that you want. The elements in this diagram are case-sensitive because they illustrate utility syntax. Other types of syntax, such as SQL, are not case-sensitive. The Creating a No-Conversion Job diagram illustrates the following steps: 1. Type onpladm create job and then the name of the job. 2. Optionally, type -p and then the name of the project. 3. Type the following required elements: v -n v -d and the name of the device v -D and the name of the database v -t and the name of the table 4. Optionally, you can choose one or more of the following elements and repeat them an arbitrary number of times: v -S and the server name v -T and the target server name v The run mode. To set the run mode, follow the Setting the Run Mode segment diagram to type -f, optionally type d, p, or a, and then optionally type l or u. 5. Follow the diagram to the terminator.
The following syntax diagram uses variables to illustrate the general form of a simple SELECT statement.
SELECT column_name FROM table_name
When you write a SELECT statement of this form, you replace the variables column_name and table_name with the name of a specific column and table.
Introduction
xi
xii
Client application
Client application
Client application
Client connectivity
Informix server
Figure 1-1. Informix Warehouse Accelerator installed on the same computer as the Informix database server.
You can use Informix Warehouse Accelerator with an Informix database server that supports a mixed workload (online transactional processing (OLTP) database and a data warehouse database), or use it with a database server that supports only a data warehouse database. However, before you use Informix Warehouse Accelerator you must design and implement a dimensional database that uses a star or snowflake schema for your data warehouse. This design includes selecting the business subject areas that you want to model, determining the granularity of the fact tables, and identifying the dimensions and hierarchies for each fact table. Additionally, you must identify the measures for the fact tables and determine the attributes for each dimension table. The following figure shows a sample snowflake schema with a fact table and multiple dimension tables.
1-1
REGION
CITY MONTH
STORE CONTACT
PERIOD
DEMOGRAPHICS
PROMOTION
PRODUCT_LINE
Figure 1-2. A sample snowflake schema that has the DAILY_SALES table as the fact table.
Administration interface
Informix Warehouse Accelerator includes an Eclipse-based administration interface, IBM Smart Analytics Optimizer Studio. You use this interface to administer the accelerator, and the data contained within the accelerator. The administration tasks are performed using a set of stored procedures in the Informix database server. The stored procedures are called through the administration interface.
Accelerator utilities
You also use utilities that are supplied with Informix Warehouse Accelerator to create the files and subdirectories that are required to run an accelerator instance and start the accelerator nodes. By default, the Informix Warehouse Accelerator uses one coordinator node and one worker node. The coordinator node is a process that manages the query tasks, such as load query and run query. The Informix database server and the ondwa utility connect to the coordinator node. The worker node is a process that communicates only with the coordinator node. The worker node has all of the data in main memory, compresses the data, and processes the queries.
Accelerator samples
Informix Warehouse Accelerator also includes a sample set of Java classes that you can use from the command line or in an application. You use these classes to perform many of the same tasks that you can perform with the administration interface. Tip: One reason that you might use these classes is to automate the steps required to refresh the data stored in the data marts. Instead of dropping and recreating the data marts manually through the administration interface, you can create an application that runs whenever it is convenient for your organization. See the dwa_java_reference.txt file in the dwa/example/cli/ directory for more
1-2
Accelerator architecture
The following figure shows that the database server communicates with the worker nodes through the coordinator node, and describes the roles of the database server, the coordinator node, and the worker nodes.
Coordinator node
Worker nodes
Manages the distribution tasks such as loading data and and query processing.
Store the data in main memory spread across all of the nodes. Perform the data compression and query processing.
Figure 1-3. A sample accelerator node architecture with one coordinator node and four worker nodes.
Informix Warehouse Accelerator must have its own copy of the data. The data is stored in logical collections of related data, or data marts. After you create a data mart, information about the data mart is sent to the database server in the form of a special view referred to as an accelerated query table or AQT. The architecture that is implemented with Informix Warehouse Accelerator is optimal for processing data warehouse queries. The demands of data warehouse queries are substantially different than OLTP queries. A typical data warehouse query requires the processing of a large set of data and returns a small result set. The overhead required to accelerate a query and return a result set is negligible compared with the benefits of using Informix Warehouse Accelerator: v The accelerator uses a copy of the data. v To expedite query processing, the copy of the data is kept in memory in a special compressed format. Advances in compression techniques, query processing on compressed data, and hybrid columnar organization of compressed data enable the accelerator to query the compressed data. v The data is compressed and stored with the accelerator to maximize parallel query processing. In addition to the performance gain of the warehouse query itself, the resources on the database server can be better utilized for other types of queries, such as OLTP queries, which perform more efficiently on the database server.
1-3
You can install Informix Warehouse Accelerator on the same computer as your Informix database server, on a separate computer, or on a cluster. There must be a TCP/IP connection between the database server and the accelerator. If the accelerator is installed on the same computer as the database server, the connection must be a local loop-back TCP/IP connection. The query optimizer on the database server identifies which data warehouse queries can be accelerated and sends those queries to the accelerator. The result set is sent back to the database server, which passes the result set back to the client. If the query cannot be processed by the accelerator, the query is processed on the database server. You use an administration tool called IBM Smart Analytics Optimizer Studio to perform the administration tasks that are required on the accelerator. The IBM Smart Analytics Optimizer Studio is commonly referred to as the administration interface. The administration tasks are implemented as a set of stored procedures that are called through the administration interface. You can install the administration interface on the same computer as the Informix database server or on a separate computer.
Administrator interface
Symmetric multiprocessing system Informix database server
Optimizer Client SQL query Use Accelerator?
Yes Local loopback TCP/IP connection
Result set
No
Figure 1-4. The accelerator and the database server installed on the same computer.
1-4
Result set
No
Figure 1-5. The accelerator and the database server installed on different computers.
1-5
Result set
No
Figure 1-6. The accelerator installed on a cluster system and the database server and administration interface installed on separate computers.
1-6
v Queries that access a large subset of the database, often by using sequential scans v Queries that involve aggregation functions such as COUNT, SUM, AVG, MAX, MIN, and VARIANCE v Queries that often create reports that group data by time, product, geography, customer set, or market v Queries that involve star joins or snowflake joins of a large fact table with several dimension tables Related concepts: Analyze queries for acceleration on page 4-7 Types of queries that are not accelerated on page 1-8
1-7
THEN EXTENDED_PRICE ELSE 0 END) AS PRIOR_MONTH FROM PERIOD,PRODUCT,DAILY_SALES,STORE WHERE PRODUCT.PRODKEY=DAILY_SALES.PRODKEY AND PERIOD.PERKEY=DAILY_SALES.PERKEY AND STORE.STOREKEY=DAILY_SALES.STOREKEY AND CALENDAR_DATE BETWEEN 7/1/2010 and 8/14/2010 AND ITEM_DESC LIKE NESTLE% GROUP BY STORE_NUMBER ORDER BY STORE_NUMBER;
Queries that search only a small number of rows of data should be processed by the database server to avoid the overhead of sending the query to the accelerator.
1-8
Queries that would change the data cannot be processed by the accelerator and must be processed by the database server. The data in the accelerator is a snapshot view of the data and is read only. There is no mechanism to change the data in the data marts and replicate those changes back to the source database server. Other queries that are not processed by the accelerator include queries that contain INSERT, UPDATE, or DELETE statements, queries that contain subqueries, and other OLTP queries. Related concepts: Queries that benefit from acceleration on page 1-6
1-9
v VARIANCE
User-defined functions
User-defined functions are not supported.
Scalar functions
The following scalar functions are supported by the accelerator: v ABS v ADD_MONTHS v CASE v v v v v v v CEIL CONCAT COUNT DATE DAY DECODE FLOOR
1-10
Supported joins
Equality join predicates, INNER joins, and LEFT OUTER joins are the supported join types. The fact table referenced in the query must be on the left side of the LEFT OUTER join.
Unsupported joins
The following joins are not supported: v RIGHT OUTER joins v FULL OUTER joins v Informix outer joins v Joins that do not use an equality predicate v Subqueries
Software prerequisites
There are separate software prerequisites for the Informix Warehouse Accelerator and the administration interface, IBM Smart Analytics Optimizer Studio. Informix Warehouse Accelerator must be installed on a computer that uses a Linux Intel x86 64-bit operating system. Informix Warehouse Accelerator can be installed on the same computer as the Informix database server, on a separate computer, or on a cluster. If you install the accelerator on a separate computer, then the Informix database server must be installed on a computer that uses one of the following operating systems: v AIX 64-bit v HP IA 64-bit v Solaris SPARC 64-bit v Linux Intel x86 64-bit For a detailed list of the operating systems supported by the current version of Informix and by other IBM Informix products, download the platform availability spreadsheet from http://www.ibm.com/software/data/informix/pubs/ roadmaps.html. Search for the product name, or sort the spreadsheet by name.
1-11
v su command
Hardware prerequisites
Make certain that you have the appropriate hardware to support the Informix Warehouse Accelerator and the administration interface. Informix Warehouse Accelerator can be installed on the same computer as the Informix database server, on a separate computer, or on a cluster. You can install the administration interface on the same computer as the Informix database server or on a separate computer. Important: The accelerator caches compressed data in memory. It is essential that the computer where the accelerator is installed is configured with a large amount of memory.
1-12
The computer on which you install Informix Warehouse Accelerator must have a CPU with the Streaming SIMD Extensions 3 (SSE3) instruction set. To verify what is installed on the computer, you can run the cat /proc/cpuinfo command and look at the flags that are returned. For example, you can use the configuration that is shown in the following table:
Component System Processor Number of processors Memory Capacity / Size IBM System x3850 X5 Intel Xeon CPU X7560 @ 2.26GHz (8-core) 4 512 GB
For additional information about the IBM System x3850 X5, see the specifications at http://www.ibm.com/systems/x/hardware/enterprise/x3850x5/specs.html.
1-13
1-14
Installation directory
The accelerator is installed in the directory that is specified by the INFORMIXDIR environment variable, if the variable is set in the environment in which the installer is launched. If the variable is not set, the default installation directory is /opt/IBM/informix. Whenever there is a reference to the file path for the accelerator installation directory, the file path appears as $IWA_INSTALL_DIR.
Storage directory
The accelerator instance resides in its own directory, referred to as the accelerator storage directory. This directory stores the accelerator catalog, data marts, logs, traces, and so forth. You create this directory when you configure the accelerator. The file path for this directory is stored in the DWADIR parameter in the dwainst.conf file.
2-1
Documentation directory
Before you install the accelerator, you can access the accelerator documentation in the following directories: v The Release Notes file is in the $IWA_ROOT_DIR/doc directory v The Quick Start Guide is in the $IWA_ROOT_DIR/quickstart directory After you install the accelerator, you can access the accelerator documentation in the following directories: v The release notes file is in the $IWA_INSTALL_DIR/release/en_us/0333/doc directory Related tasks: Configuring the accelerator (non-cluster installation) on page 3-1 Related reference: dwainst.conf configuration file on page 3-3
You must update the ONCONFIG file before you start the database server. b. Use the onspaces command to create the sbspace. The following example creates an sbspace named sbspace1:
onspaces -c -S sbspace1 -p $INFORMIXDIR/tmp/sbspace1 -o 0 -s 30000
Note: The size of the sbspace can be relatively small, for example between 30 and 50 MB. c. Restart the Informix database server.
2-2
Related concepts: Missing sbspace on page 6-1 Related tasks: Setting up the sqlhosts file (UNIX) (Administrator's Guide) Related reference: NETTYPE Configuration Parameter (Administrator's Reference)
b. Read the license agreement and accept the terms. c. Respond to the prompts in the installation program as the program guides you through the installation. v For the silent mode: a. Make a copy of the response file template that is located in the same directory as the Informix Warehouse Accelerator installation program. The name of the template is iwa.properties. b. In the response file change the value for license from FALSE to TRUE, to indicate that you accept the license terms. For example:
DLICENSE_ACCEPTED=TRUE
Chapter 2. Accelerator installation
2-3
c. Issue the installation command for the silent mode. The command for the silent mode installation is:
./iwa_install -i silent -f "file_path"
Tip: Specify the absolute path for the response file. For example, to use the silent mode with a response file called installer.properties that is located in the /usr3/iwa/ directory, the command is:
./iwa_install -i silent -f "/usr3/iwa/installer.properties"
After you complete the installation, there are additional steps if you want to install the administration interface. v The accelerator is installed in the directory that is specified by the INFORMIXDIR environment variable, if the INFORMIXDIR environment variable is set in the environment in which the installer is launched. Otherwise, the default installation directory is /opt/IBM/informix. v The configuration file, $IWA_INSTALL_DIR/dwa/etc/dwainst.conf, is generated during the installation. This configuration file is required to start the accelerator. v The installation log file, $IWA_INSTALL_DIR/ IBM_Informix_Warehouse_Accelerator_InstallLog.log, is generated during the installation. This log file provides information on the actions performed during installation and success or failure status of those actions. You must configure and start the accelerator before you can use it. Related tasks: Installing the administration interface Chapter 3, Accelerator configuration, on page 3-1 Uninstalling the accelerator on page 2-5
2. If you are installing the administration interface on a separate computer, you can insert the provided media or FTP the installation program to the separate computer. 3. Run the installation program to install the administration interface. The administration interface is installed in the path that you specify in the installer, for example $IWA_INSTALL_DIR/dwa_gui.
2-4
Tip: Linux and UNIX only - You can use the -i swing command for graphical installation or the -i silent command for silent installation. The -i console command is not supported. 4. Ensure that the interface opens without any errors. After you complete the installation, open the directory where the administration interface is installed and run the ./datastudio command to start the administration interface. You must configure and start the accelerator before you can use it. Related tasks: Installing the accelerator on page 2-3 Chapter 3, Accelerator configuration, on page 3-1 Uninstalling the accelerator
2-5
2-6
Because the amount of data in the accelerator storage directory might increase significantly, do not create the accelerator storage directory in the accelerator installation directory.You will specify the file path for this directory in the value for the DWADIR parameter in the dwainst.conf file. 4. Open the $IWA_INSTALL_DIR/dwa/etc/dwainst.conf configuration file. Review and edit the values in the dwainst.conf configuration file on page 3-3. Important: Specify the network interface value for the DRDA_INTERFACE parameter in the dwainst.conf file.
Copyright IBM Corp. 2010, 2011
3-1
5. Run the ondwa setup command on page 3-10 to create the files and subdirectories that are required to run the accelerator instance. 6. Run the ondwa start command on page 3-11 to start all of the accelerator nodes. Related concepts: Accelerator directory structure on page 2-1 Related reference: The ondwa utility on page 3-9 ondwa setup command on page 3-10 ondwa start command on page 3-11 dwainst.conf configuration file on page 3-3
Because the amount of data in the accelerator storage directory might increase significantly, do not create the accelerator storage directory in the accelerator installation directory. 4. Open the $IWA_INSTALL_DIR/dwa/etc/dwainst.conf configuration file. Review and edit the values in the dwainst.conf configuration file on page 3-3: a. For the DRDA_INTERFACE parameter, specify the network interface value that you identified in step 2. b. For the DWADIR parameter, specify the file path for the storage directory that you created in step 3. On all cluster nodes, the DWADIR parameter must be the same file path.
3-2
c. For the CLUSTER_INTERFACE parameter, specify the network device name for the connection between the cluster nodes. For example, eth0. d. If only one coordinator node or one worker node will run on each cluster node, add the following additional parameters and values:
CORES_FOR_SCAN_THREADS_PERCENTAGE=100 CORES_FOR_LOAD_THREADS_PERCENTAGE=100 CORES_FOR_REORG_THREADS_PERCENTAGE=25
5. In the $IWA_INSTALL_DIR/dwa/etc directory, create a file named cluster.conf to store the cluster nodes host names or IP addresses. In the cluster.conf file, enter one cluster node per line. For example:
node0001 node0002 node0003 node0004
The order that you list the cluster nodes hosts names (or their IP addresses) is the order that the cluster nodes are started or stopped with the ondwa start and ondwa stop commands. 6. Use the ondwa commands to set up and start the accelerator. You can run the ondwa commands from any node in the cluster. The ondwa commands apply to all the nodes listed in the cluster.conf file. a. Run the ondwa setup command on page 3-10 to create the files and subdirectories that are required to run the accelerator instance. Example output:
Checking Checking Checking Checking for for for for DWA_CM_node0 DWA_CM_node1 DWA_CM_node2 DWA_CM_node3 on on on on node0001: node0002: node0003: node0004: stopped stopped stopped stopped
b. Run the ondwa start command on page 3-11 to start all of the cluster nodes. Example output:
Starting Starting Starting Starting DWA_CM_node0 DWA_CM_node1 DWA_CM_node2 DWA_CM_node3 on on on on node0001: node0002: node0003: node0004: started started started started
3-3
COORDINATOR_SHM
The total of the shared memory on the coordinator node and worker nodes should not exceed the free memory on the computer where the accelerator is installed. Tip: The coordinator node does not need as much shared memory as the worker nodes. A value between 5 to 10 % of the total memory set aside for the accelerator is a good estimate for this parameter. If only one coordinator node or one worker node will run on each cluster node, set this value to 100. If only one coordinator node or one worker node will run on each cluster node, set this value to 25. If only one coordinator node or one worker node will run on each cluster node, set this value to 100. Ask your system administrator and network administrator which network interface to use for the DRDA_INTERFACE value. If the accelerator is installed on a separate computer than the Informix database server, you cannot use the local loopback value. Create the directory first and then specify the directory in the dwainst.conf file. Note: Specify the directory before you run the ondwa setup command.
CORES_FOR_LOAD _THREADS_PERCENTAGE
CORES_FOR_REORG _THREADS_PERCENTAGE
CORES_FOR_SCAN _THREADS_PERCENTAGE
DRDA_INTERFACE
The network device name that you will use for the connection from the Informix database server to the accelerator. The default value is lo.
DWADIR
3-4
Table 3-1. Parameters in the dwainst.conf file (continued) Parameter NUM_NODES Description The number of nodes (DWA_CM processes) Guidance The number of accelerator nodes should not overload the computer where the accelerator is installed. The number of worker nodes is the value of the NUM_NODES parameter - 1. START_PORT The starting port number for The accelerator instance the coordinator node and the assigns the port numbers worker nodes. that immediately follow the starting port number to the coordinator node and the worker nodes. These port numbers should not already be used by other processes. Each accelerator node needs to be configured with four different port numbers. Beginning with the starting port number, the nodes are assigned incremental port numbers. For example, if your instance has five nodes and you specify the START_PORT number as 21020, the accelerator instance will use ports 21020 21039 because each node uses four port numbers. WORKER_SHM The shared memory on the worker nodes. This value is the total shared memory, combined, on all the worker nodes. You can specify the value as a percentage, such as .70, or a value in Megabytes. The combined total shared memory on the worker nodes and the coordinator node should not exceed the free memory on the computer where the accelerator is installed. Tip: The data marts, with all their data in the compressed format, must fit into the shared memory on the worker nodes. Plan on using approximately two-thirds of the total memory for the accelerator as worker nodes shared memory.
The following example shows parameters and their values in a dwainst.conf configuration file:
DWADIR=$IWA_INSTALL_DIR/dwa/demo START_PORT=21020 NUM_NODES=2 WORKER_SHM=500 COORDINATOR_SHM=250 DRDA_INTERFACE="eth0"
Chapter 3. Accelerator configuration
3-5
Related concepts: Accelerator directory structure on page 2-1 Memory issues for the coordinator node and the worker nodes on page 6-1 Related tasks: Configuring the accelerator (non-cluster installation) on page 3-1 Configuring the accelerator (cluster installation) on page 3-2 Related reference: ondwa setup command on page 3-10
3-6
d. Using the information you gathered in Step 2 on page 3-6, type the name and pairing information for the accelerator. In the Pairing code field, type the PIN number. e. Click OK. A connection between the accelerator and the database server is established, and the sqlhosts file on the Informix database server is updated with the connection information. An example of the sqlhosts file is:
FLINS2 group - - c=1, a=4b3f3f457d5f552b613b4c587551362d2776496f226e714d75217e22614742677b424224 FLINS2_1 dwsoctcp 127.0.0.1 21022 g=FLINS2
7. Use the SET EXPLAIN statement to see the query plan. If the query is accelerated, the Remote SQL Request section appears in the query plan. For example:
QUERY: (ISAO-Executed)(OPTIMIZATION TIMESTAMP: 02-20-2011 01:05:57) -----select sum(units) from salesfact Estimated Cost: 242522 Estimated # of Rows Returned: 1 Maximum Threads: 0 1) sk@FLINS2:dwa.aqt0f957100-cca1-406b-93cc-cae2117122ae: Remote SQL Request: {QUERY {FROM dwa.aqt0f957100-cca1-406b-93cc-cae2117122ae} {SELECT {SUM {SYSCAST COL10 AS BIGINT} } } }
QUERY: (ISAO-FYI) (OPTIMIZATION TIMESTAMP: 02-20-2011 01:05:57) -----select sum(units) from salesfact Estimated Cost: 242522 Estimated # of Rows Returned: 1 Maximum Threads: 0 1) informix.salesfact: SEQUENTIAL SCAN Query statistics: ----------------Table map : ---------------------------Internal name Table name ---------------------------type rows_prod est_rows time est_cost ------------------------------------------------remote 1 0 00:00.00 0
3-7
determine the fact table in the query. If the fact table in the query is not the fact table in the AQT, the query is sent to Informix for processing Tip: Run the statistics again if the distribution of the data in the database changes. 2. Set the PDQPRIORITY session variable. This variable needs to be set so that a star join plan is considered by the optimizer and the right fact table is chosen. 3. Set the use_dwa session variable. Setting this variable enables the Informix query optimizer to consider using the accelerator to process the query when the optimizer generates the query plans. If the variable is set to use_dwa 0 or is not set at all, queries are not accelerated. You can specify that the variable is set automatically or you can set the variable manually: v To have this variable set when you connect to the Informix database, add the use_dwa session variable to the sysdbopen() stored procedure on your Informix database server. Specifying the variable in the procedure will avoid changing your applications or recompiling. v To set the use_dwa variable manually, from the Informix database client set the use_dwa session variable. For example:
SET ENVIRONMENT use_dwa 1;
Turns acceleration ON and Queries that match one of sends debugging information the accelerated query tables to a LOG file. (AQTs) are sent to the accelerator for processing. Debugging information is saved to the Informix message log file.
3-8
Related concepts: The query plan (Performance Guide) Statistics held for the table and index (Performance Guide) Related reference: Configure session properties (Administrator's Guide) PDQPRIORITY environment variable (SQL Reference)
Prerequisites
The ondwa utility is a Bash shell script. Because the accelerator is supported only on Linux operating systems, the ability to run Bash shell scripts is built into the operating system. The following prerequisites must be installed on the machine where you installed the accelerator: v Telnet client program v Expect utility v su command
3-9
Table 3-2. Resources that must be set to unlimited for user informix to run ondwa commands. Resource max locked memory max memory size virtual memory Bash sell ulimit command ulimit -l ulimit -m ulimit -v
If you installed Informix Warehouse Accelerator on a cluster system with user informix, you can set the following equivalent parameters to unlimited in the /etc/security/limits.conf file on each cluster node: memlock max locked-in-memory address space rss as max resident set size address space limit
For example:
informix informix informix informix informix informix soft hard soft hard soft hard memlock memlock rss rss as as unlimited unlimited unlimited unlimited unlimited unlimited
The ondwa setup command uses a file named dwainst.conf in the $IWA_INSTALL_DIR/dwa/etcdirectory to configure the accelerator instance. Using the dwainst.conf file, the ondwa setup command creates the following structure in the accelerator storage directory: v The directory shared between the accelerator nodes (shared) v The directory containing the accelerator node private directories (local) v For each accelerator node: The accelerator node private directory: local/node The accelerator node configuration file: node.conf A link to the DWA_CM executable file: DWA_CM_node The ondwa utility automatically determines the role of the coordinator node or a worker node. The first node is the coordinator node, whereas the remaining nodes are worker nodes. The number of worker nodes is determined by the following calculation: NUM_NODES - 1.
3-10
Tip: If you have the accelerator installed on the same symmetric multiprocessing (SMP) system as your Informix database server, edit the dwainst.conf file and change the NUM_NODES to 5. This will generate one coordinator node and four worker nodes on the accelerator. Each accelerator node needs to be configured with four different port numbers. These port numbers are listed in the configuration file for the accelerator node. The starting port number is taken from the dwainst.conf file, and incremented in turn. For example, if your instance has five accelerator nodes and you specify the START_PORT number as 21020, the accelerator instance will use ports 21020 21039 because each accelerator node uses four port numbers. When a accelerator node is started, the corresponding link to the accelerator binary DWA_CM_node is used. The ondwa setup command creates a symbolic link for accelerator each node to the DWA_CM binary. The link makes it easier to find the processes of your accelerator instance in a ps output because the ps command shows the symbolic links and not the DWA_CM binary itself. Related tasks: Configuring the accelerator (non-cluster installation) on page 3-1 Related reference: ondwa start command dwainst.conf configuration file on page 3-3
The output of a accelerator node is recorded in the log file for the node, for example: node0.log, node1.log, and so forth. These files are located in the accelerator storage directory. After the ondwa start command has finished, your accelerator instance is ready to use. After you run ondwa start, you should run the ondwa status command to check that status of the accelerator.
3-11
Related tasks: Uninstalling the accelerator on page 2-5 Configuring the accelerator (non-cluster installation) on page 3-1 Related reference: ondwa setup command on page 3-10 ondwa status command
3-12
The HB-Status (heartbeat status) column can have one of the following values: Initializing The accelerator node is initializing and loading data into memory. Healthy The initialization is complete and the accelerator node is ready for queries. QuiescePend The accelerator node is waiting to go into a quiesced state. Quiesced The accelerator node is in a quiesced state. Resuming The accelerator node is resuming after a quiesced or a maintenance state. Maintenance The accelerator node is in maintenance state. MaintPend The accelerator node is waiting to go into a maintenance state. Missing The accelerator node has a missing heartbeat. Shutdown The accelerator node is shutting down. Related reference: ondwa start command on page 3-11
3-13
------------------+--------------------+---------+---------+--------+-----------Task 316822777484 (of type QUERY with name Query execution @ coordinator - 0:00) ------------------+--------------------+---------+---------+--------+-----------Location | Status | Progr. | Upd. ms | Memory | Monitor ------------------+--------------------+---------+---------+--------+-----------Primary Node 0 | OPNQRY | 0 | 10 | 17K | Fine -> Node 1 | Fact | 0 | 3 | 22M | Fine (Total Memory) | | | | 22M | ------------------+--------------------+---------+---------+--------+-----------Used Resources | Mart ID: 2 on node 0 ------------------+--------------------+---------+---------+--------+-----------Task 437785070080 (of type DAEMON with name DRDADaemon - 4:48) ------------------+--------------------+---------+---------+--------+-----------Location | Status | Progr. | Upd. ms | Memory | Monitor ------------------+--------------------+---------+---------+--------+-----------Primary Node 0 | Running | 2 | 30 | 0 | Fine (Total Memory) | | | | 0 | ------------------+--------------------+---------+---------+--------+-----------Used Resources | DRDA device: lo address: 127.0.0.1:21022 on node 0 | Unbound on node 0 ------------------+--------------------+---------+---------+--------+------------ End of Tasklist -
The ondwa stop action ends when all of the DWA_CM_* processes on your accelerator instance are completed. Use the -f option to stop the full set of DWA_CM processes. Use this option if the ondwa stop action is not able to shut down one or more of the processes. Related tasks: Uninstalling the accelerator on page 2-5
3-14
$ ondwa reset
The ondwa reset command also removes all of the entries of your accelerator instance under the /dev/shm directory. This command does not remove the log files for the accelerator node and the files created by the ondwa setup command. After you run the ondwa reset command, you can initialize the accelerator instance by using the ondwa start command.
The ondwa clean command removes the following files and directories: v All of the files created by the ondwa setup command v The complete shared and local directory trees in the accelerator storage directory v The log files for the accelerator node After you run the ondwa clean command, you can run the ondwa setup command to set up the accelerator instance again.
3-15
3-16
4-1
very large number of tables. The data mart definitions are created after you run a series of statements, stored procedures, and functions that analyze your schema and queries. 3. Deploying a data mart on page 4-20 4. Loading data into data marts on page 4-21 Related tasks: Creating data mart definitions by using the administration interface on page 4-8 Creating data mart definitions by using workload analysis on page 4-10
Or you can activate SQLTRACE by using the SQL Administration API Functions. For example:
EXECUTE FUNCTION task("set sql tracing on","1500","4","high","global");
c. Restart the database server to activate the configuration parameters. d. Run the query workload. e. Review the results of the workload by using one of the following methods: v Run the onstat -g his command v Use information from the syssqltrace table in the sysmaster database. For example in dbaccess run this SQL statement:
SELECT sql_runtime, sql_statement FROM syssqltrace WHERE sql_stmtname matches "SELECT" ORDER BY sql_runtime DESC
4. Analyze the queries that you want to accelerate. 5. From the query information, exclude the OLTP queries. Analyze the remaining queries to determine which columns in the dimension tables are being used by the queries. 6. If you have large dimension tables, identify specific columns in the tables to load into the data mart instead of loading all of the columns from each table.
4-2
Related reference: Enable SQL tracing (Administrator's Guide) set sql tracing argument: Set global SQL tracing (SQL administration API) (Administrator's Reference) EXPLAIN_STAT Configuration Parameter (Administrator's Reference) SQLTRACE Configuration Parameter (Administrator's Reference)
Data marts
For efficient query processing, the Informix Warehouse Accelerator must have its own copy of the data. The data is stored in logical collections of related data, or data marts. A data mart specifies the collection of tables that are loaded into an accelerator and the relationships, or references, between these tables. Typically, the data marts contain a subset of the tables in your database. The data marts can also contain a subset of the columns within a table. To improve query processing, limit the number of dimension tables, and columns within the dimension tables, in the data mart. By identifying only those columns that are necessary to respond to your queries. However, the data marts do not need to be a duplication of the design of your warehouse fact and dimension tables. For example, you can designate a dimension table in your warehouse schema as a fact table in a data mart. When you create a data mart you use the accelerator administrator interface to specify the fact table, the dimension tables, and the references between the tables. A newly created data mart has all of the necessary structures defined but is empty and must be filled with a snapshot of the data from the Informix database server. When the data from the database server is loaded in the data mart on the accelerator, the data is compressed. After the data is loaded in the data mart, the data mart becomes operational.
4-3
DAILY_FORECAST fact table PROMOTION PROMOKEY PROMOTYPE PROMODESC PROMOVALUE PROMOVALUE2 PROMO_COST PERKEY STOREKEY PRODKEY QUANTITY_FORECAST EXTENDED_PRICE_FORECAST EXTENDED_COST_FORECAST
Using the schema in Figure 4-1, you can create two data marts. The first data mart is based on the DAILY_SALES fact table and the dimension tables that it links to, as shown in Figure 4-2 on page 4-5. A second data mart is based on the DAILY_FORECAST fact table and the dimension tables that it links to, as shown in Figure 4-3 on page 4-6.
4-4
4-5
PRODCUT PRODKEY BRANDKEY DAILY_FORECAST fact table PERKEY STOREKEY PRODKEY QUANTITY_FORECAST EXTENDED_PRICE_FORECAST EXTENDED_COST_FORECAST PRODLINEKEY UPC_NUMBER P_PRICE P_COST ITEM_DESC PACKAGE_TYPE CATEGORY SUB_CATEGORY PACKAGE_SIZE
4-6
The information stored in the catalog tables of the Informix database server is the same information that is stored in the catalog tables for other types of views. The information in the AQTs is used by the database server to determine which queries can be processed by the accelerator. The database server attempts to match a query with an AQT. If a query matches an AQT, the query is sent to the accelerator. There can be many AQTs. If the first AQT is not able to satisfy a query, then the search for a match continues until the query has been checked against all of the AQTs. If a match is found, the query is sent to the accelerator for processing. If no match is found, the query is processed by Informix. To be sent to the accelerator, the query must meet the following criteria: v The query must refer to a subset of the tables in the AQT. v The table references, or joins, specified in the query must be the same as the references in the data mart definition. v The query must include only one fact table. v The query must have an INNER JOIN or LEFT JOIN with the fact table on the left dominant side. v The scalar and aggregate functions in the query must be supported by the accelerator. A data mart can be in different states such as: LOAD PENDING, ENABLED, DISABLED. The associated AQTs reflect the basic states to facilitate correct query matching and have only two states: active and inactive. When the administrator drops a data mart from the accelerator, the associated AQTs are removed automatically from the database server. Related concepts: Analyze queries for acceleration
Does the query reference only the tables and Only queries that include the supported data columns that are included in the data mart types are loaded into the accelerator. definition? Does the query reference a table that is marked as a fact table in the data mart definition?
4-7
Consideration Is the query a long-running analytical query and not a transactional query that should not be processed by the accelerator? For example, a query that returns only a few rows by using a selective condition on an indexed column. Does the query use only supported join types? Do the join sequence and the join predicates of the query match the definition of the accelerated query table (AQT)?
Description
Equality join predicates, INNER joins, and LEFT OUTER joins are the supported join types. The fact table referenced in the query must be on the left side of the LEFT OUTER join. You need to know the joins that are supported and unsupported.
Does the query use only supported aggregate and scalar functions?
Only queries that include the supported functions are sent to the accelerator for processing.
Related concepts: Accelerated query tables on page 4-6 Queries that benefit from acceleration on page 1-6
4-8
v If you add a table to your data mart definition that includes a data type column that is not supported, you must remove that column from the data mart definition 5. Create references, or joins, between tables in the data mart definition (if necessary). 6. Designate the fact table for the data mart definition. 7. Specify the table columns to load into the accelerator. Select the table in the Canvas and look at the Properties view. Click on the Columns page to view a list of the columns in that table. By default all of the columns in the table are included in the data mart. Uncheck the columns that you do not want included. 8. Check the size of a data mart definition before it is deployed. You can compare a size estimate of the data mart definition with the memory that is available on your accelerator. If necessary, you can change the join type of table references or omit specific columns that are rarely or never needed to reduce required memory. 9. Validate the integrity of the data mart definition. Ensure that the syntax and structure of a data mart definition are correct and that the data mart can be safely deployed to the accelerator. After you create the data mart definition, you need to deploy and load data into the data mart. Related tasks: Creating data marts on page 4-1 Deploying a data mart on page 4-20
One-to-many joins
A one-to-many join connects the columns of a primary key, unique constraint or unique, nonnullable index of the parent table with columns of the child table. Any row or tuple in the child table is related to a maximum of one row or tuple in the parent table. If the table has a primary key, the corresponding key columns are selected automatically. You can override this automatic selection by selecting another unique constraint or unique index. If the parent table does not have a primary key, select one of the unique keys. At least one unique constraint or unique nonnullable index is required, otherwise the one-to-many reference cannot be created. Important: One-to-many joins lead to a better query performance than many-to-many joins. If one of the tables that you want to use has a unique constraint, unique index, or primary key on the join columns, use a one-to-many join.
Many-to-many joins
In a many-to-many join, one or more columns of the parent table are joined with an equal number of columns in the child table. The values of these columns do not have to be unique and you do not have to enforce uniqueness through the selection of a constraint. This means that any row or tuple in the child table can
Chapter 4. Accelerator data marts and AQTs
4-9
relate to multiple tuples in the parent table. Join tables at run-time: When you specify that the tables are joined at run time, the tables are joined in the system memory of the accelerator when the query is run. Runtime joins require less system memory to hold the data mart.
You can run this command through an application or with the sysdbopen() procedure.
4-10
4. Optional. You can enable SQL tracing. With SQL tracing on, you can identify the probing data that resulted from a specific SQL statement because each query statement is assigned an individual number, a statement ID.
SQL Tracing On Results You can identify specific statements, for example statements that took a certain length of time to process, or statements that accessed specific tables. With that information, you can include the probing data that resulted from only these statements in the mart definition. The probing data is collected into one single set of data. When you create the data mart definition, the entire set of probing data must be used.
To turn SQL tracing on, use the following steps: a. On the computer where the Informix database server is installed, log on as user informix. b. Enable the SQLTRACE configuration parameter in the onconfig file in the $INFORMIXDIR/etc directory. For example:
SQLTRACE level=low,ntraces=1000,size=4,mode=global
Or you can activate SQLTRACE by using the SQL Administration API Functions. For example:
EXECUTE FUNCTION task("set sql tracing on","1000","4","low","global");
c. If you changed the onconfig file, restart the database server to activate the configuration parameter. 5. Optional. To run the query probing more quickly, issue the SET EXPLAIN ON AVOID_EXECUTE statement before you run your query workload. When you issue this statement, the queries are optimized and the probing data is collected, but a result set is not determined or returned. Important: If you want to process the probing data based on the runtime of the queries then turn on SQL tracing and do not use the AVOID_EXECUTE option of the SET EXPLAIN statement. If you avoid running the queries, you will not know how long it takes to really run the queries. 6. Run the query workload. The probing data is stored in memory. 7. If you want to view the SQL trace information about the workload, use one of the following methods: v Run the onstat -g his command. v Use information from the syssqltrace table in the sysmaster database. For example, in dbaccess run this SQL statement:
SELECT sql_runtime, sql_statement FROM syssqltrace WHERE sql_stmtname matches "SELECT" ORDER BY sql_runtime DESC
8. If you want to view the probing data, use one of the following methods: v Run the onstat -g probe command. v Query the system monitoring interface (SMI) pseudo tables that contain probing data. 9. Create a separate logging database. Even though your warehouse database might be configured for logging, you should create a separate database. This separate database is used to store the data mart definition.
4-11
10. Convert the probing data into a data mart definition using the probe2mart() stored procedure. You can create a data mart definition from all of the probing data or from the data from specific queries (if SQL tracing is ON). 11. Use the genmartdef() function to generate the data mart definition. This function returns a CLOB that contains the data mart definition in XML format. Store the data mart definition in a file. 12. Import the file into the administration interface. a. On the computer where the IBM Smart Analytics Optimizer Studio is installed, start the administration interface: v On UNIX, open the $IWA_INSTALL_DIR/dwa_gui directory and run the ./datastudio command. v On Windows, select Start > Programs > IBM Smart Analytics Optimizer Studio 1.1 b. Create a new accelerator project. Right-click on the Accelerators folder and select New Accelerator or choose File > New > Accelerator project. c. Right-click on the new accelerator project in the Project Explorer window. Select Import. The value Data Mart Import is selected by default. Click Next. d. Locate the generated file. Verify that Import into existing project and the name of the current project is selected. Click Finish. After you create the data mart definition, you need to deploy and load data into the data mart. Related tasks: Creating data marts on page 4-1 Deploying a data mart on page 4-20 Related reference: UPDATE STATISTICS statement (SQL Syntax) Star-Join Directives (SQL Syntax) Contents of query probing data on page 4-18 The probe2mart stored procedure on page 4-18 Appendix B, Sysmaster interface (SMI) pseudo tables for query probing data, on page B-1
The query selects the names of the top five customers from the state CA and the total ship weight of their already shipped orders. The query is an inner join. The orders table is the fact table. The customer table is the dimension table. Since the
4-12
orders table has fewer rows than the customer table, the {+ FACT(orders)} optimizer hint is required. Otherwise the customer table would be considered as the fact table. The following commands correspond to steps in the task Creating data mart definitions by using workload analysis on page 4-10. The SQL statements used in this example were executed using dbaccess, and are prompted by ">". Commands executed from the shell are prompted by "$".
4-13
(sum)
The reason no rows are returned in this example is that the SET EXPLAIN ON AVOID_EXECUTE statement has been used.
Statement # 2:
@ 0x4dd51058
Database: 0x100153 Statement text: select {+ FACT(orders)} first 5 fname,lname,sum(ship_weight) from customer c,orders o where c.customer_num=o.customer_num and state=CA and ship_date is not null group by 1,2 order by 3 desc Statement information: Sess_id User_id Stmt Type 51 29574 SELECT Statement Statistics: Page Buffer Read Read Read % Cache 0 0 0.00 Lock Requests 0 Lock Waits 0 LK Wait Time (S) 0.0000 Avg Time (S) 0.0000 Finish Time 10:39:09 Buffer IDX Read 0 Log Space 0.000 B Max Time (S) 0.0000 SQL Error 0 Page Write 0 Num Sorts 0 Avg IO Wait 0.000000 ISAM Error 0 Run Time 0.0000 Buffer Write 0 Disk Sorts 0 I/O Wait Time (S) 0.000000 Isolation Level NL TX Stamp 33f4e PDQ 0
Write % Cache 0.00 Memory Sorts 0 Avg Rows Per Sec 678122.8357 SQL Memory 25304
4-14
> SELECT sql_id,sql_runtime, sql_statement FROM sysmaster:syssqltrace WHERE ql_stmtname=SELECT ORDER BY sql_runtime desc; sql_id 2 sql_runtime 1.47450285e-06 sql_statement select {+ FACT(orders)} first 5 fname,lname,sum(ship_weight) from customer c,orders o where c.customer_num=o.customer_num and state=CA and ship_date is not null group by 1,2 order by 3 desc
Output description: v Statement 2 accesses tables: the customer table with tabid 100 and the orders table with tabid 102 v In the customer table, columns 1, 2, 3, and 8 are accessed v In the orders table, columns 3, 7, and 8 are accessed v v v v The orders table is the fact table Column 3 in the orders table is joined with column 1 in the customers table The join is an inner join The customers table has a unique index on column 1
You can verify that the probing data was gathered by connecting to the sysmaster database and querying the SMI tables:
> SELECT * FROM sysprobetables; dbname stores_demo sql_id 2 tabid 100 fact n dbname stores_demo sql_id 2 tabid 102 fact y 2 row(s) retrieved. > SELECT * FROM sysprobecolumns; dbname sql_id tabid colno stores_demo 2 100 1
Chapter 4. Accelerator data marts and AQTs
4-15
dbname sql_id tabid colno dbname sql_id tabid colno dbname sql_id tabid colno dbname sql_id tabid colno dbname sql_id tabid colno dbname sql_id tabid colno
stores_demo 2 100 2 stores_demo 2 100 3 stores_demo 2 100 8 stores_demo 2 102 3 stores_demo 2 102 7 stores_demo 2 102 8
7 row(s) retrieved. > SELECT * FROM sysprobejds; dbname stores_demo sql_id 2 jd 1 ctabid 102 ptabid 100 type i uniq y 1 row(s) retrieved. > SELECT * FROM sysprobejps; dbname stores_demo sql_id 2 jd 1 jp 1 ccolno 3 pcolno 1 1 row(s) retrieved.
4-16
Step 11: Run the genmartdef() function and store the result in a file
Connect to the logging database stores_mart, where the probing data that was generated by the probe2mart stored procedure is stored:
$ dbaccess stores_marts > execute function lotofile(genmartdef (orders_customer_mart),orders_customer_mart.xml!,client); (expression) orders_customer_mart.xml 1 row(s) retrieved.
Step 12: Import the file into a project in the administration interface
The detailed instructions for importing the file into the administration interface are already documented in the task. Alternatively, you can also use the Java classes that are included with the accelerator to deploy and load the data mart. To deploy the data mart:
$ java createMart <accelerator name> orders_customer_mart.xml
4-17
See the dwa_java_reference.txt file in the dwa/example/cli/ directory for more information about these sample Java classes.
A join descriptor is comprised of: v All of the join predicates of the same pair of tables. For example:
table_1.col_a = table_2.col_x and table_1.col_b = table_2.col_y and table_1.col_c = table_2.col_z
v The type of the join - an inner join or left outer join. v Information about any unique indexes on the dimension table of the join. The dimension table is the right table in case of a left outer join. The unique index must be on the complete set of columns contained in this join descriptor. Related tasks: Creating data mart definitions by using workload analysis on page 4-10
Syntax
The syntax of the stored procedure is:
probe2mart ( ' database ' , ' mart_name ' , sqlid ) ;
database The name of the database that contains the data warehouse. This is the warehouse database on which the workload queries are run. See Usage. mart_name The name that you want to use for the data mart definition. The name is also the name of the data mart that is created later, based on the data mart definition. If the name you specify is an existing data mart definition, the probing data is merged into the already existing data mart definition. If the data mart definition you specify does not exist, the data mart definition is created. sqlid Optional. The ID of the query SQL statement, which identifies the probing
4-18
data from that query. If the sqlid is not provided, all of the probing data from the specified database is added to the data mart definition.
Usage
The probe2mart stored procedure should be run from a different database than the warehouse database. This separate database must be a logging database. You can use a test database, if the test database is already a logging database, or you can create a different database that keeps these tables separated from your other tables. Note: Create a separate logging database to use with the probe2mart stored procedure. Using a separate makes it much easier to revert from Informix 11.70 to an earlier version of Informix. When the stored procedure is run, the probing data is processed and stored in the logging database in a set of permanent tables. The tables keep the data mart definition in a relational format. The tables are automatically created when the probing data is processed into a data mart definition for the first time. The probe2mart stored procedure creates a data mart definition by converting the probing data and inserting rows into the following tables.
Table 4-1. Tables created by the probe2mart stored procedure Table 'informix'.iwa_marts 'informix'.iwa_tables 'informix'.iwa_columns 'informix'.iwa_mtabs 'informix'.iwa_mcols 'informix'.iwa_mrefs 'informix'.iwa_mrefcols Description Names of the data mart definitions All of the tables used in any data mart definition All columns used in any data mart definition Tables for a specific data mart definition Columns for a specific data mart definition References (join descriptors) of a specific data mart definition Reference columns (join predicates) of a specific data mart definition
Examples
To convert, or merge, all of the probing data into a data mart definition, use this form of the syntax:
EXECUTE PROCEDURE probe2mart(database, mart_name);
For example, to generate a data mart definition named salesmart from all of probing data that is available for the database sales, use this statement:
EXECUTE PROCEDURE probe2mart(sales, salesmart);
You can also merge the probing data from a specific query into a data mart definition. You need to look up the SQL ID number of the query which was captured by SQL tracing. SQL tracing must be ON to designate data from specific queries. Queries are identified by a statement ID.
4-19
For example, to merge the probing data from SQL statement 8372 into the data mart definition salesmart, run this command:
EXECUTE PROCEDURE probe2mart(sales, salesmart, 8372);
To create a data mart definition from queries that run longer than 10 seconds, use this SQL statement:
SELECT probe2mart(sales,salesmart,sql_id) FROM sysmaster:syssqltrace WHERE sql_runtime > 10;
Related tasks: Creating data mart definitions by using workload analysis on page 4-10 Related reference: Chapter 5, Reversion requirements for an Informix warehouse edition server and Informix Warehouse Accelerator, on page 5-1
Syntax
The syntax of the function is:
genmartdef(mart name);
You can either issue the genmartdef() function by itself, or incorporate it as a parameter within the LOTOFILE() function. Using the genmartdef() function with the LOTOFILE() function places the CLOB into an operating system file.
Examples
Use the genmartdef() function as a parameter within the LOTOFILE() function:
EXECUTE FUNCTION LOTOFILE(genmartdef(salesmart), salesmart.xml!,client));
The following example generates the data mart definition for salesmart. The resulting CLOB is used as a parameter within the LOTOFILE() function. The LOTOFILE() function stores the resulting CLOB in an operating system file named salesmart.xml on the client computer.
SELECT lotofile(genmartdef(salesmart),salesmart.xml!,client) FROM iwa_marts WHERE martname=salesmart;
4-20
1. On the computer where the IBM Smart Analytics Optimizer Studio is installed, start the administration interface: v On UNIX, open the$IWA_INSTALL_DIR/dwa_gui directory and run the ./datastudio command. v On Windows, select Start > Programs > IBM Smart Analytics Optimizer Studio 1.1 2. You can deploy the data mart from either the Data Source Explorer or from the Properties view of the data mart. Related tasks: Loading data into data marts Creating data mart definitions by using the administration interface on page 4-8 Creating data mart definitions by using workload analysis on page 4-10
4-21
Option TABLE
Description Each table is locked for the duration it takes to gather the load data from the table. The loaded data is consistent within each table, but not necessarily across different tables. All of the tables in the data mart are locked for the duration of the load. The loaded data is consistent from all of the tables. However, all of the other user sessions are blocked from changing the data in the tables that are involved in the load.
MART
4-22
Related concepts: Updating the data in a data mart Related tasks: Creating data marts on page 4-1
4-23
3. Validate the data marts. 4. Deploy the data marts. 5. Load the data into the data marts again. After you refresh the data marts, rerun the queries to confirm that no processing errors are returned.
To check if the probing data is removed, run the onstat -g probe command. Important: The probing data is stored in memory. The data is automatically removed when the database server is shut down.
Monitoring AQTs
You can use the onstat -g aqt command to view information about the data marts and the associated accelerated query tables (AQTs). Related reference: onstat -g aqt command: Print data mart and accelerated query table information (Administrator's Reference)
4-24
Chapter 5. Reversion requirements for an Informix warehouse edition server and Informix Warehouse Accelerator
If you need to revert from the Informix 11.70 instance to an earlier version, there are some reversion tasks you need to perform.
Reversion of the database server that contains data marts that were created with Informix Warehouse Accelerator
Before you use the onmode -b command to revert the database server, you must drop all of the data marts associated with that database server. You can drop the data marts by using the administration interface, Informix Smart Analytics Optimizer Studio, or by using the command line interface. If you do not drop the data marts and the reversion succeeds, the system databases might be rebuilt and the AQTs will disappear. As a result, the data marts associated with the reverted databases are orphaned and consume space and memory.
Reversion of the database server that performed workload analysis to create data mart definitions
If you created the data mart definitions using workload analysis, the tables that are created by the probe2mart stored procedure must be dropped before starting the reversion: v If you created a separate logging database to use with the probe2mart stored procedure, all of the data mart definition information will be in tables in that separate database. You can simply drop that database without impacting your other databases. v If you used your warehousing database as the logging database with the probe2mart stored procedure, you must manually drop each of the permanent tables that were created by the probe2mart stored procedure. If you do not drop the tables before you start reversion, the reversion will fail. Related reference: The probe2mart stored procedure on page 4-18
5-1
5-2
Chapter 6. Troubleshooting
Missing sbspace
If you do not have a default sbspace created in the Informix database server and attempt to install the accelerator an error is returned. You must create and configure a default sbspace in the Informix database server, set the name of the default sbspace in the SBSPACENAME configuration parameter, and restart the Informix database server. Related tasks: Preparing the Informix database server on page 2-2
Memory issues for the coordinator node and the worker nodes
If you do not have sufficient memory assigned to the coordinator node and to the worker nodes, you might receive errors when you load data from the database server or when you run queries. The more worker nodes that you designate, the faster the data will be loaded from the database server. However, the more worker nodes you designate the more memory you will need because each worker node stores a copy of the data in the dimension tables. You specify the memory in the dwainst.conf configuration file. Related reference: dwainst.conf configuration file on page 3-3
6-1
6-2
SQL statements
The following SQL statements create the tables, indexes, and key constraints for the sample warehouse schema.
CREATE TABLE DAILY_FORECAST ( PERKEY INTEGER NOT NULL , STOREKEY INTEGER NOT NULL , PRODKEY INTEGER NOT NULL , QUANTITY_FORECAST INTEGER , EXTENDED_PRICE_FORECAST DECIMAL(16,2) , EXTENDED_COST_FORECAST DECIMAL(16,2) ); CREATE TABLE DAILY_SALES ( PERKEY INTEGER NOT NULL , STOREKEY INTEGER NOT NULL , CUSTKEY INTEGER NOT NULL , PRODKEY INTEGER NOT NULL , PROMOKEY INTEGER NOT NULL , QUANTITY_SOLD INTEGER , EXTENDED_PRICE DECIMAL(16,2) , EXTENDED_COST DECIMAL(16,2) , SHELF_LOCATION INTEGER , SHELF_NUMBER INTEGER , START_SHELF_DATE INTEGER , SHELF_HEIGHT INTEGER , SHELF_WIDTH INTEGER , SHELF_DEPTH INTEGER , SHELF_COST DECIMAL(16,2) , SHELF_COST_PCT_OF_SALE DECIMAL(16,2) , BIN_NUMBER INTEGER , PRODUCT_PER_BIN INTEGER , START_BIN_DATE INTEGER , BIN_HEIGHT INTEGER , BIN_WIDTH INTEGER , BIN_DEPTH INTEGER , BIN_COST DECIMAL(16,2) , BIN_COST_PCT_OF_SALE DECIMAL(16,2) , TRANS_NUMBER INTEGER , HANDLING_CHARGE INTEGER , UPC INTEGER , SHIPPING INTEGER , TAX INTEGER , PERCENT_DISCOUNT INTEGER , TOTAL_DISPLAY_COST DECIMAL(16,2) , TOTAL_DISCOUNT DECIMAL(16,2) ) ; CREATE TABLE CUSTOMER ( CUSTKEY INTEGER NOT NULL , NAME CHAR(30) , ADDRESS CHAR(40) , C_CITY CHAR(20) , C_STATE CHAR(5) , ZIP CHAR(5) , PHONE CHAR(10) , AGE_LEVEL SMALLINT , AGE_LEVEL_DESC CHAR(20) , INCOME_LEVEL SMALLINT ,
Copyright IBM Corp. 2010, 2011
A-1
INCOME_LEVEL_DESC CHAR(20) , MARITAL_STATUS CHAR(1) , GENDER CHAR(1) , DISCOUNT DECIMAL(16,2) ) ; ALTER TABLE CUSTOMER ADD CONSTRAINT PRIMARY KEY ( CUSTKEY ); CREATE TABLE PERIOD ( PERKEY INTEGER NOT NULL , CALENDAR_DATE DATE , DAY_OF_WEEK SMALLINT , WEEK SMALLINT , PERIOD SMALLINT , YEAR SMALLINT , HOLIDAY_FLAG CHAR(1) , WEEK_ENDING_DATE DATE , MONTH CHAR(3) ) ; ALTER TABLE PERIOD ADD CONSTRAINT PRIMARY KEY ( PERKEY ); CREATE UNIQUE INDEX PERX1 ( CALENDAR_DATE ASC, PERKEY ASC ); ON PERIOD
PERIOD
CREATE TABLE PRODUCT ( PRODKEY INTEGER NOT NULL , UPC_NUMBER CHAR(11) NOT NULL , PACKAGE_TYPE CHAR(20) , FLAVOR CHAR(20) , FORM CHAR(20) , CATEGORY INTEGER , SUB_CATEGORY INTEGER , CASE_PACK INTEGER , PACKAGE_SIZE CHAR(6) , ITEM_DESC CHAR(30) , P_PRICE DECIMAL(16,2) , CATEGORY_DESC CHAR(30) , P_COST DECIMAL(16,2) , SUB_CATEGORY_DESC CHAR(70) ) ; ALTER TABLE PRODUCT ADD CONSTRAINT PRIMARY KEY ( PRODKEY ); CREATE UNIQUE INDEX PRODX2 ( CATEGORY ASC, PRODKEY ASC ); CREATE UNIQUE INDEX PRODX3 ( CATEGORY_DESC ASC, PRODKEY ASC ); ON PRODUCT
ON
PRODUCT
CREATE TABLE PROMOTION ( PROMOKEY INTEGER NOT NULL , PROMOTYPE INTEGER , PROMODESC CHAR(30) , PROMOVALUE DECIMAL(16,2) , PROMOVALUE2 DECIMAL(16,2) ,
A-2
PROMO_COST
DECIMAL(16,2) )
ALTER TABLE PROMOTION ADD CONSTRAINT PRIMARY KEY ( PROMOKEY ); CREATE UNIQUE INDEX PROMOX1 ( PROMODESC ASC, PROMOKEY ASC); ON PROMOTION
CREATE TABLE STORE ( STOREKEY INTEGER NOT NULL , STORE_NUMBER CHAR(2) , CITY CHAR(20) , STATE CHAR(5) , DISTRICT CHAR(14) , REGION CHAR(10) ) ; ALTER TABLE STORE ADD CONSTRAINT PRIMARY KEY ( STOREKEY ); CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX DFX1 DFX2 DFX3 DSX1 DSX2 DSX3 DSX4 DSX5 ON ON ON ON ON ON ON ON DAILY_FORECAST ( PERKEY DAILY_FORECAST ( STOREKEY DAILY_FORECAST ( PRODKEY DAILY_SALES ( PERKEY DAILY_SALES ( STOREKEY DAILY_SALES ( CUSTKEY DAILY_SALES ( PRODKEY DAILY_SALES ( PROMOKEY ASC); ASC); ASC);
ASC); ASC); ASC); ASC); ASC); KEY (perkey) KEY (prodkey) KEY KEY (custkey) KEY (promokey)
ALTER TABLE daily_sales ADD CONSTRAINT FOREIGN references period(perkey); ALTER TABLE daily_sales ADD CONSTRAINT FOREIGN references product(prodkey); ALTER TABLE daily_sales ADD CONSTRAINT FOREIGN (storekey) references store(storekey); ALTER TABLE daily_sales ADD CONSTRAINT FOREIGN references customer(custkey); ALTER TABLE daily_sales ADD CONSTRAINT FOREIGN references promotion(promokey);
ALTER TABLE daily_forecast ADD CONSTRAINT FOREIGN KEY (perkey) references period(perkey); ALTER TABLE daily_forecast ADD CONSTRAINT FOREIGN KEY (prodkey) references product(prodkey); ALTER TABLE daily_forecast ADD CONSTRAINT FOREIGN KEY (storekey) references store(storekey); update statistics medium;
Appendix A. Sample warehouse schema
A-3
A-4
Appendix B. Sysmaster interface (SMI) pseudo tables for query probing data
The SMI tables provide a way to access probing data in a relational form, which is most convenient for further processing. The sysmaster database provides the following pseudo tables for accessing probing data: v For tables: sysprobetables v For columns: sysprobecolumns v For join descriptors: sysprobejds v For join predicates: sysprobejps
B-1
Related tasks: Creating data mart definitions by using workload analysis on page 4-10
B-2
C-1
C-2
Locales utf-8
The following table lists the supported locales:
Table C-4. Information for locales utf-8 Locale utf-8 (ar_ae - hu_hu) ar_ae.utf8 ar_bh.utf8 ar_kw.utf8 ar_om.utf8 ar_qa.utf8 ar_sa.utf8 cs_cz.utf8 da_dk.utf8 de_at.utf8 de_ch.utf8 de_de.utf8 en_au.utf8 en_gb.utf8 en_us.utf8 es_es.utf8 fi_fi.utf8 fr_be.utf8 fr_ca.utf8 fr_ch.utf8 fr_fr.utf8 hr_hr.utf8 hu_hu.utf8 Locale utf8 (ja_jp - zh_tw) ja_jp.utf8 ko_kr.utf8 nl_be.utf8 nl_nl.utf8 no_no.utf8 pl_pl.utf8 pt_br.utf8 pt_pt.utf8 ro_ro.utf8 ru_ru.utf8 sh_hr.utf8 (Needs mapping to hr_hr.utf8 for Java/JDBC) sk_sk.utf8 sv_se.utf8 th_th.utf8 tr_tr.utf8 tr_tr.utf8@ifix zh_cn.utf8 zh_hk.utf8 (Needs mapping to zh_tw.utf8 for Java/JDBC) zh_tw.utf8
C-3
Table C-5. Information for locales PC-Latin-1, PC-Latin-2, and 858 Locale PC-Latin-1 da_dk.PC-Latin-1 de_at.PC-Latin-1 de_at.PC-Latin-1@bund de_at.PC-Latin-1@dude de_ch.PC-Latin-1 de_ch.PC-Latin-1@bund de_ch.PC-Latin-1@dude de_de.PC-Latin-1 de_de.PC-Latin-1@bun1 de_de.PC-Latin-1@bund de_de.PC-Latin-1@dud1 de_de.PC-Latin-1@dude en_au.PC-Latin-1 en_gb.PC-Latin-1 en_us.PC-Latin-1 en_us.PC-Latin-1@dict es_es.PC-Latin-1 es_es.PC-Latin-1@rae fi_fi.PC-Latin-1 fr_be.PC-Latin-1 fr_ca.PC-Latin-1 fr_ch.PC-Latin-1 fr_fr.PC-Latin-1 is_is.PC-Latin-1 it_it.PC-Latin-1 nl_be.PC-Latin-1 nl_nl.PC-Latin-1 no_no.PC-Latin-1 pt_pt.PC-Latin-1 sv_se.PC-Latin-1 Locale PC-Latin-2 cs_cz.PC-Latin-2 hr_hr.PC-Latin-2 hu_hu.PC-Latin-2 pl_pl.PC-Latin-2 ro_ro.PC-Latin-2 sh_hr.PC-Latin-2 sk_sk.PC-Latin-2 Locale 858 da_dk.858 de_at.858 de_at.858@bund de_at.858@dude de_de.858 de_de.858@bun1 de_de.858@bund de_de.858@dud1 de_de.858@dude es_es.858 fi_fi.858 fr_be.858 fr_fr.858 it_it.858 nl_be.858 nl_nl.858 pt_pt.858
C-4
C-5
C-6
Appendix D. Accessibility
IBM strives to provide products with usable access for everyone, regardless of age or ability.
Accessibility features
The following list includes the major accessibility features in IBM Informix products. These features support: v Keyboard-only operation. v Interfaces that are commonly used by screen readers. v The attachment of alternative input and output devices. Tip: The information center and its related publications are accessibility-enabled for the IBM Home Page Reader. You can operate all features by using the keyboard instead of the mouse.
Keyboard navigation
This product uses standard Microsoft Windows navigation keys.
D-1
alternatives. If you hear the lines 3.1 USERID and 3.1 SYSTEMID, your syntax can include either USERID or SYSTEMID, but not both. The dotted decimal numbering level denotes the level of nesting. For example, if a syntax element with dotted decimal number 3 is followed by a series of syntax elements with dotted decimal number 3.1, all the syntax elements numbered 3.1 are subordinate to the syntax element numbered 3. Certain words and symbols are used next to the dotted decimal numbers to add information about the syntax elements. Occasionally, these words and symbols might occur at the beginning of the element itself. For ease of identification, if the word or symbol is a part of the syntax element, the word or symbol is preceded by the backslash (\) character. The * symbol can be used next to a dotted decimal number to indicate that the syntax element repeats. For example, syntax element *FILE with dotted decimal number 3 is read as 3 \* FILE. Format 3* FILE indicates that syntax element FILE repeats. Format 3* \* FILE indicates that syntax element * FILE repeats. Characters such as commas, which are used to separate a string of syntax elements, are shown in the syntax just before the items they separate. These characters can appear on the same line as each item, or on a separate line with the same dotted decimal number as the relevant items. The line can also show another symbol that provides information about the syntax elements. For example, the lines 5.1*, 5.1 LASTRUN, and 5.1 DELETE mean that if you use more than one of the LASTRUN and DELETE syntax elements, the elements must be separated by a comma. If no separator is given, assume that you use a blank to separate each syntax element. If a syntax element is preceded by the % symbol, that element is defined elsewhere. The string following the % symbol is the name of a syntax fragment rather than a literal. For example, the line 2.1 %OP1 refers to a separate syntax fragment OP1. The following words and symbols are used next to the dotted decimal numbers: ? Specifies an optional syntax element. A dotted decimal number followed by the ? symbol indicates that all the syntax elements with a corresponding dotted decimal number, and any subordinate syntax elements, are optional. If there is only one syntax element with a dotted decimal number, the ? symbol is displayed on the same line as the syntax element (for example, 5? NOTIFY). If there is more than one syntax element with a dotted decimal number, the ? symbol is displayed on a line by itself, followed by the syntax elements that are optional. For example, if you hear the lines 5 ?, 5 NOTIFY, and 5 UPDATE, you know that syntax elements NOTIFY and UPDATE are optional; that is, you can choose one or none of them. The ? symbol is equivalent to a bypass line in a railroad diagram. Specifies a default syntax element. A dotted decimal number followed by the ! symbol and a syntax element indicates that the syntax element is the default option for all syntax elements that share the same dotted decimal number. Only one of the syntax elements that share the same dotted decimal number can specify a ! symbol. For example, if you hear the lines 2? FILE, 2.1! (KEEP), and 2.1 (DELETE), you know that (KEEP) is the default option for the FILE keyword. In this example, if you include the FILE keyword but do not specify an option, default option KEEP is applied. A default option also applies to the next higher dotted decimal number. In this example, if the FILE keyword is omitted, default FILE(KEEP) is used.
D-2
However, if you hear the lines 2? FILE, 2.1, 2.1.1! (KEEP), and 2.1.1 (DELETE), the default option KEEP only applies to the next higher dotted decimal number, 2.1 (which does not have an associated keyword), and does not apply to 2? FILE. Nothing is used if the keyword FILE is omitted. * Specifies a syntax element that can be repeated zero or more times. A dotted decimal number followed by the * symbol indicates that this syntax element can be used zero or more times; that is, it is optional and can be repeated. For example, if you hear the line 5.1* data-area, you know that you can include more than one data area or you can include none. If you hear the lines 3*, 3 HOST, and 3 STATE, you know that you can include HOST, STATE, both together, or nothing. Notes: 1. If a dotted decimal number has an asterisk (*) next to it and there is only one item with that dotted decimal number, you can repeat that same item more than once. 2. If a dotted decimal number has an asterisk next to it and several items have that dotted decimal number, you can use more than one item from the list, but you cannot use the items more than once each. In the previous example, you can write HOST STATE, but you cannot write HOST HOST. 3. The * symbol is equivalent to a loop-back line in a railroad syntax diagram. + Specifies a syntax element that must be included one or more times. A dotted decimal number followed by the + symbol indicates that this syntax element must be included one or more times. For example, if you hear the line 6.1+ data-area, you must include at least one data area. If you hear the lines 2+, 2 HOST, and 2 STATE, you know that you must include HOST, STATE, or both. As for the * symbol, you can repeat a particular item if it is the only item with that dotted decimal number. The + symbol, like the * symbol, is equivalent to a loop-back line in a railroad syntax diagram.
Appendix D. Accessibility
D-3
D-4
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
E-1
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation J46A/G4 555 Bailey Avenue San Jose, CA 95141-1003 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. All IBM prices shown are IBM's suggested retail prices, are current and are subject to change without notice. Dealer prices may vary. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy,
E-2
modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. Copyright IBM Corp. _enter the year or years_. All rights reserved. If you are viewing this information softcopy, the photographs and color illustrations may not appear.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml. Adobe, the Adobe logo, and PostScript are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Intel, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others.
Notices
E-3
E-4
Index A
Accelerated query tables described 1-1, 4-7 states 4-7 Accelerator administrator interface 1-4 architecture 1-4 configuring 3-1, 3-2 connecting to Informix 3-6 installing 2-1, 2-3 instance 2-1 memory 1-12 operating system 1-11 reinstalling 2-5 turning off acceleration 6-1 uninstalling 2-5 Accessibility D-1 dotted decimal format of syntax diagrams D-1 keyboard D-1 shortcut keys D-1 syntax diagrams, reading in a screen reader D-1 Administration interface creating data marts 4-8 installing 2-4 operating system 1-11 overview 1-4 Aggregate functions 1-9 Aggregate tables 4-3 AQTs See Accelerated query tables Data marts (continued) dropping 4-23 example, creating 4-12 fact table 4-3 key tasks, creating 4-1 loading data 4-21 overview 4-3 refreshing data 4-22 star schema 4-3 summary tables 4-3 updating data 4-23 Data types supported 1-9 Database server schema changes 4-23 Dimension tables sample 4-3 Directories administration interface 2-1 documentation 2-1 installation 2-1 instance 2-1 samples 2-1 storage 2-1, 3-1, 3-2 Disabilities, visual reading syntax diagrams D-1 Disability D-1 Dotted decimal format of syntax diagrams DRDA_INTERFACE parameter 3-3 DWA_CM binary 3-10 DWADIR parameter 3-3 dwainst.conf file ondwa setup command 3-10 parameters 3-3
D-1
C
CLUSTER_INTERFACE parameter 3-2, 3-3 cluster.conf file 3-2 compliance with standards viii Configuring accelerator 3-1, 3-2 dwainst.conf file 3-3 overview 3-1 sbspace 2-2 SBSPACENAME configuration parameter Coordinator node 1-1 memory issues 6-1 COORDINATOR_SHM parameter 3-3
E
Eclipse tool 1-1 Environment settings 3-7 Examples genmartdef function 4-20 probe2mart stored procedure workload analysis 4-12
2-2
4-18
F
Fact tables 4-10 sample 4-3 Functions aggregate functions 1-9 genmartdef 4-20 scalar functions 1-9 user-defined functions 1-9
D
Data currency 4-22 Data mart definitions creating 4-10 defined 4-1 Data marts aggregate tables 4-3 creating 4-8 defined 4-1 deploying 4-20 described 1-1, 4-3 designing 4-2 dimension table 4-3 Copyright IBM Corp. 2010, 2011
G
Genmartdef function examples 4-20 syntax 4-20
X-1
I
IBM Smart Analytics Optimizer Studio architecture 1-4 described 1-1 installing 2-1, 2-4 industry standards viii Installing administration interface 2-4 directory 2-1 Informix Warehouse Accelerator 2-3 overview 2-1 Instance directory 2-1
P
PDQPRIORITY variable 3-7 Prerequisites hardware 1-12 operating system 1-11 software 1-11 Probe2mart reversion requirements 5-1 Probe2mart stored procedure examples 4-18 syntax 4-18 tables 4-18 Probing data removing 4-24
J
Join descriptors 4-18, B-1 Join predicates 4-18, B-1 Joins join combinations 1-11 join predicates 1-11 many-to-many 4-9 one-to-many 4-9 supported 1-11 unsupported 1-11
Q
Queries acceleration considerations analyze 4-7 not accelerated 1-8 types accelerated 1-6 Query probing creating data marts 4-10 example 4-12 removing data 4-24 1-6
L
Locales v, 1-1 supported C-1
R
Reinstalling Informix Warehouse Accelerator Reversion workload analysis tables 5-1 2-5
M
Many-to-many joins 4-9 Memory accelerator 1-12 issues 6-1
S
Samples query profit by store 1-7 revenue by item 1-7 revenue by store 1-7 week to day profits 1-8 warehouse schema A-1 sbspace configuring 2-2 SBSPACENAME configuration parameter 2-2 Scalar functions 1-9 Schemas sample SQL A-1 Screen reader reading syntax diagrams D-1 SET ENVIRONMENT statement 4-24, 6-1 Shortcut keys keyboard D-1 Snowflake schema 1-1 standards viii Star schema sample 4-3 START_PORT parameter 3-3 Statistics UPDATE STATISTICS LOW statement 3-7 Stored procedures probe2mart examples 4-18 probe2mart syntax 4-18
N
NUM_NODES parameter 3-3
O
ondwa utility clean command 3-15 dwainst.conf file 3-10 getpin command 3-13 overview 3-9 reset command 3-14 running a cluster system 3-9 setup command 3-10 start command 3-11 status command 3-12 stop command 3-14 tasks command 3-13 users who can run the command One-to-many joins 4-9 onstat -g aqt command 4-24 Operating systems supported 1-11 Overview 1-1
3-9
X-2
Summary tables 4-3 Syntax genmartdef function 4-20 Syntax diagrams reading in a screen reader D-1 sysdbopen() procedure 3-7
T
Troubleshooting memory issues sbspace 6-1 6-1
U
Uninstalling Informix Warehouse Accelerator 2-5 UPDATE STATISTICS LOW statement 3-7 use_dwa variable 3-7, 6-1 User-defined functions 1-9
V
Variables PDQPRIORITY 3-7 use_dwa 3-7 Visual disabilities reading syntax diagrams
D-1
W
Worker nodes 1-1 memory issues 6-1 WORKER_SHM parameter 3-3 Workload analysis creating data marts 4-10
Index
X-3
X-4
Printed in USA
SC27-3851-00