Mainframe Job Tutorial
Mainframe Job Tutorial
The software delivered to Licensee may contain third-party software code. See Legal Notices (LegalNotices.pdf) for
more information.
How to Use this Guide
Chapter 6 Explains the details of working with simple flat file data.
Chapter 7 Explains the details of working with complex flat file data.
Chapter 15 Covers the process of generating code and uploading jobs to the
mainframe.
Appendix A Contains table and column definitions for the mainframe data
sources used in the tutorial.
Related Documentation
To learn more about documentation from other Ascential products as
they relate to Ascential DataStage Enterprise MVS Edition, refer to the
following table.
Ascential DataStage Server Job Describes the tools that are used in
Developer’s Guide building a server job, and supplies
programmer’s reference information
Ascential DataStage Parallel Job Describes the tools that are used in
Developer’s Guide building a parallel job, and supplies
programmer’s reference information
These guides are also available online in PDF format. You can read
them with the Adobe Acrobat Reader supplied with Ascential
DataStage. See Ascential DataStage Install and Upgrade Guide for
details on installing the manuals and the Adobe Acrobat Reader.
You can use the Acrobat search facilities to search the whole Ascential
DataStage document set. To use this feature, select EditSearch
then choose the All PDF Documents in option and specify the
Ascential DataStage docs directory (by default this is C:\Program
Files\ Ascential\DataStage\Docs).
Extensive online help is also supplied. This is especially useful when
you have become familiar with using Ascential DataStage and need to
look up particular pieces of information.
Documentation Conventions
This manual uses the following conventions:
Drop-Down
List
Browse
Tab Button
Field
Check
Option
Box
Button
Button
Contacting Support
To reach Customer Care, please refer to the information below:
Call toll-free: 1-866-INFONOW (1-866-463-6669)
Email: support@ascentialsoftware.com
Ascential Developer Net: http://developernet.ascential.com
Please consult your support agreement for the location and
availability of customer support personnel.
To find the location and telephone number of the nearest Ascential
Software office outside of North America, please visit the Ascential
Software Corporation website at http://www.ascential.com.
Chapter 1
Introduction to DataStage Mainframe Jobs
Ascential DataStage Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
MVS Edition Terms and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Chapter 2
DataStage Administration
The DataStage Administrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
Exercise 1: Set Project Defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
Chapter 3
Importing Table Definitions
The DataStage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
Exercise 2: Import Mainframe Table Definitions . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Chapter 4
Designing a Mainframe Job
The DataStage Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Exercise 3: Specify Designer Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Exercise 4: Create a Mainframe Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21
Chapter 5
Defining Constraints and Derivations
Exercise 5: Define a Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1
Exercise 6: Define a Stage Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
Exercise 7: Define a Job Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
Chapter 6
Working with Simple Flat Files
Simple Flat File Stage Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Exercise 8: Read Delimited Flat File Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Exercise 9: Write Data to a DB2 Load Ready File . . . . . . . . . . . . . . . . . . . . . . . 6-9
Exercise 10: Use an FTP Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
Chapter 7
Working with Complex Flat Files
Complex Flat File Stage Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
Exercise 11: Use a Complex Flat File Stage. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
Exercise 12: Flatten an Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
Exercise 13: Work with an ODO Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
Exercise 14: Use a Multi-Format Flat File Stage . . . . . . . . . . . . . . . . . . . . . . . 7-12
Exercise 15: Merge Multi-Format Record Types . . . . . . . . . . . . . . . . . . . . . . . 7-17
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
Chapter 8
Working with IMS Data
Exercise 16: Import IMS Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
Exercise 17: Read Data from an IMS Source . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
Chapter 9
Working with Relational Data
Relational Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1
Exercise 18: Read Data from a Relational Source . . . . . . . . . . . . . . . . . . . . . . . 9-2
Exercise 19: Write Data to a Relational Target . . . . . . . . . . . . . . . . . . . . . . . . . 9-5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-8
Chapter 10
Working with External Sources and Targets
Exercise 20: Read Data From an External Source . . . . . . . . . . . . . . . . . . . . . . 10-2
Exercise 21: Write Data to an External Target . . . . . . . . . . . . . . . . . . . . . . . . . 10-6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-8
Chapter 11
Merging Data Using Joins and Lookups
Exercise 22: Merge Data Using a Join Stage . . . . . . . . . . . . . . . . . . . . . . . . . . 11-2
Exercise 23: Merge Data Using a Lookup Stage . . . . . . . . . . . . . . . . . . . . . . . 11-5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9
Chapter 12
Sorting and Aggregating Data
Exercise 24: Sort Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2
Exercise 25: Aggregate Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3
Exercise 26: Use ENDOFDATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-9
Chapter 13
Defining Business Rules
Exercise 27: Controlling Relational Transactions . . . . . . . . . . . . . . . . . . . . . . 13-1
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
Chapter 14
Calling External Routines
Exercise 28: Define Routine Meta Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1
Exercise 29: Call an External Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-7
Chapter 15
Generating Code
Exercise 30: Modify JCL Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1
Exercise 31: Validate a Job and Generate Code . . . . . . . . . . . . . . . . . . . . . . . 15-3
Exercise 32: Define a Machine Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-4
Exercise 33: Upload a Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-7
Chapter 16
Summary
Main Features in Ascential DataStage Enterprise MVS Edition. . . . . . . . . . . 16-1
Recap of the Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2
Contacting Ascential Software Corporation . . . . . . . . . . . . . . . . . . . . . . . . . . 16-4
Appendix A
Sample Data Definitions
COBOL File Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
DB2 DCLGen File Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
IMS Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
Index
Server Components
Ascential DataStage has three server components:
Repository. A central store that contains all the information
required to build a data mart or data warehouse.
DataStage Server. Runs executable server jobs, under the
control of the DataStage Director, that extract, transform, and load
data into a data warehouse.
DataStage Package Installer. A user interface used to install
packaged DataStage jobs and plug-ins.
Client Components
Ascential DataStage has four client components, which are installed
on any PC running Windows 2000, Windows NT 4.0, or Windows XP
Professional:
DataStage Manager. A user interface used to view and edit the
contents of the Repository.
DataStage Designer. A graphical tool used to create DataStage
server, mainframe, and parallel jobs.
DataStage Administrator. A user interface used to perform
basic configuration tasks such as setting up users, creating and
deleting projects, and setting project properties.
DataStage Director. A user interface used to validate, schedule,
run, and monitor DataStage server jobs. The Director is not used
in mainframe jobs.
The DataStage Manager, Designer, and Administrator are introduced
during the mainframe tutorial exercises. You learn how to use these
tools to accomplish specific tasks and, in doing so, you gain some
familiarity with the capabilities they provide.
The server components require little interaction, although the
exercises in which you use the DataStage Manager also give you the
opportunity to examine the Repository.
Projects
In Ascential DataStage, all development work is done in a project.
Projects are created during the installation process. After installation,
new projects can be added using the DataStage Administrator.
Jobs
DataStage jobs consist of individual stages, linked together to
represent the flow of data from one or more data sources into a data
warehouse. Each stage describes a particular database or process. For
example, one stage may extract data from a data source, while
another transforms it. Stages are added to a job and linked together
using the Designer.
The following diagram represents the simplest job you could have: a
data source, a Transformer (conversion) stage, and the target data
warehouse. The links between the stages represent the flow of data
into or out of a stage.
You must specify the data you want to use at each stage and how it is
handled. For example, do you want all the columns in the source data
or only a select few? Should the data be joined, aggregated, or sorted
before being passed on to the next stage? What data transformations,
if any, are needed to put data into a useful format in the data
warehouse?
There are three basic types of DataStage job:
Server jobs. These are developed using the DataStage client
tools, and compiled and run on the DataStage server. A server job
connects to databases on other machines as necessary, extracts
data, processes it, then writes the data to the target data
warehouse.
Parallel jobs. These are developed, compiled and run in a similar
way to server jobs, but support parallel processing on SMP, MPP,
and cluster systems.
Stages
A stage can be passive or active. Passive stages handle access to files
and tables for the extraction and writing of data. Active stages model
the flow of data and provide mechanisms for combining data streams,
aggregating data, and converting data from one data type to another.
A stage usually has at least one data input and one data output.
However, some stages can accept more than one data input and can
output to more than one stage. The properties of each stage and the
data on each input and output link are specified using a stage editor.
There are four stage types in mainframe jobs:
Source stages. Used to read data from a data source. Mainframe
source stage types include:
– Complex Flat File
– Delimited Flat File (can also be used as a target stage)
– External Source
– Fixed-Width Flat File (can also be used as a target stage)
– IMS
– Multi-Format Flat File
– Relational (can also be used as a target stage)
– Teradata Export
– Teradata Relational (can also be used as a target stage)
Target stages. Used to write data to a target data warehouse.
Mainframe target stage types include:
– DB2 Load Ready Flat File
– Delimited Flat File (can also be used as a source stage)
– External Target
– Fixed-Width Flat File (can also be used as a source stage)
– Relational (can also be used as a source stage)
– Teradata Load
– Teradata Relational (can also be used as a source stage)
Processing stages. Used to transform data before writing it to
the target. Mainframe processing stage types include:
– Aggregator
– Business Rule
– External Routine
– Join
– Link Collector
– Lookup
– Sort
– Transformer
Post-processing stage. Used to post-process target files
produced by a mainframe job. There is one type of post-
processing stage:
– FTP
These stage types are described in more detail in Chapter 4.
Getting Started
This tutorial is designed to familiarize you with the features and
functionality in DataStage mainframe jobs. As you work through the
tutorial exercises, you create jobs that read data, transform it, then
load it into target files or tables. You need not have an active
mainframe connection to complete the tutorial, as final job upload is
simulated.
At the end of this tutorial, you will understand how to:
Attach to a project and specify project defaults for mainframe jobs
in the DataStage Administrator
Import meta data from mainframe sources in the DataStage
Manager
Design a mainframe job in the DataStage Designer
Term Description
.cfd CFD files.
Business Rule stage A stage that transforms data using SQL business rule
logic.
Term Description
COBOL Common Business-Oriented language. An English-
like programming language used for business
applications.
Complex Flat File stage A stage that reads data from complex flat file data
structures. A complex flat file may contain one or
more GROUP, REDEFINES, OCCURS, or OCCURS
DEPENDING ON clauses.
DB2 Load Ready Flat File A stage that writes data to a sequential file or a
Stage delimited file in a format that is compatible for use
with the DB2 bulk loader facility.
DD name The data definition name for a file used in the JCL.
DD names are required to be unique in a job.
Term Description
Delimited Flat File stage A stage that reads data from or writes data to a
delimited flat file.
External Source stage A stage that extracts data from an external source by
defining a call to a user-written subroutine.
Fixed-Width Flat File stage A stage that reads data from or writes data to a
simple flat file.
Term Description
JCL templates Customizable templates provided by DataStage to
produce the JCL specific to a job.
Link Collector stage A stage that combines data from multiple input links
into a single output link.
Multi-Format Flat File stage A stage that reads data from files containing multiple
record types. The source data may contain one or
more GROUP, REDEFINES, OCCURS, or OCCURS
DEPENDING ON clauses per record type.
Term Description
project A DataStage application. A project contains
DataStage jobs, built-in components used in jobs,
and user-defined components that perform specific
tasks in a job. The DataStage Server may have
several discrete projects, and each project may
contain several jobs.
Relational stage A stage that reads data from or writes data to a DB2
database table on an OS/390 platform.
Teradata Export stage A stage that reads data from a Teradata database
table on an OS/390 platform using the Teradata
FastExport utility.
Teradata Relational stage A stage that reads data from or writes data to a
Teradata database table on an OS/390 platform.
Term Description
variable-block file A complex flat file that contains variable record
lengths.
2 Select the project to connect to. This page displays all the projects
installed on your DataStage server. If you have administrator
status, you can create a new project by clicking Add… .
Summary
In this chapter you logged on to the DataStage Administrator, selected
a project, and defined default project properties. You became familiar
with the mainframe project settings that are used during job design,
code generation, and job upload.
Next, you use the DataStage Manager to import mainframe table
definitions.
Before you design a DataStage job, you need to create meta data for
your mainframe data sources. There are two ways to create meta data
in Ascential DataStage:
Import table definitions
Enter table definitions manually
This chapter focuses on importing table definitions to help you get off
to a quick start. The DataStage Manager allows you to import meta
data from COBOL File Definitions (CFDs), DB2 DCLGen files,
Assembler File Definitions, PL/I File Definitions, Teradata tables, and
IMS definitions.
Sample CFD files, DCLGen files, and IMS files are provided with the
tutorial. Exercise 2 demonstrates how to import CFDs and DB2
DCLGen files into the DataStage Repository. You start the DataStage
Manager and become acquainted with its functionality. The first part
of the exercise provides step-by-step instructions to familiarize you
with the import process. The second part is less detailed, giving you
the opportunity to test what you have learned. You will work with IMS
data later in the tutorial.
Toolbar
The Manager toolbar contains the following buttons:
New Data Element
New Machine
Usage
Profile Properties Small Details Analysis
New Routine HostView Icons
You can display ToolTips for the toolbar by letting the cursor rest on a
button in the toolbar.
Project Tree
The project tree contains a summary of the project contents. It is
divided into the following main branches:
Data Elements. A category exists for the built-in data elements
and any additional ones you define. These are used only for server
jobs.
IMS Databases (DBDs). This branch stores any IMS databases
that you import. It appears only if you have the IMS source
license.
IMS Viewsets (PSBs/PCBs). This branch stores any IMS
viewsets that you import. It appears only if you have the IMS
source license.
Jobs. A category exists for each group of jobs in the project.
Machine Profiles. This branch stores mainframe machine
profiles, which are used during job upload and in FTP stages.
Routines. Categories exist for built-in routines and any additional
routines you define, including external source and target routines.
Shared Containers. These are used only for server jobs.
Stage Types. The plug-ins you create or import are stored in
categories under this branch.
Table Definitions. Table definitions are stored according to the
data source. If you import a table or file definition, a category is
created under the data source type (for example, COBOL FD or
DB2 Dclgen). You see this demonstrated in the exercises later in
this chapter. If you manually enter a table or file definition, you
can create a new category anywhere under the main Table
Definitions branch.
Display Area
The display area in the right pane of the Manager window is known as
the Project View. It displays the contents of the branch chosen in the
project tree. You can display items in the display area in one of four
ways:
Large icons. Items are displayed as large icons arranged across
the display area.
Small icons. Items are displayed as small icons arranged across
the display area.
List. Items are displayed in a list going down the display area.
Details. Items are displayed in a table with Name, Description,
and Date/Time Modified columns.
2 Click the browse (…) button next to the COBOL file description
pathname field to select the ProductsCustomers.cfd file on
the tutorial CD. The names of the tables in the file automatically
appear in the Tables list. They are the names found for each
COBOL 01 level.
3 Keep the default setting in the Start position field. This is where
Ascential DataStage looks for the 01 level that defines the
beginning of a COBOL table definition.
4 Notice the Platform type field. This is the operating system for
the mainframe platform.
5 Notice the Column comment association option. This specifies
whether a comment line in a CFD file should be associated with
the column that follows it (the default) or the column that
precedes it. Keep the default setting.
6 Click the browse button next to the To category field to open the
Select Category dialog box. A default category is displayed in
the Current category field. Replace the default by typing
COBOL FD\Sales.
The top half of this dialog box displays Ascential DataStage’s view
of the column. The COBOL tab displays the COBOL view of the
column. There are different versions of this dialog box, depending
on the data source.
12 Click Close to close the Edit Column Meta Data dialog box.
13 Click Layout. The COBOL button is selected by default. This page
displays the file view layout of the column definitions in the table.
14 Click OK to close the Table Definition dialog box.
Summary
In this chapter, you learned the basics of importing meta data from
mainframe data sources into the DataStage Repository. You imported
table definitions from both CFD and DCLGen files.
Next you find out how to create a mainframe job with the DataStage
Designer.
Toolbar
The following buttons on the Designer toolbar are active for
mainframe jobs:
Open Save Save all Job Cut
New Job Job current jobs Properties Copy
Paste Undo
Job
Redo
Type of
New Job Help
on
View
Snap Zoom Zoom Print
Generate Link to in out
Code markers grid
Grid
line Toggle
annotations
You can display ToolTips for the toolbar by letting the cursor rest on a
button in the toolbar. The status bar then also displays an expanded
description of that button’s function.
The toolbar appears under the menu bar by default, but you can drag
and drop it anywhere on the screen. If you move the toolbar to the
edge of the Designer window, it attaches to the side of the window.
Tool Palette
The tool palette contains buttons that represent the components you
can add to your job design. There are separate tool palettes for server
jobs, mainframe jobs, parallel jobs, and job sequences. The palette
displayed depends on what type of job is currently active in the
Designer. You can customize the tool palette by adding or removing
buttons, creating, deleting, or renaming groups, changing the icon
size, and creating new shortcuts to suit your requirements. You can
also save your settings as your project defaults. For details on
customizing the palette, see Ascential DataStage Designer Guide.
The palette is docked to the Diagram window, but you can drag and
drop it anywhere on the screen. You can also resize it. To display
ToolTips, let the cursor rest on a button in the tool palette. The status
bar then also displays an expanded description of the button’s
function.
By default the tool palette for mainframe jobs is divided into four
groups containing the following buttons:
Complex Flat File. Reads data from a complex flat file data
structure. This is a passive stage.
Join. Joins two incoming data streams and passes the data to
another stage in the job. This is an active stage.
The General group on the tool palette contains three additional icons:
Annotation. Contains notes that you enter to describe the
stages or links in a job.
Description Annotation. Displays either the short or long
description from the job properties. You can edit this within
the annotation if required. There is only one of these per job.
Link. Joins the stages in a job together.
3 Now link the job components together to define the flow of data in
the job:
a Click the Link button on the tool palette. Click and drag
between the Fixed-Width Flat File stage on the left side of the
diagram window and the Transformer stage. Release the
mouse to link the two stages.
b In the same way, link the Transformer stage to the Fixed-Width
Flat File stage on the right side of the diagram window.
Your diagram window should now look similar to this:
Note An asterisk (*) next to the job title indicates that the job has
changed since the last time it was saved.
Since the column push option is turned on, you could bypass
this step if you wanted to output all of the columns. However,
in this case you are going to output only a subset of the
columns.
b Click the >> button to move all columns in the Available
columns list to the Selected columns list.
c Select DATA_NOT_NEEDED and FILLER_178_277 in the
Selected columns list and click <. These columns will not be
output from the stage.
d Click OK to close the Fixed-Width Flat File Stage dialog box.
e In the diagram window, notice the small icon that is attached to
the CustomersOut link. This link marker indicates that meta
data has been defined for the link. Link marking is enabled by
default, but you can turn it off by clicking the link markers
button in the Designer toolbar.
You have finished defining the input stage for the job. Ascential
DataStage makes it easy to build the structure of a job in the Designer,
then bind specific files to the job.
Transformer Stage
With the input and output stages of the job defined, the next step is to
define the Transformer stage. This is the stage where you specify
what transformations you want to apply to the data before it is output
to the target file.
The upper part of the Transformer Editor is called the Links area. It
is split into two panes:
The left pane shows the columns on the input link.
The right pane shows the columns on the output link and any
stage variables you have defined.
The Derivation cells on the output link are where you specify
what transformations you want to perform on the data. As
derivations are defined, the output column names change from
red to black, and relationship lines are drawn between the input
columns and the output columns.
Beneath the Links area is the Meta Data area. It is also split into
two panes:
The left pane contains the meta data for the input link, which is
read-only.
The right pane contains the meta data for the output link, which
you can edit.
These panes display the column definitions you viewed earlier in
the exercise on the Columns pages in the source and target
Fixed-Width Flat File Stage dialog boxes.
Note A great feature of the DataStage Designer is that you
only have to define or edit something on one end of a
link. The link causes the information to automatically
You can view ToolTips for the toolbar by letting the cursor rest on
a button in the toolbar.
For more details on the Transformer Editor, refer to Ascential
DataStage Mainframe Job Developer’s Guide. However, the steps
in the tutorial exercises tell you everything you need to know
about the Transformer Editor to enable you to run the exercises.
2 You now need to link the input and output columns and specify
what transformations you want to perform on the data. In this
simple example, you are going to map each column on the input
link to the equivalent column on the output link.
You can drag and drop input columns to output columns, or you
can use Ascential DataStage’s column auto-match facility to map
the columns automatically.
Before continuing, take a look at the HTML file you created in the
source stage. Open the file to review the information that was
captured, including the Ascential DataStage version number, job
name, user name, project name, server name, stage name, and date
written, as well as a copy of the file view layout showing the columns
and storage length. This becomes useful reference information for
your job.
Generating Code
To generate code:
1 Choose FileGenerate Code or click the Generate Code
button on the toolbar. The Code generation dialog box is
displayed:
2 Notice the Code generation path field. This is the fully qualified
path, which consists of the default root path you specified in the
Options dialog box, followed by the server name, project name,
and job name.
3 Look at the names in the Cobol program file name, Compile
JCL file name, and Run JCL file name fields. These are
member names. During job upload these members are loaded
into the mainframe libraries you specify in the machine profile
used for upload. You will delve into the details of this later.
Summary
In this chapter, you learned how to design a simple job. You created
source and target Fixed-Width Flat File stages and a Transformer
stage to link input columns to output columns. You used the
DataStage Designer to go through the process of building, saving, and
generating code for a job.
Next, you try some more advanced techniques. You use the
mainframe Expression Editor to build derivation expressions and
constraints. From this point forward, the exercises give shorter
directions for steps you have already performed. It is assumed that
you are now familiar with the Designer and Manager interfaces and
that you understand the basics of designing jobs and editing stages.
Detailed instructions are provided, however, for new tasks.
This chapter shows you how to use the Expression Editor to define
constraints and column derivations in mainframe jobs. You also learn
how to specify job parameters and stage variables and incorporate
them into constraint and derivation expressions.
In Exercise 5 you define constraints to filter output data. You expand
the job you created in Exercise 4 by adding two more target stages.
You then use the constraints to conditionally direct data down the
different output links, including a reject link. You also define the link
execution order.
In Exercise 6 you specify a stage variable that derives customer
account descriptions. You insert a new column into each of your
output links, then use the stage variable in the output column
derivations. You then finish configuring the two target stages.
In Exercise 7 you define and use a job parameter related to customer
credit ratings. You modify the constraint created in Exercise 5 so that
only customers with a selected credit rating are written to the output
links.
The left pane displays input link ordering and the right pane
displays output link ordering. Since Transformer stages have just
one input link in mainframe jobs, only output link ordering
applies.
2 View the output link order displayed. RejectedCustomersOut
should be last in the execution order. If it isn’t, use the arrow
buttons on the right to rearrange the order.
3 Click OK to save your settings and to close the Output Link
Execution Order dialog box.
4 Click OK to save the Transformer stage settings and to close the
Transformer Editor.
5 Save the job.
You define job parameters in the Job Properties dialog box, and you
store their values in a flat file on the mainframe that is accessed when
a job is run.
Summary
This chapter familiarized you with the mainframe Expression Editor.
You learned how to define constraints and derivation expressions.
You also saw how stage variables and job parameters are defined and
used.
Next you work with several types of flat files. You learn about their
unique characteristics and find out how to use them in mainframe
jobs. You also see the differences between the various flat file stage
editors.
This chapter explores the details of working with simple flat files in
mainframe jobs. You will build on what you learned in the last chapter
by working with more advanced capabilities in Fixed-Width Flat File
stages. You will also become familiar with the unique features of
Delimited Flat File and DB2 Load Ready Flat File stages.
In Exercise 8 you design a job that selects employees who are eligible
to receive an annual bonus and calculates the bonus amount. It reads
data from a delimited flat file, transforms it, and loads it to a fixed-
width flat file. You test what you’ve learned so far by configuring the
three stages, specifying a constraint, and defining an output column
derivation. You also see how easy it is to save column definitions as a
table definition in the Repository.
In Exercise 9 you modify the job to calculate hiring bonuses for new
employees. You add a constraint to the source stage, practice defining
and using a stage variable in a Transformer stage, and learn how to
configure a DB2 Load Ready Flat File target stage. Finally, in Exercise
10 you add an FTP stage to the job design so you can transfer the
target file to another machine.
LAST_NAME CHAR 20 0
HIRE_DATE CHAR 10 0
DEPARTMENT CHAR 15 0
JOB_TITLE CHAR 25 0
SALARY DECIMAL 8 2
BONUS_TYPE CHAR 1 0
BONUS_PERCENT DECIMAL 2 2
This is where you specify the delimiters for your source file. Let’s
assume your file uses a comma delimiter to separate columns and
quotation marks to denote strings, so you can keep the default
settings in the Delimiter area. Select the First line is column
names check box to specify that the first line in the file contains
the column names.
6 Click Outputs. The Constraint tab is active by default. Define a
constraint that selects only employees who were hired before
January 1, 2004, and are eligible for annual bonuses, which are
designated by an ‘A’ in the BONUS_TYPE field, as shown on the
next page.
When you are done, the Expression Editor should look similar to
this:
3 Open the Delimited Flat File source stage and specify a constraint
for the NewEmployeesOut link:
a Click Outputs.
b On the Constraint tab, select NewEmployeesOut from the
Output name drop-down list.
c Click Clear All to clear the contents of the Constraint grid.
d Define a new constraint that select employees whose hire date
is on or after January 1, 2004.
e Click OK to save your changes to the stage.
4 Open the xNewEmployees stage and edit it:
a Map the input columns straight across to the
HiringBonusesOut link.
b Create a stage variable named HiringBonus that has an initial
value of 0, Decimal data type, length 5, and scale 2.
c Recalling what you learned in Chapter 5, create the following
derivation for HiringBonus:
IF NewEmployeesOut.DEPARTMENT = ‘ENGINEERING’ THEN
1000
ELSE
IF NewEmployeesOut.DEPARTMENT = ‘MARKETING’ THEN
500
ELSE
300
END
END
f Click OK.
5 Open the DB2 Load Ready Flat File target stage and specify the
following on the General tab:
a The filename is HR.HIRING.BONUS.
b The DD name is NEWBONUS.
c The write option is Create a new file.
d Select Delimited flat file as the file type.
6 Click the Bulk Loader tab, which is where you set the parameters
to run the DB2 bulk loader utility and generate the control file:
a The user name is dstage.
b The DB2 subsystem id is DB2D.
c The table name is BONUS.
d The table owner is DB2OWN.
7 Click the Format tab to specify delimiter information for the target
file:
a Keep the default settings in the Column delimiter, String
delimiter, and Decimal point fields.
b Select Always delimit string data to delimit all string fields
in the target file. (If this box is not selected, then string fields
are delimited only if the data contains the column delimiter
character itself).
8 On the Options tab, specify the following:
a The volume serial number is MVS123.
b The database version is 6.1.
c The expiration date is 2004/365.
9 Click OK to save your changes.
10 Click Generate Code and enter BONUS04 as the member name
for all three generated files. Generate code for the job and view
the Run JCL to see how it differs from that of the last exercise.
3 Open the FTP stage and notice that the Machine Profile field on
the General page is empty. This is because you have not created
any machine profiles in the Manager. You can specify the
attributes for the target machine from within the stage as follows:
a The host name is Riker.
b The file exchange method is FTP. Note that FTP stages also
support Connect:Direct as a file exchange method.
c The user name and password are dstage.
d The transfer mode is Stream.
e The transfer type is ASCII.
f Keep the default settings in the rest of the fields. The FTP
Stage dialog box should look similar to this:
Summary
In this chapter you learned how to work with different types of simple
flat files. You read data from delimited flat files and saved columns as
a table definition in the Repository. You wrote data to both fixed-width
and DB2 load ready flat files. You specified target file parameters such
as volume serial number and tape expiration date. You also used an
FTP stage to transfer your target file to another machine. The
exercises in this chapter also gave you a chance to test what you’ve
learned about defining constraints, declaring stage variables, and
creating output column derivations.
You have worked with simple flat files in mainframe jobs. Now you
see how to read data from complex flat files. Ascential DataStage
Enterprise MVS Edition has two complex flat file stage types: Complex
Flat File and Multi-Format Flat File. The exercises in this chapter show
you how to configure them as sources and manipulate their complex
data structures.
In Exercise 11 you create a job that provides information about several
products in a product line. It extracts data from a complex flat file,
transforms it, and loads it to a delimited flat file. You practice what
you’ve learned so far by configuring the three stages, specifying a job
parameter, and defining a constraint. You also see how easy it is to
convert dates from one format to another.
Exercise 12 takes you a step further with complex flat files by showing
you how to flatten an array. You manipulate the flattened data to
create an output file that lists product colors. At the end of each
exercise you generate code for the job and look at the results.
In Exercise 13 you learn about OCCURS DEPENDING ON clauses. You
design a job that flattens an array containing product discount
information. Your then create an output file that indicates whether a
product discount is in effect as of the current date. As part of this, you
define and use stage variables.
Exercise 14 introduces you to multi-format flat files. You create a job
that reads variable-length records from a purchase order file and
writes them to three DB2 load ready target files. You also practice
importing table definitions in the Manager. In Exercise 15, you see
how to merge multiple record types down a single output link.
When you work with Multi-Format Flat File stages, you define the
record types of the data being read by the stage. Only those records
required by the job need to be included, even if the source file
contains other records. More than one record definition can be written
to each output link, and the same record definition can be written to
more than one output link.
4 Click the Selection tab on the Outputs page and move the
following columns to the Selected columns list in this order:
PRODUCT_ID, PRODUCT_DESC, COLOR_CODE,
COLOR_DESC, UNIT_PRICE, and EFF_START_DATE.
Notice that the PROD_DISCOUNTS column is not selectable.
This is because it is a group item that has sublevel items of
DECIMAL native type. Group items can only be selected if the
sublevel items are of CHARACTER native type.
5 Define a constraint on the Constraint tab that selects only
products from the product line specified by the job parameter:
c Click the Selection tab on the Outputs page and scroll down
the Available columns list. Notice that AVAILABLE_
COLORS appears four times, with a suffix showing the
occurrence number.
d Modify the Selected columns list on the Selection tab to
include the following columns: PRODUCT_ID,
PRODUCT_DESC, COLOR_DESC, COLOR_DESC_2,
COLOR_DESC_3, COLOR_DESC_4, UNIT_PRICE, and
EFF_START_DATE. Use the arrow buttons to the right of the
Selected columns list to arrange the columns in this order.
e Do not change the constraint on the Constraint tab.
f Click OK to save your changes to the source stage.
3 Open the Delimited Flat File target stage and change the filename
on the General tab to SLS.PRODUCT.COLORS.LIST. Delete the
COLOR_CODE column on the Columns tab.
4 Open the Transformer Stage and edit the COLOR_DESC column
derivation so that it results in a string of the form:
‘This product comes in colors: <color1>, <color2>, <color3> and
<color4>’
b Click the || operator. This joins the initial text string with the
next component of the expression.
c Since the length of the color descriptions varies, you want to
trim any blank spaces to make the result more readable.
Expand the Built-in Routines branch of the Item type list.
Click String to display the string functions. Double-click the
TRIM function that trims trailing characters from a string.
d In the Expression syntax box, replace <Character> with ‘ ‘
(single quote, space, single quote). This specifies that the
spaces are to be trimmed from the color description.
e In the Expression syntax box, highlight <String> and replace
it with the COLOR_DESC column. This inserts the first color
into the expression.
f Insert the || operator at the end of the expression.
g Type ‘, ’ to insert a comma and space after the first color.
5 In the Meta Data area of the Transformer Editor, change the length
of the COLOR_DESC output column to 100. This will ensure that
the entire list of colors appears in the column derivation.
6 Save the job, then generate code to make sure the job successfully
validates. Remember to change the job name in the Code
generation path field so that you don’t overwrite the COBOL and
JCL files that were generated in the last exercise.
both sales. The expression checks the current date against the
dates of both sales and returns the appropriate discount
percent, or 0 if the current date falls outside of the sale dates.
Use the BETWEEN function to compare dates. Replace
<Expression1> with CURRENT_DATE, a constant in the
Constants branch of the Item type list. Replace
<Expression2> and <Expression3> with your stage variables.
When you are done, the expression should look similar to this:
IF ProductsOut.DISCOUNT_CODE = 0 THEN
0
ELSE
IF ProductsOut.DISCOUNT_CODE = 1 THEN
IF CURRENT_DATE BETWEEN
DiscountStartDate1 AND DiscountEndDate1 THEN
ProductsOut.DISC_PCT
ELSE
0
END
ELSE
IF ProductsOut.DISCOUNT_CODE = 2 THEN
IF CURRENT_DATE BETWEEN
DiscountStartDate1 AND DiscountEndDate1 THEN
ProductsOut.DISC_PCT
ELSE
IF CURRENT_DATE BETWEEN
DiscountStartDate2 AND DiscountEndDate2
THEN
ProductsOut.DISC_PCT_2
ELSE
0
END
END
ELSE
0
END
END
END
5 Open the Delimited Flat File stage and change the filename to
SLS.PRODUCT.DISCOUNT and the DD name to DISCOUNT. Verify
that the DISCOUNT column appears on the Columns tab.
6 Save the job and generate code. Change the job name to
Exercise13 in the code generation path and enter PRODDISC as
the member name for all three generated files. View the generated
COBOL program to see the results.
You have designed a job that flattens an OCCURS DEPENDING ON
array. You defined stage variables to convert the data type of the input
columns to Date. You then used the Expression Editor to create a
complex output column derivation. The derivation determines the
number of times a product is discounted, then compares the current
date to the discount start and end dates. It returns the appropriate
3 Click the Records ID tab. You must specify a record ID for each
output link in Multi-Format Flat File stages. The record ID field
should be in the same position in each record.
To specify the record ID:
a For the ORDERS record, select the column
PurchaseOrders.ORDERS.MORD_TYPE in the Column
field, choose the = operator, and type ‘O’ in the Column/
Value field. Notice that the record ID appears in the
Constraint box at the bottom of the page.
b For the CUSTOMERS record, define a record ID where
PurchaseOrders.CUSTOMERS.MCUST_TYPE = ‘C’.
c For the INVOICES record, define a record ID where
PurchaseOrders.INVOICES.MINV_TYPE = ‘I’.
4 Click the Records view tab. Notice that the total file length of the
selected record is displayed at the bottom of the page. Find the
length of the largest record. You will use this later to verify the
value in the Maximum file record size field.
5 Click the Outputs page. The Selection tab is displayed by
default. The column push option does not operate in Multi-Format
Flat File stages (even if you selected it in Designer options) so you
must select columns to output from the stage:
a Select the OrdersOut link in the Output name field. Highlight
the ORDERS record name in the Available columns list and
click >> to move all of its columns to the Selected columns
list.
b Select the CustomersOut link in the Output name field and
move all the columns from the CUSTOMERS record to the
Selected columns list.
c Select the InvoicesOut link and move all the columns from
the INVOICES record to the Selected columns list.
6 Click the Constraint tab. You can optionally define a constraint on
the Constraint grid to filter your output data. For the OrdersOut
link, define a constraint that selects only orders totaling $100.00 or
more.
7 Click OK to accept the settings and close the Multi-Format Flat File
stage editor.
8 Reopen the stage editor and verify that Ascential DataStage
calculated the correct value in the Maximum file record size
field.
The source stage is now complete.
3 Open the source stage and edit the Selection tab so that it
contains the following columns from the three records:
MORD_TOTAL_AMT, MORD_TOTAL_QTY, MCUST_PART,
MCUST_PART_AMT, MINV_DATE, and
MINV_MISC_COMMENT.
4 Open the Transformer stage, delete the existing output columns,
and map the input columns straight across to the output link.
5 Open the target stage and change the filename to
SLS.ORDERS.SUM and the DD name to SUMMARY. Verify the
columns on the Columns tab and change the table name on the
Bulk Loader tab to SUMMARY.
6 Save the job and generate code, first changing the job name to
Exercise15 in the code generation path.
Now you have seen how to send data from multiple record types
down a single output link from a Multi-Format Flat File stage. This is
useful in business situations where data is stored in a multi-format flat
file with a hierarchical structure, but needs to be normalized and
moved to a relational database.
Summary
In this chapter you created jobs to work with different types of flat file
data. You read data from both complex and multi-format flat files and
learned how to normalize and flatten arrays. You wrote data to
delimited and DB2 load ready flat files and specified the target file
parameters. The exercises in this chapter gave you a chance to test
what you’ve learned about importing meta data, configuring stages,
defining constraints and stage variables, and specifying job
parameters.
This chapter introduces you to the IMS stage in mainframe jobs. IMS
stages are used to read data from databases in IMS version 5 and
above. When you use an IMS stage, you can view the segment
hierarchy of an IMS database and select a path of segments to output
data from. You can choose to perform either partial path or complete
path processing. You can also add an end-of-data indicator, normalize
or flatten arrays, and define a constraint to limit output data.
The exercises in this chapter show you how to import meta data from
IMS definitions and configure the IMS stage as a source in a job. In
Exercise 16 you import meta data from an IMS Data Base Description
(DBD) file and an IMS Program Specification Block (PSB) file. You
become familiar with the structure of the imported meta data by
viewing the details of the data using Ascential DataStage’s IMS DBD
Editor and IMS Viewset Editor.
In Exercise 17 you create a job that provides information about
inventory for an auto dealership. It reads data from an IMS source,
transforms it, and writes it to a flat file target. You see how to select an
IMS segment path and output columns, and you define a constraint to
limit output data.
2 Browse for the Dealer.psb file on the tutorial CD in the IMS file
description pathname field.
3 Notice the Create associated tables field, which is selected by
default. This has Ascential DataStage create a table in the
Repository that corresponds to each sensitive segment in the PSB
file, and columns in the table that correspond to each sensitive
field. If no sensitive fields exist in the PSB, then the created
columns correspond to the segments in the DBD. Only those fields
that are defined in the PSB become columns; fillers are created
where necessary to maintain proper field displacement and
segment size.
The associated tables are stored in the Table Definitions branch
of the project tree, in a subcategory called Viewset. You can
change the associated table for each segment in the IMS Viewset
Editor, as you’ll see later.
4 Create a Sales subcategory under Viewset in the To category
field.
5 Select DLERPSBR in the Viewset names list, then click Import.
After the import is complete, locate the PSB in the IMS Viewsets
(PSBs/PCBs) branch of the project tree and the associated tables in
the Table Definitions branch of the project tree. Now let’s take a look
at the imported meta data.
To view the DBD:
1 Expand the IMS Databases (DBDs) branch of the Manager
project tree to display the Sales subcategory, then double-click
the DEALERDB database in the right pane. This opens the IMS
Database Editor:
This dialog box is divided into two panes. The left pane displays
the IMS database, segments, and datasets in a tree structure, and
the right pane displays the properties of selected items. When the
database is selected, the right pane has a General page and a
Hierarchy page. The General page describes the general
properties of the database including the name, version number,
access type, organization, category, and short and long
descriptions. All of these fields are read-only except for the
descriptions.
2 Click the Hierarchy page. This displays the segment hierarchy of
the database. Right-click anywhere on the page and select Details
from the shortcut menu to view the hierarchy in detailed mode.
3 In the left pane, select the DEALER segment in the tree. The right
pane now has a General page and a Fields page. Look over the
fields on both pages.
4 Next click the DLERDB dataset in the left pane. The properties of
the dataset appear on a single page in the right pane. This
includes the DD names used in the JCL to read the file.
5 Click OK to close the IMS Database Editor. Now you are familiar
with the properties of the IMS database.
Next let’s take a look at the properties of the imported PSB.
This dialog box is also divided into two panes, the left for the IMS
viewset (PSB), its views (Program Communication Blocks, or
PCBs), and the sensitive segments, and the right for the properties
of selected items. Take a look at the PSB properties shown in the
right pane.
2 Select UNNAMED-PCB-1 in the left pane to view the PCB
properties, which are described on a General page and a
Hierarchy page. On the General page, click the Segment/Table
Mapping… button to open the Segment/Associated Table
Mapping dialog box. This dialog box allows you to create or
change the associated tables for the PCB segments. Since you
created associated tables during PSB import, the current
mappings are displayed.
The left pane displays available tables in the Repository which are
of type QSAM_SEQ_COMPLEX. The right pane displays the
segment names and the tables currently associated with them.
You can clear one or all of the current table mappings using the
right mouse button. To change the table association for a
segment, select a table in the left pane and drag it to the segment
in the right pane. When you are finished, click OK. In this case,
keep the current mappings and click Cancel to return to the IMS
Viewset Editor.
3 Click the Hierarchy page and view the PCB segment hierarchy in
detailed mode.
4 Select one of the sensitive segments in the left pane, such as
DEALER. Its properties are displayed on a General page, a Sen
Fields page, and a Columns page. Notice the browse button next
to the Associate table field on the General page; clicking this
lets you change the table associated with a particular segment if
desired.
5 Click OK to close the IMS Viewset Editor.
You have now defined the meta data for your IMS source and viewed
its properties.
3 Open the IMS source stage. The View tab is displayed by default.
This is where you specify details about the IMS source file you are
reading data from:
a Type IMS1 in the IMS id field.
b Select DLERPSBR from the PSB drop-down list. This defines
the view of the IMS database.
c Select UNNAMED-PCB-1 in the PCB drop-down list. The
drop-down list displays all PCBs that allow for IMS database
retrieval.
d Review the segment hierarchy diagram. You can view the
hierarchy in detailed mode by selecting Details from the
shortcut menu. Detailed mode displays the name of the
associated table, its record length, and the segment key field.
6 Click the Selection tab and move everything except the two filler
columns to the Selected columns list.
7 On the Constraint tab, define a constraint that selects all vehicles
with a price less than $25,000.00.
8 Click OK to accept the settings. The IMS source stage is now
complete.
9 Propagate the input columns to the output link in the Transformer
stage.
10 Configure the target Fixed-Width Flat File stage to write data to a
new file named INSTOCK.
11 Save the job and generate code. In the Code generation dialog
box, notice the IMS Program Type field. This specifies the type
of IMS program being read by the job. Keep the default setting of
DLI.
You have now read data from an IMS source. You specified the
segment path for reading data and selected the columns to be output
from the stage.
Summary
In this chapter you learned how to import data from IMS sources and
use an IMS stage in a job. You viewed the details of the imported meta
data, including the segment hierarchy, and saw how table
associations for each segment are created in the Manager. You then
configured the IMS stage as a source in a job that determined the
available stock of cars priced under $25,000 from auto dealerships.
You selected the segment path to read data from, and defined a
constraint to limit the output data.
Next you learn how to work with Relational stages.
Relational Stages
Relational stages extract data from and write data to tables in DB2
UDB 5.1 and later. When used as a source, Relational stages have
separate tabs for defining a SQL SELECT statement. You identify the
source table, select columns to be output from the stage, and define
the conditions needed to build WHERE, GROUP BY, HAVING, and
ORDER BY clauses. You can also type your own SQL statement if you
need to perform complex joins or subselects. An integrated parser
validates your syntax against SQL-92 standards.
When used as a target, Relational stages provide a variety of options
for writing data to an existing DB2 table. You can choose to insert new
rows, update existing rows, replace existing rows, or delete rows,
depending on your requirements. You identify the table to write data
to, select the update action and the columns to update, and specify
the update condition.
4 Open the Relational source stage. The Tables tab on the Outputs
page is displayed by default. The Available tables list contains
all table definitions that have DB2 as the access type. Expand the
Sales branch under DB2 Dclgen, and move both the SALESREP
and SALESTERR tables to the Selected tables list.
5 Click the Select tab and select all columns from the SALESREP
table except SLS_REP_LNAME, SLS_REP_FNAME,
SLS_TERR_NBR, and TAX_ID. Select all columns from
SALESTERR.
6 Define a computed column that is the concatenation of a sales
representative’s first and last names:
a Click New on the Select tab. The Computed Column dialog
box appears.
b Type FullName in the As name field.
c Keep the default value of CHARACTER in the Native data
type field.
d Type 40 in the Length field.
e Click Functions and choose the concatenation function
(CONCAT) from the list of DB2 functions. Notice the expression
that appears in the Expression text box.
f Highlight <Operand1> in the Expression box, click Columns,
and double-click SALESREP.SLS_REP_FNAME. This
replaces <Operand1> in the Expression box.
g Follow the same procedure to replace <Operand2> with
SALESREP.SLS_REP_LNAME. The Computed Column
dialog box should now look similar to this:
7 Click the Where tab to build a WHERE clause that specifies the
join and select conditions:
a Join the two tables on sales territory number.
b Select sales representatives from the ‘NJ’ and ‘NY’ sales
regions.
When you are done, the Where tab should look similar to this:
Summary
In this chapter you learned how to work with Relational stages, both
as sources and as targets. You saw how to join data from two input
tables, define a computed column, and build a SQL statement to
select a subset of data for output. You also learned how to specify the
criteria necessary for updating an existing DB2 table when the
Relational stage is a target.
Next you learn how to work with external data sources and targets.
You have seen how to work with a variety of flat files and relational
databases in DataStage mainframe jobs. This chapter shows you how
to work with external data sources and targets. These are file types
that do not have built-in support within Ascential DataStage
Enterprise MVS Edition.
Before you design a job using an external source or target, you must
first write a program outside of Ascential DataStage that reads data
from the external source or writes data to the external target. You can
write the program in any language that is callable from COBOL.
Ascential DataStage calls your program from its generated COBOL
program. The call interface between the two programs consists of two
parameters:
The address of the control structure
The address of the record definition
For information on defining the call interface, see Ascential DataStage
Mainframe Job Developer’s Guide.
After you write the external program, you create a routine definition in
the DataStage Manager. The routine specifies the attributes of the
external program, including the library path, invocation method and
routine arguments, so that it can be called by Ascential DataStage.
The last step is to design the job, using an External Source stage or an
External Target stage to represent the external program.
In Exercise 20 you learn how to define and call an external source
program in a mainframe job. You create an external source routine in
the Manager and design a job using an External Source stage. You
3 Click Creator and look at the fields on this page. You can
optionally enter vendor and author information here.
4 Click Arguments to define the routine arguments. The arguments
are treated as the fields of a record, which is passed to the external
source program. Load the arguments from the EXT_ORDERS
table.
When you are done, the Arguments page should look similar to
this:
6 Click Save to save the routine definition and Close to close the
Mainframe Routine dialog box.
You have finished creating the meta data for your external source
program. Now you are ready to design the job.
3 Define the Relational source stage to read data from the ORDERS
table you saved in the last exercise. Group the columns by sales
rep and order them by order date.
4 Define the External Target stage:
a Click the Routine tab on the Stage page. Notice that you can
edit the Name field here, which was not allowed in the
External Source stage. This is because Ascential DataStage
allows you to push columns from a previous stage in the job
design to an External Target stage. You can then simply enter
the routine name on this page. However, you would still need
to create a routine definition in the Manager for your job to run
successfully.
b Load the arguments from the SALESORD routine you have
already defined.
c Verify that the JCL matches what you entered in the Manager.
Summary
This chapter showed you how to work with external sources and
targets in mainframe jobs. You learned how to create a routine
definition for your external source and target programs. You designed
one job that read external purchase order data from an external
source, and another job that wrote sales order information to an
external target for analysis.
You are now familiar with all of the passive stages in mainframe jobs,
including those that provide built-in support for various file types and
those that allow you to work with external sources and targets. Next,
you start working with the active stages. You’ll see the powerful
options Ascential DataStage provides for manipulating data so that it
is efficiently organized in the data warehouse.
Now that you understand how to work with data sources and targets
in mainframe jobs, you are ready to use active stages to process the
data being moved into a data warehouse. This chapter introduces you
to Join and Lookup stages.
Join stages are used to join data from two sources. You can use the
Join stage to perform inner joins, outer joins, or full joins:
Inner joins return only the matching rows from both input tables.
Outer joins return all rows from the outer table (you designate one
of the inputs as the outer link) even if no matches are found.
Full joins return all rows that match the join condition, plus the
unmatched rows from both input tables.
Lookup stages are used to look up reference information. There are
two lookup types:
A singleton lookup returns a single matching row
A cursor lookup returns all matching rows
You can also perform conditional lookups, which are based on a pre-
lookup condition that must be met before the lookup occurs.
In Exercise 22 you join two data sources. You specify the join type and
the join technique, you define the join condition, and then you map
the joined data to your output link.
In Exercise 23 you look up information from a reference table. You
specify the lookup technique and the action to take if the lookup fails.
You then define the lookup condition and the output column
f Click the Mapping tab. Map all columns to the output link
using the following drag-and-drop technique: Click the title bar
of one of the input links and, without releasing the mouse
button, drag the mouse pointer to the first empty Derivation
cell on the output link. This automatically maps all of the input
link columns to the output link. Repeat this for the second input
link.
g Click OK to save your changes to the Join stage.
6 Define the Transformer stage by simply moving all the input
columns through to the output link. You might wonder if this stage
is necessary, since you already mapped data in the Join stage and
you are not performing any complex derivations. Your instincts
are correct – this stage is really not required in this job. However,
you will use it later in another exercise.
7 Define the Fixed-Width Flat File target stage:
a The filename is SLS.REPS.ORDERS.
b The DD name is REPORDER.
c Select Delete and recreate existing file as the write option.
d Click Columns to verify the column definitions being pushed
from the Join stage.
e Click Options and specify a retention period of 90 days.
8 Save the job and generate code.
You have designed a job that merges data from the SALESREP and
SALES_ORDERS input tables. The SLS.REPS.ORDERS output table
Summary
This chapter took you through the process of merging data using Join
and Lookup stages. You became familiar with the types of joins and
lookups that can be performed, and you learned the differences
between the various join and lookup techniques that Ascential
DataStage provides. You also saw how to build the key expression
that determines the conditions under which a join or a lookup is
performed.
You are beginning to see the powerful capabilities that Ascential
DataStage offers for manipulating data. Next, you look at two more
active stage types that are used for aggregating and sorting data.
In this chapter you learn two more ways to process data in mainframe
jobs: sorting and aggregating. These techniques are especially useful
for data warehousing because they allow you to group and
summarize data for easier analysis.
Sort stages allow you to sort data from a single input link. You can
select multiple columns to sort by. You then specify whether to sort
them in ascending or descending order.
Aggregator stages allow you to group and summarize data from a
single input link. You can perform a variety of aggregation functions
such as count, sum, average, first, last, min, and max.
Exercise 24 shows you how to sort data using Sort stages. You see
how to select sort columns and specify the sort order.
Exercise 25 introduces you to Aggregator stages. You learn about the
two methods of aggregating data and the different aggregation
functions that can be performed. You also see how to pre-sort your
source data as an alternative to using a Sort stage. When you use the
pre-sort function, Ascential DataStage generates an extra JCL step to
pre-sort the data prior to executing the generated COBOL program.
Exercise 26 demonstrates how to use DataStage’s ENDOFDATA
variable to perform special aggregation. You add an end-of-data row
to your source stage, then use this indicator in a Transformer stage
constraint to determine when the last row of input data has been
processed. A stage variable keeps a running total of revenue for all
products on back order, and sends the result to an output link after the
end-of-data flag is reached.
b Since the column push option is turned on, you do not need to
define column mappings on the Mapping tab. Simply click OK
to save your changes and to close the Sort Stage dialog box.
Now reopen the dialog box, click the Mapping tab, and notice
that Ascential DataStage has created the output columns and
defined the mappings for you.
5 Define the SortedItems target stage:
a The filename is SLS.SORTED.ITEMS.
b The write option is Overwrite existing file.
6 Save the job and generate code.
You have successfully designed a job that sorts the back order items
by product ID and color. The sorted information is loaded into the
SLS.SORTED.ITEMS flat file for analysis.
To aggregate data:
1 Create a new job named Exercise25.
2 Add a Fixed-Width Flat File source stage, a Transformer stage,
another Fixed-Width Flat File stage, an Aggregator stage, and a
Fixed-Width Flat File target stage to the Designer canvas. Link the
stages and rename them as shown:
variable, and create a new stage variable that calculates the total
revenue and sends it down a second output link.
To use ENDOFDATA:
1 Save the current job as Exercise26.
2 Add a Fixed-Width Flat File stage after the Transformer stage in
the job design. Link the stages and rename them as shown:
Summary
This chapter showed you how to sort and aggregate data. You
designed one job that sorted back order items and another that
summarized the number of items on back order and the total booked
revenue for each product. A third job calculated the total revenue for
all products on back order using an end-of-data indicator in the source
stage.
Now you are familiar with most of the active stages in DataStage
mainframe jobs. You understand a variety of ways to manipulate data
as it flows from source to target in a data warehousing environment.
In the next chapter, you learn how to specify more complex data
transformations using SQL business rule logic.
This chapter shows you how to use Business Rule stages to define
complex data transformations in mainframe jobs. Business Rule
stages are similar to Transformer stages in two ways:
They allow you to define stage variables.
They have a built-in editor, similar to the Expression Editor, where
you specify SQL business rule logic.
The main difference is that Business Rule stages provide access to the
control-flow features of SQL, such as conditional and looping
statements. This allows you to perform conditional mappings and
looping transformations in your jobs. You can also use SQL’s COMMIT
and ROLLBACK statements, allowing for greater transaction control in
jobs with relational databases.
Exercise 27 demonstrates how to use a Business Rule stage for
transaction control. You redesign a job from Chapter 9 that has a
Relational target stage. You add a Business Rule stage to determine
whether the updates to the target table are made successfully or not.
If so, the changes are committed. If not, the changes are rolled back
and the job is terminated.
This is where you specify the business rule logic for the stage.
This tab is divided into four panes: Templates, Business rule
editor, Operators, and Status.
To create a business rule, you can either type directly in the
Business rule editor pane or you can select items from the
Templates and Operators panes. You can also use the Build
Rule button to automatically generate the SET and INSERT
statements needed to map input columns to output columns.
You want to define a business rule that determines whether to
commit or roll back changes to the target table. You will use the
built-in variable SQLCA.SQLCODE to check the status of the
updates. This variable returns zero if data is successfully written to
an output link, or a nonzero value if there were errors. You will
include a DISPLAY statement to communicate the results, and an
EXIT statement to terminate the job in case of errors.
To define the business rule:
a Click Build Rule to define column mappings for the output
link. The Rule tab appears, which is similar to the Mapping
tab in other active stages:
b Use the right mouse button to select all columns on the input
link and then drag them to the output link. Click OK.
This will confirm that the insert was successful and will display
the time it was made.
The Business rule editor pane should now look similar to
this:
m Click Verify the check the expression for any syntax errors.
n Click OK to close the stage.
4 Save the job and generate code, first changing the job name to
Exercise27 in the code generation path.
Now you understand how to use a Business Rule stage to control
transactions in jobs using Relational or Teradata Relational stages.
Summary
This chapter introduced you to Business Rule stages, which are used
to perform complex transformations using SQL business rule logic.
You designed a job that determines whether to commit or roll back
changes to a relational table by checking to see if data is successfully
written to the output link.
Next you explore one more active stage that provides the means for
incorporating more advanced programming into your mainframe
jobs.
3 Click Save to save the routine definition and Close to close the
Mainframe Routine dialog box.
You have finished creating the routine meta data. Now you can call
the routine in a job.
Summary
This chapter familiarized you with calling external routines in
mainframe jobs. You specified the routine definition in the DataStage
Manager. You then used an External Routine stage in a job to calculate
the number of days between an order date and its shipment date.
At this point you know how to use most of the stage types in Ascential
DataStage Enterprise MVS Edition. The last step is to take a closer
look at the process of generating code and uploading jobs to the
mainframe.
To upload a job:
1 In the Designer, open the job named Exercise4 and choose File
Upload Job. The Remote System dialog box appears:
Summary
This chapter gave you an understanding of the post-development
tasks you do after you design a mainframe job. First you modified the
JCL templates to suit your environment. Then you generated code,
which also validated your job. Finally, you defined a machine profile
and saw how to upload the job to the target machine.
This appendix contains table and column definitions for the data used
in the exercises.
The following tables contain the complete table and column
definitions for the sample data. They illustrate how the properties for
each table should appear when viewed in the Repository.
The COBOL file definitions are listed first, in alphabetical order,
followed by the DB2 DCLGen file definitions and the IMS definitions.
05 ADDRESS_TYPE No Char 2 No 2
05 ADDRESS-NAME No Char 30 No 30
05 ADDRESS_LINE1 No Char 26 No 26
05 ADDRESS_LINE2 No Char 26 No 26
05 ADDRESS_LINE3 No Char 26 No 26
05 ADDRESS_LINE4 No Char 26 No 26
05 ADDRESS_ZIP No Char 9 No 9
05 ADDRESS_CITY No Char 20 No 20
05 ADDRESS_STATE No Char 2 No 2
05 ADDRESS_COUNTRY No Char 4 No 4
05 ADDRESS_PHONE No Char 12 No 12
05 ADDRESS_LAST_UPD_DATE No Char 8 No 8
05 CUSTOMER_STATUS No Char 1 No 1
05 CUSTOMER_SINCE_YEAR No Decimal 4 No 4
05 CREDIT_RATING No Char 4 No 4
05 SIC_CODE No Char 10 No 10
05 TAX_ID No Char 10 No 10
05 ACCOUNT_TYPE No Char 1 No 1
05 ACCOUNT_CONTACT No Char 25 No 25
05 ACCOUNT_CONTACT_PHONE No Char 12 No 12
05 MISC_1 No Char 10 No 10
05 MISC_2 No Char 10 No 10
05 MISC_3 No Char 10 No 10
05 MISC_4 No Char 10 No 10
05 MISC_5 No Char 10 No 10
05 MISC_6 No Char 10 No 10
05 MISC_7 No Char 10 No 10
05 MISC_8 No Char 10 No 10
05 MISC_9 No Char 10 No 10
05 MISC_10 No Char 10 No 10
SLS_REP_LNAME No Char 15 No 15
SLS_REP_FNAME No Char 15 No 15
SLS_TERR_NBR No Char 4 No 4
STREET1 No Char 30 No 30
STREET2 No Char 30 No 30
STREET3 No Char 30 No 30
CITY No Char 20 No 20
STATE No Char 2 No 2
ZIP No Char 10 No 10
TAX_ID No Char 9 No 9
SLS_TERR_NAME No Char 10 No 10
SLS_REGION No Char 2 No 2
IMS Definitions
The following table definitions are associated with the IMS segments
contained in the sample data.
05 DLRNAME No Char 30 No
05 FILLER_2 No Char 60 No
05 MAKE No Char 10 No
05 MODEL No Char 10 No
05 YR No Char 4 No
05 MSRP No Decimal 5 No
05 FILLER_2 No Char 6 No
05 FILLER_2 No Char 43 No
05 CUSTNAME No Char 50 No
06 FIRSTNME No Char 25 No
06 LASTNME No Char 25 No
05 FILLER_3 No Char 25 No
05 SLSPERSN No Char 50 No
06 FIRSTNME No Char 25 No
06 LASTNME No Char 25 No
05 FILLER_2 No Char 50 No
05 STKVIN No Char 20 No
05 FILLER_2 No Char 20 No
05 COLOR No Char 10 No
05 PRICE No Decimal 7 No
05 LOT No Char 10 No
A CFD files
active stage 1–4 definition 1–6
Administrator, see DataStage Administrator External.cfd 10–2
Aggregator stage importing 3–4
aggregation functions 12–5 Orditem.cfd 11–5
aggregation type 12–5 ProductsCustomers.cfd 3–4, A–2, A–3
definition 4–6, 12–1 PurchaseOrders.cfd 7–12
editing 12–5 Rep_Orditem.cfd 11–5
mapping data 12–6 Salesord.cfd 3–4, 3–7
arguments, routine 10–3, 14–2, 14–4 changing
arrays link names 4–11
definition 1–6 stage names 4–11
flattening 7–6, 7–8 clauses
normalizing 7–4, 7–8 GROUP BY 9–1, 9–4
Ascential Developer Net ix HAVING 9–1
Ascential Software Corporation OCCURS 7–2, 7–6
contacting 16–4 OCCURS DEPENDING ON 7–2, 7–8
Web site 16–4 ORDER BY 9–1, 9–4
Attach to DataStage dialog box 2–2 WHERE 9–1, 9–4, 9–8
auto technique client components 1–2
in Join stage 11–3 COBOL program 15–1
in Lookup stage 11–6 Code generation dialog box 4–20, 15–4
auto-match, column 4–19 code generation, see generating code
autosave before generating code 4–9 column auto-match 4–19, 10–8, 11–8
Column Auto-Match dialog box 4–19
B column push option 4–8, 4–15, 6–6, 7–15, 10–5,
base location for generated code 4–8 12–3
BETWEEN function 7–11 columns
Business Rule stage derivations 4–17, 6–7, 7–7, 7–11, 14–6
definition 4–5, 13–1 editing 3–6, 5–9, 6–4, 7–4
editing 13–2 loading 4–13, 7–4
manually entering 6–4
C propagating 7–16
call interface between DataStage and external saving as table definition 6–4
programs 10–1 selecting 4–14, 9–3
CAST function 7–10 compile JCL 15–1
Complex file load option dialog box 7–4, 7–6
Edit Column Meta Data 3–6 calling an external routine 10–5, 10–7, 14–2
Fixed-Width Flat File Stage 4–13 controlling relational transactions 13–1
FTP stage 6–13 creating a mainframe job 4–9
Import Meta Data (CFD) 3–5 defining a business rule 13–1
Import Meta Data (DCLGen) 3–8 defining a constraint 5–1
JCL Templates 15–2 defining a job parameter 5–10
Job Properties 5–11 defining a machine profile 15–4
Machine Profile 15–5 defining a stage variable 5–7
Mainframe Routine 10–3 defining routine meta data 10–2, 10–6, 14–1
Options 4–7 flattening an array 7–6
Project Properties 2–4 generating code 4–20, 15–3
Remote System 15–7 importing IMS definitions 8–1
Save Job As 5–2 importing table definitions 3–4
Save table definition 6–4 CFD files 3–4
Select Columns 4–14 DCLGen files 3–7
Table Definition 9–7 merging data from multiple record
Transformer Stage Constraints 5–3 types 7–17
Transformer Stage Properties 5–6, 5–8 merging data using a Join stage 11–2
DLERPSBR viewset 8–3, 8–5 merging data using a Lookup stage 11–5
documentation modifying JCL templates 15–1
conventions vii overview 1–5
DSE_TRXCONSTRAINT constant 5–6 reading data
from a complex flat file 7–3
E from a delimited flat file 6–3
Edit Column Meta Data dialog box 3–6 from a fixed-width flat file 4–12
editing from a relational source 9–2
Aggregator stage 12–5 from an external source 10–2
Business Rule stage 13–2 from an IMS file 8–6
columns 3–6, 5–9, 6–4, 7–4 recap 16–2
Complex Flat File stage 7–4, 7–6 setting project defaults 2–1
DB2 Load Ready Flat File stage 6–11, 7–16 sorting data 12–2
Delimited Flat File stage 6–3, 7–5 specifying Designer options 4–7
External Routine stage 14–3 uploading a job 15–6
External Source stage 10–5 using a Complex Flat File stage 7–3
External Target stage 10–7 using a Multi-Format Flat File stage 7–12
Fixed-Width Flat File stage 4–12, 4–15, using an FTP stage 6–12
5–10, 6–8 using ENDOFDATA 12–6
FTP stage 6–12 validating a job 15–3
IMS stage 8–6 working with an OCCURS DEPENDING ON
job properties 5–11, 9–2 clause 7–8
Join stage 11–3 writing data
Lookup stage 11–6 to a DB2 load ready flat file 6–9
Multi-Format Flat File stage 7–13 to a delimited flat file 7–5
Relational stage 9–3, 9–5 to a fixed-width flat file 4–15
Sort stage 12–2 to a relational target 9–5
Transformer stage 4–16, 5–2, 6–7, 7–10, to an external target 10–6
12–7, 14–5 expiration date, for a new data set 6–9
end-of-data row 6–2, 7–2, 12–1, 12–7 Expression Editor 1–8, 5–3, 5–4, 5–8, 6–7, 7–7,
ENDOFDATA variable 12–1, 12–6 12–5, 14–6
exercises expressions
aggregating data 12–3 constraints 5–3, 5–5, 5–12
tutorial
getting started 1–5
overview iii
prerequisites iv
recap 16–2
sample data 3–4
two file match technique, in Join stage 11–3
U
uploading jobs 15–6
user interface conventions viii
V
validating jobs 15–3
variables
ENDOFDATA 12–1, 12–6
REJECTEDCODE 5–5
SQLCA.SQLCODE 13–3
VSAM 1–11, 7–2
W
WHERE clause 9–1, 9–4, 9–8
windows
DataStage Designer 4–3
DataStage Manager 3–2
X
X constant 11–7