Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

ETL Specification Document

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14
At a glance
Powered by AI
The document outlines an ETL specification including document control, high level overview, specification details, and appendix.

The purpose of the document is to provide details on the ETL design and process.

Section 3.1 provides details on table/view structures including source and target tables.

Cisco Systems, Inc.

<Measure/Subject
Area/Project Name/Application
Name>
ETL Specification Document

Author: <Project Team Name>

Creation Date: <date>


Last Updated: <date>
Version: <version #>
1. Document Control
1.1. Revision History

Version
Date Author Details Of Change
Number

1.2. Document Reviewers

Date Reviewer’s Name Reviewer’s Job Title

1.3. Document Approvers

Date Approver’s Name Approver’s Job Title

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 2 of 14
Table of Contents
<Any change to this ETL design document should be followed up by an update to the TOC field below. To update:
right-click anywhere in the TOC field and select Update Field, then select “Entire Table” and click ok.>

1. Document Control............................................................................................................................ 2
1.1. Revision History.......................................................................................................................... 2
1.2. Document Reviewers................................................................................................................. 2
1.3. Document Approvers.................................................................................................................. 2
2. High level overview.......................................................................................................................... 4
2.1. Purpose / Overview.................................................................................................................... 4
2.2. ETL / Technical Architecture....................................................................................................... 4
3. Specification Details........................................................................................................................ 5
3.1. Table / View Structures............................................................................................................. 5
Table Name: Table 1........................................................................................................... 5
Table Name: Table 2........................................................................................................... 5
Source Table Name: Table 1............................................................................................... 6
Source Table Name: Table 2............................................................................................... 6
Target Table Name: Table 3................................................................................................ 6
Target Table Name: Table 4................................................................................................ 7
3.2. Program List............................................................................................................................... 7
<Provide the name of the package/program/mapping>........................................................7
Subprogram 1 <Provide the name of the package/program >......................................................8
Subprogram 2 <Provide the name of the package/program>......................................................9
Source Qualifier Transformation 1 <Provide the transformation name>......................................9
<Provide Transformation Type> Transformation 1 <Provide the transformation name>..............9
Workflow 1 <Provide the name of the workflow >.......................................................................10
Session 1 <Provide the name of the session >..........................................................................10
Session 2 <Provide the name of the session >..........................................................................10
3.3. Detailed Column Mapping Specification...................................................................................11
3.4. Job Configuration and Scheduling Details................................................................................13
Job Name 1........................................................................................................................ 13
Job Name 2........................................................................................................................ 13
4. Appendix......................................................................................................................................... 14
4.1. Livelink URL............................................................................................................................. 14

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 3 of 14
2. High level overview
<The goal of this template is to capture all data movement and transformation occuring within a given
measure, subject area, project, or application name. It is recommended to create and maintain a single
document to capture all data movement and transformation occuring within a category mentioned
above.>

2.1. Purpose / Overview


The purpose of this document is to record and communicate the ETL design for the
<Measure/Subject Area/Project/Application Name> to be consumed by the IT and Business users. It
explains the various components of the ETL design. The document includes the data flows as well as
the outlines of the standards and guidelines that are followed during the development process.

2.2. ETL / Technical Architecture


<Include architecture information such as High Level System Flow or provide a link to where the
diagram resides. Also included any detailed system flows or diagrams that would help provide a
clearer picture of the data movement process.>

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 4 of 14
3. Specification Details
3.1. Table / View Structures
< Provide the list all the tables/views that are used in this measure/subject area/project.>

Database: Schema: Layer:

Table Name: Table 1 Last Revise Date: By:


Description Sizing Current 3 mo 6 mo 1 yr
rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

Database: Schema: Layer:

Table Name: Table 2 Last Revise Date: By:

Description Sizing Current 3 mo 6 mo 1 yr


rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

< If identifying the table usage as Source, Target, Staging adds clarity to the ETL design, feel free to indicate that in the Table Name field as
Source, Target, Staging, etc. However, if the table is used as a combination of table usage types, then using the more generic table
specification template above would be preferred, and then clarify the table usage in the Program sections below..>
Cisco Systems Inc. Confidential 12/8/2021
ETL Specification Document Page 5 of 14
Database: Schema: Layer:

Source Table Name: Table 1 Last Revise Date: By:


Description Sizing Current 3 mo 6 mo 1 yr
rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

Database: Schema: Layer:

Source Table Name: Table 2 Last Revise Date: By:

Description Sizing Current 3 mo 6 mo 1 yr


rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

Database: Schema: Layer:

Target Table Name: Table 3 Last Revise Date: By:


Description Sizing Current 3 mo 6 mo 1 yr
rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 6 of 14
Database: Schema: Layer:

Target Table Name: Table 4 Last Revise Date: By:

Description Sizing Current 3 mo 6 mo 1 yr


rows
Mb
IN Key Type
Column / Field Format Null Description
DW (PK/FK/UK)

3.2. Program List


This section details the list of ETL programs for this measure/subject area/project.
<Copy and past the following section along with at least 1 sub program section to document all programs as needed. Each program must
have at least one sub program.>

Technology Used < Indicate the ETL technology used to


<Provide the name of the develop the program. Examples of ETL
Program package/program/mapping> technologies may include Informatica,
OWB, PL/SQL, etc. >

<Provide a textual description about what this package/program does. Also include any other diagrams
Description
or flow charts to help describe the ETL process.>

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 7 of 14
<Describe any table/dataset level filters>
Program Level Filters
<Provide the types of extraction being used by this program/package, i.e. Incremental or Complete.
Extraction Strategy
Describe in details if the extraction is incremental.>
<Describe the types of load being used by this program/package, i.e. Insert, Update, or Delete.
Load Strategy It may be a combination of these strategies. Is this an Incremental load versus a complete table rebuild?
>
<Describe the error strategy to be followed within the Extraction Strategy and Load Strategy.>

Error handling within Extraction Strategy


Error Strategy
Error handling within Load Strategy

<Copy and paste the following sections to describe the ETL process as needed. Sub programs sections are applicable to documenting
procedural program design. Transformation sections have been provided for documenting ETL designs for implementation using Informatica
(current ETL standard within the EDW environment.

NOTE: Depending on your ETL design you may have either 1- All procedural sub programs 2- All Informatica transformatoins or 3- Mixture of
procedural and Informatica ETL. The sub program and transformation templates below should be used according to your needs to document the
ETL design. For example, if additional fields are required, then add them. Or, if your ETL design will be fully deployed with Informatica, then delete
the sub program sections. Or, if you need to design a different type of transformation, such as an Aggregator transformation, copy the
Transformation template below and tailor the fields according to the transformation being designed. >

Sub Program
Subprogram 1 <Provide the name of the package/program >
Description <Describe what this package/program does>
<Source Table 1 Name>
Sources <Source Table 2 Name>
<Source Table 3 Name>
<Target Table 1 Name>
Targets <Target Table 2 Name>
<Target Table 3 Name>
Sub Program Level <Describe any table/dataset level filters>
Filters

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 8 of 14
<Describe the basic join information>
Table Join Condition

DFD Reference

BRD Reference

Sub Program
Subprogram 2 <Provide the name of the package/program>
Description <Describe what this package/program does>
<Source Table 1 Name>
Sources <Source Table 2 Name>
<Source Table 3 Name>
<Target Table 1 Name>
Targets <Target Table 2 Name>
<Target Table 3 Name>
Sub Program Level <Describe any table/dataset level filters>
Filters
<Describe the basic join information>
Table Join Condition

DFD Reference

BRD Reference

< Templates for documenting ETL designs to be implemented within Informatica. Only Source Qualifier and generic templates are provided for
examples. Feel free to create new templates for different types of Transformations.>

Transformation
Source Qualifier Transformation 1 <Provide the transformation name>
Table / Synonym / <Provide the table, synonym, or view which the Source Qualifier references>
View
<Provide filters or join conditions which should be applied to the SQ query>
Filter(s)/Condition(s)
<Provide psuedocode or the exact SQL which should drive this transformation>
SQL Override

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 9 of 14
Transformation
<Provide Transformation Type> Transformation 1 <Provide the transformation name>
<Field name required to clearly and accurately articulate the transformation design.>
Field 1
<Field name required to clearly and accurately articulate the transformation design.>
Field 2

< This section is specific to Informatica and can be used to document ETL design implemented within Workflows and Sessions which uses the
Mapping (Program) of this chapter. These sections will typically be used when a generic mapping has been defined which can be used by
multiple Sessions along with special SQL filters or conditions defined in each session. >

WorkFlow
Workflow 1 <Provide the name of the workflow >
Description < Provide a textual description about what this workflow does. Include the workflow diagram which
consist of various tasks that are run concurrently or sequentially >
Session
Session 1 <Provide the name of the session >
Update as Update as Update Truncate
Table Name Insert Delete
Update Insert else insert Table
Targets
<Target table 1>
<Target table 2>
<Describe any table/dataset level filters>
Session Level Filters

Session
Session 2 <Provide the name of the session >
Update as Update as Update Truncate
Table Name Insert Delete
Update Insert else insert Table
Targets
<Target table 1>
<Target table 2>
<Describe any table/dataset level filters>
Session Level Filters

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 10 of 14
3.3. Detailed Column Mapping Specification

<Column Mapping with respect to process (Double click on the following spreadsheet to get to the details)>

Microsoft Excel
Worksheet

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 11 of 14
Cisco Systems Inc. Confidential 12/8/2021
ETL Specification Document Page 12 of 14
3.4. Job Configuration and Scheduling Details
<This section provides details regarding how the backend jobs will be configured, which backend job will execute each ETL program within
this design, and scheduling details. This section is assuming that the EDW standard job configuration (DW_JOBS tables) and job
scheduling(Dollar Universe) environments are being used. If your environment uses a different job configuration or scheduling setup you may
want to modify this section accordingly. Existing Job? Indicates if the following specifications are changes within an existing production job.
Job Change Type further qualifies the job change type as a “New”, “Modified”, or “Dropped” change within the existing job. Job Change Type
is not required for designing new backend jobs. >

Job Name Job Name 1 Existing Job?

Dollar Universe Job Specifications DW Jobs Specifications Scheduling Specs


Job
Session UPROC
Change Dependencies Prerequisites CONC Stream ETL program Frequency Day/Time
Name Name
Type

Job Name Job Name 2 Existing Job?

Dollar Universe Job Specifications DW Jobs Specifications


Job
Session UPROC
Change Dependencies Prerequisites CONC Stream ETL program Frequency Day/Time
Name Name
Type

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 13 of 14
4. Appendix

4.1. Livelink URL


This latest version of this document can be found on LiveLink at:
http://ework.cisco.com/Livelink/livelink.exe?func=ll&objId=2248909&objAction=Open

Cisco Systems Inc. Confidential 12/8/2021


ETL Specification Document Page 14 of 14

You might also like