XML Usage in DB2
XML Usage in DB2
Developers
Troy Coleman
CA Technologies
Session Code: E11
Wednesday, 10 November 2010 13:00 14:00
Platform: z/OS
Agenda
XML Terminology
XML Extensible Markup Language
Universal means of exchanging data
Expose structure and content of documents
Textual Data Format
Tag Language
<Name>
<FirstName> Troy </FirstName>
<LastName>Coleman</LastName>
</Name>
The eXtensible Markup Language (XML) is seen by some people as the universal format for sharing data between applications.
Any application running on any platform using any database or file system can process data as long as that application understands how to process XML.
The XML document is self describing. That is you know what each element and repeating group of elements are based on tags imbedded within the document. Each element has a begin-tag and end-tag. The structure or order of the data
is inherent through the hierarchical order of tags.
XML Terminology
Well Formed Document
Well Formed:
<Name><First>Troy</First><Last>Coleman</Last></Name>
Note well formed:
<Name><First>Troy<Last>Coleman</First></Last></Name>
The XML document is considered well-formed when each element has a start-tag <> and a corresponding end-tag </>.
The last example is not well formed because the <First> tag overlaps with the <Last> tag.
XML Terminology
Document Validation
DTD Document Type Definition (oldest schema language)
Schema XSD XML Schema Definition
Validation Support
XML System Services in z/OS R10 supports optional XML Validation
COBOL XML PARSE validation added in V4.2
DB2 V9
No DTD support
XML Schema registered in XML Schema Repository
Must use explicit function DSN_XMLVALIDATE
The well formed XML language does not enforce the structure of the document. The DTD language was developed to validate and enforce the document structure.
The DTD was an important step toward the development of defining a family of documents that will be shared between producers and consumers of the data.
The XML Schema is a W3C-recommended language that improves upon the limited capabilities of DTD. The language supports governing the order of elements, Boolean predicates along with data types used to govern the content of
elements, and specialized rules to specify uniqueness and referential integrity (RI) constraints.
XML Terminology
Parser A program that can read an XML document and
provide programmatic access to the document.
The parser processes the markup tags in the XLM document and passes structured information back to an application.
Some parsers process the XML document through the SAX specification and others process using the DOM specification.
The next few slides will go into details on the benefits of using XML along with the benefits of using XML specifically on z/OS with COBOL.
Data is platform-independent
B2B Industry Standard Schema
Self Documenting Data Structure
Change content and structure does not require program
changes
Benefits:
With XML, applications can more easily read information from a variety of
platforms. The data is platform-independent, so now the sharing of data between
you and your customers can be simplified.
B2B - Businesses are developing DTDs and schemas for their industry. The ability to parse
standardized XML documents gives business products an opportunity to be
exploited in the B2B environment.
Self Doc Data: Applications understand how to read a schema which describes the data in the XML document.
Changes to content and structure is easier in XML. The data is tagged so you can add and remove elements without impacting existing
elements. You will be able to change the data without having to change the application.
Why z/OS?
No platform can compete with enterprise scale workloads
Reliability
Scalability
Security
Availability
Finance Industry
Transportation Industry
Health Industry
Government
10
Each new release of Enterprise COBOL for z/OS continues to improve on XML features as well as performance processing XML.
The latest release V4.2 takes advantage of the new z/OS XML System Services which provides the ability to valid a document through a Schema.
11
XML-TEXT Set with document fragment that are returned as alphanumeric data
To take advantage of the new XML services you will need to use the compiler option XMLPARSE(XMLSS).
A few new special registers are included when you use this option. The XML Namespace and Namespace prefix along with national encoded Namespace and Namespace prefix.
12
13
XML-EVENT
XML-TEXT
START-OF-DOCUMENT
VERSION-INFORMATION
1.0
START-OF-ELEMENT
msg
ATTRIBUTE-NAME
type
ATTRIBUTE-CHARACTERS
short
CONTENT-CHARACTERS
Hello, World!
END-OF-ELEMENT
msg
END-OF-DOCUMENT
This is a simple sample of what the parser will return for each event.
14
The previous example showed specific events. In events are triggered on either a text node, element node, processing instruction or comment.
15
In this example I ran into an error due to the bracket [ used with the CDATA parameter. The default codepage was causing an error.
I had to specify the compiler option codepage(1047) to correct the problem.
16
In this example I ran into an error due to the bracket [ used with the CDATA parameter. The default codepage was causing an error.
I had to specify the compiler option codepage(1047) to correct the problem.
17
xml-document.
05 pic x(39)
05 PIC X(39)
05 PIC X(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
05 pic x(39)
value
value
value
value
value
value
value
value
value
value
value
value
value
value
value
value
value
To jump start working with XML I didnt want to take the time to read a file or interface with CICS or IMS.
I decided to create an in working storage document to practice with.
This is a sample of an XML document I used to learn about the different events and parsing features.
18
xml-text-len computational
current-element
ws-compensation
ws-dis-salary
ws-salary
ws-bonus
ws-comm
ws-empno
WS-START-OF-DOC
88 WS-START-DOC
88 WS-NOT-START-DOC
comp
comp
comp
comp
pic
pic
pic
pic
pic
pic
pic
pic
pic
999999999.
x(30).
s9(9)v99 value zero.
$$$,$$$,$$9.99.
s9(9)v99 value zero.
s9(9)v99 value zero.
s9(9)v99 value zero.
x(06)
value spaces.
X.
VALUE 'N'.
VALUE 'Y'.
This is a simple set of working storage variables used to process the XML document and do some computations for the final output.
19
To keep things simple Im setting the start of processing at the beginning of the program.
Im then invoking the XML PARSE statement passing the working storage xml-document.
When the xml event is triggered control will be passed back to paragraph xml-handler.
Once XML processing is complete the rest of mainline will be processed.
20
XML-Handler
xml-handler section.
if ws-start-doc then
compute xml-text-len = function length(xml-text)
*
display 'doc length:' xml-text-len
end-if.
evaluate XML-EVENT
* ==> Order XML events most frequent first
when 'START-OF-ELEMENT'
display 'Start element tag: <' XML-TEXT '>'
move XML-TEXT to current-element
when 'CONTENT-CHARACTERS'
display 'Content characters: <' XML-TEXT '>'
* ==> Transform XML content to operational COBOL data
item...
evaluate current-element
when 'salary'
xml-handler will look for the event type then perform some task.
In my example at start of document processing I compute the length of the entire document just as a debugging aid.
Now I display all the different XML Events along with the data that triggered the event. Another learning aid.
I also want to do some processing with some element so Im looking for the element name salary, bonus and Comm.
Continue on the next slide
21
XML-Handler
* ==> Using function NUMVAL-C...
compute ws-salary = function numval-c(XML-TEXT)
when 'bonus'
compute ws-bonus = function numval-c(XML-TEXT)
when 'comm'
compute ws-comm = function numval-c(XML-TEXT)
when 'empno'
move xml-text
to ws-empno
end-evaluate
when 'END-OF-ELEMENT'
display 'End element tag: <' XML-TEXT '>'
move spaces to current-element
when 'START-OF-DOCUMENT'
display 'Start of document processing'
SET WS-START-DOC TO TRUE
when 'END-OF-DOCUMENT'
display 'End of document.'
22
XML-Handler
when 'VERSION-INFORMATION'
display 'Version: <' XML-TEXT '>'
when 'ENCODING-DECLARATION'
display 'Encoding: <' XML-TEXT '>'
when 'STANDALONE-DECLARATION'
display 'Standalone: <' XML-TEXT '>'
when 'ATTRIBUTE-NAME'
display 'Attribute name: <' XML-TEXT '>'
when 'ATTRIBUTE-CHARACTERS'
display 'Attribute value characters: <' XML-TEXT '>'
when 'ATTRIBUTE-CHARACTER'
display 'Attribute value character: <' XML-TEXT '>'
when 'START-OF-CDATA-SECTION'
display 'Start of CData: <' XML-TEXT '>'
when 'END-OF-CDATA-SECTION'
display 'End of CData: <' XML-TEXT '>'
when 'CONTENT-CHARACTER'
display 'Content character: <' XML-TEXT '>'
23
XML-Handler
when 'PROCESSING-INSTRUCTION-TARGET'
display 'PI target: <' XML-TEXT '>'
when 'PROCESSING-INSTRUCTION-DATA'
display 'PI data: <' XML-TEXT '>'
when 'COMMENT'
display 'Comment: <' XML-TEXT '>'
when 'EXCEPTION'
display 'Exception' XML-CODE
display 'text:' xml-text
display 'len:' xml-text-len
when 'other'
display 'Unexpected XML event:' XML-EVENT '.'
end-evaluate .
End program XMLPARS1.
24
Parsed Output
XML Event
XML Text
Start of document processing
Version: <1.0>
Encoding: <ibm-1140>
Comment: < Employee HR Information >
Start element tag: <emp>
Start element tag: <empno>
Content characters: <123456>
End element tag: <empno>
Start element tag: <name>
Content characters: <
>
Start element tag: <firstnme>
Content characters: <Troy
>
End element tag: <firstnme>
Page 1
Page 2
This is page 1 and 2 of the sample output generated by parsing the working storage xml-document
25
26
COBOL-Data-Item
01 EMP.
10 EMPNO
10 NAME.
15 FIRSTNME
15 LASTNAME
10 WORKDEPT
10 PHONENO
10 HIREDATE
10 JOB
10 EDLEVEL
10 SEX
10 BIRTHDATE
10 SALARY
10 BONUS
10 COMM
PIC X(6)
value 123456.
pic x(12)
value Troy .
pic x(15)
value Coleman .
PIC X(3)
value D01.
PIC X(4)
value 4459.
PIC X(10)
value 03/30/2001.
PIC X(8)
value MANAGER .
PIC S9(4) USAGE COMP value 16.
PIC X(1) value M.
PIC X(10) value 01/13/1963.
PIC 9(7)V9(2) USAGE display value 78250.00.
PIC 9(7)V9(2) USAGE display value 500.00.
PIC 9(7)V9(2) USAGE display value zero.
In my sample program Im using a static working storage area for the source of
input. You of course will read this in from CICS, IMS, DB2, or files.
Remember that the XML tag names are taken from the COBOL variable
names.
Is this case the document will be EMP and will have a tag for EMPNO and
NAME and .. COMM
27
END-XML
28
As you can see the first line of text has the XML DECLARATION then the
EMP document.
The names are taken from the COBOL variable names.
29
Now I would like to talk about publishing XML Documents from DB2.
The first XML function needed as you are building the document is
XMLDOCUMENT.
Otherwise you will not have a well formed document and insert into an XML
column will fail.
30
The XMLELEMENT function is used to build the element which includes its
name, optional namespace, attribute, and content.
In this example Ive built the element CUSTOMER which is made up of
namespace, attribute CID and element NAME
31
32
33
You may find that you need to add a comment in the XML document. The
XMLCOMMENT function will insert that comment for you.
34
XMLCONCAT
XMLFOREST
XMPARSE
XMLPI
XMLQUERY
XMLSERIALIZE
XMLTEXT
There are many more XML publish functions that can be used for more
complex xml documents.
This is a list of these functions.
35
This is a sample SQL statement used to publish relational data into an XML
Document.
Notice I start with the XMLDOCUMENT function followed by an
XMLELEMENT function that is made up of a list of other XMLELEMENT
functions.
36
This is the result of the previous SQL statement using the XML publishing
functions.
37
You may find that you have some XML data and would like to publish this
data as a relational table to be used in join processing with other tables.
The XMLTABLE function will do this for you.
In my example on the next slide I did not use XMLNAMESPACE.
I did use $CUST as the row-xquery-expression-constant which is used by the
passing row-xquery-argument.
The COLUMNS function looks like what you would see in a CREATE table
statement for each column being defined.
38
XMLTABLE Statement
select cx.cust_id, cx.firstname, cx.lastname, cx.street
, cx.city, cx.state, cx.country
from dsn8910.customer c
, xmltable('$cust/customer' passing info as "cust"
columns
cust_id
integer
path '@cid'
, firstname
varchar(10) path 'name/firstname'
, lastname
varchar(10) path 'name/lastname'
, street
varchar(20) path 'addr/street'
, city
varchar(11) path 'addr/city'
, state
varchar(10) path 'addr/state'
, country
varchar(03) path 'addr/@country'
) as cx
order by cx.lastname asc
In this SQL statement Im taking the XML column INFO found in the
customer table and parsing it with the XMLTABLE function to return what
looks like a table. The table returned is CX. Notice Im doing an order by on
CX.LASTNAME.
39
When Im testing my SQL I usually prototype the statement and verify that it
is a valid XML document.
To do this I publish an XML document using the SELECT publishing
functions and once it looks good I prefix the SELECT statement with an
INSERT statement.
In this case I have two columns. CID and INFO. The CID is set to 1000 and
INFO will contain the published document.
40
Using the XMLTABLE function against the row I just inserted into INFO
would product the following relational table which can be joined with other
tables.
41
42
DB2 now supports validation of the XML document in the engine. In this case
the CUSTSCHEMA was registered before processing this statement.
XML Validation is very CPU intensive. You will want to avoid this in a
production environment.
43
Its been a long time requirement to have the ability to update a portion of the
XML document and not take the performance hit with updating the entire
document.
The XMLMODIFY provides the ability to update a portion of the XML
Document.
44
DB2 10 Multi-Versioning
XML Column supporting multiple versions
Universal Table Space
DB2 10 NFM
45
The Binary data type is not supported at this point in time by COBOL.
The current API format is for JDBC, SQLJ, ODBC, or UNLOAD utility
46
Native SQL routines support parameters and variables with the XML data
type.
XML parameters can be used in SQL statements in the same way as variables
of any other data type.
In addition, variables with the XML data type can be passed as parameters to
XQuery expressions in XMLEXISTS, XMLQUERY and XMLTABLE
expressions.
47
xs:dateTime
xs:time
xs:date
xs:duration
xs:yearMonthDuration
xs:datTimeDuration
Version 10 includes date and time support for XML data types and functions.
This includes matching index support using comparison operators on duration,
date, and time values
48
Version 10 adds functionality to the CHECK DATA utility, so that you can
use this utility to verify the consistency of XML documents that are stored in a
separate XML table space.
The CHECK DATA utility verifies that all nodes in that XML document are
structurally intact and that the node ID index is consistent with the content that
is
in the XML table space.
In addition, this utility verifies that all of the XML documents of an XML
column are valid against at least one XML schema that is
specified in the XML type modifier
49
Troy Coleman
CA Technologies
troy.coleman@ca.com
E11
XML for z/OS COBOL Developers
50