Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
69 views28 pages

Overview of XML

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 28

WEBS_C03.

qxd 22/6/07 11:19 AM Page 89

CHAPTER 3

Brief overview of XML

Learning objectives
In Chapter 2 we covered the distributed and Web-based computing roots of Web
services. In this chapter we explain how XML structures, describes, and exchanges
information. One of the appealing features of XML is that it enables diverse applica-
tions to flexibly exchange information and therefore is used as the building
block of Web services. All Web services technologies are based on XML and the XML
Schema Definition Language that make possible precise machine interpretation of
data and fine tuning the entire process of information exchange between trading
enterprises and heterogeneous computing infrastructures.
It is important that readers have a good understanding of XML, XML namespaces,
and the W3C Schema Language so that they are in a position to understand funda-
mental Web services technologies such as SOAP, WSDL, UDDI, and BPEL. Therefore,
this chapter provides a brief overview of XML to help readers understand the mater-
ial that follows in this book. It specifically covers the following topics:

◆ XML document structure.


◆ XML namespaces.
◆ Defining schemas.
◆ Reusing schemas by deriving complex type extensions and polymorphic types.
◆ Reusing schemas by importing and including schemas.
◆ Document navigation and the XML Path Language.
◆ Document transformation and the eXtensible Stylesheet Language Transform.

For more details about XML and XML schemas, in particular, we refer interested readers
to the following books: [Walmsley 2002], [Skonnard 2002], [Valentine 2002].
WEBS_C03.qxd 22/6/07 11:19 AM Page 90

90 Brief overview of XML Chapter 3

3.1 XML document structure


XML is an extensible markup language used for the description and delivery of marked-up
electronic text over the Web. Two important characteristics of XML distinguish it from
other markup languages: its document type concept and its portability.
An important aspect of XML is its notion of a document type. XML documents are
regarded as having types. XML’s constituent parts and their structure formally define the
type of a document.
Another basic design feature of XML is to ensure that documents are portable between
different computing environments. All XML documents, whatever language or writing
system they employ, use the same underlying character encoding scheme. This encoding
is defined by the international standard Unicode, which is a standard encoding system that
supports characters of diverse natural languages.
An XML document is composed of named containers and their contained data values.
Typically, these containers are represented as declarations, elements, and attributes. A
declaration declares the version of XML used to define the document. The technical term
used in XML for a textual unit, viewed as a structural component, is element. Element
containers may be defined to hold data, other elements, both data and other elements, or
nothing at all.
An XML document is also known as an instance or XML document instance. This
signifies the fact that an XML document instance represents one possible set of data for a
particular markup language. The example in Listing 3.1 typifies an XML document
instance. This example shows billing information associated with a purchase order issued
by plastics manufacturer. We assume that this company has built a business based on pro-
viding “specialty” and custom-fabricated plastics components on a spot and contract basis.

<?xml version="1.0" encoding="UTF-8"?>


<BillingInformation>
<Name> Right Plastic Products </Name>
<BillingDate> 2002-09-15 </BillingDate>
<Address>
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>
</BillingInformation>

Listing 3.1 Example of an XML document instance

3.1.1 XML declaration


The first few characters of an XML document must make up an XML declaration.
The XML processing software uses the declaration to determine how to deal with the
WEBS_C03.qxd 22/6/07 11:19 AM Page 91

XML document structure 91

Figure 3.1 Layout of typical XML document

subsequent XML content. A typical XML declaration begins with a prologue that typically
contains a declaration of conformity to version 1.0 of the XML standard and to the UTF-8
encoding standard: <?xml version="1.0" encoding="UTF-8"?>. This is shown
in Figure 3.1.

3.1.2 Elements
The internal structure of an XML document is roughly analogous to a hierarchical dir-
ectory or file structure. The topmost element of the XML document is a single element
known as the root element. The content of an element can be character data, other nested
elements, or a combination of both. Elements contained in other elements are referred to
as nested elements. The containing element is the parent element and the nested element
is called the child element. This is illustrated in Figure 3.1, where a Purchase Order
element is shown to contain a Customer element, which in turn contains Name and
BillingAddress and ShippingAddress elements.
The data values contained within a document are known as the content of the document.
When descriptive names have been applied to the elements and attributes that contain
the data values, the content of the document becomes intuitive and self-explanatory to a
person. This signifies the “self-describing” property of XML [Bean 2003].
Different types of elements are given different names, but XML provides no way of
expressing the meaning of a particular type of element, other than its relationship to
other element types. For instance, all one can say about an element such as <Address> in
WEBS_C03.qxd 22/6/07 11:19 AM Page 92

92 Brief overview of XML Chapter 3

Listing 3.2 is that instances of it may (or may not) occur within elements of type
<Customer>, and that it may (or may not) be decomposed into elements of type
<StreetName> and <StreetNumber>.

3.1.3 Attributes
Another way of putting data into an XML document is by adding attributes to start tags.
Attributes are used to better specify the content of an element on which they appear by
adding information about a defined element. An attribute specification is a name–value pair
that is associated with an element. Listing 3.2 is an example of an element declaration using
an attribute (shaded) to specify the type of a particular customer as being a manufacturer.

<?xml version="1.0" encoding="UTF-8"?>


<BillingInformation customer-type="manufacturer">
<Name> Right Plastic Products </Name>
<BillingDate> 2002-09-15 </BillingDate>
<Address>
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>
</BillingInformation>

Listing 3.2 Example of attribute use for Listing 3.1

Each attribute is a name–value pair where the value must be in either single or double
quotes. Unlike elements, attributes cannot be nested. They must also always be declared in
the start tag of an element.

3.2 URIs and XML namespaces


The Web is a universe of resources. A resource is defined to be anything that has identity.
Examples include documents, files, menu items, machines, and services, as well as people,
organizations, and concepts [Berners-Lee 1998]. A Web architecture starts with a uniform
syntax for resource identifiers, so that one can refer to resources, access them, describe
them, and share them. The Uniform Resource Identifier (URI) is the basis for identifying
resources in WWW. A URI consists of a string of characters that uniquely identifies a
resource. The URI provides the capability for an element name to be unique, such that it
does not conflict with any other element names.
The W3C uses the newer and broader term URI to describe network resources rather
than the familiar but narrower term Uniform Resource Locator (URL). URI is all-inclusive,
WEBS_C03.qxd 22/6/07 11:19 AM Page 93

URIs and XML namespaces 93

referring to Internet resource addressing strings that use any of the present or future
addressing schemes [Berners-Lee 1998]. URIs include URLs, which use traditional
addressing schemes such as HTTP and FTP, and Uniform Resource Names (URNs).
URNs are another form of URI that provide persistence as well as location independence.
URNs address Internet resources in a location-independent manner and unlike URLs they
are stable over time.
XML allows designers to choose the names of their own tags and as a consequence it is
possible that name clashes (i.e., situations where the same tag name is used in different
contexts) occur when two or more document designers choose the same tag names for
their elements. XML namespaces provide a way to distinguish between elements that use
the same local name but are in fact different, thus avoiding name clashes. For instance, a
namespace can identify whether an address is a postal address, an e-mail address, or an
IP address. Tag names within a namespace must be unique.
To understand the need for namespaces consider the example in Listing 3.3. This listing
illustrates an example of an XML document containing address information without an
associated namespace.

<?xml version="1.0" encoding="UTF-8"?>


<Address>
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>

Listing 3.3 XML example with no associated namespace

Now, if we compare the instance of the Address markup in Listing 3.3 against the
BillingInformation markup in Listing 3.2, we observe that both markups contain
references to Address elements. In fact, the Address markup has its own schema in
XML Schema Definition Language. It is desirable that every time that address information
is used in an XML document that the Address declaration is reused and is thus validated
against the Address markup schema. This means that the Address element in Listing 3.2
should conform to the Address markup while the rest of the elements in this listing
conform to the BillingInformation markup. We achieve this in XML by means of
namespaces.
Namespaces in XML provide a facility for associating the elements and/or attributes in
all or part of a document with a particular schema. All namespace declarations have a
scope, i.e., all the elements to which they apply. A namespace declaration is in scope for
the element on which it is declared and of that element’s children. The namespace name
and the local name of the element together form a globally unique name known as a
qualified name [Skonnard 2002]. A qualified name is often referred to as QName and con-
sists of a prefix and the local name separated by a colon.
WEBS_C03.qxd 22/6/07 11:19 AM Page 94

94 Brief overview of XML Chapter 3

A namespace declaration is indicated by a URI denoting the namespace name. The URI
may be mapped to a prefix that may then be used in front of tag and attribute names, separ-
ated by a colon. In order to reference a namespace, an application developer needs to first
declare one by creating a namespace declaration using the form

xmlns:<Namespace Prefix> = <someURI>

When the prefix is attached to local names of elements and attributes, the elements and
attributes then become associated with the correct namespace. An illustrative example can
be found in Listing 3.4. As the most common URI is a URL, we use URLs as namespace
names in our example (always assuming that they are unique identifiers). The two URLs
used in this example serve as namespaces for the BillingInformation and Address
elements, respectively. These URLs are simply used for identification and scoping purposes
and it is, of course, not necessary that they point to any actual resources or documents.

<?xml version="1.0" encoding="UTF-8"?>


<BillingInformation customer-type="manufacturer"
xmlns="http://www.plastics_supply.com/BillInfo">
<Name> Right Plastic Products </Name>
<Address xmlns="http://www.plastics_supply.com/Addr">
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>
<BillingDate> 2002-09-15 </BillingDate>
</BillingInformation>

Listing 3.4 An XML example using namespaces

The xmlns declarations in Listing 3.4 are the default namespaces for their associated
element and all of its declarations. The scope of a default element applies only to the
element itself and all of its descendants. This means that the declaration xmlns=
"http://www.plastics_supply.com/Addr" applies only to elements nested
within Address. The declaration xmlns="http://www.plastics_supply.com/
BillInfo" applies to all elements declared within BillingInformation but not to
Address elements as they define their own default namespace.
Using default namespaces can get messy when elements are interleaved or when dif-
ferent markup languages are used in the same document. To avoid this problem, XML
defines a shorthand notation for associating elements and attributes with namespaces.
Listing 3.5 illustrates.
WEBS_C03.qxd 22/6/07 11:19 AM Page 95

Defining structure in XML documents 95

<?xml version="1.0" encoding="UTF-8"?>


<bi:BillingInformation customer-type="manufacturer"
xmlns:bi="http://www.plastics_supply.com/BillInfo"
xmlns:addr="http://www.plastics_supply.com/Addr">

<bi:Name> Right Plastic Products </bi:Name>


<addr:Address>
<addr:Street> 158 Edward st. </addr:Street>
<addr:City> Brisbane </addr:City>
<addr:State> QLD </addr:State>
<addr:PostalCode> 4000 </addr:PostalCode>
</addr:Address>
<bi:BillingDate> 2002-09-15 </bi:BillingDate>
</bi:BillingInformation>

Listing 3.5 Using qualified names in XML

The example in Listing 3.5 illustrates the use of QNames to disambiguate and scope
XML documents. As already explained earlier, QNames comprise two parts: the XML
namespace and the local name. For instance, the QName of an element like City is
composed of the "http://www.plastics_supply.com/Addr" namespace and the
local name City.
The use of valid documents can greatly improve the quality of document processes.
Valid XML documents allow users to take advantage of content management, e-business
transactions, enterprise integration, and all other kinds of business processes that require
the exchange of meaningful and constrained XML documents.

3.3 Defining structure in XML documents


A way to define XML tags and structure is with schemas. Schemas provide much needed
capabilities for expressing XML documents. They provide support for metadata character-
istics such as structural relationships, cardinality, valid values, and data types. Each type
of schema acts as a method of describing data characteristics and applying rules and con-
straints to a referencing XML document [Bean 2003]. The term schema is commonly used
in the area of databases to refer to the logical structure of a database. When the term is
used in the XML community, it refers to a document that defines the content of and struc-
ture of a class of XML documents.

3.3.1 The XML Schema Definition Language


The XML Schema Definition Language (XSD) as proposed by W3C provides a type sys-
tem for XML processing environments. XSD provides a granular method for describing
the content of an XML document and provides extensive capabilities in the areas of data
types, customization, and reuse [Bean 2003]. XSD provides a very powerful and flexible
WEBS_C03.qxd 22/6/07 11:19 AM Page 96

96 Brief overview of XML Chapter 3

way in which to validate XML documents. It includes facilities for declaring elements and
attributes, reusing elements from other schemas, defining complex element definitions,
and for defining restrictions for even the simplest of data types. This gives the XML
schema developer explicit control over specifying a valid construction for an XML docu-
ment. For instance, a document definition can specify the data type of the contents of an
element, the range of values for elements, the minimum as well as maximum number of
times an element may occur, annotations to schemas, and much more.
An XML schema is made up of schema components. These are building blocks that
make up the abstract data model of the schema. Element and attribute declarations, com-
plex and simple type definitions, and notifications are all examples of schema components.
Schema components can be used to assess the validity of well-formed element and attribute
information items and furthermore may specify augmentations to those items and their
descendants.
XML schema components include the following [Valentine 2002]: data types which
embrace both simple and complex/composite and extensible data types; element type
and attribute declarations; constraints; relationships which express associations between
elements; and namespaces and import/include options to support modularity as they
make it possible to include reusable structures, containers, and custom data types through
externally managed XML schemas.

3.3.2 The XML schema document


Schemas are more powerful when validating an XML document because of their ability to
clarify data types stored within the XML document. Because schemas can more clearly
define the types of data that are to be contained in an XML document, they allow for a
closer check on the accuracy of XML documents. Listing 3.6 illustrates an XML schema
for a sample purchase order.

<?xml version="1.0" encoding="UTF-8"?>


<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
targetNamespace="http://www.plastics_supply.com/PurchaseOrder">

<!-- Purchase Order schema -->


<xsd:element name="PurchaseOrder" type="PO:PurchaseOrderType"/>

<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>

<xsd:element name="BillingInformation" type="PO:Customer"


minOccurs="1" maxOccurs="1"/>
<xsd:element name="Order" type="PO:OrderType" minOccurs="1"
maxOccurs="1"/>
</xsd:all>
</xsd:complexType>

WEBS_C03.qxd 22/6/07 11:19 AM Page 97

Defining structure in XML documents 97


<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="PO:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice>
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AddressType">
<xsd:sequence>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode" type="xsd:decimal"/>
<xsd:sequence>
</xsd:complexType>

<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
minOccurs= "1" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>

<xsd:complexType name="ProductType">
<xsd:attribute name="Name" type="xsd:string"/>
<xsd:attribute name="Price">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="Quantity" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>

Listing 3.6 A sample purchase order schema


WEBS_C03.qxd 22/6/07 11:19 AM Page 98

98 Brief overview of XML Chapter 3

Listing 3.6 depicts a purchase order for various items. This document allows a customer
to receive the shipment of the goods at the customer’s manufacturing plant and billing
information to be sent to the customer’s headquarters. This document also contains
specific information about the products ordered, such as how much each product cost, how
many were ordered, and so on. The root element of an XML schema document, such as
the purchase order schema, is always the schema element. Nested within the schema
element are element and type declarations. For instance, the purchase order schema con-
sists of a schema element and a variety of sub-elements, most notably element complexType
and simpleType that determine the appearance of elements and their content in instance
documents. These components are explained in the following sections.
The schema element assigns the XML schema namespace ("http://www.w3.org/
2001/XMLSchema") as the default namespace. This schema is the standard schema
namespace defined by the XML schema specification and all XML schema elements must
belong to this namespace. The schema element also defines the targetNamespace
attribute, which declares the XML namespace of all new types explicitly created
within this schema. The schema element is shown to assign the prefix PO to the
targetNamespace attribute. By assigning a target namespace for a schema, we indicate
that an XML document whose elements are declared as belonging to the schema’s
namespace should be validated against the XML schema. Therefore, the PO
targetNamespace can be used within document instances so that they can conform to
the purchase order schema.
As the purpose of a schema is to define a class of XML documents, the term instance
document is often used to describe an XML document that conforms to a particular schema.
Listing 3.7 illustrates an instance document conforming to the schema in Listing 3.6.
The remainder of this section is devoted to understanding the XML schema for the
XML document shown in Listing 3.6.

3.3.3 Type definitions, element, and attribute declarations


The XSD differentiates between complex types, which define their content in terms of
elements that may consist of further elements and attributes, and simple types, which
define their content in terms of elements and attributes that can contain only data. The
XSD also introduces a sharp distinction between definitions that create new types (both
simple and complex) and declarations that enable elements and attributes with specific
names and types (both simple and complex) to appear in document instances. To declare
an element or attribute in a schema means to allow an element or attribute with a specified
name, type, and other features to appear in a particular context within a conforming XML
document.

3.3.3.1 Element declarations


Elements are the primary ingredients of an XML schema and can be declared using the
<xsd:element> element from the XSD. The element declaration defines the element
name, content model, and allowable attributes and data types for each element type. W3C
XML schemas provide extensive data type support, including numerous built-in and
WEBS_C03.qxd 22/6/07 11:19 AM Page 99

Defining structure in XML documents 99

<?xml version="1.0" encoding="UTF-8"?>

<PO:PurchaseOrder
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.plastics_supply.com/PurchaseOrder
purchaseOrder.xsd">

<ShippingInformation>
<Name> Right Plastic Products Co. </Name>
<Address>
<Street> 459 Wickham st. </Street>
<City> Fortitude Valley </City>
<State> QLD </State>
<PostalCode> 4006 </PostalCode>
</Address>
<ShippingDate> 2002-09-22 </ShippingDate>
</ShippingInformation>

<BillingInformation>
<Name> Right Plastic Products Inc. </Name>
<Address>
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>
<BillingDate> 2002-09-15 </BillingDate>
</BillingInformation>

<Order Total="253000.00" ItemsSold="2">


<Product Name="Injection Molder" Price="250000.00"
Quantity="1"/>
<Product Name="Adjustable Worktable" Price="3000.00"
Quantity="1"/>
</Order>
</PO:PurchaseOrder>

Listing 3.7 An XML instance document conforming to the schema in Listing 3.6

derived data types that can be applied as constraints to any elements or attribute. The
<xsd:element> element either denotes an element declaration, defining a named
element and associating that element with a type, or is a reference to such a declaration
[Skonnard 2002].
The topmost element container in an XML document is known as the root element (of
which there is only one per XML document). Within the root element, there may be many
occurrences of other elements and groups of elements. The containing of elements by
WEBS_C03.qxd 22/6/07 11:19 AM Page 100

100 Brief overview of XML Chapter 3

other elements presents the concept of nesting. Each layer of nesting results in another
hierarchical level. Elements may also contain attributes. Some elements may also be
defined intentionally to remain empty.
The location at which an element is defined determines its availability within the
schema. The element declarations that appear as immediate descendants of the
<xsd:schema> element are known as global element declarations and can be referenced
from anywhere within the schema document or from other schemas. For example, the
PurchaseOrderType in Listing 3.6 is defined globally and in fact constitutes the root
element in this schema. Global element declarations describe elements that are always part
of the target namespace of the schema. Element declarations that appear as part of com-
plex type definitions either directly or indirectly – through a group reference – are known
as local element declarations. In Listing 3.6 local element declarations include elements
such as Customer and ProductType.
An element that declares an element content may use compositors to aggregate existing
types into a structure, define, and constrain the behavior of child elements. A compositor
specifies the sequence and selective occurrence of the containers defined within a complex
type or group. There are three types of compositors that can be used within XML schemas.
These are sequence, choice, and all. The sequence construct requires that the
sequence of individual elements defined within a complex type or group must be followed
by the corresponding XML document (content model). The construct choice requires
that the document designer make a choice between a number of defined options in a com-
plex type or group. Finally, the construct all requires that all the elements contained in a
complex type or group may appear once or not at all, and may appear in any order.

3.3.3.2 Attribute declarations


Attributes in an XML document are contained by elements. XML attributes cannot be
nested and do not exhibit cardinality or multiplicity. To indicate that a complex element
has an attribute, we use the <attribute> element of the XSD. For instance, from
Listing 3.6 we observe that, when declaring an attribute (such as Total), we must specify
its type. This type must be one of the simple types: boolean, byte, date, dateTime,
decimal, double, duration, float, integer, language, long, short, string,
time, token, etc. This example shows that an attribute may be defined based on
simpleType elements.

3.3.4 Simple types


Most programming languages only allow developers to arrange the various built-in types
into a structured type of some sort, but do not allow developers to define new simple types
that have user-defined value spaces. XML Schema is different in this regard because it allows
users to define their own custom simple types, whose value spaces are subsets of the pre-
defined built-in types. In XML custom data types can be defined by creating a simpleType
with one of the supported data types as a base and adding constraining facets to it.
Listing 3.6 indicates that the values of the simple element Name in Customer are
restricted to only string values. Moreover, this listing specifies that each of the simple
WEBS_C03.qxd 22/6/07 11:19 AM Page 101

XML schemas reuse 101

attributes Name, BillingDate, and ShippingDate must appear exactly once as a


child of the Customer element. This is indicated by the presence of the occurrence con-
straint attributes minOccurs and maxOccurs, which specify that the minimum and
maximum number of times these elements may appear is set to one. By the same token
Listing 3.6 indicates that simple attribute types like Total and Price are restricted to
decimal values only with two digits allowed to the right of the decimal point.

3.3.5 Complex types


The complexType element is used to define structured types. An element is considered
to be a complex type if it contains child elements and/or attributes. Complex type
definitions appear as children of an xsd:schema element and can be referenced from
elsewhere in the schema and from other schemas. Complex types typically contain a set of
element declarations, element references, and attribute declarations.
An example of a complex type in Listing 3.6 is PurchaseOrderType. This
particular element contains three child elements – ShippingInformation,
BillingInformation, and Order – as well as the attribute Total. The use of the
maxOccurs and minOccurs attributes on the element declarations, with a value of one
for these attributes, indicates that the element declarations specify that they must occur
only once within the PurchaseOrderType element.
An element declared with a content model can contain one or more child elements
of the specified type or types, as well as attributes. To declare an element with element
content, a schema developer must define the type of the element using the xsd:
complexType element and include within a content model that describes all permissible
child elements the arrangement of these elements, and rules for their occurrences. In the
XSD the following elements, all, choice, sequence, or a combination of them, can
be used for this purpose. As an example, notice the use of the xsd:sequence and
xsd:choice composition elements in Listing 3.6 that defines Customer as a complex
type element. The xsd:sequence element is used to indicate when a group of elements
or attributes is declared within an xsd:sequence schema element; they must appear
in the exact order listed. This is the case with the Name and Address elements in the
complex type Customer. The <xsd:choice> element is used to indicate when a group
of elements or attributes is declared within an <xsd:choice> schema element; any one,
but not all, of the child elements may appear in the context of the parent element. This is
the case with the BillingDate and ShippingDate attributes in the complex type
Customer.

3.4 XML schemas reuse


In enterprise-level solutions, one of the most challenging problems facing XML designers
is how to design structures that can be reused. There are many benefits to designing XML
schemas using reusable components. These benefits lead directly to shorter development
cycles, reducing application development costs, simpler maintenance of code, as well as
promoting the use of enterprise data standards.
WEBS_C03.qxd 22/6/07 11:19 AM Page 102

102 Brief overview of XML Chapter 3

3.4.1 Deriving complex types


XML Schema allows the derivation of a complex type from an already existing simple or
complex type. Complex types are derived from other types either by extension or by
restriction [Walmsley 2002]. Extension allows for adding additional descendants and/or
attributes to an existing (base) type. Restriction restricts the value contents of a type. The
values of the new type are a subset of those for the base type.

3.4.1.1 Complex type extensions


Complex types may be extended by adding attributes and adding to the content model but
one cannot modify or remove existing attributes. When defining a complex content extension
the XML processor handles the extensions by appending the new content model after the
base type’s content model, as if they were together in a sequence compositor construct.

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
targetNamespace="http://www.plastics_supply.com/PurchaseOrder">

<xsd:complexType name="Address">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"
minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AustralianAddress">
<xsd:complexContent>
<xsd:extension base="PO:Address">
<xsd:sequence>
<xsd:element name="State"
type="xsd:string"/>
<xsd:element name="PostalCode"
type="xsd:decimal"/>
<xsd:element name="Country"
type="xsd:string"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
</xsd:schema>

Listing 3.8 Extending XML complex types


WEBS_C03.qxd 22/6/07 11:19 AM Page 103

XML schemas reuse 103

Listing 3.8 illustrates how to extend a complex type such as Address (which includes
number, street, and city). The City element in the listing is optional and this is indicated by
the value of zero for the attribute minOccurs. The base type Address in Listing 3.8 can be
used to create other derived types, such as EuropeanAddress or USAddress as well.

3.4.1.2 Complex type restrictions


Complex types may be restricted by eliminating or restricting attributes, and subsetting
content models. When restriction is used, instances of the derived type will always be
valid for the base type as well.
For instance, a developer can create an additional type, named
AustralianPostalAddress, from the AustralianAddress type that omits the
City element, as shown in Listing 3.9. If the state and postal code are included in an
Australian address it is not necessary to include the city as well.

<!-- Uses the data type declarations from Listing 3.8 -->
<xsd:complexType name="AustralianPostalAddress">
<xsd:complexContent>
<xsd:restriction base="PO:AustralianAddress">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"
minOccurs="0" maxOccurs="0"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode" type="xsd:decimal"/>
<xsd:element name="Country" type="xsd:string"/>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>

Listing 3.9 Defining complex types by restriction

The purpose of the complex content restrictions is to allow designers to restrict the content
model and/or attributes of a complex type. Listing 3.9 shows how the restriction element
achieves this purpose. In this example, the derived type AustralianPostalAddress
contains the Number, Street, State, PostalCode, and Country elements but
omits the City element. It is omitted as the value of both attributes minOccurs and
maxOccurs is set to zero.

3.4.1.3 Polymorphism
One of the attractive features of XML Schema is that derived types can be used poly-
morphically with elements of the base type. This means that a designer can use a derived
type in an instance document in place of a base type specified in the schema.
WEBS_C03.qxd 22/6/07 11:19 AM Page 104

104 Brief overview of XML Chapter 3

Listing 3.10 defines a variant of the PurchaseOrder type introduced in Listing 3.6
to use the base type Address for its billingAddress and shippingAddress
elements.

<!-- Uses the data type declarations from Listing 3.8 -->

<xsd:complexType name="PurchaseOrder">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="shippingAddress" type="PO:Address"
minOccurs= "1" maxOccurs="1"/>
<xsd:element name="billingAddress" type="PO:Address"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>

Listing 3.10 Defining types polymorphically

Since XML Schema supports polymorphism, an instance document can now use any
type derived from base type Address for its billingAddress and shippingAddress
elements. Listing 3.11 illustrates that the PurchaseOrder type uses the derived
AustralianAddress type as its billingAddress and the derived Australian-
PostalAddress type as its shippingAddress elements.

3.4.2 Importing and including schemas


W3C XML schemas provide extensive capabilities in the area of cross-domain reuse.
Leveraging W3C XML schemas for cross-domain reuse implies that a schema (or sub-
schema) can represent a repeatable pattern, and it can be used in different contexts and by
different applications. The method of implementation is to define modular W3C XML
schemas as external subschemas. When combined these modules provide the complete
framework for the document module. This approach allows developers to reuse schema
components and each other’s schemas and thus reduces the complexity of developing
schemas, while easing development, testing, and maintainability.
WEBS_C03.qxd 22/6/07 11:19 AM Page 105

XML schemas reuse 105

<!-- Uses type declarations from Listing 3.10 -->

<?xml version="1.0" encoding="UTF-8"?>


<PO:PurchaseOrder xmlns:
PO="http://www.plastics_supply.com/PurchaseOrder">

<Name> Plastic Products </Name>


<shippingAddress xsi:type="PO:AustralianAddress">
<Number> 459 </Number>
<Street> Wickham st. </Street>
<City> Fortitude Valley </City>
<State> QLD </State>
<PostalCode> 4006 </PostalCode>
<Country> Australia </country>
</shippingAddress>

<billingAddress xsi:type="PO:AustralianAddress">
<Number> 158 </Number>
<Street> Edward st. </Street>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
<Country> Australia </Country>
</billingAddress>
<BillingDate> 2002-09-15 </BillingDate>
</PO:PurchaseOrder>

Listing 3.11 Using polymorphism in an XML schema instance

Combining schemas can be achieved by using the include and import elements in
the XSD. Through the use of these two elements, we can effectively “inherit” attributes
and elements from referenced schemas.

3.4.2.1 Including schemas


The include element allows for modularization of schema documents by including other
schema documents in a schema document that has the same target namespace. The
include syntax targets a specific context using a namespace to provide uniqueness. This
option is useful when a schema becomes large and difficult to manage. In this case it is
desirable to partition the schema into separate subschemas (modules), which we can even-
tually combine by using the include element.
The declaration in Listing 3.12 illustrates that the Customer type has been placed
in its own schema document, which has the same target namespace as the purchase
order schema depicted in Listing 3.6. We also assume that the same applies for the
ProductType type which has been placed in its own schema document and has the same
target namespace as the purchase order schema (shaded in the listing).
WEBS_C03.qxd 22/6/07 11:19 AM Page 106

106 Brief overview of XML Chapter 3

<?xml version="1.0" encoding="UTF-8"?>


<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
targetNamespace="http://www.plastics_supply.com/PurchaseOrder">

<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="PO:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

Listing 3.12 Sample customer subschema

Now these two subschemas can be combined in the context of the purchase order schema
using the include element. This is illustrated in Listing 3.13, where the two include
statements are shaded.
Notice that in Listing 3.13 we do not need to specify the namespaces for the two included
schemas, as these are expected to match the namespace of the purchase order schema.

3.4.2.2 Importing schemas


The import element is used when we wish to use schema modules that belong to differ-
ent namespaces. An import element is used to instruct the XML parser that it should
refer to components from other namespaces. The import element differs from the
include element in two important ways [Walmsley 2002]. First, the include element
can only be used within the same namespace, while the import element is used across
namespaces. The second, subtler distinction, is their general purpose. The purpose of the
include element is specifically to introduce other schema documents, while the purpose
of the import element is to record dependency on another namespace, not necessarily another
schema document. The import mechanism enables designers to combine schemas to
create a larger, more complex schema. It is very useful in cases where some parts of a
schema, such as address types, are reusable and need their namespace and schema.
WEBS_C03.qxd 22/6/07 11:19 AM Page 107

XML schemas reuse 107

<?xml version="1.0" encoding="UTF-8"?>


<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
targetNamespace="http://www.plastics_supply.com/PurchaseOrder">

<xsd:include
schemaLocation="http://www.plastics_supply.com/customerType.
xsd"/>

<xsd:include
schemaLocation="http://www.plastics_supply.com/productType.
xsd"/>

<xsd:element name="PurchaseOrder" type="PO:PurchaseOrderType"/>

<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="BillingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="Order" type="PO:OrderType" minOccurs="1"
maxOccurs="1"/>
</xsd:all>
</xsd:complexType>

<xsd:complexType name="AddressType">
<xsd:sequence>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal"/>
<xsd:sequence>
</xsd:complexType>

<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
minOccurs= "1" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>

Listing 3.13 Using the include element in the purchase order schema
WEBS_C03.qxd 22/6/07 11:19 AM Page 108

108 Brief overview of XML Chapter 3

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:addr=http://www.plastics_supply.com/NewAddress
targetNamespace="http://www.plastics_supply.com/NewAddress">
<xsd:import namespace="http://www.plastics_supply.com/Address"
schemaLocation="addressType.xsd"/>

<xsd:complexType name="AddressType" abstract="true">


<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string" minOccurs="0"/>
<xsd:sequence>
</xsd:complexType>

<xsd:complexType name="AustralianAddress">
<xsd:complexContent>
<xsd:extension base="addr:AddressType">
<xsd:sequence>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal"/>
<xsd:element name="Country" type="xsd:string"/>
<xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complextype>

<xsd:complexType name="AustralianPostalAddress">
< xsd:complexContent>
<xsd:restriction base="addr:AusttralianAddress">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal">
<xsd:element name="Country" type="xsd:string"/>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complextype>
</xsd:schema>

Listing 3.14 The address markup schema

Listing 3.14 defines a separate schema and namespace for all types related to
addresses in the purchase order example. This schema defines a complete address
markup language for purchase orders that contains all address-related elements such as
the AddressType, AustralianAddress, EuropeanAddress, USAddress,
AustralianPostalAddress, EuropeanPostalAddress, and so on. The
WEBS_C03.qxd 22/6/07 11:19 AM Page 109

Document navigation and transformation 109

namespace attribute for the address markup schema is "http://www.plastics_


supply.com/Address", which is a distinct and separate namespace from that of the
purchase order elements.
As the purchase order example depends on the AddressType type, we shall need
to import the address markup schema into the purchase order schema as illustrated in
Listing 3.15.
Listing 3.15 shows the use of both the import and the include elements. These
appear together at the top level of the purchase order schema definition document. In
particular, it illustrates that the import statement references the namespace and location
of the schema document that contains the address markup language for purchase orders.
The imported namespace needs to be assigned a prefix before we can use it. In this case it
is assigned the prefix addr. In this way, the declaration of the Address element in the
complex type Customer is able to reference the AddressType type by using this prefix.
For reasons of brevity we included the definition of the complex type Customer as part
of the purchase order schema, instead of defining it in a separate subschema document as
we did with Listing 3.12.

3.5 Document navigation and transformation


In contrast to languages such as HTML, XML is primarily used to describe and contain
data. Although the most obvious and effective use of XML is to describe data, other tech-
nologies such as the eXtensible Stylesheet Language Transform (XSLT) can also be used
to format or transform XML content for presentation to users. XML transactions that are
targeted for direct viewing by individuals will generally require an applied stylesheet
transformation (by means of XSLT). The XSLT process transforms an XML structure into
presentation technology such as HTML or into any other required forms and structures.
XSLT intensively uses the XML Path Language or XPath (defined as a separate
specification at the W3C) to address and locate sections of XML documents [Gardner
2002]. XPath is a standard for creating expressions that can be used to find specific pieces
of information within an XML document.

3.5.1 The XML Path Language


The XPath data model views a document as a tree of nodes. Nodes correspond to docu-
ment components, such as elements and attributes. It is very common to think of XML
documents as trees comprising roots, branches, and leaves. This is quite natural as trees
are hierarchical in nature, just as XML documents are.
XPath uses genealogical taxonomy to describe the hierarchical makeup of an XML
document, referring to children, descendants, parents, and ancestors [Goldfarb 2001]. The
parent is the element that contains the element under discussion, while an element’s list of
ancestors includes its parent as the entire set of nodes preceding its parent in a directed
path leading from this element up to the root. A list of descendants includes the children of
an element in a direct path from this element all the way down to leaf nodes. The topmost
node in XPath is known as the root or document root. The root is not an element. It is
WEBS_C03.qxd 22/6/07 11:19 AM Page 110

<?xml version="1.0" encoding="UTF-8"?>


<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
targetNamespace=http://www.plastics_supply.com/PurchaseOrder
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
xmlns:addr="http://www.plastics_supply.com/Address">
<xsd:include
schemaLocation="http://www.plastics_supply.com/
productType.xsd"/>
<xsd:import namespace="http://www.plastics_supply.com/Address"
schemaLocation="http://www.plastics_supply.com/
addressType.xsd"/>

<xsd:element name="PurchaseOrder" type="PO:PurchaseOrderType"/>

<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="BillingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="Order" type="OrderType" minOccurs= "1"
maxOccurs="1"/>
</xsd:all>
</xsd:complexType>
<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="addr:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>

Listing 3.15 A purchase order schema using import and include statements together
WEBS_C03.qxd 22/6/07 11:19 AM Page 111

Document navigation and transformation 111

rather a logical construct that holds the entire XML document. The root element is the
single element from which all other elements in the XML document instance are children
or descendants. The root element is itself the child of the root. The root element is
also known as the document element, because it is the first element in a document and it
contains all other elements in the document.
Figure 3.2 exemplifies the previous points as it shows an abridged version of the logical
(XPath tree) structure for the instance document defined in Listing 3.7. Note that the root
element in this figure is Purchase Order. Attributes and namespaces are associated
directly with nodes (see dashed lines) and are not represented as children of an element.
The document order of nodes is based on the tree hierarchy of the XML instance. Element
nodes are ordered prior to their children (to which they are connected via solid lines),
so the first element node would be the document element, followed by its descendants.
Children nodes of a given element (as in conventional tree structures) are processed prior
to sibling nodes. Finally, attributes and namespace attachments of a given element are
ordered prior to the children of the element.
The code in Listing 3.7 provides a good baseline sample XML structure that we can use
for defining XPath examples. Listing 3.16 illustrates a sample XPath expression and the
resulting node set.

XPath Query#1: /PurchaseOrder/Order[2]/child::*

Resulting Node Set#1:


=====================
<Product Name="Adjustable Worktable" Price="3000.00"
Quantity="1"/>

Listing 3.16 Sample XPath query and resulting node set

The XPath query in Listing 3.16 consists of three location steps, the first one being
PurchaseOrder. The second location step is Order[2], which specifies the second
Order element within the PurchaseOrder. Finally, the third location step is child::*,
which selects all child elements of the second Order element. It is important to under-
stand that each location step has a different context node. For the first location step
(PurchaseOrder), the current context node is the root of the XML document. The
context for the second location step (Order[2]) is the node PurchaseOrder, while the
context for the third location step is the second Order node (not shown in Figure 3.2).
More information on XPath and as well as sample XPath queries can be found in books
such as [Gardner 2002], [Schmelzer 2002].

3.5.2 Using XSLT to transform documents


To perform document transformation, a document developer usually needs to supply a
style sheet, which is written in XSLT. The style sheet specifies how the XML data will be
WEBS_C03.qxd 22/6/07 11:19 AM Page 112

112 Brief overview of XML Chapter 3

Figure 3.2 XPath tree model for instance document in Listing 3.7
WEBS_C03.qxd 22/6/07 11:19 AM Page 113

Document navigation and transformation 113

displayed. XSLT uses the formatting instructions in the style sheet to perform the trans-
formation. The converted document can be another XML document or a document in another
format, such as HTML, that can be displayed on a browser. Formatting languages, such as
XSLT, can access only the elements of a document that are defined by the document struc-
ture, e.g., XML schema.
An XSLT style sheet or script contains instructions that inform the transformation
processor of how to process a source document to produce a target document. This
makes XSLT transformations very useful for business applications. Consider, for example,
XML documents generated and used internally by an enterprise that may need to be
transformed into an equivalent format that customers or service providers of this enter-
prise are more familiar with. This can help to easily transfer information to and from an
enterprise’s partners.
Figure 3.3 shows an example of such a transformation. This figure shows an XML frag-
ment which represents the billing element of a purchase order message. As shown in this
message the source XML application uses separate elements to represent street numbers,
street addresses, states, postal codes, and countries. The target application is shown to use
a slightly different format to represent postal codes as it uses seven characters to represent
postal codes by combining state information with conventional four-digit postal codes.
More information on XSLT and transformations as well as examples can be found in
books such as [Gardner 2002], [Tennison 2001].

Figure 3.3 Using XSLT to transform business-related information


WEBS_C03.qxd 22/6/07 11:19 AM Page 114

114 Brief overview of XML Chapter 3

3.6 Summary
XML is an extensible markup language used for the description and delivery of marked-up
electronic text over the Web. Important characteristics of XML are its emphasis on
descriptive rather than prescriptive (or procedural) markup, its document type concept,
its extensibility, and its portability. In XML, the instructions needed to process a docu-
ment for some specific purpose, e.g., to format it, are sharply distinguished from the
descriptive markup, which occurs within the actual XML document. With descriptive
instead of procedural markup the same document can readily be processed in many differ-
ent ways, using only those parts of it that are considered relevant.
An important aspect of XML is its notion of a document type. XML documents are
regarded as having types. Its constituent parts and their structure formally define the type
of a document. XML Schema describes the elements and attributes that may be contained
in a schema-conforming document and the ways that the elements may be arranged within
a document structure. Schemas are more powerful when validating an XML document
because of their ability to clarify data types stored within the XML document.
XML can be perceived as a dynamic trading language that enables diverse applications
to exchange information flexibly and cost-effectively. XML allows the inclusion of tags
pertinent to the contextual meaning of data. These tags make possible precise machine
interpretation of data, fine tuning the entire process of information exchange between trad-
ing enterprises. In addition the ability of the XML schemas to reuse and refine the data
model of other schema architectures enables reuse and extension of components, reduces
the development cycle, and promotes interoperability.
As XML can be used to encode complex business information it is ideally suited to sup-
port open standards, which are essential to allow rapid establishment of business informa-
tion exchange and interoperability. For example, XML is well suited to transactional
processing in a heterogeneous, asynchronous, open, and distributed architecture that is
built upon open standard technologies, such as parsers and interfaces. This applies equally
well to enterprise application integration as well as to the e-business style of integration
with trading partners.
The ability of XML to model complex data structures combined with the additional
ability of XML Schema to reuse and refine the data model of other schema architectures
enables reuse and extension of components, reduces the development cycle and promotes
interoperability. It is precisely the suite of technologies that are grouped under XML that
had profound influence on the development of Web services technologies and provide the
fundamental building blocks for Web services and service-oriented architectures.

Review questions
◆ What are the two important features of XML that distinguish it from other markup
languages?
◆ What are XML elements and what are XML attributes? Give examples of both.
WEBS_C03.qxd 22/6/07 11:19 AM Page 115

Exercises 115

◆ Describe URIs and XML namespaces using examples.


◆ What is the purpose of the XML Schema Definition Language?
◆ List and describe the main XML schema components.
◆ What are simple and what are complex XML types?
◆ How do you achieve reusability in XML?
◆ Give an example of a derived complex type.
◆ Define and describe the concept of polymorphism in XML.
◆ What is the purpose of the include and import elements in the XML Schema
Definition Language? How do they differ?
◆ What is the purpose of the XPath data model? Describe how it views an XML
document.
◆ How can XSLT help with document transformation?

Exercises

3.1 Define a simple purchase order schema for a hypothetical on-line grocery. Each
purchase order should contain various items. The schema should allow one
customer to receive the shipment of the goods and an entirely different individual,
e.g., spouse, to pay for the purchase. This document should include a method of
payment that allows customers to pay by credit card, direct debit, check, etc., and
should also contain specific information about the products ordered, such as how
much each product cost, how many items were ordered, and so on.
3.2 Extend the purchase order schema in the previous exercise to describe the case of
an on-line grocery that sells products to its customers by accepting only credit
cards as a payment medium. This simple order processing transaction should
contain basic customer, order, and product type information as well as different
methods of delivery, and a single method of payment, which includes fields for
credit card number, expiration date, and payment amount. Show how this purchase
order schema can import schema elements that you developed for Exercise 3.1.
3.3 Define a schema for a simple clearinghouse application that deals with credit card
processing and (PIN-based) debit card processing for its customers who are elec-
tronic merchants. In order to process a credit card a merchant will need to have a
valid merchant account with the clearinghouse. A merchant account is a commer-
cial bank account established by contractual agreement between a merchant and
the clearinghouse and enables a merchant that provides shopping facilities to
accept credit card payments from its customers. A merchant account is required to
authorize transactions. A typical response for a credit card is authorized, declined,
or cancelled. When the clearinghouse processes credit card sales it returns a
WEBS_C03.qxd 22/6/07 11:19 AM Page 116

116 Brief overview of XML Chapter 3

transaction identifier (TransID) only when a credit card sale is authorized. If the
merchant needs to credit or void a transaction the TransID of the original credit
card sale will be required. For simplicity assume that one credit is allowed per sale
and that a credit amount cannot exceed the original sale amount. The application
should be able to process a number of payments in a single transmission as a batch
transaction.
3.4 Define a schema for a flight availability request application that requests flight
availability for a city pair on a specific date for a specific number and type of
passengers. Optional request information can include: time or time window,
connecting cities, client preferences, e.g., airlines, flight types, etc. The request can
be narrowed to request availability for a specific airline, specific flight, or specific
booking class on a specific flight.
3.5 Define a schema for handling simple requests for the reservation of rental vehicles.
The schema should assume that the customer has already decided to use a specific
rental branch. It should then define all the information that is needed when request-
ing information about a vehicle rental. The schema should include information
such as rate codes, rate type, promotional descriptions, and so on, as well as rate
information that had been supplied in a previous availability response, along with
any discount number or promotional codes that may affect the rate. For instance,
the customer may have a frequent renter number that should be associated with the
reservation. Typically rates are offered as either leisure rates or corporate rates. The
schema should also define the rental period, as well as information on a distance
associated with a particular rate, e.g., limited or unlimited miles per rental period,
and customer preferences regarding the type of vehicle and special equipment that
can be included with the reservation of a rental vehicle.
3.6 Define a simple hotel availability request schema that provides the ability to search
for hotel products available for booking by specific criteria that may include: dates,
date ranges, price range, room types, regular and qualifying rates, and/or services and
amenities. A request can also be made for a non-room product, such as banquets
and meeting rooms. An availability request should be made with the intent to
ultimately book a reservation for an event or for a room stay. The schema should
allow a request for “static” property data published by the hotel that includes informa-
tion about the hotel facilities, amenities, services, etc., as well as “dynamic”
(e.g., rate-oriented) data. For example, a hotel may have an AAA rate, a corporate
rate (which it does not offer all the time), or may specify a negotiated code as a
result of a negotiated rate, which affects the availability and price of the rate.

You might also like