Overview of XML
Overview of XML
Overview of XML
CHAPTER 3
Learning objectives
In Chapter 2 we covered the distributed and Web-based computing roots of Web
services. In this chapter we explain how XML structures, describes, and exchanges
information. One of the appealing features of XML is that it enables diverse applica-
tions to flexibly exchange information and therefore is used as the building
block of Web services. All Web services technologies are based on XML and the XML
Schema Definition Language that make possible precise machine interpretation of
data and fine tuning the entire process of information exchange between trading
enterprises and heterogeneous computing infrastructures.
It is important that readers have a good understanding of XML, XML namespaces,
and the W3C Schema Language so that they are in a position to understand funda-
mental Web services technologies such as SOAP, WSDL, UDDI, and BPEL. Therefore,
this chapter provides a brief overview of XML to help readers understand the mater-
ial that follows in this book. It specifically covers the following topics:
For more details about XML and XML schemas, in particular, we refer interested readers
to the following books: [Walmsley 2002], [Skonnard 2002], [Valentine 2002].
WEBS_C03.qxd 22/6/07 11:19 AM Page 90
subsequent XML content. A typical XML declaration begins with a prologue that typically
contains a declaration of conformity to version 1.0 of the XML standard and to the UTF-8
encoding standard: <?xml version="1.0" encoding="UTF-8"?>. This is shown
in Figure 3.1.
3.1.2 Elements
The internal structure of an XML document is roughly analogous to a hierarchical dir-
ectory or file structure. The topmost element of the XML document is a single element
known as the root element. The content of an element can be character data, other nested
elements, or a combination of both. Elements contained in other elements are referred to
as nested elements. The containing element is the parent element and the nested element
is called the child element. This is illustrated in Figure 3.1, where a Purchase Order
element is shown to contain a Customer element, which in turn contains Name and
BillingAddress and ShippingAddress elements.
The data values contained within a document are known as the content of the document.
When descriptive names have been applied to the elements and attributes that contain
the data values, the content of the document becomes intuitive and self-explanatory to a
person. This signifies the “self-describing” property of XML [Bean 2003].
Different types of elements are given different names, but XML provides no way of
expressing the meaning of a particular type of element, other than its relationship to
other element types. For instance, all one can say about an element such as <Address> in
WEBS_C03.qxd 22/6/07 11:19 AM Page 92
Listing 3.2 is that instances of it may (or may not) occur within elements of type
<Customer>, and that it may (or may not) be decomposed into elements of type
<StreetName> and <StreetNumber>.
3.1.3 Attributes
Another way of putting data into an XML document is by adding attributes to start tags.
Attributes are used to better specify the content of an element on which they appear by
adding information about a defined element. An attribute specification is a name–value pair
that is associated with an element. Listing 3.2 is an example of an element declaration using
an attribute (shaded) to specify the type of a particular customer as being a manufacturer.
Each attribute is a name–value pair where the value must be in either single or double
quotes. Unlike elements, attributes cannot be nested. They must also always be declared in
the start tag of an element.
referring to Internet resource addressing strings that use any of the present or future
addressing schemes [Berners-Lee 1998]. URIs include URLs, which use traditional
addressing schemes such as HTTP and FTP, and Uniform Resource Names (URNs).
URNs are another form of URI that provide persistence as well as location independence.
URNs address Internet resources in a location-independent manner and unlike URLs they
are stable over time.
XML allows designers to choose the names of their own tags and as a consequence it is
possible that name clashes (i.e., situations where the same tag name is used in different
contexts) occur when two or more document designers choose the same tag names for
their elements. XML namespaces provide a way to distinguish between elements that use
the same local name but are in fact different, thus avoiding name clashes. For instance, a
namespace can identify whether an address is a postal address, an e-mail address, or an
IP address. Tag names within a namespace must be unique.
To understand the need for namespaces consider the example in Listing 3.3. This listing
illustrates an example of an XML document containing address information without an
associated namespace.
Now, if we compare the instance of the Address markup in Listing 3.3 against the
BillingInformation markup in Listing 3.2, we observe that both markups contain
references to Address elements. In fact, the Address markup has its own schema in
XML Schema Definition Language. It is desirable that every time that address information
is used in an XML document that the Address declaration is reused and is thus validated
against the Address markup schema. This means that the Address element in Listing 3.2
should conform to the Address markup while the rest of the elements in this listing
conform to the BillingInformation markup. We achieve this in XML by means of
namespaces.
Namespaces in XML provide a facility for associating the elements and/or attributes in
all or part of a document with a particular schema. All namespace declarations have a
scope, i.e., all the elements to which they apply. A namespace declaration is in scope for
the element on which it is declared and of that element’s children. The namespace name
and the local name of the element together form a globally unique name known as a
qualified name [Skonnard 2002]. A qualified name is often referred to as QName and con-
sists of a prefix and the local name separated by a colon.
WEBS_C03.qxd 22/6/07 11:19 AM Page 94
A namespace declaration is indicated by a URI denoting the namespace name. The URI
may be mapped to a prefix that may then be used in front of tag and attribute names, separ-
ated by a colon. In order to reference a namespace, an application developer needs to first
declare one by creating a namespace declaration using the form
When the prefix is attached to local names of elements and attributes, the elements and
attributes then become associated with the correct namespace. An illustrative example can
be found in Listing 3.4. As the most common URI is a URL, we use URLs as namespace
names in our example (always assuming that they are unique identifiers). The two URLs
used in this example serve as namespaces for the BillingInformation and Address
elements, respectively. These URLs are simply used for identification and scoping purposes
and it is, of course, not necessary that they point to any actual resources or documents.
The xmlns declarations in Listing 3.4 are the default namespaces for their associated
element and all of its declarations. The scope of a default element applies only to the
element itself and all of its descendants. This means that the declaration xmlns=
"http://www.plastics_supply.com/Addr" applies only to elements nested
within Address. The declaration xmlns="http://www.plastics_supply.com/
BillInfo" applies to all elements declared within BillingInformation but not to
Address elements as they define their own default namespace.
Using default namespaces can get messy when elements are interleaved or when dif-
ferent markup languages are used in the same document. To avoid this problem, XML
defines a shorthand notation for associating elements and attributes with namespaces.
Listing 3.5 illustrates.
WEBS_C03.qxd 22/6/07 11:19 AM Page 95
The example in Listing 3.5 illustrates the use of QNames to disambiguate and scope
XML documents. As already explained earlier, QNames comprise two parts: the XML
namespace and the local name. For instance, the QName of an element like City is
composed of the "http://www.plastics_supply.com/Addr" namespace and the
local name City.
The use of valid documents can greatly improve the quality of document processes.
Valid XML documents allow users to take advantage of content management, e-business
transactions, enterprise integration, and all other kinds of business processes that require
the exchange of meaningful and constrained XML documents.
way in which to validate XML documents. It includes facilities for declaring elements and
attributes, reusing elements from other schemas, defining complex element definitions,
and for defining restrictions for even the simplest of data types. This gives the XML
schema developer explicit control over specifying a valid construction for an XML docu-
ment. For instance, a document definition can specify the data type of the contents of an
element, the range of values for elements, the minimum as well as maximum number of
times an element may occur, annotations to schemas, and much more.
An XML schema is made up of schema components. These are building blocks that
make up the abstract data model of the schema. Element and attribute declarations, com-
plex and simple type definitions, and notifications are all examples of schema components.
Schema components can be used to assess the validity of well-formed element and attribute
information items and furthermore may specify augmentations to those items and their
descendants.
XML schema components include the following [Valentine 2002]: data types which
embrace both simple and complex/composite and extensible data types; element type
and attribute declarations; constraints; relationships which express associations between
elements; and namespaces and import/include options to support modularity as they
make it possible to include reusable structures, containers, and custom data types through
externally managed XML schemas.
<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="PO:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice>
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="AddressType">
<xsd:sequence>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode" type="xsd:decimal"/>
<xsd:sequence>
</xsd:complexType>
<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
minOccurs= "1" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>
<xsd:complexType name="ProductType">
<xsd:attribute name="Name" type="xsd:string"/>
<xsd:attribute name="Price">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="Quantity" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>
Listing 3.6 depicts a purchase order for various items. This document allows a customer
to receive the shipment of the goods at the customer’s manufacturing plant and billing
information to be sent to the customer’s headquarters. This document also contains
specific information about the products ordered, such as how much each product cost, how
many were ordered, and so on. The root element of an XML schema document, such as
the purchase order schema, is always the schema element. Nested within the schema
element are element and type declarations. For instance, the purchase order schema con-
sists of a schema element and a variety of sub-elements, most notably element complexType
and simpleType that determine the appearance of elements and their content in instance
documents. These components are explained in the following sections.
The schema element assigns the XML schema namespace ("http://www.w3.org/
2001/XMLSchema") as the default namespace. This schema is the standard schema
namespace defined by the XML schema specification and all XML schema elements must
belong to this namespace. The schema element also defines the targetNamespace
attribute, which declares the XML namespace of all new types explicitly created
within this schema. The schema element is shown to assign the prefix PO to the
targetNamespace attribute. By assigning a target namespace for a schema, we indicate
that an XML document whose elements are declared as belonging to the schema’s
namespace should be validated against the XML schema. Therefore, the PO
targetNamespace can be used within document instances so that they can conform to
the purchase order schema.
As the purpose of a schema is to define a class of XML documents, the term instance
document is often used to describe an XML document that conforms to a particular schema.
Listing 3.7 illustrates an instance document conforming to the schema in Listing 3.6.
The remainder of this section is devoted to understanding the XML schema for the
XML document shown in Listing 3.6.
<PO:PurchaseOrder
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.plastics_supply.com/PurchaseOrder
purchaseOrder.xsd">
<ShippingInformation>
<Name> Right Plastic Products Co. </Name>
<Address>
<Street> 459 Wickham st. </Street>
<City> Fortitude Valley </City>
<State> QLD </State>
<PostalCode> 4006 </PostalCode>
</Address>
<ShippingDate> 2002-09-22 </ShippingDate>
</ShippingInformation>
<BillingInformation>
<Name> Right Plastic Products Inc. </Name>
<Address>
<Street> 158 Edward st. </Street>
<City> Brisbane </City>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
</Address>
<BillingDate> 2002-09-15 </BillingDate>
</BillingInformation>
Listing 3.7 An XML instance document conforming to the schema in Listing 3.6
derived data types that can be applied as constraints to any elements or attribute. The
<xsd:element> element either denotes an element declaration, defining a named
element and associating that element with a type, or is a reference to such a declaration
[Skonnard 2002].
The topmost element container in an XML document is known as the root element (of
which there is only one per XML document). Within the root element, there may be many
occurrences of other elements and groups of elements. The containing of elements by
WEBS_C03.qxd 22/6/07 11:19 AM Page 100
other elements presents the concept of nesting. Each layer of nesting results in another
hierarchical level. Elements may also contain attributes. Some elements may also be
defined intentionally to remain empty.
The location at which an element is defined determines its availability within the
schema. The element declarations that appear as immediate descendants of the
<xsd:schema> element are known as global element declarations and can be referenced
from anywhere within the schema document or from other schemas. For example, the
PurchaseOrderType in Listing 3.6 is defined globally and in fact constitutes the root
element in this schema. Global element declarations describe elements that are always part
of the target namespace of the schema. Element declarations that appear as part of com-
plex type definitions either directly or indirectly – through a group reference – are known
as local element declarations. In Listing 3.6 local element declarations include elements
such as Customer and ProductType.
An element that declares an element content may use compositors to aggregate existing
types into a structure, define, and constrain the behavior of child elements. A compositor
specifies the sequence and selective occurrence of the containers defined within a complex
type or group. There are three types of compositors that can be used within XML schemas.
These are sequence, choice, and all. The sequence construct requires that the
sequence of individual elements defined within a complex type or group must be followed
by the corresponding XML document (content model). The construct choice requires
that the document designer make a choice between a number of defined options in a com-
plex type or group. Finally, the construct all requires that all the elements contained in a
complex type or group may appear once or not at all, and may appear in any order.
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:PO="http://www.plastics_supply.com/PurchaseOrder"
targetNamespace="http://www.plastics_supply.com/PurchaseOrder">
<xsd:complexType name="Address">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"
minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="AustralianAddress">
<xsd:complexContent>
<xsd:extension base="PO:Address">
<xsd:sequence>
<xsd:element name="State"
type="xsd:string"/>
<xsd:element name="PostalCode"
type="xsd:decimal"/>
<xsd:element name="Country"
type="xsd:string"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
</xsd:schema>
Listing 3.8 illustrates how to extend a complex type such as Address (which includes
number, street, and city). The City element in the listing is optional and this is indicated by
the value of zero for the attribute minOccurs. The base type Address in Listing 3.8 can be
used to create other derived types, such as EuropeanAddress or USAddress as well.
<!-- Uses the data type declarations from Listing 3.8 -->
<xsd:complexType name="AustralianPostalAddress">
<xsd:complexContent>
<xsd:restriction base="PO:AustralianAddress">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"
minOccurs="0" maxOccurs="0"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode" type="xsd:decimal"/>
<xsd:element name="Country" type="xsd:string"/>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
The purpose of the complex content restrictions is to allow designers to restrict the content
model and/or attributes of a complex type. Listing 3.9 shows how the restriction element
achieves this purpose. In this example, the derived type AustralianPostalAddress
contains the Number, Street, State, PostalCode, and Country elements but
omits the City element. It is omitted as the value of both attributes minOccurs and
maxOccurs is set to zero.
3.4.1.3 Polymorphism
One of the attractive features of XML Schema is that derived types can be used poly-
morphically with elements of the base type. This means that a designer can use a derived
type in an instance document in place of a base type specified in the schema.
WEBS_C03.qxd 22/6/07 11:19 AM Page 104
Listing 3.10 defines a variant of the PurchaseOrder type introduced in Listing 3.6
to use the base type Address for its billingAddress and shippingAddress
elements.
<!-- Uses the data type declarations from Listing 3.8 -->
<xsd:complexType name="PurchaseOrder">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="shippingAddress" type="PO:Address"
minOccurs= "1" maxOccurs="1"/>
<xsd:element name="billingAddress" type="PO:Address"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
Since XML Schema supports polymorphism, an instance document can now use any
type derived from base type Address for its billingAddress and shippingAddress
elements. Listing 3.11 illustrates that the PurchaseOrder type uses the derived
AustralianAddress type as its billingAddress and the derived Australian-
PostalAddress type as its shippingAddress elements.
<billingAddress xsi:type="PO:AustralianAddress">
<Number> 158 </Number>
<Street> Edward st. </Street>
<State> QLD </State>
<PostalCode> 4000 </PostalCode>
<Country> Australia </Country>
</billingAddress>
<BillingDate> 2002-09-15 </BillingDate>
</PO:PurchaseOrder>
Combining schemas can be achieved by using the include and import elements in
the XSD. Through the use of these two elements, we can effectively “inherit” attributes
and elements from referenced schemas.
<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="PO:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
Now these two subschemas can be combined in the context of the purchase order schema
using the include element. This is illustrated in Listing 3.13, where the two include
statements are shaded.
Notice that in Listing 3.13 we do not need to specify the namespaces for the two included
schemas, as these are expected to match the namespace of the purchase order schema.
<xsd:include
schemaLocation="http://www.plastics_supply.com/customerType.
xsd"/>
<xsd:include
schemaLocation="http://www.plastics_supply.com/productType.
xsd"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="BillingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="Order" type="PO:OrderType" minOccurs="1"
maxOccurs="1"/>
</xsd:all>
</xsd:complexType>
<xsd:complexType name="AddressType">
<xsd:sequence>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="City" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal"/>
<xsd:sequence>
</xsd:complexType>
<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
minOccurs= "1" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>
Listing 3.13 Using the include element in the purchase order schema
WEBS_C03.qxd 22/6/07 11:19 AM Page 108
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:addr=http://www.plastics_supply.com/NewAddress
targetNamespace="http://www.plastics_supply.com/NewAddress">
<xsd:import namespace="http://www.plastics_supply.com/Address"
schemaLocation="addressType.xsd"/>
<xsd:complexType name="AustralianAddress">
<xsd:complexContent>
<xsd:extension base="addr:AddressType">
<xsd:sequence>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal"/>
<xsd:element name="Country" type="xsd:string"/>
<xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complextype>
<xsd:complexType name="AustralianPostalAddress">
< xsd:complexContent>
<xsd:restriction base="addr:AusttralianAddress">
<xsd:sequence>
<xsd:element name="Number" type="xsd:decimal"/>
<xsd:element name="Street" type="xsd:string"/>
<xsd:element name="State" type="xsd:string"/>
<xsd:element name="PostalCode " type="xsd:decimal">
<xsd:element name="Country" type="xsd:string"/>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complextype>
</xsd:schema>
Listing 3.14 defines a separate schema and namespace for all types related to
addresses in the purchase order example. This schema defines a complete address
markup language for purchase orders that contains all address-related elements such as
the AddressType, AustralianAddress, EuropeanAddress, USAddress,
AustralianPostalAddress, EuropeanPostalAddress, and so on. The
WEBS_C03.qxd 22/6/07 11:19 AM Page 109
<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="ShippingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="BillingInformation" type="PO:Customer"
minOccurs="1" maxOccurs="1"/>
<xsd:element name="Order" type="OrderType" minOccurs= "1"
maxOccurs="1"/>
</xsd:all>
</xsd:complexType>
<xsd:complexType name="Customer">
<xsd:sequence>
<xsd:element name="Name" minOccurs="1" maxOccurs="1">
<xsd:simpleType>
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>
</xsd:element>
<xsd:element name="Address" type="addr:AddressType"
minOccurs= "1" maxOccurs="1"/>
<xsd:choice minOccurs="1" maxOccurs="1">
<xsd:element name="BillingDate" type="xsd:date"/>
<xsd:element name="ShippingDate" type="xsd:date"/>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="OrderType">
<xsd:sequence>
<xsd:element name="Product" type="PO:ProductType"
maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="Total">
<xsd:simpleType>
<xsd:restriction base="xsd:decimal">
<xsd:fractionDigits value="2"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="ItemsSold" type="xsd:positiveInteger"/>
</xsd:complexType>
</xsd:schema>
Listing 3.15 A purchase order schema using import and include statements together
WEBS_C03.qxd 22/6/07 11:19 AM Page 111
rather a logical construct that holds the entire XML document. The root element is the
single element from which all other elements in the XML document instance are children
or descendants. The root element is itself the child of the root. The root element is
also known as the document element, because it is the first element in a document and it
contains all other elements in the document.
Figure 3.2 exemplifies the previous points as it shows an abridged version of the logical
(XPath tree) structure for the instance document defined in Listing 3.7. Note that the root
element in this figure is Purchase Order. Attributes and namespaces are associated
directly with nodes (see dashed lines) and are not represented as children of an element.
The document order of nodes is based on the tree hierarchy of the XML instance. Element
nodes are ordered prior to their children (to which they are connected via solid lines),
so the first element node would be the document element, followed by its descendants.
Children nodes of a given element (as in conventional tree structures) are processed prior
to sibling nodes. Finally, attributes and namespace attachments of a given element are
ordered prior to the children of the element.
The code in Listing 3.7 provides a good baseline sample XML structure that we can use
for defining XPath examples. Listing 3.16 illustrates a sample XPath expression and the
resulting node set.
The XPath query in Listing 3.16 consists of three location steps, the first one being
PurchaseOrder. The second location step is Order[2], which specifies the second
Order element within the PurchaseOrder. Finally, the third location step is child::*,
which selects all child elements of the second Order element. It is important to under-
stand that each location step has a different context node. For the first location step
(PurchaseOrder), the current context node is the root of the XML document. The
context for the second location step (Order[2]) is the node PurchaseOrder, while the
context for the third location step is the second Order node (not shown in Figure 3.2).
More information on XPath and as well as sample XPath queries can be found in books
such as [Gardner 2002], [Schmelzer 2002].
Figure 3.2 XPath tree model for instance document in Listing 3.7
WEBS_C03.qxd 22/6/07 11:19 AM Page 113
displayed. XSLT uses the formatting instructions in the style sheet to perform the trans-
formation. The converted document can be another XML document or a document in another
format, such as HTML, that can be displayed on a browser. Formatting languages, such as
XSLT, can access only the elements of a document that are defined by the document struc-
ture, e.g., XML schema.
An XSLT style sheet or script contains instructions that inform the transformation
processor of how to process a source document to produce a target document. This
makes XSLT transformations very useful for business applications. Consider, for example,
XML documents generated and used internally by an enterprise that may need to be
transformed into an equivalent format that customers or service providers of this enter-
prise are more familiar with. This can help to easily transfer information to and from an
enterprise’s partners.
Figure 3.3 shows an example of such a transformation. This figure shows an XML frag-
ment which represents the billing element of a purchase order message. As shown in this
message the source XML application uses separate elements to represent street numbers,
street addresses, states, postal codes, and countries. The target application is shown to use
a slightly different format to represent postal codes as it uses seven characters to represent
postal codes by combining state information with conventional four-digit postal codes.
More information on XSLT and transformations as well as examples can be found in
books such as [Gardner 2002], [Tennison 2001].
3.6 Summary
XML is an extensible markup language used for the description and delivery of marked-up
electronic text over the Web. Important characteristics of XML are its emphasis on
descriptive rather than prescriptive (or procedural) markup, its document type concept,
its extensibility, and its portability. In XML, the instructions needed to process a docu-
ment for some specific purpose, e.g., to format it, are sharply distinguished from the
descriptive markup, which occurs within the actual XML document. With descriptive
instead of procedural markup the same document can readily be processed in many differ-
ent ways, using only those parts of it that are considered relevant.
An important aspect of XML is its notion of a document type. XML documents are
regarded as having types. Its constituent parts and their structure formally define the type
of a document. XML Schema describes the elements and attributes that may be contained
in a schema-conforming document and the ways that the elements may be arranged within
a document structure. Schemas are more powerful when validating an XML document
because of their ability to clarify data types stored within the XML document.
XML can be perceived as a dynamic trading language that enables diverse applications
to exchange information flexibly and cost-effectively. XML allows the inclusion of tags
pertinent to the contextual meaning of data. These tags make possible precise machine
interpretation of data, fine tuning the entire process of information exchange between trad-
ing enterprises. In addition the ability of the XML schemas to reuse and refine the data
model of other schema architectures enables reuse and extension of components, reduces
the development cycle, and promotes interoperability.
As XML can be used to encode complex business information it is ideally suited to sup-
port open standards, which are essential to allow rapid establishment of business informa-
tion exchange and interoperability. For example, XML is well suited to transactional
processing in a heterogeneous, asynchronous, open, and distributed architecture that is
built upon open standard technologies, such as parsers and interfaces. This applies equally
well to enterprise application integration as well as to the e-business style of integration
with trading partners.
The ability of XML to model complex data structures combined with the additional
ability of XML Schema to reuse and refine the data model of other schema architectures
enables reuse and extension of components, reduces the development cycle and promotes
interoperability. It is precisely the suite of technologies that are grouped under XML that
had profound influence on the development of Web services technologies and provide the
fundamental building blocks for Web services and service-oriented architectures.
Review questions
◆ What are the two important features of XML that distinguish it from other markup
languages?
◆ What are XML elements and what are XML attributes? Give examples of both.
WEBS_C03.qxd 22/6/07 11:19 AM Page 115
Exercises 115
Exercises
3.1 Define a simple purchase order schema for a hypothetical on-line grocery. Each
purchase order should contain various items. The schema should allow one
customer to receive the shipment of the goods and an entirely different individual,
e.g., spouse, to pay for the purchase. This document should include a method of
payment that allows customers to pay by credit card, direct debit, check, etc., and
should also contain specific information about the products ordered, such as how
much each product cost, how many items were ordered, and so on.
3.2 Extend the purchase order schema in the previous exercise to describe the case of
an on-line grocery that sells products to its customers by accepting only credit
cards as a payment medium. This simple order processing transaction should
contain basic customer, order, and product type information as well as different
methods of delivery, and a single method of payment, which includes fields for
credit card number, expiration date, and payment amount. Show how this purchase
order schema can import schema elements that you developed for Exercise 3.1.
3.3 Define a schema for a simple clearinghouse application that deals with credit card
processing and (PIN-based) debit card processing for its customers who are elec-
tronic merchants. In order to process a credit card a merchant will need to have a
valid merchant account with the clearinghouse. A merchant account is a commer-
cial bank account established by contractual agreement between a merchant and
the clearinghouse and enables a merchant that provides shopping facilities to
accept credit card payments from its customers. A merchant account is required to
authorize transactions. A typical response for a credit card is authorized, declined,
or cancelled. When the clearinghouse processes credit card sales it returns a
WEBS_C03.qxd 22/6/07 11:19 AM Page 116
transaction identifier (TransID) only when a credit card sale is authorized. If the
merchant needs to credit or void a transaction the TransID of the original credit
card sale will be required. For simplicity assume that one credit is allowed per sale
and that a credit amount cannot exceed the original sale amount. The application
should be able to process a number of payments in a single transmission as a batch
transaction.
3.4 Define a schema for a flight availability request application that requests flight
availability for a city pair on a specific date for a specific number and type of
passengers. Optional request information can include: time or time window,
connecting cities, client preferences, e.g., airlines, flight types, etc. The request can
be narrowed to request availability for a specific airline, specific flight, or specific
booking class on a specific flight.
3.5 Define a schema for handling simple requests for the reservation of rental vehicles.
The schema should assume that the customer has already decided to use a specific
rental branch. It should then define all the information that is needed when request-
ing information about a vehicle rental. The schema should include information
such as rate codes, rate type, promotional descriptions, and so on, as well as rate
information that had been supplied in a previous availability response, along with
any discount number or promotional codes that may affect the rate. For instance,
the customer may have a frequent renter number that should be associated with the
reservation. Typically rates are offered as either leisure rates or corporate rates. The
schema should also define the rental period, as well as information on a distance
associated with a particular rate, e.g., limited or unlimited miles per rental period,
and customer preferences regarding the type of vehicle and special equipment that
can be included with the reservation of a rental vehicle.
3.6 Define a simple hotel availability request schema that provides the ability to search
for hotel products available for booking by specific criteria that may include: dates,
date ranges, price range, room types, regular and qualifying rates, and/or services and
amenities. A request can also be made for a non-room product, such as banquets
and meeting rooms. An availability request should be made with the intent to
ultimately book a reservation for an event or for a room stay. The schema should
allow a request for “static” property data published by the hotel that includes informa-
tion about the hotel facilities, amenities, services, etc., as well as “dynamic”
(e.g., rate-oriented) data. For example, a hotel may have an AAA rate, a corporate
rate (which it does not offer all the time), or may specify a negotiated code as a
result of a negotiated rate, which affects the availability and price of the rate.