Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
7 views

Module 3 (Chapter 1)

Uploaded by

uhani2323
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Module 3 (Chapter 1)

Uploaded by

uhani2323
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Unit-3: "Introduction to XML"

Module-3
Chapter 1
Introduction to XML
3.1 Introduction
a) History
 A meta-markup language is a language for defining markup languages.
 The Standard Generalized Markup Language (SGML) is a meta-markup language for
defining markup languages that can describe a wide variety of document types.
 In 1986, SGML was approved as an International Standards Organization (ISO)
standard.
 In 1990, SGML was used as the basis for the development of HTML as the standard
markup language for Web documents.
 In 1996, the World Wide Web Consortium (W3C) began work on XML, another
meta-markup language.
 The first XML standard, 1.0, was published in February 1998. The second, 1.1, was
published in 2004

b) Problems with html


 One problem with HTML is that it was defined to describe the layout of information
without considering the meaning of that information.
 Eg: suppose that a document stores a list of used cars for sale and the color and price
are included for each car.
 With HTML, two pieces of information about a car could be stored as the content of
paragraph elements, but there would be no way to find them in the document because
paragraph tags could have been used for many different kinds of information.
 Another potential problem with HTML is that it enforces few restrictions on the
arrangement or order of tags in a document.

c) Solution:
 group of users with common document needs to develop its own set of tags and
attributes and then use the SGML standard to define a new markup language to meet
those needs.
 SGML includes a large number of capabilities that are only rarely used.
 A program capable of parsing SGML documents would be very large and costly to
develop.
 In addition, SGML requires that a formal definition be provided with each new
markup language.
 An alternative solution to the problems of HTML is to define a simplified version of
SGML and allow users to define their own markup languages based on it.
 XML was designed to be that simplified version of SGML.

d)Features of XML
 XML is far more than a solution to the deficiencies of HTML
 It provides a simple and universal way of storing any textual data.

Dept of CSE,GST,Bengaluru Page 1


Unit-3: "Introduction to XML"

 Data stored in XML documents can be electronically distributed and processed by any
number of different applications
 XML is a universal data interchange language
 it is a meta-markup language that specifies rules for creating markup languages.
 XML documents can be written by hand with a simple text editor.

3.2 Syntax of XML


 The syntax of XML can be thought of at two distinct levels.
 First, there is the general low-level syntax of XML, which imposes its rules on all
XML documents.
 Other syntactic level is specified by either document type definitions (DTDs) or XML
schemas.
 DTDs and XML schemas specify the set of tags and attributes that can appear in a
particular document or collection of documents and also the orders and arrangements
in which they can appear.
 An XML document can include several different kinds of statements. The most
common of these statements are the data elements of the document.
 Comments in XML are the same as in HTML. They cannot contain two adjacent
dashes, for obvious reasons.
 XML names are used to name elements and attributes. An XML name must begin
with a letter or an underscore and can include digits, hyphens, and periods.
 XML names are case sensitive, so Body, body, and BODY are all distinct names.
There is no length limitation for XML names.
 Every XML document defines a single root element, whose opening tag must appear
on the first line of XML code.
 All other elements of an XML document must be nested inside the root element. The
root element of every XHTML document is html, but in XML it has whatever name
the author chooses. XML tags, like those of XHTML, are surrounded by angle
brackets.
 Every XML element that can have content must have a closing tag. Elements that do
not include content must use a tag with the following form:
 <element_name />
 As is the case with XHTML, XML tags can have attributes, which are specified with
name–value assignments.
 As with XHTML, all attribute values must be enclosed by either single or double
quotation marks.

Example:
<?xml version = "1.0" encoding = "utf-8"?>
<ad>
<year>1960</year>
<make> cessna </make>
<model> alto </model>
<color> yellow with white trim </color>
<location>
<city> Bangalore </city>
<state> Karnataka </state>

Dept of CSE,GST,Bengaluru Page 2


Unit-3: "Introduction to XML"

</location>
</ad>

 In some cases, nested tags are better than attributes.


 A document or category of documents for which tags are being defined might need to
grow in structural complexity in the future.
 Nested tags can be added to any existing tag to describe its growing size and
complexity.
 The following versions of an element named patient illustrate three possible choices
between tags and attributes:

Eg: <!—A tag with one attribute --


<patient name = “Maggie Dee Mapie”>
………………
</patient>

<!—A tag with one nested tag--


<patient>
<name>Maggie Dee Mapie </name>
</patient>

<!—A tag with one nested tag which contains three nested tags--->
<patient>
<name>
<first> Magiee </first>
<middle> Dee </middle>
<name> Mapie </name>

3.3 XML Document Structure


 An XML document often uses two auxiliary files:
a) one that defines its tag set and structural syntactic rules &
b) one that contains a style sheet to describe how the content of the document is to be
printed or displayed.
 The structural syntactic rules are given as either a DTD or an XML schema.
 An XML document consists of one or more entities, which are logically related
collections of information, ranging in size from a single character to a chapter of a
book.
 One of these entities, called the document entity, is always physically in the file that
represents the document
 The document entity can be the entire document, but in many cases it includes
references to the names of entities that are stored elsewhere.
For example, the document entity for a technical article might contain the beginning
material and ending material but have references to the article body sections, which
are entities stored in separate files.
 many documents include information that cannot be represented as text, such as
images. Such information units are usually stored as binary data.

Dept of CSE,GST,Bengaluru Page 3


Unit-3: "Introduction to XML"

example, if apple_image is the name of an entity, &apple_image; is a reference to it


 If a binary data unit is logically part of a document, it must be a separate entity
because XML documents cannot include binary data. These entities are called binary
entities.
 Binary entities can be handled only by applications that deal with the document, such
as browsers. XML processors deal only with text.
 When several predefined entities must appear near each other in an XML document,
their references clutter the content and make it difficult to read. In such cases, a
character data section can be used
The form of a character data section is as follows:
<![CDATA[ content ]]>
For example, instead of
The last word of the line is &gt;&gt;&gt; here &lt;&lt;&lt;.
the following could be used:
<![CDATA[The last word of the line is >>> here <<<]]>

3.4 Document Type Definitions (DTD)


 A document type definition (DTD) is a set of structural rules called declarations,
which specify a set of elements and attributes that can appear in a document, as well
as how and where these elements and attributes may appear.
 DTDs also provide entity definitions. Not all XML documents need a DTD
 A document can be tested against the DTD to determine whether it conforms to the
rules the DTD describes.
 Application programs that process the data in the collection of XML documents can
be written so that they assume the particular document form.
 A DTD can be embedded in the XML document whose syntax rules it describes, in
which case it is called an internal DTD
 The alternative is to have the DTD stored in a separate file, in which case it is called
an external DTD. Because external DTDs allow use with more than one XML
document, they are preferable.
 A DTD with an incorrect or inappropriate declaration can have widespread
consequences. Fixing the DTD and all copies of it is the first and simplest step.
 After the correction of the DTD is completed, all documents that use the DTD must
be tested against it and often modified to conform to the changed DTD. Changes to
associated style sheets also might be necessary.
 Syntactically, a DTD is a sequence of declarations, each of which has the form of a
mark up declaration:
<!keyword ... >

Four possible keywords can be used in declarations


a) ELEMENT
b) ATTLIST
c) NOTATION
d) ENTITY

i)Declaring Elements:

Dept of CSE,GST,Bengaluru Page 4


Unit-3: "Introduction to XML"

 The element declarations of a DTD have a form that is related to that of the rules of
context-free grammars, also known as Backus–Naur form (BNF).
 BNF is used to define the syntactic structure of programming languages.
 A DTD describes the syntactic structure of a particular set of documents, so it is
natural for its rules to be similar to those of BNF.
 Each element declaration in a DTD specifies the structure of one category of
elements.
 The declaration provides the name of the element whose structure is being defined,
along with the specification of the structure of that element.
 An element is a node in such a tree, either a leaf node or an internal node.
 If the element is a leaf node, its syntactic description is its character pattern.
 If the element is an internal node, its syntactic description is a list of its child
elements, each of which can be a leaf node or an internal node.
 The form of an element declaration for elements that contain elements is as follows:
<!ELEMENT element_name (list of names of child elements)>

For example, consider the following declaration:


<!ELEMENT memo (from, to, date, re, body)>

 In many cases, it is necessary to specify the number of times that a child element may
appear. This can be done in a DTD declaration by adding a modifier to the child
element specification.
+ One or more occurrence
* Zero or more occurrences
? Zero or one occurrences
Consider the following DTD declaration:
<!ELEMENT person (parent+, age, spouse?, sibling*)>
 In most cases, the content of an element is type PCDATA, for parsable character data.
 Parsable character data is a string of any printable characters except “less than” (<),
“greater than” (>), and the ampersand (&). Two other content types can be specified:
EMPTY and ANY
<!ELEMENT element_name(#PCDATA)>

ii)Declaring attributes
 The attributes of an element are declared separately from the element declaration in a
DTD.
 An attribute declaration must include the name of the element to which the attribute
belongs, the attribute’s name, its type, and a default option.
 The general form of an attribute declaration is as follows:
<!ATTLIST element_name attribute_name attribute type default_option>
 If more than one attribute is declared for a given element, the declarations can be
combined, as in the following element

Dept of CSE,GST,Bengaluru Page 5


Unit-3: "Introduction to XML"

<!ATTLIST element_name
attribute_name_1 attribute type default_value_1
attribute_name_2 attribute type default_value_2

attribute_name_n attribute type default_value_n


>
 There are 10 different attribute types CDATA is only used
 The default option in an attribute declaration can specify either an actual value or a
requirement for the value of the attribute in the XML document. Table lists the
possible default options

Example:
<!ATTLIST airplane places CDATA “4”>
<!ATTLIST airplane engine_type CDATA #REQUIRED>
<!ATTLIST airplane price CDATA #IMPLIED>
<!ATTLIST airplane manfacture CDATA #FIXED “Cessna”>

iii)Declaring Entities
 Entities can be defined so that they can be referenced anywhere in the content of an
XML document, in which case they are called general entities. The predefined entities
are all general entities.
 Entities can also be defined so that they can be referenced only in DTDs, in which
case they are called parameter entities.
 The form of an entity declaration is
<!ENTITY [%] entity_name “entity_value”>
 When the optional percent sign (%) is present in an entity declaration, it specifies that
the entity is a parameter entity rather than a general entity.
Consider the following example of an entity
<!ENTITY jfk “John Fitzgerald Kennedy”>
 Any XML document that uses a DTD that includes this declaration can specify the
complete name with just the reference &jfk;.
 When an entity is longer than a few words, such as a section of a technical article, its
text is defined outside the DTD. In such cases, the entity is called an external text
entity.
 The form of the declaration of an external text entity is
<!ENTITY entity_name SYSTEM “file_location”>

iv) Sample DTD


//planes.dtd//

Dept of CSE,GST,Bengaluru Page 6


Unit-3: "Introduction to XML"

<!?xml version="1.0" encoding ="utf-8"?>


<!ELEMENT planes_for_sale(ad+)>
<!ELEMENT ad(year,make,model,color,pescription,price?,seller,location)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT make (#PCDATA)>
<!ELEMENT model(#PCDATA)>
<!ELEMENT color(#PCDATA)>
<!ELEMENT pescription(#PCDATA)>
<!ELEMENT price(#PCDATA)>
<!ELEMENT seller(#PCDATA)>
<!ELEMENT location(city,state)>
<!ELEMENT city(#PCDATA)>
<!ELEMENT state(#PCDATA)>
<!ATTLIST seller phone CDATA #REQUIRED>
<!ATTLIST seller email CDATA #IMPLIED>
<!ENTITY c "cessna">
<!ENTITY p "piper">

 Some XML parsers check documents that have DTDs in order to ensure that the
documents conform to the structure specified in the DTDs. These parsers are called
validating parsers.

v)Internal and External DTD


 A DTD can be embedded in the XML document whose syntax rules it describes, in
which case it is called an internal DTD
Syntax:
<!?xml version="1.0" encoding ="utf-8"?>
<!DOCTYPE planes [
<!----- The DTD for planes ---->
]>
 The alternative is to have the DTD stored in a separate file, in which case it is called
an external DTD. Because external DTDs allow use with more than one XML
document, they are preferable
Eg:
<?xml version = "1.0" encoding = "utf-8"?>
<!DOCTYPE planes_for_sale SYSTEM "planes.dtd">
<planes_for_sale>
<ad>
<year>1960</year>
<make> &c; </make>
<model> alto </model>
<color> yellow with white trim </color>
<pescription> New point ,nearly new interior</pescription>
<price> 23,495 </price>
<seller phone ="555-222-333"> Sky way </seller>
<location>
<city> Bangalore </city>
<state> Karnataka </state>

Dept of CSE,GST,Bengaluru Page 7


Unit-3: "Introduction to XML"

</location>
</ad>
<ad>
<year>1980</year>
<make> &p; </make>
<model> cherokee </model>
<color> gold</color>
<pescription> Old point ,nearly old interior</pescription>
<seller phone ="555-222-333"
email ="jseller@axl.com"> John Seller </seller>
<location>
<city> Bangalore </city>
<state> Karnataka </state>
</location>
</ad>
</planes_for_sale>

3.5 Namespaces
 It is often convenient to construct XML documents that use tag sets that are defined
for and used by other documents.
 When a tag set is available and appropriate for a particular XML document or class of
documents, it is better to use it than to invent a new collection of element types.
 For example, suppose you must define an XML markup language for a furniture
catalog with <chair>, <sofa>, and <table> tags. Suppose also that the catalog
document must include as well several different tables of specific furniture pieces,
wood types, finishes, and prices.
 One problem with using different markup vocabularies in the same document is that
collisions between names that are defined in two or more of those tag sets could result
 To deal with this problem, the W3C has developed a standard for XML namespaces
(at http://www.w3.org/TR/REC-xml-names)
 A n XML namespace is a collection of element and attribute names used in XML
documents. The name of a namespace usually has the form of a uniform resource
identifier (URI).

 A namespace for the elements and attributes of the hierarchy rooted at a particular
element is declared as the value of the attribute xmlns.
The form of a namespace declaration for an element is
<element_name xmlns[:prefix] = URI>
 The square brackets indicate that what is within them is optional. The prefix, if
included, is the name that must be attached to the names in the declared namespace.
 If the prefix is not included, the namespace is the default for the document.
 A prefix is used for two reasons. First, most URIs are too long to be typed on every
occurrence of every name from the namespace. Second, a URI includes characters
that are invalid in XML. Note that the element for which a namespace is declared is
usually the root of a document.
For example, all XHTML documents in this book declare the xmlns namespace on the
root element, html:
<html xmlns = “http://www.w3.org/1999/xhtml”>

Dept of CSE,GST,Bengaluru Page 8


Unit-3: "Introduction to XML"

This declaration defines the default namespace for XHTML documents, which is
http://www.w3.org/1999/xhtml.

As an example of a prefixed namespace declaration, consider the following:


<birds xmlns:bd = “http://www.audubon.org/names/species”>
Within the birds element, including all of its children elements, the names from the
given namespace must be prefixed with bd, as in the following element:
<bd:lark>

An example consisting two namespace


<states>
xmlns = "http://www.states-info.org/states"
xmlns:cap ="http://www.states-info.org/state-capitals"
<state>
<name>south dakota </name>
<population>75844 </population>
<capital>
<cap:name> Pierce </cap:name>
<cap:population>12576
</cap:population>
</capital>
</state>
</states>

3.6 Displaying Raw XML Documents


 An XML-enabled browser— or any other system that can deal with XML
documents—cannot know how to format the tags defined in any given document.
 Therefore, if an XML document is displayed without a style sheet that defines
presentation styles for the document’s tags, the displayed document will not have
formatted content.
 Contemporary browsers include default style sheets that are used when no style sheet
is specified in the XML document.
 The display of such an XML document is only a somewhat stylized listing of the
XML markup. The FX3 browser display of the planes.xml document is shown in
Figure

Xml Program
<?xml version = "1.0" encoding = "utf-8"?>
<!DOCTYPE plane_for_sale [
<!ELEMENT planes_for_sale(ad+)>
<!ELEMENT ad(year,make,model,color,pescription,price?,seller,location)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT make (#PCDATA)>
<!ELEMENT model(#PCDATA)>
<!ELEMENT color(#PCDATA)>
<!ELEMENT pescription(#PCDATA)>
<!ELEMENT price(#PCDATA)>
<!ELEMENT seller(#PCDATA)>

Dept of CSE,GST,Bengaluru Page 9


Unit-3: "Introduction to XML"

<!ELEMENT location(city,state)>
<!ELEMENT city(#PCDATA)>
<!ELEMENT state(#PCDATA)>
<!ATTLIST seller phone CDATA #REQUIRED>
<!ATTLIST seller email CDATA #IMPLIED>
<!ENTITY c "cessna">
<!ENTITY p "piper">]>

<planes_for_sale>
<ad>
<year>1960</year>
<make> &c; </make>
<model> alto </model>
<color> yellow with white trim </color>
<pescription> New point ,nearly new interior</pescription>
<price> 23,495 </price>
<seller phone ="555-222-333"> Sky way </seller>
<location>
<city> Bangalore </city>
<state> Karnataka </state>
</location>
</ad>
<ad>
<year>1980</year>
<make> &p; </make>
<model> cherokee </model>
<color> gold</color>
<pescription> Old point ,nearly old interior</pescription>
<seller phone ="555-222-333"
email ="jseller@axl.com"> John Seller </seller>
<location>
<city> Bangalore </city>
<state> Karnataka </state>
</location>
</ad>
</planes_for_sale>

Dept of CSE,GST,Bengaluru Page 10


Unit-3: "Introduction to XML"

3.7 Displaying XML Documents with CSS

<!-- planes.css -->


ad { display:block ;margin-top:15px; color:blue;}
year,make,model {color:red;font-size:16pt;}
color {display:black;margin-left:20px;font-size:12pt;}
pescription {display:block;margin-left:15px;font-size:12pt;}
seller {display:block;margin-left:15px;font-size:14pt;}
location {display:block;margin-left:40px;}
city {font-size:12pt;}
state {font-size:12pt;}

Dept of CSE,GST,Bengaluru Page 11


Unit-3: "Introduction to XML"

3.8 XML Schemas


DTD has several disadvantages
 One is that DTDs are written in a syntax unrelated to XML, so they cannot be
analysed with an XML processor
 Also, it can be confusing for people to deal with two different syntactic forms, one
that defines a document and one that defines its structure.
 Another disadvantage is that DTDs do not allow restrictions on the form of data that
can be the content of a particular tag.

 Several alternatives to DTDs have been developed to attempt to overcome their


weaknesses. The XML Schema standard, which was designed by the W3C, is one of
these alternatives.
 An XML schema is an XML document, so it can be parsed with an XML parser. It
also provides far more control over data types than do DTDs.
 The content of a specific element can be required to be any one of 44 different data
types

a) Schema Fundamentals
 Schemas can conveniently be related to the idea of a class and an object in an object-
oriented programming language.
 A schema is similar to a class definition; an XML document that conforms to the
structure defined in the schema is similar to an object of the schema’s class.
Schemas have two primary purposes.
 First, a schema specifies the structure of its instance XML documents, including
which elements and attributes may appear in the instance document, as well as where
and how often they may appear.
 Second, a schema specifies the data type of every element and attribute in its instance
XML documents.
 It has been said that XML schemas are “namespace centric.”

b) Defining a Schema
 Schemas themselves are written with the use of a collection of tags, or a vocabulary,
from a namespace that is, in effect, a schema of schemas. The name of this namespace
is http://www.w3.org/2001/XMLSchema.
 Some of the elements in the namespace are element, schema, sequence, and string.

 Every schema has schema as its root element.


 As stated, the schema element specifies the namespace for the schema of schemas
from which the schema’s elements and attributes will be drawn.
This namespace specification appears as follows:
xmlns:xsd = “http://www.w3.org/2001/XMLSchema”
The specification provides the prefix xsd for the names from the namespace for the
schema of schemas

 A schema defines a namespace in the same sense as a DTD defines a tag set.
 The name of the namespace defined by a schema must be specified with the
targetNamespace attribute of the schema element.

Dept of CSE,GST,Bengaluru Page 12


Unit-3: "Introduction to XML"

 The name of every top-level (not nested) element that appears in a schema is placed in
the target namespace, which is specified by assigning a namespace to the target
namespace attribute:
targetNamespace = “http://cs.uccs.edu/planeSchema”

 If the elements and attributes that are not defined directly in the schema element
(because they are nested inside top-level elements) are to be included in the target
namespace, schema’s elementFormDefault must be set to qualified, as follows:
elementFormDefault = “qualified”

 The default namespace, which is the source of the unprefixed names in the schema, is
given with another xmlns specification, but this time without the prefix:
xmlns = “http://cs.uccs.edu/planeSchema”

An example of a complete opening tag for a schema is as follows:


<xsd:schema
//namespace for the schema itself//
xmlns:xsd ="http://www.w3.org/2001/xmlschema"
//namespace where elements define will be placed//
targetNamespace ="http://cs.uccs.edu/planeschema"
//default namespace for this document//
xmlns = "http://cs.uccs.edu/planeschema"
//non-top level elements in the targetnamespace//
elementFormDefualt ="qualified">
c) Defining a Schema Instance
 An instance of a schema must include specifications of the namespaces it uses. These
specifications are given as attribute assignments in the tag for the root element of the
schema
 First, an instance document normally defines its default namespace to be the one
defined in its schema.
example, if the root element is planes, we could have
<planes
xmlns = http://cs.uccs.edu/planeSchema... >
 The second attribute specification in the root element of an instance document is for
the schemaLocation attribute.
 This attribute is used to name the standard namespace for instances, which includes
the name XMLSchema-instance.
 This namespace corresponds to the XMLSchema namespace used for schemas. The
following attribute assignment specifies the XMLSchema-instance namespace and
defines the prefix, xsi, for it:
xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance”
 Third, the instance document must specify the filename of the schema in which the
default namespace is defined.
 This is accomplished with the schemaLocation attribute, which takes two values: the
namespace of the schema and the filename of the schema
xsi:schemaLocation =”http://cs.uccs.edu/planeschema planes.xsd”

<planes
xmlns ="http://cs.uccs.edu/planeschema"

Dept of CSE,GST,Bengaluru Page 13


Unit-3: "Introduction to XML"

xmlns:xsi ="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation= "http://cs.uccs.edu/planeschema planes.xsd">

d)Overview of Data types:


 There are two categories of user-defined schema data types: simple and complex.
 A simple data type is a data type whose content is restricted to strings. A simple type
cannot have attributes or include nested elements.
 A complex type can have attributes and include other data types as child elements
 The XML Schema defines 44 data types, 19 of which are primitive and 25 of which
are derived.
 The primitive data types include string, Boolean, float,time, and anyURI.
 The predefined derived types include byte, long, decimal, unsignedInt,
positiveInteger, and NMTOKEN.
 User-defined data types are defined by specifying restrictions on an existing type,
which is then called a base type. Such user-defined types are derived types.
Constraints on derived types are given in terms of the facets of the base type.
 Elements in a DTD are all global.
 Data declarations in an XML schema can be either local or global.
 A local declaration is a declaration that appears inside an element that is a child of the
schema element; A locally declared element is visible only in that element
 A global declaration is a declaration that appears as a child of the schema element.
Global elements are visible in the whole schema in which they are declared.

e) Simple Types:
 Elements are defined in an XML schema with the element tag, which is from the
XMLSchema namespace
 An element that is named includes the name attribute for that purpose. The other
attribute that is necessary in a simple element declaration is type, which is used to
specify the type of content allowed in the element.
Here is an example:
<xsd:element name = “engine” type = “xsd:string” />
 An element can be given a default value with the default attribute:
<xsd:element name =”engine” type=”xsd:string” default= “fuel injected”/>
 Elements can have constant values,constant values are given with fixed attribute
<xsd:element name =”plane” type=”xsd:string” fixed=”single wing”>

 A simple user-defined data type is described in a simpleType element with the use of
facets.
 Facets must be specified in the content of a restriction element, which gives the base
type name. The facets themselves are given in elements named for the facets:
For example, the following element declares a user-defined type, firstName, for
strings of fewer than 11 characters:
<xsd:simpleType name=”firstName”>
<xsd:restriction base =”xsd:string”>
</xsd:restriction>
</xsd:simpleType>
 The number of digits of decimal number is restricted with the precision facet
<xsd:simpleType name=”phoneNumber”>

Dept of CSE,GST,Bengaluru Page 14


Unit-3: "Introduction to XML"

<xsd:restriction base =”xsd:decimal”>


</xsd:restriction>
</xsd:simpleType>

f) Complex Types:
 Most XML documents include nested elements, so few XML schemas do not have
complex types.
 Although there are several categories of complex element types, the discussion here is
restricted to those called element-only elements, which can have elements in their
content, but no text.
 Complex types are defined with the complexType tag. The elements that are the
content of an element-only element must be contained in an ordered group, an
unordered group, a choice, or a named group.
 The sequence element is used to contain an ordered group of elements
<xsd:complexType name=”sports_car”>
<xsd:sequence>
<xsd:element name=”make” type=”xsd:string”/>
<xsd:element name=”model” type=”xsd:string”/>
<xsd:element name=”engine” type=”xsd:string”/>
<xsd:element name=”year” type=”xsd:decimal”/>
</xsd:sequence>
</xsd:complexType>
 A complex type whose elements are an unordered group is defined in an all element
<xsd:element name =”planes”>
<xsd:complexType>
<xsd:all>
<xsd:element name=”make”
Type =”xsd:string”
minOccurs= “1”
maxOccurs =”unbounded” />
</xsd:all>
</xsd:complexType>
<.xsd:element>

A complete Example of schema


//planes.xsd//
<?xml version ="1.0" encoding ="utf-8"?>
<xsd:schema
xmlns:xsd ="http://wwww.w3.org/2001/XMLSchema"
targetNamespace ="http://cs.uccs.edu/planeSchema"
xmlns ="http://cs.uccs.edu/planeSchema"
elementFormDefault ="qualified">

<xsd:element name="planes">
<xsd:complexType>
<xsd:sequence>
<xsd:element name=”make” type=”xsd:string”/>

Dept of CSE,GST,Bengaluru Page 15


Unit-3: "Introduction to XML"

<xsd:element name=”model” type=”xsd:string”/>


<xsd:element name=”engine” type=”xsd:string”/>
<xsd:element name=”year” type=”xsd:decimal”/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>

//newplanes.xml//
<?xml version="1.0" encoding="utf-8"?>
<planes
xmlns ="http://cs.uccs.edu/planeSchema"
xmlns:xsi ="http://www.w3.org/2001/planeSchema"
xsi:schemaLocation ="http://cs.uccs.edu/planeSchema planes.xsd">
<make>Maruthi </make>
<model>alto </model>
<engine>Diesel engine </engine>
<year>2016 </year>
</planes>

g) Validating Instance of Schemas


 An XML schema provides a definition of a category of XML documents
 Fortunately, several XML schema validation tools are available. One of them is
named xsv, an abbreviation for XML Schema Validator.
 If the schema and the instance document are available on the Web, xsv can be used
online, like the XHTML validation tool at the W3C Web site.
 This tool can also be schema and the instance document are available on the Web, xsv
can be used online, like the XHTML validation tool at the W3C Web site.
 This tool can also be downloaded and run on any computer. T
 The Web site for xsv is http://www.w3.org/XML/Schema#XSV.
 The output of xsv is an XML document. When the tool is run from the command line,
the output document appears on the screen with no formatting, so it is a bit difficult to
read. The following is the output of xsv run on planes.xml:

Dept of CSE,GST,Bengaluru Page 16


Unit-3: "Introduction to XML"

3.7 Xml Parser


 An XML parser is a software library or package that provides interfaces for client
applications to work with an XML document.
 The XML Parser is designed to read the XML and create a way for programs to use
XML.
 XML parser validates the document and check that the document is well formatted.
 Let's understand the working of XML parser by the figure given below:

Types of XML Parsers

These are the two main types of XML Parsers:

1. DOM
2. SAX

Document Object Model)


 A DOM document is an object which contains all the information of an XML
document. It is composed like a tree structure.
 The DOM Parser implements a DOM API. This API is very simple to use.
 The XML DOM makes a tree-structure view for an XML document.
 We can access all elements through the DOM tree.
 We can modify or delete their content and also create new elements. The elements,
their content (text and attributes) are all known as nodes.
 For example, consider this table, taken from an HTML document:
<TABLE>
<ROWS>
<TR>
<TD>A</TD>
<TD>B</TD>
</TR>
<TR>
<TD>C</TD>
<TD>D</TD>
</TR>
</ROWS>
</TABLE>

Dept of CSE,GST,Bengaluru Page 17


Unit-3: "Introduction to XML"

Dept of CSE,GST,Bengaluru Page 18

You might also like