Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Extensible Markup Language (XML)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Indian Institute of Technology Kharagpur

Extensible Markup Language


(XML)

Prof. Indranil Sen Gupta


Dept. of Computer Science & Engg.
I.I.T. Kharagpur, INDIA

Lecture 16: Extensible Markup


Language (XML)
On completion, the student will be able to:
1. Explain the structure of a XML document.
2. Explain the different types of document type
declarations.
3. Explain the basic concepts of simple and
extended links.

1
Introduction

• What is XML?
¾A markup language for creating documents
containing structured information.
¾Markup language
ƒ Mechanism to identify structures in a document.
¾Structured information:
ƒ Contains content (text, image, etc.)
ƒ Contains indication of what role the content
plays (e.g., heading, footnote, address, etc.)

XML vs. HTML

• Both are markup languages, but


there are differences.
¾In HTML, both the tag set and tag
semantics are predefined and fixed.
¾XML specifies neither a tag set nor
semantics.
ƒ Provides facility to define tags.
ƒ Semantics defined by applications that
process the documents (or by stylesheets).
ƒ XML is thus a meta-language for describing
markup languages.

2
XML Development Goals

• It should be easy to use XML over the


Internet.
• XML shall support a wide variety of
applications.
• It shall be easy to write programs that
process XML documents.
• The number of optional features in XML is
kept to a minimum (zero, ideally).
• Design of XML shall be formal and
concise.
• XML documents should be easy to create.

How is XML Defined?

• XML is defined by the following


specifications:
¾Extensible Markup Language (XML) 1.0
ƒ Defines the syntax of XML.
¾XML Pointer Language (XPointer) and
XML Linking language (XLink)
ƒ Defines a standard way to represent link
between resources.
¾Extensible Style Language (XSL)
ƒ Defines the standard stylesheet language
for XML.

3
An Example XML Document

<?xml version=“1.0”?>
<quotation>
<isay> Hello, how are you </isay>
<yousay> I am not well </yousay>
<frown/>
</quotation>

Structure of a XML Document

• An XML document consists of:


¾Prolog
¾Elements
¾Attributes
¾Entity references
¾Comments

4
XML: Prolog

• The Prolog is the first structural


element that is present in the XML
document.
• Usually divided into an XML
declaration and an (optional) DTD.
• Example:
<? xml version=“1.0” encoding=“UTF-8” ?>
<? Xml version=“1.0” ?>

XML: Elements

• Elements are most common form of markup.


• XML elements must contain a start tag and a
matching end tag prefixed by a slash.
<city> Kharagpur </city>
• Empty elements can be written as <city/>
instead of both tags without contents.
• Remember …. XML is case-sensitive.
• Element naming convention:
¾ Must begin with an underscore or letter.
¾ Can contain letters, digits, underscore, hyphen, and
periods.

5
XML: Attributes

• XML attributes are attached to


elements.
¾They are name-value pairs that occur
inside start-tags after the element name.
¾Must begin with a letter or an
underscore.
¾Must not contain any white spaces.
<faculty name=“Indranil Sen Gupta”>
isg@cse.iitkgp.ac.in
</faculty>

XML: Entity References

• They are used to reference data that


is not directly in the structure.
¾Can be internal or external.
¾Built-in entity references are used to
represent &, <, >, “ and ‘.
¾The string
Tom&Jerry(“Don’t write x<y”)
would be written as
Tom&amp;Jerry(&quot;Don&apos;t write
x&lt;y&quot;)

6
¾A special form of entity reference,
called a character reference, can be
used to insert arbitrary Unicode
characters in the document.
ƒ Decimal references: &#8478;
ƒ Hexadecimal references: &#x211E;
== > Refers to the Rx prescription symbol.

XML: Comments

• Comments begin with <!-- and end


with --> .
• Can contain any data except the
literal string “--”.
• All data between these two tags are
ignored by the XML processor.

7
Processing Instructions

• Used to provide information to an


application.
¾Like comments, they are not textually part
of the XML document.
¾The XML processor is required to pass
them to an application.
• They have the form:
<?name pidata?>
¾PI names beginning with xml are reserved.

CDATA Sections

• A CDATA section instructs the XML parser


to ignore most markup characters.
• An example:
<![CDATA[
temp = *p;
*p = *q;
*q = temp;
if (temp < 0) temp = -temp;
]]>
• All character data in between is passed to
the application without interpretation.

8
Document Type Declarations (DTD)

• XML allows us to create our own tag names.


• DTD allows a document to send meta
information to the parser about its contents.
¾Sequence and ordering of tags, etc.
• Four kinds of declarations in XML:
¾Element type declarations
¾Attribute list definitions
¾Entity declarations
¾Notation declarations

Element Type declaration

• They identify the names of the elements


and the nature of their content.
¾Elements can contain simple, predefined
data types.
¾They can refer to other elements.
¾They can be defined w.r.t. their cardinality.
• Example:
<xsd:element name = “faculty”
type = “xsd:string”
maxOccurs = “unbounded”>

9
Attribute List Declaration

• Like elements, attributes must have a


name and type.
¾Attributes can use custom data types.
¾They can be restricted w.r.t. cardinality or
default values.
¾They can refer to other attribute definitions.
• Example:
<xsd:attribute name = “city”
type = “xsd:string”
fixed = “Kharagpur”/>

Entity Declarations

• They allow us to associate a name with


some other fragment or content.
• Two types:
¾Internal entities
ƒ They associate a name with a string of
literal text.
ƒ Five predefined entities are predefined:
&lt, &gt, &amp, &apos, &quot

10
¾External entities:
ƒ They associate a name with the
contents of another file.
ƒ The contents of the (text) file is
inserted at the point of reference.
ƒ Example:
<!ENTITY IITLOGO
SYSTEM “/institute/logo.gif>

Notation Declarations

• They identify specific types of


external binary data.
• This information gets forwarded to
the processing application.
• Example:
<!NOTATION GIF87A SYSTEM “GIF”>

11
Linking Documents in XML

• The XPointer and XLink specifications


provide a standard linking model for XML.
• We look into some of the features of XLink.
¾ Gives us control over the semantics of the link.
¾ Introduces the concept of Extended Links, which
can involve more than two resources.
• XML processors identify links by identifying
the attribute “xml:link”.

Simple Links

• Strongly resembles an HTML <A> link.


<link xml:link=“simple”
href=“http://www.iitkgp.ac.in”>
Our Institute Home page </link>
• The simple link identifies a link between
two resources, one of which is the content
of the linking element itself.

12
Extended Links

• They allow us to express relationships


between more than two resources.
<elink xml:link=“extended”>
<locator xml:link=“locator” href=“text.htm”>
Some text here </locator>
<locator xml:link=“locator” href=“face.jpg”>
Photo of the face </locator>
……..
</elink>

Issue of White Space

• By default, white space in a XML


document is not significant.
• We can change this:
¾The special attribute xml:space can be
used to specify that white space is
significant.
ƒ On any element which includes the
attribute specification
xml:space=‘preserve’
all white spaces would be significant.

13
Including a DTD

<?XML version=“1.0” standalone=“no” ?>


<!DOCTYPE chapter SYSTEM “mybook.dtd” [
………
………
]>

<chapter>
……..
……..
</chapter>

Validity of XML Documents

• Two categories of XML documents:


¾Well-formed
ƒ If the document obeys the syntax of XML.
ƒ Can be parsed.
¾Valid
ƒ A well-formed document is valid only if it
contains a proper DTD, and if the document
obeys the constraints of the declaration.

14
Standard XML Languages

• Synchronized Multimedia Integration


Language (SMIL)
¾An XML language for combining audio,
video, text and graphics in a precise,
synchronized fashion.
• Scalable Vector Graphics (SVG)
¾A language for specifying two dimensional
graphics in XML.
• Mathematical Markup Language (MathML)
¾An XML application for describing
mathematical notation and capturing both
its structure and contents.

• Wireless Markup Language (WML)


¾An XML application for marking up
documents to be delivered to handheld
devices.
• Chemical Markup Language (CML)
¾Used for managing and presenting
molecular and technical information.
• Open Financial Exchange (OFX)
¾An XML application for describing
financial transactions that take place
over the Internet.

15
To Summarize

• We have discussed most of the major


features of XML.
• Details and complete examples were
beyond the scope of the discussion.
• With this background, XML documents
can be interpreted and understood
without much difficulty.

16
SOLUTIONS TO QUIZ
QUESTIONS ON
LECTURE 15

Quiz Solutions on Lecture 15

1. What are the HTML tags associated with


table definitions?
<TABLE>, <TH>, <TD>, <TR>

2. How do you specify table entries spanning


multiple columns?
By using the rowspan and colspan
attributes associated with the <td> tag.

17
Quiz Solutions on Lecture 15

3. What is the purpose of the <FRAMESET>


tag>
It is used to define a collection of
frames, The <FRAME> tag can be
embedded inside it.

4. What is the purpose of the <NOFRAMES>


tag?
To handle browsers that do not support
frames.

Quiz Solutions on Lecture 15

5. What does “*” signify when specifying


the width/height of a frame?
“*” specifies the relative value with
respect to the available space.

6. What does “%” signify when specifying


the width/height of a frame?
It specifies the percentage of the
available space.

18
Quiz Solutions on Lecture 15

7. What is inline style for specifying style


sheets? Give an example.
Where the style is specified “in-line” as
part of the same document.
<H2 style = “color: blue”> This will
appear as blue. </H2>

8. What is external style for specifying style


sheets?
All styles exist in a separate document,
a link to which is specified.

QUIZ QUESTIONS ON
LECTURE 16

19
Quiz Questions on Lecture 16

1. What is a markup language?


2. What are the three main specifications
defining XML?
3. Give an example of an XML element? How
can an empty element be specified?
4. What is an XML attribute? Give an example.
5. Using entity reference, how will the string
“Hello ma’m” be represented?
6. How do you insert comments in XML?

Quiz Questions on Lecture 16

7. Why is the CDATA section used?


8. What do element type declaration do?
9. What do attribute list declaration do?
10. Give an example of simple link.
11. How do you specify extended links in
XML?
12. How do you retain white spaces in the
document?

20

You might also like