Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
1
2
Overview
• What is XML?
• Use of XML
• Difference between HTML & XML
• Rules of XML
• Encoding
• Example of XML Document
• XML Trees
• DTD
• XML Schema
• Benifits of XML
• Obstacles of XML
• References
3
What is XML??
• a family of technologies:
- XML 1.0
- Xlink
- Xpointer & Xfragments
- CSS, XSL, XSLT
- DOM
- XML Namespaces
- XML Schemas
XML
4
XML Introduction
• XML stands for EXtensible Markup Language.
• XML is designed to transport and store data.
• XML is important to know, and very easy to learn.
• XML is a markup language much like HTML.
• XML was designed to carry data, not to display
data.
• Tags are added to the document to provide the
extra information.
XML
5
XML Introduction (cont....)
• XML and HTML have a similar syntax …
both derived from SGML.
• An XML document resides in its own file
with an ‘.xml’ extension
• officially recommended by W3C since 1998.
• primarily created by Jon Bosak of Sun
Microsystems.
XML
6
Why XML is used?
• XML documents are used to transfer data
from one place to another often over the
Internet.
• XML subsets are designed for particular
applications.
• One is RSS (Rich Site Summary or Really
Simple Syndication ). It is used to send
breaking news bulletins from one web site
to another.
XML
7
Why XML is used? (cont…)
• A number of fields have their own subsets.
These include chemistry, mathematics, and
books publishing.
• Most of these subsets are registered with
the W3Consortium and are available for
anyone’s use.
XML
8
Quick Comparison
HTML
• uses tags and attributes.
• content and formatting
can be placed together
<p><font=”Arial”>text
</font></p>
• tags and attributes are
pre-determined and rigid.
XML
• uses tags and attributes.
• content and format are
separate, formatting is
contained in a stylesheet.
• allows user to specify
what each tag and
attribute means.
XML
9
The Basic Rules
• Tags are enclosed in angle brackets.
• Tags come in pairs with start-tags and end-tags.
• Tags must be properly nested.
– <name><email>…</name></email> is not
allowed.
– <name><email>…</email><name> is
allowed.
• Tags that do not have end-tags must be terminated
by a ‘/’.
– <br /> is an html example.
XML
10
The Basic Rules(Cont…)
• Tags are case sensitive.
– <address> is not the same as <Address>
• XML in any combination of cases is not allowed
as part of a tag.
• Tags may not contain ‘<‘ or ‘&’.
• Tags follow Java naming conventions, except that
a single colon and other characters are allowed.
They must begin with a letter and may not
contain white space.
• Documents must have a single root tag that
begins the document.
XML
11
Encoding
• XML uses Unicode to encode characters.
• Unicode comes in many flavors. The most
common one used in the West is UTF-8.
• UTF-8 is a variable length code. Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.
• The first 128 characters in Unicode are ASCII.
• In UTF-8, the numbers between 128 and 255 code
for some of the more common characters used in
western Europe, such as ã, á, å, or ç.
<?xml version=”1.0” encoding=”UTF-8”>
XML
12
Walking through an Example
<?xml version = “1.0” ?>
<address>
<name>
<first>Alena</first>
<last>Lee</last>
</name>
<email>alena@aaol.com</email>
<phone>123-45-6789</phone>
<birthday>
<year>1978</year>
<month>09</month>
<day>17</day>
</birthday>
</address>
Root Element
Parent node of
first element
Child node of
name element
Siblings
XML
13
XML Files are trees
address
name email phone birthday
first last year month day
XML
14
XML Trees
• An XML document has a single root node.
• The tree is a general ordered tree.
– A parent node may have any number of
children.
– Child nodes are ordered, and may have siblings.
• Pre-order traversals are usually used for getting
information out of the tree.
XML
15
DTD (Document Type Definition)
• A DTD describes the tree structure of a document
and something about its data.
• There are two data types:
1. PCDATA.
PCDATA is parsed character data.
2. CDATA
CDATA is character data, not usually
parsed.
• A DTD determines how many times a node may
appear, and how child nodes are ordered.
XML
16
DTD for address example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
XML
17
XML Schema
• Schemas are themselves XML documents.
• They were standardized after DTDs and provide
more information about the document.
• They have a number of data types including string,
decimal, integer, boolean, date, and time.
• They divide elements into simple and complex
types.
• They also determine the tree structure and how
many children a node may have.
XML
18
Schema for first address example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
XML
19
Benefits of XML
• XML is text (Unicode) based.
– Takes up less space.
– Can be transmitted efficiently.
• One XML document can be displayed differently in
different media.
– Html, video, CD, DVD,
– You only have to change the XML document in order to
change all the rest.
• XML documents can be modularized. Parts can be
reused.
• Easy to understand for human users.
• Very expressive(semanticsalong with thedata).
• Well structured, easy to read and writefrom programs.
XML
20
Obstacles of XML
• XML syntax is too verbose relative to
other alternative ‘text-based’ data
transmission formats.
• No intrinsic data type support.
• XML syntax is redundant.
XML
21
References
• W3Schools Online Web Tutorials,
http://www.w3schools.com.
• www.tutorialpoint.com
• www.basicxml.com
• www.slideshare.com
22

More Related Content

XML

  • 1. 1
  • 2. 2 Overview • What is XML? • Use of XML • Difference between HTML & XML • Rules of XML • Encoding • Example of XML Document • XML Trees • DTD • XML Schema • Benifits of XML • Obstacles of XML • References
  • 3. 3 What is XML?? • a family of technologies: - XML 1.0 - Xlink - Xpointer & Xfragments - CSS, XSL, XSLT - DOM - XML Namespaces - XML Schemas XML
  • 4. 4 XML Introduction • XML stands for EXtensible Markup Language. • XML is designed to transport and store data. • XML is important to know, and very easy to learn. • XML is a markup language much like HTML. • XML was designed to carry data, not to display data. • Tags are added to the document to provide the extra information. XML
  • 5. 5 XML Introduction (cont....) • XML and HTML have a similar syntax … both derived from SGML. • An XML document resides in its own file with an ‘.xml’ extension • officially recommended by W3C since 1998. • primarily created by Jon Bosak of Sun Microsystems. XML
  • 6. 6 Why XML is used? • XML documents are used to transfer data from one place to another often over the Internet. • XML subsets are designed for particular applications. • One is RSS (Rich Site Summary or Really Simple Syndication ). It is used to send breaking news bulletins from one web site to another. XML
  • 7. 7 Why XML is used? (cont…) • A number of fields have their own subsets. These include chemistry, mathematics, and books publishing. • Most of these subsets are registered with the W3Consortium and are available for anyone’s use. XML
  • 8. 8 Quick Comparison HTML • uses tags and attributes. • content and formatting can be placed together <p><font=”Arial”>text </font></p> • tags and attributes are pre-determined and rigid. XML • uses tags and attributes. • content and format are separate, formatting is contained in a stylesheet. • allows user to specify what each tag and attribute means. XML
  • 9. 9 The Basic Rules • Tags are enclosed in angle brackets. • Tags come in pairs with start-tags and end-tags. • Tags must be properly nested. – <name><email>…</name></email> is not allowed. – <name><email>…</email><name> is allowed. • Tags that do not have end-tags must be terminated by a ‘/’. – <br /> is an html example. XML
  • 10. 10 The Basic Rules(Cont…) • Tags are case sensitive. – <address> is not the same as <Address> • XML in any combination of cases is not allowed as part of a tag. • Tags may not contain ‘<‘ or ‘&’. • Tags follow Java naming conventions, except that a single colon and other characters are allowed. They must begin with a letter and may not contain white space. • Documents must have a single root tag that begins the document. XML
  • 11. 11 Encoding • XML uses Unicode to encode characters. • Unicode comes in many flavors. The most common one used in the West is UTF-8. • UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. • The first 128 characters in Unicode are ASCII. • In UTF-8, the numbers between 128 and 255 code for some of the more common characters used in western Europe, such as ã, á, å, or ç. <?xml version=”1.0” encoding=”UTF-8”> XML
  • 12. 12 Walking through an Example <?xml version = “1.0” ?> <address> <name> <first>Alena</first> <last>Lee</last> </name> <email>alena@aaol.com</email> <phone>123-45-6789</phone> <birthday> <year>1978</year> <month>09</month> <day>17</day> </birthday> </address> Root Element Parent node of first element Child node of name element Siblings XML
  • 13. 13 XML Files are trees address name email phone birthday first last year month day XML
  • 14. 14 XML Trees • An XML document has a single root node. • The tree is a general ordered tree. – A parent node may have any number of children. – Child nodes are ordered, and may have siblings. • Pre-order traversals are usually used for getting information out of the tree. XML
  • 15. 15 DTD (Document Type Definition) • A DTD describes the tree structure of a document and something about its data. • There are two data types: 1. PCDATA. PCDATA is parsed character data. 2. CDATA CDATA is character data, not usually parsed. • A DTD determines how many times a node may appear, and how child nodes are ordered. XML
  • 16. 16 DTD for address example <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)> XML
  • 17. 17 XML Schema • Schemas are themselves XML documents. • They were standardized after DTDs and provide more information about the document. • They have a number of data types including string, decimal, integer, boolean, date, and time. • They divide elements into simple and complex types. • They also determine the tree structure and how many children a node may have. XML
  • 18. 18 Schema for first address example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> XML
  • 19. 19 Benefits of XML • XML is text (Unicode) based. – Takes up less space. – Can be transmitted efficiently. • One XML document can be displayed differently in different media. – Html, video, CD, DVD, – You only have to change the XML document in order to change all the rest. • XML documents can be modularized. Parts can be reused. • Easy to understand for human users. • Very expressive(semanticsalong with thedata). • Well structured, easy to read and writefrom programs. XML
  • 20. 20 Obstacles of XML • XML syntax is too verbose relative to other alternative ‘text-based’ data transmission formats. • No intrinsic data type support. • XML syntax is redundant. XML
  • 21. 21 References • W3Schools Online Web Tutorials, http://www.w3schools.com. • www.tutorialpoint.com • www.basicxml.com • www.slideshare.com
  • 22. 22