WAP UNIT I - Merged-1
WAP UNIT I - Merged-1
WAP UNIT I - Merged-1
KRISHNAGIRI
PAPER NAME: WAP AND XML
SEMESTER - III
SYLLABUS
Unit I
Overview of WAP: WAP and the wireless world – WAP application architecture – WAP
internal structure – WAP versus the Web – WAP 1.2 – WTA and push features. Setting up
WAP: Available software products – WAP resources – The Development Toolkits.
Unit II
WAP gateways: Definition – Functionality of a WAP gateway – The Web model versus
the WAP model – Positioning of a WAP gateway in the network – Selecting a WAP gateway
Basic WML: Extensible markup language – WML structure – A basic WML card – Text
formatting – navigation – Advanced display features.
Unit III
Interacting with the user: Making a selection – Events – Variables – Input and parameter
passing. WML Script: Need for WML script – Lexical Structure – Variables and literals –
Operators – Automatic data type conversion – Control Constructs Functions – Using the standard
libraries – programs – Dealing with Errors.
Unit IV
XML: Introduction XML: An Eagle’s Eye view of XML – XML Definition – List of an
XML Document – Related Technologies – An introduction to XML Applications – XML
Applications – XML for XML – First XML Documents and Structuring Data: Examining the
Data XMLizing the data – The advantages of the XML format – Preparing a style sheet for
Document Display.
Unit V
Attributes, Empty Tags and XSL: Attributes – Attributes Versus Elements – Empty Tags
– XSL – Well formed XML documents – Foreign Languages and Non Roman Text – Non
Roman Scripts on the Web Scripts, Character sets, Fonts and Glyphs – Legacy character sets–
The Unicode Character set – Procedure to Write XML Unicode.
Text Books:
1) For Unit I, II, III Charles Arehart and Others. ”Professional WAP with WML, WML script,
ASP, JSP, XML, XSLT, WTA Push and Voice XML” Shroff Publishers and Distributers Pvt.
Ltd 2000.
2) For Unit IV & V Eliotte Rusty Harlod “XML TM Bible”, Books India (P) Ltd, 2000
UNIT : I
The WAP standard represents the first successful attempt to establish a broadly accepted
environment for delivering information, data and services to both enterprise and consumer users
over wireless networks.
Over the years, several companies have attempted to establish standards that would drive
widespread commercial adoption of wireless data.
Several attempts have been made to enhance the data capabilities of these networks.
Several companies have deployed wireless middleware solutions that extend the transport
protocols in an effort to improve throughput, reliability, or user experience. Middleware
solutions include the following offerings:
Secureway Wireless Gateway from IBM - Software on the client and the server
transparently manages the use of the air link beneath any standard TCP/IP
applications. Optimizations include compression data, encrypting the packets,
faster reconnection shutting down the link and delayed acknowledgement.
ExpressQ from NetTech Systems (now BroadBeam) – It provides
asynchronous messaging, push, encryption and roaming capabilities on a variety
of networks and devices.
Handheld Device Markup Language (HDML) and Handheld Device
Transport Protocol (HDTP) from Unwired Planet (now Phone.com) This
technology includes a micro-browser and optimized protocol stack for supporting
web access from mobile phones.
AirMobile from Motorola – provides optimized access to Lotus Notes and CC:
mail over wireless networks.
Narrowband Sockets(NBS) from Nokia and Intel – provides optimized data
transport services for using UDP over circuit-switched data and GSM Short
Message Service (SMS).
Data-Optimized Networks
Several Wireless Networks have been deployed to specifically support wireless data:
Cellular Digital Packet Data (CPPD) – transmits digital data within an analog
voice network. It offers raw 19.2 kbps throughput with one second latencies.
ARDIS – Developed by Motorola and IBM- offers effective throughputs of
approximately 2.4 kbps with four to ten seconds latencies.
RAM Mobile Data - Developed by RAMs offer throughput of 4 kbps with four to
eight second latencies.
Ricochet offers 128 kbps throughput with one-second latencies.
- AT&T wireless services (AWS) sought to roll out wireless data services in time for
Christmas 1997.
- In June 1997, AWS hosted a meeting in Seattle, Washington, that brought together
representatives of Ericsson, Motorola, Nokia and Un-wired Planet – to reach
agreement on a single wireless data infrastructure standard.
- The resulting effort to develop a standard, known as the Wireless Application
Protocol (WAP), was announced on June 26, 1997.
- At June 1997, the WAP standard would incorporate three existing technologies:
o HDML, developed by Unwired Planet, would become common markup
language.
o NBS developed by Nokia, as part of its Smart Messaging Technology, would
become the optimized transport protocol and HDTP would become optimized
session protocol.
o Intelligent-Terminal Transfer Protocol (ITTP), developed by Ericson, would
provide the foundation for the telephony application services.
- The layering issues were resolved in two stages. During the fall of 1997, HDTP (now
the wireless session protocol [WSP]), the security layer (now Wireless Transport
Layer Security [WTLS]), and NBS (now the Wireless Transport Protocol [WTP]),
were redesigned.
HDTP WSP WSP
SECURITY =fall 1997=> WTLS =Spring1998=> WTP/D, WTP/T
NBS WTP/D, WTP/T, WTLS
WTP/C
Evolution of the WAP Protocol Stack from core technologies to version 1.0
Evolution of the WAP standard
By late 1998 – Version 1.0 WAP standard had revealed several errors and
ambiguities in it.
In May 1999, Version 1.1 would be adopted for use in all initial commercial
deployments.
Version 1.2 of the standard was formally adopted in December 1999, and a
maintenance release was adopted in June 2000 to improve interoperability.
MARKET CONVERGENCE
This convergence (union) between the wireless and wired worlds leads to the
notion of mobile internet services, which would empower the user to access the
same suite of rich value-added applications at work, at home, and on the road.
It offers: access to services at any moment in time, from any device, at any place,
instantly, and over any network.
In the context of mobile devices and the classic web, convergence encompasses
the seamless integration of heterogeneous networks, terminals, applications, and
content. Capable to access the internet over a variety of networks.
The browsers and applications running on mobile devices need to be easier to use
than those on the desktop.
Mobile devices have limited display sizes, keypads, and battery power.
ENABLING COVERGENCE
The mobile internet continues to improve business’s bottom line and enhance
people’s lifestyles.
The first people to accept the new gadgets were high-end business users known as
Early Adopters. These users were willing to pay more and explore untested
technology to realize significant improvements in their productivity or achieve a
sizable competitive advantage.
Technology Drivers
Usage Drivers
Business Drivers
Productivity Applications
The mobile Internet gives users immediate access to relevant information and
allows them to subsequently take actions on that information.
Kiosk to content – News, Stock quotes, weather, traffic updates, sport scores,
directions to restaurants and ATMs, and train schedules, and other information.
Also includes searching for directory information such as that found in the white
or yellow pages.
Electronic commerce – Electronic commerce is the concept of carrying out trade
over the Internet. B2C – Business to Consumer commerce. Banking, purchase of
goods and services, day trading, bill payment and so on.
Motivation for Mobile E-Commerce:
o Generate more revenue through wider sales and distribution channel.
o Customer acquisition and retention are enhanced through repeated visit by
loyal customers.
o Brand recognition unlike any seen before.
o Reductions in the infrastructure and transactional costs.
o Improve customer access.
Mobile Commerce Applications
o Electronic banking - Mobile banking.
o Bill payment – online bill payment.
o Online trading
o Electronic purchases of goods and services
o Contracts – insurance policies, agreeing to corporate papers, etc.
Transaction Layer
Wireless Transaction Protocol (WTP). The WTP runs on top of a datagram service, such
as User Datagram Protocol (UDP) and is part of the standard suite of TCP/IP protocols used to
provide a simplified protocol suitable for low bandwidth wireless stations.
Security Layer
Wireless Transport Layer Security (WTLS). WTLS incorporates security features that are
based upon the established Transport Layer Security (TLS) protocol standard. It includes data
integrity checks, privacy, service denial, and authentication services.
Transport Layer
Wireless Datagram Protocol (WDP). The WDP allows WAP to be bearer-independent by
adapting the transport layer of the underlying bearer. The WDP presents a consistent data format
to the higher layers of the WAP protocol stack, thereby offering the advantage of bearer
independence to application developers.
Each of these layers provides a well-defined interface to the layer above it. This means
that the internal workings of any layer are transparent or invisible to the layers above it. The
layered architecture allows other applications and services to utilize the features provided by the
WAP-stack as well. This makes it possible to use the WAP-stack for services and applications
that currently are not specified by WAP.
Advantages of WAP architecture
1. Layering allows the design of each protocol to evolve independently of other
protocols.
2. Layering allows subsets of the standard to be implemented.
3. Layering permits effective bridging to the Internet.
4. Layering support the principal of Separability.
5. Layering potentially allows the implementation of each layer to change independently
of the other implementation.
Disadvantage of Layering
1. Layered protocol implementations are more difficult to optimize and, once optimized,
more difficult to maintain.
2. Layered protocols tend to be less efficient.
The content stored on the servers is of various formats, but HTML is the predominant. HTML
provides the content developer with a means to describe the appearance of a service in a flat
document structure. If more advanced features like procedural logic are needed, then scripting
languages such as JavaScript or VB Script may be utilised.
The figure below shows how a WWW client request a resource stored on a web server. On the
Internet standard communication protocols, like HTTP and Transmission Control
Protocol/Internet Protocol (TCP/IP) are used.
The content available at the web server may be static or dynamic. Static content is
produced once and not changed or updated very often; for example, a company
presentation. Dynamic content is needed when the information provided by the
service changes more often; for example, timetables, news, stock quotes, and
account information. Technologies such as Active Server Pages (ASP), Common
Gateway Interface (CGI), and Servlets allow content to be generated dynamically.
A markup language − the Wireless Markup Language (WML) has been adapted to develop
optimized WAP applications. In order to save valuable bandwidth in the wireless network, WML
can be encoded into a compact binary format. Encoding WML is one of the tasks performed by
the WAP Gateway/Proxy.
The user selects an option on their mobile device that has a URL with Wireless Markup
The server processes the request just as it would any other request. If the URL refers to a
static WML file, the server delivers it. If a CGI script is requested, it is processed and the
content returned as usual.
The Web server adds the HTTP header to the WML content and returns it to the gateway.
The gateway then sends the WML response back to the phone.
The micro-browser processes the WML and displays the content on the screen.
3.WAP versus The Web
WAP 1.0 WAP 2.0 Conventional Web
Assumes black-and- Limited screen Web browsers capable of
Display white, limited screen display with rendering HTML in full
display limited color color.
XHTML Basic
using a
WML using a card- combination of HTML based on web
Content
and-deck metaphor. card-and-deck pages.
metaphor and web
pages
WAP server or
Server WAP server Web Server with a Web server
WAP Proxy
WAP Protocols
including WSP WAP Protocols or
(Wireless Session WP-HTTP
Protocol), WTP (Wireless Profiled
(Wireless Transaction HTTP), TLS
Protocols HTTP over TCP/IP
Protocol), WTLS (Transport Layer
(Wireless Transport Security), and WP-
Layer Security), and TCP (Wireless
WDP (Wireless Profiled TCP)
Datagram Protocol)
WAP gateway WAP gateway
Web servers connected to
Architecture connecting devices to connecting devices
browsers.
web servers. to web servers.
Programmatic
WMLScript WMLScript JavaScript/ECMAScript
Control
4.WTA
Browsing the web using the WML browser is only one application for a
handheld device user. Say a user still wants to make phone calls and access all the
features of the mobile phone network as with a traditional mobile phone. This is
where the wireless telephony application (WTA), the WTA user agent (as shown in
Figure), and the wireless telephony application interface WTAI come in. WTA is a
collection of telephony specific extensions for call and feature control mechanisms,
merging data networks and voice networks.
● Content push: A WTA origin server can push content, i.e., WML decks or
WMLScript, to the client. A push can take place without prior client request. The
content can enable, e.g., the client to handle new network events that were
unknown before.
● Security model: Mandatory for WTA is a security model as many frauds happen
with wrong phone numbers or faked services. WTA allows the client to only
connect to trustworthy gateways, which then have to check if the servers providing
content are authorized to send this content to the client. Obviously, it is not easy to
define trustworthy in this context. In the beginning, the network operator‟s
gateway may be the only trusted gateway and the network operator may decide
which servers are allowed to provide content. Figure 10.30 gives an overview of
the WTA logical architecture.
The components shown are not all mandatory in this architecture; however,
firewalls or other origin servers may be useful. A minimal configuration could be a
single server from the network operator serving all clients. The client is connected
via a mobile network with a WTA server, other telephone networks (e.g., fixed
PSTN), and a WAP gateway. A WML user agent running on the client or on other
user agents is not shown here.
The client may have voice and data connections over the mobile network. Other
origin servers within the trusted domain may be connected via the WAP gateway.
A firewall is useful to connect third-party origin servers outside the trusted domain.
One difference between WTA servers and other servers besides security is the
tighter control of QoS. A network operator knows (more or less precisely) the
latency, reliability, and capacity of its mobile network and can have more control
over the behavior of the services. Other servers, probably located in the internet,
may not be able to give as good QoS guarantees as the network operator.
Similarly, the WTA user agent has a very rigid and real-time context
management for browsing the web compared to the standard WML user agent.
Figure shows an exemplary interaction between a WTA client, a WTA gateway, a
WTA server, the mobile network (with probably many more servers) and a voice
box server. Someone might leave a message on a voice box server as indicated.
Without WAP, the network operator then typically generates an SMS indicating
the new message on the voice box via a little symbol on the mobile phone.
However, it is typically not indicated who left a message, what messages are stored
etc. Users have to call the voice box to check and cannot choose a particular
message. In a WAP scenario, the voice box can induce the WTA server to generate
new content for pushing to the client. An example could be a WML deck
containing a list of callers plus length and priority of the calls. The server does not
push this deck immediately to the client, but sends a push message containing a
single
URL to the client. A short note, e.g., ―5 new calls are stored", could accompany
the push message. The WTA gateway translates the push URL into a service
indication and codes it into a more compact binary format. The WTA user agent
then indicates that new messages are stored. If the user wants to listen to the stored
messages, he or she can request a list of the messages.
This is done with the help of the URL. A WSP get requests the content the URL
points to. The gateway translates this WSP get into an HTTP get and the server
responds with the prepared list of callers.
After displaying the content, the user can select a voice message from the list. Each
voice message in this example has an associated URL, which can request a certain
WML card from the server. The purpose of this card is to prepare the client for an
incoming call. As soon as the client receives the card, it waits for the incoming
call. The call is then automatically accepted. The WTA server also signals the
voice box system to set up a (traditional) voice connection to play the selected
voice message. Setting up the call and accepting the call is shown using dashed
lines, as these are standard interactions from the mobile phone network, which are
not controlled by WAP.
– event handling
–telephony functions
–Call control
–Event processing
Security model: segregation
The client is connected via a mobile network with a WTA server, other telephone networks and
a WAP gateway.
The client may have voice and data connections over the network.
5.WAP SOFTWARE
MobileDev
MobileDev is the first Wireless Development Environment (WDE) specifically for WAP Internet
applications. Its innovative open-ended development model integrates a graphical application
mapper with a wizard interface and a rich tool set. MobileDev supports WAP technologies like
WML, HDML, Microsoft Active Server Pages (ASP), Perl and Java Server Pages (JSP).
Using the GUI application mapper to show the relationships between objects, developers can
quickly outline the components of a WAP application. Then they can take advantage of wizards
that generate Decks and Cards in both WML and HDML, and use MobileDev's code builder to
write WML/HDML syntax that complements the wizard-generated code. The integration of the
application mapper with wizards and the code builder provides a seamless WDE that delivers
results fast.
MobileDev comes complete with its own powerful integrated runtime engine, MobileDev Server
Script. Server Script can quickly create prototypes or build full-blown business WAP
applications that can be natively connected to an RDBMS. MobileDev WDE is also designed to
support development in ASP, JSP, Perl or other template-based server technologies.
Nokia Mobile Internet Toolkit 3.0 This toolkit provides a set of tools for preparing mobile
Internet services and includes many features and components such as: device simulators; WML
and WMLScript encoders,; WML, WMLScript, WBMP [7], push, multipart content, XHTML
[10], and CSS editors; WAP protocol stack, HTTP and file access modules; debugging views;
WAP server simulator, a WAP gateway for local toolkit use only; Wireless Telephony
Application Interface (WTAI) [8] features; Nokia SoftID [8], a Wireless Identity Module (WIM)
[8] card simulation; basic WML and XHTML sample applications and source code. Content that
users create is encoded if necessary and usually immediately displayed. Compilation status
messages are displayed at the bottom of the editing window.
The toolkit is also designed to assist developers during the process of developing a push
application. A push message has both a binary (encoded) and a text (source) form. There are
facilities for helping the developer observe and debug the effects of various pushed content on
the client system: displaying a push message on the simulator, effects on the cache, when URLs
get loaded, and so on. The NMIT provides also a mechanism for simulation of phone calls. That
mechanism is based on WTAI. The Nokia Mobile Internet Toolkit offers a wide range of
configuration and customization functions, which are easily accessible through numerous
intuitive menus. These features allow customizing device settings, connections, toolkit
preferences, etc.
Openwave SDK WAP Edition 5.01. The user interface is very intuitive, typical for programs
running under Windows operating systems. The various windows displayed within the IDE can
be moved and docked, or can be displayed outside the IDE window.
Figure 1. Openwave SDK WAP Edition 5.0 – Main Window The two main windows are the
editor window and the browser simulator window. There is also group of debugging windows:
variables; cookies; history; HTTP response; browser output. Together with the Openwave SDK
program files basic documentation (Getting started guide, WML/WMLScript guides and
references) can be installed. The information included is Figure sufficient even for absolute
beginners in this area. Experienced users can explore the Openwave Developer site. The
Openwave Software Development Kit WAP Edition 5.0 is a user-friendly application, easy to
learn for beginners. The program itself is stable; it has no extraordinary hardware requirements
and offers a huge range of customization possibilities. All three toolkits can be downloaded for
free. There are dedicated websites for network operators, developers, content providers,
technology suppliers and others interested in mobile Internet services and applications,
innovative mobility projects, etc. [1], [2], [3].
Embedded in the WAR browser is a WMLScript bytecode Virtual Machine that runs compiled
WMLScript applications. The WAP Developer Toolkit also features a WML Encoder/Decoder
for converting WML elements into bytecoded format so as to achieve optimal use of low-
bandwidth wireless communication channels.
The conversion is performed off-line and the encoded WML modules can then be made available
for downloading by WML browsers. Included is a WMLScript compiler and assembler for
converting WMLScript source code into the compressed bytecode format that is run by WAP
client devices.
Developers can use WMLScript Debug Library for debugging of their WAP applications during
development.
UNIT II
WAP GATEGATEWAY:
The Wireless Application Protocol (WAP) gateway is part of a protocol for protected use
of the Internet. In WAP, requests to access a website are sent through a WAP gateway for
security purposes.
WAP GOALS:Gateway is a device for converting the TCP/IP protocols to the different
WAP protocols and vice versa. It is able to translate HTML to WML. The WAP Gateway is
often a proxy server, meaning that it acts both as server and client for the purpose of making
requests on behalf of other clients.
Wireless Application Protocol gateway Software that decodes and encodes requests and
responses between the smartphone microbrowsers and the Internet. It decodes the encoded WAP
requests from the microbrowser and sends the HTTP requests to the Internet or to a local
application server. It encodes WML and HDML data returning from the Web for transmission to
the microbrowser in the handset.
This diagram depicts Openwave's Mobile Access Gateway which combines the WAP
gateway with an application server that hosts WAP pages and provides numerous services (push,
fax, etc.) along with encoding and decoding
A WAP gateway sits between mobile devices using the Wireless Application Protocol
(WAP) and the World Wide Web, passing pages from one to the other much like a proxy. This
translates pages into a form suitable for the mobiles, for instance using the Wireless Markup
Language (WML).
The WAP model closely resembles the Internet model of working. In Internet a
WWW client requests a resource stored on a web server by identifying it using a unique URL,
that is, a text string constituting an address to that resource. Standard communication protocols,
like HTTP and Transmission Control Protocol/Internet Protocol (TCP/IP) manage these requests
and transfer of data between the two ends. The content that is transferred can either be static like
html pages or dynamic like Active Server Pages (ASP), Common Gateway Interface (CGI), and
Servlets.
The following figure helps draw a parallel to the Internet protocols. You can see
how WAP extends or reuses Internet protocols to achieve mobile Internet access.
The strength of WAP (some call it the problem with WAP) lies on the fact that it very
closely resembles the Internet model. In order to accommodate wireless access to the
information space offered by the WWW, WAP is based on well-known Internet technology that
has been optimized to meet the constraints of a wireless environment. Corresponding to HTML,
WAP specifies a markup language adapted to the constraints of low bandwidth available with the
usual mobile data bearers and the limited display capabilities of mobile devices - the Wireless
Markup Language (WML). WML offers a navigation model designed for devices with small
displays and limited input facilities (no mouse and limited keyboard). WAP also provides a
means for supporting more advanced tasks, comparable to those solved by using
for example JavaScript in HTML. The solution in WAP is called WML Script.
WAP MODEL.
This process is hidden from the phone, so it may access the page in the same way as a
browser accesses HTML, using a URL (for example, http://example.com/foo.wml), provided the
mobile phone operator has not specifically prevented this.
WAP gateway software that encodes and decodes request and response between the
smartphones, microbrowser and internet. It decodes the encoded WAP requests from the
microbrowser and send the HTTP requests to the internet or to a local application server. It also
encodes the WML and HDML data returning from the web for transmission to the microbrowser
in the handset.
WAP GATEWAY
The position of a mobile phone can be located using information from the radio network
it uses. This paper presents two applications that make use of Wireless Application Protocol
(WAP) and mobile positioning technology to provide location-based information to the user in
real time.
The applications use the user's position to generate dynamic information. Besides, both
applications also have a web interface to perform administering tasks. The first application
allows the user to locate the nearest resource (pizza restaurant, theater, gas station, and so on) to
his/her position.
The second application is a data acquisition system and it demonstrates how the user can
directly introduce data into Geographical Information Systems (GIS) from a WAP phone. Our
proposal uses existing radio network and does not need the use of Global Positioning System
(GPS).
SELECTING A WAP GATEWAY BASIC WML:
WML is the markup language defined inthe WAP specification. WAP sites are written
in WML, while web sites are written in HTML. WML is very similar to HTML.
Extensible Markup Language is a markup language and file format for storing,
transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in
a format that is both human-readable and machine-readable. Wikipedia
Editors: Tim Bray, Jean Paoli, Michael Sperberg-McQueen, Eve Maler, François Yergeau, John
W. Cowan
Abbreviation: XML
Organization: World Wide Web Consortium (W3C)
Latest version: 1.1 (2nd ed.); September 29, 2006; 15 years ago
Domain: Serialization
XML (Extensible Markup Language) is used to describe data. The XML standard is a
flexible way to create information formats and electronically share structured data via the public
internet, as well as via corporate networks.
XML's primary function is to create formats for data that is used to encode information
for documentation, database records, transactions and many other types of data. XML data may
be used for creating different content types that are generated by building dissimilar types of
content -- including web, print and mobile content -- that are based on the XML data.
Like Hypertext Markup Language (HTML), which is also based on the SGML standard,
XML documents are stored as American Standard Code for Information Interchange (ASCII)
files and can be edited using any text editor.
XML's primary function is to provide a "simple text-based format for representing structured
information," according to the World Wide Web Consortium (W3C), the standards body for the
web, including for the following:
technical documentation;
books;
transactions; and
invoices.
XML enables sharing of structured information among and between the following:
W3C defines the XML standard and recommends its use for web content. While XML and
HTML are both based on the SGML platform, W3C has also defined the XHTML and XHTLM5
document formats that mirror, respectively, the HTML and HTML5 standards for web content.
XML works by providing a predictable data format. XML is strict on formatting; if the
formatting is off, programs that process or display the encoded data will return an error.
<warning>
<para>
<emphasis type="bold">May cause serious injury</emphasis>
Exercise extreme caution as this procedure could result in serious injury or death if
precautions are not taken.
</para>
</warning>
In this example, this data is interpreted and displayed in different ways, depending on the
form factor of the technical documentation. On a webpage, this element could be displayed in the
following way:
The same XML code is rendered differently on an appliance user interface (UI) or in
print. This element could be interpreted to display the text tagged as emphasis differently, such
as having it appear in red and with flashing highlights. In printed form, the content might be
provided in a different font and format.
XML documents do not define presentation, and there are no default XML tags. Most
XML applications use predefined sets of tags that differ, depending on the XML format. Most
users rely on predefined XML formats to compose their documents, but users may also define
additional XML elements as needed.
EXAMPLE :
WML Structure
The topmost layer in the WAP (Wireless Application Protocol) architecture is made up of
WAE (Wireless Application Environment), which consists of WML and WML scripting
language.
WML is based on HDML and is modified so that it can be compared with HTML.
WML takes care of the small screen and the low bandwidth of transmission.
WAP sites are written in WML, while web sites are written in HTML.
WML is very similar to HTML. Both of them use tags and are written in plain text
format.
WML files have the extension ".wml". The MIME type of WML is "text/vnd.wap.wml".
WMLScript.
WML Versions:
WAP Forum has released a latest version WAP 2.0. The markup language defined in
WAP 2.0 is XHTML Mobile Profile (MP). The WML MP is a subset of the XHTML. A style
sheet called WCSS (WAP CSS) has been introduced alongwith XHTML MP. The WCSS is a
subset of the CSS2.
Most of the new mobile phone models released are WAP 2.0-enabled. Because WAP 2.0
is backward compatible to WAP 1.x, WAP 2.0-enabled mobile devices can display both
XHTML MP and WML documents.
WML 1.x is an earlier technology. However, that does not mean it is of no use, since a lot
of wireless devices that only supports WML 1.x are still being used. Latest version of WML is
2.0 and it is created for backward compatibility purposes. So WAP site developers need not to
worry about WML 2.0.
When a WML page is accessed from a mobile phone, all the cards in the page are
downloaded from the WAP server. So if the user goes to another card of the same deck, the
mobile browser does not have to send any requests to the server since the file that contains the
deck is already stored in the wireless device.
You can put links, text, images, input fields, option boxes and many other elements in a
card.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card id="one" title="First Card">
<p>
This is the first card in the deck
</p>
</card>
<card id="two" title="Second Card">
<p>
Ths is the second card in the deck
</p>
</card>
</wml>
The first line of this text says that this is an XML document and the version is 1.0. The
second line selects the document type and gives the URL of the document type definition (DTD).
One WML deck (i.e. page ) can have one or more cards as shown above. We will see
complete details on WML document structure in subsequent chapter.
Unlike HTML 4.01 Transitional, text cannot be enclosed directly in the <card>...</card>
tag pair. So you need to put a content inside <p>...</p> as shown above.
Wireless devices are limited by the size of their displays and keypads. It's therefore very
important to take this into account when designing a WAP Site.
While designing a WAP site you must ensure that you keep things simple and easy to
use. You should always keep in mind that there are no standard microbrowser behaviors and that
the data link may be relatively slow, at around 10Kbps. However, with GPRS, EDGE, and
UMTS, this may not be the case for long, depending on where you are located.
The following are general design tips that you should keep in mind when designing a
service:
Keep text brief and meaningful, and as far as possible try to precode options to minimize
down.
Use standard layout tags such as <big> and <b>, and logically structure your information.
Don't go overboard with the use of graphics, as many target devices may not support
them.
TEXT FORMATTING
Line Break:
The <br /> element defines a line break and almost all WAP browsers supports a line
break tag.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Line Break Example">
<p align="center">
This is a <br /> paragraph with a line break.
</p>
</card>
</wml>
This will produce the following result:
Text Paragraphs:
The <p> element defines a paragraph of text and WAP browsers always render a
paragraph in a new line.
center
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Paragraph Example">
<p align="center">
This is first paragraph
</p>
<p align="right">
This is second paragraph
</p>
</card>
</wml>
WML Tables:
The <table> element along with <tr> and <td> is used to create a table in WML. WML
does not allow the nesting of tables
align L To specify the horizontal text alignment of the columns, you need to
C assign three letters to the align attribute. Each letter represents the
R horizontal text alignment of a column. The letter can be L, C, or R.
For example, if you want the following settings to be applied to your
table:
Then you should set the value of the align attribute to LCR.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="WML Tables">
<p>
<table columns="3" align="LCR">
<tr>
<td>Col 1</td>
<td>Col 2</td>
<td>Col 3</td>
</tr>
<tr>
<td>A</td>
<td>B</td>
<td>C</td>
</tr>
<tr>
<td>D</td>
<td>E</td>
<td>F</td>
</tr>
</table>
</p>
</card>
</wml>
Preformatted Text:
The <pre> element is used to specify preformatted text in WML. Preformatted text is text
of which the format follows the way it is typed in the WML document.
This tag preserves all the white spaces enclosed inside this tag. Make sure you are not
putting this tag inside <p>...</p>
The <pre> element supports following attributes:
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Preformatted Text">
<pre>
This is preformatted
text and will appear
as it it.
</pre>
</card>
</wml>
You can enclose Text or image along with a task tag inside
<anchor>...</anchor> tag pair.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Anchor Element">
<p>
<anchor>
<go href="nextpage.wml"/>
</anchor>
</p>
<p>
<anchor>
<prev/>
</anchor>
</p>
</card>
</wml>
The <a>...</a> tag pair can also be used to create an anchor link and always a preferred
way of creating links.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="A Element">
<p> Link to Next Page:
<a href="nextpage.wml">Next Page</a>
</p>
</card>
</wml>
Sample Output
END OF UNIT II
UNIT-III
INTERACTION WITH THE USER
MAKING A SELECTION:
The <select>...</select> WML elements are used to define a selection list and the
<option>...</option> tags are used to define an item in a selection list. Items are presented as
radio buttons in some WAP browsers. The <option>...</option> tag pair should be enclosed
within the <select>...</select> tags.
Attributes:
iname text Names the variable that is set with the index result of the selection
multiple o true Sets whether multiple items can be selected. Default is "false"
o false
name text Names the variable that is set with the result of the selection
tabindex number Sets the tabbing position for the select element
value text Sets the default value of the variable in the "name" attribute
Example:
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Selectable List">
<p> Select a Tutorial :
<select>
<option value="htm">HTML Tutorial</option>
<option value="xml">XML Tutorial</option>
<option value="wap">WAP Tutorial</option>
</select>
</p>
</card>
</wml>
When you will load this program, it will show you the following screen:
Once you highlight and enter on the options, it will display the following screen:
You want to provide option to select multiple options, then set multiple attribute to true as
follows:
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card title="Selectable List">
<p> Select a Tutorial :
<select multiple="true">
<option value="htm">HTML Tutorial</option>
<option value="xml">XML Tutorial</option>
<option value="wap">WAP Tutorial</option>
</select>
</p>
</card>
</wml>
EVENTS
WML language also supports events and you can specify an action to be taken whenever an
event occurs. This action could be in terms of WMLScript or simply in terms of WML.
WML supports following four event types:
onenterbackward: This event occurs when the user hits a card by normal backward
navigational means. That is, user presses the Back key on a later card and arrives back at
this card in the history stack.
onenterforward: This event occurs when the user hits a card by normal forward
navigational means.
onpick: This is more like an attribute but it is being used like an event. This event occurs
when an item of a selection list is selected or deselected.
ontimer: This event is used to trigger an event after a given time period.
These event names are case sensitive and they must be lowercase.
The <onevent>...</onevent> tags are used to create event handlers. Its usage takes the following
form:
<onevent type="event_type">
A task to be performed.
</onevent>
You can use either go, prev or refresh task inside <onevent>...</onevent> tags against an event.
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<onevent type="onenterbackward">
<go href="#card3"/>
</onevent>
<card id="card1" title="Card 1">
<p>
<anchor>
<go href="#card2"/>
Go to card 2
</anchor>
</p>
</card>
<card id="card2" title="Card 2">
<p>
<anchor>
<prev/>
Going backwards
</anchor>
</p>
</card>
<card id="card3" title="Card 3">
<p>
Hello World!
</p>
</card>
</wml>
VARIABLES
A variable is used to store some data. You can modify or read the value of a variable during
execution. The var keyword is used to declare WMLScript variables. It should be used in the
following form (the part enclosed within brackets [] is optional):
Variable initialization is optional. If you do not initialize a variable, the WMLScript interpreter
will assign an empty string to it automatically, i.e. the following line of script:
var wmlscript_variable;
is equivalent to:
You can use the var keyword once, but declare more than one variable. For example, the
following WMLScript code:
var wmlscript_variable1;
var wmlscript_variable2;
var wmlscript_variable3;
is equivalent to:
var wmlscript_variable1, wmlscript_variable2, wmlscript_variable3;
Note that in WMLScript, you must use the var keyword to declare a variable before it can be
used. This is different from JavaScript in which automatic declaration of variables is supported.
For example, the following function is NOT valid in WMLScript since
the wmlscript_variable variable has not been declared:
function wmlscript_func()
{
wmlscript_variable = "Welcome to our WMLScript tutorial";
}
WMLScript does not support global variables. Global variables are variables that can be
accessed from any functions. This is different from JavaScript in which global variables are
supported. For example, the following script is valid in JavaScript but not in WMLScript:
var wmlscript_variable;
function wmlscript_func1()
{
wmlscript_variable = "Hello";
}
function wmlscript_func2()
{
wmlscript_variable = "Welcome to our WMLScript tutorial";
}
In WMLScript, arguments are passed to functions by value, which means if you specify a
variable as the argument of a function, the value of this variable will not be affected by any
operation inside the function. This is because when an argument is passed by value, a copy of the
variable is passed to the function instead of the original variable.
Passing arguments by reference means a reference to the variable is passed to the function
instead of a copy of the variable. The value of the variable in the calling function can be changed
by operations inside the called function.
WMLScript does not support passing arguments by reference. This creates a problem for us
since arguments have to be passed by reference in some situations. One situation is when we
want to return multiple values back to the calling function. (Remember that the return statement
can only be used to return one value. So, it cannot help us in this situation.) To solve the
problem, we can make use of the "passing arguments by reference" way: return values are
assigned to the argument variables, whose values can be read in the calling function.
The following example demonstrates how to do this. We will take a date in the MM-DD-YYYY
format (e.g. 08-30-2005) from the user and change it to a different format DD/MM/YYYY (e.g.
30/08/2005). Below is the WML document of the example:
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.3//EN"
"http://www.wapforum.org/DTD/wml13.dtd">
<wml>
<card id="card1" title="WMLScript Tutorial">
<p>
Please enter a date in the MM-DD-YYYY format:<br/>
<input name="datef1"/><br/>
<a href="passByRefEg1.wmls#changeDateFormat('$(datef1)')">Run
WMLScript</a><br/><br/>
</p>
<pre>$(result)</pre>
</card>
</wml>
If you view the above WML document in a mobile phone browser, you will see something
similar to this:
Sony Ericsson T68i Nokia Mobile Browser 4.0
Select the "Run WMLScript" anchor link and the changeDateFormat() function of
the passByRefEg1.wmls file will be executed. Here shows the code of the function:
As you can see above, we make use of the String standard library's elementAt() function
in parseDate(). It helps us break down a string using the specified delimiter. Details about it will
be mentioned in the "Getting the Element at a Certain Index in a String: elementAt() Function"
section of this tutorial. Now all you need to know is that it breaks down a date, say "08-30-
2005", into "08", "30" and "2005".
The following screenshots show the result of the above example in some mobile phone browsers:
Sony Ericsson T68i Nokia Mobile Browser 4.0
WML Script
WMLScript (Wireless Markup Language Script) is the client-side scripting language of WML
(Wireless Markup Language). A scripting language is similar to a programming language, but is
of lighter weight. With WMLScript, the wireless device can do some of the processing and
computation. This reduces the number of requests and responses to/from the server.
This chapter will give brief description of all the important WML Script components.
WML Script is very similar to Java Script. WML Script components have almost similar
meaning as they have in Java Script. The WML Script program components are summarized
here.
Arithmetic Operators
Comparison Operators
Logical (or Relational) Operators
Assignment Operators
Conditional (or ternary) Operators
Control statements are used for controlling the sequence and iterations in a program.
Statement Description
The user-defined functions are declared in a separate file having the extension .wmls. Functions
are declared as follows −
The functions used are stored in a separate file with the extension .wmls. The functions are called
as the filename followed by a hash, followed by the function name −
maths.wmls#squar()
Lang − The Lang library provides functions related to the WMLScript language core.
Example Function − abs(),abort(), characterSet(),float(), isFloat(), isInt(), max(),
isMax(), min(), minInt(), maxInt(), parseFloat(), parseInt(), random(), seed()
Float − The Float library contains functions that help us perform floating-point arithmetic
operations.
Example Function − sqrt(), round(), pow(), ceil(), floor(), int(), maxFloat(), minFloat()
String − The String library provides a number of functions that help us manipulate
strings.
Example Function − length(), charAt(), find(), replace(), trim(), compare(), format(),
isEmpty(), squeeze(), toString(), elementAt(), elements(), insertAt(), removeAt(),
replaceAt()
URL − The URL library contains functions that help us manipulate URLs.
Example Function − getPath(), getReferer(), getHost(), getBase(), escapeString(),
isValid(), loadString(), resolve(), unescapeString(), getFragment()
WMLBrowser − The WMLBrowser library provides a group of functions to control the
WML browser or to get information from it.
Example Function − go(), prev(), next(), getCurrentCard(), refresh(), getVar(), setVar()
Dialogs − The Dialogs library Contains the user interface functions.
Example Function − prompt(), confirm(), alert()
Single-line comment − To add a single-line comment, begin a line of text with the //
characters.
Multi-line comment − To add a multi-line comment, enclose the text within /* and */.
These rules are the same in WMLScript, JavaScript, Java, and C++. The WMLScript engine will
ignore all comments. The following WMLScript example demonstrates the use of comments −
The WMLScript language is case-sensitive. For example, a WMLScript function with the name
WMLScript Function is different from wmlscript function. So, be careful of the capitalization
when defining or referring to a function or a variable in WMLScript.
The following elements in the Structure Module are used to specify the structure of a WML2
document:
body
html
wml:card
head
title
The body Element:
The wml:newcontext attribute specifies whether the browser context is initialised to a well-
defined state when the document is loaded. If the wml:newcontext attribute value is "true", the
browser MUST reinitialise the browser context upon navigation to this card.
The wml:card element specifies a fragment of the document body. Multiple wml:card elements
may appear in a single document. Each wml:card element represents an individual presentation
and/or interaction with the user.
If the wml:card element's newcontext attribute value is "true", the browser MUST reinitialise the
browser context upon navigation to this card.
This element keeps header information of the document like meta element and style sheet etc.
WML - Variables
Because multiple cards can be contained within one deck, some mechanism needs to be in place
to hold data as the user traverses from card to card. This mechanism is provided via WML
variables.
WML is case sensitive. No case folding is performed when parsing a WML deck. All
enumerated attribute values are case sensitive. For example, the following attribute values are all
different: id="Card1", id="card1", and id="CARD1".
Variables can be created and set using several different methods. Following are two examples:
The <setvar> element is used as a result of the user executing some task. The >setvar> element
can be used to set a variable's state within the following elements: <go>, <prev>, and <refresh>.
The following element would create a variable named a with a value of 1000:
<setvar name="a" value="1000"/>
The input elements:
Variables are also set through any input element like input,select, option, etc. A variable is
automatically created that corresponds with the named attribute of an input element.
For example, the following element would create a variable named b:
<select name="b">
<option value="value1">Option 1</option>
<option value="value2">Option 2</option>
</select>
Using Variables:
Variable expansion occurs at runtime, in the microbrowser or emulator. This means it can be
concatenated with or embedded in other text.
Variables are referenced with a preceding dollar sign, and any single dollar sign in your WML
deck is interpreted as a variable reference.
<p> Selected option value is $(b) </p>
Operators :
Arithmatic Operators
Comparison Operators
Following are the comparison operators supported by the WML Script language −
Logical Operators
Following are the logical operators supported by the WML Script language −
Assignment Operators
Conditional Operator
There is one more oprator called conditional operator. This first evaluates an expression for a
true or false value and then execute one of the two given statements depending upon the result of
the evaluation. The conditioanl operator has this syntax −
Operators Categories
All the operators we have discussed above can be categorised into following categories −
Operator precedence determines the grouping of terms in an expression. This affects how an
expression is evaluated. Certain operators have higher precedence than others; for example, the
multiplication operator has higher precedence than the addition operator −
For example, x = 7 + 3 * 2; Here x is assigned 13, not 20 because operator * has higher
precedenace than + so it first get multiplied with 3*2 and then adds into 7.
Here operators with the highest precedence appear at the top of the table, those with the lowest
appear at the bottom. Within an expression, higher precedenace operators will be evaluated first.
WMLScript is a weakly typed language. This means you never specify the datatype of a variable
or the return type of a function. All expressions have a type internally, but WMLScript itself
converts values back and forth between the different types as required, so that you don’t have to.
For example, the value 1234 is an integer, but if you pass it to the String.length( ) library
function, which expects a string, it’s implicitly converted to the string "1234", the length of
which is 4.
Similarly, if you try to evaluate the expression:
"1234" * "2"
both values are converted to integers before use, and the result is the integer 2468. This is what is
meant by weak typing.
Datatypes
WMLScript has five datatypes: string, integer, floating-point number, Boolean, and the special
type invalid. Every value in WMLScript belongs to one of these types, although most can be
converted to others. The format of a literal value determines its type. (A literal is a constant
included explicitly in the code.)
WMLScript has only one type of variable, which is var. However, WMLScript variables are
actually handled as five primitive data types internally. A variable can be used to store a value of
any of the five primitive data types. The data types supported in WMLScript are:
1. Boolean.
A Boolean value can be true or false.
Examples:
var variable1 = true;
var variable2 = false;
2. Integer.
WMLScript uses 32-bit integers with two's complement. This means an integer value can be in
the range from -232/2 to 232/2-1, i.e. -2147483648 to 2147483647.
Examples:
var variable1 = 10000;
var variable2 = -10000;
3. Float.
WMLScript uses 32-bit single precision format to represent floating-point numbers. The
maximum value supported is 3.40282347E+38. The smallest positive nonzero value supported is
1.17549435E-38. Note that some mobile devices do not support floating-point numbers.
The float() function of WMLScript's Lang standard library can be used to check whether a
mobile device supports floating-point numbers.
Examples:
var variable1 = 11.11;
var variable2 = -11.11;
4. String.
A string contains some characters.
Example:
var variable = "WMLScript Tutorial";
5. Invalid.
This is used to indicate that a variable is invalid.
Example:
var variable = invalid;
Sometimes the result of an operation is of the invalid type. This means errors have occurred
during the operation. One example is the divide-by-zero error:
var variable = 100 / 0;
After the execution of the above script, variable contains the invalid value. You can use
the isvalid operator to find out whether a variable is of the invalid data type. It will be covered
shortly in this WMLScript tutorial.
The data types of operands determine the action of an operator. For example, + is both the
addition operator and the string concatenation operator. If the operands of the + operator are
numeric values, the addition operation will be done. For example, the result of 10+15 is 25. If
any operands of the + operator are of the string data type, the string concatenation operation will
be done. For example, the result of "10"+"15" is "1015". The WMLScript interpreter will carry
out a data type conversion automatically if necessary. For example, if the WMLScript interpreter
encounters the operation 10+"15", the numeric value 10 will first be converted to the string data
type. After that the string concatenation operation will be performed. The result will be "1015".
1. WMLBrowser library
The WMLBrowser library provides a group of functions to control the WML browser or to get
information from it.
2. Dialogs library
The Dialogs library contains functions for displaying alert messages, confirmation messages and
input boxes to users.
3. String library
The String library provides a number of functions that help us manipulate strings.
4. Float library
The Float library contains functions that help us perform floating-point arithmetic operations.
5. Lang library
The Lang library provides functions related to the WMLScript language core. It contains
functions for generating random numbers, performing some arithmetic operations and dealing
with data type conversion.
6. URL library
The URL library contains functions that help us manipulate URLs.
The WMLScript standard libraries are included in the WAP specification. Hence, mobile devices
that support WMLScript should also support the WMLScript standard libraries. Note that the
Float library is only available on mobile devices that can perform floating-point calculations.
WMLScript programmers should get familiar with the standard libraries. With such knowledge,
you can know whether a particular task can be done by using the functions in the standard
libraries. If the standard libraries do not have the function you need, that means you have to
implement the function yourself or the task cannot be done with the WMLScript language.
Calling a function in the WMLScript standard libraries is simple. It is the same as calling your
own function, except that you have to add the library name and a dot character before the
function name. This is the syntax:
library_name.function_name(parameter1, parameter2...);
The following WMLScript example demonstrates how to call the sqrt() function of the Float
standard library to calculate the square root of a number:
function findSquareRoot(number)
{
return Float.sqrt(number);
}
WML program :
A WML program is typically divided into two parts: the document prolog and the body.
Consider the following code:
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.2//EN"
"http://www.wapforum.org/DTD/wml12.dtd">
<wml>
<card id="one" title="First Card">
<p>
This is the first card in the deck
</p>
</card>
<card id="two" title="Second Card">
<p>
Ths is the second card in the deck
</p>
</card>
</wml>
The first line of this text says that this is an XML document and the version is 1.0. The second
line selects the document type and gives the URL of the document type definition (DTD). The
DTD referenced is defined in WAP 1.2, but this header changes with the versions of the WML.
The header must be copied exactly so that the tool kits automatically generate this prolog.
The prolog components are not WML elements and they should not be closed, i.e. you should
not give them an end tag or finish them with />.
WML Document Body:
The body is enclosed within a <wml> </wml> tag pair. The body of a WML document can
consist of one or more of the following:
Deck
Card
Content to be shown
Navigation instructions
Unlike HTML 4.01 Transitional, text cannot be enclosed directly in the <card>...</card> tag
pair. So you need to put a content inside <p>...</p> as shown above.
Put above code in a file called test.wml file, and put this WML file locally on your hard disk,
then view it using an emulator.
This is by far the most efficient way of developing and testing WML files. Since your aim is,
however, to develop a service that is going to be available to WAP phone users, you should
upload your WML files onto a server once you have developed them locally and test them over a
real Internet connection. As you start developing more complex WAP services, this is how you
will identify and rectify performance problems, which could, if left alone, lose your site visitors.
In uploading the file test.wml to a server, you will be testing your WML emulator to see how it
looks and behaves, and checking your Web server to see that it is set up correctly. Now start your
emulator and use it to access the URL of test.wml. For example, the URL might look something
like this:
http://websitename.com/wapstuff/test.wml
Dealing with Errors:
Error Handling
WMLScript does not provide an exception-handling mechanism like that found in Java and other
heavier-weight languages. You should be careful when testing your WMLScript programs to
exercise all the branches of your scripts during testing. Common coding mistakes that can lead to
errors include these:
Dividing by zero
Nesting functions too deeply (the stack memory is very limited on many handsets)
Predictable errors, such as user errors, should be trapped, and a consistently formatted error
message should be displayed along with a means to cancel or continue as appropriate.
<wml:errors/>
Retrieves the set of error messages from the request object with the default key
of Action.ERROR_KEY or the value specified by attribute name. If ActionErrors are
found then the errors are displayed. This tag also requires the following two message
keys in the application scope MessageResources.
Introduction of XML
2
Parts of it can be shown or hidden depending on user actions. This is extremely useful when
you’re working with large information repositories like relational databases.
The list of an XML Document
XML is, at the root, a document format. It is a series of rules about what XML
documents look like. There are two levels of conformity to the XML standard. The first is
well-formedness and the second is validity.
HTML is a document format designed for use on the Internet and inside Web
browsers. XML can certainly be used for that, as this book demonstrates. However, XML is
far more broadly applicable.
As previously discussed, it can be used as a storage format for word processors, as a
data interchange format for different programs, as a means of enforcing conformity with
Intranet templates, and as a way to preserve data in a human-readable fashion.
Editors
XML documents are most commonly created with an editor. This may be a basic text
editor like Notepad or vi that doesn’t really understand XML at all. On the other hand, it may
be a completely WYSIWYG editor like Adobe FrameMaker that insulates you almost
completely from the details of the underlying XML format. Or it may be a structured editor
like JUMBO that displays XML documents as trees. For the most part, the fancy editors
aren’t very useful yet, so this book concentrates on writing raw XML by hand in a text editor.
For example, the document may be a record or a field in a database, or it may be a
stream of bytes received from a network.
Parsers and Processors
An XML parser (also known as an XML processor) reads the document and verifies
that the XML it contains is well formed. It may also check that the document is valid, though
this test is not required. The exact details of these tests will be covered in Part II. But
assuming the document passes the tests, the processor converts the document into a tree of
elements.
Browsers and Other Tools
Finally the parser passes the tree or individual nodes of the tree to the end application.
This application may be a browser like Mozilla or some other program that understands what
to do with the data. If it’s a browser, the data will be displayed to the user. But other
programs may also receive the data. For instance, the data might be interpreted as input to a
database, a series of musical notes to play, or a Java program that should be launched. XML
is extremely flex-ible and can be used for many different purposes.
3
The Process Summarized
To summarize, an XML document is created in an editor. The XML parser reads the
document and converts it into a tree of elements. The parser passes the tree to the browser
that displays it. Figure 1 shows this process.
4
It’s easy to apply CSS rules to XML documents. You simply change the names of the
tags you’re applying the rules to. Mozilla 5.0 directly supports CSS style sheets combined
with XML documents, though at present, it crashes rather too frequently.
Extensible Style Language
The Extensible Style Language (XSL) is a more advanced style-sheet language
specifically designed for use with XML documents. XSL documents are themselves well-
formed XML documents.
XSL documents contain a series of rules that apply to particular patterns of XML
elements. An XSL processor reads an XML document and compares what it sees to the
patterns in a style sheet. When a pattern from the XSL style sheet is recognized in the XML
document, the rule outputs some combination of text. Unlike cascading style sheets, this
output text is somewhat arbitrary and is not limited to the input text plus formatting
information.
CSS can only change the format of a particular element, and it can only do so on an
element-wide basis. XSL style sheets, on the other hand, can rearrange and reorder elements.
CSS has the advantage of broader browser support. However, XSL is far more
flexible and powerful, and better suited to XML documents. Furthermore, XML documents
with XSL style sheets can be easily converted to HTML documents with CSS style sheets.
URLs and URIs
XML documents can live on the Web, just like HTML and other documents. When
they do, they are referred to by Uniform Resource Locators (URLs), just like HTML files.
For example, at the URL http://www.hypermedic.com/style/xml/tempest.xml you’ll find the
complete text of Shakespeare’s Tempest marked up in XML.
Although URLs are well understood and well supported, the XML specification uses
the more general Uniform Resource Identifier (URI). URIs are a more general architecture
for locating resources on the Internet, that focus a little more on the resource and a little less
on the location.
A URI can find the closest copy of a mirrored document or locate a document that has
been moved from one site to another. In practice, URIs are still an area of active research, and
the only kinds of URIs that are actually supported by current software are URLs.
XLinks and XPointers
As long as XML documents are posted on the Internet, you’re going to want to be
able to address them and hot link between them. Standard HTML link tags can be used in
5
XML documents, and HTML documents can link to XML documents. For example, this
HTML link points to the aforementioned copy of the Tempest rendered in XML:
<a href=”http://www.hypermedic.com/style/xml/tempest.xml”>
The Tempest by Shakespeare
</a>
XML lets you go further with XLinks for linking to documents and XPointers for
addressing individual parts of a document.
XLinks enable any element to become a link, not just an A element. Furthermore,
links can be bi-directional, multidirectional, or even point to multiple mirror sites from which
the nearest is selected. XLinks use normal URLs to identify the site they’re linking to.
XPointers enable links to point not just to a particular document at a particular
location, but to a particular part of a particular document. An XPointer can refer to a
particular element of a document, to the first, the second, or the 17th such element, to the first
element that’s a child of a given element, and so on. XPointers provide extremely powerful
connections between documents that do not require the targeted document to contain
additional markup just so its individual pieces can be linked to it.
The Unicode Character Set
The Web is international, yet most of the text you’ll find on it is in English. XML is
starting to change that. XML provides full support for the two-byte Unicode character set, as
well as its more compact representations. This character set supports almost every character
commonly used in every modern script on Earth.
Unfortunately, XML alone is not enough. To read a script you need three things:
1. A character set for the script
2. A font for the character set
3. An operating system and application software that understands the character set
How the Technologies Fit Together
XML defines a grammar for tags you can use to mark up a document. An XML
document is marked up with XML tags. The default encoding for XML documents is
Unicode.
6
An Introduction to XML Applications
What Is an XML Application?
XML is a meta-markup language for designing domain-specific markup languages.
Each XML-based markup language is called an XML application. This is not an application
that uses XML like the Mozilla Web browser, the Gnumeric spreadsheet, or the XML Pro
editor, but rather an application of XML to a specific domain such as Chemical Markup
Language (CML) for chemistry or GedML for genealogy.
Each XML application has its own syntax and vocabulary. This syntax and
vocabulary adheres to the fundamental rules of XML.
This is much like human languages, which each have their own vocabulary and
grammar, while at the same time adhering to certain fundamental rules imposed by human
anatomy and the structure of the brain.
Chemical Markup Language
Peter Murray-Rust’s Chemical Markup Language (CML) may have been the first
XML application. CML was originally developed as an SGML application, and gradually
transitioned to XML as the XML standard developed. In its most simplistic form, CML is
“HTML plus molecules”, but it has applications far beyond the limited confines of the Web.
Molecular documents often contain thousands of different, very detailed objects. For
example, a single medium-sized organic molecule may contain hundreds of atoms, each with
several bonds.
The water molecule H2O
<?xml version=”1.0”?>
<CML>
<MOL TITLE=”Water”>
<ATOMS>
<ARRAY BUILTIN=”ELSYM”>H O H</ARRAY>
</ATOMS>
<BONDS>
<ARRAY BUILTIN=”ATID1”>1 2</ARRAY>
<ARRAY BUILTIN=”ATID2”>2 3</ARRAY>
<ARRAY BUILTIN=”ORDER”>1 1</ARRAY>
</BONDS>
</MOL>
</CML>
Mathematical Markup Language
The Mathematical Markup Language (MathML) is an XML application for
mathematical equations. MathML is sufficiently expressive to handle pretty much all forms
7
of math—from grammar-school arithmetic through calculus and differential equations. It can
handle many considerably more advanced topics as well, though there are definite gaps in
some of the more advanced and obscure notations used by certain sub-fields of mathematics.
Channel Definition Format
Microsoft’s Channel Definition Format (CDF) is an XML application for defining
channels. Web sites use channels to upload information to readers who subscribe to the site
rather than waiting for them to come and get it. This is alternately called Webcasting or push.
CDF was first introduced in Internet Explorer 4.0.
A CDF document is an XML file, separate from, but linked to an HTML document on
the site being pushed. The channel defined in the CDF document determines which pages are
sent to the readers, how the pages are transported, and how often the pages are sent. Pages
can either be pushed by sending notifications, or even whole Web sites, to subscribers; or
pulled down by the readers at their convenience.
Synchronized Multimedia Integration Language
The Synchronized Multimedia Integration Language (SMIL, pronounced “smile”) is a
W3C recommended XML application for writing “TV-like” multimedia presentations for the
Web. SMIL documents don’t describe the actual multimedia content (that is the video and
sound that are played) but rather when and where they are played.
<?xml version=”1.0” encoding=”ISO-8859-1”?>
<!DOCTYPE smil PUBLIC “-//W3C//DTD SMIL 1.0//EN”
“http://www.w3.org/TR/REC-smil/SMIL10.dtd”>
<smil>
<body>
<seq id=”Kubrick”>
<audio src=”beethoven9.mid”/>
<video src=”corange.mov”/>
<text src=”clockwork.htm”/>
<audio src=”zarathustra.mid”/>
<video src=”2001.mov”/>
<text src=”aclarke.htm”/>
</seq>
</body>
</smil>
8
Vector Markup Language
XML application for vector graphics called the Vector Markup Language (VML).
VML is more finished than SVG, and is already supported by Internet Explorer 5.0
and Microsoft Office 2000.
MusicML
The Connection Factory has created an XML application for sheet music called
MusicML. MusicML includes notes, beats, clefs, staffs, rows, rhythms, rests, beams, rows,
chords and more. Listing 2-8 shows the first bar from Beth Anderson’s Flute Swale in
MusicML.
VoxML
Motorola’s VoxML (http://www.voxml.com/) is an XML application for the spoken
word. In particular, it’s intended for those annoying voice mail and automated phone
response systems (“If your hair turned green after using our product, please press one. If your
hair turned purple after using our product, please press two. If you found an unidentifiable
insect in the product, please press 3. Otherwise, please stay on the line until your hair grows
back to its natural color.”).
Extensible Forms Description Language
To develop a markup language with the right combination of power and rigor to meet
your needs, and this example is no exception. In particular UWI.COM has proposed an XML
application called the Extensible Forms Description Language (XFDL) for forms with
extremely tight legal requirements that are to be signed with digital signatures. XFDL further
offers the option to do simple mathematics in the form, for instance to automatically fill in
the sales tax and shipping and handling charges and total up the price.
Human Resources Markup Language
HireScape’s Human Resources Markup Language (HRML) is an XML application
that provides a simple vocabulary for describing job openings. It defines elements matching
the parts of a typical classified want ad such as companies, divisions, recruiters, contact
information, terms, experience, and more.
Resource Description Framework
XML adds structure to documents. The Resource Description Framework (RDF) is an
XML application that adds semantics. RDF can be used to specify anything from the author
and abstract of a Web page to the version and dependencies of a software package to the
director, screenwriter, and actors in a movie. What links all of these uses is that what’s being
9
encoded in RDF is not the data itself (the Web page, the software, the movie) but information
about the data. This data about data is called meta-data, and is RDF’s raison d’être.
XML for XML
XML is an extremely general-purpose format for text data. Some of the things it is
used for are further refinements of XML itself. These include the XSL style-sheet language,
the XLL-linking language, and the Document Content Description for XML.
XSL
XSL, the Extensible Style Language, is itself an XML application. XSL has two
major parts. The first part defines a vocabulary for transforming XML documents. This part
of XSL includes XML tags for trees, nodes, patterns, templates, and other elements needed
for matching and transforming XML documents from one markup vocabulary to another (or
even to the same one in a different order).
The second part of XSL defines an XML vocabulary for formatting the transformed
XML document produced by the first part. This includes XML tags for formatting objects
including pagination, blocks, characters, lists, graphics, boxes, fonts, and more.
XLL
The Extensible Linking Language, XLL, defines a new, more general kind of link
called an XLink. XLinks accomplish everything possible with HTML’s URL-based
hyperlinks and anchors. However, any element can become a link, not just A elements. For
instance a footnote element can link directly to the text of the note like this:
<footnote xlink:form=”simple” href=”footnote7.xml”>7</footnote>
DCD
XML’s facilities for declaring how the contents of an XML element should be
formatted are weak to nonexistent. For example, suppose as part of a date, you set up
MONTH elements like this:
<MONTH>9</MONTH>
A number of schemes have been proposed to use XML itself to more tightly restrict
what can appear in the contents of any given element. One such proposal is the Document
Content Description, (DCD). For example, here’s a DCD that declares that MONTH
elements may only contain an integer between 1 and 12:
<DCD>
<ElementDef Type=”MONTH” Model=”Data” Datatype=”i1”
Min=”1” Max=”12” />
</DCD>
10
Behind-the-Scene Uses of XML
Not all XML applications are public, open standards. A lot of software vendors are
moving to XML for their own data simply because it’s a well-understood, generalpurpose
format for structured data that can be manipulated with easily available cheap and free tools.
Microsoft Office 2000 promotes HTML to a coequal file format with its native binary
formats. However, HTML 4.0 doesn’t provide support for all of the features Office requires,
such as revision tracking, footnotes, comments, index and glossary entries, and more.
Additional data that can’t be written as HTML is embedded in the file in small chunks of
XML. Word’s vector graphics will be stored in VML. In this case, embedded XML’s
invisibility in standard browsers is the crucial factor.
11
<?xml version=”1.0” standalone=”yes”?>
This is an example of an XML processing instruction. Processing instructions begin
with <? And end with ?>. The first word after the <? is the name of the processing
instruction, which is xml in this example.
The XML declaration has version and standalone attributes. An attribute is a name-
value pair separated by an equals sign. The name is on the left-hand side of the equals sign
and the value is on the right-hand side with its value given between double quote marks.
Every XML document begins with an XML declaration that specifies the version of
XML in use. In the above example, the version attribute says this document conforms to
XML 1.0.
Collectively these three lines form a FOO element. Separately, <FOO> is a start tag;
</FOO> is an end tag; and Hello XML! is the content of the FOO element.
You may be asking what the <FOO> tag means. The short answer is “whatever you
want it to.” Rather than relying on a few hundred predefined tags, XML lets you create the
tags that you need. The <FOO> tag therefore has whatever meaning you assign it. The same
XML document could have been written with different tag names, as shown in Listings 3, 43,
and 5, below:
Listing 3: greeting.xml
<?xml version=”1.0” standalone=”yes”?>
<GREETING>
Hello XML!
</GREETING>
Listing 4: paragraph.xml
<?xml version=”1.0” standalone=”yes”?>
<P>
Hello XML!
</P>
Listing 5: document.xml
<?xml version=”1.0” standalone=”yes”?>
<DOCUMENT>
Hello XML!
</DOCUMENT>
12
Semantic meaning exists outside the document, in the mind of the author or reader or
in some computer program that generates or reads these files.
Structuring Data
It defines the structure of the documents
Examining the Data
In most sports you hear about heart, guts, ability, skill, determination, and more. But
only in baseball do the fans get so worked up about raw numbers. Batting average, earned run
average, slugging average, on base average, fielding percentage, batting average against right
handed pitchers, batting average against left handed pitchers, batting average against right
handed pitchers when batting left-handed, batting average against right handed pitchers.
In the next two sections, for the benefit of the less baseball-obsessed reader, we will
examine the commonly available statistics that describe an individual player’s atting and
pitching.
Batters
We analyzed all possible batting orders for all teams in the 1989 National League.
The results of that paper were mildly interesting. The worst batter on the team, generally the
pitcher, should bat eighth rather than the customary ninth position, at least in the National
League, but what concerns me here is the work that went into producing this paper.
Pitchers
Pitchers are not expected to be home-run hitters or base stealers. Indeed a pitcher who
can reach first on occasion is a surprise bonus for a team.
The number of runs for a pitcher is the number of runs scored by the opposing teams
against this pitcher.
Organization of the XML Data
XML is based on a containment model. Each XML element can contain text or other
XML elements called its children. A few XML elements may contain both text and child
elements, though in general this is bad form and should be avoided wherever possible.
XMLizing the Data
Let’s begin the process of marking up the data for the 1998 Major League season in
XML with tags that you define. Remember that in XML we’re allowed to make up the tags as
we go along.
13
Teams contain players. Players will have statistics including games played, at bats,
runs, hits, doubles, triples, home runs, runs batted in, walks, and hits by pitch.
XML documents may be recognized by the XML declaration. This is a processing
instruction placed at the start of all XML files that identifies the version in use. The only
version currently understood is 1.0.
<?xml version=”1.0”?>
XMLizing League, Division, and Team Data
Major league baseball is divided into two leagues, the American League and the
National League. Each league has a name. The two names could be encoded like this:
<?xml version=”1.0”?>
<SEASON>
<YEAR>1998</YEAR>
<LEAGUE>
<LEAGUE_NAME>National League</LEAGUE_NAME>
</LEAGUE>
<LEAGUE>
<LEAGUE_NAME>American League</LEAGUE_NAME>
</LEAGUE>
</SEASON>
14
<PLAYER>
<GIVEN_NAME>Andy</GIVEN_NAME>
<SURNAME>Pettitte</SURNAME>
</PLAYER>
<PLAYER>
<GIVEN_NAME>Hideki</GIVEN_NAME>
<SURNAME>Irabu</SURNAME>
</PLAYER>
</TEAM>
15
The Advantages of the XML Format
There are several benefits. Among them:
✦ The data is self-describing
✦ The data can be manipulated with standard tools
✦ The data can be viewed with standard tools
✦ Different views of the same data are easy to create with style sheets
The first major benefit of the XML format is that the data is self-describing. The
meaning of each number is clearly and unmistakably associated with the number itself. When
reading the document, you know that the 121 in <HITS>121</HITS> refers to hits and not
runs batted in or strikeouts.
The second benefit to providing the data in XML is that it enables the data to be
manipulated in a wide range of XML-enabled tools, from expensive payware like Adobe
FrameMaker to free open-source software like Python and Perl.
Preparing a Style Sheet for Document Display
A CSS style sheet associates particular formatting with each element of the document.
The complete list of elements used in our XML document is:
SEASON
YEAR
LEAGUE
LEAGUE_NAME
DIVISION
DIVISION_NAME
TEAM
TEAM_CITY
TEAM_NAME
PLAYER
SURNAME
GIVEN_NAME
POSITION
GAMES
GAMES_STARTED
AT_BATS
RUNS
HITS
DOUBLES
TRIPLES
HOME_RUNS
RBI
STEALS
16
CAUGHT_STEALING
SACRIFICE_HITS
SACRIFICE_FLIES
ERRORS
WALKS
STRUCK_OUT
HIT_BY_PITCH
Linking to a Style Sheet
The style sheet for the XML document 1998shortstats.xml might be called
1998shortstats.css. On the other hand, if the same style sheet is going to be applied to many
documents, then it should probably have a more generic name like baseballstats.css.
To attach a style sheet to the document, you simply add an additional
<?xmlstylesheet?> processing instruction between the XML declaration and the root element,
like this:
<?xml version=”1.0” standalone=”yes”?>
<?xml-stylesheet type=”text/css” href=”baseballstats.css”?>
<SEASON>
...
Assigning Style Rules to the Root Element
The most important style, therefore, is the one for the root element, which is
SEASON in this example. This defines the default for all the other elements on the page.
SEASON {font-size: 14pt; background-color: white;
color: black; display: block}
Place this statement in a text file, save the file with the name baseballstats.css in the
same directory as Listing 4-1, 1998shortstats.xml, and open 1998shortstats.xml in your
browser.
Assigning Style Rules to Titles
The YEAR element is more or less the title of the document. Therefore, let’s make it
appropriately large and bold—32 points should be big enough.
YEAR {display: block; font-size: 32pt; font-weight: bold;
text-align: center}
Assigning Style Rules to Player and Statistics Elements
The trickiest formatting this document requires is for the individual players and
statistics. Each team has a couple of dozen players. Each player has statistics. You could
think of a TEAM element as being divided into PLAYER elements, and place each player in
his own block-level section as you did for previous elements.
17
UNIT – V
18
With experience, you’ll gain a feel for when attributes are easier than child elements
and vice versa. Until then, one good rule of thumb is that the data itself should be stored in
elements.
Information about the data (meta-data) should be stored in attributes. And when in
doubt, put the information in the elements.
These include:
✦ Attributes can’t hold structure well.
✦ Elements allow you to include meta-meta-data (information about the information
about the information).
✦ Not everyone always agrees on what is and isn’t meta-data.
✦ Elements are more extensible in the face of future changes.
Structured Meta-data
One important principal to remember is that elements can have substructure and
attributes can’t. This makes elements far more flexible, and may convince you to encode
meta-data as child elements.
Meta-Meta-Data
Using elements for meta-data also easily allows for meta-meta-data, or information
about the information about the information. For example, the author of a poem may be
considered to be meta-data about the poem.
<POET LANGUAGE=”English”>Homer</POET>
<POET LANGUAGE=”Greek”> </POET>
What’s Your Meta-data Is Someone Else’s Data
“Metaness” is in the mind of the beholder. Who is reading your document and why
they are reading it determines what they consider to be data and what they consider to be
meta-data.
For example, if you’re simply reading an article in a scholarly journal, then the author
of the article is tangential to the information it contains.
Elements Are More Extensible
Attributes are certainly convenient when you only need to convey one or two words
of unstructured information.
In these cases, there may genuinely be no current need for a child element. However,
this doesn’t preclude such a need in the future.
19
Empty Tags
Empty tags are distinguished from start tags by a closing /> instead of simply a
closing >. For instance, instead of <PLAYER></PLAYER> you would write <PLAYER/>.
Empty tags may contain attributes. For example, here’s an empty tag for Joe Girardi
with several attributes:
<PLAYER GIVEN_NAME=”Joe” SURNAME=”Girardi”
GAMES=”78” AT_BATS=”254” RUNS=”31” HITS=”70”
DOUBLES=”11” TRIPLES=”4” HOME_RUNS=”3”
RUNS_BATTED_IN=”31” WALKS=”14” STRUCK_OUT=”38”
STOLEN_BASES=”2” CAUGHT_STEALING=”4”
SACRIFICE_FLY=”1” SACRIFICE_HIT=”8”
HIT_BY_PITCH=”2”/>
XSL
This language is the Extensible Style Language (XSL); and it is also supported by
Internet Explorer 5.0, at least in part. XSL is divided into two sections, transformations and
formatting.
The transformation part of XSL enables you to replace one tag with another. You can
define rules that replace your XML tags with standard HTML tags, or with HTML tags plus
CSS attributes. You can also do a lot more including reordering the elements in the document
and adding additional content that was never present in the XML document.
The formatting part of XSL defines an extremely powerful view of documents as
pages. XSL formatting enables you to specify the appearance and layout of a page including
multiple columns, text flow around objects, line spacing, assorted font properties, and more.
It’s designed to be powerful enough to handle automated layout tasks for both the Web and
print from the same source document.
XSL Style Sheet Templates
An XSL style sheet contains templates into which data from the XML document is
poured. For example, one template might look something like this:
<HTML>
<HEAD>
<TITLE>
XSL Instructions to get the title
</TITLE>
</HEAD>
<H1>XSL Instructions to get the title</H1>
20
<BODY>
XSL Instructions to get the statistics
</BODY>
</HTML>
CSS or XSL
CSS and XSL overlap to some extent. XSL is certainly more powerful than CSS.
However XSL’s power is matched by its complexity. This chapter only touched on the basics
of what you can do with XSL. XSL is more complicated, and harder to learn and use than
CSS, which raises the question, “When should you use CSS and when should you use XSL?”
CSS is more broadly supported than XSL. Parts of CSS Level 1 are supported for
HTML elements by Netscape 4 and Internet Explorer 4 (although annoying differences exist).
21
Well-Formed XML Documents
XML Documents Are Made Of
An XML document contains text that comprises XML markup and character data. It is a
sequential set of bytes of fixed length, which adheres to certain constraints. It may or may not
be a file. For instance, an XML document may:
✦ Be stored in a database
✦ Be created on the fly in memory by a CGI program
✦ Be some combination of several different files, each of which is embedded in another
✦ Never exist in a file of its own
XML documents are made up of storage units called entities. Each entity contains either
text or binary data, never both. Text data is comprised of characters. Binary data is used for
images and applets and the like. To use a concrete example, a raw HTML file that includes
<IMG> tags is an entity but not a document. An HTML file plus all the pictures embedded in
it with <IMG> tags is a complete document.
Such a document normally contains a standalone attribute in its XML declaration with
the value yes, like the one following:
<?xml version=”1.0” standalone=”yes”?>
Markup and Character Data
XML documents are text. Text is made up of characters. A character is a letter, a digit, a
punctuation mark, a space, a tab or something similar. XML uses the Unicode character set,
which not only includes the usual letters and symbols from the English and other Western
European alphabets, but also the Cyrillic, Greek, Hebrew, Arabic, and Devanagari alphabets.
The text of an XML document serves two purposes, character data and markup.
Character data is the basic information of the document. Markup, on the other hand, mostly
describes a document’s logical structure.
Comments
XML comments are almost exactly like HTML comments. They begin with <!— and
end with —> . All data between the <!— and —> is ignored by the XML processor. It’s as if
it wasn’t there. Comments can be used to make notes to yourself or to temporarily comment
out sections of the document that aren’t ready. For example,
<?xml version=”1.0” standalone=”yes”?>
<!—This is Listing 3-2 from The XML Bible—>
<GREETING>
22
Hello XML!
<!—Goodbye XML—>
</GREETING>
Entity References
Entity references are markup that is replaced with character data when the document
is parsed. Entity references are used in XML documents in place of specific characters that
would otherwise be interpreted as part of markup.
CDATA
Most of the time anything inside a pair of angle brackets (<>) is markup and anything
that’s not is character data. However there is one exception. In CDATA sections all text
is pure character data. Anything that looks like a tag or an entity reference is really just
the text of the tag or the entity reference. The XML processor does not try to interpret it
in any way.
CDATA sections are used when you want all text to be interpreted as pure character data
rather than as markup. This is primarily useful when you have a large block of text that
contains a lot of <, >, &, or “ characters, but no markup. This would be true for much C
and Java source code.
CDATA sections are also extremely useful if you’re trying to write about XML in XML.
For example, this book contains many small blocks of XML code.
Tags
What distinguishes XML files from plain text files is markup. The largest part of the
markup is the tags. While you saw how tags are used in the previous chapter, this section will
define what tags are and provide a broader picture of how they’re used.
In brief, a tag is anything in an XML document that begins with < and ends with >
and is not inside a comment or a CDATA section. Thus, an XML tag has the same form as an
HTML tag. Start or opening tags begin with a < which is followed by the name of the tag.
End or closing tags begin with a </ which is followed by the name of the tag. The first >
encountered closes the tag.
Tag Names
Every tag has a name. Tag names must begin with a letter or an underscore (_).
Subsequent characters in the name may include letters, digits, underscores, hyphens, and
periods. They may not include white space. (The underscore often substitutes for white
space.) The following are some legal XML tags:
23
<HELP>
<Book>
<volume>
<heading1>
<section.paragraph>
<Mary_Smith>
<_8ball>
The following are not syntactically correct XML tags:
<Book%7>
<volume control>
<1heading>
<Mary Smith>
<.employee.salary>
The rules for tag names actually apply to names of many other things as well. The
same rules are used for attribute names, ID attribute values, entity names, and a number of
other constructs you’ll encounter in the next several chapters.
Closing tags have the same name as their opening tag but are prefixed with a / after
the initial angle bracket. For example, if the opening tag is <FOO>, then the closing tag is
</FOO>. These are the end tags for the previous set of legal start tags.
</HELP>
</Book>
</volume>
</heading1>
</section.paragraph>
</Mary_Smith>
</_8ball>
XML names are case sensitive. This is different from HTML where <P> and <p> are
the same tag, and a </p> can close a <P> tag. The following are not end tags for the set of
legal start tags we’ve been discussing.
</help>
</book>
</Volume>
</HEADING1>
</Section.Paragraph>
</MARY_SMITH>
</_8BALL>
24
Empty Tags
Many HTML tags that do not contain data do not have closing tags. For example,
there are no </LI>, </IMG>, </HR>, or </BR> tags in HTML. Some page authors do include
</LI> tags after their list items, and some HTML tools also use </LI>. However the HTML
4.0 standard specifically denies that this is required. Like all unrecognized tags in HTML, the
presence of an unnecessary </LI> has no effect on the rendered output.
XML distinguishes between tags that have closing tags and tags that do not, called
empty tags. Empty tags are closed with a slash and a closing angle bracket (/>). For example,
<BR/> or <HR/>.
Current Web browsers deal inconsistently with tags like this. However, if you’re
trying to maintain backwards compatibility, you can use closing tags instead, and just not
include any text in them. For example,
<BR></BR>
<HR></HR>
<IMG></IMG>
Attributes
As discussed in the previous chapter, start tags and empty tags may optionally contain
attributes. Attributes are name-value pairs separated by an equals sign (=).
For example,
<GREETING LANGUAGE=”English”>
Hello XML!
<MOVIE SRC=”WavingHand.mov”/>
</GREETING>
Attribute Names
Attribute names are strings that follow the same rules as tag names. That is, attribute
names must begin with a letter or an underscore (_). Subsequent letters in the name may
include letters, digits, underscores, hyphens, and periods. They may not include white space.
(The underscore often substitutes for whitespace.)
The same tag may not have two attributes with the same name. For example, the
following is illegal:
<RECTANGLE SIDE=”8cm” SIDE=”10cm”/>
25
Attribute Values
Attributes values are also strings. Even when the string shows a number, as in the
LENGTH attribute below, that number is the two characters 7 and 2, not the binary number
72.
<RULE LENGTH=”72”/>
Unlike attribute names, there are few limits on the content of an attribute value.
Attribute values may contain white space, begin with a number, or contain any punctuation
characters (except, sometimes, single and double quotes).
XML attribute values are delimited by quote marks. Unlike HTML attributes, XML
attributes must be enclosed in quotes. Most of the time double quotes are used. However, if
the attribute value itself contains a double quote, then single quotes may be used. For
example:
<RECTANGLE LENGTH=’7”’ WIDTH=’8.5”’/>
26
1. The XML declaration must begin the document
This is the XML declaration for stand-alone documents in XML 1.0:
<?xml version=”1.0” standalone=”yes”?>
If the declaration is present at all, it must be absolutely the first thing in the file
because XML processors read the first several bytes of the file and compare those bytes
against various encodings of the string <?xml to determine which character set is being used
(UTF-8, big-endian Unicode, or little-endian Unicode).
XML file because of the extra spaces at the front of the line:
<?xml version=”1.0” standalone=”yes”?>
2. Use Both Start and End Tags in Non-Empty Tags
Web browsers are relatively forgiving if you forget to close an HTML tag. For
instance, if a document includes a <B> tag but no corresponding </B> tag, the entire
document after the <B> tag will be made bold. However, the document will still be
displayed.
3. End Empty Tags with “/>”
Tags that do not contain data, such as HTML’s <BR>, <HR>, and <IMG>, do not
require closing tags. However, empty XML tags must be identified by closing with a /> rather
than just a >. For example, the XML equivalents of <BR>, <HR>, and <IMG> are <BR/>,
<HR/>, and <IMG/>.
<BR></BR>
<HR></HR>
<IMG></IMG>
4. Let One Element Completely Contain All Other Elements
An XML document has a root element that completely contains all other elements
of the document. This sometimes called the document element instead.
These tags may have, but do not have to have, the name root or DOCUMENT. For
instance, in the following document the root element is GREETING.
<?xml version=”1.0” standalone=”yes”?>
<GREETING>
Hello XML!
</GREETING>
5. Do Not Overlap Elements
Elements may contain (and indeed often do contain) other elements. However,
27
elements may not overlap. Practically, this means that if an element contains a start tag for an
element, it must also contain the corresponding end tag.
For example, the following is acceptable XML:
<PRE><CODE>n = n + 1;</CODE></PRE>
6. Enclose Attribute Values in Quotes
XML requires all attribute values to be enclosed in quote marks, whether or not the
attribute value includes spaces. For example:
<A HREF=”http://metalab.unc.edu/xml/”>
7. Only Use < and & to Start Tags and Entities
XML assumes that the opening angle bracket always starts a tag, and that the
ampersand always starts an entity reference. (This is often true of HTML as well, but most
browsers will assume the semicolon if you leave it out.) For example, consider this line,
<H1>A Homage to Ben & Jerry’s
New York Super Fudge Chunk Ice Cream</H1>
8. Only Use the Five Preexisting Entity References
You’re probably familiar with a number of entity references from HTML. For
example © inserts the copyright symbol “. ® inserts the registered trademark
symbol “.
Well-Formed HTML
Real-World Web Page Problems
Real-world Web pages are extremely sloppy. Tags aren’t closed. Elements overlap.
Raw less-than signs are included in pages. Semicolons are omitted from the ends of entity
references. Web pages with these problems are formally invalid, but most Web browsers
accept them. Nonetheless, your Web pages will be cleaner, display faster, and be easier to
maintain if you fix these problems.
Some of the common problems that Web pages have include the following:
1. Start tags without matching end tags (unclosed elements)
2. End tags without start tags
3. Overlapping elements
4. Unquoted attributes
5. Unescaped <, >, &, and “ signs
6. No root element
7. End tag case doesn’t match start tag case
28
There are also some rules that only apply to XML documents, and that may actually
cause problems if you attempt to integrate them into your existing HTML pages. These
include:
1. Begin with an XML declaration
2. Empty tags must be closed with a />
3. The only entity references used are &, <, >, ' and "
Close All Start Tags
Any element that contains content, whether text or other child elements, should have a
start tag and an end tag. HTML doesn’t absolutely require this. For instance, <P> , <DT>,
<DD>, and <LI> are often used in isolation.
29
A Character Set for the Script
Computers only understand numbers. Before they can work with text, that text has to
be encoded as numbers in a specified character set. For example, the popular ASCII character
set encodes the capital letter ‘A’ as 65. The capital letter ‘B’ is encoded as 66. ‘C’ is 67, and
so on.
A Font for the Character Set
A font is a collection of glyphs for a character set, generally in a specific size, face,
and style. For example, C, C, and C are all the same character, but they are drawn with
different glyphs. Nonetheless their essential meaning is the same.
Exactly how the glyphs are stored varies from system to system. They may be
bitmaps or vector drawings; they may even consist of hot lead on a printing press. The form
they take doesn’t concern us here. The key idea is that a font tells the computer how to draw
each character in the character set.
An Input Method for the Character Set
An input method enables you to enter text. English speakers don’t think much about
the need for an input method for a script. We just type on our keyboards and everything’s
hunky-dory. The same is true in most of Europe, where all that’s needed is a slightly
modified keyboard with a few extra umlauts, cedillas, or thorns (depending on the country).
30
XML addresses this problem by moving beyond small, local character sets to one
large set that’s supposed to encompass all scripts used in all living languages (and a few dead
ones) on planet Earth. This character set is called Unicode.
XML document is divided into text and binary entities. Each text entity has an
encoding. If the encoding is not explicitly specified in the entity’s definition, then the default
is UTF-8—a compressed form of Unicode which leaves pure ASCII text unchanged.
Thus XML files that contain nothing but the common ASCII characters may be edited
with tools that are unaware of the complications of dealing with multi-byte character sets like
Unicode.
The ASCII Character Set
ASCII, the American Standard Code for Information Interchange, is one of the
original character sets, and is by far the most common. It forms a sort of lowest common
denominator for what a character set must support. It defines all the characters needed to
write U.S. English, and essentially nothing else.
31
languages. Unicode characters 0 through 255 are identical to Latin-1 characters 0 through
255.
Procedure to Write XML Unicode
Unicode is the native character set of XML, and XML browsers will probably do a
pretty good job of displaying it, at least to the limits of the available fonts. Nonetheless, there
simply aren’t many if any text editors that support the full range of Unicode. Consequently,
you’ll probably have to tackle this problem in one of a couple of ways:
1. Write in a localized character set like Latin-3; then convert your file to Unicode.
2. Include Unicode character references in the text that numerically identify particular
characters.
The first option is preferable when you’ve got a large amount of text to enter in
essentially one script, or one script plus ASCII. The second works best when you need to mix
small portions of multiple scripts into your document.
Inserting Characters in XML Files with Character References
Every Unicode character is a number between 0 and 65,535. If you do not have a text
editor that can write in Unicode, you can always use a character reference to insert the
character in your XML file instead.
A Unicode character reference consists of the two characters &# followed by the
character code, followed by a semicolon. For instance, the Greek letter π has Unicode value
960 so it may be inserted in an XML file as π. The Cyrillic character has Unicode value
1206 so it can be included in an XML file with the character reference Ҷ.
Converting to and from Unicode
Application software that exports XML files, such as Adobe Framemaker, handles the
conversion to Unicode or UTF-8 automatically. Otherwise you’ll need to use a conversion
tool. Sun’s freely available Java Development Kit (JDK) includes a simple command-line
utility called native2ascii that converts between many common and uncommon localized
character sets and Unicode.
For example, the following command converts a text file named myfile.txt from the
platform’s default encoding to Unicode
C:\> native2ascii myfile.txt myfile.uni
You can specify other encodings with the -encoding option:
C:> native2ascii -encoding Big5 chinese.txt chinese.uni
You can also reverse the process to go from Unicode to a local encoding with the reverse
option:
32
C:> native2ascii -encoding Big5 -reverse chinese.uni chinese.txt
How to Write XML in Other Character Sets
Unless told otherwise, an XML processor assumes that text entity characters are
encoded in UTF-8. Since UTF-8 includes ASCII as a subset, ASCII text is easily parsed by
XML processors as well.
The only character set other than UTF-8 that an XML processor is required to
understand is raw Unicode. If you cannot convert your text into either UTF-8 or raw
Unicode, you can leave the text in its native character set and tell the XML processor which
set that is. This should be a last resort, though, because there’s no guarantee an arbitrary
XML processor can process other encodings. Nonetheless Netscape Navigator and Internet
Explorer both do a pretty good job of interpreting the common character sets.
To warn the XML processor that you’re using a non-Unicode encoding, you include
an encoding attribute in the XML declaration at the start of the file. For example, to specify
that the entire document uses Latin-1 by default (unless overridden by another processing
instruction in a nested entity) you would use this XML declaration:
<?xml version=”1.0” encoding=”ISO-8859-1” ?>
You can also include the encoding declaration as part of a separate processing
instruction after the XML declaration but before any character data appears.
<?xml encoding=”ISO-8859-1”?>
33
WAP AND XML
ONE MARK QUESTIONS
UNIT - I
1.WAP is:
wired
wireless
both 1 and 2
None of these
computers
networks
pagers
mobile phones
WAP
WML
WASP
WHTML
internet
network
computer
mobile
False
True
not always
None of these
built
developed
inherited
linked
internet browsers
web browsers
macro browsers
micro browsers
9. WAP homepages are not too different from the HTML homepages.
False
Not always
True
None of these
case sensitive
case insensitive
not always
None of these
tables
images
text
pictures
<br/>
<br>
%br%
?br
cards
decks
both 1 and 2
None of these
decks
pages
files
cards
True
False
None of these
UNIT – II
1. What is WMLScript ?
programming language
scripting language
system software
operating system
ASP pages
XML pages
WML pages
HTML pages
Web
Mobile
WAP
All of these
None of these
5. WML is:
Wired Markup Language
None of these
ASP
DHTML
HTML
PHP
DECKS
CARDS
PAGES
FORMS
PAGES
FORMS
TABLES
CARDS
high
light
strong
URLs
forms
pages
links
object code
binary code
byte code
Hexa code
W3C
WAP
Java
HTTP
.wml
.wap
.wmlscript
.wmls
YES
Can't say
None of these
alert()
confirm()
error()
prompt()
UNIT-III
1. What is the function of prompt() ?
None of these
numeric handling
math
string
character
Returns the nearest integer that is not greater than a specified number
Returns the nearest integer that is not smaller than a specified number
None of these
6. The <prev> task represents the action of going back to the previous:
deck
page
card
block
page
deck
block
card
everything
nothing
something
None of these
9. The <do> tag can be used to activate the task when the user clicks on a
word or phrase on the screen.
Not always
False
True
None of these
10. Which fo these are not task elements ?
<go>
<noop>
<refresh>
<do>
UNIT- IV
1. What does XML stand for?
A. eXtra Modern Link
B. eXtensible Markup Language
C. Example Markup Language
D. X-Markup Language
2. What is the correct syntax of the declaration which defines the XML version?:
A. <xml version="A.0" />
B. <?xml version="A.0"?>
C. <?xml version="A.0" />
D. None of the above
20. DTD includes the specifications about the markup that can be used within the
document, the specifications consists of all EXCEPT
A. the browser name
B. the size of element name
C. entity declarations
D. element declarations
28. In XML
A. the internal DTD subset is read before the external DTD
B. the external DTD subset is read before the internal DTD
C. there is no external type of DTD
D. there is no internal type of DTD
A. (i) is correct
B. (i),(ii) are correct
C. (ii),(iii) are correct
D. (i),(ii),(iii) are correct
30. To use the external DTD we have the syntax
A. <?xml version=”A.0” standalone=”no”?>
<! DOCTYPE DOCUMENT SYSTEM “order.dtd”?>
B. <?xml version=”A.0” standalone=”yes”?>
<! DOCTYPE DOCUMENT SYSTEM “order.dtd”?>
(3 )<?xml version=”A.0” standalone=”no”?>
<! DOCTYPE DOCUMENT “order.dtd”?>
D. <?xml version=”A.0” standalone=”yes”?>
<! DOCTYPE DOCUMENT SYSTEM “order.dtd”?>
31. To add the attribute named Type to the <customer> tag the syntax will be
A. <customer attribute Type=”exelent”>
B. <customer Type attribute =”exelent”>
C. <customer Type attribute_type=”exelent”>
D. <customer Type=” exelent” >
33. You can name the schema using the name attribute like
A. <schema attribute=”schema1”>
B. <schema nameattribute=”schema1”>
C. <schema nameattri=”schema1”>
D. <schema name=”schema1”>
34. The default model for complex type, in XML schemas for element is
A. textOnly
B. elementOnly
C. no default type
D. both 1 & 2
35. Microsoft XML Schema Data types for Hexadecimal digits representating octates
A. UID
B. UXID
C. UUID
D. XXID
39. In simple Type Built into XML schema Boolean type holds
A. True, False
B. 1,0
C. both A. & B.
D. True/False and any number except 0
40. In simple type built into XML schema type flat has single precision of ________ floating
point
A. 16 bit
B. 32 bit
C. 8 bit
D. 4 bit
A. (i) only
B. (ii) only
C. (ii),(iii) only
D. all
43. The default model for complex type, in XML schemas for element is
A. textOnly
B. elementOnly
C. no default type
D. both a & b
47. To Bind the HTML elements with DSO we use _________ attribute
A. DATASOURCE
B. DATAFIELD
D. DATAFLD
48. To bind the HTML element <INPUT> Type in text with the datasource “ dsoCustomer”
we use
A. <INPUT TYPE=”TEXT” DATAFIELD=”#dsoCustomer”>
B. <INPUT TYPE=”TEXT” DATASRC=” dsoCustomer”>
C. <INPUT TYPE=”TEXT” DATASRC=” #dsoCustomer” >
D. <INPUT TYPE=”TEXT” DATAFLD=” #dsoCustomer”>
49. XML DSOs has the property for the number of pages of data the recordset contains
A. count
B. number
C. pageCount
D. pageNumber
UNIT-V
51. For XML document to be valid
A. document need to be well formed also
B. document need not to be well formed
C. document need to be well formed & valid
D. document validity has no relationship with well formedness
A. (i) is correct
B. (ii)is correct
C. both are correct
55. To match the root node in XMLT transform the syntax will be
A. <xsl:template match=”Document”>
B. <xsl:template match=”Root”>
C. <xsl:template match=”RootNode”>
D. <xsl:template match=” /”>
56. To match the specific XML elements child like of parent element is the syntax will be
A. <xsl:template match=”PLANET_NAME”>
B.<xsl:template match=”PLANET/NAME”>
C. <xsl:template match=”/NAME”>
D. <xsl:template match=”//”>
57. PI in XML specification stands for
A. C.14
B. priceless instruction
C. processing instruction
D. polymorphic inheritance
62. Identify the most accurate statement about the application of XML:
A. XML must be used to produce XML and HTML output.
B. XML cannot specify or contain presentation information.
C. XML is used to describe hierarchically organized information.
D. XML performs the conversion of information between different e-business applications.
63. The XSl formatting object which formats the data and caption of a table is
A. table
B. table-content
C. table-text
D. none of the above
64. The XSL formating object which holds the content of the table body
A. table
B. table-body
C. table-content
D. table-footer
65. The XSL formatting object which formats the data in a table
A. table
B. table-body
C. title
D. table-content
66. The XSL formating object use to hold the content of the label of a list item is
A. list-block
B. list item
C. list-item-body
D. list-item-label
67. The XSL formating object use to hold the contents of the body of a list item is
A. list-block
B. list item
C. list-item-body
D. list-item-label
70. The syntax for writing the minimum occurrence for an element is
A. <xsd:element ref=” note” min=” 0” />
B. <xsd:elements ref=” note” min=” 0” />
C. <xsd:elements ref=” note” minOccur=”0” />
D. <xsd:elements ref=” note” minOccurs=” 0” />
A. the input and output of the XSLT processor must be unparsed XML documents
B. the input and output of the XSLT processor must be a hierarchical tree representing an
XML document
C. the XSLT processor must be called from a web agent
D. the XSLT processor must be given the DTD as well as the XML document instance
A. XPath identifies the order or path of processing to be followed as the XSL language is
processed
B. XPath identifies locations in XML data to be transformed in the source tree and the
locations to be generated in output tree specified in XSL translation prescriptions
C. XPath identifies the path to be followed in the execution of XSL translation prescriptions
D. XPath specifies which XSL transform files are to be used in the translation of XML
74. Which statement correctly describes the capabilities of the XSLT language?
A. XSLT uses the DTD to determine how XML documents will be translated
B. XSLT specifies how a hierarchical trees, representable by an XML document may be
translated into non-hierarchical formats
C. XSLT specifies how a hierarchical tree, representable by an XML document, may be
translated into another hierarchical tree, also representable by an XML document
D. XSLT specifies the formatting style to be used to render an XML document
76. The transformation of XML document in to another type of document by XSLT can be
done by
77: To match the root node in XMLT transform the syntax will be
A. <xsl:template match=”Document”>
B. <xsl:template match=”Root”>
C. <xsl:template match=”RootNode”>
D. <xsl:template match=” /” >
78: To match the specific XML elements in XMLT the syntax for given name “ rootnode”
is
A. <xsl:template match=” root”>
B. <xsl:template match=” /”>
C. <xsl:template match=” rootnode” >
D. <xsl:template match=” //”>
79. To match the specific XML elements child like of parent element is the syntax will be
A. <xsl:template match=”PLANET_NAME”>
B. <xsl:template match=” PLANET/NAME” >
C. <xsl:template match=” /NAME”>
D. <xsl:template match=” //”>
80. InXSLT style sheet we have syntax to match elements with id as (if id is “ change” )
81. To match the text node (in XSLT) the syntax will be
A. <xsl:template match=” text”>
B. <xsl:template match-text=” text”>
C. <xsl:template match=text( )>
D. <xsl:template match=” text( )” >
84: Which of the following specify that the order and content of "membership" is not
important
A. <!ELEMENT membership NORULE>
B. <!ELEMENT membership EMPTY>
C. <!ELEMENT membership ALL>
D. <!ELEMENT membership ANY>
85: Which of the following is used to specify the attribute list of an element
A. ATTLIST
B. ?ATTLIST
C. !ATTLIST
D. #ATTLIST
86: Which of the following instruct the browser which stylesheet to use
A. <xml-stylesheet type="text/xsl" href="cd.xsl">
B. <xml-stylesheet type="text/xsl" xsl="cd.xsl">
C. <?xml-stylesheet type="text/xsl" href="cd.xsl"?>
D. <?xml-stylesheet type="text/xsl" xsl="cd.xsl"?>
88: Which of the following XSLT Patterns is used to match any descendant nodes
A. /
B. //
C. .
D. ..
89: Which of the following XSLT Patterns is used to match the parent node
A. /
B. //
C. .
D. ..
A. XML developed from HTML because WEB browsers became more powerful.
B. XML is designed as a replacement because SGML can not be used for document
development.
C. XML builds on HTMLs ability to provide content to virtually any audience by adding
the power of intelligent content.
D. XML is the modern replacement for HTML and SGML, taking the good points from each,
making both of those languages obsolete.
A. Develop DTD, conduct a pilot project, create a modular library, train staff.
B. Train staff, convert legacy documents, develop DTD, create modular library.
C. Conduct pilot program, train staff, create modular library, develop DTD
D. Conduct pilot program, train staff, develop DTD, convert documents, purchace XML tools.