MST Unit-1
MST Unit-1
MST Unit-1
Web Application
The client sends an HTTP request and the server answers with
an HTML page to the client, using HTTP.
HTTP Methods:
HTTP request can be made using a variety of methods, but the ones
you will use most often are Get and Post. The method name tells
the server the kind of request that is being made, and how the rest
of the message will be formated.
HTTP Methods and Descriptions :
CONNECT Reserved for use with a proxy that can switch to being a tunnel.
PUT This is same as POST, but POST is used to create, PUT can be
content.
Data is sent in header to the server Data is sent in the request body
Get request can send only limited amount of data Large amount of data can be sent.
Get request is not secured because data is exposed in URL Post request is secured because data is not exposed in URL
Get request can be bookmarked and is more efficient. Post request cannot be bookmarked.
Following are some basic differences between the PUT and the
POST methods :
PUT must be used for CREATE when the client already knows
the url before the resource is created.
Put simply, an authoritative DNS server is a server that actually holds, and is
responsible for, DNS resource records. This is the server at the bottom of the DNS
lookup chain that will respond with the queried resource record, ultimately allowing
the web browser making the request to reach the IP address needed to access a website
or other web resources. An authoritative nameserver can satisfy queries from its own
data without needing to query another source, as it is the final source of truth for
It’s worth mentioning that in instances where the query is for a subdomain such as
foo.example.com or blog.cloudflare.com, an additional nameserver will be added to
the sequence after the authoritative nameserver, which is responsible for storing the
subdomain’s CNAME record.
There is a key difference between many DNS services and the one that Cloudflare
provides. Different DNS recursive resolvers such as Google DNS, OpenDNS, and
providers like Comcast all maintain data center installations of DNS recursive
resolvers. These resolvers allow for quick and easy queries through optimized
clusters of DNS-optimized computer systems, but they are fundamentally different
than the nameservers hosted by Cloudflare.
Note: Often DNS lookup information will be cached either locally inside the querying
computer or remotely in the DNS infrastructure. There are typically 8 steps in a DNS
lookup. When DNS information is cached, steps are skipped from the DNS lookup
process which makes it quicker. The example below outlines all 8 steps when nothing
is cached.
3. The root server then responds to the resolver with the address of a Top Level
Domain (TLD) DNS server (such as .com or .net), which stores the information for
its domains. When searching for example.com, our request is pointed toward the
.com TLD.
5. The TLD server then responds with the IP address of the domain’s nameserver,
example.com.
7. The IP address for example.com is then returned to the resolver from the
nameserver.
8. The DNS resolver then responds to the web browser with the IP address of the
domain requested initially.
Once the 8 steps of the DNS lookup have returned the IP address for
example.com, the browser is able to make the request for the web page:
10. The server at that IP returns the webpage to be rendered in the browser (step
10).
2. Iterative query - in this situation the DNS client will allow a DNS server to
return the best answer it can. If the queried DNS server does not have a match
for the query name, it will return a referral to a DNS server authoritative for a
lower level of the domain namespace. The DNS client will then make a query to
the referral address. This process continues with additional DNS servers down
the query chain until either an error or timeout occurs.
3. Non-recursive query - typically this will occur when a DNS resolver client
queries a DNS server for a record that it has access to either because it's
authoritative for the record or the record exists inside of its cache. Typically, a
DNS server will cache DNS records to prevent additional bandwidth
consumption and load on upstream servers.
In Chrome, you can see the status of your DNS cache by going to chrome://net-
internals/#dns.
When the recursive resolver inside the ISP receives a DNS query, like all previous
steps, it will also check to see if the requested host-to-IP-address translation is
already stored inside its local persistence layer.
The recursive resolver also has additional functionality depending on the types of
records it has in its cache:
1. If the resolver does not have the A records, but does have the NS records for the
authoritative nameservers, it will query those name servers directly, bypassing
several steps in the DNS query. This shortcut prevents lookups from the root and
.com nameservers (in our search for example.com) and helps the resolution of
the DNS query occur more quickly.
2. If the resolver does not have the NS records, it will send a query to the TLD
servers (.com in our case), skipping the root server.
3. In the unlikely event that the resolver does not have records pointing to the TLD
servers, it will then query the root servers. This event typically occurs after a DNS
cache has been purged.
Protocols:
Standardized protocols are like a common language that computers can use, similar
to how two people from different parts of the world may not understand each
other's native languages, but they can communicate using a shared third language. If
one computer uses the Internet Protocol (IP) and a second computer does as well,
they will be able to communicate — just as the United Nations relies on its 6 official
languages to communicate amongst representatives from all over the globe. But if
one computer uses IP and the other does not know this protocol, they will be unable
to communicate.
On the Internet, there are different protocols for different types of processes.
Protocols are often discussed in terms of which OSI model layer they belong to.
Types of Protocols:
Transmission Control Protocol (TCP)
Internet Protocol (IP)
User Datagram Protocol (UDP)
Post office Protocol (POP)
Simple mail transport Protocol (SMTP)
File Transfer Protocol (FTP)
Hyper Text Transfer Protocol (HTTP)
Hyper Text Transfer Protocol Secure (HTTPS)
An overview of HTTP:
HTTP is a protocol which allows the fetching of resources, such as HTML documents.
It is the foundation of any data exchange on the Web and it is a client-server
protocol, which means requests are initiated by the recipient, usually the Web
browser. A complete document is reconstructed from the different sub-documents
fetched, for instance text, layout description, images, videos, scripts, and more.
HTTP as an application layer protocol, on top of TCP (transport layer) and IP (network
layer) and below the presentation layer.Designed in the early 1990s, HTTP is an
extensible protocol which has evolved over time. It is an application layer protocol
that is sent over TCP, or over a TLS-encrypted TCP connection, though any reliable
transport protocol could theoretically be used. Due to its extensibility, it is used to
not only fetch hypertext documents, but also images and videos or to post content
to servers, like with HTML form results. HTTP can also be used to fetch parts of
documents to update Web pages on demand.
HTTP is a client-server protocol: requests are sent by one entity, the user-agent (or a
proxy on behalf of it). Most of the time the user-agent is a Web browser, but it can
be anything, for example a robot that crawls the Web to populate and maintain a
search engine index.
Each individual request is sent to a server, which handles it and provides an answer,
called the response. Between the client and the server there are numerous entities,
collectively called proxies, which perform different operations and act as gateways or
caches, for example.
In reality, there are more computers between a browser and the server handling the
request: there are routers, modems, and more. Thanks to the layered design of the
Web, these are hidden in the network and transport layers. HTTP is on top, at the
application layer. Although important to diagnose network problems, the underlying
layers are mostly irrelevant to the description of HTTP.
The user-agent is any tool that acts on the behalf of the user. This role is primarily
performed by the Web browser; other possibilities are programs used by engineers
and Web developers to debug their applications.
The browser is always the entity initiating the request. It is never the server (though
some mechanisms have been added over the years to simulate server-initiated
messages).
To present a Web page, the browser sends an original request to fetch the HTML
document that represents the page. It then parses this file, making additional
requests corresponding to execution scripts, layout information (CSS) to display, and
sub-resources contained within the page (usually images and videos). The Web
browser then mixes these resources to present to the user a complete document, the
Web page. Scripts executed by the browser can fetch more resources in later phases
and the browser updates the Web page accordingly.
A Web page is a hypertext document. This means some parts of displayed text are
links, which can be activated (usually by a click of the mouse) to fetch a new Web
page, allowing the user to direct their user-agent and navigate through the Web. The
browser translates these directions in HTTP requests, and further interprets the HTTP
responses to present the user with a clear response.
The Web server:
On the opposite side of the communication channel, is the server, which serves the
document as requested by the client. A server appears as only a single machine
virtually: this is because it may actually be a collection of servers, sharing the load
(load balancing) or a complex piece of software interrogating other computers (like
cache, a DB server, or e-commerce servers), totally or partially generating the
document on demand.
A server is not necessarily a single machine, but several server software instances can
be hosted on the same machine. With HTTP/1.1 and the Host header, they may even
share the same IP address.
Proxies:
Between the Web browser and the server, numerous computers and machines relay
the HTTP messages. Due to the layered structure of the Web stack, most of these
operate at the transport, network or physical levels, becoming transparent at the
HTTP layer and potentially making a significant impact on performance. Those
operating at the application layers are generally called proxies. These can be
transparent, forwarding on the requests they receive without altering them in any
way, or non-transparent, in which case they will change the request in some way
before passing it along to the server. Proxies may perform numerous functions:
caching (the cache can be public or private, like the browser cache)
filtering (like an antivirus scan or parental controls)
load balancing (to allow multiple servers to serve the different requests)
authentication (to control access to different resources)
logging (allowing the storage of historical information)
HTTP is simple:
HTTP is generally designed to be simple and human readable, even with the added
complexity introduced in HTTP/2 by encapsulating HTTP messages into frames. HTTP
messages can be read and understood by humans, providing easier testing for
developers, and reduced complexity for newcomers.
HTTP is extensible:
Introduced in HTTP/1.0, HTTP headers make this protocol easy to extend and
experiment with. New functionality can even be introduced by a simple agreement
between a client and a server about a new header's semantics.
HTTP is stateless: there is no link between two requests being successively carried
out on the same connection. This immediately has the prospect of being problematic
for users attempting to interact with certain pages coherently, for example, using e-
commerce shopping baskets. But while the core of HTTP itself is stateless, HTTP
cookies allow the use of stateful sessions. Using header extensibility, HTTP Cookies
are added to the workflow, allowing session creation on each HTTP request to share
the same context, or the same state.
Before a client and server can exchange an HTTP request/response pair, they must
establish a TCP connection, a process which requires several round-trips. The default
behavior of HTTP/1.0 is to open a separate TCP connection for each HTTP
request/response pair. This is less efficient than sharing a single TCP connection
when multiple requests are sent in close succession.
In order to mitigate this flaw, HTTP/1.1 introduced pipelining (which proved difficult
to implement) and persistent connections: the underlying TCP connection can be
partially controlled using the Connection header. HTTP/2 went a step further by
multiplexing messages over a single connection, helping keep the connection warm
and more efficient.
Caching How documents are cached can be controlled by HTTP. The server can
instruct proxies and clients, about what to cache and for how long. The client can
instruct intermediate cache proxies to ignore the stored document.
Relaxing the origin constraint To prevent snooping and other privacy invasions, Web
browsers enforce strict separation between Web sites. Only pages from the same
origin can access all the information of a Web page. Though such constraint is a
burden to the server, HTTP headers can relax this strict separation on the server side,
allowing a document to become a patchwork of information sourced from different
domains; there could even be security-related reasons to do so.
Authentication Some pages may be protected so that only specific users can access
them. Basic authentication may be provided by HTTP, either using the WWW-
Authenticate and similar headers, or by setting a specific session using HTTP cookies.
Proxy and tunneling Servers or clients are often located on intranets and hide their
true IP address from other computers. HTTP requests then go through proxies to
cross this network barrier. Not all proxies are HTTP proxies. The SOCKS protocol, for
example, operates at a lower level. Other protocols, like ftp, can be handled by these
proxies.
Sessions Using HTTP cookies allows you to link requests with the state of the server.
This creates sessions, despite basic HTTP being a state-less protocol. This is useful
not only for e-commerce shopping baskets, but also for any site allowing user
configuration of the output.
HTTP flow:
When a client wants to communicate with a server, either the final server or an
intermediate proxy, it performs the following steps:
Open a TCP connection: The TCP connection is used to send a request, or several,
and receive an answer. The client may open a new connection, reuse an existing
connection, or open several TCP connections to the servers.
Send an HTTP message: HTTP messages (before HTTP/2) are human-readable. With
HTTP/2, these simple messages are encapsulated in frames, making them impossible
to read directly, but the principle remains the same.
For example:
GET / HTTP/1.1
Host: developer.mozilla.org
Accept-Language: fr
HTTP/1.1 200 OK
Server: Apache
ETag: "51142bc1-7449-479b075b2891b"
Accept-Ranges: bytes
Content-Length: 29769
Content-Type: text/html
FTP:
o FTP stands for File transfer protocol.
o FTP is a standard internet protocol provided by TCP/IP used for transmitting the files
from one host to another.
o It is mainly used for transferring the web page files from their creator to the
computer that acts as a server for other computers on the internet.
o It is also used for downloading the files to computer from other servers.
Objectives of FTP
o It provides the sharing of files.
o It is used to encourage the use of remote computers.
o It transfers the data more reliably and efficiently.
Why FTP?
Although transferring files from one system to another is very simple and
straightforward, but sometimes it can cause problems. For example, two systems may
have different file conventions. Two systems may have different ways to represent
text and data. Two systems may have different directory structures. FTP protocol
overcomes these problems by establishing two connections between hosts. One
connection is used for data transfer, and another connection is used for the control
connection.
Mechanism of FTP
The above figure shows the basic model of the FTP. The FTP client has three
components: the user interface, control process, and data transfer process. The server
has two components: the server control process and the server data transfer process.
FTP Clients
o FTP client is a program that implements a file transfer protocol which allows you to
transfer files between two hosts on the internet.
o It allows a user to connect to a remote host and upload or download the files.
o It has a set of commands that we can use to connect to a host, transfer the files
between you and your host and close the connection.
o The FTP program is also available as a built-in component in a Web browser. This GUI
based FTP client makes the file transfer very easy and also does not require to
remember the FTP commands.
Advantages of FTP:
o Speed: One of the biggest advantages of FTP is speed. The FTP is one of the fastest
way to transfer the files from one computer to another computer.
o Efficient: It is more efficient as we do not need to complete all the operations to get
the entire file.
o Security: To access the FTP server, we need to login with the username and
password. Therefore, we can say that FTP is more secure.
o Back & forth movement: FTP allows us to transfer the files back and forth. Suppose
you are a manager of the company, you send some information to all the employees,
and they all send information back on the same server.
Disadvantages of FTP:
o The standard requirement of the industry is that all the FTP transmissions should be
encrypted. However, not all the FTP providers are equal and not all the providers
offer encryption. So, we will have to look out for the FTP providers that provides
encryption.
o FTP serves two operations, i.e., to send and receive large files on a network. However,
the size limit of the file is 2GB that can be sent. It also doesn't allow you to run
simultaneous transfers to multiple receivers.
o Passwords and file contents are sent in clear text that allows unwanted
eavesdropping. So, it is quite possible that attackers can carry out the brute force
attack by trying to guess the FTP password.
o It is not compatible with every system.
SMTP:
o SMTP stands for Simple Mail Transfer Protocol.
o SMTP is a set of communication guidelines that allow software to transmit an
electronic mail over the internet is called Simple Mail Transfer Protocol.
o It is a program used for sending messages to other computer users based on e-mail
addresses.
o It provides a mail exchange between users on the same or different computers, and it
also supports:
o It can send a single message to one or more recipients.
o Sending message can include text, voice, video or graphics.
o It can also send the messages on networks outside the internet.
o The main purpose of SMTP is used to set up communication rules between servers.
The servers have a way of identifying themselves and announcing what kind of
communication they are trying to perform. They also have a way of handling the
errors such as incorrect email address. For example, if the recipient address is wrong,
then receiving server reply with an error message of some kind.
Components of SMTP
o First, we will break the SMTP client and SMTP server into two components such as
user agent (UA) and mail transfer agent (MTA). The user agent (UA) prepares the
message, creates the envelope and then puts the message in the envelope. The mail
transfer agent (MTA) transfers this mail across the internet.
o SMTP allows a more complex system by adding a relaying system. Instead of just
having one MTA at sending side and one at receiving side, more MTAs can be added,
acting either as a client or server to relay the email.
o The relaying system without TCP/IP protocol can also be used to send the emails to
users, and this is achieved by the use of the mail gateway. The mail gateway is a relay
MTA that can be used to receive an email.
Working of SMTP:
1. Composition of Mail: A user sends an e-mail by composing an electronic mail
message using a Mail User Agent (MUA). Mail User Agent is a program which is used
to send and receive mail. The message contains two parts: body and header. The
body is the main part of the message while the header includes information such as
the sender and recipient address. The header also includes descriptive information
such as the subject of the message. In this case, the message body is like a letter and
header is like an envelope that contains the recipient's address.
2. Submission of Mail: After composing an email, the mail client then submits the
completed e-mail to the SMTP server by using SMTP on TCP port 25.
3. Delivery of Mail: E-mail addresses contain two parts: username of the recipient and
domain name. For example, vivek@gmail.com, where "vivek" is the username of the
recipient and "gmail.com" is the domain name.
If the domain name of the recipient's email address is different from the sender's
domain name, then MSA will send the mail to the Mail Transfer Agent (MTA). To relay
the email, the MTA will find the target domain. It checks the MX record from Domain
Name System to obtain the target domain. The MX record contains the domain name
and IP address of the recipient's domain. Once the record is located, MTA connects to
the exchange server to relay the message.
4. Receipt and Processing of Mail: Once the incoming message is received, the
exchange server delivers it to the incoming server (Mail Delivery Agent) which stores
the e-mail where it waits for the user to retrieve it.
5. Access and Retrieval of Mail: The stored email in MDA can be retrieved by using
MUA (Mail User Agent). MUA can be accessed by using login and password.
HTML5:
HTML5 tutorial provides details of all 40+ HTML tags including audio, video,
header, footer, data, datalist, article etc. This HTML tutorial is designed for beginners
and professionals.
HTML5 is a next version of HTML. Here, you will get some brand new features which
will make HTML much easier. These new introducing features make your website
layout clearer to both website designers and users. There are some elements like
<header>, <footer>, <nav> and <article> that define the layout of a website.
Why use HTML5:
It is enriched with advance features which makes it easy and interactive for
designer/developer and users.
It facilitate you to design better forms and build web applications that work offline.
It provides you advance features for that you would normally have to write JavaScript
to do.
The most important reason to use HTML 5 is, we believe it is not going anywhere. It
will be here to serve for a long time according to W3C recommendation.
HTML 5 Example:
Let's see a simple example of HTML5.
1. <!DOCTYPE>
2. <html>
3. <body>
4. <h1>Write Your First Heading</h1>
5. <p>Write Your First Paragraph.</p>
6. </body>
7. </html>
Supporting browsers:
1.Chrome
2.IE
3.Firefox
4.Opera
5.Safari
List of the Advantages and Disadvantages of HTML5:
ADVANTAGES:
You are not required to pay royalties if you decide to use HTML5 for your
website. It’s cross-platform, which means you can use it on virtually any
device. It works the same whether you access a website through a
desktop, a laptop, a smartphone, or even your television. As long as the
browser you are using supports HTML5, then there is a good chance that it
will work as it should.
5. There are more page layout elements available for your content.
If you’ve grown familiar with the older versions of HTML, then you know
what your options are already: Div, Heading, Paragraph, and Span. With
HTML5, you’ve got a lot of elements to play around with when designing
your page layouts. Headers, footers, areas, and sections are all available to
you. That makes it possible to develop a page with representative mark-
ups that guide users through the purpose of the content they are
encountering.
CSS3:
Cascading Style Sheets (CSS) is a style sheet language used for describing the
look and formatting of a document written in a markup language. CSS3 is a latest
standard of css earlier versions(CSS2). The main difference between css2 and css3
is follows −
Media Queries
Namespaces
Selectors Level 3
Color
CSS3 is used with HTML to create and format content structure. It is responsible
for colours, font properties, text alignments, background images, graphics, tables,
etc. It provides the positioning of various elements with the values being fixed,
absolute, and relative.
CSS3 modules:
CSS3 is collaboration of CSS2 specifications and new specifications, we can called
this collaboration is module. Some of the modules are shown below −
Selectors
Box Model
Backgrounds
Image Values and Replaced Content
Text Effects
2D Transformations
3D Transformations
Animations
Multiple Column Layout
User Interface
CSS3 Rounded corners are used to add special colored corner to body or
text by using the border-radius property.A simple syntax of rounded corners
is as follows −
#rcorners7 {
border-radius: 60px/15px;
background: #FF0000;
padding: 20px;
width: 200px;
height: 150px;
}
Features of CSS3:
The features of the CSS3 are as follows:
1. Selectors
Selectors allow the designer to select on more precise levels of the web
help match attribute and attribute values. New selectors target a pseudo-
class to style the elements targeted in the URL. Selectors also include a
radio buttons.
running headers and footers. There are additional properties for printing
footnotes.
5. Multi-Column Layout
This feature includes properties to allow designers to present their content
column-width.
Advantages of CSS3:
CSS3 provides a consistent and precise positioning of navigable
elements.
appealing.
and attractive, and this can be achieved with the help of CSS3.
CSS3 allows the designer to create websites, rich in content and low in
code. This technology brings some exciting features that make the page
look good, simple for the user to navigate, and functions flawlessly.
Some designs like drop shadows, rounded corners, and gradients find use
in just about every web page. These design enhancements can make the
site look appealing when used appropriately. Formerly, to use these
include these designs directly, leading to simpler and cleaner, and fast
pages.
Starting at the top of the web page, let's go through the anatomy of a
web page:
Page Title
The page tile is set using the <title> </title> set of tags in the
head section of the html coding. This is the only web page
element within the head section of the web page the visitor will
see.
The URL is the domain name of the website. If the visitor just
typed www.domainname.com they would be taken to the home
page of the website.
File Name:
File name is the web page file name. It cannot contain any
spaces! The file name can be written as one long name (e.g.
basichtmlarticles.htm), with hyphens (e.g. basic-html-
articles.htm, as shown in the image above) or with underscores
(e.g. basic_html_articles.htm).
When you create a web page you have to give it a name. The
file name has what is called an extension at the end of it.
The extension at the end of the file name tells the browser
what kind of file it is. A HTML document would have an
extension of .htm or html. If your web page uses a certain
programming language it would have the appropriate
extension. e.g. .php is for the PHP programming language, .asp
is for the ASP programming language.
Scroll Bars:
Scroll bars are on the right side and bottom of the browser
window. If there is a scroll bar at the bottom (horizontal scroll
bar) your web page content is too wide for the browser
window.
A web page layout should be designed so there is no horizontal
scroll bar. You need to test your web page at different
resolutions and on different operating systems to see if the way
the page is laid out will result in horizontal scroll bars when
viewed at smaller resolutions or by different operating systems.
Header:
The header is at the very top of the web page. It usually contains a logo
for the website.
Navigation:
Web page content includes everything between the <body> and </body>
tags. We have already looked at some of the web page content, the
header and navigation system. Also considered web page content is the
web page footer (we will discuss this next) and the center section of this
page that you are looking at now.
Footer:
This section is where you usually put your copyright notice, link to your
privacy policy and your website contact information.
Header:
Header is the upper (top) part of the webpage. Being the area
people see before scrolling the page in their first seconds on the
website, the header is an element of strategic importance. It is
expected from the header to provide the core navigation around the
website so that users could scan it in split seconds and jump to the
main pages that can help them. Headers are also referred to as site
menus and positioned as an element of primary navigation in the
website layout.
XML:
XML tags identify the data and are used to store and organize
the data, rather than specifying how to display it like HTML
tags, which are used to display the data.
Applications:
1.Character
4.Tag
5.Element
6.Attribute
A tag is a markup construct that begins with < and ends with > . Tags
come in three flavors:
Advantages of XML:
XML Schema:
XML Schemas define the elements of your XML files. A simple element
is an XML element that contains only text. It cannot contain any other
elements or attributes.
1. simpleType
2. complexType
simpleType
We can modify or delete their content and also create new elements.
The elements, their content (text and attributes) are all known as nodes.
1. <TABLE>
2. <ROWS>
3. <TR>
4. <TD>A</TD>
5. <TD>B</TD>
6. </TR>
7. <TR>
8. <TD>C</TD>
9. <TD>D</TD>
10. </TR>
11. </ROWS>
12. </TABLE>
note.xml
1. <?xml version="1.0" encoding="ISO-8859-1"?>
2. <note>
3. <to>sonoojaiswal@javatpoint.com</to>
4. <from>vimal@javatpoint.com</from>
5. <body>Hello XML DOM</body>
6. </note>
Properties of DOM: Let’s see the properties of the document object
that can be accessed and modified by the document object.
XSLT:
XSLT (Extensible Stylesheet Language Transformations) is a language
for transforming XML documents into other XML documents, or
other formats such as HTML for web pages, plain text or XSL Formatting
Objects, which may subsequently be converted to other formats, such as
PDF, PostScript and PNG.
How XSLT Works
Image representation:
Advantage of XSLT
SAX Approach:
Benefits:
A SAX parser only needs to report each parsing event as it
happens, and normally discards almost all of that information once
reported (it does, however, keep some things, for example a list of
all elements that have not been closed yet, in order to catch later
errors such as end-tags in the wrong order). Thus, the minimum
memory required for a SAX parser is proportional to the maximum
depth of the XML file (i.e., of the XML tree) and the maximum data
involved in a single XML event (such as the name and attributes of
a single start-tag, or the content of a processing instruction, etc.).
Drawbacks:
The event-driven model of SAX is useful for XML parsing, but it does
have certain drawbacks.
Virtually any kind of XML validation requires access to the document in
full. The most trivial example is that an attribute declared in the DTD to
be of type IDREF, requires that there be only one element in the
document that uses the same value for an ID attribute. To validate this in
a SAX parser, one must keep track of all ID attributes (any one of
them might end up being referenced by an IDREF attribute at the very
end); as well as every IDREF attribute until it is resolved. Similarly, to
validate that each element has an acceptable sequence of child
elements, information about what child elements have been seen for
each parent must be kept until the parent closes.