Java XML and JSON: Document Processing for Java SE Jeff Friesen All Chapters Instant Download
Java XML and JSON: Document Processing for Java SE Jeff Friesen All Chapters Instant Download
com
https://textbookfull.com/product/java-xml-and-
json-document-processing-for-java-se-jeff-friesen/
https://textbookfull.com/product/java-xml-and-json-document-
processing-for-java-se-2nd-edition-jeff-friesen/
textbookfull.com
https://textbookfull.com/product/java-xml-and-json-friesen-jeff/
textbookfull.com
https://textbookfull.com/product/learn-java-for-android-development-
friesen-jeff/
textbookfull.com
https://textbookfull.com/product/computer-networks-27th-international-
conference-cn-2020-gdansk-poland-june-23-24-2020-proceedings-piotr-
gaj/
textbookfull.com
Dark Kings (Feathers and Fate #1) 1st Edition Sadie Moss
https://textbookfull.com/product/dark-kings-feathers-and-fate-1-1st-
edition-sadie-moss/
textbookfull.com
https://textbookfull.com/product/principles-of-physiology-for-the-
anaesthetist-4th-edition-peter-kam/
textbookfull.com
https://textbookfull.com/product/homelessness-handbook-1st-edition-
levinson/
textbookfull.com
Mutual Insurance 1550-2015: From Guild Welfare and
Friendly Societies to Contemporary Micro-Insurers 1st
Edition Marco H. D. Van Leeuwen (Auth.)
https://textbookfull.com/product/mutual-insurance-1550-2015-from-
guild-welfare-and-friendly-societies-to-contemporary-micro-
insurers-1st-edition-marco-h-d-van-leeuwen-auth/
textbookfull.com
Java XML
and JSON
Document Processing for Java SE
—
Second Edition
—
Jef f Friesen
Java XML and JSON
Document Processing for Java SE
Second Edition
Jeff Friesen
Java XML and JSON: Document Processing for Java SE
Jeff Friesen
Dauphin, MB, Canada
Introduction�����������������������������������������������������������������������������������������������������������xvii
v
Table of Contents
vi
Table of Contents
vii
Table of Contents
viii
Table of Contents
ix
Table of Contents
Index��������������������������������������������������������������������������������������������������������������������� 519
x
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
About the Author
Jeff Friesen is a freelance teacher and software developer
with an emphasis on Java. In addition to authoring Java I/O,
NIO and NIO.2 (Apress), Java Threads and the Concurrency
Utilities (Apress), and the first edition of this book, Jeff has
written numerous articles on Java and other technologies
(such as Android) for JavaWorld (JavaWorld.com), informIT
(InformIT.com), Java.net, SitePoint (SitePoint.com),
and other web sites. Jeff can be contacted via his web site
at JavaJeff.ca or via his LinkedIn (LinkedIn.com) profile
(www.linkedin.com/in/javajeff).
xi
About the Technical Reviewer
Massimo Nardone has more than 24 years of experiences
in Security, web/mobile development, Cloud, and IT
architecture. His true IT passions are Security and Android.
He has been programming and teaching how to program
with Android, Perl, PHP, Java, VB, Python, C/C++, and
MySQL for more than 20 years.
He holds a Master of Science degree in Computing
Science from the University of Salerno, Italy.
He has worked as a Project Manager, Software Engineer,
Research Engineer, Chief Security Architect, Information
Security Manager, PCI/SCADA Auditor, and Senior Lead IT Security/Cloud/SCADA
Architect for many years.
His technical skills include Security, Android, Cloud, Java, MySQL, Drupal, Cobol,
Perl, web and mobile development, MongoDB, D3, Joomla, Couchbase, C/C++, WebGL,
Python, Pro Rails, Django CMS, Jekyll, Scratch, etc.
He worked as visiting lecturer and supervisor for exercises at the Networking
Laboratory of the Helsinki University of Technology (Aalto University). He holds four
international patents (PKI, SIP, SAML, and Proxy areas).
He currently works as Chief Information Security Officer (CISO) for Cargotec Oyj,
and he is member of ISACA Finland Chapter Board.
Massimo has been reviewing more than 45 IT books for different publishing
companies, and he is the coauthor of Pro Android Games (Apress, 2015), Pro JPA 2 in
Java EE 8 (APress 2018), and Beginning EJB in Java EE 8 (Apress, 2018).
xiii
Acknowledgments
I thank Apress Acquisition Editor Jonathan Gennick and the Apress Editorial Board for
giving me the opportunity to create this second edition. I also thank Editor Jill Balzano
for guiding me through the book development process. Finally, I thank my technical
reviewer and copy editor for catching mistakes and making the book look great.
xv
Introduction
XML and (the more popular) JSON let you organize data in textual formats. This book
introduces you to these technologies along with Java APIs for integrating them into your
Java code. This book introduces you to XML and JSON as of Java 11.
Chapter 1 introduces XML, where you learn about basic language features (such
as the XML declaration, elements and attributes, and namespaces). You also learn
about well-formed XML documents and how to validate them via the Document Type
Definition and XML Schema grammar languages.
Chapter 2 focuses on Java’s SAX API for parsing XML documents. You learn how to
obtain a SAX 2 parser; you then tour XMLReader methods along with handler and entity
resolver interfaces. Finally, you explore a demonstration of this API and learn how to
create a custom entity resolver.
Chapter 3 addresses Java’s DOM API for parsing and creating XML documents. After
discovering the various nodes that form a DOM document tree, you explore the DOM
API, where you learn how to obtain a DOM parser/document builder and how to parse
and create XML documents. You then explore the Java DOM APIs related to the Load
and Save, and Traversal and Range specifications.
Chapter 4 places the spotlight on Java’s StAX API for parsing and creating XML
documents. You learn how to use StAX to parse XML documents with stream-based and
event-based readers and to create XML documents with stream-based and event-based
writers.
Moving on, Chapter 5 presents Java’s XPath API for simplifying access to a DOM
tree’s nodes. You receive a primer on the XPath language, learning about location path
expressions and general expressions. You also explore advanced features starting with
namespace contexts.
Chapter 6 completes my coverage of XML by targetting Java’s XSLT API. You learn
about transformer factories and transformers, and much more. You also go beyond the
XSLT 1.0 and XPath 1.0 APIs supported by Java.
xvii
Introduction
Chapter 7 switches gears to JSON. You receive an introduction to JSON, take a tour of
its syntax, explore a demonstration of JSON in a JavaScript context (because Java doesn’t
yet officially support JSON), and learn how to validate JSON objects in the context of
JSON Schema.
You’ll need to work with third-party libraries to parse and create JSON
documents. Chapter 8 introduces you to the mJson library. After learning how
to obtain and use mJson, you explore the Json class, which is the entry point for
working with mJSon.
Google has released an even more powerful library for parsing and creating JSON
documents. The Gson library is the focus of Chapter 9. In this chapter, you learn how
to parse JSON objects through deserialization, how to create JSON objects through
serialization, and much more.
Chapter 10 focuses on the JsonPath API for performing XPath-like operations on
JSON documents.
Chapter 11 introduces you to Jackson, a popular suite of APIs for parsing and
creating JSON documents.
Chapter 12 introduces you to JSON-P, an Oracle API that was planned for inclusion in
Java SE, but was made available to Java EE instead.
Each chapter ends with assorted exercises that are designed to help you master the
content. Along with long answers and true/false questions, you are often confronted
with programming exercises. Appendix A provides the answers and solutions.
Thanks for purchasing this book. I hope you find it helpful in understanding XML
and JSON in a Java context.
Jeff Friesen (October 2018)
Note You can download this book’s source code by pointing your web browser to
www.apress.com/9781484243299 and clicking the Source Code tab followed
by the Download Now link.
xviii
PART I
Exploring XML
CHAPTER 1
Introducing XML
Applications commonly use XML documents to store and exchange data. XML defines
rules for encoding documents in a format that is both human-readable and machine-
readable. Chapter 1 introduces XML, tours the XML language features, and discusses
well-formed and valid documents.
What Is XML?
XML (eXtensible Markup Language) is a meta-language (a language used to describe
other languages) for defining vocabularies (custom markup languages), which is the key
to XML’s importance and popularity. XML-based vocabularies (such as XHTML) let you
describe documents in a meaningful way.
XML vocabulary documents are like HTML (see http://en.wikipedia.org/
wiki/HTML) documents in that they are text-based and consist of markup (encoded
descriptions of a document’s logical structure) and content (document text not
interpreted as markup). Markup is evidenced via tags (angle bracket–delimited syntactic
constructs), and each tag has a name. Furthermore, some tags have attributes (name/
value pairs).
Note XML and HTML are descendants of Standard Generalized Markup Language
(SGML), which is the original meta-language for creating vocabularies—XML is
essentially a restricted form of SGML, while HTML is an application of SGML. The
key difference between XML and HTML is that XML invites you to create your own
vocabularies with their own tags and rules, whereas HTML gives you a single
pre-created vocabulary with its own fixed set of tags and rules. XHTML and other
XML-based vocabularies are XML applications. XHTML was created to be a cleaner
implementation of HTML.
3
© Jeff Friesen 2019
J. Friesen, Java XML and JSON, https://doi.org/10.1007/978-1-4842-4330-5_1
Chapter 1 Introducing XML
If you haven’t previously encountered XML, you might be surprised by its simplicity
and how closely its vocabularies resemble HTML. You don’t need to be a rocket scientist
to learn how to create an XML document. To prove this to yourself, check out Listing 1-1.
<recipe>
<title>
Grilled Cheese Sandwich
</title>
<ingredients>
<ingredient qty="2">
bread slice
</ingredient>
<ingredient>
cheese slice
</ingredient>
<ingredient qty="2">
margarine pat
</ingredient>
</ingredients>
<instructions>
Place frying pan on element and select medium heat.
For each bread slice, smear one pat of margarine on
one side of bread slice. Place cheese slice between
bread slices with margarine-smeared sides away from
the cheese. Place sandwich in frying pan with one
margarine-smeared side in contact with pan. Fry for
a couple of minutes and flip. Fry other side for a
minute and serve.
</instructions>
</recipe>
4
Chapter 1 Introducing XML
Listing 1-1 presents an XML document that describes a recipe for making a grilled
cheese sandwich. This document is reminiscent of an HTML document in that it consists
of tags, attributes, and content. However, that’s where the similarity ends. Instead of
presenting HTML tags such as <html>, <head>, <img>, and <p>, this informal recipe
language presents its own <recipe>, <ingredients>, and other tags.
Note Although Listing 1-1’s <title> and </title> tags are also found in
HTML, they differ from their HTML counterparts. Web browsers typically display
the content between these tags in their title bars or tab headers. In contrast, the
content between Listing 1-1’s <title> and </title> tags might be displayed as
a recipe header, spoken aloud, or presented in some other way, depending on the
application that parses this document.
X
ML Declaration
An XML document usually begins with the XML declaration, special markup telling an
XML parser that the document is XML. The absence of the XML declaration in Listing 1-1
reveals that this special markup isn’t mandatory. When the XML declaration is present,
nothing can appear before it.
The XML declaration minimally looks like <?xml version="1.0"?> in which the
nonoptional version attribute identifies the version of the XML specification to which
the document conforms. The initial version of this specification (1.0) was introduced in
1998 and is widely implemented.
5
Chapter 1 Introducing XML
Note The World Wide Web Consortium (W3C), which maintains XML, released
version 1.1 in 2004. This version mainly supports the use of line-ending characters
used on EBCDIC platforms (see http://en.wikipedia.org/wiki/EBCDIC)
and the use of scripts and characters that are absent from Unicode (see h ttp://
en.wikipedia.org/wiki/Unicode) 3.2. Unlike XML 1.0, XML 1.1 isn’t widely
implemented and should be used only when its unique features are needed.
XML supports Unicode, which means that XML documents consist entirely of
characters taken from the Unicode character set. The document’s characters are
encoded into bytes for storage or transmission, and the encoding is specified via the
XML declaration’s optional encoding attribute. One common encoding is UTF-8 (see
http://en.wikipedia.org/wiki/UTF-8), which is a variable-length encoding of the
Unicode character set. UTF-8 is a strict superset of ASCII (see http://en.wikipedia.
org/wiki/ASCII), which means that pure ASCII text files are also UTF-8 documents.
Note In the absence of the XML declaration or when the XML declaration’s
encoding attribute isn’t present, an XML parser typically looks for a special
character sequence at the start of a document to determine the document’s
encoding. This character sequence is known as the byte-order-mark (BOM) and
is created by an editor program (such as Microsoft Windows Notepad) when it
saves the document according to UTF-8 or some other encoding. For example,
the hexadecimal sequence EF BB BF signifies UTF-8 as the encoding. Similarly,
FE FF signifies UTF-16 (see http://en.wikipedia.org/wiki/UTF-16) big
endian, FF FE signifies UTF-16 little endian, 00 00 FE FF signifies UTF-32
(see http://en.wikipedia.org/wiki/UTF-32) big endian, and FF FE 00
00 signifies UTF-32 little endian. UTF-8 is assumed when no BOM is present.
If you’ll never use characters apart from the ASCII character set, you can probably
forget about the encoding attribute. However, when your native language isn’t English
or when you’re called to create XML documents that include non-ASCII characters, you
need to properly specify encoding. For example, when your document contains ASCII
plus characters from a non-English Western European language (such as ç, the cedilla
6
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Chapter 1 Introducing XML
used in French, Portuguese, and other languages), you might want to choose ISO-8859-1
as the encoding attribute’s value—the document will probably have a smaller size when
encoded in this manner than when encoded with UTF-8. Listing 1-2 shows you the
resulting XML declaration.
The final attribute that can appear in the XML declaration is standalone. This
optional attribute, which is only relevant with DTDs (discussed later), determines
whether or not there are external markup declarations that affect the information passed
from an XML processor (a parser) to the application. Its value defaults to no, implying
that there are or may be such declarations. A yes value indicates that there are no such
declarations. For more information, check out “The standalone pseudo-attribute is only
relevant if a DTD is used” (www.xmlplease.com/xml/standalone/).
7
Chapter 1 Introducing XML
recipe
Figure 1-1. Listing 1-1’s tree structure is rooted in the recipe element
<?xml version="1.0"?>
<article title="The Rebirth of JavaFX" lang="en">
<abstract>
JavaFX 2 marks a significant milestone in the history
of JavaFX. Now that Sun Microsystems has passed the
torch to Oracle, JavaFX Script is gone and
JavaFX-oriented Java APIS (such as
8
Chapter 1 Introducing XML
<code>javafx.application.Application</code>) have
emerged for interacting with this technology. This
article introduces you to this refactored JavaFX,
where you learn about JavaFX 2 architecture and key
APIs.
</abstract>
<body>
</body>
</article>
This document’s root element is article, which contains abstract and body child
elements. The abstract element mixes content with a code element, which contains
content. In contrast, the body element is empty.
Note As with Listings 1-1 and 1-2, Listing 1-3 also contains whitespace (invisible
characters such as spaces, tabs, carriage returns, and line feeds). The XML
specification permits whitespace to be added to a document. Whitespace appearing
within content (such as spaces between words) is considered part of the content. In
contrast, the parser typically ignores whitespace appearing between an end tag and
the next start tag. Such whitespace isn’t considered part of the content.
An XML element’s start tag can contain one or more attributes. For example,
Listing 1-1’s <ingredient> tag has a qty (quantity) attribute, and Listing 1-3’s
<article> tag has title and lang attributes. Attributes provide additional details
about elements. For example, qty identifies the amount of an ingredient that can be
added, title identifies an article’s title, and lang identifies the language in which the
article is written (en for English). Attributes can be optional. For example, when qty
isn’t specified, a default value of 1 is assumed.
9
Chapter 1 Introducing XML
Consider <expression>6 < 4</expression>. You could replace the < with numeric
reference <, yielding <expression>6 < 4</expression>, or better yet with <,
yielding <expression>6 < 4</expression>. The second choice is clearer and easier to
remember.
10
Chapter 1 Introducing XML
11
Random documents with unrelated
content Scribd suggests to you:
These are inflicted as are wounds elsewhere, and, while always
serious, have an importance proportionate to the infection which may
have occurred with the injury or afterward. In practise it may be
assumed that the skin, like the clothing outside, is always dirty and
infected, and that every penetrating wound should be regarded as an
infected wound. Not every wound in the vicinity of a joint is
penetrating, and it is advisable to ascertain whether a joint cavity be
actually open, as much of the method of treatment will depend upon
this fact. The majority of these injuries are of the punctured or small
incised variety. The actual joint opening is usually smaller than that
in the skin. It may be so small as to escape observation. Outflow of
blood is not pathognomonic, but escape of synovial fluid always
indicates that some serous cavity, possibly a bursa or tendon sheath,
has been opened. Immediate accumulation of fluid within a joint after
probable wounding of the synovial membrane is quite suggestive, as
it is likely to imply that the joint is filling with blood. After any injury
which may loosen them the epiphyses should be carefully examined,
in order to determine if they have been loosened, while it should be
estimated, so far as possible, whether the epiphyseal junction has
been disturbed or is probably infected. The student should
remember that punctured wounds of joints are not necessarily made
from without inward. A spicule or fragment of bone may, by
protruding, produce exactly the same condition, only in this case
there may be a compound fracture to complicate it. Infection does
not invariably follow these injuries. Their gravity is in large degree
measured by the presence or absence of a suppurative synovitis.
This does not necessarily instantly follow the injury, but develops
within the ensuing two or three days. Therefore the fate of such a
joint is not necessarily determined by inspection within the first few
hours. Esmarch’s dictum regarding gunshot wounds may here be
paraphrased. The fate of every punctured joint depends upon the
man who first takes care of it. If the proper thing be done promptly a
good result may usually be obtained.
The first indication in every such case is sterilization of the parts,
including the area of the wound. If by a small elliptical incision the
wounded skin can be excised, it may perhaps very much improve
the prospect. A small punctured wound may be watched for a day or
two, especially if it be believed that the first attention were prompt
and antiseptic. Should no unpleasant features appear little need be
done except to apply ice externally and maintain rest. On the first
appearance of sepsis or of increasing trouble in the joint it should be
promptly incised, irrigated, and drained.
In the larger openings of joints it should be assumed from the
outset that infection has occurred. In such a case the wound margins
should be trimmed, the joint cavity thoroughly irrigated, and explored
for foreign bodies, by enlarging the existing opening. After thorough
irrigation a drain should be inserted for at least a few hours. For this
purpose a catgut strand or a drainage tube may be employed.
As soon as the presence of pus (acute pyarthrosis) is made clear
the case takes on a larger aspect, in that drainage not alone at one
point is indicated, but probably at two or three. Nothing is so
disastrous to an involved joint as pus retained within its hidden
recesses. Almost every other consideration is sacrificed to its
discovery and to affording a means for its escape. Counteropenings
in numbers sufficient for the purpose are, therefore, indicated, and it
will often be best to draw through the affected joint a drainage tube,
of a size sufficient to prevent its occlusion by thick pus or debris.
Daily and continuous irrigation may be practised to great advantage,
or, as is possible with the ankle, the wrist, or elbow, continuous
immersion may be substituted as a still better measure. Wherever
infection and destruction to this degree have taken place it may be
presumed that the future of the joint is seriously compromised. There
will, therefore, be room for display of judgment as to when to begin
passive and when active motion; moreover, a guarded prognosis
concerning restoration of function should be given.
Gunshot fractures of joints constitute almost a category by
themselves. Under the old regime, and in the pre-antiseptic era,
gunshot wounds of joints condemned one to amputation and loss of
at least the part below. The mortality attending injuries of this kind,
with the resulting amputations, during our Civil War, and all others
previous to it, was extreme. The Continental surgeons first
appreciated the value of antiseptic occlusion, and taught the rest of
the world that this wholesale sacrifice of limb, and often of life, was
unnecessary and could be avoided. Reyher’s first papers on this
subject revolutionized previous views and practises, and established
on a firm basis the general principle of primary antiseptic occlusion
of those injured joints. The accumulated experience of military
surgeons since his time, as well as of civil surgeons all over the
world, has demonstrated that if a gunshot wound of a joint be
afforded prompt antiseptic occlusion and rest the chances are in
favor of restoration of function, with a minimum of disturbance and a
maximum of result. It was because of these results that soldiers
were provided with the “first aid to the injured” packets, so that a
punctured wound might be protected immediately after its reception.
Even the complete tunnelling of a joint, which the Mauser bullets so
often accomplish, does not seem to be so serious an injury today as
was the puncture of a needle or an awl in the pre-antiseptic era.
Therefore the best thing to do with a gunshot wound is to practise
antiseptic occlusion. If it become troublesome it should be treated in
accordance with the advice given above.
This relegates the matter of amputation or of primary excision of
an injured joint to those cases of extensive and mutilating injury
where not only the soft structures are widely opened and infected,
but the joint ends of the bones also are seriously involved. When it
comes to the treatment of compound dislocations it is difficult to lay
down principles which shall be universally applicable. As a general
rule primary excision will usually be indicated, and prove not only
life-saving but limb-saving. In compound dislocations of the
astragalus its removal will be nearly always indicated. Only in cases
of extensive damage will amputation be necessary.
Inasmuch as it is infection, leading to suppurative synovitis or
arthritis, which gives to all serious cases their greatest dangers, it will
be sufficient at this point to remind the reader to this effect and to
describe the condition itself a little later.