Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Representation of Physical Motions by Various Types of Quaternions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 127

The representation of physical motions

by various types of quaternions

D. H. Delphenich †

Abstract: It is shown that the groups of Euclidian rotations, rigid motions, proper, orthochronous Lorentz
transformations, and the complex rigid motions can be represented by the groups of unit-norm elements in
the algebras of real, dual, complex, and complex dual quaternions, respectively. It is shown how someof
the physically-useful tensors and spinors can be represented by the various kinds of quaternions. The basic
notions of kinematical states are described in each case, except complex dual quaternions, where their
possible role in describing the symmetries of the Maxwell equations is discussed.


E-mail: david_delphenich@yahoo.com, Website: neo-classical-physics.info.
CONTENTS
Page

INTRODUCTION……………………………………………………............. 1
References………………………………………………………………… 6

I. ALGEBRAS

1. General notions……………………………………………………………. 10
2. Ideals in algebras…………………………………………………………... 15
3. Automorphisms of algebras………………………………………..……… 17
4. Representations of algebras……………………………………………….. 18
5. Tensor products of algebras……………………………………………….. 21
References………………………………………………………………… 23

II. REAL QUATERNIONS

1. The group of Euclidian rotations………………………………………… 24


2. The algebra of real quaternions………………………………………….. 30
3. The action of rotations on quaternions…………………………………… 35
4. The kinematics of fixed-point rigid bodies………………………………. 41
References……………………………………………………………….. 47

III. DUAL QUATERNIONS

1. The group of Euclidian rigid motions…………………………………… 48


2. The algebra of dual numbers……………………………………………. 54
3. Functions of dual numbers………………………………………………. 56
4. Dual linear algebra………………………………………………………. 56
5. The algebra of dual quaternions…………………………………………. 64
6. The action of rigid motions on dual quaternions………………………… 69
7. Some line geometry……………………………………………………… 71
8. The kinematics of translating rigid bodies………………………………. 72
References……………………………………………………..………… 75

IV. COMPLEX QUATERNIONS

1. Functions of complex variables………………………………………….. 76


2. The group of complex rotations…………………………………………. 77
3. The algebra of complex quaternions…………………………………….. 81
4. The action of the Lorentz group on complex quaternions………………. 88
5. Some complex line geometry……………………………………………. 98
6. The kinematics of Lorentzian frames……………………………………. 101
References…………………………………………………..…………… 105
Table of Contents ii

V. COMPLEX DUAL QUATERNIONS

1. The group of complex rigid motions…………………………………… 107


2. The algebra of complex dual numbers…………………………………. 109
3. Functions of complex dual numbers……………………………………. 111
4. Complex dual linear algebra……………………………………………. 112
5. The algebra of the complex dual quaternions………………………….. 115
6. The action of the group of complex rigid motions on complex dual
quaternions……………………………………………………………... 121
7. The role of complex rigid frames in physics…………………………… 122
INTRODUCTION

According to Wilhelm Blaschke [1], the first vestiges of the concept of a quaternion
went to back to the work of Leonhard Euler on rigid-body rotations in 1776 [2]. Olinde
Rodrigues seems to have developed much of the basics of the subject in 1840, without
actually introducing the term “quaternion.” However, the first major attempt to develop
the concept in its own right, along with its applications to physics, seems to have the
posthumously-published book of Sir William Rowan Hamilton that he called Elements of
Quaternions [3] that came almost a century after Euler in 1866, although the work was
done starting in 1843.
Hamilton envisioned quaternions as a geometric algebra that would extend the three-
dimensional algebra of the vector cross product to a four-dimensional algebra in which
the Euclidian scalar product also played a role in defining the product of its elements.
Thus, the algebra would be closely related to rotations by the fact that the vector cross
product defines a Lie algebra on R3 that is isomorphic to so(3; R), which consists of
infinitesimal generators of Euclidian rotations, as well as the fact that rotations preserve
the Euclidian scalar product.
In that monumental treatise, the coefficients of quaternions were treated as if they
could be either real or complex, as it suited the purpose, although Hamilton suggested
that the complex kind might be referred to as “biquaternions.” However, since the work
predated the theory of relativity, no mention was made of the role that the complex
quaternions might play in regard to relativistic motions.
Apparently, the subsequent history of the theory of quaternions was somewhat
marginal to the mainstream of both mathematics and physics for quite some time.
However, from 1899 to 1913 there was an international Quaternion Society that formed
around Alexander MacFarlane that was dedicated to keeping the concept alive. For the
most part, their main acceptance in the early years seems to have been with the
algebraists (cf., e.g., Shaw [4]), who saw them as a useful example of an algebra.
However, later researchers in physics and mechanical engineering continued to expand
upon the usefulness of quaternions in applications that involved the representation of
three-dimensional Euclidian rotations.
For instance, to this day, the inertial navigation community regards the representation
of kinematics by real quaternions as being their most computationally efficient algorithm
for the propagation of rotating frames. Although the method of Euler angles requires one
less coordinate to describe a rotation, nevertheless, the differential equations that one
must integrate involve products of trigonometric functions, which tends to slow down the
computational speed considerably. Although the method of direction cosine matrices
only involves basic arithmetic operations on the components of the matrices, nonetheless,
they are matrices with nine components, which also tends to slow things down. Thus, the
four components of a unit quaternion seem to provide an ideal compromise.

In a different part of geometry and mechanics, geometers such as Louis Poinsot [5]
and Michel Chasles [6] were exploring the geometry of the larger group of rigid motions,
which also includes translations, in addition to the rotations; both two and three
dimensional rigid motions were dealt with in detail. The geometry that they were
Introduction 2

considering was actually projective geometry, not affine Euclidian geometry. Chasles
found that the concept of an infinitesimal center of rotation for a planar rigid motion
could be extended to a “central axis” for any three-dimensional rigid motion, which
allowed one to decompose the motion into a rotation about the axis and a translation
along it. Previously, in the context of statics, Poinsot found that a finite set of spatially-
distributed force vectors that acted on a rigid body could be replaced with an equivalent
force-moment about a central axis and a force along it. The German geometers Julius
Plücker [7] and his illustrious student Felix Klein [8] expanded upon the projective-
geometric nature of these constructions, and introduced the term “Dyname” for a finite
spatial distribution of force vectors. (The French were using the term “torseur,” which
eventually became the more modern term “torsor.”) In the meantime, Sir Robert Ball [9]
had introduced the term “screw” to describe the canonical form of the rigid motion and
“wrench” to describe the canonical form of a force distribution. To some extent, the
German school of geometrical kinematics culminated in the 1903 treatise [10] of another
student of Plücker named Eduard Study that was entitled Die Geometrie der Dynamen.
One of the innovations that Study introduced in that work was the algebra of “dual
numbers,” which he applied to study of quaternions to produce “dual quaternions.”
Actually, William Kingdon Clifford had previously sketched out a theory [11] of what he
was calling “biquaternions,” although his usage of that word was inconsistent with that of
Hamilton, since Clifford’s biquaternions were quaternions whose components were dual
numbers, while Hamilton’s biquaternions had complex components. However, the
treatise of Study developed the concepts in much more detail than that of Clifford. Since
dual quaternions represent an algebra over R8, some authors (e.g., MacAulay [12])
referred to them as “octonions.” Unfortunately, that usage is inconsistent with the
modern usage of that term to refer to Cayley numbers, which defines a division algebra
over R8, unlike dual quaternions, which have divisors of zero, and therefore cannot be a
division algebra.
The dual numbers represent an algebra over R2, just as the complex numbers do, but
the difference is that the basic object of the dual number algebra is a symbol ε that is
nilpotent – viz., ε2 = 0 – while the basic object of the complex algebra is a symbol i with
the property that i2 = −1. The effect of introducing ε is to produce an algebra that is not a
division algebra – in particular, it has divisors of zero – with the property that the product
of two dual numbers a + εb and c + εd is ac + ε(ad + bc), which then combines the usual
multiplication of real numbers with their addition. Although it is not obvious at this stage
of the discussion, when one uses such numbers for the components of quaternions, it
allows on to represent three-dimensional rigid motions by means of the group of unit dual
quaternions, just as the unit quaternions carry a representation of the group of three-
dimensional Euclidian rotations; in fact, it is the spin representation that is isomorphic to
SU(2).
Apparently, the main acceptance of the methods of dual quaternions was by the
mechanical engineering community. One of the earliest applications of screws and dual
quaternions to mechanics was by the Russian Zanichevskiy in 1889 [13], although no
copies of that paper remain, as it was destroyed by the Bolsheviks during the revolution
of 1917, along with some later work of Kotjelnikoff [14]. Richard von Mises developed
3 The representation of physical motions by various types of quaternions

the application to mechanics in a pair of papers in 1924 [15] that were widely cited to this
day. Another classic of the Russian school was by Dimentberg [16], which also
contained a bibliography of Russian work that was not widely known outside of the
Soviet Union. The general fields of application of dual quaternions to mechanics that is
being discussed today seem to be in the theory of mechanisms [17−19], and especially
robot manipulators [20-22]. Just as real quaternions give a computationally-efficient
algorithm for dealing with rigid-body rotations in real time, dual quaternions give a
computationally-efficient algorithm for dealing with rigid-body motions that also include
translations, such as the motion of joints in manipulators.
Nonetheless, in the physics community itself, one finds the observation of Herbert
Goldstein in a footnote to the 1980 edition [23] of his standard textbook Classical
Mechanics in the context of rigid-body kinematics:

“…Such a combination of translation and rotation is called a screw motion. There


seems to be little present use for this version of Chasles’ theorem, nor for the
elaborate mathematics of screw motions as developed in the nineteenth century…”

Meanwhile, in a different part of the physics community, the complex quaternions


that Hamilton had only alluded to were being applied to the emerging physics of special
relativity. One of the earliest researchers to develop that application was the Polish
physicist Ludwik Silberstein [24], who showed how the Lorentz transformations could be
represented by the action of unit complex quaternions on the quaternions. Because the
complex quaternions, like the dual quaternions, are also an algebra – but not a division
algebra – over R8, and they admit more automorphisms than the real quaternions, due to
the possibility of complex conjugation, they also admit more ways of defining the action
of unit complex quaternions on the quaternions, and more types of invariant subspaces
for the actions, which correspond to the different types of tensors that one can represent
by quaternions.
As for the acceptance of the methods of complex quaternions into relativity theory,
one should note that according to Silberstein, in the cited reference, Minkowski felt that
quaternions were “too narrow and clumsy for the purpose.” However, various papers on
the subject of quaternions and relativity followed, just the same. (See, e.g., Weiss [25]
and Rastall [26].)

It is the fact that the representation of various tensors (scalar, vector, bivector, spinor)
by complex quaternions corresponds, not to differing numbers of component indices, but
to differing actions that makes these representations somewhat more esoteric than the
more conventional tensor representations, but it is in the fact that one can represent
different types of tensors by the same basic algebra that one finds the power of the
methodology. In particular, various attempts were made – notably, by Lanczos [27] – to
apply the methods of complex quaternions to the modeling of both the Maxwell
equations for electromagnetism and the Dirac equation for the wave function of the
electron. The big problem that emerged was the unification problem of finding a field (or
wave) equation that would include both Maxwell and Dirac as special cases.
This unification problem was closely related to the Einstein-Maxwell problem, which
concerned finding a field equation that would imply both Maxwell’s equations of
Introduction 4

electromagnetism and Einstein’s equations of gravitation as consequences. One of the


many attempts [28] that Einstein, together with Mayer, made along those lines involved
the use of what he was calling “semi-vectors,” which were later showed by Blaton [29] to
be a slight generalization of the spinors that were gradually being introduced into
quantum physics due to the discovery of the magnetic moment of the electron and the
Uhlenbeck-Goudsmit hypothesis that it was due to some form of intrinsic angular
momentum.
One of the intriguing aspects of the Lanczos equations was that they involved an even
higher-dimensional field space than the complex quaternions, namely, since they seemed
to involve pairs of complex quaternions, one might think of the wave function as taking
its values in an algebra of real dimension sixteen. Interestingly, both Albert Proca [30]
and Sir Arthur Stanley Eddington [31] were suggesting that the correct form of the
quantum wave function should take its values in the full sixteen-dimensional Clifford
algebra of Minkowski space, and not just C4. However, as is usually the case, the
Lanczos equation did not attract widespread attention, mostly due to the problem of the
physical interpretation of the pairs of complex quaternions.
One suggestion that is worthy of consideration was made more recently by Gsponer
and Hurni [32], who conjectured that perhaps the pairing of wave functions is simply due
to the more modern notion of isospin symmetry, such as one finds in the proton-neutron
doublet or the electron-neutrino doublet. Certainly, such a concept was not known to
Lanczos at the time of his paper in 1927, so it would not have been considered back then.
Another sixteen-real-dimensional algebra that has a natural place in physical
mechanics is the algebra of complex dual quaternions, which have the same
multiplication table for the basis elements as real quaternions, but their components come
from the four-real-dimensional algebra of complex dual numbers. These numbers look
essentially the same as real dual numbers, except that the components are complex
numbers. Thus, the complex dual quaternions can be regarded as a complexification of
the real dual quaternions, so they have some of the features of both the dual and the
complex numbers. As a complexification of the real dual quaternions, the group of unit
complex dual quaternions naturally carries a representation of the group ISO(3; C) of
complex rigid motions, which are then the semi-direct product of the three-dimensional
complex Euclidian rotation group with the three-dimensional complex translation group.
Although the complex rotation group is actually isomorphic to the identity component of
the Lorentz group – i.e., the proper, orthochronous Lorentz group – and the translation
group C3 includes R4 as a subgroup, nonetheless, the nature of the semi-direct product
does not permit one to find a Poincaré subgroup of ISO(3; C). However, this does not
mean that there are no physically useful applications of the group, since it acts quite
naturally on C3, which can be used as a model for the field spaces of electromagnetism,
namely, the spaces of bivectors and 2-forms over R4.
The representation of electromagnetic fields as fields with values in C3, such as in the
form E + iB, goes back at least as far as Riemann’s lectures on partial differential
5 The representation of physical motions by various types of quaternions

equations (see Weber [33]), although it only got a brief passing mention at the time. It
was next discussed by Silberstein [34a, b] in 1907 and independently by Arthur Conway
[35] in 1911. Later, it was notably employed by Ettore Majorana in order to put
Maxwell’s and Dirac’s equations into a common formalism, although that work took the
form of notes that were not published until much later in a compilation volume [36].
Independently of him, J. Robert Oppenheimer [37] also employed the complex
representation in order to discuss the problem of finding a wave function for the photon
that would be analogous to the one that the Dirac equation gives for the electron.
Interestingly, that problem is still open, since the statistical interpretation of the wave
function assumes that one can localize it to a point particle, which is impossible for the
photon, although one can still speak of a momentum-space wave function for it. The
complex representation had also been developed in the context of general relativity
theory, as well, and some researchers referred to it as the method of “3-spinors” (See,
e.g., [38, 39].)

Although the main application of quaternions in this monograph is to kinematics,


nevertheless, for the sake of completeness, we shall also discuss complex dual
quaternions, which have not been the subject of as much discussion in physics as real,
dual, and complex quaternions. As we see it, the main application of complex dual
quaternions to the symmetries of electromagnetic field equations, so the kinematics of
complex rigid frames will not be pursued at the moment. Since the complex dual
quaternions provide a sixteen-real-dimensional field space, that fact might explain the
Lanczos equations in a physically reasonable way as an alternative to isospin doublets.

The basic structure of this monograph is straightforward:


In chapter I, we define the general notion of an algebra and discuss some of the
elementary notions concerning them that will be applied in the later chapters. The fact
that an algebra is a special type of ring is emphasized, so some of the material is general
to rings, while other material is specific to algebras. In particular, the notion of the tensor
product of algebras will be crucial to the discussion of dual, complex, and complex dual
quaternions.
In chapter II we introduce the real quaternions, which are at the root of all of the other
variants that we will subsequently discuss. In particular, we show that the group of unit
real quaternions is a Lie group that is isomorphic to SU(2) by a straightforward
association of quaternions with 2×2 complex matrices. We then show how the unit real
quaternions act on the three-dimensional vector subspace of pure – or “vector” –
quaternions in a manner that represents the action of SO(3; R) on three-dimensional real
Euclidian space, although diametral pairs of unit quaternions get associated with the same
proper rotation in the manner of the spin representation of the orthogonal group. We then
show how one derives the basic kinematical objects from such an action when the unit
quaternion varies differentiable in time. In particular, we discuss both velocity and
acceleration for vectors and frames and show the forms that take in both inertial and co-
moving frames.
In chapter III, we introduce the algebra of real dual numbers and show how one does
some of the usual trigonometry and linear algebra using them instead of real numbers.
We then define the algebra of dual quaternions to be the tensor product of the real
Introduction 6

quaternions with the dual numbers and go over the same basic topics that we did in the
chapter on real quaternions. We then see that the unit dual quaternions carry a
representation of the Lie group of rigid motions in three-dimensional real Euclidian space
and examine how that group acts on dual vectors, which are the analogue of pure
quaternions in this case. We then show how many of the kinematical expressions that
were derived before are obtained essentially by changing the ring of coefficients for the
vectors from real numbers to dual numbers.
In chapter IV, we discuss complex quaternions. The basic flow of ideas is the same
as in the previous two chapters, although we will see that there are more automorphisms
that one can introduce in the complex case, which define more actions of the unit
complex quaternions on the complex quaternions and more invariant subspaces. Since
that group is easily seen to be isomorphic to SL(2; C), one sees that the complexification
of real rotations gives transformations that have fundamental significance in special
relativity, as does the group SO(3; C). Furthermore, one finds that that when the scalar
are complex one can have non-trivial “null quaternions,” which are related to the
isotropic vectors of Minkowski space. They also make it possible to find actions of the
group of unit complex quaternions whose invariant subspaces could represent SL(2; C)
spinors. We then examine the kinematics of frames in the various invariant subspaces.
Finally, in chapter V, we attempt follow the same basic template for exploring the
algebra of complex dual quaternions and examine what issues become relevant when one
complexifies the algebra of real dual quaternions, or similarly, “dualizes” the algebra of
complex quaternions. We conclude by suggesting that the most immediate application of
complex dual quaternions is to the symmetries of the Maxwell equations, which then
becomes a matter for a later study of the role of quaternions in physical field theories.

A certain familiarity with the basic definitions of differentiable manifolds and Lie
groups will be assumed in what follows, such as one might learn from Frenkel [40].
However, most of the discussion is algebraic in character, and assumes just the rudiments
of linear algebra, while concepts from abstract algebra, such as rings, modules, and fields,
will be introduced as necessary for the benefit of physicists who are not familiar with
them.

References

1. W. Blaschke:
a. and H. R. Müller, Ebene Kinematik, Oldenbourg, Munich, 1956.
b. “Anwendungen dualer Quaternionen auf Kinematik,” Annales Academiae
Scientiarum Fennicae (1958), 1-13; Gesammelte Werke, v. 2; English
translation by D. H. Delphenich at neo-classical-physics.info.
c. Kinematik und Quaternionen, Mathematische Mongraphien, VEB Deutscher
Verlag der Wissenschaften, Berlin, 1960; English translation by D. H.
Delphenich at neo-classical-physics.info.
2. L. Euler, “Formulae generales pro translatione quacunque corporum rigidorum,”
Novi Commentarii Acad. Petropolitanae 20 (1776), 189-207.
7 The representation of physical motions by various types of quaternions

3. W. R. Hamilton, Sir, Elements of Quaternions, Longmans, Green, and Co., London,


1866.
4. J. B. Shaw, Synopsis of linear associative algebras, Carnegie Institute, Washington,
D.C., 1907.
5. L. Poinsot:
a. Éléments de statique, 1st ed., Gauthier-Villars, Paris, 1803; 2nd ed., with a
preface by Bertrand, 1877.
b. Outlines of a new theory of rotational motion. Extract from a memoir he
presented to the French Institute in 1834 that was translated into English by C.
Whitley, Pitt Press, Cambridge, 1834.
6. M. Chasles, Aperçu historiques sur l’origine et la développement des méthodes en
géométrie, Hayez, Brussels, 1837.
7. J. Plücker:
a. “Fundamental views regarding mechanics,” Phil. Trans. Roy. Soc. London
156 (1866), 361-380.
b. Neue Geometrie des Raumes, Teubner, Leipzig, 1868.
8. F. Klein:
a. “Notiz, betreffend den Zusammenhang det Liniengeometrie mit der Mechanik
starren Körper,” Math. Ann. 4 (1871); Ges. math. Abh., art. XIV; English
translation by D. H. Delphenich at neo-classical-physics.info.
b. “Zur Schraubentheorie von Sir Robert Ball, Zeit. Math. Phys. 47 (1902),
republished with an appendix in Math. Ann. 62 (1906); Gesammelte
mathematische Abhandlungen, art. XXIX; English translation by D. H.
Delphenich at neo-classical-physics.info.
c. Elementary Mathematics from an Advanced Standpoint: Geometry, trans.
from third German edition by E. R. Hedrick, Dover. Mineola, NY, 2004 (see
esp. pp. 21-38). (First German edition was published in 1908.)
9. R. S. Ball, Sir, The Theory of Screws, Hodges, Foster, and Co., Dublin, 1876.
10. E. Study, Die Geometrie der Dynamen, Teubner, Leipzig, 1903.
11. W. K. Clifford, “A preliminary sketch of biquaternions,” Proc. Lond. Math. Soc. 4
(1873),
12. A. MacAulay, Octonions: A development of Clifford’s Bi-quaternions Cambridge
University Press, Cambridge, 1898.
13. I. Zanchevskiy, “The Theory of Screws,” Bulletin of the Mathematics Division of
the Novosibirsk Society of Natural Scientists, v. IX, Odessa, 1889 (in Russian, and
no longer available).
14. A. P. Kotjelnikoff, “Screw Calculus and Some of its Applications to Geometry and
Mechanics,” Kazan U., 1895 (in Russian, but no longer available).
15. R. v. Mises:
a. “Motorrechnung: Ein neues Hilfsmittel der Mechanik,” ZAMM 4 (1924),
b. “Anwendungen der Motorrechnung,” ZAMM 4 (1924),
(English translations of both papers are available at neo-classical-physics.info.)
16. F. M. Dimentberg, “The Screw Calculus and its Applications in Mechanics,”
Fiziko-Matematicheskoy Literatury, Moscow, 1965 (in Russian); English
translation by Foreign Technology Division of U.S.A.F., available through DTIC.
Introduction 8

17. A. T. Yang and F. Freudenstein, “Application of dual-number quaternion algebra to


the analysis of spatial mechanisms,” J. of Appl. Mech., Trans. of the ASME, E
(1964).
18. G. R. Veldkamp, “On the use of dual numbers, vectors, and matrices in
instantaneous, spatial kinematics,” Mechanism and Machine Theory, 11 (1976),
141-156.
19. K. H. Hunt, Kinematic Geometry of Mechanisms, Oxford University Press, Oxford,
1978.
20. J. M. Selig, Geometric Fundamental of Robotics, Springer, Berlin, 2005.
21. J. M. McCarthy, Introduction to Theoretical Kinematics, M.I.T Press, Cambridge,
MA, 1990.
22. J. Duffy, Statics and Kinematics, with Applications to Robotics, Cambridge
University Press, Cambridge, 1996.
23. H. Goldstein, Classical Mechanics, 2nd ed., Addison-Wesley, Reading, MA, 1980.
24. L. Silberstein, The Theory of Relativity, MacMillan, London, 1914.
25. P. Weiss, “On some applications of quaternions to restricted relativity and classical
radiation theory,” Proc. Roy. Irish Acad. A: Math. Phys. Sci. 46 (1940/1941), 129-
168.
26. P. Rastall, “Quaternions in relativity,” Rev. Mod. Phys. (1964), 820-832.
27. C. Lanczos:
a. “Die tensoranalytischen Beziehungen der Diracschen Gleichung,” Zeit. Phys.
57 (1927), 447-473. English translation by D. H. Delphenich at neo-classical-
physics.info.
b. “Zur kovarianten Formulierung der Diracschen Gleichung,” Zeit. Phys. 57
(1927), 474-483. English translation by D. H. Delphenich at neo-classical-
physics.info.
c. “Die Erhaltingssätze in der feldmässigen Darstellung der Diracschen
Theorie,” Zeit. Phys. 57 (1927), 484-493. English translation by D. H.
Delphenich at neo-classical-physics.info.
28. A. Einstein and W. Mayer, “Semivektoren und Spinoren,” Sitz. d. preuss. Akad. d.
Wiss. (1932), 522-550.
29. J. Blaton, “Quaternionen, Semivektoren, und Spinoren,” Zeit. Phys. 95 (1935), 337-
354. English translation by D. H. Delphenich at neo-classical-physics.info.
30. A. Proca, “Sur l’équation de Dirac,” J. Phys. Rad. 1 (1930), 235-248.
31. A. S. Eddington, Relativity Theory of Protons and Electrons, Cambridge University
Press, Cambridge, 1936.
32. A. Gsponder and J.-P. Hurni, “Lanczos-Einstein-Petiau: From Dirac’s equations to
nonlinear wave mechanics,” in W. R. Davis, et al., Cornelius Lanczos COllected
Published Papers with Commentaries, North Carolina State University, Raleigh,
NC, 1998. v. III, pp. 2-1248 to 2-1277; also available at arXiv.org,
physics/0508036.
33. H. Weber, Die partiellen Differentialgleichungen der mathematischen Physik, nach
Riemann’s Vorlesungen, v. 2, Vieweg and Son, Braunschweig, 1901; see § 138,
especially.
34. L. Silberstein:
9 The representation of physical motions by various types of quaternions

a. “Elektromagnetische Grundgleichungen in bivectorieller Behandlung,” Ann. d.


Phys. 327 (1907), 579-586. English translation by D. H. Delphenich at neo-
classical-physics.info.
b. “Nachtrag zur Abhandlung über ‘Elektromagnetische Grundgleichungen in
bivectorieller Behandlung’,” Ann. d. Phys. 329 (1907), 783-784. English
translation by D. H. Delphenich at neo-classical-physics.info.
35. A. Conway, “On the application of quaternions to some recent developments of
electrical theory,” Proc. Roy. Irish Acad. A: Math. Phys. Sci. 29 (1911/1912), 1-9.
36. E. Majorana, personal notes that were later compiled in S. Esposito, E. Recami, A.
van der Merwe, and R. Battiston, Ettore Majorana: Research Notes in Theoretical
Physics, Springer, Heidelberg, 2008.
37. J. R. Oppenheimer, “Note on light quanta and the electromagnetic field,” Phys.
Rev. 38 (1931), 725-746.
38. A. Peres, “Three-component spinors,” J. Math. Mech. 11 (1962), 61-79.
39. M. Cahen, R. Debever, and L. Defrise, “A complex vectorial formalism in general
relativity,” J. Math. Mech. 16 (1967), 761-785.
40. T. Frenkel, The Geometry of Physics, an Introduction, Cambridge University Press,
Cambridge, 1997.
CHAPTER I

ALGEBRAS.

1. General notions [1-3]. If V is an n-dimensional vector space whose scalars come


from a field K, which will always be either the field R of real numbers or the field C of
complex numbers in what follows, then an algebra over V is defined by a K-bilinear map
V×V → V, (a, b) ֏ ab that one regards as a multiplication of vectors to produce another
vector. Thus, for any elements a, b, c ∈ V and any scalar λ ∈ K, one must have:

(a + b)c = ac + bc, a(b + c) = ab + ac, (λa)b = a(λb) = λ(ab).

Since the vector space V has an Abelian group structure that is defined by vector
addition and the first two conditions that were just stated represent right and left
distributivity, any algebra can be regarded as a ring algebraically (cf., e.g., Jacobson [4],
which also includes a chapter on algebras.). However, the last set of conditions
specializes the definition to vector spaces.
The multiplication that defines an algebra does not have to be associative or
commutative, not does it have to admit a unity element or multiplicative inverses.
Indeed, it might admit divisors of zero, which would be non-zero elements a and b such
that:
ab = 0.

If {ei, i = 1, …, n} is a basis for the vector space that underlies an algebra A then one
can obtain all of the important structure of the algebra from the multiplication table for
the basis elements, since any other elements are linear combinations of them and the
product is assumed to be bilinear. One can then summarize the multiplication table for
the basis elements in the form of a set of linear equations that express the various
products as linear combination of the basis elements again:

ei ej = aijk ek , (1.1)

in which the component array aijk is referred to as the set of structure constants for the
algebra in that basis. These constants can also be regarded as the components of a third-
rank tensor of mixed type over A, since the algebra multiplication is a bilinear map from
A×A to A, and thus defines an element of A* ⊗ A* ⊗ A.
There is an essential difference between defining a basis for the vector space that
underlies an algebra A and defining a set of generators for the algebra. A set S ⊂ A
consists of generators for A if every element of A can be expressed as a linear
combination of products of elements of S. Thus, although any basis will generate the
algebra, often, as we shall see, a subset of a basis might generate the other basis elements
by way of products of the basis elements.
11 The representation of physical motions by various types of quaternions

If v = vi ei and w = wi ei are arbitrary elements of A then one sees that their product vw
has components with respect to ei that can be obtained from:

(ab)k = aijk v i w j . (1.2)

An especially important class of non-associative algebras is given by Lie algebras,


for which the multiplication of a and b is written [a, b]. The Lie bracket is then required
to be anti-symmetric 1) and satisfy the Jacobi identity:

[a, b] = − [b, a], [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0.

The Jacobi identity can be regarded as a measure of the non-associativity of the Lie
bracket. However, if one has an associative algebra over a vector space V then one can
define a Lie algebra over V by the commutator bracket:

[a, b] = ab – ba. (1.3)

The associativity of the product is necessary in order to make the bracket satisfy the
Jacobi identity.
A particular example of a Lie algebra that will recur in what follows is the algebra
over R3 that is defined by the vector cross product 2):

[a, b] = a × b = (εijk aj bk) ei , (1.4)

in which {e1, e2, e3} constitutes the canonical basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)} for R3,
while ai and bi are the components of a and b with respect to that basis. Thus, the Levi-
Cività symbol εijk also gives one the structure constants for that Lie algebra.
This Lie algebra is isomorphic to the Lie algebra so(3; R) of infinitesimal three-
dimensional Euclidian rotations. We regard the latter as defined by anti-symmetric real
3×3 matrices, when given the commutator bracket. A useful basis for the Lie algebra
so(3; R) is defined by the elementary infinitesimal rotation matrices:

0 0 0   0 0 1  0 −1 0 
J1 =  0 0 −1 , J2 =  0 0 0  , J3 =  1 0 0  , (1.5)
 0 1 0   −1 0 0   0 0 0 

for infinitesimal rotations around the x, y, and z axes, respectively.

1
) For some authors, if one wishes to admit vector spaces over Z2 , the anti-symmetry is replaced by the
requirement that [a, a] = 0 in any case. We shall not, however, need such generality for our purposes.
2
) We shall adhere to the notational convention that lower-case Latin indices always range from 1 to 3
and lower-case Greek indices range from 0 to 3.
Chapter I - Algebras 12

The isomorphism of (R3, ×) with so(3; R) is then defined by the adjoint


representation, ad: (R3, ×) → so(3; R), a ֏ [ad(a)], where:

ad(a) b = a × b (1.6)

and [ad(a)] is the matrix of the linear map ad(a): R3 → R3 with respect to some basis. If
that basis is the canonical basis then one finds that:

Ji = [ad(ei)], (i = 1, 2, 3). (1.7)


One can then say that:
 0 − a3 a 2 
 
ad(a) = ai Ji =  a 3 0 − a1  . (1.8)
 a 3 a1 0 
 

Interestingly, when one defines the vector product of complex 3-vectors in C3, the
resulting Lie algebra so(3; C) is isomorphic to the Lie algebra so(1, 3) of infinitesimal
Lorentz transformations of Minkowski space, which, for us, will be R4 given the scalar
product of signature type (+1, −1, −1, −1). These infinitesimal Lorentz transformations
will be defined by real 4×4 matrices l with property that:

ηl + lη = 0,

in which η = diag[+1, −1, −1, −1] is the matrix of the scalar product.
If the basis for the complex vector space C3 is given by the canonical basis then one
can give a basis for the complex Lie algebra so(3; C) by way of Ji = [ad(ei)], as before. If
one regards so(3; C) as a real Lie algebra then can give a basis for it by way of Ji and Ki
= iJi, and one sees that, from the C-bilinearity of the Lie bracket, the commutation rules
for the basis are:
[Ji, Jj] = εijk Jk , [Ji, Kj] = εijk Kk , [Ki, Kj] = − εijk Jk , (1.9)

which are then isomorphic to those of so(1, 3), when one replaces Ji and Ki with the real
4×4 matrices:
0 1 0 0 0 0 1 0 0 0 0 1
0 0   1 0 0 0 0 0 0 0  0 0 0 0 
ˆJ =   , K̂1 =  , K̂ 2 =  , K̂1 =  , (1.10)
i
 0 J i   0 0 0 0  1 0 0 0 0 0 0 0
     
0 0 0 0 0 0 0 0 1 0 0 0
13 The representation of physical motions by various types of quaternions

which then describe the three elementary infinitesimal rotations and boosts, respectively.
Thus, one can see that, in a sense, an infinitesimal boost is like an imaginary
infinitesimal rotation.

We have just given two examples of a fundamental class of associative algebras in the
form of matrix algebras, which are defined by vector spaces of square matrices under
matrix multiplication. We will use the notation M(n, R) for the algebra of real n×n
matrices and M(n, C) for complex n×n matrices. Both of these algebras have a unity in
the form of the identity matrix. An early theorem of Cayley stated that any associative
algebra can be represented by a matrix algebra. In many cases, this representation is an
isomorphism.
A vector subspace S of an algebra A is called a subalgebra if the product of any two
elements of S is another element of S; from the bilinearity of the product, the product of
any linear combination of elements of S with any other linear combination of elements of
S will then belong to S. For instance, 0 and A are (improper) subalgebras, and in the case
of the Lie algebra so(3; R), the set of all anti-symmetric 3×3 real matrices ω ij such that
ω ij v j = 0 is a subalgebra, and, in fact, a one-dimensional subalgebra that is isomorphic to
so(2; R). Similarly, M(n, R) can be represented as a subalgebra of M(n, C); for instance,
by representing any matrix in M(n, C) as the sum of a real matrix, which then belongs to

M(n, R) and an imaginary one. Note that although the imaginary matrices in M(n, C)
take the form of i times a real matrix, nonetheless, the product of two imaginary matrices
is real, so the imaginary matrices do not form a subalgebra.
An important example of a subalgebra is the center of any algebra A. This ideal Z(A)
would then consist of all elements of A that commute with every other element; i.e., Z(A)
= {z ∈ A | az = za for all a∈ A}. At the very least, 0 and 1 will have this property (if
there is a unity), so if A has a unity all scalar multiples of 1 will be contained in the
center. For instance, the Schur lemma says that the only square matrices that commute
with all other square matrices are scalar multiples of the identity matrix, which says that
the center of those algebras will be defined by those scalar multiples.
One can easily see that Z(A) must be a subalgebra of A, since if z and z′ are elements
of Z(A) and a is an arbitrary element of A then:

a (zz′) = z a z′ = (zz′) a

so zz′ commutes with any element of A, as well.

Any algebra product can be polarized into a sum of a commutator and an anti-
commutator:
ab = 12 [a, b] + 12 {a, b},
where:
{a, b} = ab + ba.
Chapter I - Algebras 14

This results in a corresponding polarization of the structure constants with respect to


any choice of basis:
aijk = bijk + cijk , bijk = b kji , cijk = − c kji .

For a Lie algebra, by anti-symmetry, only the cijk will be non-vanishing.

A Clifford algebra C(n, <.,.>) over an n-dimensional orthogonal space (V, <.,.>) has
the property that:
{a, b} = 2<a, b>, a, b ∈ V. (1.11)

A Clifford algebra over an n-dimensional vector space will be 2n-dimensional and can be
represented by some matrix algebra. Any orthonormal frame {ea, a = 1, .., n} for V will
then define a set of generators for the algebra, since any element of the algebra can be
expressed as a linear combination of products of the generators. One then sees that a
basis for C(n, <.,.>) is defined by all of the linearly independent products {1, ea, ea eb, …,
e1 … en}, when one takes into account the basic defining relation (1.11).
The Clifford algebra that will be of interest to us is the one C(3, δij) that is defined
over real, three-dimensional Euclidian space (R3, δij). Relative to the canonical basis,
which is also orthogonal, one then has:
ei ej + ej ei = 2δij . (1.12)

This algebra is then eight-dimensional as a real algebra and has a basis that is defined by
the set of products {1, ei , e2e3, e3e1, e1e2, e1e2e3}.
From (1.12), the set {1, e2e3, e3e1, e1e2} closes under multiplication and thus defines a
four-dimensional subalgebra of C(3, δij) that one calls the even subalgebra. The four-
dimensional vector space complement that is spanned by the set {ei, e1e2e3} does not
close, so it is not a subalgebra, and one calls the elements of this space the odd elements
of C(3, δij).
Although one might think that we might also use the Clifford algebra C(4, ηµν) over
Minkowski space, nonetheless, we shall find that C(3, δij) is still fundamental to the
complex quaternions, which relate to Lorentz transformations, when one complexifies it.

An algebra that has no divisors of zero is an integral domain. An algebra with unity
becomes a division algebra when every element then admits a multiplicative inverse.
According to Adams’s theorem, the only real division algebras, up to isomorphism, are
R, C, H, O, which are algebras of real numbers, complex numbers, real quaternions, and
octonions – or Cayley algebras – respectively. A commutative division algebra is then a
field. The division algebras R and C are both associative and commutative, while H is
associative, but not commutative, and O is neither associative nor commutative. The
15 The representation of physical motions by various types of quaternions

only complex division algebra is C itself (see Dickson [2], in the section on complex
algebras).

2. Ideals in algebras. If U and V are subsets of an algebra A then we shall define


their product to be the linear subspace UV = span{uv | u ∈ U, v∈ V}; that is, it consists of
all finite linear combinations of products of elements in the two sets. Thus, UU ⊂ U iff U
is a subalgebra of A.
A subset I of an algebra A is called a left-sided ideal of A iff AI ≤ I. Clearly, A and 0
are always left-ideals in any algebra. The latter situation is, however, distinct from the
notion of a zero left-ideal; for such an ideal, AI = 0. However, if A is a division algebra
then there would no zero left-ideals in it. If A has a unity e then the only left-ideal that
contains e is A itself.
If S is a subset of A then the left-ideal generated by S is defined to be the set I(S),
which can be characterized as the smallest (with respect to inclusion) left-ideal that
contains S. For instance, if the set consists of only the unity e then I(e) = A.
As we said above, by definition, a left-ideal is a linear subspace of an algebra A. One
can easily show that any left-ideal I must also be a sub-algebra; i.e., II ≤ I. One simply
takes two representative elements u and u′ in I and notes that their products uu′ and u′u
both belong to I since the element on the left is a general element in A, while the
elements on the right are presumed to be elements of I. However, it is not always true
that a sub-algebra must be a left-ideal. For instance, any line through the origin in an
algebra will be a subalgebra, but it will not generally be fixed by left-multiplication by
every element of the algebra unless the algebra is one-dimensional to begin with.
Complex multiplication gives a familiar example in which lines through the origin can
get rotated by multiplication.
There are two important classes of special elements in any left-ideal I: An element n
∈ I is called nilpotent if np = 0 for some positive integer p, and the minimum such p is
called the degree of nilpotency. For instance, in the algebra of 2×2 real matrices, the
matrix:
0 1
0 0
 

is nilpotent of degree 2. It is clear that a division algebra can have no non-zero nilpotent
elements, since nnp−1 = 0 would then define a pair of divisors of zero.
An element ε ∈ A is called idempotent if ε2 = ε. For instance, if A has a unity element
then it will always be an idempotent. Similarly, 0 is always an idempotent, and we will
treat this as the trivial case.
If one considers matrix algebras then one sees that projection operators behave like
idempotents. Basically, the first application projects all of the elements of the vector
Chapter I - Algebras 16

space that the matrix acts on onto a subspace, while the second application of the
projection acts like a unity on the subspace.
For a finite-dimensional vector space V one can always find (non-unique)
supplementary subspace Sc to any given subspace S such that V = S ⊕ Sc. Once a
supplement has been chosen for S, any element v ∈ V can be uniquely expressed in the
form s + sc, where s ∈ S and sc ∈ Sc.
This then defines projection operators P: V → S, v ֏ s, Pc: V → Sc, v ֏ sc.
Furthermore, one must have:
Pc = I – P. (1.13)

There is then a corresponding unique decomposition of the identity transformation I into


the sum:
I = P + Pc. (1.14)

For instance, in two dimensions one can express the identity matrix as the sum:

1 0 1 0 0 0
0 = +
 1   0 0   0 1 

of the projection onto the x axis and the projection onto the y axis.
One also notes that the projection operators have the property that PPc = PcP = 0,
which derives from the fact that S ∩ Sc = 0.
One can reverse the logic and say that if one is given two linear transformations P, Pc
of A to itself such that:

1. P2 = P, Pc2 = Pc (idempotency),
2. PPc = PcP = 0 (orthogonality),
3. I = P + Pc (decomposition of the identity)

then there are subspaces S and Sc of A such that A = S ⊕ Sc and S ∩ Sc = 0; the subspace
are defined simply by the images of the linear transformations.
One similarly defines two idempotent elements ε1 and ε2 in an algebra A to be
orthogonal iff ε1ε2 = ε2 ε1 = 0. If one has, moreover, that e = ε1 + ε2 then one can express
A as the direct sum I(ε1) ⊕ I(ε1) = Aε1 + Aε2. There is always the question of
reducibility to address at a time like this, so we define an idempotent ε to be primitive iff
is can be expressed as the sum of two orthogonal non-trivial idempotents and imprimitive
otherwise. For instance, in the present case, if e = ε1 + ε2 then e would not be a primitive
idempotent.
Furthermore, from the definition of an idempotent, one will have:

(e – ε)(e – ε) = e – 2ε + ε2 = e – ε, ε (e – ε) = 0.

Thus, εc = e – ε is also an idempotent and is orthogonal to ε. One can then obtain a


decomposition of the unity element:
e = ε + εc.
17 The representation of physical motions by various types of quaternions

Along with left-ideals, one can also define a right-ideal I in A to be a linear subspace
such that IA ≤ A; I is therefore also a sub-algebra of A. There are then corresponding
definitions for the right-ideal generated by a subset and the decomposition of the unity
element by right-ideals of orthogonal idempotents.

Finally, one can define a two-sided ideal – or simply, ideal – in an algebra A to be a


subspace I that is both a left-ideal and a right ideal. This can also be expressed by saying
that AIA ≤ I. A two-sided ideal of an algebra (i.e., ring) is more closely analogous to a
normal subgroup of a group than the previous two types of one-sided ideals, since one
finds that the difference vector space A – I, which is composed of equivalence classes of
elements in A that differ by an element of I, is an algebra iff I is a two-sided ideal. Note
that if A is a commutative algebra then all ideals will be two-sided.

3. Automorphisms of algebras. An automorphism of an algebra A is a linear


isomorphism a: A → A, v ֏ vα that respects the order of multiplication:

(vw)α = vα wα.

For instance, complex conjugation of complex numbers has this property:

(z1 z2)* = z1∗ z2∗ .

One calls α an anti-automorphism when the order of multiplication is reversed:

(vw)α = wα vα.

The transposition of square matrices and the inversion of invertible matrices fall into this
category:
(AB)T = B TA T, (AB)−1 = B−1 A−1.

All of the examples that were given so far are also examples of involutions; i.e., α2 =
I.
When one composes two automorphisms, the result is an automorphism. However,
the composition of two anti-automorphisms is an automorphism, while the composition
of an automorphism and an anti-automorphism − in either order − is an anti-
automorphism. For instance, the Hermitian conjugation of square complex matrices is
the composition of complex conjugation and transposition:

† = T* = *T,

which then becomes an anti-automorphism:


Chapter I - Algebras 18

(AB)† = B† A†.

As long as the sum or difference v ± vα is still a member of A, one can define the
polarization of any element v ∈ A into a part v+ that is fixed by the automorphism (or
anti-automorphism) α and a part v− that goes to its negative under α:

v = v+ + v−, (v+)α = v+, (v−)α = − v−,


by setting:
v± = 1
2 (v ± vα).

When α is complex conjugation, one polarizes a complex number into a sum of a real
and an imaginary part. When α is the transposition of square matrices, the result is the
sum of a symmetric matrix and a skew-symmetric one, while if α is Hermitian
conjugation, the result is the sum of a Hermitian matrix and a skew-Hermitian one.
However, inversion of invertible matrices would not admit this decomposition, since the
sum or difference of an invertible matrix does not have to be invertible.
Polarization induces a decomposition of the identity operator on A into the sum of
two projections:
I = P+ + P−,
with:
P±(v) = v±.

There is a corresponding direct sum decomposition of A, as a vector space, into A+ ⊕


A−, where:
P±(A) = A±.

These subspaces do not have to be sub-algebras, though. For instance, the product of
imaginary numbers is a real number, while the product of symmetric or skew-symmetric
matrices does not have to be symmetric or skew-symmetric, respectively.
When α is an involutory anti-automorphism, the operation of A on itself by α-
conjugation:
A × A → A, (a, b) ֏ abaα

has the useful property that it always has A± for invariant subspaces. That is, if b ∈ A±
then abaα ∈ A±, as well. This follows from the fact that if bα = ± b then:

(abaα)α = abαaα = ± abaα.

This fact will prove repeatedly useful in our discussions of the action of unit quaternions
of various types on the quaternions, more generally.

4. Representations of algebras. A representation of an n-dimensional algebra A


over a field K is a homomorphism ρ: A → M(m; K). That is, it is a linear map that
19 The representation of physical motions by various types of quaternions

associates every element a ∈ A with a unique matrix ρ(a) in the algebra of m×m matrices
with elements in the field K that also has the property that:

ρ(ab) = ρ(a) ρ(b);


i.e., it respects the products.
The image ρ(A) will be a sub-algebra of M(m; K), but not necessarily one that is
isomorphic to A. For instance, the trivial map that takes every element of A to the 0
matrix is still a homomorphism, but not a very fascinating one. Indeed, ρ(A) will be
isomorphic to A iff ρ is also injective, which is true iff ker ρ = 0. In such a case, one will
call the representation faithful, and if m = n, in addition, then the representation is an
isomorphism, and one says that A is itself a matrix algebra.
Any algebra admits at least two non-trivial representations, which are defined by left-
multiplication and right-multiplication. If the algebra is a division algebra then one can
also define another representation by means of conjugation.
If a ∈ A then left-multiplication by a defines a linear map L(a): A → A, b ֏ ab.
However, the linear map does not have to be invertible. In fact, L(a) is invertible iff a is
invertible. Thus, if A is a division algebra then its non-zero elements will be represented
in GL(n; K).
If one chooses a basis ei for A, which we assume to be n-dimensional, then the linear
transformation L(a) can be represented by an n×n matrix [ L(a )]ij with entries in the field
of scalars for A by way of:
L(a)ei = e j [ L(a)]ij . (1.15)

Thus, one can define L: A → M(n; K), a ֏ [ L(a )]ij , and one finds that it is, in fact, a
representation of A. If A has a unity e then the representation must be faithful, since
otherwise there would be distinct elements a ≠ a′ in A such that L(a) = L(a′). That would
mean that one would have to have ab = a′b for all b ∈ A. In particular, this would have to
be true for b = e, which would imply that a = a′.
However, one immediately sees that the representation L is not usually likely to be an
isomorphism, since the dimension of M(n; K) is n2, as opposed to n for A. Thus, the only
possible dimensions in which this might happen are 0 and 1.
If one chooses a basis for the algebra A then the components of the product vw of two
elements v and w can be expressed in terms of the structure constants aijk as in (1.2), and
this gives us the matrix [ L(a )]ij in terms of the structure constants, as well; namely:

(vw)i = aijk vj wk = [ L(v )]ij w j ,


with:
[ L(v )]ij = akji vk.

Thus, the matrix of the map L itself is given by the structure constants in that basis.
Chapter I - Algebras 20

The representation by right-multiplication is entirely analogous to the case of left-


multiplication. First one defines R(a): A → A, b ֏ ba and then R: A → M(n; K), a
֏ [R(a)]. From the previous argument, one sees that if one has chosen a basis for A such
that its structure constants are aijk then the matrix of [R(v)] in that basis will be:

[ R (v )]ij = aijk vk.

Hence, depending upon the symmetry of the product, left and right multiplication might
or might nor be closely related processes. In particular, for Lie algebras, whose structure
constants are then anti-symmetric, it is only necessary to examine the left multiplication,
which then gives the adjoint representation of the Lie algebra, which allows one to define
the “roots” of the Lie algebra as eigenvalues of the matrices that one associates with
elements in a “Cartan subalgebra.”

Because the structure of the algebra is essentially contained in the structure constants,
one can determine much of that structure by looking at the properties of the general
matrix [ L(v )]ij . In particular, looking for its eigenvectors and eigenvalues gives one the
characteristic and minimal polynomials with coefficients that depend upon aijk and the
components of the general element v. The roots and factorizability of these polynomials
then have much to say about the structure of the algebra itself. Note, furthermore, that
although the matrix [ L(v )]ij will change with a change of basis, the characteristic
polynomial will not.
Since we will have no immediate need for this approach to the structure of algebras,
we simply refer the interested readers to some of the earlier literature (e.g., Shaw [1],
Dickson [2], or Albert [3]).

The representation of an n-dimensional division algebra A over K in M(n; K) by


conjugation is called the adjoint representation (although this a different usage from the
one that relates to Lie algebras). First, if a ∈ A then one defines conjugation by a as the
linear map ad(a): A → A, b ֏ aba−1. Once again, since A is a division algebra, as long
as a ≠ 0 the linear transformation ad(a) will be invertible. One then defines the adjoint
representation of A by ad: A → M(n; K), a ֏ [ad(a )]ij , where:

a ei a−1 = e j [ad(a )]ij . (1.16)

The three types of representations that we just defined are closely related to the
previous kinds of ideals, since both concepts are related to the multiplication of elements.
Thus, one can think of a left-ideal I in A as an invariant subspace of the representation L
since L(a)I ≤ I for every a ∈A; i. e., the representation on I is irreducible. If I = I(ε)
21 The representation of physical motions by various types of quaternions

for some idempotent element ε ∈ A then the representation is irreducible iff ε is


primitive.
Analogous statements apply to the case of right-ideals and right-multiplication.
One also sees the adjoint representation of any division algebra A has two-sided
ideals for invariant subspaces. There is also closely-related “chiral” representation of
A×A on A that takes any ((a, b), c) to acb. This representation would also have two-sided
ideals for its invariant subspaces.

5. Tensor products of algebras. Although the representations of physical fields in


the various types of quaternions are essentially an alternative to the tensor and spinor
product representations that are customarily used in theoretical physics, nevertheless, we
shall still have to clarify what we mean by saying that the various types of quaternion
algebras are obtained by tensoring the algebra of real quaternions by various coefficients
rings. We mean that those coefficient rings can all be regarded as real algebras of
varying dimensions that are obtained by taking the tensor product of the two algebras as
real vector spaces.
If A and B are both K-algebras of dimensions n and m, respectively, then their tensor
product algebra is a K-algebra A⊗KB of dimension nm that is defined over the
corresponding tensor product of vector spaces by also accounting for the products on A
and B to give a product on A ⊗KB:
(a⊗b)(a′⊗b′) = aa′⊗bb′.

One can also think of A⊗KB as consisting of linear combinations of elements in A


with coefficients in B.
If {ei, i = 1, …, n} is a basis for A and {fα, α = 1, …, m} is a basis for B then {ei ⊗ fα,
i = 1, …, n, α = 1, …, m) is a basis for A⊗KB, when the components of any element of
A⊗KB are taken from K. Not all elements of A⊗KB are of the form a⊗b, but only the
decomposable ones. In that case, the components of an element of A⊗KB with respect to
the basis ei ⊗ fa are of the form aibα; a more general element simply has a component
matrix βiα with elements in K.
Since the main use that we will have for tensor products of algebras will involve
tensoring the real quaternions with various coefficient algebras, we shall now show how
the above remarks simplify somewhat in such a case. For example, we consider the
complexification of a real algebra.
Let A be an n-dimensional real algebra with a basis defined by {ei, i = 1, …, n} and a
multiplication table that is defined by
ei ej = aijk ek .
Chapter I - Algebras 22

We regard C as a division algebra over R2 with a basis defined by {1, i} whose


multiplication table is:
11 = 1, 1i = i1 = i, ii = −1.

When one takes the tensor product A ⊗R C, the basis that one can define on it from
the given ones consists of 2n members {ei ⊗1, ei ⊗ i}, which we abbreviate to {ei, iei}.
Thus, in order extend the two given multiplication tables, we only need to account for the
products that involve i, which we do by way of:

ei (iej) = (iej) ei = i (ei ej) = aijk (iek ) = (iaijk )e k , (1.17)


(iei)(iej) = i2 ei ej = − ei ej = − aijk ek . (1.18)

If one regards A ⊗R C as a real algebra then a typical element v can be represented in


the form:
i
v = vRe ei + vIm
i
(iei ) ,

but if one regards it as a complex algebra with a basis given by ei then the same elements
takes the form:
i
v = (vRe + ivIm
i
)ei .

Similarly, if one regards A ⊗R C as a real algebra then the new set of structure
constants is now aIJK , where I, J, K run from 1 to 2n, the structure constants aijk are
unchanged, and from (1.17), (1.18), the missing ones are:

aik++nn, j = aik, +j +nn = aijk , aik+ n , j + n = − aijk ,


all other being zero.
However, if we regard A ⊗R C as a complex algebra then the structure constants
remain unchanged. In fact, more generally, when one tensors a given algebra with
various other coefficient algebras, the structure constants for a real basis remain the same,
while the character of the multiplication is solely due to the character of the
multiplication in the coefficient algebra B, since one assumes that products are B-bilinear.
Therefore, one can always deal with the products of coefficients and products of basis
elements separately. In particular, all of the algebras that we will be dealing with are
obtained by tensoring the real quaternions with various other algebras, namely, the
complex numbers, the dual numbers, and the complex dual numbers. Hence, the basic
multiplication table for the basis elements will not change fundamentally, while the
idiosyncrasies of the coefficient algebra will affect the products of coefficients.
23 The representation of physical motions by various types of quaternions

References

1. J. B. Shaw, Synopsis of linear associative algebras, Carnegie Institute, Washington,


D.C., 1907.
2. L. E. Dickson, Algebras and their Arithmetics, Constable and Co., Ltd., London,
1923; reprinted by Dover, Mineola, NY, 1960.
3. A. A. Albert, Structure of Algebras, A. M. S. Colloquium Publications, v. 24,
Providence, RI, 1939.
4. N. Jacobson, The Theory of Rings, Mathematical Surveys of the A. M. S., NY,
1943.
CHAPTER II

REAL QUATERNIONS

The algebra of real quaternions is fundamental to the extensions that follow, so we


first introduce the formalism at that level, and then show how one extends it in the
subsequent chapters.

1. The group of Euclidian rotations. Three-dimensional Euclidian space, for now,


will be R3 when it is given the Euclidian scalar product <.,.>. A scalar product on a K-
linear space V is, of course, a bilinear functional V × V → K, (v, w) ֏ <v, w> that is
symmetric and non-degenerate. Thus, one always has:

<v, w> = <w, v>, (2.1)

and for every v ∈ V the linear map v* : V → V*, w ֏ v*(w) = <v, w> is an isomorphism.
If {ei, i = 1, 2, 3} is a basis for R3 then it will be said to be orthonormal iff:

<ei, ej> = δij . (2.2)

We shall also refer to an orthonormal basis for E3 = (R3, <.,.>) as an orthonormal frame
for that vector space.
The scalar product of any two vectors v = viei and w = wjej can be obtained from the
scalar products of the frame members using bilinearity:

<v, w> = δij vi wj = v1 w1 + v2 w2 + v3 w3. (2.3)

In particular, the scalar product of any vector with itself takes the form:

3
|| v ||2 = <v, v> = δij vi vj = ∑ (v )
i =1
i 2
, (2.4)

and we will refer to || v ||2 as the norm-squared of v and its square root as the norm of v.
Because we are dealing with real numbers, the only way that || v || can vanish is if v =
0. Thus, one refers to the scalar product as positive-definite. (Of course, this property
does not apply to the Minkowski space scalar product.)
A map R: R3 → R3 will be called an orthogonal transformation – or simply, a
rotation – if it preserves the scalar product; i.e.:

<Rv, Rw> = <v, w> for all v, w. (2.5)


25 The representation of physical motions by various types of quaternions

Since the scalar product is bilinear and symmetric, one can then show:

Theorem:

Any orthogonal map R must be an invertible linear transformation.

Proof:
Linearity:
<R(αv), Rw> = α <v, w> = α <Rv, Rw> = <αRv, Rw>,
<R(v + v′), Rw> = <v + v′, w> = <v, w> + <v′, w>
= <Rv, Rw> + <Rv′, Rw> = <Rv + Rv′, Rw>,

and as these relations must be true for all vectors, one can conclude the linearity:

R(αv) = αRv, R(v + v′) = Rv + Rv′.

Invertibility:

If Rv = 0 then:
<Rv, Rv> = <v, v> = 0,

which is only possible if v = 0 in the positive-definite case; thus, ker R = 0, which makes
R injective. The fact that it is also surjective then follows from the nullity-rank theorem.
Q.E.D.

Thus, R can be represented by a matrix, which we shall denote by either R or R ij ,


with respect to a chosen basis ei, since:
Rei = e j Ri j . (2.6)

The matrix can also be said to act on the components of a vector v = viei since:

Rv = vi (e j Ri j ) = ( Ri j v i )e j ; (2.7)
i.e., if v′ = Rv then:
v′i = R ij v j . (2.8)

Note that the same matrix acts on the column vector [vi] directly on the left and on the
row vector [ei] by its transpose on the right.
Because of (2.3) and (2.5), the matrix of any rotation R has the property that:

δ kl Rik R lj = δij. (2.9)


Thus, one can say that:
R−1 = RT, (2.10)

where the T refers to the transpose operator.


Chapter II. Real quaternions. 26

Since this means that RRT = RTR = I, if one takes the determinant of both sides then
one finds that:
det(R) = ± 1. (2.11)

The positive sign refers to proper rotations, while the negative sign gives improper
ones, which are the product of a proper rotation with a reflection through the origin,
whose matrix is then – I. Only the proper rotations are regarded as physical motions.

If one looks at the eigenvalues of a typical rotation R then the characteristic


polynomial will take the form:

det(R − λI) = aλ3 + bλ2 + cλ + d,

with all real coefficients. Although the roots do not have to all be real, nonetheless, the
complex roots must only occur in complex conjugate pairs (3). Thus, at least one of them
must be real. From orthogonality, however, one sees that for a real eigenvalue λ with a
corresponding eigenvector v, one must have:

<Rv, Rv> = λ2<v, v> = <v, v>,

which makes λ = ± 1 unless v = 0 (again, in the positive-definite case), although the


negative sign refers to an improper rotation.
Thus a three-dimensional rotation will always have an axis; i.e., a line through the
origin whose points are all fixed by the rotation. The effect of the rotation on any plane
perpendicular to that axis will be a planar rotation, whose matrix relative to an
orthonormal frame in that plane can be given the form:

 cos θ − sin θ 
R=  , (2.12)
 sin θ cos θ 

where the angle of rotation θ is measured positive clockwise.


One sees that the characteristic polynomial of such a matrix is:

λ2 – 2 cos θ λ + 1,

and its eigenvalues will take the form:

λ = eiθ = cos θ + i sin θ, (2.13)

which makes all of them complex numbers on the unit circle. We shall see formulas that
are analogous to this one show up in all of the following sections.
The corresponding eigenvectors for any eiθ can then be chosen to take the form:

(3) Caveat: This will no longer be true when we get to complex rotations.
27 The representation of physical motions by various types of quaternions

1
 ±i  ,
 

independently of θ; in particular, they are the same for the complex conjugate of λ, which
amounts to a rotation through an angle of – θ. Since they are clearly complex vectors,
this will only be of interest when we go on to complex rotations.
As a result of all of this, the characteristic polynomial for a proper three-dimensional
rotation factors into either of two forms:

(λ − 1)(λ2 – 2 cos θ λ + 1) or (λ – 1)3,

depending upon whether one root is real and the other two are complex or whether all of
them are real ( = 1), which can only be true for the identity matrix.
There is a useful formula for the effect of a rotation on a vector v that is discussed in
theoretical kinematics [1] and is called Rodrigues’s formula:

v′ = cos θ v + (1 – cos θ) <v, u> u + sin θ u × v, (2.14)

in which axis of rotation is described by a unit vector u and the angle of rotation is θ.
This formula will also recur in the sequel in various analogous forms.

Any general rotation can be expressed uniquely as a product of elementary rotations


about the three orthonormal axes of a frame. Their matrices take the form:

1 0 0   cosψ 0 sin φ 
R(θ, 0, 0) = 0 cos θ − sin θ  ,
 R(0, φ, 0) =  0 1 0  ,
 
 0 sin θ cos θ   − sin φ 0 cos φ 
(2.15)
 cosψ − sinψ 0 
R(0, 0, ψ) =  sinψ cosψ 0  ,
 0 0 1 

once one has chosen a particular order for the product, since the product of rotations does
not generally commute, unless they are both performed about the same axis.
The real numbers θ, φ, ψ are called the Euler numbers for the sequence of rotations,
and are sometimes referred to as the roll, pitch, and yaw angles, respectively. They then
define a local coordinate chart for the differentiable manifold O(3; R), which can then be
seen to be three-dimensional. Since it also has a group structure under the composition of
rotations, and the group operations of product and inversion are differentiable, one then
sees that O(3; R) is a real, three-dimensional, non-Abelian Lie group, while O(2; R) is a
real, one-dimensional, compact, Abelian one that is diffeomorphic to a pair of circles.
Chapter II. Real quaternions. 28

Since the determinant function is continuous, the level sets corresponding to ± 1 are
disjoint connected components, and the connected component that contains the identity
matrix is a group SO(3; R), which is then composed of proper Euclidian rotations. It can
be shown to be diffeomorphic to RP3 as a manifold, so it is compact. The proof follows
easily using quaternions, as we shall see.
The Lie algebra so(3; R) of SO(3; R), which represents the infinitesimal generators of
one-parameter subgroups of proper rotations, can be obtained by differentiating the basic
property of orthogonal matrices when one assumes that the rotations define a
differentiable curve through the identity in SO(3; R):

d
[ R( s ) R T ( s )] = Rɺ ( s ) RT ( s ) + R(s ) Rɺ T ( s ) = 0.
ds s =0

When one sets Rɺ (0) = ω, so Rɺ T (0) = ωT, one gets the defining property of
infinitesimal rotation matrices:

ω + ωT = 0; (2.16)
i.e., they are anti-symmetric.
The Lie algebra so(3; R) can also be conveniently represented by the vector cross
product that is defined on R3:
[v, w] = v × w = εijk vi wj ek . (2.17)

In order to show the isomorphism, one needs only to define the adjoint action of so(3;
R), in the present form, on itself, which was discussed in Chapter I. The isomorphism of
these two representations of so(3; R) can be obtained by associating the basis elements ei
of R3 with the elementary matrices Ji, which define a basis for the matrix representation
of that Lie algebra.
The structure constants – i.e., the commutation relations – for the Lie algebra can be
obtained from the Lie brackets of the orthonormal basis vectors:

[ei, ej] = εijk ek . (2.18)

One sees that the eigenvalues of the elementary matrices are 0, ± i, while the
eigenvectors are unchanged from those of the corresponding finite rotations. Thus, the
axis of a finite rotation can also be obtained from then zero eigenvector of its
infinitesimal generator. If the latter is expressed in the form ad(v) then its zero
eigenspace will be the line through v itself, since v × w = 0 iff v is collinear with w.
29 The representation of physical motions by various types of quaternions

Three-dimensional, Euclidian rotations can also be represented by 2×2 complex


unitary matrices with unity determinant, which defines a group that is usually denoted by
SU(2). Thus, the space of its defining representation is C2, when it is given the Hermitian
inner product. Such an inner product is not symmetric in the same sense as the usual
scalar product, but must satisfy:
(v, w) = (w, v)*.

As a result, the norm-squared of any vector will be real.


Moreover, a complex basis {e1, e2} for C2 is called unitary iff:

(ea, eb) = δab ,

and one finds that the general component expression for the inner product becomes:

(v, w) = δab va wb*.

A C-linear transformation U of C2 is then called unitary when it preserves the


Hermitian inner product:

(Uv, Uw) = (v, w), for all v, w ∈ C2.

As a result, the matrix of a unitary transformation must satisfy:

UU† = U†U = I, i.e., U−1 = U†,

which implies that every unitary transformation is invertible, moreover.


This also implies that the modulus of det U must be unity for any unitary matrix U,
since:
det(UU†) = det U (det U)* = || det U ||2 = 1.

One finds that because of the unitarity constraint on its elements a typical 2×2 unitary
matrix U does not need four independent complex numbers to specify it uniquely, but
only two:
α γ  α − β ∗ 
U=   = ∗ 
.
β δ  β α 

The complex numbers α, β, γ, δ are then the Cayley-Klein parameters (4), which go back
to Klein’s work on the theory of tops, and when one expresses α and β in terms of real
and imaginary components:
α = e0 + ie3, β = e2 + ie1,

(4) A standard reference of Cayley-Klein parameters and Euler parameters, as well as their relationship
to Euler angles, is Goldstein [2].
Chapter II. Real quaternions. 30

the four real parameters e0, …, e3 that one introduces are sometimes referred to as the
Euler parameters (as distinct from the Euler angles). We shall soon see that the Euler
parameters were essentially the components of a real unit quaternion.

When one further imposes the constraint that U have unity determinant, this implies
that:
αα* + ββ* = 1,

and one sees that if one represents the two column vectors of U as a pair of vectors {U1,
U2} in C2 then the conditions that were imposed on a matrix in SU(2) say that this pair of
vectors must constitute a special unitary frame. Since the association of a pair (U, −U) of
matrices in SU(2) with a rotation matrix R in SO(3) – a process that will become quite
straightforward when we have introduced quaternions – is often referred to as defining
the “spin” covering group of SO(3), we will call a special unitary frame in C2 a spin
frame. Thus, any oriented, orthonormal frame in E3 is associated with two spin frames.
Just as an oriented, orthonormal frame in R2 is defined by specifying one of the two
frame members, similarly, a spin frame in C2 is defined by specifying one of the two
complex vectors. For instance, one can take the column U1 = [α, β]T in U to be the first
frame member and then define the other by:


2  − β ∗   0 −1 α 
U = ∗  =     = J U1*,
 α  1 0   β 

in which we have introduced J for the matrix of a clockwise rotation through π/2 radians.

Because of this, and the fact that U1 has unit norm, one sees that a matrix in SU(2)
can just as well be described by a unit vector in C2. This one-to-one correspondence
between SU(2) matrices and unit vectors in C2 is at the heart of the description of spin by
Pauli spinors, and is also quite elegantly incorporated into the theory of real quaternions,
as we shall see.

2. The algebra of real quaternions [3-6]. The algebra H of real quaternions is


defined over the vector space R4 by giving the multiplication table for the canonical basis
{e0, …, e3}
e0 eµ = eµ e0 = eµ , ei ej = − δij e0 + εijk ej ek . (3.1)

One immediately notes that since products of some of the basis elements can produce
other basis elements, the basis in question does not constitute a minimal set of generators.
31 The representation of physical motions by various types of quaternions

For instance, since e1e1 = − e0 and e1e2 = e3 , one could use {e1, e2} to generate H, since
the given basis then consists of {− e1e1, e1, e2, e1e2}.
One can read off the structure constants for H relative to the canonical basis from
(3.1) directly:
a0κµ = aκµ 0 = δ µκ , aij0 = − δij, aijk = εijk . (3.2)

A typical quaternion then takes the form:

q = qµ eµ , (3.3)

and if p = pµ eµ is another quaternion then their product pq can be expressed in the


explicit form:
pq = (p0q0 – p1q1 – p2q2 – p3q3) e0
+ (p1q0 + p0q1 – p3q2 + p2q3) e1
+ (p2q0 + p3q1 + p0q2 – p1q3) e2
+ (p3q0 – p2q1 + p1q2 + p0q3) e3 . (3.4)

The element e0 then represents the unity element of the algebra H, and, in fact, it
generates the center of H, which is defined all elements that commute with all other
elements, so it consists of all scalar multiples q0e0 . Hence, it will often be convenient to
simply abbreviate e0 by 1. However, when we get to orthogonality, it is important to
remember that 1 is still a vector, so, in particular <1, ei> = 0, as we will see.
The algebra H contains an infinitude of subalgebras that are R-isomorphic to C, such
as the subalgebras generated by {1, e1}, {1, e2}, and {1, e3}. In fact, more generally, if q
= qiei satisfies <q, q> = 1, with a definition of the scalar product that we will give shortly,
then {1, q} generates a subalgebra that is R-isomorphic to C.

Any quaternion can be expressed in the scalar-plus-vector form:

q = q0 + q, (q = qi ei), (3.5)

where one calls S(q) = q0 the scalar part of q and V(q) = q, the vector or pure quaternion
part of q. Thus, the scalars represent the center of H.
One can introduce the conjugation automorphism: If q = q0 + q then:

q = q0 – q. (3.6)

In fact, this is an anti-automorphism, since:

pq = q p. (3.7)
Chapter II. Real quaternions. 32

One sees that polarizing the identity operator with respect to conjugation expresses it
as the sum:
I = S + V, (3.8)

of two complementary projections S : H → SH and V : H → SH that are defined by:

Sq = 12 (q + q ) , Vq = 12 (q − q ) . (3.9)

One then has a corresponding direct sum decomposition H = SH ⊕ VH.


The product of any two quaternions q and r can be expressed in the scalar-plus-vector
form:
qr = (q0r0 − <q, r>) + r0q + q0r + q × r, (3.10)
since:
qr = − <q, r> + q × r. (3.11)

Note that the scalar part of the product then behaves like the Minkowski scalar product,
even though we are still only talking about Euclidian geometry. We then define:

(q, r) = S(qr) = 12 (qr + rq ) = q0r0 − <q, r>. (3.12)

Of particular interest in what follows will be the general expression for the square of
any quaternion:
q2 = (q0)2 − <q, q> + 2q0q. (3.13)

Although the algebra H is associative, it is not commutative. One notes that, in fact:

[q, r] = 2 q × r, (3.14)

so although the Lie algebra that is defined by the commutator bracket is isomorphic to
so(3; R), there is a factor of 2 involved that relates to the fact that the three-dimensional
Euclidian rotations will be represented by half-angle rotations.
The algebra H is, as we mentioned before, a division algebra; in particular, it has no
divisors of zero. In order to find the multiplicative inverse to any non-zero quaternion q,
we can go back to the expression (3.10) for qr and set it equal to 1. This gives the
following conditions on r:

q0r0 − <q, r> = 1, r0q + q0r = − q × r.

In the second equation, we see that a linear combination of q and r can lie in the
plane of q × r only if it equals zero. Thus, r = λq for some real number λ. But, from the
left-hand side this makes r0 = − λq0. From the first equation, we see that one must then
have (q0)2 − <q, q> = 1/λ. We shall now see that, in fact, λ = − || q ||−2.
33 The representation of physical motions by various types of quaternions

From (3.10), one sees that:

qq = qq = q0q0 + <q, q> ≡ || q ||2. (3.15)

Thus, we are now looking at the Euclidian norm over R4, as well as over R3. The level
surfaces of that norm are then real 3-spheres of radius || q ||.
More generally, we can define another scalar product on H by way of:

<q, r> = S( qr ) = q0r0 + <q, r>. (3.16)

One then sees that this scalar product amounts to the Euclidian scalar product for
quaternions of vector type; in particular, one sees that one also has || q ||2 = <q, q>.
Since || q || is positive-definite for real q, as long as q itself is non-zero, one can define
the multiplicative inverse to q by:
q
q−1 = . (3.17)
|| q ||2

The non-zero quaternions Q* then define a multiplicative group that is also a four-
dimensional real Lie group. It contains the subgroup Q1 of all unit quaternions, and since
any non-zero quaternion q can be expressed in “polar” form || q || q̂ , where q̂ = q / || q ||
is a unit quaternion, one sees that the group Q* is the product R* × Q1 of the group of
non-zero real numbers under multiplication and the group of unit quaternions.
This polar form of any q can then be expressed in the form:

q = || q || (cos 12 α + sin 12 α q̂ ), (3.18)

in which q̂ is a unit vector that generates an axis of rotation, so:

q = || q || sin 12 α q̂ , (3.19)

and the angle α, which is then defined by:


q0
cos 12 α = , (3.20)
|| q ||

represents one-half an angle of rotation about that axis. The appearance of the factor 1/2
will become more necessary when we see that the group of unit quaternions is isomorphic
to SU(2), which doubly covers the group SO(3).
One easily verifies that H has no non-trivial nilpotents of degree two by setting the
expression (3.13) for q2 equal to 0, which would make:

(q0) 2 = <q, q>, q0q = 0.


Chapter II. Real quaternions. 34

From the second equation, we know that either q0 or q is 0. In the former case, this
would make <q, q> = 0, and for real vectors this would imply that q = 0, which is the
trivial case q = 0. Similarly, in the latter case, if q = 0 then <q, q> = 0 vanishes, and with
it q0, which again gives the trivial case.
In order to find the non-trivial idempotents in H, one goes back to the expression
(3.13) for q2 and sets it equal to q = q0 + q. This implies that one must have:

q0 = (q0) 2 − <q, q>, q = 2q0q.

If we address the second one first then we see that either q = 0 or q ≠ 0. In the former
case, from the first equation, one must have q0 = 1, which gives the trivial idempotent q=
1. If q ≠ 0 then q0 = 1/2 , which implies that <q, q> = − 1/4. As long as we are dealing
with real vectors this is impossible, although it will be possible when we go on to
complex vectors.
We conclude that there are no non-trivial idempotents in H.

Although H is not a complex algebra, nonetheless, for some purposes – such as SU(2)
spinors – one can represent H as a real algebra over C2. In order to see this, one first
reverts to the classical notation 1, i, j, k for the principal units of H. Since k = ij, one can
rearrange the terms in the expansion of a typical quaternion as follows:

q = q0 + q1i + q2j + q3ij = (q0 + q1i) + (q2 + q3i) j = z1 + z2j, (3.21)

in which we have introduced the complex components:

z1 = q0 + q1i, z2 = q2 + q3i. (3.22)

One should be careful about regarding j as another version of i, since even though j2 =
−1, nonetheless, ij = − ji. As a result, one sees that left scalar multiplication and right
scalar multiplication are distinct in this case, since, for example, if z = u + iv is a complex
number then:
zj = (u + iv)j = uj + vij = uj – ji = j(u – iv) = jz*.

This explains the sense in which this algebra over C2 is not really a complex algebra,
since one finds that the product is R-bilinear, but not C-bilinear, namely:

(z1 + z2j)(w1 + w2j) = z1w1 + z2j w1 + z1 w2j + z2j w2j


= z1w1 + z2 w1*j + z1w2j + z2w2*jj
= (z1w1 − z2w2*) + (z1w2 + z2 w1*) j.
35 The representation of physical motions by various types of quaternions

If the product were C-bilinear then the complex conjugates would not appear.
However, complex conjugation is still R-linear, so the product is R-bilinear. This
situation is closely related to the fact that SU(2) is not a complex Lie group, although it is
defined in terms of complex 2×2 matrices, since its real dimension – viz., 3 – is not even.
In order to be consistent with the real case, one defines the conjugate of q as:

q = z1* − z2j. (3.23)


This makes:
|| q ||2 = qq = z1z1* + z2 z2*, (3.24)

which then defines a Hermitian scalar product on H. In particular, || q ||2 is still a real
number.
The inverse of q is still q / || q ||2 , although the definition of conjugate and norm-
squared have changed. Similarly, one can still define unit quaternions by || q || = 1,
although the explicit form for || q || is Hermitian complex, now.
Although we have put j to the right of the complex component z2 in the above
expressions, one can also represent quaternions by putting i to the left and using j as the
imaginary unit, as well:

q = q0 + q1i + q2j + q3ij = (q0 + q2j) + i(q1 + q3j) = z1 + i z2, (3.25)


in which:
z1 = q0 + q2j, z2 = q1 + q3j, (3.26)
this time.
One still has iz = z*i, so the product of two quaternions takes the form:

zw = (z1 + i z2)(w1 + i w2) = z1w1 − z2w1* + i(z1*w2 + z2w1). (3.27)

The conjugate of q takes the form:


q = z1* − iz2, (3.28)

which differs from the previous expression only by the placement of the i. Thus, the
inverse of a non-zero quaternion still has the same form, as does the product qq = || q ||2.
The distinction between these two ways of representing a real quaternion as a pair of
complex numbers will become essential when we discuss the action of SU(2) on Pauli
spinors in the next section. For now, we observe that the pair (z1, z2) of complex numbers
that we associated with the tetrad (q0, …, q3) of real numbers that define a real quaternion
are essentially two of the four Cayley-Klein parameters that one associates with the four
Euler parameters in order to represent a rotation, as discussed in the first section of this
chapter.

3. The action of rotations on quaternions. We shall first prove that the Lie group
Q1 is isomorphic to SU(2) and then go on to show how one can isometrically represent
Chapter II. Real quaternions. 36

vectors and orthonormal frames in E3 by quaternions of vector type in such a way that a
certain action of unit quaternions on the latter space becomes two-to-one equivalent to
the action of rotations on E3. We will discuss the representation of spin frames and Pauli
spinors by real quaternions, shortly.
The algebra H can be represented isomorphically as a real subspace of the complex
algebra M(2; C) by simply defining the association of basis elements.
One defines a basis {τµ , µ = 0, …, 3} for the complex four-dimensional vector space
M(2; C) by way of:
1 0 1 0   0 1 0 1 
τ0 =   , τ1 = i   , τ2 =  −1 0  , τ3 = i 1 0  , (4.1)
0 1 0 −1    

which are then seen to verify the multiplication table:

τ0τµ = τµτ0 = τµ , τiτj = − δijτ0 + εijkτjτk . (4.2)

This is formally identical to (3.1), so the linear map defined by taking eµ to τµ and
extending to the other elements by R-linearity represents H isomorphically as a real
subalgebra of the complex algebra M(2; C). The typical quaternion qµ eµ then goes to the
matrix:
 q 0 + iq1 q 2 + iq 3   iq1 q 2 + iq3 
[q] = qµ τµ =  2 1
= q0τ0 +  2  . (4.3)
 − q + iq q − iq   − q + iq −iq1 
3 0 3

The τ matrices have the property that τ0 represents the unity of the algebra and the
other three matrices τi, i = 1, 2, 3 are anti-Hermitian:

τ i† = − τi . (4.4)

In fact, they relate to the usual Pauli σ matrices, which are Hermitian, by the rule:

τ1 = iσ3, τ2 = iσ2, τ3 = iσ1, (4.5)

which also involves a permutation of the axes. The reason that we shall use anti-
Hermitian matrices, instead of Hermitian ones, is that the Lie algebra that is generated by
the τi is su(2), which then represents infinitesimal rotations directly, and when we extend
this to sl(2; C) by adding the Hermitian matrices as infinitesimal boosts, it will not be so
confusing as to what roles are played by the two types of matrices.
The matrix that represents the conjugate of q takes the form:
37 The representation of physical motions by various types of quaternions

 q 0 − iq1 − q 2 − iq 3   iq1 q 2 + iq3 


[q ] =  2 1
= q 0
τ 0 –  2 1
= [q]†. (4.6)
 q − iq 3
q 0
+ iq   − q + iq 3
− iq 

Thus, conjugation corresponds to the Hermitian adjoint operation in this real case.
The determinant of [q] is:
det [q] = || q ||2, (4.7)

and one sees that the multiplicative group Q* of non-zero quaternions then corresponds to
a real subgroup GL(2; C) that is isomorphic to GL(4; R) and the subgroup Q1 of unit
quaternions then corresponds to a real subgroup of GL(2; C) that is isomorphic to SU(2),
since the inverse of the matrix [q] for any unit quaternion q is

 q 0 − iq1 − q 2 − iq 3 
[q]−1 =  2 1
= [q]†. (4.8)
 q − iq 3
q 0
+ iq 

This association of unit quaternions with elements of SU(2) gives a concise way of
showing that the manifold of the Lie group SU(2) is diffeomorphic to a real 3-sphere.

When one defines H as a real algebra over C2, as we did at the end of the last section,
the association of a quaternion q = z1 + z2j with a 2×2 complex matrix becomes:

 z1 z 2 
[q] =  2∗ 1∗  . (4.9)
−z z 

We recognize that this amounts to the association of the complex Cayley-Klein


parameters, which are now z1 and z2, with the four real Euler parameters, which are now
the components qµ of a real quaternion, before one imposes the unitarity constraint.
The determinant of this matrix then becomes:

det [q] = z1z1* + z2z2* = || q ||2, (4.10)


and:
 z1* − z 2 
[q ] =  2∗ 1  = [q]†. (4.11)
z z 

This once more shows the Hermitian nature of H in this formulation.


Since the matrices [qiτi] that represent pure quaternions define a Lie algebra under Lie
bracket that is isomorphic to su(2), which is, in turn, isomorphic to so(3; R), one sees that
the Lie algebra of infinitesimal rotations can be represented by the Lie algebra of pure
quaternions.
Chapter II. Real quaternions. 38

Having established the isomorphism of the Lie group Q1 with SU(2), one then defines
an action of Q1 on H as follows:
Q1 × H → H, (u, q) ֏ uqu . (4.12)
This action has the spaces SH and VH of quaternions of scalar and vector type,
respectively, as invariant subspaces. Clearly, if q is a scalar then q goes to itself under
the above action, so the action is trivial on scalars. However, if q is a vector then q = −
q, which means that:
uqu = u q u = − uqu ,

so q′ = uqu is also a vector. If two vectors x and y go to the same vector under the
action of u then uxu = uyu , which makes x = y, by cancellation on the left and right, so
the action is injective. Similarly, it is surjective, since the equation x′ = uxu can be
solved by way of x = ux′u for any x′. It is also linear since:

u (α x + β y )u = α (uxu ) + β (uyu ) .

One notes that the restriction of scalar product <⋅,⋅> on quaternions to the three-
dimensional vector space VH of pure quaternions, namely:

<x, y> = S( xy ) = δij xi yj, (4.13)

makes VH isometric to E3. The action in question is then an isometry of that Euclidian
structure, since if q′ = uq then:

q′ q′ = uqu uq u = u qq u = uu ⋅ qq = qq .

If ei is an orthonormal frame in VH then the action of a unit quaternion u on ei can


also be defined by a matrix Ri j by way of:

uei u = e j Ri j . (4.14)

Since the action is an isometry, the matrix Ri j is orthogonal. It also becomes clear that
both u and – u can be associated with the same rotation matrix Ri j , due to the quadratic
nature of the action of the unit quaternions on vectors.
The pairs of unit quaternions {q, − q} that are, in fact, antipodal points on the unit 3-
sphere in H. The line through the origin of R4 that connects the two antipodal points can
then represent the rotation in SO(3), which is, in fact, diffeomorphic to RP3 as a
39 The representation of physical motions by various types of quaternions

manifold. Thus, in a sense, the components of quaternions behave like the homogeneous
coordinates of points in RP3, and the two-fold covering map SU(2) → SO(3) can be
regarded as the association of {u, − u} to Ri j , and topologically this is the association of
antipodal points on a real 3-sphere to points in RP3 by way of the line through the
antipodal points.
One can show that the association of u with R is an order-reversing homomorphism
by noting that the action of uu′ on x is:

uu ′x uu′ = u (u′x u ′)u = u (x R′)u = x R′R.

If one puts u into polar form then the action (4.12) on vectors v takes the form:

(cos 12 θ + sin 12 θ u) v (cos 12 θ − sin 12 θ u)

= cos θ v + (1 – cos θ) <u, v>u + sin θ u × v, (4.15)

which is again Rodrigues’s formula.

In summation, if E3 is represented isometrically by the vector space VH, when it is


given the scalar product that is the restriction of the one defined on H, then the action of
Q1 ≅ SU(2) on VH that was defined in (4.12) is isometric and maps to the action of SO(3)
on E3 by right matrix multiplication in such a way that antipodal unit quaternions go to
the same proper rotation.

One can also use the left and right multiplication of a quaternion q by a unit
quaternion u, when expressed in the complex form, to account for SU(2) spinors.
However, one finds that in order to get the most intuitively appealing results, one must
represent real quaternions in the form z1 + iz2 when dealing with left multiplication and in
the form z1 + z2j when dealing with right multiplication.
Namely, since {1, i} defines a C-basis for C2, which we temporarily denote by {c1,
c2} and left multiplication by a unit quaternion u is an invertible R-linear map, the image
of the basis ca by L(u) can be expressed in terms of ca by means of an invertible 2×2
complex matrix:
uca = cb [ L(u )]ba . (4.16)

One can derive an explicit expression for the matrix [ L(u )]ba by considering the
expression for the product uq, although one must multiply the unit i on the left in order to
represent the quaternions. Thus, if u = u1 + iu2 and q = q1 + iq2 then:
Chapter II. Real quaternions. 40

uq = u1q1 – u2*q2 + i(u1*q2 + u2q1). (4.17)


This makes:
 u1 −u 2∗ 
[ L(u )]ba =  2 1∗  , (4.18)
u u 
which is a matrix in SU(2).
If q′ = L(u)q = uq then:
q′ q′ = uqq u = qq , (4.19)

so left multiplication is an isometry of the Hermitian structure, which is consistent with


our previous statement about [ L(u )]ba .
Therefore, if one represents an element of C2, such as an SU(2) spinor, by q in its
complex form and an element of SU(2) by a unit quaternion u then the left action qu
corresponds to the multiplication of a matrix in SU(2) times the element of C2.
Similar statements are true for right multiplication, which defines a matrix [ R (u )]ba by
way of:
ca u = cb [ R (u )]ba , (4.20)

and this matrix is also an element of SU(2). However, this time, the basis ca refers to {1,
j}.
From considering the product qu, when one multiplies the j on the right, this time:

qu = q1u1 – q2u2* + (q1u2 + q2u1*) j, (4.21)


one derives:
b  u1 u 2 
[ R (u )] =  2* 1*  , (4.22)
 −u u 
a

which is seen to be a unitary matrix with determinant 1, as well as the transpose of the
matrix [ L(u )]ba , which one sees immediately from comparing (4.22) to (4.18)
However, the right multiplication of a quaternion q by u now corresponds to the right
multiplication of a row vector in C2* by the matrix [ R (u )]ba . Thus, left and right
multiplication of a quaternion q by a unit quaternion u correspond to the action of SU(2)
on C2 vectors and covectors, respectively, as long as one represents a C2 vector (z1, z2) in
the form z1 + jz2 and a covector in C2* in the form z1 + z2j.
If one takes the tensor product zawb of a C2 vector and a C2 covector then one obtains
a second-rank mixed tensor whose components define a complex 2×2 matrix, which can
also represent a vector in R3. The simultaneous left action of u and right action of u on
that tensor then corresponds to the conjugation of its component matrix by the SU(2)
41 The representation of physical motions by various types of quaternions

matrices that represent u and u , which is the usual action that one encounters in spinor
algebra for vectors.

4. The kinematics of fixed-point rigid bodies [7]. The mechanics of fixed-point


rigid bodies (i.e., tops) is usually first treated as being based in differentiable curves in
SO(3). That is because the assumption of rigidity allows one to represent the state of a
rigid body at each point in time by some chosen orthonormal frame at a chosen point.
For instance, flight mechanics often defines a body frame that is centered at the center
of mass of the vehicle and has a positive x-axis along the longitudinal axis, pointing
forward, a positive y-axis that points out the left wing, and a z-axis that points vertically
upward; one might invert for the z and z directions for some purposes. We shall use the
term attitude that is used in that discipline to refer to the angular position of a rigid body,
rather than orientation, which has an unrelated, but established, usage in the context of
frames.
The assumption of a fixed point for the rotational motion then means that there is no
translation of the frame through the course of time. All of the other possible attitudes of
the rigid body are then in one-to-one correspondence with points of SO(3). Hence, when
one initial frame is chosen, the time evolution of frames in the rigid body can be
associated with a curve R: R → SO(3), t ֏ R(t), such that R(0) = I, since the initial frame
is the reference for the other ones.
If one assumes sufficient differentiability conditions for the curve R(t) then the first
time derivative Rɺ (t ) can be regarded as the angular velocity of the motion relative to the
initial frame, which then represents the angular velocity in an “inertial” frame. If one
translates the tangent vector Rɺ (t ) in the tangent space to SO(3) at R(t) back to the identity
I then one gets a corresponding element 5):

ω(t) = Rɺ (t ) R(t)−1 (5.1)

of TISO(3), which represents the angular velocity in a co-moving – i.e., non-inertial –


frame. Since TISO(3) can be identified with the Lie algebra so(3), when it is defined by
the right-invariant vector fields on SO(3), one can think of angular velocity as a curve in
so(3).
If one goes to the next time derivative R ɺɺ(t ) then that takes the form of a curve in
T2SO(3) that takes its value in the vector space TRɺ ( t )TR ( t ) SO (3) at each t. This curve then
represents the angular acceleration of the motion relative to the initial frame, which must
be erected in both the tangent space TR(t)SO(3) and the tangent space to the point Rɺ (t ) in
that space, as well.
In order to get the angular acceleration relative to the co-moving frame, one can
differentiate (5.1):
5
) We choose right translation over left translation since that is usually how one assumes that rotations
act on orthonormal frames. It is also consistent with the way that one converts from motion relative to an
inertial frame to motion relative to a co-moving one.
Chapter II. Real quaternions. 42

dω ɺɺ −1 ɺ ɺ −1 ɺɺ −1 ɺ −1 ɺ −1 ɺɺ −1
= RR + RR = RR − RR RR = RR − ωω ,
dt

and define the angular acceleration relative to the co-moving frame by:

ɺɺ −1 = dω + ωω .
α(t) = RR (5.2)
dt

This then defines a curve in TωTISO(3; R) = Tω so(3; R).


Higher derivatives of R(t) can then be right-translated back to the identity to give the
corresponding values in the co-moving frame.
The kinematical state of the rigid body relative to the initial frame can then be
described by the “k-jet;”
(k )
jk(t) = (t, R(t), Rɺ (t ) , …, R (t ) ),

while the kinematical state relative to the co-moving frame can take the form:

(k ) ( k −1)
jk(t)R(t)−1 = (t, I, Rɺ (t ) R(t)−1, …, R (t ) R(t)−1) = (t, I, ω(t), …, ω (t ) ) .

In order to represent the kinematical state of a rigid body by quaternions, one maps
the vector space R3 to the vector space VH of pure quaternions by taking every vector x =
(x1, x2, x3) to the quaternion x = xi ei . Thus, every frame fi = eiR in R3 becomes a frame fi
in VH. Similarly, every sufficiently differentiable curve x(t) in R3 becomes a sufficiently
differentiable curve x(t) = xi(t) ei in VH, so every moving frame fi(t) in R3 also becomes a
moving frame fi(t) in VH.
However, since the action of the rotation group on pure quaternions is by conjugation,
not right-multiplication, in order to represent the frame fi in terms of ei one must first
represent the rotation matrix R by a unit quaternion q, and then let q act on ei by
conjugation:
fi = qei q .

In order to represent a moving frame fi(t), one then represents the curve R(t) as a
curve q(t) in Q1 :
fi(t) = q(t )ei q (t ) .

One then differentiates this curve directly to get:

df i
vi ≡ = qɺei q + qei qɺ . (5.3)
dt
43 The representation of physical motions by various types of quaternions

This system of equations expresses the time derivative of the rotation of the frame in
terms of the original frame ei, which is then the way that things appear in an inertial
frame. In the co-moving frame, one substitutes ei = qfi q and gets:

df i
= qq ɺ fi − fi qq
ɺ fi + fi qqɺ = qq ɺ ,
dt

ɺ = − qqɺ by differentiation.
in which we have used the fact that since qq = 1, one gets qq
If one introduces the quaternion:
ω = qq
ɺ , (5.4)

which is analogous to (5.1), to play the role of angular velocity then we can summarize
the previous formula as:
df i
= [ω, fi] = 2ω × fi . (5.5)
dt

One again, there is a factor of 2 compared to the Euclidian expression.


If one differentiates (5.3) again then one gets the acceleration of the moving frame
relative to the initial frame as:
dv
ai ≡ i = qɺɺei q + 2qɺei qɺ + qei qɺɺ . (5.6)
dt

When one substitutes for ei one then gets the corresponding expression relative to the
co-moving frame:
ɺɺ ei + 2qq
ai = qq ɺ ei qqɺ + ei qqɺɺ = qq
ɺɺ ei − 2qq ɺ + ei qqɺɺ .
ɺ ei qq

If one differentiates (5.4) then one sees that:

ωɺ = qɺɺ q + qɺ qɺ = qɺɺ q + || qɺ ||2 .


One also finds that:
||ω ||2 = qqqq
ɺ ɺ = || qɺ ||2 ,
which makes:
qɺɺ q = ωɺ − ||ω ||2.

After substituting this in the last equation for ai, one gets:

ai = − ||ω ||2 ei − <ω


ω, ei> ω + [ ωɺ ,ei] (5.7)

for the acceleration of the moving frame relative to itself.

When one goes on to the spin representation of Euclidian rotations, one sees that
there are two ways of representing the motion of spin frames: by elements of SU(2) and
by unit vectors in C2, when it is given the Hermitian inner product, which we call H2.
Chapter II. Real quaternions. 44

When one chooses a reference spin frame on H2, any other spin frame can be
described by a unique matrix in SU(2). Thus, a time-parameterized family of spin frames
is just a sufficiently-differentiable curve U(t) in SU(2). We shall then denote its first and
second derivatives with respect to t by Uɺ and Uɺɺ , respectively.
Of course, these matrices represent the time evolution of the frame with respect to –
say – the initial frame, which amounts to the inertial description of the motion. If one
wishes to describe it with respect to the moving spin frame U(t) then one must right-
translate U(t) back to the identity matrix by means of U(t)†, and in so doing, translate Uɺ
back to a tangent vector to I, namely:
Ω = UU ɺ †,
and Uɺɺ , to a tangent vector to (I, Ω):
α = UU
ɺɺ † .

Thus, Ω is an element of the Lie algebra su(2) of infinitesimal generators of special


unitary transformations, while α is tangent to that vector space at Ω.
If one differentiates Ω with respect to t, one gets:

Ω ɺɺ † + UU
ɺ = UU ɺɺ † + ΩUU †Ω† = UU
ɺ ɺ † = UU ɺɺ † − ΩΩ ,

since Ω is skew-Hermitian; i.e.:


α= Ω
ɺ + ΩΩ .

Of course, when one associates the 2×2 skew-Hermitian matrix Ω with a 3×3 real
orthogonal matrix R, a factor of 1/2 must be introduced into Ω in order to account for the
double-valuedness of the covering map, which makes the rotational angle θ in E3
correspond to the angle θ/2 in H2, and therefore, under differentiation, the angular
velocity ω and angular acceleration α in E3 will correspond to the angular velocity ω/2
and α/2 in H2, as well.

The non-relativistic description of spinning matter was introduced into quantum


mechanics by Wolfgang Pauli [8] in order to properly account for the spin of an electron
in the Schrödinger equation, as demanded by the Uhlenbeck-Goudsmit hypothesis that
the newly-discovered magnetic dipole moment of the electron might be proportional to an
intrinsic angular moment – or “spin” – that the electron possessed. Of course, various
theoreticians emphasized that this spin did not have to be represent an actual kinematical
state of the electron, such as the way that the Earth rotates about its own axis while
orbiting around the Sun. Indeed, nowadays, spin is seen to be more related to the
dimension of the space in which the quantum wave function takes its values.
Pauli’s modification of the Schrödinger equation involved first replacing the wave
function that took its values in C with one that took its values in C2, which one then
called a Pauli spinor. Just as the real Euclidian rotation group acts on rigid bodies in E3,
its covering group SU(2) acts on rigid bodies when they are represented in H2.
We have already observed in the first section of this chapter that the attitude of a rigid
body can be equivalently described by a matrix in SU(2) or a column vector in H2 with
45 The representation of physical motions by various types of quaternions

unit Hermitian norm. We have also pointed out that H can be re-organized into a real
algebra over C2. Thus, we see that since Q1 is isomorphic to SU(2) and the unit sphere in
H2, the multiplication of two real unit quaternions also describes the kinematics of a Pauli
spinor when one assumes that its time evolution can be described by the action of a
sufficiently-differentiable curve in SU(2) on an initial spinor ψ0 in H2:

ψ(t) = U(t)ψ0 .

Thus, by successive differentiations, we get:

ψɺ = Uɺψ 0 = Ω ψ, ψɺɺ = Uɺɺψ 0 = αψ,

with Ω and α defined as above in the case of SU(2).


The interpretation of the Pauli equation for the spinning electron in terms of the so-
called “hydrodynamical” intepretation of wave mechanics, which went back to
Madelung, and was expanded upon by Takabayasi, Schönberg, and others, was discussed
in the pair of papers by Bohm, Schiller, and Tiomno [9]. This interpretation of wave
mechanics is not the same thing as examining its classical limit, since one does not take h
to zero in the process, but only gives the complex wave equations of quantum theory a
real tensorial form. This is, in fact, the process by which one associates physical
observables with the wave function when one wishes to represent those observables as
tensor fields of various ranks, instead of Hermitian operators. The basic construction that
emerges – perhaps by starting with the field Lagrangian and deriving the Noether
currents that follow from the basic symmetries of the action functional – is that of
bilinear covariants, which we now briefly discuss.
When one represents a Pauli spinor as ψ = [z1, z2]T, the basic physical observables
generally follow from considering expressions of the form ψ†σµψ, where σ0 = I and the
σi, i = 1, 2, 3 are the Pauli matrices, with their indices raised using the Euclidian metric.
One thus obtains four real functions from the four real functions qµ, µ = 0, …, 3 that
define z1 and z2. A straightforward calculation gives:

ψ†ψ = z1 z1* + z2 z2* = ∑µ (q µ ) 2


, (5.8)

ψ†σ1ψ = z1* z2 + z2* z1 = 2(q0 q2 + q1 q3), (5.9)

ψ†σ2ψ = i(z1 z2* − z2 z1*) = 2(q0 q2 + q1 q3), (5.10)

ψ†σ3ψ = z1 z1* − z2 z2* = (q0)2 + (q1)2 – (q2)2 – (q3)2. (5.11)

The first number ψ†ψ that one obtains is usually taken to represent a scalar density,
such as a number density, or, when normalized to have unity for its integral over all
space, a probability density function. The other three numbers ψ†σiψ, i = 1, 2, 3 are
taken to be proportional to the spin density of the particle that is described by ψ.
Chapter II. Real quaternions. 46

The four scalars can be assembled into a four-dimensional vector (ψ†σµψ)σµ with real
coefficients, and since we know now that the σµ represent the basis for H in a subalgebra
of the matrix algebra M(2; C), this suggests that the four-vector (ψ†σµψ) eµ , when the
coefficients are expressed in terms of the qµ, is an element of H. If we examine the
expressions qe µ q then we see that:

qe 0 q = qq = ψ†ψ, (5.12)
qe1q = (ψ†σ3ψ ) e1 + (ψ†σ2ψ ) e1 + (ψ†σ1ψ ) e1 . (5.13)

If you recall that in (4.5) we were permuting 1 and 3 in our association of the Pauli
matrices with the τi that followed most naturally from our quaternion basis then one sees
that, in effect, the quaternion qe 0 q + qe1q is essentially the one that corresponds to
(ψ†σµψ)σµ . One can evaluate the other products qe 2 q and qe3 q , but all that one finds is
expressions for the coefficients that resemble those of qe1q with various permutations
and sign changes of the components qµ. One then presumes that the definition of the
Pauli matrices essentially “favors” σ1, in that sense, much as one usually singles out the z
component of angular momentum in quantum mechanics.
One also obtains physical observables from bilinear expressions that involve the
differentials dψ and dψ†. In particular, the Noether current that is associated with the
U(1) phase invariance of the Lagrangian is the vector field that corresponds to the 1-
form:
ℏ ℏ 1* 1 1 1* 2* 2 2 2*
J= (ψ†dψ − dψ†ψ) = (z dz − z dz + z dz − z dz ) . (5.14)
2mi 2mi

When we substitute the quaternion component expressions for z1 and z2, the current
takes the form:

J = (q0 dq1 – q1 dq0 + q2 dq3 – q3 dq2). (5.15)
m

If we form the quaternion expression that corresponds to ψ†dψ − dψ†ψ then we get:

qdq − dqq = 2(q0 dqi – qi dq0 + εijk qj dqk) ei . (5.16)

Once again, we see that the component of e1 is the desired expression, while the
components of the other two basis elements are obtained by permuting the indices on the
spatial components.
47 The representation of physical motions by various types of quaternions

References

1. O. Bottema and B. Roth, Theoretical Kinematics, North Holland, Amsterdam,


1979; reprinted by Dover, Mineola, NY, 1990.
2. H. Goldstein, Classical Mechanics, 2nd ed., Addison-Wesley, Reading, MA, 1980.
3. J. B. Shaw, Synopsis of linear associative algebras, Carnegie Institute, Washington,
D.C., 1907.
4. L. E. Dickson, Algebras and their Arithmetics, Constable and Co., Ltd., London,
1923; reprinted by Dover, Mineola, NY, 1960.
5. A. A. Albert, Structure of Algebras, A. M. S. Colloquium Publications, v. 24,
Providence, RI, 1939.
6. N. Jacobson, The Theory of Rings, Mathematical Surveys of the A. M. S., NY,
1943.
7. W. Blaschke:
a. “Anwendungen dualer Quaternionen auf Kinematik,” Annales Academiae
Scientiarum Fennicae (1958), 1-13; Gesammelte Werke, v. 2; English
translation available at neo-classical-physics.info.
b. Kinematik und Quaternionen, Mathematische Mongraphien, VEB Deutscher
Verlag der Wissenschaften, Berlin, 1960; English translation available at neo-
classical-physics.info.
8. W. Pauli, “Zur Quantenmechanik des magnetischen Elektrons,” Zeit. Phys. 43
(1927), 601-623; English translation available at neo-classical-physics.info.
9. D. Bohm, R. Schiller, and J. Tiomno, “A causal interpretation of the Pauli equation
(A),” Supp. Nuovo Cimento 1 (1955), 48-66; (B), ibid., 67-91.
CHAPTER III

DUAL QUATERNIONS

Now that we have established the basic structure of the real quaternions, in order to
discuss the dual quaternions, which belong to the real algebra H ⊗ D, where D is the
algebra of dual numbers, we mostly need to introduce that algebra and then observe what
would change as a result of the tensoring operation. We will then see that the effect of
introducing the dual numbers as components is to make it possible to represent
translations in Euclidian space, as well as rotations, by means of quaternions.

1. The group of rigid motions. Since rigid motions are special type of affine
transformation, we must now regard E3 as a three-dimensional affine space A3 that has
been given a Euclidian scalar product on its tangent spaces. Hence, one can no longer
form scalar combinations of points in the space A3 and it has no uniquely-defined origin.
One only knows that there is a transitive action of the translation group (R3, +) that
allows one to associate any pair of points x, y ∈ A3 with a unique vector s ∈ R3 that one
interprets as the displacement vector from x to y. Therefore the opposite displacement
vector from y to x will be – s. One can write this association in either form:

y – x = s, y = x + s.

One can then regard the action of the translation group as equivalent to a two-point
map s : A3 × A3 → R3, (x, y) ֏ y − x. This also allows us to define a position vector field
x: A3 → R3, x ֏ x(x) relative to any choice of reference point O by way of:

x(x) = x – O = s(x, O). (6.1)

This map is invertible, and one obtains a global coordinate system on A3 by this
means. Hence, as a differentiable manifold, A3 is diffeomorphic to R3, and one says that
the affine space A3 is modeled on the vector space R3. If one chooses a frame in any
tangent space TxA3, and thus a linear isomorphism of R3 with TxA3, then one can also say
that A3 is modeled on any of its tangent spaces.
One can define a line [l] through a point x ∈ A3 by the set of a points of the form x +
α l, where α is a real number and l is a vector in R3 that defines the direction of the line.
Two lines [l] and [l′] in A3 are said to be parallel iff the points of [l′] can be obtained
from the points of [l] by translating any point x in [l] to a point x + s in [l′] by means of
some displacement vector s that is the same for all points of [l]. One can also say that the
49 The representation of physical motions by quaternions

differential map dτ(s) to the map τ (s): A3 → A3, x ֏ x + s parallel-translates vectors in


TxA3 to vectors in Tx+sA3, since it is a linear isomorphism, due to the invertibility of the
translation of points. The map τ(s) is then referred to as the translation map associated
with s; since the addition of vectors is commutative, it unnecessary to specify whether it
is right or left translation.
Since the only frames in an affine space are in its tangent spaces, in order to define an
affine frame (x, ei) one must specify the point of application x along with the linear frame
ei in TxA3. Thus, an affine frame is really a triple of points in the tangent bundle T(A3)
that project to the same point x and have linearly independent vector parts.
However, since there are no linear combinations that are defined in A3, an affine
frame does not give one coordinates for the points of T(A3) directly. In order to get
coordinates for a point (x, v) ∈ T(A3) from a choice of affine frame (O, ei), one first
parallel-translates the frame ei from O to all of the other tangent spaces and thus defines a
global parallel frame field ei(x).
This construction of ei(x) allows one to give the coordinates xi of any point x and the
components vi of any tangent vector v ∈ Tx(A3) by using the same global frame field for
both. Namely, x = O + xiei(O) and v = vi ei(x). Thus, a point (x, v) ∈ T(A3) gets mapped
to a point (xi, vi) ∈ R3 × R3 and an affine frame (x, fi) gets mapped to a point (xi, Lij ) ∈

R3 × GL(3; R), where:


fi = e j ( x) Lij . (6.2)

An affine transformation τ : A3 → A3, x ֏ τ(x) is defined to be a map that preserves


parallelism. That is, if the lines [l] and [l′] are parallel then the lines τ([l]) and τ([l′]) will
also be parallel, as well. This condition includes the condition that an affine
transformation takes lines to lines, which means that is a collineation.
If we choose a reference point O, and thus define a position vector field x(x) relative
to O, then we can use the invertible map x: A3 → R3 to define a map τ : R3 → R3 that
has the same effect on the position vectors x(x) of points in R3 that τ has on points in A3.
The definition is:
τ (x) = x(τ(x)). (6.3)

Since τ is a collineation of A3, τ will be a collineation of R3 (because a line in A3 will


become a line in R3), and thus, an invertible affine transformation of R3. The group A(3;
R) of invertible affine transformations on R3 can characterized by the semi-direct product
R3 ×s GL(3; R), so every such transformation is described by a pair (a, L) that consists of
a translation a and an invertible linear map L. The composition of two transformations
takes the form:
(a, L)(b, M) = (a + Lb, LM), (6.4)
Chapter III – Dual quaternions 50

and the inverse of the element (a, L) is:

(a, L)−1 = (− La, L−1). (6.5)

The action of (a, L) on R3 is simply:

(a, L)v = a + Lv. (6.6)

It is important to remember that since the association of τ with τ depended upon the
choice of O, so will the association of τ with some pair (s, L). If one chooses another
point O′ = O + d then the new position vector field will be:

x′(x) = x − O′ = x – O – d = x(x) – d,

and one will also have a new definition of τ :

τ ′(x′( x)) = x′(τ(x)) = x(τ(x)) – d = τ (x( x)) − d = s – d + Lx(x) = s – d + L(x′(x) + d);

i.e.:
τ ′(x′( x)) = (s + (L – I)d) + L x′(x). (6.7)

Thus, if τ gets associated with (s, L) then τ ′ will get associated with (s + (L – I)d, L).
Although the linear part remains unchanged, the translational part changes by more than
just − d, namely:
s′ = s + (L – I)d. (6.8)

This fact will be at the basis for our later discussion of Chasles’s theorem about rigid
motion.
A convenient way of representing A(3; R) by matrices is to regard R3 as the space of
inhomogeneous coordinates for a chart in RP3 and then embed R3 in the space R4 of
homogeneous coordinates as the affine hyperplane (1, xi). Relative to the canonical basis
for R4, the matrix of any affine transformation (a, L) then takes the form:

1 0 
[(a, L)] =  i i  . (6.9)
a Lj 

An interesting aspect of projective geometry is the fact that if one regards the linear
hyperplane (0, xi) as the hyperplane at infinity then the translation subgroup of A(3; R)
acts trivially on the points at infinity, which is easy to see in this matrix representation
when one sets Lij = δ ij . This fact is related to the fact that parallel lines in projective
51 The representation of physical motions by quaternions

spaces intersect in a point at infinity, so one can no longer define the parallel translation
of lines in projective geometry.

If we introduce the Euclidian scalar product <.,.> into the tangent spaces of A3 then
we can further reduce the group of affine transformations to the ones whose differential
maps preserve the scalar product:

<dτ|x(v), dτ|x(w)> = <v, w>. (6.10)

When we look at the image of this in R3 using some position vector field x(x), we see that
if τ ( v) = a + Lv then:
dτ = L. (6.11)

Since this differential map is independent of the choice of point in R3, it becomes
unnecessary to deal with the differential map and we see that we are simply restricting
the linear part L of the affine map τ by:

<Lv, Lw> = <v, w>; (6.12)

i.e., L must be an orthogonal transformation, or rotation.


In order to deal with physical motions, one must further restrict oneself to only proper
rotations, and one sees that one has reduced A(3; R) to the group ISO(3; R) = R3 × SO(3;
R) of Euclidian spatial rigid motions. Thus, since it is a subgroup of the semi-direct
product R3 × SO(3; R), the products and inverses behave the same way as before.
As a real Lie group, ISO(3; R) is non-Abelian, six-dimensional, and connected, but
not simply connected, and the presence of R3 as a factor makes it non-compact. Since R3
is a normal subgroup, it also not simple. Its two-to-one simply connected covering group
is R3 × SU(2).
A convenient way to represent ISO(3; R) by matrices is to restrict the representation
(6.9) accordingly, so the matrix of any rigid motion (a, R) takes the form:

1 0 
[(a, R)] =  i i  . (6.13)
a Rj 

The Lie algebra iso(3; R) of ISO(3; R) can be obtained by differentiation of a curve


through the identity transformation and is seen to consist of the semi-direct sum R3 ⊕s
Chapter III – Dual quaternions 52

so(3; R). If v + ω and v′ + ω′ are two elements of iso(3; R) then their Lie bracket is
obtained by using bilinearity and the fact that the Lie algebra of R3 is Abelian:

[v + ω, v′ + ω′] = [v, ω′] + [ω, v′] + [ω, ω′], (6.14)

and if one represents ω and ω′ as vectors ω and ω′, respectively, in (R3, ×) by setting ω =
ω) and ω′ = ad(ω
ad(ω ω′) then one can also say that:

[v + ω, v′ + ω′] = v × ω′ + ω × v′ + ω × ω′. (6.15)

The matrix representation of v + ω that corresponds to (6.13) can also be obtained by


differentiation at the identity:
0 0 
[(v, ω)] =  i i  , (6.16)
v ω j 

in which ω ij is a real, anti-symmetric 3×3 matrix.

We now extend our previous discussion of axes for SO(3; R) to ISO(3; R) by first
pointing out that although translations are not actually linear transformations, they can
still have eigenvectors. Namely, if one considers the basic definition of an eigenvector
for a translation a:
v + a = λv or (λ – 1)v = a

then one sees that all that this requires is that v be collinear with a, while λ can take on
any real value except 1, unless a = 0. Recall that in order to have an eigenvalue of 1, one
must have a fixed point, and non-zero translations act without fixed points.
Now, an axis for a rigid motion is still an invariant line in A3, so if v points in the
direction of the line [l] then its image under the rigid motion (a, R) will also point in that
direction; i.e.:
a + Rv = λv. (6.17)
for some real scalar λ.
If we change to a different reference point O′ in A3 then the translation vector a will
become:
a′ = a + (R – I)d, (6.18)
in which d = O′ – O.
If d is a real eigenvector of R then the change of reference point has no effect on the
translational part of the rigid motion. The line [d] would then be a rotational axis for R,
so already we see that we have singled out a class of lines in A3 that get associated with a
rigid motion by way of its rotational part. However, if we consider the translational part
then we can also single out a unique line among them. One does this by looking for a
translation d that will make a′ collinear with d:
53 The representation of physical motions by quaternions

a + (R – I)d = λd.

Let us now denote the rotational axis of R by [l] and decompose both a and d into
sums a|| + a⊥ and d|| + d⊥ whose component vectors are parallel to [l] and perpendicular
to it, respectively. This last equation then becomes two equations:

a|| = λd|| , a⊥ = [(λ + 1)I − R] d⊥ .

The fact that these equations still involve the unspecified parameter λ is due to the
fact that the solution to our problem is a line, not a point. As long as one chooses a non-
zero value for λ, the first equation is soluble for d|| . Since the second equation actually
only involves the planar part of the rotation in the plane perpendicular to the rotational
axis, the matrix [(λ + 1)I − R] will be invertible and one can also solve for d⊥. Choosing
λ = 1 gives the simplest solution:

d|| = a||, d⊥ = [2I − R]−1 a⊥ . (6.19)


That is:

Chasles’s theorem [1, 2]: Given a choice of reference point O in A3 for which a
rigid motion gets associated with the pair (a, R), one translate to another reference point
O′ such that the rigid motion takes the form (a||, R(θ)), where the translation a|| is in the
direction of the rotational axis [l] of R that passes through O′ and the rotation R(θ) is
about the axis [l].

A rigid motion is then canonically associated with a central axis [l] such that the
motion consists of a rotation around the axis and a translation along it. For that reason,
Sir Robert Ball referred to the canonical form of a rigid motion as a screw. What
Plücker, Klein, and Study were calling a “dyname” was then a wrench, while the French
used the term “torseur,” which got Anglicized to the modern term “torsor,” which refers
to an element of the dual space to a Lie algebra of infinitesimal motions.
Actually, the concept of a central axis went back further in time to Poinsot, who
showed that an analogous (in fact, the same) axis is defined for a finite spatial distribution
of force vectors that act on a rigid body. In that case, the central axis had the property
that the collective effect of the forces was equivalent to a single force that acted in the
direction of the axis and a force moment that acted in a plane perpendicular to the axis.
Ball then referred to this configuration of force and moment as a “wrench.”
The set of all oriented, orthonormal, affine frames in (E3, <.,.>) is also the bundle
SO(E3) of oriented, orthonormal linear frames in the tangent spaces. The group ISO(3;
R) of rigid motions acts on SO(E3) on the right, but not as a structure group. For one
thing, the structure group of SO(E3) is SO(3; R), not ISO(3; R), and for another, the
action of a structure group takes frame at a given point to other frames at that same point.
The action is:
SO(E3) × ISO(3; R) → SO(E3), ((x, ei), (si, R ij )) ֏ (x′, e′i ),
with:
Chapter III – Dual quaternions 54

x′ = x + siei , e′i = e j Ri j . (6.20)

One sees that the action of ISO(3; R) on SO(E3) is similar to the action of ISO(3; R)
on itself by multiplication. Indeed, one recalls that SO(E3) is diffeomorphic to ISO(3; R)
as a manifold by choosing any orthonormal affine frame (O, ei), mapping it to (0, δ ij ) and
mapping any other orthonormal affine frame (x, fi) to the element (si, R ij ) ∈ ISO(3; R)
that makes x = O + siei and fi = e j Ri j . Thus, the right action of ISO(3; R) on both ISO(3;
R) and SO(E3) commutes with that diffeomorphism. One then says that the two actions
are equivariant. In particular, the diffeomorphism takes orbits of one action to orbits of
the other.

2. The algebra of dual numbers [3-5]. The algebra D of dual numbers can be
regarded as an R-algebra over R2, in which one notates an order pair (a, b) ∈ R2 by a +
εb. The number a in this case is referred to as the real part of the dual number a + εb,
while εb is the pure dual part. Thus, one can regard the set {1, ε} as a basis for R2.
The Abelian group structure on D is defined by vector addition of order pairs:

(a + εb) + (c + εd) = (a + c) + ε(b + d),

while the scalar multiplication by a real number λ is also defined component-wise:

λ(a + εb) = λ a + ελb.

The multiplication of dual numbers is defined by polynomial multiplication, modulo


the condition on ε that:
ε2 = 0.

Hence, one can give the multiplication table for the basis elements:

11 = 1, 1ε = ε1 = ε, εε = 0.

Thus, if α = a + εb and β = c + εd then their product is defined to be:

αβ = ac + ε(ad + bc),

which is easily verified to be associative, commutative, and to have:


55 The representation of physical motions by quaternions

1 = 1 + ε(0)
for a multiplicative unity.
The fact that the multiplication distributes over addition follows from the definition of
that operation and is part of the proof of the bilinearity of the product. The other part is
that for any real scalar λ and any two dual numbers α, β, one must have:

(λα)β = α(λβ) = λ(αβ).

Since εε = 0, one already sees that non-zero divisors of zero exist, and, in fact, any
product of pure dual numbers will vanish, as well, from bilinearity. Thus, the ring is not
an integral domain; i.e., the algebra is not a division algebra.
The two-real-dimensional algebra D can be represented quite simply by a sub-algebra
of the algebra of real 3×3 matrices in a manner that is suggestive of projective geometry.
Namely, one takes the dual number a + εb to the matrix:

1 0 0
[a + ε b] =  0 a 0  .
b 0 1 

Although D is not a division algebra, some elements of D do have multiplicative


inverses. Namely, if one defines the conjugate to any dual number α = a + εb to be:

α = a − εb
then one sees that:
α α = a2.

We can then define the modulus-squared of any dual number α to be:

| α |2 = α α = a2,
and see that | α | = 0 iff a = 0.
Therefore, if α is not a pure dual number then one can define a multiplicative inverse
to α by way of:
α
α −1 = .
| α |2

One can then define the multiplicative group D* of invertible dual numbers when it is
given multiplication. It is a non-compact, two-dimensional, real, Abelian, Lie group with
two connected components – viz., the dual numbers with positive and negative real parts,
respectively. It includes the subgroup D1 of elements with unit modulus, which then all
have form:
α = ± 1 + εb.
Chapter III – Dual quaternions 56

Since any invertible dual number α has a non-vanishing modulus that is equal to | a |,
one can factor the modulus out and express every element of D* in the form of a positive
real number times a dual number of unit modulus:

 b 
α = | a |  ±1 + ε .
 |a|

Thus, the group D* is isomorphic to the direct product R* × D1.

3. Functions of dual numbers [3-5]. When one starts with the simplest functions f :
D → D, namely, polynomial functions, one sees that if x = x + εs is any dual number
then one sees by induction that:
dx n
x n = xn + ε n xn−1s = xn + ε s. (8.1)
dx

Since a polynomial P[ x ] is a (real) linear combination of powers of x , and


differentiation is linear, one can say that for any polynomial function P[ x ] of the dual
variable x , one will have:
P[ x ] = P[x] + ε P′[x] s. (8.2)

By generalization, if one assumes that f(x) is a differentiable function of x then one


can simply define the function:
f ( x ) = f(x) + ε f′(x) s. (8.3)

In particular, we will need to know that:

cos(θ ) = cos θ – ε s sin θ, (8.4)


sin(θ ) = sin θ + ε s cos θ. (8.5)
This also makes:
cos2 (θ ) + sin 2 (θ ) = 1, (8.6)
cos(2θ ) = cos2 (θ ) − sin 2 (θ ) , (8.7)
sin(2θ ) = 2sin(θ ) cos(θ ) , (8.8)
as in conventional trigonometry.

4. Dual linear algebra. The Cartesian product Dn, whose elements all look like
( x1 , …, x n ) can be treated as either an R-vector space or a D-module, depending upon
57 The representation of physical motions by quaternions

whether one is considering scalars that come from the field R or the ring D. The
elements of Dn will be referred to as dual vectors.
A module over a ring (6) (which is D, in the present case) behaves in many ways like
a vector space, except that one must take into account the possible existence of zero
divisors and non-invertible elements in the ring. That is, one still has an Abelian group
under addition – Dn, in the present case – and the addition of elements is interpreted as
(dual) vector addition. Furthermore, the ring D acts on Dn in the manner of scalar
multiplication by way of the obvious definition:

α ( x1 , …, x n ) = ( α x1 , …, α x n ).

Since D is commutative under multiplication, it is unnecessary to distinguish left from


right multiplication. Scalar multiplication has the same properties as it does for a vector
space; namely, if α and β are dual numbers, while v and w are dual vectors, then one
always has:
1v = v, 0 v = 0, α ( β v ) = (α β ) v ,
(α + β ) v = α v + β v , α (v + w) = α v + α w.

However, since there are divisors of zero in D, one will also have pairs α v of non-zero
scalars and vectors whose product is zero, such as α = εα, v = εv, where α and v are
real.
Any dual vector v can be put into real-plus-pure-dual form v + εa, where v and a are
real vectors, as can any dual number α = α + εβ , so we can also express the scalar
multiplication in that way:
α v = αv + ε(αa + βv).
In particular:
α v = αv + εαa, (1 + ε) v = v + ε(a + v).

Since the Abelian group Dn has no torsion factors – i.e., there is no non-zero integer n
and non-zero v such that n v = v + … + v (n summands) = 0 – the D-module Dn is free.
Hence, one can find a basis for it, which is a set { e 1 , …, e n } of n dual vectors such that
any dual vector v can be expressed as a linear combination with dual number
coefficients:

(6) See Jacobson [6] or MacLane and Birkhoff [7] for the general theory of modules, among other
references..
Chapter III – Dual quaternions 58

n
v = ∑v e
i =1
i
i .

For instance, one has the usual canonical basis, where e i = (0, …, 0, 1, 0, …, 0), with the
1 in the ith place.
One must note that the existence of divisors of zero makes the usual definition of
n
linear independence less useful, since there might be linear combinations ∑x e
i =1
i
i that go

to zero even though not all of the x i are zero. For instance, one might have (ελ)(εei),
where λ is real and non-zero and ei is one of the canonical basis vectors. This does not
imply that one cannot have a basis for a dual vector space, only that not as many sets of n
dual vectors can be used for that purpose.

A map L : V → W, where V is an n-dimensional dual vector space and W is an m-


dimensional one, is said to be D-linear if it takes linear combinations to linear
combinations:
L(α v + β w ) = α L ( v ) + β L( w ) .

When a basis { e 1 , …, e n } has been chosen for V, and another basis { e 1 , …, e n } has
been chosen for W, any D-linear map L can be associated with a dual matrix Lai by the
usual process:
L( e i ) = f a Lai . (9.1)

A dual matrix can also be put into real-plus-pure-dual form:

Lai = Lai + ε Aia ,


where Lai and Aia are real matrices.
The action of a dual linear map L on a dual vector v can also be regarded as an
action of the dual matrix Laj on v i when a basis has been chosen or as an action of that
matrix on the basis itself:
L( v ) = ( Lai v i ) f a = v i ( Lai f a ) .

Hence, one can relate the action of dual linear maps to the multiplication Lai v i of a dual
matrix times a dual column vector of components. In real-plus-pure-dual form this is:

Lai v i = Lai v i + ε ( Lai a i + Aia vi ) . (9.2)

For the sake of clarity, we shall usually omit the matrix indices, since they behave as
they do in conventional linear algebra.
59 The representation of physical motions by quaternions

The product of two dual matrices L and M = M + εB takes on the real-plus-pure-


dual form:
L M = LM + ε(LB + AM). . (9.3)

The identity transformation I: : Dn → Dn, v ֏ v still has the usual matrix δ ij for
any basis, and a dual matrix L is invertible iff there is some dual matrix Lɶ = Lɶ + ε Aɶ
such that:
L Lɶ = Lɶ L = I.

From (9.3), we see that if Lɶ exists then one must have that Lɶ is the inverse of the real
matrix L (which must then be invertible) and that:

Aɶ = − Lɶ A Lɶ ,

which will exist as long as Lɶ exists, regardless of whether A is invertible.

The invertible D-linear maps form a group, as do the invertible dual matrices, and we
refer to either as GL(n; D). That group contains GL(n; R) as a subgroup, by way of the
invertible pure real matrices. It also contains a subgroup that consists of the matrices of
the form:
L = I + εA.

Since the product of two such matrices L and M = I + εB is:

L M = I + ε(A + B),

the inverse of such a matrix is of the form:

Lɶ = I − εA,

which is the dual conjugate of the matrix L . One then sees that this subgroup is
2
isomorphic to the translation group of R n .
Note that the action of pure real matrices is linear, while the action of the latter class
of matrices gives an affine transformation of the dual part of any vector:

(I + εA)(v + εa) = v + ε(a + Av).

One can define a scalar product <.,.> on Dn to be a symmetric, D-bilinear functional


on Dn that is non-degenerate in the sense that the map Dn → Dn*, v ֏ < v , .> is a D-
Chapter III – Dual quaternions 60

linear isomorphism. Here, one must remember that <.,.> takes its values in D, and we are
defining Dn* to be the dual space to Dn, namely the D-module of all D-linear functionals
on Dn. The dual Euclidian scalar product is the one that makes the canonical basis
orthonormal:
<ei, ej> = δij .

Thus, if two dual vectors v and w are expressed with respect to that basis then their
scalar product takes the form:
< v, w > = δ ij v i w j ,

and if both sets of components are expressed in real-plus-pure-dual form as vi + εai and wi
+ εbi then their scalar product takes the forms:

< v, w > = δij vi wj + ε δij (vi bj + ai wj) (9.4)


or:
< v, w > = <v, w> + ε(<v, b> + <w, a>). (9.5)

A D-linear map L is said to be dual orthogonal iff it preserves the dual Euclidian
scalar product; i.e., for all dual vectors v and w , one must have:

< Lv , L w > = < v , w > .

If one expresses this in terms of components then the condition on the dual matrix Lij
is that:
δ kl Lki Llj = δ ij or LT L = I.

Once again, this implies that L , as well as Lij , must be invertible and that the inverse
of Lij is its transpose, in the usual sense of the word. Thus, the dual orthogonal
transformations or matrices form a group O(n; D), which contains O(n; R) as a subgroup
by way of the pure real orthogonal matrices. However, one notes that in order for a
translation δ ij + ε Aij to be orthogonal its inverse, namely, δ ij − ε Aij must be its transpose,
namely, δ ij + ε Ai j . Thus, the matrix Aij must be anti-symmetric. For the case of n = 3,
which is the most interesting one to us at the moment, this means that the translation
subgroup is isomorphic to R3 and the matrix Aij takes the form of ad(a) for some real 3-
vector a:
61 The representation of physical motions by quaternions

 0 −a3 a2 
 
Aij =  a 3 0 − a1  .
 − a 2 a1 0 
 

Thus, one sees that O(3; D) has much in common with IO(3; R). For one thing, any
dual orthogonal 3×3 matrix R = R + εA can be expressed as a product of a rotation and a
translation:
R = R(I + ε RA
ɶ ),

and the two subgroups O(3; R) and R3 intersect only at the identity matrix. Thus, both
O(3; D) and IO(3) have R3 × O(3; R) as their underlying group manifold. However,
when one compares the group multiplications, one sees that although the association of
rotations is straightforward, the association of the translations with the dual matrices of
the form I + εA is not. The product (a, R)(a′, R′) has a translational part a + Ra′, while
the product of two dual orthogonal matrices (R + εA)(R′ + εA′) has a pure dual part of AR′
+ RA′, so a direct association of a with A and a′ and A′ with a′ does not appear to be
consistent.
However, O(3; D) is, fact, isomorphic to IO(3; R), by a different association of the
translations with pure dual matrices, but the isomorphism is easier to exhibit at the
infinitesimal level of Lie algebras, since the adjoint map that one uses is more germane to
the Lie algebras than the Lie groups themselves. If v + ω ∈ iso(3; R) then if one
associates that element with the anti-symmetric dual matrix ω + ε ad(v) ∈ so(3; D) then
one finds that not only does this give an R-linear isomorphism of the two vector spaces,
but the Lie brackets are consistent, as well:

[v + ω, v′ + ω′] = [v, ω′] + [ω, v′] + [ω, ω′]


= −ω′v + ωv′ + [ω, ω′],

[ω + ε ad(v), ω′ + ε ad(v′)] = [ω, ω′] + ε([ad(v), ω′] + [ω, ad(v′)]


= [ω, ω′] + ε([v, ω′] + [ω, v′],

since [v, ω′] + [ω, v′] is the translational part of the former expression.
In order to get some idea of why the association of group elements is more
complicated, one can examine what happens to the exponential of an anti-symmetric dual
matrix. First, one notes that if ω = ω + εV then:

n −1
ω n = ωn + ε ∑ω
k =0
Vω k .
n − k −1
(9.6)
Chapter III – Dual quaternions 62

If the matrices ω and V commuted then the summation would give simply:

dω n
nωn−1V = V,

which is analogous to what happened for functions of dual variables, but, of course,
matrix multiplication is not generally commutative.
When one forms the exponential sum:

∞ ∞
1 n 1 n  ∞ 1 n −1 n − k −1 k 
exp ω n = ∑
n =0 n !
ω = ∑
n =0 n !
ω + ε ∑ n! ∑ ω
 n =0 k =0
Vω  ,

(9.7)

one sees that it has exp ω for its real part, but a more complicated expression than one
might prefer for the pure dual part. If ω and V commuted then it would simplify to (exp
ω)V, but that would hardly be typical.

The reduction from O(3; D) to SO(n; D) comes about by defining a volume element
on Dn. For a given basis { e 1 , …, e n } one can first define its reciprocal basis { θ 1 , …,
θ n } on Dn* in the usual way:
θ i ( e j ) = δ ij ,

and then form the dual n-form V = θ 1 ^ … ^ θ n , which is now a completely anti-
symmetric D-multilinear functional on Dn. When each θ i is expressed in real-plus-pure-
dual form as θ i + εηi, one sees that the only mixed products of real and pure dual 1-forms
that survive the multiplication must include only one pure dual 1-form εηi, so V takes
the form:

V = θ 1 ^ … ^ θ n + ε(η1 ^ θ 2 ^ …^ θ n + … + θ 1 ^ …^ θ n−1 ^ ηn). (9.8)

which we can also write in the more concise form:

V = V + e ε(ηi ^ #ei), (9.9)


in which we have defined:
V = θ 1 ^ … ^ θ n, #ei = iei V .

When one changes to another basis L( e i ) = e j L ji by way of an invertible D-linear


map L , the effect on V is to multiply it by det( Lij ) , which is a dual number, now,
although as an algebraic expression in the components of Lij it is the same as in the real
63 The representation of physical motions by quaternions

case. Thus, L preserves the volume element V iff det( Lij ) = 1, relative to the chosen
basis. The invertible D-linear transformations that preserve the volume element or the
invertible dual matrices with determinant 1 then form a subgroup SL(n; D) of GL(n; D),
and one can also restrict O(n; D) to SO(n; D).
One can define dual eigenvectors and dual eigenvalues in the predictable way: A dual
vector v is an eigenvector of a dual linear map L with eigenvalue λ ∈ D iff:

Lv = λ v.

When one puts everything into real-plus-pure-dual form − L = L + εA, v = v + εa, λ


= λ + εα – one gets:
Lv + ε(La + Av) = λv + ε(λa + αv).

Thus, one must necessarily have that v is an eigenvector of L with an eigenvalue λ,


but when one equates the pure dual parts of the equation, one gets:

La + Av = λa + αv,

which is not as strong a condition on a as the condition on v.


We can rewrite this latter condition as:

(L − λI)a = − (A − αI)v,

and this shows us that if a is not an eigenvector of L with eigenvalue λ then the matrix (L
– λI) is invertible, and the condition on a is that:

a = − (L − λI)−1(A − αI)v,

and α can be arbitrary. This is reminiscent of the fact that the eigenvalues of a given
translation a can take on any value, depending upon what vector one chooses to be
collinear to a.
If a is an eigenvector of L with eigenvalue λ then if λ is non-degenerate, a must be
collinear with v, and v must be an eigenvector of A with eigenvalue α. This would imply
that LAv = ALv, which is not as strong as saying that A and L must commute, unless the
same condition is true for all their eigenvectors. If λ is degenerate then a can be simply
contained in the same eigenspace as v, without being collinear, but v would still have to
be an eigenvector of A with eigenvalue α.
Chapter III – Dual quaternions 64

5. The algebra of dual quaternions [3-5]. One can regard the real vector space H
⊗ D as a D-module, which we then denote by HD , by forming linear combinations of the
canonical basis vectors eµ in R4 with coefficients in the ring D to give dual quaternions:

q = (qµ + εrµ) eµ = qµ eµ + rµ εeµ = q + εr. (10.1)

Thus, one can also regard a dual quaternion as a pair of elements q, r in R4; i.e., an
element of R8. One refers to q as the quaternion part Q( q ) of q and εr as the pure dual
quaternion part DQ( q ). This gives two complementary projections of HD onto the two
direct summands of H ⊕ H, so one can say that I = Q + DQ. One defines them more
specifically by polarizing the automorphism of HD that takes any q = q + εr to:

q  = q − εr, (10.2)
which makes:
Q( q ) = q = 12 (q + q  ) , DQ( q ) = εr = 12 (q − q  ) . (10.3)

One can also decompose HD into a direct sum of the form D ⊕ DV, where D is a two-
real-dimensional subalgebra that is isomorphic to D and DV is a six-real-dimensional
subspace of dual quaternions of the form:

q + εr = (qi + εri) ei , (10.4)

that one refers to as dual quaternions of vector type or, more concisely, as dual vectors.
A typical dual quaternion is then expressed in “scalar-plus-vector” form as:

q = q0 + q . (10.5)

This decomposition of HD defines a decomposition I = DS + DV of the identity


operator into a sum of projections onto the relevant subspace One can define them more
specifically by:
DS( q ) = q 0 = 12 (q + q ) , DV( q ) = q = 12 (q − q ) , (10.6)

in which the conjugate of a dual quaternion is defined by:

q = q0 − q = q + ε r . (10.7)
65 The representation of physical motions by quaternions

The algebra of HD is simply the one that it inherits from H by D-bilinearity, modulo
the relation ε2 = 0. One can first extend the basis {eµ , µ = 0, …, 3} for R4 to the basis
{eµ , εeµ , µ = 0, …, 3} for R8 and then define the missing products:

eµ (εeν ) = (εeµ )eν = ε (eµ eν), (εeµ )(εeν) = 0 . (10.8)

One can also define the product of any two dual quaternions q + εr and q′ + εr′ by D-
bilinearity:
q q ′ = (q +εr)(q′ + εr′) = qq′ + ε(rq′ + qr′). (10.9)

Thus, HD is still an associative, but not commutative, ring and has a unity element 1 that
is still defined by e0, with a center that is defined by all of the dual scalars, which have
the form (q0 + εr0) e0 .
In particular, the square of a dual quaternion takes the form:

q 2 = q2 + ε(qr + rq). (10.10)

One finds that the product of dual quaternions admits an expansion that is analogous
to the one for real quaternions:

q q ′ = ( q , q ′) + q 0 q′ + q′0 q + q × q′ , (10.11)
in which
3
( q , q ′) = DS ( q q ′) = q 0 q ′0 − ∑ δ ij q i q′ j , (10.12)
i =1

and one finds that if q = q + εr and q′ = q′ + εr′ then:

q × q′ = 12 [q, q′] = q × q′ + ε(q × r′ + r × q′). (10.13)

For dual quaternions of dual vector type, one then has:

 3 
q q′ = −  ∑ δ ij q i q ′ j  + q × q′ . (10.14)
 i =1 
It is also useful to know that:

(a × b) × c = < a , c > b − < a , b > c , (10.15)


in which:
3
< a , b > = − DS (a b) = ∑δ
i =1
ij ai b j . (10.16)
Chapter III – Dual quaternions 66

Like the algebra D itself, the algebra HD has divisors of zero; in particular, from the
last set of equations in (10.8), one sees that the product of any two pure dual quaternions
is zero. They are then nilpotents of degree two; in fact they are the only ones. In order to
verify this, one goes back to the expression (10.10) for q 2 and sets it equal to zero. This
gives:
q2 = 0, qr + rq = 0.

Hence, either q = 0 or q is a non-trivial nilpotent real quaternion, which we have seen is


impossible.
As for idempotents, one sets q 2 = q in (10.10) and deduces the conditions:

q2 = q, qr + rq = r.

Thus q must be an idempotent in H, which can only mean 0 or 1. Either of these cases
give the same result that r, and therefore q, vanishes. Hence, there are no non-trivial
idempotents in HD, either.
In order to find the invertible elements, one can look at (10.9) when:

qq′ = 1, rq′ + qr′ = 0.

If q is invertible then this can be solved uniquely by setting:

q
q′ = q−1 = , r′ = − q−1r q−1.
|| q ||2

Thus, q = q + εr is invertible iff q is invertible iff q ≠ 0, and:

q −1 = q−1 – ε q−1r q−1. (10.17)

Hence, HD, unlike H, is not a division algebra, although it contains a subset Q ∗ that
defines a multiplicative group, namely, the set complement of the vector subspace of pure
dual quaternions. As a set, it takes the form of the product Q* × R4, since the only
restriction on the quaternion part q is that it be non-zero, while there is no restriction
placed on the pure dual quaternion r. The group structure is somewhat more involved,
since although the non-zero quaternions still have the same group structure − namely, Q *
− as before, the pure dual quaternions by themselves have no group structure (since their
product is always 0), and the extension of Q* to the pure dual quaternions does not have
an obvious interpretation, at the moment.
Nonetheless, one finds that the dual quaternions of the form 1 + εq do form a
subgroup of Q ∗ , and the product of two of them takes the form:
67 The representation of physical motions by quaternions

(1+ εq)(1 + εq′) = 1 + ε(q + q′). (10.18)

Thus, the subgroup that they define is isomorphic to the four-dimensional translation
group. Therefore, since the subgroup of non-zero pure quaternions and the subgroup of
translations have only the element 1 in common and both are four-dimensional, we see
that the group Q ∗ is eight dimensional.
In order to go further into the structure of the group Q ∗ , we now examine the nature
of the most natural scalar product that we can define on HD , which is analogous to the
one that we defined on H, except that it take its values in D, not R:

< q, q′ > = DS (qq′) = <q, q′> + ε(<r, q′> + <q, r′>), (10.19)
In particular:
< q, q > = || q ||2 + 2ε <q, r> ≡ || q ||2, (10.20)

and a dual quaternion q is said to be a unit dual quaternion iff || q || = 1, which is true iff:

|| q || = 1, <q, r> = 0. (10.21)

The first condition says that the quaternion part q must lie on the unit sphere in H.
The second condition defines a homogeneous quadratic hypersurface in R8, and therefore
a quadric in RP7 that is called the Study quadric. When written out in terms of scalar and
vector parts, it reads:
3
q0r0 + <q, r> = ∑
µ
q µ rµ = 0.
=0
(10.22)

When one passes from a non-zero dual quaternion q to the line [q ] through the origin
that it defines, the first condition in (10.21) becomes superfluous.
We now see that for unit dual quaternions the expression (10.17) for the inverse
becomes:
q −1 = q – ε q r q . (10.23)

One can now regard the set Q1 of all unit dual quaternions as a group, since if q and
q ′ are unit dual quaternions then the product q q ′ = qq′ + ε(qr′ + rq′) is also a unit dual
quaternion. This follows from the fact that:

< q q ′ , q q ′ > = DS( q q ′ q ′ q ) = 1.


Chapter III – Dual quaternions 68

The group Q1 includes SU(2) as a subgroup in the form of the unit quaternions, as
well as the translation group R3, in the form of all elements of the form 1 + εs. Indeed,
for a general element q + εs, the conditions that || q || = 1 and <q, s> = <q, s> = 0 amount
to the statement that q describes a point on the unit 3-sphere in H, while s lies in the plane
tangent to it. Thus, as a manifold, we can think of Q1 as the manifold TS3 = S3 × R3.

One should note that, from (10.20), HD admits null elements, for which || q || = 0. In
fact, q is a null dual quaternion iff it is a pure dual quaternion.
When the scalar product that we just defined is restricted to dual quaternions of vector
type – i.e., dual vectors − one gets:

< q, q′ > = <q, q′> + ε(<r, q′> + <q, r′>). (10.24)

The norm-squared is then:

|| q ||2 = < q, q > = || q ||2 + 2ε <q, r>, (10.25)

and a unit dual quaternion q of dual vector type must satisfy:

|| q ||2 = 1, <q, r> = 0, (10.26)

which represents a unit vector q in R3 and a vector r of unspecified length that is


perpendicular to it. One then refers to such a q as a dual unit vector. The set of all dual
unit vectors is then TS2, which only locally looks like S2 × R2, since S2 is not
parallelizable. Thus, that set cannot be a group manifold, since every Lie group is
parallelizable. Indeed, the product of two dual unit vectors q + εr and q′ + εr′ is qq′ +
ε(rq′ + qr′), which does not have to be a dual vector, since it has a dual scalar part equal
to − <q, q′> − ε(<r, q′> + <q, r′>) = − < q , q′ >.
We can express any dual quaternion in a polar form that is analogous to the one
obtained for real quaternions, namely:

q = || q || (cos 12 θ + sin 12 θ u ). (10.27)

The factor in parentheses then represents a typical unit dual quaternion where θ = θ + εs,
is a dual angle and u = u + εm is a dual unit vector.
From (10.27), the polar form of a unit dual quaternion is then:

u = cos 12 θ + sin 12 θ u
69 The representation of physical motions by quaternions

s
= (cos 12 θ + sin 12 θ u) + ε [− (sin 12 θ – cos 12 θ u) + sin 12 θ m], (10.28)
2

which is analogous to the form that we introduced for real quaternions. This time, the
rigid motion involves a dual axis u = u + εm, (m = x×u) that represents a dual unit
vector and a dual angle θ = θ + s that combines a rotation around u by the angle θ with a
translation along m through a distance s. This differs from the previous case of real
quaternions also by the fact that the rotational axis u no longer has to go through the
origin.

There is a simple isomorphic representation of the Lie algebra iso(3) in the Lie
algebra q0 of dual vectors given the commutator bracket. If one associates the
infinitesimal rigid motion ω + v with the dual vector (ω ω + εv) then one sees that, from
1
2
the bilinearity of the commutator bracket, the Lie bracket of two such infinitesimal
motions:
ω + v, ω′ + v′] = [ω
[ω ω, ω′] + [ω
ω, v′] + [v, ω′] = ω × ω′ + ω × v′ + v × ω′

(since [v, v′] = 0 for the translations) is consistent with the commutator of the
corresponding dual vectors:

1
2 ω + εv, ω′ + εv′] = 12 {[ω
[ω ω, ω′] + ε{[ω ω, v′])} = ω × ω′ + ε(ω
ω′, v] + [ω ω × v′ + v × ω′),

in which we have used the fact that [q, q′] = 2 q × q′ for quaternions of vector type.

6. The action of rigid motions on dual quaternions. The group of unit dual
quaternions can be used to represent rigid motions of affine E3 and its bundle SO(E3) of
orthonormal affine frames.
The action of unit dual quaternions on dual quaternions is simply the predictable
extension of the action q′ = u q u of a unit quaternion u on a quaternion q to an action of
a unit dual quaternion u on a dual quaternion q , namely:

Q1 × HD → HD , q ֏ q ′ = u q u . (11.1)

The proof that the transformation of HD is a D-linear, surjective, and two-to-one


follows from the same proofs as in the case of H, since we are still assuming that u is
invertible.
The proof that the action is by isometries of <.,.> is straightforward, since:

< p′ , q′ > = DS( u p u u q u ) = DS( u p q u ) = DS( p q ) = < p , q >,


Chapter III – Dual quaternions 70

in which we have implicitly used the fact that the action (11.1) has DSHD and DVHD as
invariant subspaces. Therefore, the action thus defined takes dual scalars to other dual
scalars and dual vectors to dual vectors.
Since the action (11.1) is linear in q for each u and has DVHD for an invariant
subspace, if one chooses a basis { e i , i = 1, 2, 3} for DVHD then one can associate the
action of u with an invertible dual 3×3 matrix R ij by way of:

u e i u = e j Ri j . (11.2)

The fact that the action is by isometries then implies that R ij ∈ SO(3; D) and the
resulting map Q1 → SO(3; D), u ֏ R ij is then a two-to-one homomorphism of the
group of unit dual quaternions with the group of three-dimensional rigid motions, as
represented by the volume-preserving, dual orthogonal transformations. The fact that the
association preserves the product follows from the fact that the product u u′ of two unit
dual quaternions acts on a dual orthonormal frame e i to give:

u u′ e i u u′ = u (u′ e i u′)u = (u e j u ) R′i j = e j R′k j Rik .

Thus, the association takes the quaternion product u u′ to the matrix product R′k j Rik , so it
is an order-reversing two-to-one isomorphism from Q1 to SO(3; D).

The transition from an action of Q1 on dual vectors in DVHD to an action on


orthonormal affine frames (x, fi) at its various points is also straightforward, since one has
the vector x from O to x, while the unit vectors fi define moments mi = x × fi about O.
The orthonormal affine frame (x, fi) at the point x ∈ E3 then becomes the three dual unit
vectors:
f i = fi + ε mi . (11.3)
One finds that:
< f i , f j > = δij + ε(<fi, mj> + <mi, fj>). (11.4)
One then observes that:
<fi, mj> + <mi, fj> = det[fi | x | fj] + det[x | fi | fj] = 0, (11.5)

since switching any two columns in a matrix changes the sign of the determinant. We
can therefore assert that if (x, ei) is an orthonormal affine frame at a point x in affine E3
then the dual frame f i that was defined in (11.3) is an orthonormal dual frame in DVHD .
Since the action (11.1) is by D-linear isometries, and (x, fi) = (O, ei)(xi, Ri j ), one sees
that:
71 The representation of physical motions by quaternions

f i = e j Ri j + ε x × ei = e j Ri j + ε ei ad(x) = ej ( Ri j + ε [ad(x)]ij ) = ej R ij ,

in which we have introduced the right-adjoint operator for the vector x:

y ad(x) = x × y = − εijk xk yj ei , (11.6)


and the dual matrix:
R ij = Ri j + ε x k [ad(e k )]ij = Ri j − ε ε ijk x k . (11.7)

One finds that there is an analogue of Rodrigues’s formula that applies to dual
vectors:
u x u = cos θ x + (1 − cos θ ) < u, x > u + sin θ u × x . (11.8)

7. Some line geometry. One use for the set of dual unit vectors is that they give a
faithful representation of the manifold of all lines in E3, which we regard as an affine
space, not a vector space, now. First, one represents a line [l] in E3 by means of two
(tangent) vectors x and u, where x is the displacement vector x − O that takes one from a
chosen reference point O to a point x on [l] and u is a unit vector tangent to x that defines
the direction of [l]. Since – u would also define the direction of [l], unless one orients the
line (which would then make it a “spear,” in the language of Study), the set of all lines
through x is a manifold that is diffeomorphic to RP2, which is doubly covered by S2.
Thus, one can associate [l] with a line through the origin in H that lies in the subspace of
pure quaternions by taking u to the corresponding pure quaternion, and thus with two
points of intersection with the unit sphere in the subspace. As for the vector x, one finds
it more convenient to define the moment m of u about O by way of:

m = x × u, (12.1)

which is then independent of the choice of x, but dependent on the choice of O, since any
other point x′ on [l] could be expressed by a position vector of the form x + αu, for some
scalar α, and the moment of u about O would then go to:

(x + αu) × u = m + α u × u = m.

One can define a canonical choice of x by specifying that x be perpendicular to u.


A different choice of O – say, O′ = O + s – would then make the new moment of u
take the form:
m′ = (x + s) × u = m + s × u,

and this would be unchanged only if s were parallel to u. Since this defines a line [a]
through O that is parallel to [l], one can also think of m as the moment of [l] about [a].
Another advantage of using m, in place of x is that m will automatically be
perpendicular to u. Thus, if one associates the line [l] with the pair of vectors (u, m) and
Chapter III – Dual quaternions 72

then with the dual vector l = u + εm, one finds that l is also a dual unit vector.
Conversely, every dual unit vector u + εm (and its negative) defines a line in E3 by way
of the line through the origin of H that contains u and the vector m that is tangent to the
unit 2-sphere, which then gives the moment of the line about O, and thus, the canonical
normal x from O to that line.
If one has two lines [l] and [l′] in E3 that have a distance of closest approach s that lies
along a common perpendicular to both lines, which is described by the points x and x′,
and define an angle θ when they are parallel-translated along that line until they intersect
then one finds that:
<u, u′> = cos θ, (12.2)
<x, u′> + <x′, u> = − det[x′ − x | u | u′] = − s sin θ. (12.3)

When the two lines are represented by dual unit vectors l and l ′ , one finds, from
(8.4), that the latter set of equations consolidate into:

< l , l ′ > = cos θ , θ = θ + ε s. (12.4)

Thus, the orthogonality of the lines, in the dual sense, is equivalent to the statement
that the lines intersect at a right angle, since the real part of θ describes the angle
between them and dual part describes the distance of closest approach.
One also finds that there is an analogue of the usual formula for the cross product:

l × l ′ = sin θ n , (12.5)

in which n is the dual unit vector that is orthogonal to the plane of l and l ′ in the right-
hand sense.
The key to making the association of a rigid motion of E3 with u and θ is given by
Chasles’s theorem, which we discussed above. The dual unit vector u = u + εm then
defines central axis [l] of the rigid motion with respect to some chosen point O in E3.

8. The kinematics of translating rigid bodies. When a rigid body is allowed to


translate, as well as rotate, the rotating orthonormal frame fi(t) at the fixed point O
becomes the orthonormal frame (x(t), fi(t)) at a moving point x(t). One can then represent
(x(t), fi(t)) in the form:
(x(t), fi(t)) = (x0, f0j)(sj(t), Ri j (t ) ) = (x0 + s(t), f0j Ri j (t ) ), (13.1)

in which x0 = x(0), f0j = fi(0), and s(t) = sj(t)f0j . One can also express this relationship by
the pair of equations:
x(t) = x0 + s(t), fi(t) = f0j Ri j (t ) . (13.2)

If one assumes that all functions of time are sufficiently differentiable then a first
differentiation gives:
73 The representation of physical motions by quaternions

dx df
v= = sɺ , fɺi = i = f0j Rɺi j . (13.3)
dt dt

One solves for (x0, f0i) in terms of (x, fi):

x0 = x – s, f0i = fi Rɶi j , (13.4)

and substitutes this in (13.3) to get:

v = sɺ , fɺi = fi ωi j , (13.5)

in which we have introduce the angular velocity of the moving frame with respect to the
initial one:
ωi j = Rɺkj Rɶik . (13.6)

A second differentiation of (13.3) gives the acceleration of the moving frame relative
to the initial one:
dv ɺɺ dfɺ ɺɺ j .
a= = ɺsɺ , fi = i = f0j R i (13.7)
dt dt

When one substitutes for f0j, one gets:


fi = f jα i j ,
ɺɺ (13.8)

in which we have introduced the angular acceleration of the moving frame:

α i j = Rɶkj Rɺɺik . (13.9)


One can also differentiate (13.5) to get:

fi = fɺ jωi j + f jωɺi j = f j (ωkjωik + ωɺ i j ) ,


ɺɺ (13.10)
which also makes:
α i j = ωkjωik + ωɺi j , (13.11)
since fi is a frame.
If one now represents the orthonormal frame (x0, f0i) by the dual unit vector f 0i = f0i +
εm0i, with the predictable definition for m0i , then if the rigid motion (sj(t), Ri j (t ) ) is
represented by the unit dual quaternion q (t ) = q(r) + εs(t), the time evolution of the initial
frame (x0, f0i) under the action of the one-parameter family of rigid motions can be
expressed in a form that follows from (11.1):

f i (t ) = q (t ) f 0 i q (t ) . (13.12)
Chapter III – Dual quaternions 74

If we assume that the curve q (t ) in Q1 is sufficiently differentiable then one can


express the velocity of the moving frame f i (t ) relative to the initial frame in the form:

df i
vi = = qɺ f 0 i q + q f 0 i qɺ . (13.13)
dt

When we solve (13.12) for f 0i = q f i q and substitute in (13.13), we get the velocity
relative to the moving frame:
v i = ω f i + f i ω = [ω, f i ] , (13.14)

in which we have introduced the absolute velocity:

ω = qɺ q , (13.15)

and used the fact that ω is a dual vector, so ω = − ω .


The absolute velocity ω = ω + εv corresponds to an element ω + v of the Lie algebra
iso(3) of the group of rigid motions in E3, so it consists of a rotational part ω and a
translational part v.
Another differentiation of (13.13) gives the acceleration relative to the initial frame:

dv i
ai = = qɺɺ f 0 i q + 2 qɺ f 0 i qɺ + q f 0i qɺɺ , (13.16)
dt

and upon substituting for f 0i , one gets the absolute acceleration:

a i = [ α , f i ] − 2ω f i ω , (13.17)

into which we have introduced the generalized angular acceleration:

α = qɺɺ q . (13.18)

One can also differentiate (13.14) to get:

a i = [ω
ɺ , f i ] + [ω, fɺ i ] = [ω
ɺ , f i ] + [ω,[ω, f i ]] , (13.19)
which makes:
[α , f i ] = [ω
ɺ , f i ] + [ω,[ω, f i ]] + 2ω f i ω . (13.20)
75 The representation of physical motions by quaternions

References

1. H. Goldstein, Classical Mechanics, 2nd ed., Addison-Wesley, Reading, MA, 1980.


2. O. Bottema and B. Roth, Theoretical Kinematics, North Holland, Amsterdam,
1979; reprinted by Dover, Mineola, NY, 1990.
3. E. Study, Geometrie der Dynamen, Teubner, Leipzig, 1903.
4. W. Blaschke:
a. “Anwendungen dualer Quaternionen auf Kinematik,” Annales Academiae
Scientiarum Fennicae (1958), 1-13; Gesammelte Werke, v. 2; English
translation available at neo-classical-physics.info.
b. Kinematik und Quaternionen, Mathematische Mongraphien, VEB Deutscher
Verlag der Wissenschaften, Berlin, 1960; English translation available at neo-
classical-physics.info.
5. G. R. Veldkamp, “On the use of dual numbers, vectors, and matrices in
instantaneous, spatial kinematics,” Mechanism and Machine Theory, 11 (1976),
141-156.
6. N. Jacobson, Lectures in Abstract Algebra, Van Nostrand, Princeton, NJ, 1951.
7. S. MacLane and G. Birkhoff, Algebra, 2nd ed., MacMillan, NY, 1979.
CHAPTER IV

COMPLEX QUATERNIONS

Since the algebra of complex quaternions HC comes about by taking the real tensor
product H ⊗R C, in a manner that is analogous to the way that dual quaternions came
about from the tensor product H ⊗R D, we will proceed in a manner that is analogous to
what we did in the last chapter. The main difference is in the fact that the algebra C is
now a division algebra, as well as a field, since i2 = − 1, as opposed to the way that ε2 = 0.
In fact, one can continuously deform the linear automorphism i into the nilpotent ε by
representing both of them as 2×2 real matrices. If one defines the one-parameter family
of matrices:
 0 1
σ(λ) =   (13.21)
 −λ 0 

then one sees that for λ = 0 the matrix represents ε, while for λ = +1 it represents i.

1. Functions of complex variables. Since the theory of functions of complex


variables is quite vast and the elements are commonly taught, we shall simply summarize
some of the formulas that are relevant to our immediate purposes.
If:
z = x + iy = r(cos θ + i sin θ)

is a complex number then some of the basic functions that one encounters in terms of real
variables take the complex form:

zn = rn (cos nθ + i sin nθ), (14.1)

ez = ex(cos y + i sin y), so eiy = cos y + i sin y, (14.2)

sin z = sin(x + iy) = sin x cos iy + cos x sin iy


= sin x cosh y + i cos x sinh y, (14.3)

cos z = cos(x + iy) = cos x cos iy − sin x sin iy


= cos x cosh y − i sin x sinh y, (14.4)
since:
sin ix = i sinh x, cos ix = cosh x. (14.5)

Some other useful formulas that we will need are the complex analogues of the usual
real formulas:
77 The representation of physical motions by various types of quaternions

cos2 z + sin2 z =1, (14.6)

cos (z + z′) = cos z cos z′ − sin z sin z′, (14.7)

sin (z + z′) = sin z cos z′ + cos z sin z′, (14.8)


so, in particular:
sin 2z = 2 sin z cos z, (14.9)

cos 2z = cos2 z − sin2 z. (14.10)


One also finds that:
(sin z)* = sin α*, (cos z)* = cos z *. (14.11)

2. The group of complex rotations. Although our ultimate objective will be the
representation of the proper, orthochronous Lorentz group, nevertheless, for the purposes
of quaternions, its representations by means of SO(3; C) and SL(2; C) will appear most
directly. Thus, we shall begin by examining what happens when one complexifies
Euclidian space E3 to EC3 and then show how that relates to the more physically familiar
Lorentz group.
We define EC3 to be (C3, <.,.>), where <.,.> is the Euclidian scalar product. That is, a
complex frame {ei, i = 1, 2, 3} allows one to express every complex vector v by a linear
combination viei with complex components vi, and that frame is said to be orthonormal
iff:
<ei, ei> = δij , (15.1)
which makes:
3
<v, w> = δij vi wj = ∑v w
i =1
i i
, (15.2)

and this differs from the corresponding real expression only by the fact that the resulting
number is complex. Thus, the definitions of orthogonality and normality do not change.
However, when one defines the norm-squared of any complex vector:

3
|| v ||2 = ∑ (v )
i =1
i 2
, (15.3)

one finds that, unlike the real analogue, it does not have to be positive-definite. That is,
non-zero solution of the quadratic equation || v ||2 = 0 can exist, and one calls such vectors
null vectors. If one puts the vector v into real-plus-imaginary form a + ib then one sees
that:
|| v ||2 = || a ||2 − || b ||2 + 2i <a, b>. (15.4)

In order for this to vanish, the real and imaginary parts must satisfy the conditions:
Chapter IV. Complex quaternions. 78

|| a || = || b ||, <a, b> = 0. (15.5)

That is, the real vectors a and b must have the same length and they must be orthogonal.
This situation is quite fundamental to the theory of electromagnetism, in which the
vector a becomes the electric field strength E and the vector b becomes the magnetic
field strength B. The two vectors can be combined into a complex vector E + iB, which
is a concept that goes back at least as far as Riemann [1], and was expanded upon by
Conway [2], Silberstein [3], Majorana [4], and Oppenheimer [5] in various contexts.
They can also be combined into a 2-form dτ ^ E + #B, where dτ is the proper time 1-
form that allows one to decompose four-dimensional Minkowski spacetime M4 into a
one-dimensional proper time axis T and a complementary spatial subspace Σ, and # : Λ1Σ
→ Λ2Σ, v ֏ ivVs is the Poincaré isomorphism that one gets from a choice of spatial
volume element Vs . The null vectors or 2-forms include the fields of electromagnetic
waves, but not exclusively.
However, since our immediate interest in this monograph is kinematics, we shall not
go further into such matters at the moment. We shall, however, revisit them in the
context of complex line geometry later in this chapter.
A complex orthogonal transformation is still defined to be a C-linear isomorphism L:
EC3 → EC3 with the property that:

<Lv, Lw> = <v, w> for all v, w, (15.6)

and this still implies the basic property of the matrix [L] of any complex orthogonal
transformation that:
[L]−1 = [L]T. (15.7)
This still implies that:
det(L) = ± 1, (15.8)

although the group O(3; C) of all complex orthogonal transformations of EC3 does not
split into two connected components, because + 1 and – 1 can be connected to each other
in the complex plane by a continuous path that does not go through 0. The set of
transformations with the positive sign on their determinant includes the identity, and is
therefore a subgroup, which we denote SO(3; C) and call the proper complex orthogonal
group in three-dimensions. If one introduces an orientation on C3 then one can think of
its elements a orientation-preserving complex orthogonal transformations, since they all
have det(L) = 1.
However, just as we now have null vectors in EC3 , we also find that although the
typical element L ∈ SO(3; C) can be expressed in real-plus-imaginary form, that
representation is not as physically illuminating as when one uses polar decomposition to
express L as a product RB of a real rotation R and a matrix B that will be seen to
correspond to a Lorentz boost. Once again, the presence of null vectors complicates the
79 The representation of physical motions by various types of quaternions

Gram-Schmidt process by which one orthonormalizes L into R, but we shall see that the
expression of matrices in real-plus-imaginary form makes this decomposition quite
elementary in the context of the Lie algebra so(3; C) of infinitesimal generators of one-
parameter subgroups of orientation-preserving complex orthogonal transformations.
Meanwhile, one can still use the complexified elementary rotation matrices R(θ, 0, 0),
R(0, φ, 0), R(0, 0, ψ) as the generators of all complex rotations, except that now the Euler
angles are complex. However, using the identities that we established in the previous
section, one can factor an elementary complex rotation into the product of a real rotation
and an imaginary one. We illustrate this for matrices in SO(2; C), but the principle is the
same for the elementary rotations in SO(3; C). If α = θ + iβ is a complex angle then:

 cos α − sin α   cos θ cosh β − i sin θ sinh β − sin θ cosh β − i cos θ cosh β 
 sin α cos α  = sin θ cosh β + i cos θ cosh β cos θ cosh β − i sin θ sinh β 
   

 cos θ − sin θ   cosh β −i sinh β 


=   .
 sin θ cos θ  i sinh β cosh β 

The right-hand matrix can also be written in the form:

 cos iβ − sin i β 
 sin i β cos i β  ,
 

which makes it clear that one is dealing with a planar rotation through an imaginary
angle.
One finds that in this two-complex-dimensional case the imaginary rotations do form
a group under multiplication, and it is of real dimension one, but since cosh β and sinh β
are asymptotic to eβ, the Lie group SO(2; iR) that is generated by all such matrices is not
compact. Since it connected, it must therefore be diffeomorphic to a line, while the real
rotations form a one-real-dimensional Lie group that is diffeomorphic to a circle. In fact,
the group SO(2; C) then becomes isomorphic to the group (C*, ×) of non-zero complex
numbers under multiplication.
One notes that the eigenvalues of matrices in SO(2; C) immediately take the form:

λ = cos α + i sin α,

but since α is complex, when one expands it into its real components θ + iβ the ultimate
result is:
λ = eβ (cos θ + i sin θ),

which again includes all non-zero complex numbers.


Chapter IV. Complex quaternions. 80

However, although one can commute the two matrices in this case, that is only
because they act about the same complex axis. In three complex dimensions, when two
rotations act about different axes they do not commute, in general. Hence, if one factors
R(α, β, γ) into the product R(α) R(β) R(γ) of three complex elementary rotations, and
then factors the complex rotations into the products of real and imaginary rotations, then
the resulting sequence:
R(θ) R(βx) R(φ) R(βy) R(ψ) R(βx)

cannot generally be rearranged into a product of three real rotations and a product of
three imaginary ones.

By differentiating a curve through the identity transformation, one can see that a
typical element of so(3; C) is again an anti-symmetric matrix ϖ, except that now it
consists of complex entries. One still has the condition Tr(ϖ) = 0 that corresponds to the
condition det(L) = 1 on the finite transformations.
When one uses the bilinearity of the Lie bracket, which is now assumed to be C-
bilinearity, one sees that if one expresses ϖ as ω + iζ then one gets:

[ϖ,ϖ′] = [ω, ω′] – [ζ, ζ′] + i([ω, ζ′] + [ζ, ω′]). (15.9)

In particular, [ω, ω′] and [iζ, iζ′] both belong to so(3; R), while [ ω, iζ] will belong to
its imaginary complement i so(3; R). Hence, although so(3; R) is a subalgebra of so(3;
C), its complement i so(3; R) is not. Therefore, the decomposition so(3; C) = so(3; R)
⊕ i so(3; R) of vector spaces does not correspond to a direct sum of subalgebras.
If one takes a real basis for so(3; C) in the form of {ei, iei, i = 1, 2, 3} then one finds
the commutation relations for the Lie algebra so(3; C) immediately from those of so(3;
R) and bilinearity:

[ei, ej] = εijk ek, [ei, iej] = εijk iek , [iei, iej] = − εijk ek . (15.10)

Thus, although the vector space of real vectors defines a sub-Lie-algebra, from the last set
of relations, the vector space of imaginary ones does not.

When one looks at the situation involving the eigenvalues of complex orthogonal
transformations, one sees that there are two immediate differences from the real case:
Firstly, polynomials with complex coefficients do not have to always admit conjugate
pairs of roots, and secondly, they are always factorizable into the product of powers of
linear factors; i.e., they are always reducible. The general form for a characteristic
polynomial will then be (λ – λ1)(λ – λ1)(λ – λ3), naively.
81 The representation of physical motions by various types of quaternions

Thus, when one considers that for an orthogonal L the equation Lv = λv is equivalent
to the equation LTv = (1/λ) v, and that both L and LT have the same eigenvalues, one sees
that the only symmetry we can generally find in the roots of the characteristic polynomial
for a given L is that if λ is a root then so is 1/λ. Hence, we can further specify the form
as (λ ± 1)(λ – λ1)(λ – 1/λ2), since the only numbers that equal their reciprocals are ± 1.
However, λ = − 1 does not correspond to a proper rotation.
The fact that any proper complex orthogonal transformation L must have 1 as an
eigenvalue still implies the existence of an axis for any complex rotation. However, it is
now a complex line, which is then a real plane in C3. Hence, it consists of a real axis for
the real rotational part, as well as another one for the imaginary rotation.

3. The algebra of complex quaternions [6-10]. If {eµ , µ = 0, …, 3} is the


canonical basis for C4 then a typical complex quaternion can be represented in the form:

q = q0 + qiei , (16.1)

in which the components qµ = pµ + irµ are now assumed to be complex numbers.


Therefore, analogous to what we did with dual quaternions, we can also express a typical
complex quaternion as the sum of a real quaternion and an imaginary one:

q = p + ir = ( p0 + piei) + i(r0 + riei). (16.2)

Thus in addition to the quaternion conjugation automorphism, one can also introduce
complex conjugation and the adjunction operator:

q = p + ir , q* = p − ir, q† = q ∗ = p − ir . (16.3)

These automorphisms define projections of HC onto direct summands in

decomposition of HC by polarizing the identity operator:

I = CS + CV = Re + Im = H+ + H−, (16.4)
in which one then has:
CS(q) = 12 (q + q ) , CV(q) = 12 (q − q ) , (16.5)
Re(q) = 12 (q + q∗ ) , Im(q) = 12 (q − q∗ ) , (16.6)

H (q) = 12 (q + q ) ,
+ †
H (q) = 12 (q − q ) .

(16.7)

Note that a self-adjoint quaternion will have the form:

q = q0 + i qi ei (qµ all real), (16.8)

while an anti-self-adjoint quaternion will have the form:


Chapter IV. Complex quaternions. 82

q = i q0 + qi ei (qµ all real). (16.9)

The real dimensions of the spaces of complex scalars, complex vectors, real
quaternions, imaginary quaternions, self-adjoint complex quaternions, and anti-self-
adjoint complex quaternions are then 2, 6, 4, 4, 4, 4, respectively.
The multiplication of two complex quaternions q, q′ follows from the assumption of
bilinearity, if one preserves the same multiplication table for the basis elements as in the
real case and uses the basic property i2 = − 1:

q q′ = (pp′ − rr′) + i(pr′ + rp′). (16.10)

which can also be expressed in complex scalar plus complex vector form:

q q′ = (q0 q′0 − <q, q′>) + (q0q′ + q′0q + q × q′), (16.11)

in which all of the scalars and vectors are complex, now.


This multiplication is still associative, but not commutative, and has a unity element
1, just like the real quaternions, but it is no longer a division algebra since it has divisors
of zero. In fact, one can show that the only complex division algebra, up to isomorphism
is C itself (see Dickson [11], pp. 126).
In order to give an example of a pair of divisors of zero, one again looks at qq :

qq = (q0)2 + <q, q> ≡ || q ||2. (16.12)

Whereas, in the real case this would have to vanish for any non-zero q, in the complex
case, this is no longer true. One then refers to the complex quaternions for which || q ||
vanishes as the null quaternions. Hence, any null quaternion and its conjugate represent
divisors of zero.
One can introduce the following complex scalar products by using complex
components for the quaternions, this time:

(q, q′) = CS(qq′) = q0q′0 − <q, q′> = ηµν qµ q′ν, (16.13)


<q, q′> = CS( qq′ ) = q0q′0 + <q, q′> = δµν qµ q′ν. (16.14)

Thus, the first one makes C4 into a complex Minkowski space, while the second one
makes it into a complex Euclidian one. From the definition of the (Euclidian) norm in
(16.12), one can also say that:
|| q ||2 = <q, q>. (16.15)

If one puts q and q′ into real-plus-imaginary form p + ir, p′ + ir′ then their scalar
product takes the form:
<q, q′> = <p, p′> − <r, r′> + i(<p, r′> + <p′, r>). (16.16)
Thus:
|| q ||2 = || p ||2 − || r ||2 + 2i <p, r>. (16.17)
83 The representation of physical motions by various types of quaternions

One can then characterize a null quaternion by the pair of quadratic conditions on the
real quaternions p, r:
|| p ||2 = || r ||2, <p, r> = 0. (16.18)

Thus, the p and r must both lie on a real 3-sphere of radius || p ||2, or rather, they must
each lie separately on a disjoint pair of real 3-spheres in C4 = R4 × R4, while the second
condition once more singles out the Study quadric in R4 × R4. It is interesting that the
change of coefficient ring from D to C has not affected the condition for the vanishing of
the pure dual or imaginary part of the scalar product, even though one has defined two
different real algebras over R4 × R4. This is related to the fact that in a sense the
translations of E3 are a non-relativistic (c → ∞) version of the Lorentz boosts.
We can examine the possible existence of nilpotents of degree two and idempotents
by specializing the general expression (16.11) for the product of complex quaternions to
an expression for the square of one:

q2 = (q0)2 − <q, q> + 2q0q . (16.19)

Of course, the only essential difference between this expression and the
corresponding one for real quaternions is in the fact that now everything is complex.
However, that still implies some new consequences.
First we set q2 = 0 and get the same conditions as in the real case, which we repeat for
the sake of logical continuity:

(q0)2 = <q, q>, q0q = 0.

As before, setting q = 0 still makes q0 = 0, which is still trivial, and setting q0 = 0 still
makes <q, q> = 0, but now that we are dealing with complex vectors this equation can
admit non-trivial solutions. If we express q in real-plus-imaginary form as p + ir then
this condition expands into:
<p, p> = <r, r>, <p, r> = 0. (16.20)

As we shall see when we discuss complex line geometry, the quadric that is defined by
the last condition in the three-complex-dimensional vector space of pure quaternions
relates to something else that has been well-studied (7), namely, the Klein quadric. In
fact, in electromagnetism, if one replaces p and r with E and B, respectively, then one
finds that the two conditions on nilpotents of degree two amount to the same necessary
(but not sufficient) conditions on electromagnetic field strengths in order for them to
represent the fields of electromagnetic waves.
We thus conclude that HC does, in fact, admit nilpotent elements of degree two,
which are moreover, physically significant.

(7) No pun intended!


Chapter IV. Complex quaternions. 84

As for idempotents, if we set q2 = q in (16.19) then this gives the necessary


conditions:
q0 = (q0)2 − <q, q>, q = 2q0q. (16.21)

These are also the same as for the real case, except that now everything is complex. If
one sets q = 0 in the latter equation then this would imply q0 = 0 in the former one, as
before. Otherwise, q0 = 1/2, as before, which still implies that:

<q, q> = − 14 , (16.22)

except that now it admits non-trivial solutions. In real-plus-imaginary form, it gives:

<p, p> − <q, q> = − 14 , <p, q> = 0, (16.23)

which differs from the nilpotent case in the first equation, but not the second one.
These necessary conditions are clearly sufficient.
If we replace q with || q || u, where || u || = 1, then one finds that || q || = i/2. Thus, we
can express any idempotent in HC in the form:

q = 12 (1 + iu). (16.24)

A complex quaternion is a unit quaternion iff || q ||2 = 1, which then leads to the pair
of real quadrics in R4 × R4:

|| p ||2 − || r ||2 = 1, <p, r> = 0. (16.25)

Thus, one is still dealing with the Study quadric, although the quadric that is defined by
the unity constraint is no longer homogeneous. However, if one reverts to the complex
form of the norm-squared:

|| q ||2 = (q0)2 + (q1)2 + (q2)2 + (q3)2 (16.26)

then one sees that the unit complex quaternions simply define a complex 3-sphere of unit
radius, just as the unit real quaternions defined a real 3-sphere of unit radius.

One now sees how to define the inverse element to any invertible element from this.
If q is not a null quaternion then one sets:

q
q−1 = , (16.27)
|| q ||2

and one sees that q−1 is, in fact, the multiplicative inverse of q. Of course, this, too, is
simply the complex analogue of the result for real quaternions.
85 The representation of physical motions by various types of quaternions

Thus, the multiplicative group CQ* of all invertible (i.e., non-null) complex
quaternions can be factored into a product C* × CQ1 of the multiplicative group of non-
zero complex numbers and the multiplicative group of unit complex quaternions. The
group CQ1 then consists of all points on a complex-Euclidian 3-sphere in C4 of unit
radius; in fact, as we shall demonstrate in the next subsection, it is a complex Lie group
that is isomorphic to SL(2; C), which is a two-to-one simply-connected covering group of
the proper, orthochronous Lorentz group.
In analogy to what we de did for real and dual quaternions, we find that there is also a
polar form for any non-null complex quaternion:

q = || q || (cos 12 α + sin 12 α û ), (16.28)

in which the complex angle α = θ + iβ represents both an angle of rotation θ around a


spatial axis, which is generated by the real vector a and a boost with a rapidity parameter
β in a direction that is defined by a real vector b. One can then obtain the angle α from
any non-null q by means of:
q0
cos 2 α =
1
. (16.29)
|| q ||

The complex vector û = a + ib is then assumed to be a complex unit vector, so:

1 = || û ||2 = − û û = − aa + bb − i(ab + ba) = || a ||2 − || b ||2 + 2i<a, b>, (16.30)

which imposes the conditions on a and b that follow from the restriction of (16.25) to
complex quaternions of vector type:

1 = || a ||2 − || b ||2, <a, b> = 0. (16.31)


One can obtain û from:

q = || q || û , || q || = || q || sin 12 α, (16.32)

as long as || q || is non-null. If q = p + ir then one gets the individual component vectors


a and b of û from:
|| q || û = || q || (a + ib) = p + ir,
which makes:
p r
a= , b= . (16.33)
|| q || || q ||

When one takes the commutator bracket of two complex quaternions, one gets:

[q, q′] = 2 q × q′. (16.34)


Chapter IV. Complex quaternions. 86

This tells us that the complex vector quaternions define a complex three-dimensional Lie
algebra that is isomorphic to so(3; C), and that the center of the quaternion algebra
consists of all complex quaternions of complex scalar type. One finds that so(3; C) is
also isomorphic to sl(2; C), as well as so(1, 3). In fact, when one expresses an element of
so(3; C) in real-plus-imaginary form:
Ω = ω + iβ
β, (16.35)

one finds that ω represents an infinitesimal Euclidian rotation (i.e., an element of so(3;
R)), while β is effectively an infinitesimal Lorentz boost.
It is amusing that although one is often told in elementary special relativity that the
vector cross product is no longer useful in four real dimensions, nonetheless, its extension
to three complex dimensions has a fundamental special-relativistic significance, after all.
For the sake of completeness, we include the anti-commutator bracket of two
complex quaternions:
{q, q′} = 2((q, q′) + q0q′ + q′0q). (16.36)

When q and q′ are complex vectors, one gets:

{q, q′} = − 2<q, q′>, (16.37)

and one has:


q q′ = − <q, q′> + q × q′, (16.38)
as in the real case.
Equation (16.37) is suggestive of the Clifford algebra of real E3 − which is also of
real dimension eight − although one finds that HC is not the completion of the even
subalgebra of C(3; δij) by associating the imaginary quaternions with the odd subspace,
but the complexification of that even real subalgebra to the even complex subalgebra of
the Clifford algebra over EC3 .
It will be useful to note the following facts, which are to be contrasted with the
corresponding dual results. If q = p + ir, q′ = p′ + ir′ then:

q × q′ = p × p′ − r × r′ + i(r × p′ + p × r′), (16.39)

and the triple vector product rule still holds for complex vectors:

(a × b) × c = <a, c> b − <a, b> c. (16.40)

When we get to the representation of spinors by complex quaternions, we shall need


to know a bit about the left and right ideals of the algebra HC . A left ideal I of HC is, by
87 The representation of physical motions by various types of quaternions

definition, a linear subspace such that HC I ≤ I, and therefore a sub-algebra. If one has
an idempotent ε then it will generate a left ideal I(ε) = HC ε. As mentioned above, the
element εc = 1 – ε is also an idempotent that is orthogonal to ε so one has a
decomposition of unity:
1 = ε + εc.

The left ideal generated by εc will then be I(εc) = HC εc, and one has a decomposition:

HC = I(ε) ⊕ I(εc).

Since we have four complex dimensions to start with, naively, the only distinct non-
trivial possibilities for direct sum decompositions are into vector subspaces of 1+3
dimensions and 2+2 dimensions. However, one immediately sees that a one-dimensional
ideal in HC − i.e., a complex line through the origin − cannot exist, since the line through
the origin of any non-zero complex quaternion will be rotated by some quaternion, so the
only possibility is a pair of complementary two-complex dimensional sub-algebras.
We next examine the nature of an idempotent ε in HC more closely in its scalar-plus-
vector representation:
ε = ε0 + ε.

From the definition of an idempotent, one has:

ε2 = (ε, ε) + 2ε0ε = ε0 + ε .
This makes:
ε0 = (ε, ε) = 12 , (16.41)
so:
εc = (1 – ε0) − ε = ε . (16.42)

Hence, any idempotent ε is a null quaternion:

|| ε ||2 = εε = εε c = 0. (16.43)

Furthermore, any other element qε in I(ε) will have to be a null quaternion, since:

|| qε ||2 = qε qε = qεε q = 0.

Any linear combination of such elements can be written in the form:

∑ λ (q ε ) = ( ∑ λ q ) ε
a
a
a
a
= qε,
Chapter IV. Complex quaternions. 88

so every element of I(ε) is a null quaternion.


Therefore, any left ideal in HC defines a two-dimensional vector subspace in the
three-dimensional quadric hypersurface that is defined by the null quaternions.
As we observed above, there are only two (complex) degrees of freedom in the choice
of idempotents in HC , since their scalar part is always 1/2, while their vector part must lie
on the complex unit sphere CS2 in C3. Therefore, since the null quaternions lie on a
three-complex-dimensional hypersurface, not all null quaternions are going to be
idempotents. In particular, the nilpotents of degree two are also null quaternions.
One example of an idempotent can be given by assuming that u is in the z direction:

ε = 12 (1 ± ie3). (16.44)

The complementary idempotent εc is then the conjugate quaternion, which simply inverts
the choice of sign.
One can adapt the basis {eµ} to the direct sum by defining:

l0 = 12 [(e2 – ie1) + χ(e0 + ie3)], l1 = 12 [−ϕ(e2 + ie1) + (e0 − ie3)], (16.45)


l2 = 12 [(e2 – ie1) + ϕ(e0 + ie3)], l3 = 12 [−χ(e2 + ie1) + (e0 − ie3)], (16.46)

in which:
ε 0 − iε 3 ε 0 + iε 3
ϕ= , χ= . (16.47)
ε 2 + iε 1 ε 2 + iε 1

From these two equations, and the conditions on ε that are imposed by the fact that it
is an idempotent, one can solve for the components of ε in terms of ϕ, χ:

i 1 − ϕχ 1 1 + ϕχ i ϕ+χ
ε0 = 12 , ε1 = − , ε2 = , ε3 = − . (16.48)
2 ϕ−χ 2 ϕ−χ 2ϕ−χ

Thus, we have one way of parameterizing the two-dimensional imaginary sphere that
the idempotents represent.

Analogous remarks apply to the case of right ideals. If confusion might arise, we
could distinguish left ideals from right ideals by means of appropriate subscripts, but we
shall generally treat them separately in the sequel.

4. The action of the Lorentz group on complex quaternions. We shall first


observe that the extension of the representation of real quaternions by 2×2 complex
matrices to a representation of complex quaternions still comes from the association of
the basis elements eµ with the 2×2 complex matrices τµ . Thus the matrix [q] that gets
associated with a complex quaternion q is still of the form:
89 The representation of physical motions by various types of quaternions

 q 0 + iq1 q 2 − iq 3 
[q] = qµ τµ =  2 1
. (17.1)
 q + iq q − iq 
3 0

except that now the components qµ are generally complex.


Since the complex vector spaces HC and M(2; C) both have complex dimension four,
this association is a C-linear isomorphism of those vector spaces. Thus, it is by going
from real numbers to complex ones that one completes the association of quaternions
with 2×2 complex matrices, as the complex quaternions of real type still define a real
subspace of M(2; C) of real dimension four.
Furthermore, since the product of complex quaternions still goes to the corresponding
product of matrices:
[qq′] = [q][q′],

the association is also an isomorphism of complex algebras. One also still has:

det[q] = || q ||2,

except that the determinant and norm involved are complex, now. Thus, the null
quaternions go to matrices of zero determinant, which are then the non-invertible ones,
and the unit quaternions go to elements of SL(2; C). In fact, that association is also an
isomorphism of complex Lie groups of complex dimension three. This association of
unit complex quaternions with matrices in SL(2; C) also shows quite clearly that the latter
complex Lie group is diffeomorphic, as a complex manifold, to the complex 3-sphere.
Now that one also has an adjunction automorphism defined, one finds that, in fact:

[q]† = [q†]. (17.2)

Thus, self-adjoint complex quaternions go to Hermitian matrices, while the anti-self-


adjoint quaternions go to anti-Hermitian matrices. One finds that when one polarizes the
matrices of M(2; C) that have zero trace – i.e., the elements of sl(2; C) − with respect to
the Hermitian conjugate operation, the effect is to express any element of that Lie algebra
as a sum of a zero-trace anti-Hermitian matrix, which then belongs to the Lie algebra
su(2), and a zero-trace Hermitian one. Since the Lie bracket of two Hermitian matrices is
anti-Hermitian:

[H, H′]† = (HH′ – H′H)† = H′†H† – H†H′† = H′H – HH′ = − [H, H′],

those matrices do not define a complex Lie subalgebra of sl(2; C), but only a three-real-
dimensional subspace.
Chapter IV. Complex quaternions. 90

One can easily define the isomorphism of Lie algebras sl(2; C) with so(3; C), since
both have complex dimension three. First, one defines their C-linear isomorphism as
vector spaces by the association of the basis vectors τi with the elementary three-
dimensional infinitesimal rotation matrices Ji, i = 1, 2, 3. One then observes that since
the basis elements satisfy the same commutation relations the linear isomorphism is a Lie
algebra isomorphism. Under this association, one sees that anti-Hermitian matrices go to
real infinitesimal rotation matrices, while the Hermitian ones go to imaginary rotation
matrices. As we saw above, this means that the anti-Hermitian matrices represent
infinitesimal generators of real Euclidian rotations, while the Hermitian ones represent
the infinitesimal generators of real Minkowski space boosts; i.e., pure Lorentz
transformations.
Thus, we see that the Lie group of unit complex quaternions is isomorphic to SL(2;
C), which doubly covers the proper, orthochronous Lorentz group SO+(3, 1). Therefore,
if one can represent Minkowski space as a vector subspace in HC then one can represent
the action of SO+(3, 1) on Minkowski space by the action of unit complex quaternions on
other quaternions.
In fact, we saw above that the basic automorphisms of HC define several subspaces
that have real dimension four. In particular, the spaces of real, imaginary, self-adjoint,
and anti-self-adjoint quaternions all have real dimension four. Furthermore, one can
define a scalar product on each of them that makes them isometric to real Minkowski
space. In the case of the real and imaginary quaternions, the scalar product is (q, q′) =
CS(qq′), while in the case of self-adjoint and anti-self-adjoint quaternions, the scalar
product is <q, q′> = CS( qq′ ).
The problem at hand is to find linear actions of CQ1 on HC that leave these subspaces
invariant and act by isometries on them. There are five basic actions of the group CQ1 of
unit complex quaternions on HC . We introduce the following terminology:

1. Left-multiplication:
CQ1 × HC → HC, (u, q) ֏ uq,
2. Right-multiplication:
HC × CQ1 → HC, (u, p) ֏ qu,
3. Complex congruence:
CQ1 × HC → HC, (u, q) ֏ uqu*,

4. Conjugate congruence:

CQ1 × HC → HC, (u, q) ֏ uqu ,


91 The representation of physical motions by various types of quaternions

5. Adjoint congruence:
CQ1 × HC → HC, (u, q) ֏ uqu † .

There is also a “chiral” action of the product group CQ1 × CQ1:

CQ1 × CQ1 × HC → HC, (u, u′, q) ֏ uqu′.

One can also consider the conjugate actions of all of the above, when the element u
gets applied to q by way of u . As it turns out, going from an action to its conjugate
action amounts to the difference between vectors and covectors; i.e., duality in the real
linear spaces that are being represented by subspaces of quaternions. However, since any
unit quaternion p has a conjugate u that is also a unit quaternion the conjugate actions of
CQ1 will always have the same invariant subspaces as the direct action.
The actual representation of the conventional vector spaces of physical tensors, such
as scalars, vectors, covectors, bivectors, 2-forms, 3-vectors, and 3-forms, and spinors then
comes down to identifying invariant subspaces of the various actions that were defined
above that have the same dimensions as the corresponding spaces of real or complex
tensor or spinor objects with those invariant subspaces.

The action that we are calling complex congruence above is certainly a linear action
and has H and iH as invariant subspaces. However, when one takes the scalar product
(p′, q′) of two real or imaginary quaternions p′ = upu*, q′ = uqu* one gets:

(p′, q′) = CS(p′, q′) = CS(upu*uqu*). (17.3)

One sees that since generally u* ≠ u the product in parentheses does not reduce to
CS(upqu*), which would then reduce to (p, q). Thus, the action in question does not
generally act by isometries on its invariant subspaces.
A more physically convenient action is what we are calling adjoint congruence,
which takes all self-adjoint quaternions to other self-adjoint quaternions, and similarly for
the anti-self-adoint ones. This action then corresponds to the action of matrices in SL(2;
C) on matrices in M(2; C) by an analogous conjugation that involves the Hermitian
adjoint. The fact that this action leaves the subspaces H+ and H− invariant follows from
the general discussion in Chap. I, sec. 3.
One finds that the linear action just defined is also by isometries of the restriction of
the complex quaternion scalar product to self-adjoint and anti-self-adjoint complex
quaternions. Since these latter quaternions take the forms (16.8) and (16.9), respectively,
the restrictions in question take the form:

<q, q′>+ = ηµν qµ qν, <q, q′>− = − <q, q′>+ , (17.4)


Chapter IV. Complex quaternions. 92

respectively. Thus, the restrictions of the complex quaternion scalar product to those
subspaces make them isometric to Minkowski space, as well as linearly isomorphic. The
fact that linear action in question preserves this scalar product follows from direct
calculation: If x′ = uxu†, y′ = uyu† then one has:

<x′, y′>+ = CS( x′y ′ ) = CS (uxu †u † y u ) = CS (uxy u ) = CS (uu )CS ( xy ) = <x, y>+,

and similarly for <x′, y′>− .


Since Minkowski space vectors can be represented as self-adjoint or anti-self-adjoint
complex quaternions, so can linear frames; in particular, one can represent a Lorentzian
frame {fµ , µ = 0, …, 3} by a corresponding Lorentzian frame in H+ or H− , which we
shall denote by the same symbols. If one then defines the transformation of this frame by
a unit complex quaternion u by:
f µ′ = u fµ u† (17.5)

then one sees that since fµ is a frame on H± the new frame f µ′ can be expressed in terms
of it by way of a coefficient matrix Lνµ :

f µ′ = fν Lνµ . (17.6)

Since the transformation that is defined by u is an isometry of H± , one infers that the
matrix Lνµ must be Lorentz-orthogonal. One can also see that the matrix Lνµ must be real,
since the fact that the transformation takes self-adjoint elements to other self-adjoint
elements implies that [f µ′ ]† = f µ′ , and when one expands this using (17.6), one gets:

[fν Lνµ ]† = [fν ]† [ Lνµ ]∗ = fν [ Lνµ ]∗ = fν Lνµ ,

which is possible only if:


[ Lνµ ]∗ = Lνµ .

Furthermore, one sees that since u and – u both produce the same Lorentz
transformation, the association of ± u to Lνµ becomes the 2-1 covering map SL(2; C) →
SO+(3, 1).
One also sees that since the antipodal pair {u, − u} on the complex 3-sphere is
associated with a complex line through the origin of C4 – namely, all points of the form
λu, with λ complex – it then defines a point in CP3. This shows one a direct path to
proving that the identity component of the Lorentz group is diffeomorphic to CP3 as a
manifold.
93 The representation of physical motions by various types of quaternions

The action of CQ1 on HC that takes the form of conjugate congruence has the spaces
of complex scalar and complex vectors as invariant subspaces, from the general
considerations regarding automorphisms of algebras above. It also acts by isometries,
since q′ = uqu then:
q′ q′ = uquuq u = q q .

If one reverts to the polar form for the unit complex quaternion u then one finds for
their action on complex vectors v in the manner that is currently at issue that:

uvu = (cos 12 α + sin 12 α u) v (cos 12 α − sin 12 α u)


= cos α v + (1 – cos α) <u, v> u + sin α u × v, (17.7)

which still has the form of Rodrigues’s formula, except that now the angle α and unit
vector u are both complex. Thus, one is now dealing with a rotation along one axis and a
boost along another.
Since the complex vector space of complex quaternions of vector type is C-linearly
isomorphic to C3 by a choice of complex 3-frame – such as ei – one can also represent the
action of a unit complex quaternion u on complex 3-frames by a 3×3 complex matrix Lij :

u ei u = e j Lij . (17.8)

Because this action is also an isometry of the complex Euclidian scalar product, one sees
that the matrix Lij must be complex orthogonal. Thus, since ± u once more produce the
same Lij , the association of ± u with Lij amounts to the two-to-one covering map SL(2;
C) → SO(3; C), and one also finds that SO(3; C) is isomorphic to SO+(3, 1). One also
notes that the covering SL(2; C) → SO(3, C) is simply the complexification of the
covering SU(2) → SO(3; R), while the covering CS3 → CP3 is the complexification of
the covering S3 → RP3. Thus, one can regard the transition from non-relativistic physics
to relativistic physics as having as much to do with the transition from real to complex
numbers as it does with the transition from three dimensions to four.

Because the complex quaternions of vector type represent a real vector space of
dimension six, it seems, on the surface of things, that they would be less well-adapted to
the motions of points in a four-dimensional real vector space, such as Minkowski space.
However, this is only partially true, since they are eminently adapted to the problem of
describing the motions of lines and 2-planes in R4, as well bivectors and 2-forms, which
are at the root of modern electromagnetism. We shall discuss this shortly, but first we
Chapter IV. Complex quaternions. 94

want to discuss how the action of the Lorentz group on spinors can be represented by a
linear action of the unit complex quaternions on the complex quaternions.

One can represent SL(2; C) spinors by means of complex quaternions, as long one has
chosen an idempotent ε. As discussed above, such an element generates both a left ideal
IL(ε) = HCε and a right ideal IR(ε) = ε HC , both of which are two-dimensional sub-
algebras that are composed of nothing but null quaternions. We also note that any left
ideal is, by definition, an invariant subspace of the action of SL(2; C) on HC by left-
multiplication. Analogously, any right ideal is an invariant subspace of the action of
SL(2; C) on HC by right-multiplication.
The representation of a unit quaternion u by an invertible 4×4 complex matrix
[ L(u )]νµ that one gets from left-multiplication is the one that we discussed in Chapter I
µ
that one gets by choosing a basis for HC and using the structure constants aκν and the
components uκ to define:
[ L(u )]νµ = aκν
µ κ
u . (17.9)

In the case of the structure constants for the quaternions (which are the same
regardless of the coefficient ring), one can simply write out the components of the
product uq and identify the matrix that takes qµ to (uq) µ. This makes:

 u 0 −u1 −u 2 −u 3 
 1 
µ  u u 0 −u 3 u 2 
[ L(u )]ν = 2 = a0µν u 0 + a1µν u1 + a2µν u 2 + a3µν u 3 , (17.10)
u u u −u 
3 0 1

 3 
 u −u
2
u1 u 0 
with
0 −1 0 0 0 0 −1 0 0 0 0 −1
1 0 0 0  0 0 0 1 0 0 −1 0 
a0µν = I, a1µν =  , a2µν =  , a3µν = . (17.11)
0 0 0 −1 1 0 0 0 0 1 0 0
     
0 0 1 0 0 −1 0 0 1 0 0 0

With the notation of Chapter II, we can express these matrices in the form:

τ 0 0  τ 2 0   0 τ1   0 τ3 
a0µν =  , a1µν = −  , a2µν = i  , a3µν = i  . (17.12)
 0 τ0   0 τ2   −τ 1 0   −τ 3 0 

Similarly, for right multiplication one finds that the matrix [ R(u )]νµ comes from:

[ R(u )]νµ = aνκµ uκ , (17.13)


95 The representation of physical motions by various types of quaternions

which makes:
 u 0 −u1 −u 2 −u 3 
 1 
µ  u u 0 u 3 −u 2 
[ R (u )]ν = 2 = aνµ0 u 0 + aνµ1u1 + aνµ2u 2 + aνµ3u 3 , (17.14)
u −u 3 u 0 u1 
 3 0
 u u −u u 
2 1

with:
0 −1 0 0  0 0 −1 0  0 0 0 −1
1 0 0 0 0 0 0 −1 0 0 1 0
aνµ0 = I, aν 1 = 
µ , aν 2 = 
µ , aνµ3 = , (17.15)
0 0 0 1 1 0 0 0 0 −1 0 0 
     
0 0 −1 0  0 1 0 0 1 0 0 0
or
τ 0 0   −τ 2 0   0 −τ 0   0 τ2 
aνµ0 =  , aνµ1 =  , aνµ2 =  , aνµ3 =  . (17.16)
 0 τ0   0 τ2  τ 0 0   −τ 2 0 

One can easily show these actions preserve the scalar product <.,.>, since if q′ = uq
then:
q′q′ = uqq u = u u ⋅ qq = qq ,
while if q′ = qu then:
q′q′ = qu u q = qq .

Of course, since the ideals IL(ε) and IR(ε) are composed exclusively of null
quaternions, the scalar product is always zero.
Similarly, IL,R(εc) = I L , R (ε ) is an invariant subspace under these actions,
respectively, and thus, if ε is a primitive idempotent then this defines a decomposition of
HC into irreducible representations:

HC = IL(ε) ⊕ I L (ε ) = IR(ε) ⊕ I R (ε ) .

Since the invariant subspaces IL(ε) or IR(ε), as well as their complements, are two-
dimensional, if one chooses a complex 2-frame {fa , a = 1, 2}in any of them then the
action of CQ1 by left or right multiplication will produce another complex 2-frame in the
same space, regardless of what unit quaternion u one chooses to act on the frame. Hence,
one can associate u with an invertible 2×2 matrix Λ ba or Γ ba by way of:

u fa = fb Λ ba , fa u† = fb Γ ba , (17.17)

depending upon whether one is dealing with left or right translation, respectively.
Chapter IV. Complex quaternions. 96

Since the action preserves the scalar product, the matrix will have determinant 1; i.e.,
it will belong to SL(2; C). Thus, the action of the group of complex unit quaternions on
the invariant subspaces of left and right multiplication behaves like the defining
representation of SL(2; C) on C2 by left or right matrix multiplication on column or row
vectors, respectively.

Following Blaton [6], we will call quaternions, when given the action of right-
multiplication by u†, where u is a unit quaternion, semi-quaternions of the first kind and
when they are given the action of left-multiplication by u, they will be called semi-
quaternions of the second kind. In order to distinguish them, we shall also use a single
underbar for the first kind and a double underbar for the second kind.
One immediately sees that products of the form pq also transform like vectors, since:

p′q′ = upqu † = u ( pq )u † . (17.18)

This is analogous to the way that one expresses vectors in Minkowski space as tensor
products of spinors. Of course, the components of that tensor product will define a 2×2
complex matrix, and we know that SL(2; C) acts on such things by matrix conjugation,
since it acts on the column vectors of C2 by left-multiplication and the row vectors of C2*
by right-multiplication.
In general, products of the form p q are invariant under right-multiplication, while
products of the form p q are invariant under left-multiplication:

p′q′ = pu †u † q = p (uu )† q = p q ,
p′q′ = p u u q = p q .

A semi-quaternion q is of the first (second, resp.) kind iff q† is of the second (first,
resp.) kind:
( q u † )† = u q † , (uq )† = q †u † .

If one decomposes a semi-quaternion of the first kind q into its projections qε and
qε in the left ideals Il(ε) and I l (ε ) for a primitive idempotent ε and its conjugate ε then
qε and qε will be referred to as spinors of the first kind. Similarly, the decomposition
into right ideals gives spinors of the second kind. The general quaternion will then
represent a sum of linearly independent spinors of the same kind, or a bispinor:

q = qε + qε = εq + ε q . (17.19)
97 The representation of physical motions by various types of quaternions

It is important to point out that the definition of a spinor in this way clearly depends
upon the choice of idempotent ε.

The “chiral” action of CQ1 on HC that we defined above, which takes (u1, u2, q) to
u1qu2 is also an isometry of the complex Euclidian scalar product, since if q′ = u1qu2 then
one must have:
q′ q′ = u1qu2u2 q u1 = q q .

Thus, if we express this action by means of an invertible 4×4 complex matrix Mνµ by
means of:
u1eµ u2 = eν M νµ

then we can see that one must have that M νµ ∈ SO(4; C). One thus defines a
homomorphism SL(2; C) × SL(2; C) → SO(4; C), (u1, u2) ֏ M νµ that also defines an
isomorphism at the level of Lie algebras.
The chiral action can also be written as the product of a left-translation matrix and a
right translation matrix. In fact, the associativity of the quaternion multiplication implies
that the two matrices must commute:

(u1eµ) u2 = eν Lνκ (u1 ) Rµκ (u2 ) , u1(eµ u2) = eν Rκν (u2 ) Lκµ (u1 ) ,
so
Lνκ (u1 ) Rµκ (u2 ) = Rκν (u2 ) Lκµ (u1 ) .

One can identify various subgroup actions by restricting the chiral action to
preserving various invariant subspaces. For instance, if one demands that it take real
quaternions to real ones and imaginary quaternions to imaginary ones then this would
make q′ * = ± q′, which would imply that:

(u1qu2) * = u1∗ q ∗u2∗ = ± u1∗ q u2∗ = ± u1qu2 .

If one pre-multiplies both sides of the last equality by uɶ1 and post-multiplies it by uɶ2 then
one gets uɶ1u1∗ q ∗u2∗uɶ2 = q and if this is to be true for all real q then one must have:

u1 = u1∗ , u2 = u2∗ ;

i.e., u1 and u2 must be real quaternions.


Similarly, if one considers the corresponding statement regarding the matrix M νµ then
one must have:
(eν M νµ )∗ = eν∗ (M νµ )∗ = ± eν (M νµ )∗ = ± eν M νµ
Chapter IV. Complex quaternions. 98

This then implies that ( M νµ )∗ = Mνµ , which makes the matrix real. Thus, one now has a
homomorphism SU(2) × SU(2) → SO(4; R), (u1, u2) ֏ Mνµ .
If the action is to take scalars to scalars and vectors to vectors then one must have:

u1e µ u2 = u2 eµ u1 = ± u2e µ u1 = ± u1 eµ u2 ,

which leads to the condition:


u1 = u2 ,

and the action reduces to that of conjugate congruence.


In order to get back to the proper, orthochronous, Lorentz group, one needs only to
restrict the action in such a way that it takes (anti) self-adjoint quaternions to other (anti)
self-adjoint ones:
q′† = u2† q †u1† = ± u2† q u1† = ± q′ = ± u1qu2 ,

which implies that one must have:


u1 = u2† .

The action then reduces to adjoint congruence.


Since this also implies that the matrix Lνµ (u1 ) must be the complex conjugate of
Rµν (u2 ) , we have the following theorem that goes back to Einstein and Mayer [12], and is
also discussed by Scherrer [13], Blaton [6], and Lanczos [7]:

Theorem:

Any proper, orthogonal Lorentz matrix Mνµ can be expressed as the product
Lκµ (u1 ) Rνκ (u2 ) of two matrices such that:

1. Lκµ (u1 ) Rνκ (u2 ) = Rκµ (u2 ) Lνκ (u1 ) .


2. Rµν (u2 ) = ( Lνµ (u1 ) )*.

We summarize the various actions of the group of complex unit quaternions in Table
1.

5. Some complex line geometry. In order to see how bivectors relate to lines, one
needs to define the Plücker-Klein embedding of the manifold of lines in RP3 in the vector
space Λ2R4 of bivectors over R4. First, one notes that under the projection R4 – {0} →
RP3, x ֏ [x], which takes any point in R4 that is not the origin to the line through the
99 The representation of physical motions by various types of quaternions

origin that goes through it, a line [x, y] in RP3 will be the projection of a 2-plane through
the origin in R4. If one spans that plane by means of two linearly independent vectors –
say x and y – then there is a bivector x ^ y that gets associated with that line.

Table 1. Representation of metric spaces by invariant subspaces of HC.

Metric space Invariant subspace Isometric action of unit quaternions

(C, | |2) Scalar quaternions Conjugate congruence

(C2, 0) Left or right ideals Left or right multiplication

(R4, δµν) Real or imaginary quaternions Complex congruence

(R4, ηµν) (anti-) self-dual quaternions Adjoint congruence

(C3, δij) Pure quaternions Conjugate congruence

However, the choice of spanning vectors is not unique, and if one chooses any other
pair – say x′ and y′ − then they will be related to the first pair by an invertible linear
transformation:
 x′   a11 a12   x 
 y′  =  a 2 a 2   y  . (18.1)
   1 2 
One then finds that
x′ ^ y′ = det[a] x ^ y. (18.2)

Thus, any two choices for spanning vectors will produce bivectors that differ only by
a non-zero scalar multiple. Hence, they all define the same point [x ^ y] in PΛ2R4, which
is the five-dimensional real projective space of lines through the origin of Λ2R4. If RP13
is the manifold of lines in RP3 then the map RP13 → PΛ2R4, [x, y] ֏ [x ^ y] that takes a
Chapter IV. Complex quaternions. 100

line to the equivalence class of bivectors that are associated with it is an embedding that
one calls the Plücker-Klein embedding.
The image of the latter embedding does not consist of all bivectors, but only ones
that are decomposable; i.e., of the form x ^ y, instead of x ^ y + v ^ w. Decomposable
bivectors b have the characteristic property that:

b ^ b = 0. (18.3)

This is a homogeneous, quadratic condition on the bivectors, which then defines a


quadric hypersurface in Λ2R4, and because of the homogeneity, in PΛ2R4, as well. This
quadric is called the Klein quadric, and this shows that manifold of lines in RP3 is four-
dimensional.
The connection with complex quaternions is straightforward when one first notes that
the real vector space Λ2R4 is six-dimensional, which is also the real dimension of the
subspace of complex quaternions of vector type. One first assumes that R4 has been
given a “time-space decomposition” into a direct sum R ⊕ R3, where, for example, e0
might span the R summand and {ei, i = 1, 2, 3} might span the R3 summand. One then
notes that a basis for Λ2R4 can be defined by {εεi, *εεi , i = 1, 2, 3}, in which:

εi = e0 ^ ei, *εεi = 12 εijk ej ^ ek . (18.4)

This not only defines a basis for Λ2R4 as a real vector space of dimension six, but if
we define the linear isomorphism * : Λ2R4 → Λ2R4, b ֏ *b by its effect on the basis
elements:
*(εεi) = *εεi , *(*εεi) = − εi (18.5)

then we find that the map * also allows one to define a complex structure on Λ2R4 by
simply setting:
ib = *b, so (α + iβ)b = αb + β*b. (18.6)

One thus has a way of defining complex scalar multiplication on Λ2R4 that makes {εεi,
i = 1, 2, 3} into a complex basis. Under this complex structure the bivectors of “electric”
type are the ones in the real subspace spanned by {εεi, i = 1, 2, 3}, while the ones of
“magnetic” type are in the real subspace spanned by {*εεi, i = 1, 2, 3}. Thus, the electric
bivectors correspond to the real subspace of the complex vector space, while the
magnetic ones correspond to the imaginary subspace. This suggest an obvious C-linear
101 The representation of physical motions by various types of quaternions

isomorphism of Λ2R4 to C3 that takes the complex basis {εεi, i = 1, 2, 3} to the canonical
basis {ei, i = 1, 2, 3} in C3. The association of bivectors then takes the form:

Ei εi + Bi *εεi ֏ (Ei + iBi) ei .

As mentioned previously, this association of a bivector (or 2-form, for that matter)
with a complex 3-vector goes back to lectures of Riemann on partial differential
equations in mathematical physics, and was resurrected numerous times by many other
researchers to this day. The subsequent association of a bivector with a complex
quaternion of vector type then becomes obvious if one regards ei as also spanning the
vector subspace of H, which we have been treating as an algebra over C4.
The action of CQ1 on bivectors over R4 or lines in RP3 that is most appropriate is
then conjugate congruence, which has the complex quaternions of vector type as an
invariant subspace. This gives us a different way of geometrically characterizing the unit
complex quaternions of vector type. As we mentioned previously, since û = a + ib must
satisfy || û ||2 = − û û = 1, one must have <a, a> − <b, b> = 1 and <a, b> = 0. Thus, the
(real) spatial vectors a and b must be orthogonal, which means that their tips generate a
line, namely, (1 –λ)a + λb, and the vectors themselves span a plane through the origin.

6. The kinematics of Lorentzian frames. The (proper, orthochronous) Lorentz


group can be represented by either its defining representation of SO+(3, 1), which acts on
real Minkowski space M4, SO(3; C), whose defining representation is on complex
Euclidian space EC3 , or SL(2; C), whose defining representation is on C2, which is
implicitly given the trivial scalar product. In the last section, we saw how each of the
representations can take the form of linear actions of the group CQ1 of complex unit
quaternions on various invariant subspaces of HC that were associated with each action.
Moreover, we saw how that action would affect the various types of frames that were
appropriate to each action.

In order to go on to relativistic kinematics, we only need to start with a differentiable


curve in the Lie group CQ1 and differentiate the action of that Lie group on the invariant
subspaces in each case. Hence, the actions will all have in common some things that
pertain to the differentiation of curves in CQ1 .
If u(t) = uµ(t)eµ is such a sufficiently differentiable curve then its velocity vector field
will be a vector field on that curve:
Chapter IV. Complex quaternions. 102

du du µ
uɺ (t ) = = eµ , (19.1)
dt dt

as will its acceleration vector field:

duɺ d 2 u d 2u µ
uɺɺ(t ) = = 2 = eµ . (19.2)
dt dt dt 2

If one right-translates all points of the curve u(t) to the identity element 1 by means of
−1
u(t) then the differential map to each individual right translation takes the curve uɺ (t ) in
the tangent spaces Tu(t)CQ1 to a curve in T1CQ1:

ω(t) = uɺ (t ) u(t)−1. (19.3)

Since T1CQ1 can be identified with the Lie algebra of CQ1, which is isomorphic to
sl(2; C), one can think of the elements ω(t) as being infinitesimal Lorentz
transformations, which makes the curve ω(t) represent a relativistic analogue of angular
velocity for whatever frame CQ1 is acting on. However, since sl(2; C) decomposes into
a direct sum of vector spaces su(2) ⊕ h(2), the transformations of ω(t) include both
infinitesimal rotations and infinitesimal boosts, and one can generally represent ω(t) in
the form ϖ(t) + η(t), where ϖ(t) represents a curve in su(2) and η(t) represents a curve in
h(2).
Since the acceleration uɺɺ(t ) is a curve in the second tangent bundle Tuɺ (t )Tu (t )CQ1 , when
one right-translates u(t) back to 1, the effect is to right-translate uɺ (t ) to ω(t) and produce
a curve in Tω(t)T1CQ1 = Tω(t)sl(2; C), namely:

α(t) = uɺɺ(t ) u(t)−1. (19.4)

This then represents a relativistic analogue of angular acceleration.


If one takes the time derivative of ω(t) − namely:

ωɺ = uu
ɺɺ −1 + uu
ɺ ɺ −1 = uu
ɺɺ −1 − uu ɺ −1 = α − ωω ,
ɺ −1uu
then one sees that:
α = ωɺ + ωω. (19.5)

When one decomposes ω into ϖ + η, this decomposes α into:

α = (ϖɺ + ϖϖ) + (ηɺ + ηη) + {ϖ, η}. (19.6)


103 The representation of physical motions by various types of quaternions

Thus, the contributions from the rotations and boosts must be augmented by a coupling
term.
Let us now apply this to the action of CQ1 on Minkowski space, as it is represented
by either invariant subspace H±, namely, by adjoint congruence, which takes any (u, q) to
uqu†. If we regard the curve q(t) as being produced by the action of u(t) on some initial
element q0 = q(0), namely:
q(t) = u(t) q0 u(t)†, (19.7)

then its velocity vector field takes the form:

ɺ 0u † + uq0uɺ † ,
qɺ = uq (19.8)

in which we have dropped the explicit reference to the curve parameter, for brevity.
This velocity amounts to the one that is observed by an “inertial” observer, whose
velocity relative to q(t) is then uɺ . If one substitutes:

q0 = u−1 q u−† (19.9)


in (19.8) then the result is:

ɺ −1qu − †u † + uu −1qu − †uɺ † = ω q + qω † .


qɺ = uu (19.10)

This is then the form that is taken by the velocity of q(t) with respect to a co-moving – or
non-inertial − observer.
Now that we are dealing with the Lie algebra sl(2; C), we see that we cannot simply
assume that ω is anti-Hermitian, since that is only true for the part of it that belongs to
su(2). The other part is then Hermitian, and if we express ω = ϖ + η, such that ϖ is anti-
Hermitian and η is Hermitian, then we see that this makes:

qɺ = [ϖ, q] + {η, q}. (19.11)

A second differentiation of q(t) give the acceleration in an inertial frame:

ɺɺ 0 u † + 2uq
qɺɺ = uq ɺ 0 uɺ † + uq0 uɺɺ† , (19.12)
and in a co-moving frame:

ɺɺ −1qu − †u † + 2uu
qɺɺ = uu ɺ −1qu − †uɺ † + uu −1qu − †uɺɺ†
= α q + qα † + 2ω qω † . (19.13)

If we replace the moving point q(t) with the moving frame:

fµ(t) = u(t) f0µ u(t)† (19.14)


Chapter IV. Complex quaternions. 104

then one finds that its velocity takes the forms:

fɺµ = uɺ f0 µ u † + u f0 µ uɺ † = ω fµ + fµ ω†, (19.15)

while its acceleration takes the forms:

f µ = uɺɺf0 µ u † + 2uɺ f0 µ uɺ † + u f0 µ uɺɺ† = α fµ + fµ α† + 2ω fµ ω†.


ɺɺ (19.16)

If we wish to now examine the action of CQ1 on bivectors, we recall that they are
modeled by the invariant subspace CV of complex vector quaternions under the action of
conjugate congruence, which takes (u, q) to uqu . Thus, if q0 = q(0) represents an initial
bivector then its time evolute can be defined by:

q(t) = u (t ) q0 u (t ) . (19.17)

Although we are using a different automorphism in order to define the action of CQ1 ,
nonetheless, the calculations that follow are essentially the same, except for a change of
symbol. Thus, velocity and acceleration in an inertial frame take the form:

ɺ 0u + uq0uɺ ,
qɺ = uq ɺɺ 0u + 2uq
qɺɺ = uq ɺ 0uɺ + uq0uɺɺ , (19.18)

while in a co-moving frame they become:

qɺ = ω q + qω , qɺɺ = α q + qα + 2ω qω . (19.19)

The kind of frame that is most appropriate to CV is an oriented, orthonormal, complex


3-frame {fi, i = 1, 2, 3}. When one substitutes fi(t) for q(t) and f0i for q0, the last two sets
of equations take the analogous forms:

fɺi = uɺ f0 i u + u f0i uɺ = ω fi + fiω , (19.20)

fi = uɺɺf0 i u + 2uɺ f0 i uɺ + u f0 i uɺɺ = α fi + fiα + 2ω fi ω .


ɺɺ (19.21)

The kinematics of spinors (see, e.g., Gürsey [14] or Proca [15]) involves the action of
CQ1 on the invariant subspaces of either right or left multiplication, namely, the left or
right ideals. If HC = IL(ε) ⊕ I L (ε ) is a decomposition of HC into left ideals relative to a
primitive idempotent ε then the action of CQ1 on IL(ε) or I L (ε ) takes the form (u, q) ֏
uq. Thus, if u(t) is a sufficiently differentiable curve in CQ1 and q0 is an element of
105 The representation of physical motions by various types of quaternions

either ideal then one defines a sufficiently differentiable curve q(t) in the respective ideal
by way of:
q(t) = u(t) q0 . (19.22)

Differentiation gives the velocity and acceleration in the inertial frame as:

qɺ = uq
ɺ 0, qɺɺ = uq
ɺɺ 0 , (19.23)

and in the co-moving frame, one substitutes q0 = u−1q in order to get:

qɺ = ωq, qɺɺ = αq. (19.24)

The kind of frame that is most suited to this action is a complex, null 2-frame {fa, a =
1, 2} for either IL(ε) or I L (ε ) . If one substitutes f0a for q0 and fa for q then the
kinematical equations take the forms:

fɺa = uɺ f0 a = ω fa , fa = uɺɺf0 a = α fa .
ɺɺ (19.25)

The only difference between the left action and the right action is that the right action
also involves taking the adjoint of u before acting to the right. Thus, the curve q(t) in a
right ideal comes from:
q(t) = q0 u(t)†. (19.26)

The kinematical equations can be obtained from (19.23) and (19.24) by inspection:

qɺ = q0uɺ † = qω†, qɺɺ = q0uɺɺ† = qα†, (19.27)

and the kinematical equations for moving frames become:

fɺa = f0a uɺ † = fa ω†, fa = f0a uɺɺ† = fa α†.


ɺɺ (19.28)

References

1. H. Weber, Die partiellen Differentialgleichungen der mathematischen Physik, nach


Riemann’s Vorlesungen, v. 2, Vieweg and Son, Braunschweig, 1901; see § 138,
especially.
2. A. Conway, “On the application of quaternions to some recent developments of
electrical theory,” Proc. Roy. Irish Acad. A: Math. Phys. Sci. 29 (1911/1912), 1-9.
3. L. Silberstein:
a. “Elektromagnetische Grundgleichungen in bivectorieller Behandlung,” Ann. d.
Phys. 327 (1907), 579-586. English translation by D. H. Delphenich at neo-
classical-physics.info.
Chapter IV. Complex quaternions. 106

b. “Nachtrag zur Abhandlung über ‘Elektromagnetische Grundgleichungen in


bivectorieller Behandlung’,” Ann. d. Phys. 329 (1907), 783-784. English
translation by D. H. Delphenich at neo-classical-physics.info.
4. E. Majorana, personal notes that were later compiled in S. Esposito, E. Recami, A.
van der Merwe, and R. Battiston, Ettore Majorana: Research Notes in Theoretical
Physics, Springer, Heidelberg, 2008.
5. J. R. Oppenheimer, “Note on light quanta and the electromagnetic field,” Phys.
Rev. 38 (1931), 725-746.
6. J. Blaton, “Quaternionen, Semivektoren, und Spinoren,” Zeit. Phys. 95 (1935), 337-
354. English translation by D. H. Delphenich at neo-classical-physics.info.
7. C. Lanczos, “Die tensoranalytischen Beziehungen der Diracschen Gleichung,” Zeit.
Phys. 57 (1927), 447-473. English translation by D. H. Delphenich at neo-classical-
physics.info.
8. L. Silberstein, The Theory of Relativity, MacMillan, London, 1914.
9. P. Weiss, “On some applications of quaternions to restricted relativity and classical
radiation theory,” Proc. Roy. Irish Acad. A: Math. Phys. Sci. 46 (1940/1941), 129-
168.
10. P. Rastall, “Quaternions in relativity,” Rev. Mod. Phys. (1964), 820-832.
11. L. E. Dickson, Algebras and their Arithmetics, Dover, Mineola, NY, 1960; first
edition, 1923.
12. A. Einstein and W. Mayer, “Semivektoren und Spinoren,” Sitz. d. preuss. Akad. d.
Wiss. (1932), 522-550.
13. W. Scherrer, “Quaternionen und Semivektoren,” Comm. Math. Helv. 7 (1935), 141-
149. English translation by D. H. Delphenich at neo-classical-physics.info.
14. F. Gürsey, “Relativistic kinematics of a class of point particles in spinorial form,”
Nuov. Cim. 5 (1957), 784-809.
15. A. Proca:
a. “Mécanique du point,” J. Phys. Rad. 15 (1954), 65-72. English translation by
D. H. Delphenich at neo-classical-physics.info.
b. “Particules de trés grandes vitesse en mécanique spinorielle,” Nuov. Cim. 2
(1955), 962-971. English translation by D. H. Delphenich at neo-classical-
physics.info.
CHAPTER V

COMPLEX DUAL QUATERNIONS

1. The group of complex rigid motions. The group ISO(3; C) of complex rigid
motions is defined by complexification of the corresponding real group. That is, one
starts with complex three-dimensional affine Euclidian space EC3 = ( AC3 , δij) upon which
one has an action of the three-dimensional complex translation group C3.
Thus, we must begin with complex three-dimensional affine space AC3 . This is
simple a space on which one has defined a simply transitive action of the complex
translation group C3, which can either be written in the form x ֏ y = x + zi or as an anti-
symmetric function from AC3 × AC3 to C3 that takes (x, y) to y – x = zi. Hence, if one
chooses a point O ∈ AC3 to serve as “origin” or reference point then any x ∈ AC3 can be
associated with an ordered triple of complex numbers (z1, z2, z3) that makes x = O + zi. If
one chooses a complex 3-frame {e1, e2, e3} in TO AC3 then zi can be associated with the
tangent vector z = ziei . The group A(3; C) of complex affine transformations acts on
GL( AC3 ) on the right as:
(x, fi)(zi, Lɶij ) = (x + zifi, f j Lɶij ), (20.1)

so it acts on the coordinates (xi, f ji ) of a complex affine frame (x, fi) on the right by:

(zi, Lij )(xi, f ji ) = (zi + Lij x j , Lik f jk ). (20.2)

One introduces a complex Euclidian structure on T( AC3 ) in the form of a complex


scalar product on the tangent vectors <vx, wx> at each point. A complex linear frame ei in
Tz AC3 is said to be orthonormal iff:
<ei, ej> = δij , (20.3)
so if vz = vi ei and wz = wi ei then:
<vz, wz> = δij vi wj. (20.4)

A complex rigid frame (z, ei) on EC3 then consists of a point z in complex three-
dimensional affine space AC3 and a complex-orthonormal frame {ei, i = 1, 2, 3} in the
tangent space Tz AC3 . A complex rigid motion is then a transformation T of AC3 whose
differential preserves the Euclidian scalar product:

<dT|zv, dT|zw> = <v, w>,


Chapter V. Complex dual quaternions 108

where v, w ∈ Tz AC3 , so dT|zv, dT|zv ∈ TT ( z ) AC3 , and also preserve the orientation of any
frame. One can then show that dT|z is invertible at every z, and by the inverse function
theorem, T is locally invertible.
Since the differential of any uniform translation is zero, the group C3 will be a
subgroup of ISO(3; C), as will the subgroup SO(3; C) of all complex, orientation-
preserving rotations of any tangent space. In fact, one can represent ISO(3; C) as the
semi-direct product C3 ×s SO(3; C), although the representation depends upon a choice of
complex rigid frame. Thus, the complex Lie group ISO(3; C) has a complex dimension
of six or a real dimension of twelve. As a complex manifold, it is diffeomorphic to the
product C3 × CP3, so it is non-compact and connected, but not simply connected. Its
simply connected covering group is C3 ×s SL(2; C).

If (zi, R ij ), (wi, S ij ) ∈ C3 ×s SO(3; C) then their product is:

(zi, R ij )(wi, S ij ) = (zi + R ij wj, Rki S kj ). (20.5)

The group C3 ×s SO(3; C) also acts on complex rigid frames in an analogous way:

(z, ei)(si, R ij ) = (z + siei , e j Ri j ). (20.6)

One can represent an element (si, R ij ) of C3 ×s SO(3; C) by an invertible 4×4 complex


matrix if one treats the coordinates zi of C3 as inhomogeneous coordinates for a Plücker
coordinate chart on CP3 and regards C3 as embedded in the space of homogeneous
coordinates C4 – {0} as the affine hyperplane z0 = 1. The matrix of (si, R ij ) is then the
complexification of the corresponding real one:

1 0 
 si Ri  .
 j

One notices that this construction is not as natural for the covering group C3 ×s SL(2;
C), since SL(2; C) is more closely related to the geometry of CP2. However, if one
restricts oneself to a subset of the form C2 × SL(2; C) then one can represent an element
(sa, Lab ) as an invertible complex 3×3 matrix:
109 The representation of physical motions by various types of quaternions

1 0 
s a Ra  .
 b 

One notes, from (20.5), that, in fact, no C2 subspace Π of C3 will define a subgroup
of the form C2 ×s SO(3; C), since the issue is whether every complex rotation R ij z j of a zi
∈ Π will still be a vector in Π. That is, Π will have to be an invariant subspace of the
action of SO(3; C) on C3 by left-multiplication.
Since every complex rotation has an axis [l] − which is, of course, a complex line, not
a real one – that is a one-dimensional invariant subspace of C3, its orthogonal
complement [l]⊥ is also an invariant plane of the rotation that is, moreover, linearly
isomorphic to C2. However, since different rotations will generally have non-collinear
axes, it is impossible to find one axis such that all rotations will have the same invariant
plane.
Since C2 is isomorphic to R4 as a real Lie group and SO(3; C) is isomorphic to SO+(3,
1), finding such invariant subspaces would be essential to representing transformations of
the Poincaré group R4 ×s SO+(3, 1) as complex rigid motions. However, we now see that
the complex rigid motions are not an extension of the Poincaré group.
That does not mean that complex rigid motions have no physical relevance. Since
their most natural action is by matrix multiplication on vectors and covectors in C3, the
issue is what physical relevance that space would have. However, we have already
pointed out that the spaces of 2-forms and bivectors on a four-dimensional vector space,
when given a complex structure, can be modeled by that complex vector space. Hence,
one would expect that the physical applications of complex rigid motions would relate to
the time evolution of electromagnetic fields, such as electromagnetic waves. Therefore,
we shall confine our attention to that application.

2. The algebra of complex dual numbers. One can define the algebra CD of
complex dual numbers quite simply by saying that it is D ⊗R C, which means that one
starts with the same basic elements 1 and ε as a complex basis, as in the real case, and
forms all complex linear combinations a +εb, where a and b are complex numbers, now.
We will then call a the (complex) scalar part of the complex dual number and εb is the
pure (complex) dual part.
Thus, the basic vector space on which the algebra is defined is C2, this time. If one
prefers to regard CD as an algebra over R4, instead, then one regards the four elements 1,
i, ε, iε, which are assumed to be linearly independent, as a basis. The multiplication rules
Chapter V. Complex dual quaternions 110

for the basis elements are the ones that follow naturally from associativity. We
summarize them in a table:
1 i ε iε
1 1 i ε iε
i i −1 iε −ε
ε ε iε 0 0
iε iε −ε 0 0

The rules of addition and multiplication are unchanged from the real case, at least
formally. Thus, the ring CD is a commutative ring with unity, with zero divisors defined
by pairs of pure complex dual numbers; thus, it is not a division algebra. The conjugation
operation that takes α to α is defined essentially as before, as is the resulting modulus-
squared. Since |α | = a is still definite – although not positive-definite − the multiplicative
inverse to α is again defined whenever α is not a pure complex dual number and has
formally the same expression as before:
α
α−1 = .
| α |2

Therefore, the invertible complex dual numbers define a non-compact, two-


dimensional, Abelian, complex Lie group CD* whose two connected components are
defined by the half-planes in C2 that lie on either side of the “complex y-axis.” The
subgroup CD1 of complex dual numbers of unit modulus is only slightly more involved,
this time, since the complex dual numbers of unit dual modulus must not be confused
with the complex numbers of unit modulus, whose modulus-squared is defined zz*. Thus,
a typical element of CD1 still has the form:

α = ± 1 + εb,

except that b is complex, this time, and a typical element of CD* can be expressed in the
form of a product of a non-zero complex number and a complex dual number of unit
modulus:
 εb 
α = a 1 +  .
 a 

This brings us to the essential differences between D and CD, which are mostly
concerned with the fact that since a complex dual number α = (a + ib) + ε(c + id) can also
be expressed as the sum α = β + iγ of two real dual numbers β and γ by way of:

α = (a + εc) + i(b + εd),


111 The representation of physical motions by various types of quaternions

one can define another automorphism of the algebra that takes α to its complex
conjugate:
α* = β – iγ = (a − ib) + ε(c − id) = (a + εc) − i(b + εd).

One can then combine the two types of conjugation that we have defined on complex
dual numbers to give the adjunction automorphism, which takes α = a + εb to:

α† = α ∗ = a* − εb* = β − iγ ,
which makes:
α α† = (a + εb)(a* − εb*) = | a |2 + ε(ab* + a*b).

3. Functions of complex dual numbers. The only differences between the formulas
that we derived for functions of dual variables in chapter III and the ones for complex
dual variables are based in the fact that now they reduce to sums of functions of complex
variables. Hence, one must be more careful about the differentiation of such functions,
since complex differentiation is more restrictive than real differentiation in that complex
functions must satisfy the Cauchy-Riemann equations in order to be continuously
differentiable – or holomorphic. As a consequence, such functions are also complex
analytic, in the sense that they can be expanded into a convergent Taylor series, at least
locally.
We simply repeat the formulas of the previous section on functions of real dual
variables, starting with the power formula:
dz n
z n = zn + ε n zn−1s = zn + ε s. (22.1)
dz

More generally, a polynomial P[ z ] is a complex linear combination of powers of z ,


and for any polynomial function P[ z ] of the complex dual variable z , one will have:

P[ z ] = P[z] + ε P′[z] s. (22.2)

This generalizes to the definition:

f ( x ) = f(z) + ε f′ (z) s. (22.3)

if one assumes that f(z) is a holomorphic function of z.


In particular, the trigonometric formulas remain intact with complex dual arguments:

cos(α ) = cos α – ε s sin α, (22.4)


sin(α ) = sin α + ε s cos α. (22.5)

cos2 (α ) + sin 2 (α ) = 1, (22.6)


cos(2α ) = cos2 (α ) − sin 2 (α ) , (22.7)
Chapter V. Complex dual quaternions 112

sin(2α ) = 2 sin(α ) cos(α ) . (22.8)

This time, α = α + ε s is the sum of complex numbers, so we resort to the relevant


formulas in the section on functions of complex variables for the verification of the
present formulas.

4. Complex-dual linear algebra. Much of what we said previously in chapter III in


the context of dual linear algebra carries over to the case of complex-dual linear algebra
by complexification. In particular, one is still dealing with a module over the ring CD,
rather than a vector over any field, and the model for such a CD-module is the Cartesian
product CDn of n copies of CD, whose elements then look like z = ( z 1 , …, z n ), where
the coordinates z i are complex-dual numbers, this time.
One forms linear combinations α z + β w coordinate-wise, as before, and a basis for
an n-dimensional complex-dual vector space V is still a set { e 1 , …, e n } of n complex-
dual vectors such that any complex-dual vector z in V can be expressed as a linear
combination:
n
z = ∑z e
i =1
i
i . (23.1)

Once again, the issue of linear independence is complicated by the fact that the
existence of divisors of zero in CD makes it possible for z i e i to be zero without all of
the z i being individually zero. However, this does not preclude the existence of bases, it
only reduces the number of acceptable sets of n complex-dual vectors that could serve as
bases to essentially complex bases – i.e., ones with no non-vanishing complex dual part.
Any complex-dual vector z can be decomposed into complex-plus-pure-complex-
dual form, real-dual-plus-imaginary-dual form, or a sum of four real vectors:

z = z + εw = x + iy = x + εa + iy + iεb,

in which z and w are complex n-vectors, x and y are real-dual, and x, a, y, b are all real
n-vectors.
A complex-dual linear map L : V → W from an n-dimensional CD-linear space V to
an m-dimensional one W is a function that takes CD-linear combinations to linear
combinations:
L(α z + β w ) = α L( z ) + β L(w ) .
113 The representation of physical motions by various types of quaternions

If V is an n-dimensional complex-dual vector space then a choice of basis defines a


CD-linear isomorphism of V with CDn that takes z , as in (23.1), to the n-tuple ( z 1 , …,
z n ). If bases { e 1 , …, e n } and { f 1 , …, f m } are chosen for both V and W, respectively,
then any complex-dual linear map L can be associated with a complex-dual m×n matrix
Lai by way of:
L( e i ) = f a Lai .

The action of L on z can be associated with a corresponding action of Lai on z i by


matrix multiplication:
L( z i ) = Lai z i .

Like complex-dual vectors, the complex-dual matrix Lai can decomposed into
complex-plus-pure-complex-dual form, real-dual-plus-imaginary-dual form, or even a
sum of four real matrices:

Lai = Lai + ε Aai = R ia + i I ia = R ia + ε Aia + i I ia + iε B ia ,

in which Lai and Aia are complex matrices, R ia and I ia are real-dual matrices, and all of
the matrices in the last expression are real.
When expressed in complex-plus-pure-complex-dual form, the product of two
complex-dual square matrices Li j = Li j + ε Ai j and M ij = M ij + ε B i j takes the form:

Li k M kj = Li k M kj + ε ( Aik M kj + B ik Lk j ) .

Thus, it behaves like the multiplication of complex square matrices for the complex parts,
but has the characteristically more involved form for the pure-complex-dual parts.
If the CD-linear map L : V → V is invertible then so is the matrix Li , so a matrix Lɶ ij j

exists such that:


Lɶ ik Lkj = Lik Lɶ kj = δ ij .

In complex-plus-pure-complex-dual form, this implies the conditions that the


complex part Lɶij of Lɶ i j must be, in fact, the inverse of Lij , which must then exist, and the
pure complex-dual part Aɶ i of Lɶ i must satisfy:
j j

Aɶ ij = − Lɶik Alk Lɶlj .

As before, this places no restriction on Aij itself.


Chapter V. Complex dual quaternions 114

The set of all invertible complex dual linear maps, or all invertible complex-dual n×n
matrices, for that matter, then forms a group GL(n; CD). As a complex Lie group, it has
dimension 2n2, and as a real Lie group, it has dimension 4n2. It includes GL(n; C) as a
subgroup by way of the invertible complex n×n matrices and the complex translation
group C n by way of the matrices of the form I + εA, where we now drop the matrix
2

indices, for brevity.

We can introduce a scalar product on a CD-linear space V as we did for a D-linear


one – i.e., a symmetric CD-bilinear functional <.,.> on V that is non-degenerate, in the
sense that the map V → V*, v ֏ < v,. > is a CD-linear isomorphism. In particular, the
scalar product now takes its values in CD. When two complex dual vectors v and w are
expressed in complex-plus-pure-complex-dual form as v + ea and w + eb, respectively,
the scalar product takes the form:

< v, w > = <v, w> + ε(<v, b> + <w, a>). (23.2)

Thus, orthogonality of v and w would imply two conditions:

<v, w> = 0, <v, b> + <w, a> = 0. (23.3)

That is, the complex parts would have to be orthogonal in the usual sense, while the pure
complex dual parts would have to satisfy a more elaborate condition.
A basis { e 1 , …, e n } for V is (Euclidian) orthonormal for this scalar product if one
has:
< e i , e j > = δij . (23.4)

Thus the scalar product of two complex dual vectors v = v i e i and w = wi e i will
take the component form:
< v, w > = δ ij v i w j . (23.5)

When the components are expressed in complex-plus-pure-complex-dual form as vi +


εai and wi + εbi, respectively, this takes the form:

< v, w > = δij vi wj + εδij(vi bj + wi aj). (23.6)

A CD-linear transformation L : V → V is CD-orthogonal iff it preserves the Euclidian


scalar product. That is, for every v , w ∈ V one must have:
115 The representation of physical motions by various types of quaternions

< Lv , L w > = < v , w > . (23.7)

As usual, the condition on the matrix that represents L with respect to some basis
(which we shall also represent by L ) is:

LT L = LLT = I; (23.8)
i.e.:
Lɶ = LT . (23.9)

If L is expressed in complex-plus-pure-complex-dual form as L + εA then the


condition (23.8) takes the form:

LTL = LLT = I, AT = − LTALT. (23.10)

Thus, the complex part belongs to O(n; CD), while the pure dual part satisfies a more
involved constraint.
In order to reduce to SO(n; CD), one must introduce a volume element V on V, which
takes the form of a non-zero completely anti-symmetric CD-multilinear functional on V:

1
V ( v1 ,⋯ , v n ) = ε i1⋯in v1i1 ⋯ vnin = det[ v1 | ⋯ | v n ]V . (23.11)
n!

Of course, the determinant of the component matrix will take its values in CD.
Under a CD-linear transformation L one will have:

V ( Lv1 ,⋯ , Lv n ) = (det L)V ( v1 , ⋯ , v n ) . (23.12)

Thus, for a volume-preserving CD-linear transformation, one will have that det L = 1.
The subgroup of GL(n; CD) for which this is true will be denoted by SL(n; CD), and the
subgroup of O(n; CD) will be denoted by SO(n; CD).

5. The algebra of complex dual quaternions. The algebra HCD of complex dual
quaternions is defined the tensor product H ⊗ CD. Thus, a typical element q ∈ HCD can
be expressed in the form:
q = q µ eµ , (24.1)
Chapter V. Complex dual quaternions 116

in which the basis elements eµ are the same as in the real case, but the components q µ
are now complex dual numbers. One can also regard HCD as HC ⊗ D or HD ⊗ C. That is,
the elements of HCD can be regarded as complex quaternions with dual coefficients or
dual quaternions with complex coefficients. We shall often find that the latter
representation is most useful, since one then simply complexifies the corresponding dual
expressions.
Hence, if one regards the elements of HCD in the form (24.1) then the only thing that
changes in the expression for the product of two complex dual quaternions:

q q ′ = q µ qν e µ eν (24.2)

is in the way that products of components form, while the products of basis elements are
still the same as for real quaternions.
There are various decompositions of complex dual quaternions that become useful,
since the number of dimensions in the coefficient algebra has doubled from either C or D.
We simply summarize some of them:

real + imaginary: q = p + ir , p , r ∈ HD ,
scalar + vector: q = q0 + q , q 0 ∈ DSHCD , q ∈ DVHCD ,

complex + dual: q = p +εr , p, r ∈ HC .

These direct sum decompositions of the vector space HCD :

HCD = Re(HCD) ⊕ Im(HCD) = DSHCD ⊕ DVHCD = C(HCD) ⊕ D(HCD)

define projection operators Re, Im, DS, DV, C, D onto the summands, which can be
defined by polarizing the automorphisms *, which is still complex conjugation, dual
conjugation:
q = q 0 − q , (24.3)
and quaternion conjugation:
q = p +εr . (24.4)
One then gets:
Re( q ) = 12 (q + q ∗ ) , Im( q ) = 12 (q − q ∗ ) , (24.5)
DS ( q ) = 12 (q + q  ) , DV ( q ) = 12 (q − q  ) , (24.6)
C ( q ) = 12 (q + q ) , D ( q ) = 12 (q − q ) . (24.7)

In the three forms that we introduced, the product (24.2) becomes:


117 The representation of physical motions by various types of quaternions

q q ′ = p p′ − r r ′ + i ( p r ′ + r p′) (24.8)
q q ′ = ( q , q ′) + q 0 q′ + q′0 q + q × q′ , (24.9)
q q ′ = pp′ + ε(pr′ + rp′). (24.10)

into which we have introduced the scalar product:

3
( q , q ′) = DS ( q q ′) = q 0 q ′0 − ∑ δ ij q i q′ j = (p, p′) + ε[(p, r′) + (p′, r)], (24.11)
i =1

and if q = q + εr and q′ = q′ + εr′ then one defines the cross product by bilinearity:

q × q′ = 12 [q, q′] = q × q′ + ε(q × r′ + r × q′). (24.12)

Of course, both of these expressions (24.11) and (24.12) are merely the complexification
of their real counterparts.
One can also introduce another isomorphism that corresponds to the adjoint operation
for complex quaternions:
q † = p† + εr†. (24.13)

By polarization, this defines a decomposition of HCD into a direct sum H+ ⊕ H− ⊕ εH+ ⊕


εH− of four-real-dimensional subspaces, two of which are composed of pure dual real
quaternions. One can also rearrange this direct product into the form (H+ ⊕ εH+) ⊕
(H− ⊕ εH−), which makes HCD the sum of two dour-dimensional dual vectors spaces.
As we see, HCD is still an associative, but not commutative, ring and has a unity
element 1 that is still defined by e0, with a center that is defined by all of the complex
dual scalars, which have the form (q0 + εr0) e0 .
Like the algebra of real dual quaternions, the algebra HCD has divisors of zero, and the
obvious examples are still the pure complex dual quaternions. Similarly, the only
difference between the expression for the inverse of an invertible element q = p + εr:

q −1 = p−1 – ε p−1r p−1 (24.14)

and the previous one in the real case is that everything is complex now. In particular, p
must not be a null complex quaternion.

The other scalar product that we have been habitually defining is:

3
< q, q′ > = DS ( q q ′) = ∑
µ
δ µν q µ q ′ν ,
=1
(24.15)
Chapter V. Complex dual quaternions 118

which now takes its values in CD.


If one puts the quaternions involved into quaternion-plus-dual form then the scalar
product looks like it did before, but with complex quaternions:

< q, q′ > = <p, p′> + ε(<r, p′> + <p, r′>). (24.16)


In particular:
|| q ||2 = || p ||2 + 2ε<p, r>. (24.17)

From (24.17), a null complex dual quaternion must satisfy:


|| p || = 0, <p, r> = 0. (24.18)

Hence, unlike the real dual case, since there non-trivial complex quaternions, there are
also non-trivial complex dual quaternions, as well. However, analogous to the real case,
we are now dealing with the complex Study quadric, as the second condition shows.
Similarly, a unit quaternion must satisfy:

|| p || = 1, <p, r> = 0. (24.19)

Therefore, it defines a Cartesian product (p, r) ∈ HC × HC of a unit complex quaternion p


and complex quaternion r that is complex-orthogonal to p.
If q is non-null then || q || can be factored out and any invertible complex dual
quaternion can be expressed as a product:

q = || q || u , (24.20)

with the obvious definition for the complex dual unit quaternion u .
Furthermore, we have a canonical form for u that is the complex analogue of the real
expression:
u = cos 12 α + sin 12 α u , (24.21)

in which the angle α is a complex dual number, while u is a complex dual unit vector.
Since:
|| q q ′ ||2 = DS ( q q ′q q ′) = DS ( q q′ q ′ q ) = || q ||2 || q ′ ||2, (24.22)
i.e.:
|| q q ′ || = || q || || q ′ ||, (24.23)

one sees that the product of two complex dual null quaternions is a complex dual null
quaternion and the product of two complex dual unit quaternions is again a unit
quaternion. Therefore, although null quaternions do not have multiplicative inverses, and
therefore do not define a group, nonetheless, the unit quaternions do form a group, since
one also has that the inverse of a unit quaternion is a unit quaternion. This follows from
(24.23) when the left-hand side is unity.
119 The representation of physical motions by various types of quaternions

∗ ∗
Thus, the group CQ of invertible complex dual quaternions and the group Q that
we introduced previously differs only by the complexification everything. In particular,
it contains the subgroup CQ* of invertible complex quaternions (r = 0), which is the set
complement of the null hypersurface, and the subgroup of elements of the form 1 + εr,
which is then isomorphic to the translation group of C4. The two groups intersect only at

1 and are both four-complex-dimensional, so the group CQ has complex dimension 8.

From (24.20), CQ can be expressed as the product group CD* × CQ1 of the invertible
complex dual numbers with the group of complex dual unit quaternions.
As we shall see, the group CQ1 is isomorphic to the semi-direct product C3 ×s SO(3;
C), which is the group of complex rigid motions. Hence, it has complex dimension six or
real dimension twelve.
The Lie algebra iso(3; C) = C3 ⊕s so(3; C) of infinitesimal complex rigid motions is
isomorphic to the Lie algebra q0C of complex dual vectors, which amounts to the
complexification of the corresponding statement for infinitesimal real rigid motions and
real dual vectors. One simply repeats the argument that was given above under the
assumption that all of the component vectors involved are complex, now.

For complex dual vectors, one has:

q q ′ = − < q , q ′ > + q × q′ , (24.24)


< q, q′ > = <p, p′> + ε(<p, r′> + <r, p′>), (24.25)
|| q || = || p || + 2ε<p, r>,
2 2
(24.26)

which is, of course, merely the complexification of the real situation.


In particular, a null complex dual vector satisfies:

|| p || = 0, <p, r> = 0, (24.27)

which is now non-trivially possible, and a unit complex dual vector satisfies:

|| p || = 1, <p, r> = 0. (24.28)

Thus the set of all unit complex dual vectors becomes the complexification of the real set,
or T(CS2). It is not, however, a group, which was also true for the real case.
Just as DV(HD) represented the real affine space A3, similarly, DV(HCD) represents the
tangent bundle T ( AC3 ) to the complex affine space AC3 . The complex vector part v of v +
εa represents the vector part v of the point (x, v) ∈ T ( AC3 ) , while the pure complex dual
part a indirectly represents the translation that takes some reference point O to x.
Chapter V. Complex dual quaternions 120

In order to find the nilpotents of degree two in HCD , we start with the expression for
q 2 , namely:
q 2 = p2 + ε(pr + rp). (24.29)

Once again, this equation is unchanged from the real dual case, except that now
everything is complex, and similarly, for the necessary conditions for q 2 to vanish, one
has:
p2 = 0, pr + rp = 0.

The possibility that p = 0 still leads to the fact that the complex dual quaternions of pure
complex dual type are all nilpotents, as they were in the real case. However, p is now a
complex quaternion, and we saw that HC also admits non-trivial nilpotents of degree two.
Thus, in addition to the pure dual case, one can consider nilpotents for which p is non-
trivial. Recall that for HC they took the form p = p with <p, p> = 0; in particular, p0 = 0.
Now put everything into scalar-plus-vector form and look at the corresponding form
of the latter conditions. After some straightforward calculations, one gets that:

pr + rp = 2(r0p − <p, r>).

The vanishing of the left-hand side would imply that:

r0p = 0, <p, r> = 0.

If p = 0 then q = εr is a pure complex dual quaternion.


The case in which r0 vanishes makes r = r a pure complex quaternion, as well as p,
although it must be restricted by the last condition above. Thus, such a nilpotent takes
the form:
q = p + εr, <p, r> = 0. (24.30)

As for the idempotents, if one sets q 2 = q and expands this into complex-plus-dual
form then this gives the following necessary conditions:

p2 = p, pr + rp = r.

Thus, p must be an idempotent in HC, which we know exist non-trivially. It is clear


that setting p to either 0 or 1 would make r = 0, which are both trivial cases. Thus, we set
p equal to its canonical form:
p = 12 (1 + iu) , || u || = 1. (24.31)

If we put r into its scalar-plus-vector form r0 + r then the second condition above
takes the form:
121 The representation of physical motions by various types of quaternions

r0 – i<u, r> = r0, r + ir0 u = r,


which imply:
<u, r> = 0, r0 u = 0.

Hence, either u = 0, which does not lie on the unit sphere and is therefore
unacceptable, or r0 = 0. We can then put a non-trivial idempotent ι in HCD into the
form:
ι = 12 (1 + iu) + εr, <u, u> = 1, <u, r> = 0. (24.32)

Since HCD admits non-trivial idempotents ι , it will also admit the non-trivial left and
right ideals Il( ι ) and Ir( ι ) that are generated by them. As in the complex case, the
conjugate ι of an idempotent is also idempotent and since it is therefore null, one still
has ι ι , which means that ι is orthogonal to ι and one can decompose HCD into direct
sums Il( ι ) ⊕ Il( ι ) and Ir( ι ) ⊕ Ir( ι ), with a corresponding decomposition of the
identity operator:
I= ι+ ι . (24.33)

The vector subspaces Il( ι ), Il( ι ), Ir( ι ), Ir( ι ), which are also subalgebras, then
have complex dual dimension two and are invariant subspaces under left and right
multiplication, respectively.

6. The action of the group of complex rigid motions on complex dual


quaternions. The various actions that we introduced in the context of real dual
quaternions can all be complexified, which also introduces the possibilities that we
introduced for complex quaternions, due to the increased number of automorphisms.
As in the real case, the action:

CQ1 × HCD → HCD , (u , q ) ֏ u q u (25.1)

has DS(HCD) and DV(HCD) for invariant subspaces, so that much has merely been
complexified. Similarly, the argument that makes the action isometric for the scalar
product <.,.> works the same, except that the dual numbers become complex dual ones.
One also sees that both u and − u produce the same effect on q , as usual. Thus,
when one expresses q in component form as q µ e µ , with complex dual coefficients, one
can associate the above action of u with a 4×4 matrix Lνµ with complex dual coefficients
by way of:
(u q u )µ = Lνµ qν , (25.2)
Chapter V. Complex dual quaternions 122

which can also be defined by the corresponding transformation of the frame eµ :

u e µ u = Lνµ eν . (25.3)

Since DS(HCD) and DV(HCD) are invariant subspaces, the matrix Lνµ , in a frame that is
adapted to the decomposition, splits into a direct sum of a 1×1 – complex dual scalar –
matrix and a 3×3 complex dual submatrix Lij . We now need to show that Lij represents a
complex rigid motion. However, this follows directly from the fact that u q u is an
isometry of the complex dual scalar product on HCD , and thus, of its restriction to
DV(HCD). Of course, the map CQ1 → ISO(3; C) that takes { u , − u } to Lij is the two-to-
one covering homomorphism C3 ×s SL(2; C) → C3 ×s SO(3; C).

7. The role of complex rigid frames in physics. Rather than examine the
kinematics of complex rigid frames in terms of complex dual quaternions, since no
discussion of such matters seems to exist in the physics mainstream, we shall first attempt
to motivate the physical significance of such a study. The first issue to address is the
physical applicability of the complex affine space AC3 .
Since the best-established application of the complex vector space C3 is to
electromagnetism, it seems reasonable to extend that discussion to the affine case. Thus,
one needs to justify that the concept of the translation of a 2-form or bivector by another
2-form or bivector actually occurs naturally in the theory of electromagnetism. In fact,
this is the case.
This is due to the fact that since the Maxwell equation for the electromagnetic field
strength 2-form F – namely, dF = 0 – is linear and homogeneous in F, any general
solution to that equation will be determined only up to an additive constant field F0 . Of
course, imposing boundary or initial-value conditions will eliminate that indeterminacy,
but it does exist for the general solutions; i.e., it is a symmetry of the system of equations.
By contrast, the dual equation (8) δH = J for the electromagnetic excitation bivector
field H has that symmetry only in the absence of sources, such as for points outside the
support of the source current J.
However, that would suggest that the application of complex dual quaternions to
physical models is most natural in the context of the symmetries of the field equations for
electromagnetism, while the main focus of the present study was to the role of
quaternions in kinematics. Therefore, we shall defer a more detailed study of the

(8) Here, we are defining the divergence operator d: ΛkM → Λk−1Μ to be the adjoint #−1 ⋅ d ⋅ # of the
exterior derivative operator d with respect to the Poincaré isomorphism #: ΛkM → Λn−kΜ that is defined by
a choice of volume element on T(M). This definition does, in fact, agree with the usual divergence operator
for vector fields.
123 The representation of physical motions by various types of quaternions

physical role of complex dual quaternions to a later paper that will deal with the
application of quaternions to physical field theories.

You might also like