THE RATIONAL SPIRIT IN MODERN CONTINUUM MECHANICS
The Rational Spirit in
Modern Continuum
Mechanics
Essays and Papers Dedicated to the Memory of
Clifford Ambrose Truesdell III
Edited by
CHI-SING MAN
University of Kentucky,
Lexington, U.S.A.
and
ROGER L. FOSDICK
University of Minnesota,
Minneapolis, U.S.A.
Reprinted from Journal of Elasticity: The Physical and Mathematical Science
of Solids, Vols. 70, 71, 72 (2003)
KLUWER ACADEMIC PUBLISHERS
NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN:
Print ISBN:
1-4020-2308-1
1-4020-1828-2
©2005 Springer Science + Business Media, Inc.
Print ©2004 Kluwer Academic Publishers
Dordrecht
All rights reserved
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic,
mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Visit Springer's eBookstore at:
and the Springer Global Website Online at:
http://ebooks.kluweronline.com
http://www.springeronline.com
Portrait by Joseph Sheppard
Table of Contents
Portrait by Joseph Sheppard
v
Foreword by Chi-Sing Man and Roger Fosdick
xi
Published Works of Clifford Ambrose Truesdell III
xiii
Serials Edited by Clifford Ambrose Truesdell III
xli
Eulogium by Roger Fosdick
xliii
Photograph: Bloomington, Indiana, 1959
xlv
BERNARD D. COLEMAN / Memories of Clifford Truesdell
1–13
ENRICO GIUSTI / Clifford Truesdell (1919–2000), Historian of Mathematics
15–22
WALTER NOLL / The Genesis of Truesdell’s Nonlinear Field Theories of
Mechanics
23–30
JAMES SERRIN / An Appreciation of Clifford Truesdell
31–38
D. SPEISER / Clifford A. Truesdell’s Contributions to the Euler and the
Bernoulli Edition
Photograph: Baltimore, Maryland, 1978
STUART S. ANTMAN / Invariant Dissipative Mechanisms for the Spatial
Motion of Rods Suggested by Artificial Viscosity
39–53
55–64
MILLARD F. BEATTY / An Average-Stretch Full-Network Model for Rubber
Elasticity
65–86
MICHELE BUONSANTI and GIANNI ROYER-CARFAGNI / From 3-D Nonlinear Elasticity Theory to 1-D Bars with Nonconvex Energy
87–100
GIOVANNI BURATTI, YONGZHONG HUO and INGO MÜLLER / Eshelby
Tensor as a Tensor of Free Enthalpy
101–112
SANDRO CAPARRINI and FRANCO PASTRONE / E. Frola (1906–1962):
An Attempt Towards an Axiomatic Theory of Elasticity
113–125
GIANFRANCO CAPRIZ and PAOLO MARIA MARIANO / Symmetries and
Hamiltonian Formalism for Complex Materials
127–140
DONALD E. CARLSON, ELIOT FRIED and DANIEL A. TORTORELLI /
Geometrically-based Consequences of Internal Constraints
141–149
vii
viii
YI-CHAO CHEN / Second Variation Condition and Quadratic Integral Inequalities with Higher Order Derivatives
151–167
ELENA CHERKAEV and ANDREJ CHERKAEV / Principal Compliance and
Robust Optimal Design
169–196
JOHN C. CRISCIONE / Rivlin’s Representation Formula is Ill-Conceived for
the Determination of Response Functions via Biaxial Testing
197–215
CESARE DAVINI and ROBERTO PARONI / Generalized Hessian and External Approximations in Variational Problems of Second Order
217–242
F. DELL’ISOLA, G. SCIARRA and R.C. BATRA / Static Deformations of a
Linear Elastic Porous Body Filled with an Inviscid Fluid
243–264
GIANPIETRO DEL PIERO / A Class of Fit Regions and a Universe of Shapes
for Continuum Mechanics
265–285
LUCA DESERI and DAVID R. OWEN / Toward a Field Theory for Elastic
Bodies Undergoing Disarrangements
287–326
MARCELO EPSTEIN and IOAN BUCATARU / Continuous Distributions of
Dislocations in Bodies with Microstructure
327–344
MARCELO EPSTEIN and MAREK ELŻANOWSKI / A Model of the Evolution of a Two-dimensional Defective Structure
345–355
J.L. ERICKSEN / On the Theory of Rotation Twins in Crystal Multilattices
357–373
MAURO FABRIZIO and MURROUGH GOLDEN / Minimum Free Energies
for Materials with Finite Memory
375–397
ROGER FOSDICK and LEV TRUSKINOVSKY / About Clapeyron’s Theorem in Linear Elasticity
399–426
M. FOSS, W. HRUSA and V.J. MIZEL / The Lavrentiev Phenomenon in Nonlinear Elasticity
427–435
GIOVANNI P. GALDI / Steady Flow of a Navier–Stokes Fluid around a Rotating Obstacle
437–467
TIMOTHY J. HEALEY and ERROL L. MONTES-PIZARRO / Global Bifurcation in Nonlinear Elasticity with an Application to Barrelling States
of Cylindrical Columns
469–494
MOJIA HUANG and CHI-SING MAN / Constitutive Relation of Elastic Polycrystal with Quadratic Texture Dependence
495–524
MASARU IKEHATA and GEN NAKAMURA / Reconstruction Formula for
Identifying Cracks
525–538
R.J. KNOPS and PIERO VILLAGGIO / An Approximate Treatment of Blunt
Body Impact
539–554
I-SHIH LIU / On the Transformation Property of the Deformation Gradient
under a Change of Frame
555–562
ix
KONSTANTIN A. LURIE / Some New Advances in the Theory of Dynamic
Materials
563–573
GERARD A. MAUGIN / Pseudo-plasticity and Pseudo-inhomogeneity Effects
in Materials Mechanics
575–597
A. IAN MURDOCH / On the Microscopic Interpretation of Stress and Couple
Stress
599–625
PABLO V. NEGRÓN-MARRERO / The Hanging Rope of Minimum Elongation for a Nonlinear Stress–Strain Relation
627–649
MARIO PITTERI / On Certain Weak Phase Transformations in Multilattices
651–671
PAOLO PODIO-GUIDUGLI / A New Quasilinear Model for Plate Buckling
673–698
G. RODNAY and R. SEGEV / Cauchy’s Flux Theorem in Light of Geometric
Integration Theory
699–719
U. SARAVANAN and K.R. RAJAGOPAL / A Comparison of the Response
of Isotropic Inhomogeneous Elastic Cylindrical and Spherical Shells
and Their Homogenized Counterparts
721–749
M. ŠILHAVÝ / On SO(n)-Invariant Rank 1 Convex Functions
751–762
K. WILMAŃSKI / On Thermodynamics of Nonlinear Poroelastic Materials
763–777
WAN-LEE YIN / Anisotropic Elasticity and Multi-Material Singularities
779–808
Foreword
Through his voluminous and influential writings, editorial activities, organizational leadership, intellectual acumen, and strong sense of history, Clifford Ambrose Truesdell III (1919–2000) was the main architect for the renaissance of rational continuum mechanics since the middle of the twentieth century. The present
collection of 42 essays and research papers pays tribute to this man of mathematics,
science, and natural philosophy as well as to his legacy.
The first five essays by B.D. Coleman, E. Giusti, W. Noll, J. Serrin, and
D. Speiser were texts of addresses given by their authors at the Meeting in memory
of Clifford Truesdell, which was held in Pisa in November 2000. In these essays the
reader will find personal reminiscences of Clifford Truesdell the man and of some
of his activities as scientist, author, editor, historian of exact sciences, and principal
founding member of the Society for Natural Philosophy.
The bulk of the collection comprises 37 research papers which bear witness to
the Truesdellian legacy. These papers cover a wide range of topics; what ties them
together is the rational spirit. Clifford Truesdell, in his address upon receipt of a
Birkhoff Prize in 1978, put the essence of modern continuum mechanics succinctly
as “conceptual analysis, analysis not in the sense of the technical term but in the
root meaning: logical criticism, dissection, and creative scrutiny.” It is in celebration of this spirit and this essence that these research papers are dedicated to the
memory of their bearer, driving force, and main promoter for half a century. Most
of these papers were presented at the Symposium on Recent Advances and New
Directions in Mechanics, Continuum Thermodynamics, and Kinetic Theory – In
Memory of Clifford A. Truesdell III, held in Blacksburg, Virginia, in June 2002;
parts of two papers were delivered at the meeting Remembering Clifford Truesdell,
held in Turin in November 2002; and the rest was written especially for the present
collection.
The portrait, a photo of which serves as the frontispiece of this collection,
adorns the Clifford A. Truesdell III Room of History of Science in the library of
the Scuola Normale Superiore (Pisa, Italy), which was inaugurated in October 2003
and permanently houses Clifford Truesdell’s previously private collection of books,
papers, and correspondence. We are grateful to Mrs. Charlotte Truesdell for helping
us secure a digital file of this photo and for providing us with the list of published
works of Clifford Truesdell.
C HI -S ING M AN
University of Kentucky
Lexington
ROGER F OSDICK
University of Minnesota
Minneapolis
xi
Published Works of
Clifford Ambrose Truesdell III ⋆
The year of publication is omitted from the entry unless it differs from the year
under which the entry is listed. Letters following a number indicate subsidiary
separate publications, as follows:
P
A
C
L
R
RE
T
TC
TE
Preliminary report or preprint,
Abstract, separately published or only published version,
Condensed or extracted version,
Lecture concerning part or all of the contents of main entry,
Reprint, entire,
Reprint of an extract,
Translation, entire,
Translation, condensed,
Translation of an extract.
The list excludes some 600 reviews published between 1949 and 1971 in Mathematical Reviews, Applied Mechanics Reviews, Zentralblatt für Mathematik, Industrial Laboratories, and Mathematics of Computation but includes reviews published in other journals.
1943
1. (Co-author P. N EMÉNYI) A stress function for the membrane theory of shells
of revolution, Proceedings of the National Academy of Sciences (U.S.A.) 29,
159–162.
Other publication in 1943: No. 3A1.
1944
2. A LONZO C HURCH, Introduction to Mathematical Logic, Part I, Notes by
C.A. T RUESDELL, Annals of Mathematics Studies No. 13, Princeton, University Press, vi + 118 pp.
⋆ Note by the editors: This list and the list on p. 29 are slightly edited versions of those that we
received from Mrs. C. Truesdell, to whom we are heartily grateful. In our editorial work we have
added a few entries, updated several items, and made a small number of other minor corrections. To
G.P. Galdi, K. Hutter, R.G. Muncaster, F. Pastrone, and D. Speiser, we are beholden for their help in
tracking down article titles and numbers of journal volumes. In what follows, explanatory remarks
set off by square brackets were made by Clifford Truesdell himself.
xiii
xiv
PUBLISHED WORKS OF C.A. TRUESDELL
1945
3. The membrane theory of shells of revolution, Transactions of the American
Mathematical Society 58, 96–166.
3A1. The differential equations of the membrane theory of shells of revolution, Bulletin of the American Mathematical Society 49 (1943), 863–
864.
3A2. The membrane theory of shells of revolution, Bulletin of the American
Mathematical Society 51, 225.
4. On a function which occurs in the theory of the structure of polymers, Annals
of Mathematics 46, 144–157.
n
−2m
, ∞
5. Generalizations of Euler’s summations of the series ∞
n=0 (−) ×
n=1 n
(2n + 1)−2m−1 , etc., Annals of Mathematics 46, 194–195.
Other publication in 1945: No. 12A1.
1946
6. (Co-author R.C. P RIM) On Linearized Axially Symmetric Flow of a Compressible Fluid, U.S. Naval Ordnance Laboratory Memorandum 8885, 16 December, 4 pp.
7. On Behrbohm and Pinl’s linearization of the equation of two-dimensional
steady polytropic flow of a compressible fluid, Proceedings of the National
Academy of Sciences (U.S.A.) 32, 289–293 = U.S. Naval Ordnance Laboratory Memorandum 8888, 18 December, 6 pp.
7A. On Behrbohm and Pinl’s linearization of the two dimensional steady
flow of a compressible adiabatic fluid, Bulletin of the American Mathematical Society 53 (1947), 59.
Other publications in 1946: Nos. 8A and 12A2.
1947
8. On Sokolovsky’s “Momentless shells”, Transactions of the American Mathematical Society 61, 128–133.
8A. Same title, Bulletin of the American Mathematical Society 52 (1946),
240.
9. (Co-author R.N. S CHWARTZ) The Newtonian mechanics of continua, U.S.
Naval Ordnance Laboratory Memorandum 9223, 18 July, 25 pp.
9A. (Co-author R. S CHWARTZ) On the Newtonian Mechanics of Continua,
Bulletin of the American Mathematical Society 53, 1125.
10. A note on the Poisson–Charlier functions, Annals of Mathematical Statistics
18, 450–454.
11. Review of L. Brand’s “Vector and Tensor Analysis”, Science 106, 623.
Other publications in 1947: Nos. 7A, 12A3, 13P, 14P, 16P, 16A, 23P, 48P.
1948
12. An Essay toward a Unified Theory of Special Functions, based on the Functional Equation ∂F (z, α)/∂z = F (z, α + 1), Annals of Mathematics Studies
No. 18, Princeton, Princeton University Press, iv + 182 pp.
PUBLISHED WORKS OF C.A. TRUESDELL
13.
14.
15.
16.
17.
xv
12A1. On the functional equation ∂F (z, α)/∂z = F (z, α +1), Bulletin of the
American Mathematical Society 51 (1945), 883.
12A2. On a class of differential-difference equations, Bulletin of the American Mathematical Society 52 (1946), 823.
12A3. On the Functional Equation (∂/∂z)F (z, a) = F (z, a + 1), U.S. Naval
Ordnance Laboratory Memorandum 8975, 17 February 1947, 13 pp. =
Proceedings of the National Academy of Sciences (U.S.A.) 33 (1947),
82–93.
12A4. A unified theory of special functions, American Mathematical Monthly
56 (1949), 368.
12L. Une méthode nouvelle concernant les fonctions spéciales, pp. 53–72
of Three Lectures on Mathematics and Mechanics, U.S. Naval
Research Laboratory Theoretical Mechanics Section Memorandum
No. 3836-1, August 1, 1949.
On the total vorticity of motion of a continuous medium, Physical Review (2)
73, 510–512.
13P. The Transport of Vorticity, U.S. Naval Ordnance Laboratory Memorandum 9260, 11 August 1947, 7 pp.
On the transfer of energy in continuous media, Physical Review (2) 73, 513–
515.
14P. The Energy Theorem for Newtonian Continua, U.S. Naval Ordnance
Laboratory Memorandum 9224, 21 July 1947, 8 pp.
A New Definition of a Fluid, U.S. Naval Ordnance Laboratory Memorandum
9487, 5 January, 31 pp.
15A1. On the differential equations of slip flow, Proceedings of the National
Academy of Sciences (U.S.A.) 34, 342–347.
15A2. On the differential equations for slip flow, Physical Review (2) 73,
1255.
On the reliability of the membrane theory of shells of revolution, Bulletin of
the American Mathematical Society 54, 994–1008.
16P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9270, 14
August 1947, 15 pp.
16A. Same title, Bulletin of the American Mathematical Society 53 (1947),
1125.
Généralisation de la formule de Cauchy et des théorèmes de Helmholtz au
mouvement d’un milieu continu quelconque, Comptes Rendus Hebdomadaires
des Séances de l’Académie des Sciences (Paris) 227, 757–759.
17L. Sur la cinématique des mouvements tourbillonaires, pp. 3–20, 73–74
of Three Lectures on Mathematics and Mechanics, U.S. Naval
Research Laboratory Theoretical Mechanics Section Memorandum
No. 3836-1, August 1, 1949.
xvi
PUBLISHED WORKS OF C.A. TRUESDELL
18. Une formule pour le vecteur tourbillon d’un fluide visqueux élastique, Comptes
Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 227,
821–823.
18L. Des théorèmes tourbillonaires de la mécanique des fluides, pp. 21–37,
75–76 of Three Lectures on Mathematics and Mechanics, U.S. Naval
Research Laboratory Theoretical Mechanics Section Memorandum
No. 3836-1, August 1, 1949.
Other publications in 1948: Nos. 19P, 32P, 64P.
1949
19. The effect of viscosity on circulation, Journal of Meteorology 6, 61–62.
19P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9516, 27
January 1948, 6 pp.
19A. Same title, Physical Review (2) 76 (1949), 192–193.
20. Deux formes de la transformation de Green, Comptes Rendus Hebdomadaires
des Séances de l’Académie des Sciences (Paris) 229, 1199–1200.
Other publications in 1949: Nos. 12A4, 12L, 17L, 18L, 22P, 26P, 26L1, 26L2,
26L3A, 29P, 35P, 64A.
1950
21A. On finite strain of an elastic body, Bulletin of the American Mathematical
Society 55, 1072.
22. Bernoulli’s theorem for viscous compressible fluids, Physical Review (2) 77,
535–536.
22P. Same title, U.S. Naval Research Laboratory Report 3558, October 12,
1949, iv + 3 pp.
22A. Bernoulli’s theorem for viscous fluids, Bulletin of the American Mathematical Society 56, 253.
23. (Co-author R. P RIM) A derivation of Zorawski’s criterion for permanent
vectorlines, Proceedings of the American Mathematical Society 1, 32–34.
23P. (Co-author R.C. P RIM) Zorawski’s Kinematic Theorems, U.S. Naval
Ordnance Laboratory Memorandum No. 9354, 20 September 1947,
4 pp.
24. On the effect of a current of ionized air upon the earth’s magnetic field,
Journal of Geophysical Research 55, 247–260; 56 (1951), 134.
25. On the balance between deformation and rotation in the motion of a continuous medium, Journal of the Washington Academy of Sciences 40, 313–317.
26. A new definition of a fluid, I: The Stokesian fluid, Journal de Mathématiques
Pures et Appliquées (9) 29, 215–244; 30 (1951), 156–158.
26P. Same title, pp. 351–364 of Proceedings of the 7th International
Congress of Applied Mechanics (1948), Volume 2, 1949 = [with
minor alterations] U.S. Naval Research Laboratory Report P-3457,
April 26, 1949, iv + 11 pp.
PUBLISHED WORKS OF C.A. TRUESDELL
xvii
26L1. Deformation: Elastic, plastic, and fluid masses, Research Reviews
(U.S. Office of Naval Research), 15 April 1949, pp. 10–14.
26L2. Une définition nouvelle des fluides, pp. 38–52, 76 of Three Lectures
on Mathematics and Mechanics, U.S. Naval Research Laboratory
Theoretical Mechanics Section Memorandum No. 3836-1, August 1,
1949.
26L3A. Recent continuum theories of fluid dynamics, Physical Review (2)
75 (1949), 1293.
27. On the addition and multiplication theorems for special functions, Proceedings of the National Academy of Sciences (U.S.A.) 36, 752–755.
28. The effect of the compressibility of the earth on its magnetic field, Physical
Review (2) 78, 823.
Other publications in 1950: Nos. 29A, 30A1, 30A2.
1951
29. A form of Green’s transformation, American Journal of Mathematics 73, 43–
47.
29P. Same title, U.S. Naval Research Laboratory Report No. 3554, 11
October 1949, iii + 4 pp.
29A. Same title, Bulletin of the American Mathematical Society 56 (1950),
171.
30. Vorticity averages, Canadian Journal of Mathematics 3, 69–86.
30A1. On Poincaré’s analogy between vorticity and mass density, Bulletin
of the American Mathematical Society 56 (1950), 347.
30A2. Vorticity averages, Physical Review (2) 79 (1950), 229.
31. Verallgemeinerung und Vereinheitlichung der Wirbelsätze ebener und rotationssymmetrischer Flüssigkeitsbewegungen, Zeitschrift für Angewandte
Mathematik und Mechanik 31, 65–71.
31A. A new vorticity theorem, pp. 639–640 of Proceedings of the International Congress of Mathematicians, 1950, Volume 1, 1952.
32. On Ertel’s vorticity theorem, Zeitschrift für Angewandte Mathematik und Physik 2, 109–114.
32P. On Ertel’s Theorem of the Diffusion of Vorticity, U.S. Naval Ordnance Laboratory Memorandum No. 9528, 3 February 1948, 8 pp.
33. Caractérisation des champs vectoriels qui s’annulent sur une frontière fermée,
Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences
(Paris) 232, 1277–1279.
34. Analogue tri-dimensionnel au théorème de M. Synge concernant les champs
vectoriels qui s’annulent sur une frontière fermée, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 232, 1396–1397.
35. A new definition of a fluid, II: The Maxwellian fluid, Journal de Mathématiques Pures et Appliquées (9) 30, 111–155.
35P. Same title, U.S. Naval Research Laboratory Report No. 3553, September 20, 1949, viii + 36 pp.
xviii
PUBLISHED WORKS OF C.A. TRUESDELL
36. Proof that Ertel’s vorticity theorem holds in average for any medium suffering no tangential acceleration on the boundary, Geofisica Pura e Applicata
19, 1–3.
37. On the equation of the bounding surface, Bulletin of the Technical University
of Istanbul 3, 71–78.
38. On the velocity of sound in fluids, Journal of the Aeronautical Sciences 18,
501.
39. The analogy between irrotational gas flow and minimal surfaces, Journal of
the Aeronautical Sciences 18, 502.
40A. Severe pure shear of an elastic body, Indiana Academy of Science Proceedings 61, 271.
41. Discussion of the paper by W.R. Osgood and J.A. Joseph, “On the general
theory of thin shells”, Journal of Applied Mechanics 18, 231–232.
42. Review of J.L. Synge and R.A. Griffith’s “Principles of Mechanics”, 2nd
edn, American Mathematical Monthly 57, 351–354.
Other publications in 1951: Nos. 24 (corrections), 26 (corrections), 54A1.
1952
43. The mechanical foundations of elasticity and fluid dynamics, Journal of
Rational Mechanics and Analysis 1, 125–300; 2 (1953), 593–616; 3 (1954),
801.
43R. [corrected, with a preface, annotations, and appendices (1962)],
pp. i–vxi, 1–186, 204–214 of Continuum Mechanics I, New York,
Gordon & Breach, 1966.
44. A program of physical research in classical mechanics, Zeitschrift für Angewandte Mathematik und Physik 3, 79–95.
44R. [corrected and annotated] pp. 187–203, 215–218 of Continuum Mechanics I, New York, Gordon & Breach, 1966.
45. On the viscosity of fluids according to the kinetic theory, Zeitschrift für
Physik 131, 273–289.
46. On curved shocks in steady plane flow of an ideal fluid, Journal of the
Aeronautical Sciences 19, 826–828.
47. Longueur critique pour la propagation des ondes libres dans un fluide visqueux, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 235, 702–704.
48. Vorticity and the Thermodynamic State in a Gas Flow, Mémorial des Sciences Mathématiques No. 119, Paris, Gauthier-Villars, 56 pp.
48P. (Co-author R.C. P RIM) Vorticity and the Thermodynamic State in the
Flow of an Inviscid Fluid, U.S. Naval Ordnance Laboratory Memorandum 9416, 12 November 1947, 14 pp.
49. Review of “Advances in Applied Mechanics”, Volume 2, Bulletin of the
American Mathematical Society 58, 403–407.
PUBLISHED WORKS OF C.A. TRUESDELL
xix
50. Review of F.D. Murnaghan’s “Finite Deformation of an Elastic Solid”, Bulletin of the American Mathematical Society 58, 577–579.
50C. Same title, Science 115, 634.
51. Discussion of H.M. Trent’s paper, “An alternative formulation of the laws of
mechanics”, Journal of Applied Mechanics 19, 569–570.
52. Discussion of R.A. Toupin’s paper, “A variational principle for the mesh-type
analysis of a mechanical system”, Journal of Applied Mechanics 19, 574.
53. Review of W. Prager and P.G. Hodge’s “Theory of Perfectly Plastic Solids”,
Bulletin of the American Mathematical Society 58, 674–677.
Other publications in 1952: Nos. 31A, 57P, 59C1, 59C1T, 59C2.
1953
54. Two measures of vorticity, Journal of Rational Mechanics and Analysis 2,
173–217.
54A1. A measure of vorticity, Bulletin of the American Mathematical
Society 57 (1951), 138.
54A2. La velocità massima nel moto di Gromeka–Beltrami, Accademia
Nazionale del Lincei, Rendiconti della Classe di Scienze Fisiche,
Matematiche e Naturali (8) 13, 378–379.
54A3. A measure of vorticity, pp. 245–246 of Proceedings of the 8th International Congress on Theoretical and Applied Mechanics 1953,
1954.
55. Notes on the history of the general equations of fluid dynamics, American
Mathematical Monthly 60, 445–458.
55R. Same title, Journal of the American Society of Naval Engineers 66
(1954), 97–108.
56. Generalization of a geometrical theorem of Euler, Commentarii Mathematici
Helvetici 27, 233–234.
57. Precise theory of the absorption and dispersion of forced plane infinitesimal waves according to the Navier–Stokes equations, Journal of Rational
Mechanics and Analysis 2, 643–742.
57P. Preliminary Report: Non-linear absorption and dispersion of plane
ultrasonic waves in pure fluids, Journal of the Washington Academy
of Sciences 42 (1952), 33–36.
58. The physical components of vectors and tensors, Zeitschrift für Angewandte
Mathematik und Mechanik 33, 345–356; 34 (1954), 69–70.
59. Paul Felix Neményi, Journal of the Washington Academy of Sciences 43,
62–63.
59C1. Same title, Science (2) 116 (1952), 215–216.
59C1T. [inaccurate] Same title, Physikalische Blätter 7 (1952), 325–326.
59C2. Same title, Zeitschrift für Angewandte Mathematik und Physik 3
(1952), 400–401.
59C3. Same title, Zeitschrift für Angewandte Mathematik und Mechanik 33,
72.
xx
PUBLISHED WORKS OF C.A. TRUESDELL
60. Review of H.M. Westergaard’s “Theory of Elasticity and Plasticity”, Bulletin
of the American Mathematical Society 59, 412–413.
61. Review of V.V. Novozhilov’s “Foundations of the Nonlinear Theory of Elasticity”, Bulletin of the American Mathematical Society 59, 467–473.
62. Václav Hlavatý, International Mathematical News No. 29/30, 2–3.
Other publications in 1953: Nos. 43 (corrections and additions), 65A.
1954
63. A new chapter in the theory of elastica, pp. 52–55 of Proceedings of the First
Midwestern Conference on Solid Mechanics, 1953.
64. The Kinematics of Vorticity, Indiana University Science Series No. 19, xvii +
232 pp.
64P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9591, 11
March 1948, 35 pp.
64A. Same title, Bulletin of the American Mathematical Society 55 (1949),
296 and 699.
65. Le pendule hydraulique, pp. 383–396 of Mémoires sur la Mécanique des Fluides offerts à M.D. Riabouchinsky à l’occasion de son Jubilé scientifique, Publications Scientifiques et Techniques du Ministère de l’Air, Paris.
65A. The hydraulic pendulum, Indiana Academy of Science Proceedings 63
(1953), 263.
66. Editor’s Introduction: Rational fluid mechanics, 1687–1765, pp. VII–CXXV
of Leonhardi Euleri Opera Omnia, Series II, Volume 12, Zürich, Füssli.
67. The present status of the controversy regarding the bulk viscosity of fluids,
Proceedings of the Royal Society (London) A 226, 59–65.
68. Mathematics, pp. 618–619 of The American Peoples Encyclopedia Yearbook
for 1953.
69. Review of J. Pérès’ “Mécanique Générale”, Bulletin of the American Mathematical Society 60, 286.
70. Review of E.J. McShane, J.L. Kelley and F.J. Reno’s “Exterior Ballistics”,
Scripta Mathematica 20, 172–174.
71. Review of A. Erdélyi, W. Magnus, F. Oberhettinger, and F. Tricomi’s “Higher
Transcendental Functions”, Volumes I and II, American Mathematical Monthly
61, 576–578.
Other publications in 1954: Nos. 43 (corrections), 55R, 54A3 and 84C.
1955
72. Hypo-elasticity, Journal of Rational Mechanics and Analysis 4, 83–133, 1019–
1020.
72L. L’ipoelasticità, Conferenze del Seminario di Matematica dell’Università
di Bari No. 29, Bologna, Zanichelli, 1957, 16 pp. [The text was somewhat mangled by the editor.]
72R. [corrected] pp. 43–92 of Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach, 1965.
PUBLISHED WORKS OF C.A. TRUESDELL
xxi
73. The simplest rate theory of pure elasticity, Communications on Pure and Applied Mathematics 8, 123–132.
73R. [corrected] pp. 32–41 of Continuum Mechanics III: Foundations of
Elasticity Theory, New York, Gordon & Breach, 1965.
74. Review of F.I. Frankl and E.A. Karpovich’s “Gas Dynamics of Thin Bodies”,
Science 121, 163–164.
75. Some things you don’t know about mathematics and mechanics, Indiana
Alumni Magazine 17, 2–5. [The title was supplied by the editor; C.T. would
not have accepted it, had he been informed.]
76. IU Prof says pupils can’t write, either, Indianapolis Times, June 16. [The title
was supplied by the editor.]
77. I. Editor’s Introduction: The first three sections of Euler’s treatise on fluid mechanics (1766). II. The theory of aerial sound, 1687–1788. III. Rational fluid
mechanics, 1765–1788, pp. VII–CXVII of Leonhardi Euleri Opera Omnia,
Series II, Volume 13, Zürich, Füssli.
Other publication in 1955: No. 80A.
1956
78. (Co-author E. I KENBERRY) On the pressures and the flux of energy in a gas
according to Maxwell’s kinetic theory, I, Journal of Rational Mechanics and
Analysis 5, 1–54.
79. On the pressures and the flux of energy in a gas according to Maxwell’s kinetic
theory, II, Journal of Rational Mechanics and Analysis 5, 55–128.
79L1. La crise actuelle dans la théorie cinétique des gaz (1955), Journal de
Mathématiques Pures et Appliquées (9) 37 (1958), 103–118.
79L1T. By B.H. Aleksanova i N.T. Pawenko, Sovremenny krizis v kinetiqesko teorii gazov, Mexanika No. 4/62 (1960),
65–75.
79L2. Une solution exacte des équations de Maxwell (1955), Journal de
Mathématiques Pures et Appliquées (9) 37 (1958), 119–133.
79L3. Congetture intorno ad un nuovo metodo di approssimazione asintotica
(1961), Rendiconti di Matematica 23 (1964), 185–192.
80. Das ungelöste Hauptproblem der endlichen Elastizitätstheorie, Zeitschrift für
Angewandte Mathematik und Mechanik 36, 97–103.
80A. Same title, Physikalische Verhandlungen 68 (1955), 129.
80T1. By G.. Dжanelidze: Nerexenna glavna zadaqa nelineno teorii uprugosti, Mexanika No. 1/41 (1957), 67–
74.
80T2. By C.T.: The main open problem in the finite theory of elasticity,
pp. 102–108 of Continuum Mechanics III: Foundations of Elasticity
Theory, New York, Gordon & Breach, 1965.
81. Hypo-elastic shear, Journal of Applied Physics 27, 441–447.
81R. Pp. 93–100 of Continuum Mechanics III: Foundations of Elasticity
Theory, New York, Gordon & Breach, 1965.
xxii
PUBLISHED WORKS OF C.A. TRUESDELL
82. Zur Geschichte des Begriffes “innerer Druck”, Physikalische Blätter 12, 315–
326.
83. Experience, theory, and experiment, pp. l3–18 of Proceedings of the Sixth
Hydraulics Conference, Bulletin 36, State University of Iowa Studies in Engineering.
84. Review of “Advances in Applied Mechanics”, Volume 3, Scripta Mathematica
22, 65–68.
84C. A comment on scientific writing, Science 120 (1954), 434.
85. Review of R. Dugas’ “La Mécanique au XVIIe Siècle”, Isis 47, 449–452.
86. Query No. 150 [Bounded magic], Isis 47, 59.
1957
87. (Co-author B. B ERNSTEIN) The solution of linear differential equations by
quadratures, Journal für die Reine und Angewandte Mathematik 197, 104–
111.
88. Sulle basi della termomeccanica, Accademia Nazionale del Lincei, Rendiconti
della Classe di Scienze Fisiche, Matematiche e Naturali (8) 22, 33–38, 158–
166.
88T. By C.T.: On the foundations of mechanics and energetics, pp. 293–305
of Continuum Mechanics II: The Rational Mechanics of Materials, New
York, Gordon & Breach, 1965.
89. Eulers Leistungen in der Mechanik, Enseignement Mathématique 3, 251–262.
90. General solution for the stresses in a curved membrane, Proceedings of the
National Academy of Sciences (U.S.A.) 43, 1070–1072.
91. Review of “The Principal Works of Simon Stevin”, Volume 1, edited by
E. Crone, E.J. Dijksterhuis, R.J. Forbes, M.G.J. Minnaert, A. Pannekoeg, Physikalische Blätter 13, 578–579.
Other publications in 1957: Nos. 72L, 80T1, 95A.
1958
92. The new Bernoulli edition, Isis 49, 54–62.
93. Geometric interpretation for the reciprocal deformation tensors, Quarterly of
Applied Mathematics 15, 434–435.
94. Recent advances in rational mechanics, Science 127, 729–739.
94R. [corrected] Essay VIII in No. 165 below.
95. Neuere Anschauungen über die Geschichte der allgemeinen Mechanik, Zeitschrift für angewandte Mathematik und Mechanik 38, 148–157.
95A. Neuere Anschauungen über die Geschichte der Mechanik, Physikalische
Verhandlungen 83 (1957), 50.
96. Neuere Entwicklungen in der klassischen statistischen Mechanik und in der
kinetischen Gastheorie, ausgearbeitet von D. MORGENSTERN, Ergebnisse der
exakten Naturwissenschaften 30, 286–343.
97. (Co-author J.L. E RICKSEN) Exact theory of stress and strain in rods and
shells, Archive for Rational Mechanics and Analysis 1 (1957/8), 295–323.
xxiii
PUBLISHED WORKS OF C.A. TRUESDELL
97R. Pp. 307–323 of Continuum Mechanics II: The Rational Mechanics of
Materials, New York, Gordon & Breach, 1965.
Other publications in 1958: Nos. 79L1, 79L2, 98P, 104P, 107P.
1959
98. The rational mechanics of materials – past, present, future, Applied Mechanics Reviews 12, 75–80.
98P. Same title, Mathematics Research Center, United States Army, The
University of Wisconsin, Technical Summary Report No. 41, July
1958, 28 pp.
98R. [corrected and modified] pp. 225–236 of Applied Mechanics Surveys,
Washington, Spartan Books, 1966.
99. Invariant and complete stress functions for general continua, Archive for
Rational Mechanics and Analysis 4 (1959/60), 1–29.
100. 20 Lectures on the Elements of Fluid Mechanics, notes taken by R. Wells,
Rheology Section, National Bureau of Standards, June 30–September 11,
multiplied typescript, 131 pp.
101. Review of H. Rouse and S. Ince’s “History of Hydraulics”, Isis 50, 69–71.
102. Review of “Rheology, theory and applications”, edited by F. Eirich, Quarterly of Applied Mathematics 17, 221–222.
103. Query No. 158, “Physical Intuition”, Isis 50, 480.
Other publication in 1959: No. 110A.
1960
104. Intrinsic equations of spatial gas flow, Zeitschrift für Angewandte Mathematik und Mechanik 40, 9–14.
104P. Same title, Mathematics Research Center, United States Army, The
University of Wisconsin, Technical Summary Report No. 33, July
1958, 13 pp.
105. (Co-author R.P. K ANWAL) Electric current and fluid spin created by the passage of a magnetosonic wave, Archive for Rational Mechanics and Analysis
5, 432–439.
106. (Co-author B.D. C OLEMAN) On the reciprocal relations of Onsager, Journal
of Chemical Physics 33, 28–31.
107. (Co-author R. T OUPIN) The classical field theories, pp. 226–793 of Flügge’s
Handbuch der Physik, Volume 3, Part 1, Berlin/Göttingen/Heidelberg,
Springer-Verlag.
107P. [Chapter C only] Kinematics of singular surfaces and waves, Mathematics Research Center, United States Army, The University of Wisconsin, Technical Summary Report No. 43, October 1958, 89 pp.
108. (translation by Mathäi) Zu den Grundlagen der Mechanik und Thermodynamik, Physikalische Blätter 16, 512–517.
108T. [English original] Text of the Chairman’s Introduction to the Colloquium on the Foundations of Mechanics and Thermodynamics
xxiv
PUBLISHED WORKS OF C.A. TRUESDELL
held at the U.S. National Bureau of Standards, Washington, October
21–23, 1959, Appendix to No. 153, 1966.
109. A program toward rediscovering the rational mechanics of the age of reason,
Archive for History of Exact Sciences 1, 3–36.
109TE. By C.T.: La scienza del moto dai ‘Principia Mathematica Naturalis
Philosophiae’ di Newton alla ‘Méchanique Analitique’ di Lagrange,
Atti e Memorie della Academia Nazionale di Scienze, Lettere ed
Arti, Modena (6) 2, 3–32.
109R1. [corrected] Essay II in No. 165 below.
109R2. [of the foregoing] No. HS-76 in The Bobbs-Merrill Reprint Series
in History of Science, 1972.
110. Modern theories of materials, Transactions of the Society of Rheology 4, 9–
22.
110A. Same title, Rheology Bulletin 28, No. 3 (1959), p. 5.
111. The Rational Mechanics of Elastic or Flexible Bodies, 1638–1788, L. Euleri
Opera Omnia, Series II, Volume 11, Part 2, Zürich, Füssli, 435 pp.
111A. Outline of the history of flexible or elastic bodies to 1788, Journal
of the Acoustical Society of America 32, 1647–1656.
111L1. Origin of the theory of vibrating systems, Res Mechanica 21 (1987),
291–311. [This text was drawn by others from Truesdell’s notes for
a lecture.]
112. [unsigned] Potentials (physics), pp. 539–542 of McGraw-Hill Encyclopedia
of Science and Technology, Volume 10.
113. [unsigned] Unified field theories, pp. 200–201 of McGraw-Hill Encyclopedia
of Science and Technology, Volume 14.
114. Review of “Die Deutsch–Russische Begegnung und Leonhard Euler”, edited
by E. Winter, Isis 51, 115.
115. Review of Leonhard Euler’s “Vollständige Anleitung zur Algebra”, edited by
J.E. Hofmann, Isis 51, 434.
116. Query No. 161, Approximate theories in early research, Isis 51, 207. [Answered by B.L. VAN DER WAERDEN on pp. 567–568.]
Other publication in 1960: No. 79L1T.
1961
117. Stages in the development of the concept of stress, pp. 556–564 of Problems
of Continuum Mechanics [Muskhelisvili Anniversary Volume], Philadelphia,
Society for Industrial and Applied Mathematics.
117T. Зtapy razviti ponti naprжeni, pp. 439–447 of Problemy mexaniki sploxno sredy, Moscow, Izdatelьstvo
Akademii Nauk SSSR.
118. Exact theory of self-expanding piston rings, Ingenieur-Archiv 30, 77–87.
119. The Principles of Continuum Mechanics, Socony Mobil Oil Company Colloquium Lectures in Pure and Applied Science No. 5 (February, 1960), (x) +
371 + XVIII pp. Reprinted in 1963 and 1965.
xxv
PUBLISHED WORKS OF C.A. TRUESDELL
120. General and exact theory of waves in finite elastic strain, Archive for Rational
Mechanics and Analysis 8, 263–296.
120L. Second-order theory of wave propagation in isotropic elastic materials, pp. 187–199 of Proceedings of the International Conference
on Second-order Effects, Haifa (1962), 1964.
120R1. [corrected] pp. 230–263 of Continuum Mechanics IV: Problems of
Nonlinear Elasticity, New York, Gordon & Breach, 1965.
120R2. Ibid, [not repaginated] Memoir 1 in Wave Propagation in Dissipative Materials, a Reprint of Five Memoirs by B.D. C OLEMAN ,
M.E. G URTIN , I. H ERRERA R., and C. T RUESDELL, New York,
Springer-Verlag, 1965.
121. Ergodic theory in classical statistical mechanics, pp. 21–56 of Rendiconti
della Società Italiana di Fisica, XIV Corso = Ergodic Theory, ed. P. Caldirola,
New York, Academic Press.
122. Review of M. Clagett’s “The Science of Mechanics in the Middle Ages”,
Speculum 36, 119–121.
123. Review of “Critical Problems in the History of Science”, edited by M. Clagett,
Manuscripta 5, 101–103.
124. Review of “Die Berliner und die Petersburger Akademie im Briefwechsel
Leonhard Eulers, Teil I, Der Briefwechsel L. Eulers mit G.F. Müller, 1735–
1767”, edited by A.P. Juškevič, E. Winter, and P. Hoffmann, Isis 52, 113–114.
125. Review of M. Dyck’s “Novalis and Mathematics: A Study of Friedrich von
Hardenberg’s Fragments on Mathematics and its Relation to Magic, Music,
Religion, Philosophy, Language and Literature”, Isis 52, 606–607.
1962
126. Mechanical basis of diffusion, Journal of Chemical Physics 37, 2336–2344.
126L. Una teoria meccanica della diffusione, pp. 161–168 of Celebrazioni
Archimedee del Secolo XX (Siracusa, 1961), Volume 3.
127. Reactions of the history of mechanics upon modern research, pp. 35–47 of
Proceedings of the Fourth U.S. National Congress of Applied Mechanics.
127A. Same title, Journal of Applied Mechanics 29, 225.
127R. [corrected] Essay VII in No. 165, below.
127T. [of the foregoing] by P. Zimmermann: Rückwirkungen der Geschichte der Mechanik auf die moderne Forschung, Humanismus
und Technik 13 (1969), 1–25.
128. Solutio generalis et accurata problematum quamplurimorum de motu corporum elasticorum incomprimibilium in deformationibus valde magnis, Archive
for Rational Mechanics and Analysis 11, 106–113; 12 (1963), 427–428; 28
(1968), 397–398.
129. (Co-author R.P. K ANWAL) Fluid and magnetic distortion carried by magnetosonic waves, The Physics of Fluids 5, 368–369.
xxvi
PUBLISHED WORKS OF C.A. TRUESDELL
130. Review of “Die Berliner und die Petersburger Akademie im Briefwechsel
Leonhard Eulers, Teil II, der Briefwechsel L. Eulers mit Nartov, Razumovskij,
Schumacher, Teplov und der Petersburger Akademie, 1730–1763”, edited
by A.P. Juškevič, E. Winter, P. Hoffmann, and Ju.Ch. Kopelevič, Isis 53,
411–413.
Other publication in 1962: No. 144P1.
1963
131. (Co-author R.A. T OUPIN) Static grounds for inequalities in finite strain of
elastic materials, Archive for Rational Mechanics and Analysis 12, 1–33; 19
(1965), 407.
132. The meaning of Betti’s reciprocal theorem, Journal of Research of the National Bureau of Standards 67B, 85–86.
133. Remarks on hypo-elasticity, Journal of Research of the National Bureau of
Standards 67B, 141–143.
134. Review of M. Jammer’s “Concepts of Mass in Classical and Modern Physics”,
Isis 54, 290–291.
135. Review of D. Morgenstern and I. Szabò’s “Vorlesungen über theoretische
Mechanik”, Bulletin of the American Mathematical Society 69, 330–332.
136. Query 170 – Portrait of George Green, Isis 54, 277.
Other publication in 1963: No. 128 (corrections).
1964
137. Second-order effects in the mechanics of materials, pp. 1–47 of Proceedings
of the International Conference on Second-order Effects, Haifa (1962).
138. The natural time of a visco-elastic fluid: its significance and measurement,
The Physics of Fluids 7, 1134–1142.
139. A theorem on the isotropy groups of a hyperelastic material, Proceedings of
the National Academy of Sciences (U.S.A.) 52, 1081–1083.
140. Whence the law of moment of momentum?, pp. 588–612 of Mélanges Alexandre Koyré, Volume 1, Paris, Hermann.
140R. [corrected] Essay V of No. 165, below.
140TE. By C.T., with a different appendix: “Die Entwicklung des Drallsatzes”, Zeitschrift für Angewandte Mathematik und Mechanik 44,
149–158.
141. The modern spirit in applied mathematics, I.C.S.U. Review of World Science
6, 195–205.
142. Fluid mechanics before the Society for Natural Philosophy, Science 143, 382.
143. [Gratiae ob lauream honoris causa ab Academia Polytechnica Mediolanensi
collatam], p. 40 of Cerimonie Celebrative del Centenario del Politicnico, 2–4
Aprile 1964, Milano.
Other publications in 1964: Nos. 79L3, 120L, 144P2.
xxvii
PUBLISHED WORKS OF C.A. TRUESDELL
1965
144. Rational mechanics of deformation and flow [Bingham Medal Address],
pp. 3–30 of Proceedings of the 4th International Congress on Rheology
(1963), Volume 2.
144P1. Il punto di vista invariantivo nella meccanica dei corpi continui,
Rendiconti del Seminario Matematico e Fisico di Milano 32 (1962),
91–104.
144P2. Die Rationale Mechanik der Kontinua, Zeitschrift für Angewandte
Mathematik und Mechanik 44 (1964), 341–347.
144P2T. By A.I. Vandiner: Racionalьna mexanika sploxno
sredy, Mexanika No. 4/92 (1965), 103–111.
144RE. [with editorial changes in incorrect English] Buletinul Institutului
Politehnic din Iasi (n.s.) 13 (17) (1967), 415–418; 14 (18) (1968),
131–136.
145. (Co-author W. N OLL) The Non-Linear Field Theories of Mechanics, Flügge’s
Handbuch der Physik, Volume 3, Part 3, Berlin-Heidelberg-New York,
Springer-Verlag, viii + 602 pp.
146. (Co-author B.D. C OLEMAN) Homogeneous motions of incompressible materials, Zeitschrift für Angewandte Mathematik und Mechanik 45, 547–551.
147. Fluids of the second grade regarded as fluids of convected elasticity [with an
appendix by C.-C. WANG], The Physics of Fluids 8, 1936–1938.
148. Twenty prefaces in Continuum Mechanics II: The Rational Mechanics of
Materials, New York, Gordon & Breach.
149. Sixteen prefaces in Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach.
150. Seventeen prefaces in Continuum Mechanics IV: Problems of Non-Linear
Elasticity, New York, Gordon & Breach.
151. Preface to Wave Propagation in Dissipative Materials, a Reprint of Five
Memoirs by B.D. C OLEMAN, M.E. G URTIN, I. H ERRERA R., and C. T RUES DELL, New York, Springer-Verlag, 1965.
Other publications in 1965: Nos. 72R, 73R, 80T2, 81R, 88T, 97R, 120R1, 120R2,
135 (corrections).
1966
152. Instabilities of isotropic perfectly elastic materials in simple shear, pp. 139–
142 of Proceedings of the Eleventh International Congress of Applied Mechanics, Munich (1964).
153. Six Lectures on Modern Natural Philosophy, New York, Springer-Verlag,
(viii) + 117 pp.
153T1. By Magdalena Staszel and Wojciech Zakrewski: Sześć Wykładów
Nowoczesnej Filozofii Przyrody, Warsaw, Panstwowe Wydawnictwo
Naukowe, 1969, 143 pp.
xxviii
PUBLISHED WORKS OF C.A. TRUESDELL
153TE. [first three lectures] by I.T. Rabotnova: Glavy iz knigi
«Xestь lekci po sovremenno naturfilosofii»,
Mexanika No. 4/122 (1970), 99–136.
153RE. [Lecture 5], pp. 55–73 of A Taste of Science, ed. R.J. Tykodi, Westport, Connecticut, Technomic Publishing Co., 1975.
154. The Elements of Continuum Mechanics, New York, Springer-Verlag, [iv] +
279 pp. Corrected second printing, 1985.
154L1. Foundations of continuum mechanics, pp. 35–48 of Delaware Seminar in the Philosophy of Physics (1965), edited by M. Bunge, New
York, Springer-Verlag, 1967.
154L2. Thermodynamics of deformation, pp. 101–112 of Non-Equilibrium
Thermodynamics, Variational Techniques and Stability, Chicago,
University of Chicago Press.
154L3. Thermodynamics of deformation, pp. 1–12 of Modern Developments in the Mechanics of Continua, New York, Academic Press.
154L4. The nonlinear field theories in mechanics (1966), pp. 19–215 of
Topics in Nonlinear Physics, Berlin-Heidelberg-New York, SpringerVerlag, 1968.
154L5. La thermodynamique de la déformation, pp. 207–231 of Canadian
Congress of Applied Mechanics (1967), Proceedings, Volume 3,
1968.
154L6. Classical and modern continuum theories, pp. 79–92 of Polymers
in the Engineering Curriculum, Proceedings of the Third Buhl International Conference on Materials, Pittsburgh, October 28–29,
1968, 1971.
155. Existence of longitudinal waves, Journal of the Acoustical Society of America 40, 729–730.
156. Preface, pp. IVA–IVL, to the second edition of G.G. Stokes’s Mathematical
and Physical Papers, New York, Johnson Reprint Co., Volume 1.
Other publications in 1966: Nos. 43R, 44R, 98R, 163L1.
1967
157. Reactions of late baroque mechanics to success, conjecture, error, and failure
in Newton’s Principia, The Texas Quarterly, Autumn, 238–258.
157R1. [corrected] Essay III in No. 165, below.
157R2. [of the preceding], pp. 2–47 of Mechanics 1970, American Academy of Mechanics, 1970.
157R3. [of No. 157R1], pp. 192–232 of The Annus Mirabilis of Sir Isaac
Newton, 1666–1966, edited by R. Palter, Cambridge, Massachusetts,
M.I.T. Press, 1971.
157R4. [of No. 157R1] No. HS77 in The Bobbs-Merrill Reprint Series in
History of Science, 1972.
158. Reply to the paper “Zum Begriff des Elastischen Körpers” by H. Ziegler and
D. McVean, Zeitschrift für Angewandte Mathematik und Physik 18, 293.
PUBLISHED WORKS OF C.A. TRUESDELL
xxix
159. Review of C.W. Kilmister and J.E. Reeve’s “Rational Mechanics”, American
Mathematical Monthly 74, 748–749.
160. Review of “Manuscripta Euleriana Archivi Academiae Scientiarum URSS,
Tomus 1, Descriptio Scientifica”, Isis 58, 271–273.
160C. Same title, Scripta Mathematica 28 (1968), 210–211.
161. Review of “Manuscripta Euleriana Archivi Academiae Scientiarum URSS,
Tomus 2, Opera Mechanica”, Isis 58, 273–274 = Scripta Mathematica 28
(1968), 211–212.
162. Review of I.E. Farquhar’s “Ergodic Theory in Statistical Mechanics”, Quarterly of Applied Mathematics 24, 386.
Other publications in 1967: Nos. 144RE, 154L1, 165P.
1968
163. Thermodynamics for beginners, pp. 373–387 of Irreversible Aspects of Continuum Mechanics, Proceedings of the IUTAM Symposia Vienna, June 22–28,
1966, Wien/New York, Springer-Verlag.
163L1. Termodinamica per principianti, Atti e Memorie della Accademia
Nazionale di Scienze, Lettere ed Arti, Modena (6) 8 (1966), 136–
144.
163L2. Termodinamica elementare, Rendiconti del Seminario Matematico
dell’Università e del Politecnico di Torino 27 (1967/68), 19–33.
163T. By I.T. Rabotnova: Termodinamika dl naqinawix,
Mexanika No. 3/121 (1970), 116–128.
164. Sulle basi della termodinamica delle miscele, Accademia Nazionale del Lincei, Rendiconti della Classe di Scienze Fisiche, Matematiche e Naturali (8)
44, 381–383.
164T. By C.T.: On the foundations of the thermodynamics of mixtures,
pp. 273–297 of Mechanics 1971, American Academy of Mechanics,
1973.
165. Essays in the History of Mechanics, New York, Springer-Verlag, (x)+384 pp.
165P. [Essay I only] Leonardo da Vinci, The Myths and the Reality, Johns
Hopkins Magazine, Spring, 1967, 29–42. [The title was supplied by
the editor without informing C.T., who would not have accepted it.]
165A. “Essays in the History of Mechanics”, Archives Internationales
d’Histoire des Sciences 23 (1970), 177–178.
165T. By J.C. Navascues Howard and E.T. Perez-Relaño: Ensayos de Historia de la Mecánica, Madrid, Editorial Tecnos, 1975, 343 pp.
165TE. [Essay VI] by P. Zimmermann: Frühe kinetische Gastheorien, Humanismus und Technik 14 (1970), 1–29.
See also Nos. 94, 109, 127, 140, 157.
166. Comment on longitudinal waves, Journal of the Acoustical Society of America 43, 170.
xxx
PUBLISHED WORKS OF C.A. TRUESDELL
167. Parole [di ringraziamento per il V Premio internazionale con medaglia d’oro
“Prof. Modesto Panetti”], Atti della Accademia di Scienze di Torino 102,
21–24.
168. Preface, pp. III–V of Continuum Theory of Inhomogeneities in Simple Bodies, Berlin-Heidelberg-New York, Springer-Verlag.
Other publications in 1968: Nos. 128 (corrections) 144RE, 154L4, 154L5, 161
(second publication).
1969
169. Rational Thermodynamics. A Course of Lectures on Selected Topics, New
York, McGraw-Hill, (x) + 208 pp.
169T1. By D.J. Fernandez Ferrer: Termodinámica Racional, Barcelona, Editorial Reverte 1973, X + 221 pp.
169T2. By M. Fichera Colautti: Termodinamica Razionale, with an appendix, Contributi del Centro Linceo di Scienze Matematiche e Applicazioni No. 20, Roma, Accademia dei Lincei, 1976, 235 pp.
170. A precise upper limit for the correctness of the Navier–Stokes theory with
respect to the kinetic theory, Journal of Statistical Physics 1, 313–318.
170T. By E.G. Berner: Toqny verhni predel korrektnosti
teorii Navьe–Stoksa s uqetom kinetiqesko teorii,
Mehanika No. 4/122 (1970), 137–142.
171. A teaching assistant remembers, Wall Street Journal, January 27, p. 14. [The
title was supplied by the editor.]
Other publications in 1969: Nos. 127T, 153T1, 157T.
1970
172. De pressionibus negativis in sinu et in pariete regionis fluido viscoso moventi
impletae schedula, Annali di Matematica Pura ed Applicata (4) 84, 213–
224.
172A. Same title, Zentralblatt für Mathematik 207 (1971), 252–253.
173. Review of L. Suklje’s “Rheological Aspects of Soil Mechanics”, American
Scientist 58, 210–211.
Other publications in 1970: Nos. 153TE, 163T, 165A, 165TE, 170T, 181C.
1971
174. Letter to the Editor, The Johns Hopkins Magazine, Spring, 3–4.
175. Review of O. Penrose’s “Foundations of Statistical Mechanics”, American
Scientist 59, 638.
176. Review of R.M. Christensen’s “Theory of Viscoelasticity: An Introduction”,
American Scientist 59, 615.
177. Review of T.L. Hankins’ “Jean d’Alembert: Science and the Enlightenment”,
Centaurus 16, 56–59.
Other publications in 1971: Nos. 154L6, 157R3, 172A.
xxxi
PUBLISHED WORKS OF C.A. TRUESDELL
1972
178. Leonard Euler, supreme geometer (1707–1783), pp. 51–95 of Studies in Eighteenth Century Culture, Volume II, Irrationalism in the Eighteenth Century,
Case Western Reserve University Press.
179. Review of G.A. Tokaty’s “A History and Philosophy of Fluidmechanics”,
Nature 236, 84–85.
180. Review of C. Naux’s “Histoire des logarithmes de Neper à Euler, Tome II, La
promotion des logarithmes au rang de valeur analytique”, Isis 63, 443–444.
Other publications in 1972: Nos. 109R2 and 157R4.
1973
181. Is there a philosophy of science? An essay review of “Induction and Intuition
in Scientific Thought” by Peter Brian Medawar, Centaurus 17, 142–172.
181C. Review of Medawar’s “Induction and Intuition in Scientific Thought”,
Die Naturwissenschaften 57 (1970), 314.
182. (Co-author C.-C. WANG) Introduction to Rational Elasticity, Leyden,
Wolters–Noordhoff, xii + 566 pp.
183. Introduction à la Mécanique Rationnelle des Milieux Continus [translation
by D. Euvrard of an unpublished English text], vii + 367 pp., Paris, Masson
[published as of 1974].
184. Theoria de effectibus mechanicis caloris pridem ab illmo Sadi Carnoto verbis
physicis promulgata nunc primum mathematice enucleata, Atti della Accademia di Scienze dell’Istituto di Bologna Classe die Scienze Fisiche (12) 10,
29–41.
184T. By B. Cimbleris, annotated: Carnot finalmente matematizado, Revista
da Escola da Engenharia da Universidade Federal do Minas Gerais
2 (1974), 3–21.
185. The efficiency of a homogeneous heat engine, Journal of Mathematical and
Physical Sciences (Madras) 7 [Milne-Thomson anniversary volume], 349–
371; 9 (1975), 193–194.
185A. Sul rendimento delle macchine termiche omogenee, Accademia
Nazionale dei Lincei, Rendiconti della Classe die Scienze Fisiche,
Matematiche e Naturali (8) 53, 549–553.
185T. By S. Benenti: Sul rendimento di una macchina termica omogenea,
Rendiconti del Seminario Matematico dell’ Università e del Politecnico di Torino 31 (1971/3), 47–68 (1974).
186. Mathematical Aspects of the Kinetic Theory of Gases, Notas de Matemática
Física, Volume III, Instituto de Matemática, Universidade Federal do Rio de
Janeiro, (viii) + 246 pp.
187. The scholar’s workshop and tools, Centaurus 17, 1–10.
188. Review of I. Bernard Cohen’s “Introduction to Newton’s Principia”, Physics
Today, April, p. 59.
189. Review of “Die Werke von Jakob Bernoulli, Band I”, Isis 64, 112–114.
xxxii
PUBLISHED WORKS OF C.A. TRUESDELL
190. Review of W. Flügge’s “Tensor Analysis and Continuum Mechanics”, American Scientist 61, 100.
191. Review of G.S. Gilmor’s “Coulomb and the Evolution of Physics and Engineering in Eighteenth-Century France”, Eighteenth-Century Studies 7
(1973/4), 213–225.
Other publications in 1973: Nos. 164T, 169T1, 203PT, 224P.
1974
192. The meaning of viscometry in fluid dynamics, Annual Review of Fluid Mechanics 6, 111–146.
193. (Co-author H. M OON) Interpretation of adscititious inequalities through the
effects pure shear stress produces upon an isotropic elastic solid, Archive for
Rational Mechanics and Analysis 55, 1–17.
194. Preface, pp. V–VI, The Foundations of Mechanics and Thermodynamics,
Selected Papers by W. Noll, Berlin-Heidelberg-New York, Springer-Verlag.
195. A simple example of an initial-value problem with more than one solution,
Istituto Lombardo, Accademia di Scienze e Lettere. Rendiconti, Classe di
Scienze Matematiche e Naturali (A) 108, 301–304.
Other publications in 1974: Nos. 184T and 185T.
1975
196. Pervonaqalьny kurs racionalьno mehaniki sploxnyh sred,
translation by R.V. Golьdxten and V.M. Entov of an unpublished
English text, edited by P.A. Жilin and A.I. Lurьe, Moscow, Mir,
592 pp.
197. (Co-author H. M OON) Inequalities sufficient to ensure semi-invertibility of
isotropic functions, Journal of Elasticity 5, 183–189.
197A. Same title, Zentralblatt für Mathematik 324 (1976), 513–514.
198. Early kinetic theories of gases, Archive for History of Exact Sciences 15,
1–66.
199. Les bases axiomatiques de la thermodynamique, Entropie 11, No. 63, 6–11;
No. 64, 4–10; No. 65, 4–8. [The title is an unauthorized editorial replacement for the author’s “Trois conférences sur la structure conceptuelle de la
thermodynamique, 1973”.]
200. Review of P. Costabel’s “Leibniz and Dynamics”, Historia Mathematica 2,
360–361.
201. Review of A.C. Eringen and E.S. Suhubi’s “Elastodynamics, Volume I,
Finite Motions”, Journal of the Acoustical Society of America 58, 539–540.
Other publications in 1975: Nos. 153RE, 185 (corrections), 213A.
1976
202. History of classical mechanics, Die Naturwissenschaften 63, 53–62, 119–
130.
202T. História da Mecânica clássica, Revista Brazileira de Ciências
Mecânicas 4 (1982), No. 2, 3–17, and No. 3, 3–21.
xxxiii
PUBLISHED WORKS OF C.A. TRUESDELL
203. The scholar, a species threatened by professions, Critical Inquiry 2, 631–
648.
203A. Same title, Sociological Abstracts 26 (1978), 807.
203PT. By C. Wintzer and C.J. Scriba: Der Gelehrte: Eine durch die Professionen bedrohte Spezies, Humanismus und Technik 17 (1973), 113–
127. [An entire page of text is accidentally omitted.]
203R. [corrected] Speculations in Science and Technology 3 (1980), 517–
532.
204. Improved estimates for the efficiencies of irreversible heat engines, Annali di
Matematica Pura ed Applicata (4) 108, 305–323.
205. Review of “The Mathematical Papers of Isaac Newton, Volume VI”, edited
by D.T. Whiteside, American Scientist 64, 230.
206. Questioni vecchie e nuove di termodinamica razionale (Corso Linceo di
1973), published as an appendix (pp. 209–235) to No. 169T2.
207. Irreversible heat engines and the second law of thermodynamics, Letters in
Heat and Mass Transfer 3, 267–289.
207L. Macchine termiche irreversibili e la seconda legge della termodinamica, pp. 297–307 of Problemi attuali di meccanica teorica e applicata, Atti del Convegno Internazionale a ricordo di Modesot Panetti,
Torino.
208. Review of “L. Euleri Opera Omnia Series IVA, Volume 1”, edited by A. Juškevič, V. Smirnov, and W. Habicht, Eighteenth-Century Studies 9, 627–634 =
[corrected] Archives Internationales d’Histoire des Sciences 27 (1977),
292–296.
209. (Co-authors G. A STARITA and G.G. S ARTI) Insegnamento della termodinamica nella Facoltà di Ingegneria con i metodi della termodinamica razionale, La Chimica e l’Industria 58, 204–206.
210. Review of K. Walters’ “Rheometry”: “Theoretical Rheology”, edited by
J.F. Hutton, J.R.A. Pearson, and K. Walters; R.R. Huilgol’s “Continuum
Mechanics of Viscoelastic Liquids”; and P. Chadwick’s “Continuum Mechanics”, American Scientist 64, 705–706.
1977
211. A First Course in Rational Continuum Mechanics, Part I: Fundamental Concepts, New York, Academic Press, xxiii + 280 pp.
212. (Co-author R. F OSDICK) Universal flows in the simplest theories of fluids,
Annali della Scuola Normale Superiore di Pisa (IV) 2, 323–341.
212R. Pp. 330–348 of Volume 1 of Raccolta degli Scritti dedicati a Jean
Leray, Scuola Normale Superiore, Pisa, 1978.
213. (Co-author S. B HARATHA) The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines, Rigorously Constructed Upon the
Foundation Laid by S. Carnot and F. Reech, New York, Springer-Verlag,
xxii + 154 pp.
xxxiv
214.
215.
216.
217.
218.
219.
220.
221.
222.
223.
224.
225.
PUBLISHED WORKS OF C.A. TRUESDELL
213A. How to understand and teach the logical structure and the history of
classical thermodynamics, pp. 577–586 of Volume 2 of Proceedings
of the International Congress of Mathematicians, Vancouver, 1974,
1975.
Correction of two errors in the kinetic theory of gases which have been used
to cast unfounded doubt upon the principle of material frame-indifference,
Meccanica 11 (1976), 196–199.
Review of “Commentationes Mechanicae et Astronomicae, Commentationes
ad Scientiam Navalem Pertinentes, Volumen Prius, Leonhardi Euleri Opera
Omnia Series II, Volumen 20”, edited by W. Habicht, Centaurus 21, 76–77.
Review of “Lodovico Ferrari e Niccolò Tartaglia, Cartelli di Sfida Matematica”, edited by A. Masotti, Isis 68, 643–644.
1978
[Address upon receipt of a Birkhoff Prize, 1978], The Mathematical Intelligencer 1, 99–101, 193.
Review of “Die Berliner und die Petersburger Akademie der Wissenschaften
im Briefwechsel Leonhard Eulers, Teil 3, Wissenschaftliche und Wissenschaftsorganisatorische Korrespondenzen 1726–1744”, edited by A.P. Juškevič, E. Winter, P. Hoffmann, I.N. Klado, and Ju.Ch. Kopelevič, Isis 69, 301–
303.
Some challenges offered to analysis by rational thermomechanics, pp. 495–
603 of Contemporary Developments in Continuum Mechanics and Partial
Differential Equations (Proceedings of the International Conference on Continuum Mechanics and Partial Differential Equations, Rio de Janeiro, August 1977), edited by G.M. de LaPenha and L.A. Medeiros, Amsterdam,
North-Holland.
1979
Absolute temperatures as a consequence of Carnot’s General Axiom, Archive
for History of Exact Sciences 20, 357–380.
Schizzo concettuale della termodinamica per gli studiosi di meccanica, Bollettino della Unione Matematica Italiana (5) 16-A, 1–20.
Essay review of I. Szabò’s “Die Geschichte der Mechanischen Prinzipien”,
Centaurus 23, 163–175.
[translation of an unpublished English text] Meccanica e Termomeccanica
razionale (1974), pp. 33–52 of Volume IV of Enciclopedia del Secolo XX,
Rome.
1980
(Co-author R.G. M UNCASTER) Fundamentals of Maxwell’s Kinetic Theory
of a Simple Monatomic Gas, Treated as a Branch of Rational Mechanics,
New York, Academic Press, xxvii + 593 pp.
The Tragicomical History of Thermodynamics, 1822–1854, New York,
Springer-Verlag, XII + 372 pp.
xxxv
PUBLISHED WORKS OF C.A. TRUESDELL
225P. The Tragicomedy of Classical Thermodynamics (1971), International Centre for Mechanical Sciences, Udine, Courses and Lectures, No. 70, Wien and New York, Springer-Verlag, 41 pp., 1973.
[Publication in this form was not authorized by C.T. and was
contrary to his wishes.]
225RE. The disastrous effects of experiment upon the early development
of thermodynamics, pp. 415-423 of Scientific Philosophy Today.
Essays in Honor of Mario Bunge, Dordrecht, Reidel, 1981.
226. Sketch for a history of constitutive relations, pp. 1–27 of Proceedings of
the 8th International Congress on Rheology, Volume 1.
227A. The nature and function of constitutive relations, pp. 9-41 through 9-44 of
Volume 2 EPRI Workshop Proceedings: Basic Two-Phase Flow Modelling
in Reactor Safety and Performance, Electric Power Research Institute, Palo
Alto, March.
228. (Co-author J.F. BELL) §§3–6 of Physics of Music, pp. 666, 667 of Grove’s
Dictionary of Music and Musicians, Volume 14. [Editorial revisions introduced a good many errors.]
229. Biographies [mangled by the editor and hence all but one unsigned] of
D. Bernoulli, Chladni, Euler (co-author J.F. Bell), Helmholtz, Hooke, Lagrange, Lambert, and Sauveur, p. 628 of Volume 2, pp. 289, 290 of Volume 4, p. 292 of Volume 6, pp. 465, 466 and 686 of Volume 8, p. 361 and
397 of Volume 10, p. 524 of Volume 16 of Grove’s Dictionary of Music
and Musicians.
230. Rapport sur le pli cachété No . 126, paquet présenté à l’Académie des Sciences dans le séance du 1er . Octobre 1827, par M. Cauchy, et contenant
le Mémoire “Sur l’équilibre et le mouvement intérieur d’un corps, solide
considéré comme un système de molécules distinctes les unes des autres”,
Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences
(Paris) 291 (Vie Académique), 33–46.
230A. “Cauchy’s first attempt at molecular theory of elasticity”, Bollettino di Storia delle Scienze Matematiche 1 (1981), 135–143.
231. Proof that my work estimate implies the Clausius–Planck inequality, Accademia Nazionale dei Lincei, Rendiconti della Classe di Scienze Fisiche,
Matematiche e Naturali (8) 68, 191–199.
Other publication in 1980: No. 203R.
1981
232. [Paroles de reconnaissance] Comptes Rendus Hebdomadaires des Séances
de l’Académie des Sciences (Paris) 292 (Vie Académique), 45.
233. The role of mathematics in science as exemplified by the work of the
Bernoullis and Euler, Verhandlungen der Naturforschenden Gesellschaft
in Basel 91, 5–22.
xxxvi
PUBLISHED WORKS OF C.A. TRUESDELL
234. [Translation of an unpublished text in English] Il calcolatore: rovina della
scienza e minaccia per il genere umano, pp. 37–65 of La nuova ragione,
Scienza e cultura nella società contemporanea, Bologna, Scientia/Il mulino.
234R. Notiziario della Unione Matematica Italiana 11 (1984), 52–80.
Other publication in 1981: No. 230A.
1982
235. The kinetic theory of gases, a challenge to analysts, pages 321–344 of Contributions to Analysis and Geometry (The Philip Hartman Symposium), edited
by D.N. Clark, G. Pecelli, and R. Sacksteder, Johns Hopkins University
Press.
236. Our debt to the French tradition; our search for structure today, Scientia 76,
63–77.
236T. Il nostro debito verso la tradizione francese: le “catastrofi” e l’attuale
ricerca di struttura, ibid. 79–87. [This translation is faulty at some
essential points.]
227. Perpetual motion consistent with classical thermodynamics, Atti della Accademia di Scienze di Torino 114 (1980), 433–436.
228. Fundamental mechanics in the Madrid Codices, pp. 309–324 of Leonardo e
l’Età della Ragione, Milano, Scientia.
238T. [gravely defective] I primi principi di meccanica nei codici di Madrid,
pp. 325–332 of ibid.
Other publication in 1982: No. 202T.
1983
239. Euler’s contribution to the theory of ships and mechanics, Centaurus 26,
323–335.
240. The influence of elasticity on analysis: The classic heritage, Bulletin of the
American Mathematical Society (2) 9, 293–310.
1984
241. Preface to the reissue of Handbuch der Physik, Volume VIa, pp. V–VIII of
each of the four parts.
242. Correction of some errors published in this journal, Journal of Non-Newtonian
Fluid Mechanics 15, 249–251.
243. An Idiot’s Fugitive Essays on Science: Methods, Criticism, Training, Circumstances, New York, Springer-Verlag, XVII + 645 pp. Second printing,
revised and augmented, 1987, xvii + 661 pp.
243C. Essay 33d with most of the quotations from Gulliver’s Travels omitted, pages vii–xxxix of L. Euler, Elements of Algebra, New York,
Springer, 1984.
244. Rational Thermodynamics, A Course of Lectures on Selected Topics, with an
appendix by C.-C. Wang, second edition, corrected and enlarged, to which
xxxvii
PUBLISHED WORKS OF C.A. TRUESDELL
are adjoined appendices by R.M. Bowen, G. Capriz, P.J. Chen, B.D. Coleman, C.M. Dafermos, W.A. Day, J.L. Ericksen, M. Feinberg, M.E. Gurtin,
R. Lavine, I.-S. Liu, I. Müller, J.W. Nunziato, S.L. Passman, M. Pitteri,
P. Podio-Guidugli, D.R. Owen, P.A.C. Raats, M. Šilhavý, C. Truesdell,
E.K. Walsh, and W.O. Williams, New York, Springer-Verlag, xvii+578 pp.
244LA. [of Appendix 1A] Classical Thermodynamics is a Mathematical
Science, pp. 40–53 of Proceedings of the International Conference
on Nonlinear Mechanics, Shanghai, October 28–31, 1985, Beijing,
Science Press, 1985.
245. A puzzle divided: English and Continental chairs following a unique design
of the early eighteenth century, Furniture History (John Hayward Memorial)
20, 56–60 and plates 76–80.
Other publications in 1985: Nos. 154R, 244LA.
1986
246. A third line of argument in thermodynamics, pp. 79–83 of New Perspectives
in Thermodynamics (workshop at the Institute for Mathematics and its Applications, University of Minnesota, June, 1983), New York, Springer-Verlag.
247. What did Gibbs and Carathéodory leave us about thermodynamics?, pp. 101–
124 of New Perspectives in Thermodynamics (workshop at the Institute for
Mathematics and its Applications, University of Minnesota, June, 1983),
New York, Springer-Verlag.
248. Classical thermodynamics cleansed and cured, pp. 265–291 of Meeting on
Finite Thermoelasticity, Contributi del centro Linceo interdisciplinare di
Scienze Matematiche e loro applicazioni No. 76, Roma, Accademia Nazionale dei Lincei. Corrected reprint circulated in 1988.
249. Preface, p. V of The Breadth and Depth of Continuum Mechanics, A Collection of Papers Dedicated to J.L. Ericksen on His Sixtieth Birthday, edited by
C.M. Dafermos, D.D. Joseph, and F.M. Leslie, Berlin, Springer-Verlag.
1987
250. Great Scientists of Old as Heretics in “The Scientific Method”, Charlottesville, University of Virginia Press, 96 pp.
251. Preface, pp. V–VI of Analysis and Thermodynamics, A Collection of Papers
Dedicated to W. Noll on His Sixtieth Birthday, edited by B.D. Coleman,
M. Feinberg, and J. Serrin, Berlin, Springer-Verlag.
Other publications in 1987: Nos. 111L1, 254A.
1988
252. Editorial, Archive for Rational Mechanics and Analysis 100 (1987/8), IX–
XXII.
253. On the vorticity numbers of monotonous motions, Archive for Rational Mechanics and Analysis 104, 105–109.
xxxviii
PUBLISHED WORKS OF C.A. TRUESDELL
254. Review of U. Bottazzini’s “The Higher Calculus: A History of Real and
Complex Analysis from Euler to Weierstrass”, Archives Internationales
d’Histoire des Sciences 38, 125–137.
254A. Same title, Bulletin of the American Mathematical Society (2) 17,
186–189 (1987).
1989
255. Preface, pp. V–VI of Analysis and Continuum Mechanics, A Collection of Papers Dedicated to J. Serrin on His Sixtieth Birthday, edited by S.S. Antman,
H. Brezis, B.D. Coleman, M. Feinberg, J.A. Nohel, and W.P. Ziemer, Berlin,
Springer-Verlag.
256. [Comment on the article by Charles J. Sykes on The Johns Hopkins University, September 6], Wall Street Journal, September 19.
257. Newton’s influence on the mechanics of the eighteenth century.
257T. (By K. Hutter) Newtons Einfluß auf die Mechanik des 18. Jahrhunderts, pp. 47–73 of Die Anfänger der Mechanik: Newtons Principia
gedeutet aus ihrer Zeit und ihrer Wirkung auf die Physik, edited by
K. Hutter, Berlin, Springer-Verlag.
258. Maria Gaetana Agnesi, Archive for History of Exact Sciences 40, 113–142;
corrections and additions, 43 (1992), 385–386.
1991
259. Foreword, pp. vii–x of Edoardo Benvenuto, An Introduction to the History of
Structural Mechanics, New York, Springer-Verlag (2 Volumes).
260. Letter to the Editor, Isis 82, 90.
261. A First Course in Rational Continuum Mechanics, Volume 1, Second Edition, corrected, revised, and augmented, Academic Press, xviii + 391 pp.
1992
262. Jacopo Riccati, un grande “Savant” del ’700: Vita, Studi, Carattere, pp. 1–25
of J. Riccati e la Cultura della Marca nel Settecento Europeo (Atti del Convegno Internazionale di Studio, Castelfranco Veneto, 5–6 Aprile 1990), edited
by Gregorio Piaia and Maria Laura Soppelsa, Leo S. Olschki, Firenze.
263. Cauchy and the modern mechanics of continua (1989), Revue d’Histoire des
Sciences 45, 5–24.
264. Sophie Germain: Fame earned by stubborn error, Bolletino di Storia delle
Scienze Matematiche 11, 3–24.
265. Functionals in the modern mechanics of continua, Convegno Internazionale
in Memoria di Vito Volterra (1990), Atti del Convegni Lincei 92, 225–242.
266. (Co-author W. N OLL) The Non-Linear Field Theories of Mechanics, Second
Edition, Berlin-Heidelberg-New York, Springer-Verlag, X + 591 pp.
266T. (By Chen Zhaoxun) Fei xian xing chang zhi li xue li lun, National
Intitute for Compilation and Translation, Taipei, I+III+VI+ 742 pp.,
2001.
xxxix
PUBLISHED WORKS OF C.A. TRUESDELL
1993
267. Mechanics, especially elasticity, in the correspondence of Jacob Bernoulli
with Leibniz, pp. 13–26 of Der Briefwechsel von Jacob Bernoulli, edited
by A. Weil, in Die gesammelten Werke der Mathematiker und Physiker der
Familie Bernoulli, Basel, Birkhäuser.
1994
268. A modern exposition of classical thermodynamics, in: La Termodinamica e
la Termocinetica nelle Scuole di Ingegneria, a ricordo del Prof. Cesare Codegone (Atti della giornata di studio tenutasi presso il Politecnico di Torino il
15 ottobre 1992), Atti della Accademia delle Scienze di Torino, Classe di
Scienze Matematiche, Fisiche e Naturali 128, suppl. 2, 71–94.
1995
269. A che serve la storia delle scienze matematiche?, pp. 45–52 of Honoris
Causa, Lezioni Dottorali di Insigniti di Laurea ad Honorem in Occasione del
VI Centenario dell’Ateneo, Anno Accademico 1991/92, Ferrara, Università
degli Studi di Ferrara.
1996
270. Jean-Baptiste-Marie Charles Meusnier de la Place (1754–1793): An historical note, Meccanica 31, 607–610.
271. The thirty-fifth anniversary of this Archive, by the Editor, Archive for History
of Exact Sciences 50, 1–4.
2000
272. (Co-author K.R. R AJAGOPAL) An Introduction to the Mechanics of Fluids,
xiii + 277 pp., Birkhäuser, Boston.
2001
Other publication in 2001: No. 266T.
2004
273. (Co-author W. Noll) The Non-Linear Field Theories of Mechanics, Third
Edition, edited by S.S. Antman, Berlin-Heidelberg-New York, SpringerVerlag, XXIX + 602 pp.
Serials Edited by
Clifford Ambrose Truesdell III
I. Serials, as Founder or Co-founder
(Co-founder and co-editor T.Y. Thomas, later co-editor V. Hlavatý) Journal of
Rational Mechanis and Analysis, Indiana University, 5 volumes, 1952–1956.
Archive for Rational Mechanics and Analysis, Berlin, Springer-Verlag, 1957–
1989 (co-editor J. Serrin, 1967–1985).
Archive for History of Exact Sciences, Berlin, Springer-Verlag, 1960–1990.
Springer Tracts in Natural Philosophy, New York, Springer-Verlag, 1963–
1966; co-editor, 1967–1978; editor, 1979–2000.
II. Other Serials
Editor, Reihe für Mechanik, Ergebnisse der Angewandten Mathematik, Berlin,
Springer, 1957–1962, 3 volumes in all.
Member of the Editorial Committee, Rendiconti del Circolo Matematico di
Palermo, 1971–2000.
Member of the International Editorial Committee, Meccanica, 1974–1994.
Member of the International Editorial Committee, Annali della Scuola Normale Superiore di Pisa, 1974–1999.
Member of the International Editorial Board, Il Nuovo Cimento B, 1979–1987.
Member of the Editorial Council, Bollettino di Storia delle Scienze Matematiche, Unione Matematica Italiana, 1979–2000.
Member of the Editorial Board, Speculations in Science and Technology, 1980–
1987.
Member of the Editorial Board, Ganita-Bhãrati, 1981–1993.
Member of the Editorial Board, Stability and Applied Analysis of Continuous
Media, 1991–1993.
xli
Eulogium
CLIFFORD AMBROSE TRUESDELL III
(b. February 18, 1919; d. January 14, 2000)
Clifford Ambrose Truesdell III died on January 14, 2000. This man of mathematics,
science and natural philosophy focused his strong sense of history and his talents
and taste for identifying major advances in rational mechanics to establish a renaissance in mechanics and materials research that has prospered since the middle
of the 20th century. He contributed substance and spirit to the areas of continuum
mechanics, thermodynamics and kinetic theory, challenged the establishment and
its dogmatic thinking, and engaged the community of young researchers with a new
and fundamental direction of inquiry which concentrated on foundations, structure
and logical implication. His letters, his books, his essays, his university courses, his
co-founding of the Journal for Rational Mechanics and Analysis, his founding of
the Archive for Rational Mechanics and Analysis, the Archive for History of Exact
Sciences, Springer Tracts in Natural Philosophy, the Society for Natural Philosophy and his support of the research of other scientists were exceptional. His joint
encyclopedic articles, The Classical Field Theories in 1960 and The Non-Linear
Field Theories of Mechanics in 1965, were masterful, erudite, comprehensive, and
pioneering works of lasting value.
Clifford Truesdell was a scholar of immense creativity, a linguist, a connoisseur
of the arts and a historian unfettered by fashion. Throughout his life, he taught
us to preserve scholarship, to question foundations and to follow a path of reason
with principle, purpose and passion. His actions constantly provided a stimulus,
an environment and a framework for scientific discovery. He made a profound
contribution to our science.
ROGER F OSDICK
University of Minnesota
Minneapolis
xliii
Bloomington, Indiana, 1959
Memories of Clifford Truesdell
BERNARD D. COLEMAN
Department of Mechanics and Materials Science, Rutgers University, 98 Brett Road, Piscataway,
NJ 08854-8058, U.S.A.
Received 10 February 2003; in revised form 2 March 2003
Below is a shortened version of the text of a talk given at the Meeting in memory
of Clifford Truesdell held in Pisa in November of 2000 and at the Symposium on
Recent Advances and New Directions in Mechanics, Continuum Thermodynamics,
and Kinetic Theory held in Blacksburg in June of 2002. Appended to that text is
the Curriculum Vita of Professor Truesdell as he kept it up-to-date until October
1993, at which time, with his approval, I had it transcribed into its present format.
Clifford Truesdell and Thermodynamics
I consider myself to have been among the most fortunate of men: I have had a
teacher and friend, indeed, more than a friend, in effect, an elder brother, who was
the leading scholar in my science and who gave me encouragement, sound advice,
and every type of help that I might need, even when I did not know that I needed
it. Most important of all, he taught me that careful scholarship and the persistent
search for insight and understanding are far more important than facile skill in the
use of contemporary techniques for the solution of currently popular problems.
Clifford Ambrose Truesdell III was born in Los Angeles, February 18, 1919. In
his 23rd and 24th years he received, from the California Institute of Technology,
the B.S. Degree in Mathematics, the B.S. Degree in Physics, and the M.S. Degree
in Mathematics, and, in addition, from Brown University, a Certificate in Mechanics. In his 25th year he received, from Princeton University, the Ph.D. Degree in
Mathematics.
In the course of his career he received numerous awards and prizes, among
which are: the Euler medal of the USSR Academy of Sciences, which was received
twice, in 1958 and 1983, the Bingham Medal of the Society of Rheology, the
Panetti Prize and Gold Medal of the Accademia di Scienze di Torino, the Birkhoff
Prize of the American Mathematical Society and the Society for Industrial and
Applied Mathematics, and the Ordine del Cherubino of the Università di Pisa. He
received honorary doctorates from five universities and was awarded membership
in twelve international academies of science; among them is the illustrious Italian
Accademia Nazionale dei Lincei.
1
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 1–13.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
2
B.D. COLEMAN
Although I was an undergraduate student at Indiana University from February
1948 to June 1951, and hence my stay in Bloomington Indiana did overlap, albeit
partially, that of Clifford and Charlotte Truesdell, we first met years later, in the
Spring of 1958, at a scientific meeting in Lancaster Pennsylvania on the subject
of rheology. The meeting was followed by an exchange of letters about thermodynamics, which was in turn followed by a one-week visit with Clifford and Charlotte
Truesdell at their house in Bloomington in the winter of 1958 and not long after
that by his two week visit to the Mellon Institute. If you bear with me, I should
like to tell you about events of that period from the point of view of one whose
subsequent view of science, the arts, and life itself were completely changed by his
interaction with Clifford Truesdell.
In the summer of 1957 I left a position in the chemical industry to become a
Senior Fellow of the Mellon Institute in Pittsburgh, and soon after my arrival I
started to attend courses given by Walter Noll on continuum mechanics and related
branches of mathematics. Before the academic year was over, we both went to
the rheology meeting in Lancaster. The list of speakers for that meeting included
Clifford Truesdell, Walter Noll, Jerry Ericksen, and Ronald Rivlin. At a luncheon
that was held there, Walter Noll and I were sitting at a table with several persons
other than those just mentioned, and I expressed the view that although we were all
told in school that thermodynamics is a closed subject whose general principles are
known and pertain to only equilibrium states or to processes that stay so close to
equilibrium that all departures from equilibrium are governed by linear constitutive
relations, I could not believe that such is the case, and I felt that our knowledge of
what the science of thermodynamics could be was in some way analogous to what
the cultivators of mechanics knew about their subject at the time of the publication
of Newton’s Principia and the early work of the Basel School. All within hearing,
with the exception of Walter Noll, agreed with each other that I was wrong. Walter
agreed with me and suggested that I read certain papers of Clifford Truesdell and
that we talk more when we were back in Pittsburgh.
We did talk more, much more, and I, with my eyes open wide with excitement,
read whatever I could of Clifford Truesdell’s writings on thermodynamics.
There were papers that carried forward Maxwell’s idea that a properly formulated theory of diffusion of mass in fluid mixtures should account, as Fick’s Law
does not, for balance of linear and angular momentum.
There were passages decrying the vagueness that rendered nearly empty, at least
for mathematicians, the text-book versions of the Second Law of Thermodynamics.
Among these was a footnote to a discussion of classical thermodynamics in his
paper, The Mechanical Foundations of Elasticity and Fluid Dynamics, published
in 1952 in Volume 1 of the Journal for Rational Mechanics and Analysis. The footnote urges the rational student “to cleave the stinging fog of pseudo-philosophical
mysticism” hiding the mathematics behind a certain formulation of the Second
Law. It was clear that he saw that thermodynamics, far from being a closed subject,
was in a terrible state.
MEMORIES OF CLIFFORD TRUESDELL
3
Today we know that he was then doing research that would supply the key to
setting things straight. In that paper of 1952 there appears a preliminary version of
what he, with great generosity, called the Clausius–Duhem inequality, and which
appeared in its present form in The Classical Field Theories of Mechanics, by
Clifford Truesdell and Richard Toupin, published in 1960 in the Handbuch der
Physik, Vol. III:
d
q·n
r
η dm.
H −
da +
dm, H =
dt
θ
∂P
P θ
P
Here H is the total entropy of the part P , θ is the thermodynamical temperature,
q is the inward directed heat flux, r is the supply of heat from external sources, and
n is the outward directed unit normal vector.
On reading those two works one sees that before 1960 it was clear to Clifford
Truesdell that that inequality is the correct mathematical form of the second-law
of thermodynamics for the materials or systems such that the total entropy H is
an integral over P of an entropy density η, and nearly all the thermodynamical
systems that we consider in continuum physics have that property. The question
that seemed open at the time was the following: How does one use the inequality?
Is it a restriction on the process, or a relation to be obeyed by all processes?
In the early 1960’s Walter Noll put to me the idea, as if it should be obvious to
every one, that the inequality is a restriction on all processes that are admissible in
the material of which the body is composed, and, because one defines each material
by giving a set of constitutive relations, the Clausius–Duhem inequality, as it must
hold for all processes compatible with those relations, becomes a restriction on
constitutive relations. In another act of great generosity, Walter suggested that we
develop the idea together. It took awhile to sort the argument out and to present it
in a way that would convince the wary. The paper was written while he and I were
on sabbatical leave and were guests of Clifford Truesdell at Hopkins.
Shortly thereafter, I used this approach to the Clausius–Duhem inequality to
render mathematical ideas I had been struggling to express for years about thermodynamical restrictions on materials with gradually fading memory.
Many of the people in this room have done research on the Clausius–Duhem
inequality, in the study of its implications for new classes of constitutive relations,
in the study of its implications for the theory of the evolution of singular surfaces,
or in the study of its logical relation to other mathematically precise statements of
the second law. I am certain that I do not exaggerate when I say that every one
of them feels a deep debt of gratitude to Clifford Truesdell for finding the tool to
cleave the fog that once obscured the science of thermodynamics.⋆ But we have
even greater debts to him.
I should like to return to the time of that rheology meeting in Lancaster and elaborate on my own debt to him. A few months after that meeting, I received a letter
from Clifford in which he said that he heard from Walter Noll that I had studied at
⋆ The remark was appropriate to both the meeting in Pisa and symposium in Blacksburg.
4
B.D. COLEMAN
Yale, had read some of the works of J. Willard Gibbs, and knew some things about
thermodynamics. He then asked if I could clarify some passages in Gibbs’ paper on
The Equilibrium of Heterogeneous Substances. With Walter’s help I studied those
passages for months and finally sent off a long letter in which Clifford’s questions
were answered but issues were raised which for decades affected my work on the
stability of thermodynamical systems. In his reply he invited me to visit him and
Charlotte in Bloomington. Many years later, in the summer of 1993, I wrote to him
about that period and with your indulgence I shall read from the letter. I know there
are others here who could say similar things about our beloved friend, and when I
have finished you will see what I mean about our greater debt to him. What I read
now are three brief consecutive paragraphs out of a long letter.
“My correspondence with you about Gibbs’ conditions for the stability of fluid
phases occurred in this happy period. I visited your house in Bloomington for
a week in, I believe, the winter of 58–59. That visit had a major influence
on my view of what is important. As if struck by lightening, like the one
Christians call Paul, I suddenly saw clearly something for which I was ready
by instinct. In my case it was not a new religion, but a way to get out of a rut,
by seeking to study the languages, the writings, the art, the customs, the lives,
and the music and diversions of the ages in which our science originated and
the works we admire were produced. Your example gave me the impetus to
try to learn properly other languages. I became serious in my study of Italian.
In later years I have tried to improve my French. (In my 62nd year, I started
working on Attic and Homeric Greek, but was too old for such efforts.)”
“The influence of our friendship on my intellectual development has been too
great to describe in a few phrases couched in generalities. A brief summary is
impossible. Only examples will do, and there is space here for only one.”
“That one concerns my behavior when I am writing something for publication.
Invariably, upon completion of a passage I put down the pencil and ask myself:
‘What would Clifford say if he saw what I have just written?’ The subsequent
imagined conversation is often such that I feel obliged to rewrite the passage.”
Would that he were here now, to help us live up to the standards of scholarship
and clarity that he set for us!
MEMORIES OF CLIFFORD TRUESDELL
5
Curriculum Vita of Clifford Ambrose Truesdell III
Born in Los Angeles, California, February 18, 1919
Studies
European Travel and Private Study, 1936–1938.
California Institute of Technology: B.S. (Mathematics), 1941; B.S. (Physics),
1941; M.S. (Mathematics), 1942.
Brown University: Certificate in Mechanics, 1942.
Princeton University: Ph.D. (Mathematics), 1943.
Primary Employment
California Institute of Technology:
Assistant in history, debating, and mathematics, 1940–1942.
Brown University:
Assistant in Mechanics, 1942.
Princeton University:
Instructor of Mathematics, 1942–1943.
University of Michigan:
Instructor of Mathematics, 1943–1944.
Radiation Laboratory, Massachusetts Institute of Technology:
Staff Member, 1944–1946.
U.S. Naval Ordnance Laboratory, White Oak, Maryland:
Chief, Theoretical Mechanics Subdivision, 1946–1948.
U.S. Naval Research Laboratory, Washington, D.C.:
Head, Theoretical Mechanics Section, 1948–1950.
Indiana University:
Professor of Mathematics, 1950–1961.
Johns Hopkins University:
Professor of Rational Mechanics, 1961–1989; Emeritus, 1989–
Part-time, Temporary, and Visiting Appointments
University of Maryland, College Park:
Lecturer in Mathematics, 1946–1947,
Assistant Professor of Mathematics, 1947–1949,
Associate Professor of Mathematics, 1949–1950.
U.S. Naval Research Laboratory, Washington, D.C.:
Consultant, 1951–1955.
Universität Marburg an der Lahn:
Gastprofessor, 1957.
Mathematics Research Center, U.S. Army, University of Wisconsin, Madison:
Member, 1958.
6
B.D. COLEMAN
Mellon Institute, Pittsburgh: Visitor, 1959.
Socony-Mobil Research Laboratory, Dallas, Texas:
Colloquium Lecturer, 1960.
U.S. National Bureau of Standards, Washington, D.C.:
Consultant, 1950–1962.
University of California at Los Angeles: Special Lecturer, 1963.
Technische Universität Berlin–Charlottenburg: Gastprofessor, 1964.
University of Washington, Seattle:
Walker-Ames Professor, 1964.
Australian Mathematical Society Summer Research Institute, Melbourne:
Lecturer, 1965.
Syracuse University, New York:
Distinguished Visiting Professor, 1965.
International School on Nonlinear Problems in Physics, München:
Lecturer, 1966.
Università di Pisa:
Visiting Lecturer, 1966, 1973–1975, 1978, 1980, 1982, 1985, 1987.
Sandia Corporation, Albuquerque, N.M.:
Visitor, 1966.
Drexel Institute of Technology, Philadelphia:
Seventy-Fifth Anniversary Lecturer, 1966–1967.
Accademia dei Lincei, Roma:
Professore Linceo, 1970, 1973.
Universidade Federal do Rio de Janiero:
Lecturer for the Coordenacão dos Programas de Pós-Graduacão
de Engenharia and the Instituto de Matemática, 1972.
Georgia Institute of Technology, Atlanta:
Consultant, 1973–1974.
Scuola Normale Superiore, Pisa:
Ospite Linceo, 1974.
Brookhaven National Laboratories, Long Island:
Consultant (Advanced Codes Review Committee, U.S. Nuclear Regulatory
Commission), 1975–1983.
University of Delaware:
Bicentennial Scholar in Residence, 1976.
Instituto de Ingenieria Mecánica y Mecánica Teórica y Applicada,
Universidad Autónoma de Mexico, Mexico D.F.:
Lecturer, 1977.
Università di Bologna:
Professore Visitatore, 1978, 1987, 1988.
Université Catholique de Louvain:
Visiting Professor, 1979.
MEMORIES OF CLIFFORD TRUESDELL
7
Institut des Hautes Études Scientifiques, Bures-sur-Yvette:
Visitor, 1981.
Cornell University:
First Distinguished Visiting Professor of Theoretical and Applied
Mechanics, 1982.
Università di Firenze, Scuola di Architettura:
Professore à Contratto, 1985.
Scuola di Ingegneria Strutturale, Universitá di Roma “La Sapienza”:
Visiting Professor, 1990.
Short Lecture Series and Named Single Lectures
University of Toronto, 1949.
Sorbonne, Paris, 1949, 1955.
State University of Iowa, 1956.
Indiana University, 1959.
Scuola Internazionale di Fisica, Varenna, 1960.
Universitá di Padova, 1961.
Universitá e Politecnico di Milano, 1961.
Midwest Mechanics Seminar Tour, 1962.
Academy of Sciences, Warsaw, 1963, 1964.
The Johns Hopkins University, 1965.
Gibson Lecturer in the History of Mathematics, University of Glasgow, 1965.
Distinguished Visiting Lecturer, Centennial of the University of Kentucky,
1965.
NSF Conference on Recent Developments in Continuum Mechanics, Virginia
Polytechnic Institute, 1966, 1969.
Koerner Lecturer, Simon Fraser University, Burnaby, BC, 1969.
Distinguished Lecturer in Chemical Engineering, University of Rochester,
1970.
International Centre of Mechanical Sciences, Udine, 1971.
Centennial Lecturer in Engineering Mechanics, Virgina Polytechnic Institute,
1971.
Section de Transferts Thermiques, Centre de Recherches Nucléaires,
Grenoble, 1973.
Bajer Lecture, Princeton University, 1975.
Durelli Lecture, Catholic University of America, 1977.
International Symposium on Continuum Mechanics and Partial Differential
Equations, Universidade do Rio de Janeiro, 1977.
University of Chicago, 1979.
Thermofluids Lectures, Departments of Chemical and Mechanical
Engineering, School of Mines, University of Arizona, Tucson, 1980.
Ritt Lectures, Department of Mathematics, Columbia University, 1982.
8
B.D. COLEMAN
First MTU Lectures in Engineering Science, Michigan Technical University,
Houghton, 1983.
St. Andrews University, Scotland, 1983.
Distinguished Scientist Lecture, Trinity University, San Antonio, TX, 1984.
Allen Lecture in Mathematical Sciences, Rensselaer Polytechnic Institute,
1985.
Page-Barbour Lectures, University of Virginia, Charlottesville, 1985.
Franklin Lecture, Auburn University, Auburn, AL, 1986.
Invited Single Lectures to Meetings and Symposia
American Mathematical Association (Baltimore, 1948).
American Physical Society (Charlottesville, 1949; State College, PA, 1953;
San Diego, CA, 1971).
International Conference on Theoretical Fluid Mechanics, Harvard, 1950.
Conference on Elasticity, University of Maryland, 1952.
Sigma Xi (Indiana University, 1952; State University of Iowa, 1956;
Illinois Institute of Technology, 1960; Georgia Institute of Technology
(Monie A. Ferst Memorial Lecture), 1969; University of Tennessee, 1976;
McGill University, 1976.
Symposium on Ultrasonic Absorption and Dispersion in Fluids, Brown
University, 1952.
Discussion Meeting on the Second Viscosity of Fluids, The Royal Society,
London, 1953.
First Midwestern Conference on Solid Mechanics, University of Illinois, 1953.
Symposium of the Office of Ordnance Research and the American
Mathematical Society, Chicago, 1954.
American Society for Engineering Education, Urbana, 1954.
Gesellschaft für Angewandte Mathematik und Mechanik (General Lectures),
Berlin, 1955; Hamburg, 1957.
Sixth Conference on Hydraulics (General Lecture), Iowa City, 1955.
Eulerfeier (Main Lecture), Basel, 1957.
Washington Philosophical Society, 1958.
Accademia Nazionale di Scienze, Lettere ed Arti (Inaugural Address),
Modena, 1960.
Celebrazioni Archimedee, Siracusa, 1961.
I.U.T.A.M. Symposium on Second-order Effects in Elasticity, Plasticity,
and Fluid Dynamics (General Lecture), Haifa, 1962.
Fourth U.S. National Congress of Applied Mechanics (General Lecture),
Berkeley, 1962.
Summer Conference on Non-ideal Mechanical Behavior, Princeton, 1962.
American Society of Mechanical Engineers, Washington, 1962.
Symposium on Hemodynamics and Hydrodynamics, Baltimore, 1962.
MEMORIES OF CLIFFORD TRUESDELL
9
International Congress of Rheology (Bingham Medal Address), Providence,
1963.
Society for Natural Philosophy at Pittsburgh, 1963; Notre Dame, 1971;
Seattle, 1972; Pisa, 1974, 1978; Williamsburg (Fifteenth Anniversary
Lecture), 1978; Rolla, 80; Brown, 1983; Baltimore, 1987; Pittsburgh
(Walter Noll retirement symposium), 1993.
Eleventh International Congress of Applied Mechanics, Munich, 1954.
Philosophy of Science Seminar, University of Delaware, 1965.
Convegno dei Meccanici Italiana, Modena, 1966.
I.U.T.A.M. Symposium on Irreversible Thermodynamics in Continuous
Media, Vienna, 1966.
Commemoration of Newton’s Annus Mirabilis, Austin, 1966.
First Canadian National Congress of Applied Mechanics (General Lecture),
Quebec, 1967.
Third Buhl International Conference on Materials, Mellon Institute, 1968.
Symposium on “The Interplay between Mathematics and Physics – The Rise
of Mathematical Physics” at the University of Aarhus, 1970.
Southwest Graduate Research Conference, Houston, Texas, 1971.
Second Annual Meeting of the American Society for Eighteenth Century
Studies, College Park, Maryland, 1971.
Banquet address, meeting of the History of Science Society and Society for
the History of Technology, Washington, 1972.
Sectional address (History and Paedagogy), International Congress of
Mathematicians, Vancouver, 1974.
Address at the Engineering Commencement, Tulane University, New Orleans,
1976.
Address on receipt of a Birkhoff Prize, Annual meeting of the American
Mathematical Society and the Society for Industrial and Applied
Mathematics, 1978.
Euromech Colloquium, Pisa, 1978.
Italo–American Co-operative Science Seminar, Venice, 1978.
Organizer’s address, Special Symposium on “Conceptual Analysis in Rational
Thermomechanics”, Summer meeting of the American Mathematical
Society, Providence, 1978.
Keynote address on Constitutive Relations, E.P.R.I. Workshop on Two-Phase
Flow, Tampa, Florida, 1979.
Colloquium on Continuum Thermodynamics, Society of Engineering
Science, Northwestern University, Evanston, 1979.
Celebration of the 75th anniversary of Scientia, Milano, 1980.
Plenary lecture, 8th International Congress of Rheology, Naples, 1980.
General Lecture, Society of Engineering Science, Atlanta, 1980.
Colloquium on the History of Mathematics, Winter meeting of the American
Mathematical Society, San Francisco, 1981.
10
B.D. COLEMAN
Keynote address, 11th Southeastern Conference on Theoretical and Applied
Mechanics, Huntsville, Alabama, 1982.
Joint Session on the History of Mathematics, meetings of the American
Mathematical Society and American Mathematical Association, Toronto,
1982.
Festakt Daniel Bernoulli (Main Lecture), Basel, 1982.
Leonardo e l’età della ragione (Congress organized by Scientia and the
governing bodies of Milano and Lombardy), Milano, 1982.
25th British Theoretical Mechanics Colloquium, Manchester, 1983.
International Symposium, “The Codex Hammer in Context”, Walters Art
Gallery, Baltimore, 1983.
Workshop on the Laws and Structure of Continuum Thermomechanics,
Institute for Mathematics and its Applications, University of Minnesota,
Minneapolis, 1983.
Convegno sul tema “Termoelasticità finita”, Accademia Nazionale dei Lincei,
Rome, 1985.
International Conference on Nonlinear Mechanics, Shanghai, 1985.
900th Anniversary celebrations, University of Bologna, 1987, 1988.
300 Years of Gravitation, University of Cambridge, England, 1987.
International Conference dedicated to the Tricentenary of the Publication of
Newton’s Principia, U.S.S.R. Academy of Sciences, Moscow, 1987.
First Plenary Lecture, 4th National Congress of Theoretical and Applied
Mechanics, Coimbra, Portugal, 1987.
Celebration of the 300th anniversary of Newton’s Principia, Technische
Hochschule Darmstadt, 1988.
First Plenary Lecture, III International Workshop on Mathematical Aspects of
Fluid and Plasma Dynamics, Salice Terme, Italy, 1988.
First Plenary Lecture, IX Congress Nazionale dell’Associazione Italiana di
Meccanica Teorica ed Applicata, Bari, 1988.
Imola Conference, Università degli Studi di Bologna, September 5–7, 1988.
Inaugural Charles E. Foster Lecture, School of Aerospace and Mechanical
Engineering, University of Oklahoma, Norman, 1990.
Convegno Internazionale “I Riccati e la cultura della Marca nel Settecento
Europeo”, Castelfranco Veneto, 1990.
First Rutgers Conference on Theoretical Mechanics: The Dynamics of Rods,
August 24–27, 1990, Rutgers University, New Brunswick.
Convegno Internazionale in Memoria di Vito Volterra, Accademia Nazionale
dei Lincei, October 8–11, 1990.
Editorial Positions
Co-founder and Co-editor, Journal of Rational Mechanics and Analysis,
1952–1956.
MEMORIES OF CLIFFORD TRUESDELL
11
Editor or Co-editor, Leonhardi Euleri Opera Omnia, Series II, Vols. 10–13,
18–19, 1952–1971.
Co-editor, Handbuch der Physik, (Springer) Vols. 8/I, 8/II, 9 and 6a/1–6a/4,
1956–1974.
Founder and Editor, Archive for Rational Mechanics and Analysis, 1957–
1967, Co-editor, 1967–1985; Editor, 1985–1989.
Editor, Reihe für Mechanik, Ergebnisse der Angewandten Mathematik, 1957–
1962.
Founder and Editor, Archive for History of Exact Sciences, 1960–.
Founder and Editor, Springer Tracts in Natural Philosophy, 1962–1966;
Co-editor, 1967–1978; Editor, 1979–.
Co-editor, Studies in the Foundations, Methodology and Philosophy of
Science, 1966–1970.
Member of the Editorial Board, Rendiconti del Circolo Matematico di
Palermo, 1971–.
Member of the Editorial Board, Annali della Scuola Normale Superiore, Pisa,
1974–.
Member of the Editorial Board, Meccanica, 1974–.
Member of the International Editorial Board, Il Nuovo Cimento B, 1979–
1981; Il Nuovo Cimento D, 1982–1987.
Member of the Editorial Council, Bollettino di Storia delle Scienza
Matematiche, Unione Matematica Italiana, 1979–.
Member of the Editorial Board, Speculations in Science and Technology,
1980–1987.
Member of the Editorial Board, Ganita-Bharati, 1981–.
Member of the Editorial Board, Stability and Applied Analysis of Continuous
Media, 1991–.
Organizational Positions
U.S. Correspondent, International Mathematical News (Austria), 1952–1956.
Member, Committee on Applied Mathematics, U.S. National Research
Council, 1954–1956.
Sponsor for Elasticity, American Society for Mechanical Engineers, 1956–
1958.
General Chairman, Conference on the Foundations of Mechanics and
Thermodynamics, National Bureau of Standards, 1959.
Member of organizing committee, International Conference on Rarefied Gas
Dynamics, Berkeley, California, 1960.
Member of organizing committee, International Congress of Logic and the
Philosophy of Science, Stanford, 1960.
Member of organizing committee, I.U.T.A.M. Conference on Second-order
Effects in Elasticity, Plasticity, and Fluid Mechanics, Haifa, 1962.
12
B.D. COLEMAN
Co-founder, Society for Natural Philosophy, 1963; Director, 1963–1984;
Secretary, 1963–1965, 1970–1971, 1980–1981; Chairman, 1967–1968,
1983–1984;
Member of the Program Committee, 1975–1976.
Co-chairman of the local committee and Chairman of the Round-Table
Discussion, meetings of the Society for Natural Philosophy at Baltimore,
1963, Bressanone, 1965, Chairman of a Round-table Discussion at the
meetings at Chicago, 1966; Cincinnati, 1970; Cincinnati, 1977; Madison,
1984. Co-chairman of the local committee for the meeting at Baltimore,
1965.
Coordinator, C.I.M.E. Course on Nonlinear Continuum Theories,
Bressanone, 1965.
Co-Chairman, First Joint Italian-American Cooperative Science Seminar,
Udine, 1971.
Member of the Scientific Committee, Symposium on Problems of Plasticity,
Polish Academy of Sciences, Warsaw, 1972.
Co-Chairman, Italian–American Cooperative Science Seminar, Udine, 1974.
Organizer of Special Symposium “Conceptual Analysis in Rational
Thermomechanics”, Summer Meeting, American Mathematical Society,
Providence, 1978.
Member of the Steering Committee, International Conference on Nonlinear
Mechanics, Shanghai, 1985.
Honorary Doctorates
Dott.ing.h.c. in Mechanical Engineering (Fluid Mechanics and History of
Science), Centenary of the Politecnico di Milano, 1965.
D.Sc. (Engineering), Tulane University, 1976.
Fil. D. h.c. (Physics), Uppsala University, 1979.
Dr. Phil. h.c. (Sciences), University of Basel, 1979.
Dott. mat. h.c. (Mathematics), University of Ferrara, 1992.
Memberships in National or International Academies of Science, etc.
Socio Onorario dell’Accademia Nazionale di Scienze, Lettere ed Arti,
Modena, from 1960.
Membre Correspondent de l’Académie Internationale d’Histoire des Sciences,
Paris, 1961–1968. Membre Effectif from 1968.
Membro Straniero dell’Istituto Lombardo Accademia di Scienze e Lettere,
from 1968.
Socio Corrispondente Straniero dell’Istituto Veneto di Scienze, Lettere ed
Arti, from 1969.
Accademico Corrispondente Straniero dell’Accademia delle Scienze
dell’Istituto di Bologna, from 1971.
MEMORIES OF CLIFFORD TRUESDELL
13
Socio Straniero dell’Accademia Nazionale dei Lincei Rome, from 1972.
Membre Titulaire de l’Académie Internationale de Philosophie des Sciences,
Bruxelles, from 1974.
Socio Straniero dell’Accademia delle Scienze, Torino, from 1978.
Membro Corrispondente, Academia Brasileira de Ciências, from 1981.
Honorary Foreign Member, Polish Society for Theoretical and Applied
Mechanics, from 1985.
Membrum Ordinarium, Regia societas scientiarum Upsaliensis, from 1987.
Fellow, American Academy of Arts and Sciences, from 1991.
Awards, Prizes
California Institute of Technology, Institute Scholar and LaVerne Noyes
Scholar, 1938–1941; Conger Peace Prize, 1940, 1941.
Fellow of the John Simon Guggenheim Memorial Foundation, 1957.
Euler medal of the USSR Academy of Sciences, 1958, 1983.
Senior Post-Doctoral Fellow, U.S. National Science Foundation, 1960–1961.
Bingham Medal of the Society of Rheology, 1963.
Gold Medal and International Prize “Modesto Panetti” (applied mechanics),
Accademia di Scienze di Torino, 1967.
Birkhoff Prize (applied mathematics), American Mathematical Society and
Society for Industrial and Applied Mathematics, 1978.
Ordine del Cherubino, University of Pisa, 1978.
Visiting Research Scholar, Japan Society for the Promotion of Science, Kyoto,
1980.
Senior U.S. Scientist Award (Humboldtpreis), West Germany, 1985.
Clifford Truesdell (1919–2000), Historian of
Mathematics
ENRICO GIUSTI
Department of Mathematics, University of Florence, Italy
Received 26 August 2003
In many ways, the research performed by Clifford Truesdell on the history of
mathematics can be summarized by the title of the first article, at the beginning of
the first issue of the Archive for the History of Exact Sciences: “A program towards
rediscovering Rational Mechanics in the age of reason”. Two themes come together
and will always recur in Truesdell’s research. The first one is reason: in an age when
the term “enlightenment” took up also negative meanings, Truesdell never stopped
claiming, in a decisive and clear language, the supremacy of reason as the only
guide to human behavior. He saw the end of this “age of reason” in the French
revolution, the source, in his opinion, of all atrocities of modern times, from the
lager to the gulag, from the nuclear threat to universal suffrage. Truesdell sees all
subsequent historical events in an exclusively negative way. Although we cannot
understand or agree with Truesdell on all these, we see that his theory envisages
reason as the only compass able to guide mankind in his daily choices, and discover in his mathematical philosophy a model of that rational culture that seems
sometimes increasingly far from us and obsolete.
The second theme is rational mechanics. All of the work of Truesdell on history
places at the center the onset and the development of modern rational mechanics,
which is a well-known discipline for Italian scientists, but is somehow unrelated
to the Anglo-Saxon tradition. Rational mechanics is considered by Truesdell, first
of all, as an out and out mathematical discipline, equally far from pure abstract
speculation and from “big science” with its big-scale projects and few ideas.
Not only private, individual experimental researches were performed in the eighteenth century; there were also large, cooperative projects. As today, they cost more
than real science, and they attracted administrators. But the effect of all this expense
on what we now consider the achievement of the period was nil. The method used in
the great researches was entirely mathematical, but the result was not what would now
be called pure mathematics. Experience was the guide; experience, physical experience and the experience of accumulated previous theory. If we were to seek a word
for what was done, it would not be physics and it would not be pure mathematics;
least of all would it be applied mathematics: It would be rational mechanics.⋆
⋆ Essays in the History of Mechanics, Springer-Verlag, Berlin, 1968, p. 136.
15
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 15–22.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
16
E. GIUSTI
In rational mechanics, Truesdell has studied especially the founders, “the immortal ones” in his personal definition: Euler, Jacob, Johann and Daniel Bernoulli,
Cauchy. This choice is evidently a reflection of his personal orientations and taste;
there are, however, deeper reasons than his personal preferences: these reasons
originate from the way Truesdell conceived the study of the historical aspects of
mathematics.
One of the key points of Truesdell’s perspective on the history of mathematics
concerns competence. Truesdell did not indulge in sociological investigations or
in the description of cultural circles and cultural institutions, except for the cases
where those aspects had a decisive function for the development of the research.
His history was about contents rather than circumstances. According to Truesdell,
the history of science has reason to exist only if it tells about science; otherwise
it will just exert a bad influence both on history and, especially, on science itself.
For this branch of history, one must have a detailed, deep and diverse knowledge:
in the first place, knowledge of mathematics, based on an extensive study of the
subject, preferably associated with some kind of familiarity with research procedures, and then historical and philological competence. All this knowledge can
only be acquired after a long study period: the history (or the philosophy) of mathematics cannot be the occupation of retired mathematicians or literates without any
scientific background:
The philosophy of science, I believe, should not be the preserve of senile scientists
and of teachers of philosophy who have themselves never so much as understood the
contents of a textbook of theoretical physics, let alone done a bit of mathematical
research or even enjoyed the confidence of a creating scientist.⋆
Since it is not a matter of describing an environment that looks the same to each
observer, but rather a matter of analyzing and understanding different contributions
bearing the most diverse degrees of deepness and quality, the specificity and the
vastness of the history of mathematics raise the urgent problem of relevance. As he
cannot describe and reproduce all existing written treatises and documentation on
mathematics, the historian must make a choice and illustrate, at the same time, the
criteria on which he bases his choice. Truesdell has only one criterion: relevance.
First of all, after examining the huge series of authors and literature, the historian
will have to identify who and what gave a concrete contribution to the development
of the discipline, then comprehend and describe the contents. This absolutely requires a specific competence; from this comes, furthermore, the need to evaluate,
express judgements and to take a position.
In writings on the history of science today, as in all aspects of social intercourse,
it becomes increasingly bad taste to call a spade a spade. In the particular application to the history of science, the compulsion to euphemy assumes the form of
a solemn refusal to admit that there is such a thing as wrong in science. All that
matters is how the scientists of one epoch thought and felt about nature, their own
⋆ An Idiot’s Fugitive Essays on Science, Springer-Verlag, New York, 1984, p. vii.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
17
work, and the work of others. In particular, the beginner is enjoined above all not
to take sides, like the positivist historians of the last century, and set past science
into categories of true and false according as it does or does not agree with what is
taught in school science of our own day. However admirable this philosophy may
be in promoting peace and mutual love among historians of science, it disregards
one aspect of science that is not altogether negligible, namely, that scientists seek the
truth, not a truth. He who refuses to “take sides” in science in effect negates science
itself by denying its one and common purpose. He reduces science to just one more
social manifestation. In so doing, not only does he by implication belittle the great
scientists of the past, but also he sins against history, for in his attempt at historical
impartiality he destroys the object, namely science, the history of which he claims
to write.⋆
From this point of view, the mathematician, and consequently also the historian
of mathematics, derives an undoubted advantage from the fact that mathematics,
as opposed to other disciplines, applies objective criteria to distinguish truth from
wrong and relevant aspects from secondary ones:
Now a mathematician has a matchless advantage over general scientists, historians,
politicians, and exponents of other professions: He can be wrong. A fortiori, he can
also be right. There are errors in E UCLID, and, to within a set certainly of measure
zero on the ordinary human scale, what E UCLID proved to be true in ancient Greece is
true even in the colossal, unprecedented, nucleospatial, totally welfared today. In the
advance through the physical, social, historical, and other sciences, the demarcation
between truth and falsehood grows vaguer, until in some areas truth can be rezoned as
falsehood and falsehood enshrined into truth by consensus of “acknowledged experts
and authorities” or even popular vote.⋆⋆
According to Truesdell, to take a position is a duty, perhaps the main duty of a
historian, who must distinguish relevance from fame and put each author in the
right perspective. In his treatises about a period or an author, Truesdell never
neglects to take a clear position and to go so far that, in some cases, it looks
as if he intentionally chooses topics which allow him to take a position against
established interpretations and evaluations, often accepted and repeated without
criticism. His heroes, as we explained, are Euler, Bernoulli, Cauchy and, in history,
Pierre Duhem. On the opposite side, he puts D’Alembert, Lagrange and Mach: the
last two are considered responsible, by the way, for the misunderstandings and the
forgeries present in the history of mechanics.
The first volume [of the Encyclopédie] carries as frontispiece a magnificent engraving
of D’A LEMBERT, who, under the guise of authoritative reviews, filled its pages with
shoddy hashes of antiquated science served up with a sauce of his own prejudices,
advertisements for his researches, and attacks on his opponents.‡
⋆ Essays in the History of Mechanics, pp. 145–146.
⋆⋆ Essays in the History of Mechanics, p. 140.
‡ The rational mechanics of flexible or elastic bodies 1638–1788 (Leonhardi Euleri Opera
Omnia II, XI (2)), p. 245n.
18
E. GIUSTI
L AGRANGE’s talent for algebra was undoubtedly great, but in respect to fundamental
questions of analysis or mechanics his work does not attain the logical and conceptual standards of his great predecessors. Also, the proportion of nontrivial errors in
L AGRANGE’s calculation is high compared with other major mathematicians’. This
body of errors seems to have attracted little notice, so that Lagrange is generally
given credit for having solved several problems on which his work is largely or totally
wrong.
It is time for a reappraisal of the work of the French mathematicians, a reappraisal
constructed, in defiance of the généralités from the obituaries and the descendents of
the obituaries, upon critical study of the work done. I am confident such a reappraisal
would much reduce the importance of D’A LEMBERT and L AGRANGE, would yield
a more realistic view of L APLACE, M ONGE, and F OURIER, would raise C LAIRAUT
and P OISSON to their just level, and would reveal C AUCHY as the towering giant of
his age and nation.⋆
While L AGRANGE’s book is a good starting place, experience with it has led me
to the following working hypotheses:
1. There was little new in the Méchanique Analitique; its contents derive from
earlier papers of L AGRANGE himself or from works of E ULER and other predecessors.
2. General principles or concepts of mechanics are misunderstood or neglected by
L AGRANGE.
3. L AGRANGE’s histories usually give the right references but misrepresent or slight
the contents.⋆⋆
Not even Leonardo escapes his criticism. Commenting on the attribution to
Leonardo of the laws ruling the free fall on inclined planes, Truesdell writes:
From many other examples we know that L EONARDO often proposed rules of simple
proportion, and that usually they came out wrong. Here is one that is right. Did
L EONARDO know it? Was not this, for him, just one of the hundred linear rules he
proposed and never got around to trying out, the only difference being that this one
concerns a problem later seen to have central importance, and this answer just turned
out to be right?
I fear so. We remember that in regard to free fall, L EONARDO proposed several
linear laws, mutually contradictory, and in the single one that turned out to be correct,
he may simply have forgotten to repeat his other, incorrect, statements.
The facts before us are simple:
1.
2.
3.
4.
In physics, some relations are linear and some are not.
L EONARDO never proposed any relation other than a linear one.
L EONARDO did propose dozens of linear relations.
From 1 and 3, some of L EONARDO’s rules may be expected to come out right.
Counting only the cases when he was right and disregarding those when he was wrong
may give a somewhat distorted estimate of his capacity as a natural philosopher.‡
⋆ The rational mechanics of flexible or elastic bodies, p. 412n.
⋆⋆ Essays in the History of Mechanics, pp. 246–247.
‡ Essays in the History of Mechanics, p. 36.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
19
Once again, he does not fear the wrath of defenders of the “politically correct”
principle, when he writes an article bearing the title “Sophie Germain: fame earned
by stubborn error”, in which he makes statements such as:
The above suffices to show that Sophie Germain was ignorant not only of elementary
mechanics but also of the calculus of double integrals.⋆
Of course, Truesdell does not blame the authors; they did all that they possibly
could. Instead, he blames the historians, who are not able to distinguish between
sumptuously dressed banalities and conclusive results, who keep repeating already
repeated statements without thinking them through and, by doing so, who contribute to the onset of an historic treacle, in which nobody anymore is able to
distinguish a valid theorem from a futile exercise.
While a physicist writing on the history of physics usually tells us what he thinks the
old scientists must have thought, and a historian tells us whom they knew and what
books they read, Professor JAMMER tells us mainly what they said they did . . . .
A history apparently was not intended, since, despite critical remarks here and there,
Professor JAMMER seems to be content with quiet juxtaposition of conflicting opinions. We read what was thought about mass not only by N EWTON and H ERTZ but
also by A LOIS H ÖFLER and C LÉMENTICH DE E NGELMEYER. Now if Professor
JAMMER had found that H ÖFLER and DE E NGELMEYER, although forgotten today,
in fact did something important in mechanics, everyone should congratulate him on
his success as a historian, but when he merely tells us what they thought – H ÖFLER
is quoted as saying, “The tonomonic quantity ‘dyne’ precedes logically the notion of
‘one gram mass’ ”, and DE E NGELMEYER “our daily experience prepares us much
better for the comprehension of the notion of force than of mass . . .” – then we may
well ask, who cares? Indeed, if it had been Huygens and Euler who had made the
statement just quoted, a historian would do neither their memories nor his readers
any service by perpetuating these flat vacuities.⋆⋆
Once the history of our culture was our common heritage, our pride and our lesson
book for conduct both private and public. . . . For me, as long as I have tried to do
research, the one and only school of method has been study, study and study of the
masters.
Today these simple truths are as obsolete currency as gold coin. . . . The beginner
in history of science must be taught first of all what will make him, if he completes apprenticeship, different from and independent of historians and scientists
alike. Mathematics cannot be defined now except as that which mathematicians do;
for physics, we substitute the word “physicists”, and soon the history of science will
be defined as that which historians of science do and will likewise live a Parkinsonian
life, independent equally of science and of history. Just as books on political history
are written now to be read by political historians alone, and works on mathematics
to be read by none but professional mathematicians, soon we can expect that books
on the history of science will be meaningless except to historians-of-science, dumb
⋆ Bollettino di Storia delle Scienze Matematiche, 11-2 (1991), p. 12.
⋆⋆ An Idiot’s Fugitive Essays on Science, pp. 170–171.
20
E. GIUSTI
to scientists and to historians, serving only to produce more and more historiansof-science who are paid, if they can get jobs, to do nothing but indoctrinate more
historians-of-science.⋆
The history of science is different in kind from science itself and from ordinary history. The material of the history of science is compact: Being history, it necessarily
concerns the past, and because in the past science was a tiny and select vocation, not
the factory job it is today, there is little to be read. What little there is, includes the
highest intellectual achievement of our culture as well as a part of its finest artistic
creation. . . . The example set today by the professions of scientists and historians
is the worst that historians of science could choose to follow. Indeed, the history
of science needs to be cleared and established. Thereafter, it ought to be learned.
Although only a handful of persons could ever acquire the eccentric conjunction of
skills and knowledge necessary in him who would do sound research in the history of
science, there are many who can and should learn the results of that research. History
of science should be studied and learned by every scientist, every historian, every
person who seeks any intellectual footing in the Western culture. The great need of
history of science today is for teachers.⋆⋆
These last two citations bring us to another topic on which Truesdell repeatedly
insisted, in particular in his Lectio doctoralis, held on the occasion of the conferment of the “laurea honoris causa” at the University of Ferrara: the importance of
the history of mathematics and, more generally, of the history of science, not only
for the culture of the average citizen and, in particular, for the culture of scientists,
but also for the interaction between historic and scientific culture. Without the
history of science, a scientist’s culture is an end in itself; without at least a basic
knowledge of science, the history of mathematics is impossible.
All attempts to write a history of mathematics to be read by non-mathematicians
turned out as disastrous failures: horrendous and stupid myths, the caricature or even
the degradation of mathematics. Usually, the mathematician is represented as a sort
of quack, rather than the way a great mathematician really is: a thinker, an organiser
of ideas, a creator of new concepts, the bearer of the truth, the indefatigable worker
in the fields of knowledge and beauty. In such works of vulgarization, mathematics
itself is described as an arcane science, a spell, a crazy revelation.
My discussion here refers not only to mathematics in its modern and specific sense,
but also to mathematical science. The history of mathematical science, although
probably not very useful to individuals who are ignorant of any form of mathematics,
should be understandable to a much vaster public nowadays, more than any common
research paper. It should be interesting, perhaps even useful, to anybody with some
knowledge of mathematics, not necessarily very deep and detailed, and to anybody
with some conscience of the value of mathematical science. A treatise on the history
of mathematical science should differ profoundly from a modern research essay.‡
⋆ An Idiot’s Fugitive Essays on Science, pp. 585–586.
⋆⋆ An Idiot’s Fugitive Essays on Science, p. 589.
‡ Translated from Honoris Causa. Lezioni dottorali di insigniti di laurea ad honorem in occasione
del VI centenario dell’Ateneo. Academic Year 1991/1992. University of Ferrara, 1995, p. 49.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
21
One of the main functions it should fulfill is to help scientists understand some aspects
of specific areas of mathematics about which they still don’t fully know. What’s more
important, it helps them too. By satisfying their natural curiosity, typically present in
everybody towards his or her own forefathers, it helps them indeed to get acquainted
with their ancestors in spirit. As a consequence, they become able to put their efforts
into perspective and, in the end, also able to give those efforts a more complete
meaning. By seeing beyond one or two leaves of the great tree of mathematics, they
can comprehend the whole structure, discern the subdivision of its branches and trace
back its roots.
When a scientist gets the chance to measure his own efforts by comparing them with
the results of the immortals, he will feel perhaps less compelled to vie for supremacy
over the pygmies of his level and will try, instead, to achieve one or two small goals,
which will have a probability to survive the test of time. This is a moral advantage.⋆
Mathematical science offers examples of what human reason can achieve, of the
safety that comes together with reason; by reaching a rational clearness, they put also
in evidence the limits of reason and the unstable foundations on which all sciences
are based. The methods that turned out to be successful and those that have been
surpassed and failed, the paths that guided to the final destination and the blind ones,
can be learned exclusively through the history of mathematical sciences. This is an
old-style history, the history of men and of their actions; with the words of Savile:
No other study is as suitable as history to guide the life of mankind.
This is a social advantage.⋆⋆
Still, relevance and fame are two different concepts. Truesdell does not choose
his heroes simply among the recognized mathematicians. The scientists who are
brought by Truesdell to the level of heroes, and take, in this way, a curtain call
on the scene of Truesdell’s investigations, are sometimes practically unknown, not
only to the general public, but also to historians of science.
Particularly enlightening examples are John Herapath and John James Waterston. Both of them were the first ones to give a substantial contribution to the kinetic
theory of gases, after the publication of Daniel Bernoulli’s Hydrodynamica: both
of them had sent their work to the Royal Society which, in both cases, rejected it,
giving rise to a series of disputes. Herapath, though he used the momentum instead
of the kinetic energy of particles for his definition of temperature, had shown that
the kinetic theory was suitable as a first explanation for phase transitions, diffusion, and the propagation of sound. In a series of letters, in which Herapath was
exhorted to abandon these speculative aspects and devote himself to experiments,
the president of the Royal Society, Sir Humphry Davy, wrote, “having considered
a good deal the subject of the supposed real zero, I have never been satisfied with
any conclusions respecting it. I cannot see any necessary connexion between the
⋆ Translated from Honoris Causa, pp. 49–50.
⋆⋆ Translated from Honoris Causa, p. 50.
22
E. GIUSTI
capacity of bodies for heat, and the absolute quantity they contain; and temperature
does not measure a quantity, but merely a property of heat.”
As far as Waterston is concerned, long before the work of Joule, Thompson and
Krönig, he had sent in 1845 a note on kinetic theory to the Royal Society. This
document was rejected by the Society as well. On that occasion, one of the two
referees objected that the principle that pressure depended on molecular collisions
with the walls, a central hypothesis of Waterston’s document, was “by no means
a satisfactory basis for a mathematical theory”. The second referee just wrote that
“the paper is nothing but nonsense, unfit even for reading before the Society”. The
work in question was rediscovered by Lord Rayleigh in 1891 and published in
1893, ten years after Waterston’s death.
In the contrast between the genius and the Academy, Truesdell chooses definitely the first and draws two important conclusions from the above-described
events. In the first place, he praises an anarchical approach in academic research
and organization. Truesdell himself depicts the academic world with his incomparable style, as somewhat serious and somewhat ironical, and describes it in ways
that, besides some details, could very well fit the Italian situation:
Our academic life presents to the foreigner a lamentable scene of chaos. No-one
knows who is on top. If in University 1 Professor A is a demigod, we have only to
consult Professor B in University 2 to learn that in his department A would not qualify
even as an assistant. True, A belongs to six national committees, has a million dollar
grant from the Central Spy Bureau, and has published eight successful textbooks,
but B, who points to A’s textbooks as models of nonsense, has written 216 research
papers with twenty-three co-authors and also is consultant for four major corporations, assistant editor of five journals and second vice-president of a professional
trade-union.⋆
The second conclusion drawn by Truesdell from the events in which Herapath
and Waterston had been involved is more important. He thought that a system making use of anonymous referees favored a reactionary attitude and made it difficult
for new ideas to emerge. In the journals he founded, the two Archives, this system
was replaced by one that the name of the communicator of a paper was published
immediately beneath the name of the work’s author. He thought that this change
would successfully replace anonymous irresponsibility with individual and clear
responsibility. Unfortunately, I fear that his thought was wrong: in the framework
of modern organization, the problem is not anymore the protection of a genius from
the closed and reactionary attitude of the establishment, but rather the protection of
journals from the huge amount of irrelevant material which threatens to flood them.
From this point of view, a comparison between different opinions is preferable to
leaving it all to the whim of just one communicator. After all, no system is perfect.
⋆ An Idiot’s Fugitive Essays on Science, p. 399.
The Genesis of Truesdell’s Nonlinear Field
Theories of Mechanics ⋆
WALTER NOLL
308 Field Club Ridge Road, Pittsburgh, PA 15238, U.S.A. E-mail: opanoll@yahoo.com
Received 14 March 2003
Clifford Truesdell was a singularity among all prominent scientist-scholars of the
twentieth century. He believed that the pinnacle of civilization had been reached
in the 18th century and that things have gone downhill ever since. He had no
television set and no radio, and I doubt that he ever used a typewriter, let alone a
computer. Many of the letters he sent me were written with a quill pen. (However,
he did not reject such modern conveniences as flush toilets and air-conditioning.)
He loved baroque music and did not care very much for what was composed later
on. He owned very fine harpsichords and often invited masters of the instrument
to play, often before a large audience, in his large home in Baltimore, which he
called the “Palazzetto”. He collected art and antique furniture, mostly from the 18th
century. He even often dressed in the manner of an 18th century gentleman. Most
importantly, he admired the scientists of the 18th century and, above all, Leonhard
Euler, whom he considered to be the greatest mathematician of all time. Actually,
in the 18th century, mathematics and physics were not the separate specialities
they are today, and the term “Natural Philosophy” was the term then used for the
endeavor to understand nature by using mathematics as a conceptual tool. C.T. tried
and succeeded to some extent in reviving this term.
Clifford Truesdell was extremely prolific and he worked very hard most of the
time. When he didn’t, he liked to eat well and drink good wine. This is perhaps
one reason why he loved Italy so much, and spent extended periods of time there.
Clifford Truesdell was not only an eminent scientist but also a superb scholar.
He mastered Latin perfectly. He could not only read and understand the classical
literature, which was written in Latin, but even wrote at least one paper in Latin.
He was fluent not only in his native language, English, but also in French, German,
and, above all, Italian. He wrote papers and gave lectures in all of these languages.
All of these attributes are particularly rare for somebody born in Los Angeles,
California.
⋆ This paper is the text, with minor revisions, of a lecture given by the author at the Meeting in
memory of Clifford Truesdell, held in Pisa, Italy, in November 2000.
23
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 23–30.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
24
W. NOLL
Clifford Truesdell was only in his early 30s in the early 1950s when he had
already established himself as perhaps the world’s best informed person in the field
of continuum mechanics, which included not only the classical theories of fluid
mechanics and elasticity but also newer attempts to mathematically describe the
mechanical behavior of materials for which the classical theories were inadequate.
In 1952, he published a 175 page account entitled “The Mechanical Foundations
of Elasticity and Fluid Dynamics” in the first volume of the Journal of Rational
Mechanics and Analysis.
Springer-Verlag, with Headquarters in Heidelberg, Germany, has been an important publisher of scientific literature since the beginning of the 20th century. In
the 1920s it published the Handbuch der Physik, a multi-volume encyclopedia of
all the then known knowledge in Physics, mostly written in German. By the 1950s
the Handbuch had become hopelessly obsolete, and Springer-Verlag decided to
publish a new version, written mostly in English and called “The Encyclopedia
of Physics”. Here is Clifford Truesdell’s own account, in a 1988 editorial in the
Archive for Rational Mechanics and Analysis, of how he became involved in this
enterprise:
In 1952 Siegfried Flügge and his wife Charlotte were organizing and carrying through
the press of Springer-Verlag the vast, new Encyclopedia of Physics (Handbuch der
Physik), an undertaking that was to last for decades. Joseph Meixner in that year
had called Flügge’s attention to my article The mechanical foundations of elasticity
and fluid dynamics . . . On October 1, 1952, Flügge invited me to compose for the
Encyclopedia a more extensive article along the same lines but presenting also the
elementary aspects of continuum mechanics, which my paper . . . had presumed to be
already known by the reader. In the second week of June 1953, Flügge spent some
days with us in Bloomington, mainly discussing problems of getting suitable articles
on fluid dynamics for the Encyclopedia. . . . He asked me to advise him, and by the
end of his visit I had agreed to become co-editor of the two volumes under discussion.
It was C.T., in his capacity as co-editor, who invited James Serrin to write the
basic article on Fluid Mechanics. C.T. himself decided to make two contributions
dealing with the foundations of continuum mechanics, the first to be called “The
Classical Field Theories of Mechanics” and the second “The Nonlinear Field Theories of Mechanics”. Here is what he wrote about this decision, in the preface to a
1962 reprint of his Mechanical Foundations:
By September, 1953, I had agreed to write, in collaboration with others, a new exposition for Flügge’s Encyclopedia of Physics. It was to include everything in The
Mechanical Foundations, supplemented by fuller development of the field equations
and their properties in general; emphasis was to be put on principles of invariance and
their representations; but the plan was much the same. As the work went into 1954,
the researches of Rivlin and Ericksen, and of Noll made the underlying division into
fluid and elastic phenomena, indicated by the very title The Mechanical Foundations
of Elasticity and Fluid Dynamics no longer a natural one. It was decided to split the
projected article in two parts, one on the general principles, and one on constitutive
equations. The former part, with the collaboration of Mr. Toupin and Mr. Ericksen,
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
25
was completed and published in 1960 . . . . In over 600 pages it gives the full material
sketches in Chapters II and III of The Mechanical Foundations. The second article, to
be called “The Nonlinear Field Theories of Mechanics”, Mr. Noll and I are presently
engaged in writing. The rush of fine new work has twice caused us to change our basic
plan and rewrite almost everything. The volume of grand and enlightening discoveries
since 1955 has made us set aside the attempt at complete exposition of older things.
It seems better now to let The Mechanical Foundations stand as final for most of the
material included in it, and to regard the new article not only as beginning from a
deeper and sounder basis than could have been reached in 1949 but also as leaving
behind it certain types of investigation that no longer seem fruitful, no matter what
help they may have provided when new.
In 1951 I started employment as a “Wissenschaftlicher Assistant” (scientific assistant) at an Institute of Engineering Mechanics (Lehrstuhl für Technische Mechanik) at the Technical University of Berlin. In late 1952 I came across a leaflet from
Indiana University offering “research assistantships” at the Graduate Institute of
Mathematics and Mechanics. I applied and was accepted to start there in the Fall
of 1953. After arriving, I found out that I had become a “graduate student”. At
first I did not know what that meant, because German universities have nothing
corresponding to it. But the situation was better than I expected, because no work
was assigned to me, and I could pursue work towards a Ph.D. degree full time.
I asked Clifford Truesdell to be my thesis advisor, because he was the local
expert on mechanics and I thought that would help me after my return to Germany.
At the time I did not know that C.T. had already become the global expert on
mechanics. I also did not know at the time that I was C.T.’s first doctoral student.
I was told later that other graduate students did not select him as a thesis advisor
because they thought that he was too tough. It is true that he was tough and asked
much of me, but he was also extremely kind and helpful, and, in the end, he had
more influence on my professional career than any other person. I learned from
him not only the basic principles and the important open questions of mechanics,
but also how to write clear English. In addition, his wife Charlotte and he did very
much to make me feel comfortable in Bloomington. They invited me to their home
very frequently to share the gourmet dinners prepared by Charlotte. As a thesis
topic, C.T. gave me a paper written by S. Zaremba in 1903 and asked me to make
sense of it. I found that, in order to do so, a general principle was needed, which
I called “The Principle of Isotropy of Space”. This principle also served to clarify
many assertions in mechanics that had been obscure before (at least to me). C.T.
had a wonderful sense of humor. One day he put up his son, then about 10 years
old, to ask me: “Mr. Noll, please explain the Principle of Isotropy of Space to me.”
Actually, I think I have found a good explanation only recently, about 40 years later.
I had a very good background in the concepts of coordinate-free linear algebra and
used it in my thesis. At the time, C.T. was uncomfortable with this approach and
forced me to add a coordinate version to all formulas. However, he did not have a
closed mind and later, in a letter to me dated August 4, 1958, he wrote:
26
W. NOLL
I must also admit that the direct notations you use are better suited to fundamental
questions than are indicial notations. Your present mathematical style is smoother
and simpler than that in your thesis.
After I received my Ph.D. in September 1954 and returned to Germany, C.T. and
I stayed in contact by mail, and I saw him again when he gave a lecture in Berlin in
June 1955 (“Das ungelöste Hauptproblem der endlichen Elastizitätstheorie”). The
term of my position in Berlin ran out in the Fall of 1955 and I needed a new job.
On C.T.’s recommendation I was offered an Associate Professorship at Carnegie
Institute of Technology (now Carnegie Mellon University), where I moved in the
Fall of 1956 after having spent a year at the University of Southern California. Here
is an excerpt from a letter he wrote to me on February 14, 1956, after I accepted
the position at Carnegie Tech:
I think you deserve your position at Carnegie and hope you will like it. If I might
offer a word of advice, it would be to maintain the level set by your two papers
in the Journal. Nowadays so many people are publishing like rabbits that volumes
of papers make little impression. Many young people publish too much when they
find out how easy it is to do something others have not done, especially something a
senior colleague is too dull to see. It is of course necessary to avoid setting such high
standards that one’s publication ceases entirely, as is the case with many well known
savants.
Later, in December of the same year, he wrote me the following letter:
Dear Noll,
Ericksen and I have long promised to write a second article, The nonlinear field
theories of mechanics, to cover exact work on elastic, fluid, and plastic phenomena.
Ericksen is far behind on the first article, and therefore we have not even outlined this
second one. Although I have not yet consulted Ericksen, I feel sure that he would be
relieved if you would consent to join us and do a major part of the second article.
The order of authors would be according to the amount of work provided, and if you
could do most of the work, you would be the senior author.
You know both the old Mechanical Foundations and the new Classical Field
Theories thoroughly. Therefore you have as much background as anyone else, and
you have made important contributions yourself. I have not been able to think of a
good organization. All I can say is that I should prefer to cut down on the amount
of space given to special proposals prior to1948 and to emphasize general work such
as you and Rivlin have done. Whether it is better to treat very general visco-elastic
theories first or to give a definitive presentation of classical finite elasticity first is not
clear to me. The article should be exhaustive, but for the work prior to1948 I think
we need only condense the Mechanical Foundations or change the emphasis, as I
believe the list of sources cited there is virtually complete. The presentation should
be concise, but the length is not critical.
If this proposal is acceptable to you, I will put it up to Ericksen at once. Also,
if you could start right away on the organization and suggest a distribution of material
among the three authors it would help. I believe I can begin active work within two
months, but I do not expect to be able to prosecute exclusively this one project in the
near future.
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
27
Here is an excerpt from my answer, written on January 2, 1957:
I shall be glad to accept your proposal and join you and Ericksen as co-author of
“the nonlinear field theories of mechanics”. It will be a challenging task for me, but I
shall do the best I can. I have thought a little bit about the organization. Here is very
roughly what I would like to propose:
The first section should contain general remarks concerning constitutive equations, their classification, invariance requirements, constraints, definitions of terms
like “material”, “isotropic”, “aelotropic”, etc. I have been thinking about this subject
for quite a while, and I plan to include my ideas on it in a paper on the axiomatic
foundation of mechanics, which I plan to write in the near future.
I think it is best to give then an account of finite elasticity and nonlinear fluid
theory (Reiner–Rivlin). These are the simplest and best known nonlinear theories.
Next, more general stress-strain relations (as those proposed by Rivlin and Ericksen) could be treated. Finally, constitutive equations involving stress rates (as those
proposed by Oldroyd, Cotter and Rivlin, and in my thesis) and their special cases
(hypo-elasticity) could be worked out, and the connection of these general theories
with the simpler ones could be established.
I would be willing to do the general remarks and also most of the treatment of
the various constitutive equations in general. However, I would appreciate some help
with the special cases and solutions. I would think, for instance, that Ericksen would
have little trouble to present all special solutions valid for an arbitrary strain energy
function in finite elasticity by taking his two papers (ZAMP 5: 466–489 and J. Math.
Phys. 2: 126–128) as a basis. Also, it is probably easier for one of you to deal with
finite deformations of shells, anisotropic elastic bodies, and similar material. I would
think, too, that you would want to do the section on hypo-elasticity yourself.
I am open to alternative and more detailed proposals concerning the organization as well as the distribution of the material to the authors, and I am looking forward
to have your opinion.
The new task of working on the Nonlinear Field Theories of Mechanics (NLFT)
changed my professional life forever, because it focused my attention on the foundations of continuum mechanics. C.T. was very good at reading, digesting, and
summarizing existing literature. I am not. I cannot help rethinking and recreating
whatever subject I wish to understand. This had perhaps the effect of improving
the NLFT in the end, but it also slowed down the progress towards completion.
Sometimes C.T. became very impatient with me, and with good reason. The low
point was reached when C.T. wrote the following letter to me:
Dear Walter,
Upon my return from the West, I did not find any manuscript of our article.
As I said in Pittsburgh, I feel now that you should withdraw from the article. This
is the third period of half a year or more in which you have not sent me a line, and
again the subject is slipping away from us. My having had to disturb you every few
weeks or days for the past six months so as to get your assurance that a large section
of manuscript would be ready in a few days has been very painful, the more so since
28
W. NOLL
fruitless. I have put in this treatise the better part of all my work in the past five years,
and as I prepare for another trip to Europe, I cannot drag this weight of responsibility
any longer.
I plan to finish the article by intensive work in January and February, and I
request you to send me all material I gave you before I went to Europe last spring.
Surely, you know that I feel this loss deeply. The parts written by you are
far better than anything I could write, and your criticism and correction of the parts
I wrote have been of the highest value. There is no one else who could do the job
you are capable of doing, but you have not done it, and some sort of article must be
published nevertheless. There is too much invested in it already.
Well, I pulled myself together and I helped to finish NLFT.
During our collaboration we were in constant contact by mail or in person.
I visited Bloomington in November 1958. C.T. came to Pittsburgh in June 1959 and
gave lectures at the Mellon Institute (later a part of Carnegie Mellon University). In
August 1961, C.T. left Indiana University and he and Charlotte moved to Baltimore
where he joined The Johns Hopkins University with the title Professor of Rational
Mechanics. In September 1961, C.T. visited Pittsburgh again. During the academic
year 1962/63, C.T. arranged for me to come to Johns Hopkins University as a
visiting professor.
Occasionally, C.T. and I had disagreements on terminology. Perhaps the most
interesting concerned what we finally called “The Principle of Material FrameIndifference”. Here are some excerpts from letters that we exchanged on this subject:
C.T. to W.N., August 1, 1958:
I was just planning to write something mentioning your old “isotropy of space”. It
seems to me the term “principle of objectivity” shares one of the undesirable features
of the old term, namely, it is too vague. I was going to use “the principle of material
indifference”. However, it would be bad for there to be three names running around
for the same principle. If you find my suggestion appealing, let me know right away;
otherwise, I will use your present name.
W.N. to C.T., August 5, 1958:
I have switched from “isotropy of space” to “objectivity” because the former suggests
only that there are no preferred directions in static space, which is much less than is
implied by the principle of objectivity.
If “objectivity” means independence of the observer, as I believe it does, then
it seems to me that it is the correct term. One could remove some of the vagueness
you mentioned by using “principle of objectivity of material properties” (or perhaps,
“principle of material objectivity”). If you think that this is advisable, please insert
“of material properties” after “principle of objectivity” in the table of contents (section 11), in line 2 of page 3, in the title of section 11 on page 20, and in line 15 from
the bottom of page 21. I think that no additional changes are necessary because the
other references to the “principle of objectivity” cannot lead to confusion.
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
29
C.T. to W.N., August 9, 1958 (handwritten):
Dear Noll,
I have to confess that my dislike of “objective” is subjective. Psychologists,
sociologists, and all sorts of other charlatans and fakers use “objective” when the
only sense I can find for what they claim is “nonsensical” or “thoughtless”. Thus I
exclude, banish and ostracize “objective” as the opposite of ”subjective” in my own
usage, but of course there is no reason why this should influence you.
Cordial regards, CT
C.T.’s objection did influence me and I am now glad it did. I suggested that we
change indifference to frame-indifference. C.T. accepted that and I believe now that
the term we finally used expresses the idea better than any of the ones used before.
The entire task of writing the NLFT turned out more extensive than was expected, and C.T. called it the “monsterino”. (He called the earlier Classical Field
Theories the “monster”.) The monsterino was finally published in 1965. There were
many people who helped us with the work by reading parts of the manuscript and
offering corrections, critique, and suggestions. They include B.D. Coleman, J. Ericksen, M.E. Gurtin, R. Toupin, K. Zoller, C.-C. Wang, D.C. Leigh, C.C. Hsiao,
W. Jauzemis, and A.J.A. Morgan. On June 25, 1961 C.T. wrote to me:
I now have an excellent secretary, so that the preparation of the manuscript should be
easier.
That helped, too.
Clifford Truesdell was very meticulous about attributions. I believe he bent over
backwards in my favor when he described, in a footnote to the Introduction to
NLFT, how the work was divided between us:
Acknowledgment. This treatise, while it covers the entire domain indicated by its
title, emphasizes the reorganization of classical mechanics by Noll and his associates.
He laid down the outline followed here and wrote the first drafts of most sections in
Chapters B, C, and E and of a few in Chapter D. Among the places where he has
given new results not published elsewhere, shorter proofs, or major simplifications
of older ideas may be mentioned (21 items). The larger part of the text was written
by Truesdell, who also took the major share in searching the literature. While Noll
revised many of the sections drafted by Truesdell, it is the latter who prepared the
final text and must take responsibility for such oversights, crudities, and errors as
may remain.
The Nonlinear Field Theories was reprinted in 1992. Here is an excerpt from
the preface to the second edition:
This volume is a second, corrected edition of The Nonlinear Field Theories of Mechanics, which first appeared as Volume III/3 of the Encyclopaedia of Physics, 1965.
Its principal aims were to replace the conceptual, terminological, and notational chaos
that existed in the literature of the field by at least a modicum of order and coherence,
and second, to describe, or at least to summarize, everything that was both known
and worth knowing in the field at the time. Inspecting the literature that has appeared
30
W. NOLL
since then, we conclude that the first aim was achieved to some degree. Many of the
concepts, terms, and notations we introduced have become more or less standard,
and thus communication among researchers in the field has been eased. On the other
hand, some ill-chosen terms are still current. Examples are the use of “configuration”
and “deformation” for what we should have called, and now call, “placement” and
“transplacement”, respectively. (To classify translations and rotations as deformations
clashes too severely with the dictionary meaning of the latter.) We believe that the
second aim was largely achieved also. We have found little published before 1965
that should have been included in the treatise but was not. However, a large amount
of relevant literature has appeared since 1965, some of it important. As a result, were
the treatise to be written today, it should be very different. On p. 12 of the Introduction
we stated “. . . we have subordinated detail to importance and, above all, clarity and
finality”. We believe now that finality is much more elusive than it seemed at the
time. The General theory of material behavior presented in Chapter C, although still
useful, can no longer be regarded as the final word. The Principle of Determinism
for the Stress stated on p. 56 has only limited scope. It should be replaced by a more
inclusive principle, using the concept of state rather than a history of infinite duration,
as a basic ingredient. In fact, forcing the theory of materials of the rate type into the
general framework of the treatise as is done on p. 95 must now be regarded as artificial
at best, and unworkable in general. This difficulty was alluded to in footnote 1 on
p. 98 and in the discussion of B. Bernstein’s concept of a material on p. 405. This
major conceptual issue was first resolved in 1972 [“A New Mathematical Theory of
Simple Materials” by W. Noll, Archive for Rational Mechanics and Analysis, Vol. 48,
pp. 1–50], and then only for simple materials. The new concept of material makes it
possible, also, to include theories of plasticity in the general framework, and one can
now do much more than “refer the reader to the standard treatises”, as we suggested
on p. 11 of the Introduction.
About three years ago, Springer-Verlag sold the right to translate NLFT into
Chinese. I recently received copies of the Translation.1
1 After this paper was written, I was informed by Springer-Verlag that a second reprinting will
appear in 2003.
An Appreciation of Clifford Truesdell
JAMES SERRIN
School of Mathematics, University of Minnesota, Minneapolis, MN 55455, U.S.A.
Received 9 June 2003
The following comments are a slightly emended version of a talk delivered at a
meeting of the Society for Natural Philosophy in December 1985, together with
additional remarks appended in May 2003.
This meeting of the Society for Natural Philosophy has been arranged as a special tribute to its principal founding member, Clifford Truesdell, in recognition of
his many contributions to the Society since its beginning at The Johns Hopkins
University in March 1963. In this group picture of the participants of the first
meeting (see Figure 1) you will see Clifford Truesdell at next to the left in the
front row, and no doubt you will also recognize a number of other faces, all 22
years younger and perhaps 22 years wiser, than today.
In the intervening years the Society has sponsored 27 additional meetings, holding fast throughout its existence to the original form, to small size, no publications,
and the encouragement of quality of research – primarily in rational continuum
mechanics and its foundations, though with occasional forays into related mathematical disciplines. And outstanding menus, as well.
Consistent maintainance of the original principles of the Society is an achievement due more to Clifford Truesdell than to any other single individual. It is impossible in a few short minutes to do justice to his remarkable and myriad accomplishments, in papers, monographs, memoirs, and books ranging from mathematical
sciences, to rational mechanics, natural philosophy, and the history of science.
Already in the forties there was clear evidence of this future in several unusual
publications, perhaps not today known to everyone here.
As a student at Indiana University I was privileged early to see the power of his
thought in exceptional lectures which he offered in elasticity and kinetic theory –
perfect in content and stately in development – thus, the appearance in 1952 of the
seminal paper “Mechanical Foundations of Elasticity and Fluid Mechanics” came
to me not so much as a surprise but rather as a remarkable opportunity to learn
more from a great master. Several years before, he had initiated research on the
kinematics of vorticity, together with the interrelations of vorticity to thermodynamics. This culminated in his next work, the elegant monograph “The Kinematics
of Vorticity”, published by the Indiana University Press but now out of print,
and, following without pause, his great paper on the absorption and dispersion
31
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 31–38.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
32
Figure 1. First Meeting of the Society for Natural Philosophy, March 25–26, 1963. The front row left-to-right: L.M. Milne-Thomson, C. Truesdell,
B. Coleman, J. Serrin, J.L. Ericksen, H. Markovitz, E. Sternberg, W. Noll, H. Grad, R. Toupin, R. Rivlin. Back row second from the right: Charlotte
Truesdell.
J. SERRIN
AN APPRECIATION OF CLIFFORD TRUESDELL
33
of sound according to the viscosity formulae of Navier and Stokes. Altogether
one of the most audacious first appearances of a new star in the firmament of
mechanics!
While time makes it impossible to dwell upon still another quiver in his bow,
I must add that it was exactly following these works that he began the three celebrated prefaces for the Opera Omnia of Leonhard Euler – one each on fluid
mechanics, acoustics, and elasticity. These established his fame as the leading
modern interpreter of Eighteenth Century Mechanics, and of course brought to
our attention, as nothing else, the contributions of Euler to our science.
In 1960 Truesdell and Richard Toupin discovered the important entropy relations now known as the Clausius–Duhem inequality, specifically including in it
the crucial external heat supply term. This development made possible the later
theory of Coleman and Noll, providing thereby, for the first time, a rational and
rigorous approach to the foundational aspect of constitutive restrictions. Moreover,
as appeared in retrospect many years following, the radiation supply term provided
a vital link between classical thermodynamics and modern theory.
Truesdell’s work on the Clausius–Duhem inequality was severly criticized by
those whose understanding of thermodynamics had been formed without benefit
from study of the masters of the subject. The criticisms could not be fully answered, ironically, because the foundations of thermodynamics were themselves
open to question. In any case, this neglected and nettle-choked thicket of physics
became one of the principal areas of Truesdell’s later work: of course his detractors
never dreamed that anyone would set foot in those Augean stables, nor of course
did they have the courage to do so themselves. Nevertheless, the difficulties of
thermodynamics were enough to require many further years of thought before the
ideas were ripe for presentation, and other projects loomed immediately.
Two monumental volumes for the Handbuch der Physik appeared in the early
sixties, “The Classical Field Theories” with Richard Toupin, and “Non-linear Field
Theories of Mechanics” with Walter Noll. It is almost impossible to know how to
approach these volumes. Their weight of scholarship and perception, their very
completeness, daunt the mere mortal who opens them. Even so, the rich veins of
ore were, individually and internationally, mined by the many. Here one may also
recall the impress which Truesdell’s views had on the seven companion volumes of
the Handbuch der Physik; for the most part he selected not only the general subject
matter, but also the corresponding authors. Even now, thirty years later, many of
these volumes are as fresh as ever – required reading for the acolyte.
After 1964 thermodynamics and kinetic theory began to occupy Clifford more
heavily than ever before, as if only the most monumental problems were now sufficient of challenge. In 1957 he had published a first paper on chemical mixtures,
followed by a second paper in 1968 which inspired much further work on this most
difficult of subjects. Then came in 1969 the polemical masterpiece “Rational Thermodynamics”, summarizing his thoughts to that time and forming a springboard
(I almost wrote “launching pad” but rejected this) for future work.
34
J. SERRIN
Figure 2. Journal of Rational Mechanics and Analysis: Volume 1, Number 1.
If I may digress a moment, let me recall that this scientific activity was paralleled by continuing effort to re-establish both Rational Mechanics and the methods
of Natural Philosophy as integral parts of physical science. Beyond the formation of
the Society for Natural Philosophy, there is the founding of the Journal of Rational
Mechanics and Analysis in 1952 (see Figure 2), its reappearance as the Archive
for Rational Mechanics and Analysis in 1957, and the inauguration of the Archive
for History of Exact Sciences in 1960, to name his major and most longstanding
contributions in this direction. Galileo had said “Nature first made things in her
own way, and then made human reason skillful enough to understand – but only
by hard work – some parts of her secrets.” Clifford saw well that a combination
of mathematical analysis and historical sense was the key to this understanding,
and he emphasized always, as Dedekind put it, “What is provable should not be
believed in science without proof.” In these endeavors his practice emulated his
preachment to the Society at an earlier meeting. “The aim of rational mechanics is
to provide a sound conceptual framework for description of nature as human senses
perceive it and to create patterns of systematic inquiry and inference such as to order and interrelate the phenomena thereby conceived.” Deeper understanding was
to come through the successive works of generations, together with our experience
of materials themselves.
AN APPRECIATION OF CLIFFORD TRUESDELL
35
I would like to bring you the message that Clifford’s evangelism succeeded
beyond his expectations, but must sadly report that the world is still peopled by
pagans (and worse) – a theme which Clifford of course has voiced frequently in
his scientific essays. In any case I herewith dedicate for his perusal two sentences
culled at random from today’s scientific journalese – the first “If space has nine
dimensions and matter is strings then the mysteries of the universe may soon come
clear” (so help me God), and the second, “Among enhancement techniques the
National Research Council will investigate are sleep learning, group cohesion technologies, accelerated intelligencing and parapsychological stress.” But one must
put aside the ridiculous in the study of nature.
Returning then to the question of classical thermodynamics, one observes from
the outset that this is the sole discipline of classical physics where serious argument about fundamentals remains. Even while moguls of science grant entropy
and energy all-pervasive influence, it is impossible to obtain agreement from these
grandees on the two laws of nature from which these concepts derive. As recently
as 1970 the subject was innocent of mathematical structure, and the lack of precision showed throughout. The introductory essay in “Rational Thermodynamics”,
written in 1968, tells this story faultlessly.
Truesdell attacked this miasmic swamp with vital energy, and brought others as
well to the fray. The outcome is now clear; there is no longer doubt that energy
and entropy can be set upon sure and credible foundations,⋆ vindicating Clifford’s
intuition when he proposed the Clausius–Duhem inequality in 1960. His own part
of this research, carried out from 1974–1984, included in particular a rigorous
axiomatization of reversible thermodynamics. There is also in this work a corollary
result, a beautiful pearl, which I think deserves to be better known than it is. This is
his discovery that, for reversible systems, the law of thermodynamics can be stated
without recourse to the mechanical equivalent of heat, indeed in such a way that the
existence of this mechanical equivalent can be deduced, much as the existence of
absolute temperature can be inferred from the second law. Generalizations of this
to arbitrary thermal systems have been given by later writers, leading me to predict
that ultimately this method of viewing the First Law will become a new paradigm
of physics.
Books issue in continuing stream from “Il Palazzetto”: “Introduction to Rational
Elasticity” (with Wang), “The Tragicomical History of Thermodynamics, 1822–
1854”, “A First Course in Rational Mechanics”, “An Idiot’s Fugitive Essays on
Science”, “Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic
Gas” (with Muncaster), the last his magnus opus on a subject which has been with
him for 40 years, and even papers on antique furniture and the routes to Hell. What
⋆ Note added, March 2003. This statement needs some clarification: The context should be re-
stricted to phenomenological thermodynamics, in which heat and work are the primitive elements,
and where also there are given differential forms defining the heat and work associated to processes in
question. Beyond this, there are systems where one may credibly define energy and entropy, though
such definitions may restrict the set of available processes of the systems.
36
J. SERRIN
more proof is needed for the converse of the famous epigram of Leonardo da Vinci:
“Quiet water becomes stagnant. Iron rusts from disuse. So doth inactivity sap the
vigor of the mind.”
Clifford Truesdell inspired a generation to study Natural Philosophy, and persuaded the learned that Athena is not dead. He has repeatedly and deservedly been
honored for his many accomplishments. There is the Bingham Medal of the Society of Rheology which he received in 1963, the Modesto Panetti Gold Medal and
Prize from the Turin Academy of Sciences in 1967, and the George David Birkhoff
Award for Applied Mathematics from the American Mathematical Society in 1978,
as well as Foreign Membership in the Accademia Nazionale dei Lincei. I dare say
that the Society for Natural Philosophy as well would wish to present formally
to him its highest constitutional award: the great regard in which he is held, the
high esteem he is owed, and our continuing best wishes for the furtherance of the
scientific program in rational mechanics which he initiated at the beginning of his
career.
The following additional words were appended in May, 2003, drawn from a
lecture given in November, 2000, at Pisa at the Meeting in memory of Clifford
Truesdell.
Generous in praise, criticism, judgement and friendship, none bestowed lightly:
ordinary words do not suffice for Clifford Truesdell. He had great vision accompanied by exceptional accomplishments, at the same time supremely gifted, prodigiously learned and with almost overwhelming energy. He lived life not just to
the full, but to overflowing in every activity he undertook: his prodigal lifestyle,
his many volumes of correspondence, his historical research and essays, his many
books, both scientific and philosophical, and his exceptional editorial activities.
He was lavish in his help for young researchers and was happy with the success
of those who followed in the directions he pioneered. He chose his friends well,
and from the group around him and around the Archive some of the greatest work
of mechanics and thermomechanics in the last century resulted.
Theodore Roosevelt’s old fashioned words apply still to Clifford and to his
goals, “Far better it is to dare mighty things, to win glorious triumphs, even though
checkered by failure, than to take rank with these poor spirits who neither enjoy
nor suffer much, because they live in the gray twilight that knows neither victory
nor defeat.”
All the same, it is necessary to add that Clifford was not without the faults of
the gods – perhaps he was aware of this – even allowing this aspect of his character
to be a full part of his personality. Clifford could become upset on occasion, not
always fairly; there were exasperating times when he became all too human.
Let me mention one moment engraved on my memory. On one occasion when
my wife Barbara and I were visiting Clifford and Charlotte, after the dinner, while
he and I were standing at the top of the magnificent divided stairway at the
Palazzetto, the subject of absolute temperature in classical thermodynamics came
up. After all at that time we were both violently infected by the thermodynamic
AN APPRECIATION OF CLIFFORD TRUESDELL
37
plague. He would not accept my view of the “hotness manifold”, which perhaps
unduly boldly I had ventured to discuss. While always expressed with politeness,
the differing views clearly upset him – he was finally led to utter the ultimate put
down: “The problem with you, dear James, is that you never understood Bernard
Coleman.” Of course, that is no doubt true, . . . .
Let me add something about the Archive (Archive for Rational Mechanics and
Analysis). I will ramble a bit, because the main aspect of the Archive – what it is
and what it has meant is familiar.
In the years between 1946 to 1960 there was a wonderful period of optimistic
activity in the United States, based on the successful ending of the Great War
against Germany and Japan, and also on the need to supply many of the requirements of life which had not been met since the beginning of the Great Depression
in 1930.
Thus a period of optimism was just beginning and American mathematics was
coming to maturity. Before that time, American higher mathematics had been concentrated almost entirely in the universities of the East Coast – Harvard, MIT,
Yale, Columbia and Princeton; and otherwise only at the University of Chicago
and at Berkeley. There had been only 500 members of the American Mathematical
Society, when now there are 10,000. A year’s collection of mathematical review
was less than half what we now receive in one month. This small world was to
change dramatically after the war, with the coming of European mathematicians,
mostly German, who had escaped from Hitler’s tyranny or from the devastation of
Europe. They were welcomed in America, to aid in the war as scientists, or as part
of the generosity towards the world which Americans in those days so strongly
showed.
It was my good fortune to go to Indiana University at just this time, to study
with a newly formed but impressive faculty, including among others Max Zorn,
Eberhard Hopf and David Gilbarg. Clifford Truesdell arrived in Bloomington in
1950, one year before I left, brought there to solidify the study of continuum mechanics. The students were completely in awe of him, and indeed his lectures were
amazing tours de force, which those who knew him later can easily imagine. That
was the setting.
Now let me turn specifically to the Archive, at first in Clifford’s words.
“A mathematical journal of a different kind had to arise. It was only a question
of persons, place, and time. In those circumstances, T.Y. Thomas, the Head of the
Department, asked me to join him in founding a journal to serve the then growing
fields of mathematical continuum mechanics and the analysis of nonlinear partial
differential equations.”
“Beyond the scope and editorial policy, there was some discussion about the
unusual title ‘Rational Mechanics’, but in the end it was adopted because Newton
had introduced it in his Principia and had not only exemplified it but defined it.”
Even the form of the cover of the new journal occupied his mind – for what it is
worth, my small contribution in this direction was to choose the color, red instead
38
J. SERRIN
of blue, to be used as the contrasting ink on the cover. Thus the JOURNAL of
Rational Mechanics and Analysis was born, complete with Latin inscription.
Number 1 of Volume 1 of the Journal of Rational Mechanics and Analysis
appeared in January, 1952.
Clifford left Indiana in 1957 for Johns Hopkins, and the old Journal became
the new Journal of Mathematics and Mechanics (since that time renamed again as
the Indiana University Mathematics Journal). At the same time, a new periodical
emerged from the ashes, the ARCHIVE for Rational Mechanics and Analysis.
This time by the way, there was no problem in choosing a color for the cover,
the publisher Springer-Verlag being inextricably committed to yellow, and yellow
only. Both journals have of course flourished and are still here, a tribute to the
far-seeing eye and organizational ability of Truesdell.
The early years of the Archive saw a great revival of mechanics as a rational doctrine, with the contributions of Stuart Antman, Bernard Coleman, Jerald Ericksen,
Roger Fosdick, Morton Gurtin, Daniel Joseph, Victor Mizel, Walter Noll, David
Owen, Richard Toupin, as well as the celebrated analysts, Antonio Ambrosetti,
Haïm Brezis, Constantine Dafermos, P.L. Lions, J.B. McLeod, Paul Rabinowitz,
and others. The Archive had become necessary for every fine scientific library.
It was thus fitting and natural that when Clifford Truesdell retired from the
editorship in 1989, Stuart Antman was chosen to follow in the same tradition. The
Archive requires special qualities of mind of its main editors – taste in subject and
style, judgement of scientific merit, dedication to an ideal, sheer plodding effort
– together with the ability to see a union between mathematical analysis and the
physical or rational world. These are talents which Stu had in abundance – and,
during the time I was co-editor of the Archive, talents which I could draw on from
Clifford Truesdell when required.
The focus of the Archive has turned gradually in recent years, in part due to
changed research directions in rational mechanics and analysis, a change which affected both contributors and Members of the Editorial Board as well as the response
of the chief Editors.
Clifford viewed mathematics and mechanics as a type of art, as part of our living
culture. Here are the words of a famous Finnish composer – which seem equally
relevant to the life of mathematics:
“It is my belief that art is great if, at some moment, it catches ‘a glimpse of
eternity through the window of time’ – if the experience is one which we might
call ‘the oceanic feeling’. This to my mind, is the only true justification for all art.
All else is of secondary importance.” (Einojuhani Rautavaara)
Clifford A. Truesdell’s Contributions to the Euler
and the Bernoulli Edition
D. SPEISER
Université Catholique de Louvain, Louvain-la-Neuve, Belgium
Present address: Bromhübelweg 5, CH - 4144 Arlesheim, Switzerland.
E-mail: rspeiser@datacomm.ch
Received 2 March 2003
1. Introduction
Allow me to recall that the young scientist, formed at Caltech and who had won
acclaim through a series of articles on special subjects, soon acquired extraordinary
fame through his comprehensive Handbuch articles, the second of which was written with Walter Noll, where they reformulated the foundations of the mechanics of
continua. The mechanics of continua, which before had appeared in all textbooks
as a conglomerate, even as a sandhill of sometimes isolated subjects between which
there was no connection, logical or mathematical, became now connected and
unified by a powerful system of axioms in the way Hilbert had postulated at the
beginning of the century.
This unification of the entire field of mechanics became possible, among other
things, through the sharp distinction between dynamical principles and constitutive equations. The latter formulates the special properties of the materials only –
something that cannot be deduced in classical mechanics, but must be left to quantum mechanics. In classical mechanics such a property must be formulated by an
additional hypothesis. Noll and Truesdell then showed that this distinction could
be traced back to Cauchy and from Cauchy to Euler and to Jacob Bernoulli. And
thereby we have now stepped onto the domain of the history of science.
But here we must now pause for a moment. While it is obvious that having
worked out a systematic organization of mechanics is indeed an incomparable
preparation for analyzing, ordering and understanding also the historical discoveries and the various processes of the development of science, one cannot stress
enough, on the other hand, that science and history are two radically different
endeavors of the human spirit.
The essence of science lies in its property of being systematic since science
ultimately always wishes to grasp the laws of nature, which it strives to uncover and
to formulate in the simplest and most transparent form. But human history, and thus
39
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 39–53
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
40
D. SPEISER
also the history of science, is the complete opposite of this: it is totally unsystematic, always complex and never simple nor transparent. So, for writing the history
of science two different, indeed totally opposite, endeavors must simultaneously be
at work in the same man. Thus from the same man two almost irreconcilable gifts
are requested – gifts from his intellect as well as from his heart. This confrontation,
one might say “clash”, of the endeavor to systematize and to extract the universally
valid from the documents which the historian finds before him, with the aim to
determine the conditions under which this, always unique, discovery was made,
under very special circumstances and by one distinct individual different from all
others, and then to interpret its significance for the development of science, is the
character of the history of science. It is its very essence, even its unique prerogative
and also its characteristic charm.
Beyond and above all of the interesting facts which we learn and insights which
we gain about the progress that we are allowed to observe and learn to appreciate,
it is precisely this constant confrontation which fascinates the reader of Truesdell’s
articles and books, and there perhaps more so than for any other historian of science
I know. No wonder then that the interest of such a scientist-historian is directed
especially towards the great systematizers and unifiers, those who at the end of
previous long developments could lay the definite foundations of a whole field,
like Newton, Euler, Cauchy, and Maxwell, or prepare new ways like Hilbert did for
quantum mechanics. But as we shall also see, Truesdell had a special admiration
also for Jacob Bernoulli. Here I shall restrict myself almost exclusively to what
Truesdell did for the Euler Edition, which was the cause of my first contacts with
him, and later, which is a very different story, for the Bernoulli Edition.
2. Truesdell’s Introductions to Euler’s Works on Hydrodynamics
Truesdell was invited in the fifties by the mathematician Andreas Speiser, then
General Editor of Euler’s Opera Omnia, to edit volumes 12 and 13, series II, which
contain Euler’s writings on hydrodynamics. The great systematizer Truesdell was,
I must remind no one, cut out for that job much more than Speiser could have
known. Indeed, Truesdell’s introductions to the two volumes II.12 and 13 offer
much more than what their titles promise and what Speiser possibly could have
hoped for. He traced the subject back to its origins, Archimedes for the hydrostatic
and Torricelli for the hydrodynamical part, and then presented the development
of these branches of science in detail, from the second half of the 17th century
throughout the whole 18th century and beyond. Truesdell presented this scientific
development, where many strands lived for a long time their separate lives. Later
they became more and more intertwined and now form whole domains, but in the
unplanned and rationally never fully explicable way that history, which does not
care much for its future historians, always proceeds. And eventually he showed
how these separate domains came together and why they had to be formulated by a
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
41
new mathematical language – a language the very creation of which was stimulated
largely by this development.
In these two introductions we can see how the systematic understanding of a
science from today’s point of view not only helps to unravel the various strands,
but is indeed also indispensable for showing what exactly each strand and each
domain has contributed, and thus for doing justice to each of them. Here it is not so
much the experimental versus the theoretical progress that must be balanced; for,
as Truesdell observed, because of the sudden enormous progress of mathematics
during this period, theory was then mostly far ahead. Incidentally, this is perhaps
the secret reason why this period was always, and still is today, so much neglected
by the historians of physics. Rather, the difficulty is to attribute proper credit not
only to the stimuli that comes from new, now suddenly accessible, problems of
mechanics on the one hand but also on the other to the impulses due to the analytical and, to a minor degree, geometric discoveries made around the middle of the
century.
First we have Newton’s heritage, deposited in the second book of the Principia.
It is well known that precisely here are some of the book’s most difficult, even
obscure passages. Truesdell’s summary of it and his evaluations are extremely succinct; they fill merely two pages. Even so they are a very valuable help to the reader.
This holds true especially for his careful separation of what Newton had derived
by starting with the corpuscular view from those results that he had derived using
a continuum view and, finally, from those results for which he used both kinds of
assumptions. The only regret the reader feels is that this whole section is so short.
And then we find Truesdell’s examination of Daniel Bernoulli’s Hydrodynamica. This book contains, albeit in splendid isolation from the rest, the celebrated
excursion into the kinetic theory of gases, as we say today, which Truesdell places
in the proper historical perspective. Here, Bernoulli was over a century ahead of his
time. But then in this book hydrostatics is unified with hydraulics. This was the first
of the two big unifications of theories during this period. It was achieved through
the celebrated Bernoulli equation, which, as Truesdell carefully explains, was by
no means written by Bernoulli in the form in which we know it today. Indeed, not
less than four major discoveries were needed before its transparent form, due to
Euler, became possible.
The first of the four discoveries just mentioned was Johann I Bernoulli’s introduction of Newton’s concept of force into hydrodynamics, if only in a one-dimensional setting. It was Truesdell who rediscovered this fact, acknowledged already
by Euler, thus clearing Johann from the reproach of having plagiarized his own
son. Later Szabó confirmed Truesdell’s vindication in even stronger terms. And
Truesdell underlined that the roots of Euler’s hydrodynamics were to be found in
Johann’s rather than in Daniel’s work.
The second important discovery was Euler’s deeply penetrating understanding
of what we call today an “inertial system” and, in particular, the fact that only in
such a frame can one expect the laws of nature to have a simple form, a form that
42
D. SPEISER
one can guess through mathematical or physical imagination. Truesdell calls this
paper E177 “a research, that will change the whole face of mechanics”. This is
the moment to mention the recent book by the Italian mathematician and historian
Giulio Maltese in Rome: La storia di “F = ma”. This book deals with the development of particle mechanics from Newton to Euler and it is perhaps the first one
to profit fully from Truesdell’s work.
The third important discovery was Euler’s creation of the concept of “inner
pressure”, as opposed to Stevin’s “outer pressure”. This new concept was the key
to his equations.
The fourth important step lies on the mathematical side of this development.
It was the formulation of a field description due to d’Alembert, a discovery that
again was first pointed out by Truesdell. And on the mathematical side, one can,
of course, directly observe the greatest step due to the mathematics of the 18th
century: the systematic development of partial differential equations, due again
largely to Euler and to d’Alembert in their work on hydrodynamics. The interplay
between these two lines of research were masterfully pursued and presented by
Truesdell. If we think how loath scientists usually are to recognize how much their
own field owes to the progress of other areas, Truesdell’s presentation is doubly
remarkable and welcome.
Thus the ingredients were together for the equations of hydrodynamics, published in 1752 by Euler in his paper E 258. Thereby, hydrodynamics and aerodynamics were unified into one dynamical theory, which can be applied equally to
either of them by selecting a special constitutive equation. These equations mark
more than only a period in the history of science: they represent the first field
theory in physics! The Bernoulli equation appears now in the form we know it
today, namely as an integral of the equations of motion, valid only under certain
circumstances.
To this culmination point in the history of mechanics there corresponds a culmination in Truesdell’s narratives: his commentaries on Euler’s two comprehensive
presentations of hydrodynamics. The first of these consisted of three papers written
in the 1750’s in French, and the second was written in the 1760’s in Latin. The latter
is part of Euler’s great project, conceived in 1734, to present the entire science of
mechanics in 6 volumes. The first of Euler’s presentations reflects the moment of
the discovery when it was still fresh in his mind, and the second, a more polished
one, shows the ambition to present everything as clearly as possible, for beginners.
I mention here only Truesdell’s comments on Euler’s paper E 331: “On the
motion of fluids arising from different degrees of heat”, written in 1764. This paper
belongs to the prehistory of meteorology. Euler pursued a line taken up 150 years
later by Milankovitch, but Truesdell pointed out, and again he was here the first to
do this, what this paper brought to thermodynamics.
But he restricted his account by no means to the leading figures of history,
even if they do receive the lion’s share, but others are not overlooked. Thus, the
work of Simon Stevin, one of those who is always underestimated, received elo-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
43
quently his due. Truesdell drew the reader’s attention also to Jacob Hermann, Jacob
Bernoulli’s most important disciple next to his brother Johann. It will be an important task of the Bernoulli Edition, to scrutinize Hermann’s works for important
contributions.
In the introduction to the second volume one finds Truesdell’s account of the development of acoustics. Besides Euler’s, two names stand out here: Daniel Bernoulli
and Lagrange. I prefer to mention Bernoulli’s work later in connection with the introduction to the history of elasticity. Of Lagrange’s work, Truesdell gave a detailed
account of correspondence with Euler as well as of Lagrange’s own published
papers: he wrote that “the velocity-potential theorem and the impulse theorem are
first rate creative works and L AGRANGE’s greatest discoveries in fluid dynamics.”
For organizing this vast amount of material, as one can appreciate, an intimate
acquaintance with the sources is not enough. What was essential here was a deep
insight into the science of mechanics itself – an insight that could be gained only
from its most modern formulation. Such an understanding was essential to provide
a systematic and powerful enough reference system for organizing adequately this
enormous body of material. It is the combination of both of these qualities, which
Truesdell possessed to the highest degree, which lends to his presentation relief
and gives to the narrative at the supreme moments a dramatical quality.
But let us return to Euler’s equations and their two great presentations. From
Euler these equations came to Cauchy, who, with the help of his new creation –
the stress tensor – laid the foundation of the theory of elasticity in its definite form.
During the same period the field idea, through Euler’s “Lettres à une princesse”,
reached Faraday, who quoted them frequently in his journal, and from Cauchy and
Faraday the concept of a field came to Maxwell. One may fairly say, that either
of Euler’s great works, especially the Latin one, if supplemented with some of
Truesdell’s comments which place them in a modern perspective, are, or alas, as
I rather must say, would still today be the best introduction to hydrodynamics at
the level of high school, and even more so, at the level of university teaching. The
presentation of the idea of what a law of nature is, how it works and what use is
made of it under special circumstances including technical applications has hardly
ever been surpassed. This presentation is never clouded by mere formalisms nor by
going into the details of technological applications which, more often than not, just
obscures the basic idea. By Euler and by Truesdell one is led from one essential
point to the next with only the absolutely necessary excursions until one reaches
the summit.
3. Truesdell’s Introduction to Euler’s Work on Elasticity
“The rational mechanics of flexible or elastic bodies 1638–1788”, as the title implies, reviews its subject from Galileo’s Discorsi to Lagrange’s Méchanique Analitique. But in fact it opens with a survey over the prehistory of the whole field of
early Greek antiquity and also deals with the Middle Ages. Here we find such
44
D. SPEISER
Truesdellian pearls like the following first sentence: “Duhem’s great historical
studies showed that the apparent darkness of mediaeval physics is but darkness
of our knowledge of it.” Indeed this was a call to fill an immense gap in science.
I will mention here only one glaring hole: our ignorance of mediaeval technology. How, exactly, did the mediaeval architects and engineers proceed with the
construction of their enormous cathedrals, especially of the towers, which surpass
in height, refinement and daring everything that the Greeks and even the Romans
had achieved? Truesdell’s clarion call ought to remind all historians of science as
well as of the arts that they should direct their attention much more to the Middle
Ages than they have done so far. He mentions explicitly the elusive Jordanus de
Nemore and notes: “The only writing of value on deformable bodies that I have
been able to see is the fourth book of J ORDAN DE N EMORE’s Theory of Weight
(13th century), and remarkable it is, Western in spirit, ambitious beyond anything
in the Greek or Arab tradition. The seventeen propositions on fluid flow, resistance,
fracture and elasticity are all original.” And the next section contains an evaluation
of Leonardo’s achievements.
The whole account fills the separate volume II, 11b of 400 pages that contains
the introduction to the two volumes II, 10 and 11! An immense achievement:
theories, experiments, simple historical facts, etc. are not only enumerated but
thoroughly “digested”; that is, their scientific content is explained and imbedded
into the historical development. To satisfy both requirements makes, of course,
heavy demands on the writer. Add to this that the account is based not only on
the published writings, but also on an enormous number of letters, for instance, on
the quite complex epistolary exchanges between the Bernoullis and Euler, which
Truesdell searched through. From these we can get an idea of the Herculean labor
that he has gone through. And besides theories, we also find experiments discussed – experiments done, among others, by Musschenbroek, Giordano Riccati
and Chladni and the account reaches far into technology and engineering.
Since the field of elasticity has a more complex structure than hydrodynamics,
its history presents more isolated strands and subdomains. Thus, a lucid arrangement is here much more difficult to attain. Here I must restrict myself to a small
part of Truesdell’s accounts of elasticity proper and the notion of flexibility. I shall
begin with the latter.
While the modern theories of flexibility began with Galileo’s and Mersenne’s
investigations of the string, it was only the investigation of static problems like
the hanging cord and the suspension bridge that led science to ask for mechanical
explanations. In 1712, Brook Taylor computed the fundamental frequency of the
string. While Daniel Bernoulli, as we learn from Truesdell, had a bit later the theory
of the overtones in his hand, he did not publish it. Instead he studied the double-, the
triple-, the multiple-pendulum, and asked the following question, characteristic of
his whole research: under what condition is the oscillation stationary? His answer
was: a pendulum with n masses has exactly n stationary oscillations, and he computed up to n = 5 its n frequencies as a function of the various masses and length’s.
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
45
Turning next to the the hanging cord, by going to the limit he proved that it has
infinitely many stationary oscillations given by the zeros of a new transcendental
function, today denoted as the Bessel function J0 (x).
A little later Johann I Bernoulli formulated the problem of the motion of several
particles attached to a string. Truesdell saw that in this field too he was the first to
use Newton’s equation of motion.
Then d’Alembert, who had studied Daniel Bernoulli’s two papers, published
in the Traité de Dynamique in 1743 a partial differential equation for the hanging
cord, the first partial differential equation in mechanics. Truesdell called this discovery “a turning point in the whole history of mechanics”. Then, in 1746, in his
paper “Recherches sur la courbe que forme une corde tendue mise en vibration”,
d’Alembert presented the differential equation of the string, this time together with
the solution, and a short time later Euler published the solution found in a different
way.
With these discoveries began the last of the big scientific polemics of the 18th
century. Truesdell was highly critical of the polemic itself, which he called “deplorable”. He commented that it “confirms the principle that ever the greatest
quantity of paper is smeared over with the dullest matter.” But this “great quantity” was searched through by him with great care, and the questions at issue were
explained by him incisively.
In this triangular struggle both d’Alembert and Euler maintained that only the
partial differential equation would yield all solutions. Their quarrel concerned the
class of functions that are admissible as solutions. While d’Alembert tenaciously,
but erroneously, maintained that only the functions today called analytic can serve
as solutions of a mechanical problem, Euler admitted a much larger class, the
class of piecewise smooth functions. Hereby he achieved in Truesdell’s words
“the greatest advance of scientific methodology in the whole century”, because
it contradicted the Leibniz postulate that in mechanics functions must be analytic,
a postulate which, according to Truesdell, had not been contradicted by anyone, not
even by Newton. Bernoulli, however, claimed that the trigonometric series would
equally well yield all solutions. As he had overlooked the arbitrary phases he was
wrong, as we know today. The approach of his adversaries carried the day, but
250 years later we can see that Bernoulli had in fact solved the first finite and
infinite eigenvalue problems, which occupy now, after the past 75 years, the center
of quantum mechanics. Thus are the meanderings of the developments of science!
All these problems, however, were merely 1-dimensional ones. Only one
2-dimensional problem from the field of flexibility was solved during this period:
Euler discovered in 1759 the equation of the drum.
The elaboration of the theory of elastic bodies moved during the same period on
very different lines, especially with respect to the mathematics involved. Truesdell
traced its modern development back to Beeckman and to Galileo. In the Discorsi,
Galileo had asked: what is the proportion between the minimal weight Pl needed
to break a beam simply by elongation and the minimal weight Pt needed to break
46
D. SPEISER
it transversally when one of its ends is clamped into a wall? Galileo’s prediction
was
Pl : Pt = (b : a)N,
where a and b are the length and breadth of the beam, respectively, and N is a
numerical factor. By assuming a special model he then could compute N = 1/2.
Mariotte, who tried to confirm Galileo’s predictions experimentally found N = 1/4
rather than 1/2, and he wrote his results to Leibniz. Leibniz suggested in 1684 that
one should take into account also the energy needed to bend the beam before it
breaks. Using Hooke’s law he found N = 1/3. According to Truesdell this was the
first computation which took account of the dilation of fibres.
A few years later in 1687 Jacob Bernoulli asked Leibniz, in a letter, for explanations concerning his new calculus and in the same letter he included the
results of his own experiments which in certain cases clearly disproved Hooke’s
law. Leibniz, because of his absence on a trip, replied three years later. He suggested that Bernoulli should determine the exact form of a beam that is bent by a
weight. Bernoulli at once set out to work on this problem. And indeed by using the
principle of the balance of moments of forces and his own “golden theorem” –
his formula for the radius of curvature – he found the solution in the form of
an integral which contained an arbitrary function depending upon the constitutive
relation between the stress and the strain. Here, in 1692, Bernoulli recognized the
distinction between dynamical principle and constitutive equation, for his experiments had convinced him that there was no universal law valid for all materials.
In the simplest case, namely when Hooke’s law is assumed, the solution is given
by the famous elliptic integral, discovered and discussed by him in this paper. Of
his theory of the bent beam Truesdell wrote (p. 96): “the deepest and most difficult
problem yet to be solved in mechanics, is his alone.”
Truesdell pursued Bernoulli’s later investigations on this problem, especially
on the location of the neutral fibre. It ended, as Antoine Parent showed, with a
failure. Truesdell remarked: “To the ironies and disappointments which filled [his]
life must be added that while he originated or assembled all the apparatus sufficient
to put [his final equation] on firm ground, he failed to do so, failed because his
attempt was on too grand a scale.” In fact the problem of the neutral fibre seems
still today not to be solved in full generality.
Truesdell’s evaluation of Jacob Bernoulli’s achievements is: “In our epoch for
study, 1638–1788, but one other, Euler, is to build himself a like monument in our
subject.” On the other hand, it is characteristic for Truesdell that he devoted a full
section to Parent, whom he rescues from near oblivion by showing that “Parent was
the first to apply statical principles correctly to the tensions of the fibres of a beam,
and that he recognized the existence of shearing stress.” These are no small merits,
indeed!
The next step in this development was taken by Daniel Bernoulli. He knew that
Euler was working on a book on variational calculus, and suggested to him a mini-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
47
mum principle for the potential elastic energy stored in a curved beam. Euler immediately worked out its consequences which he annexed as the first appendix to his
Methodus Inveniendi Lineas Curvas, where he derived a multitude of new results.
Truesdell commented also on Euler’s further work in this field, e.g., his discovery of the shear force and Coulomb’s discovery of the shear stress. In this field
all correctly solved problems were, again, 1-dimensional. Even the problem of the
oscillations of a massive plate was missed by the second Jacob Bernoulli, if also,
as D.O. Mathùna showed, only by a hair’s breadth! Truesdell was here a bit severe,
for the definitive solution was given by Lagrange only more than 30 years later.
Especially impressive is Truesdell’s “modern evaluation”, which fills the last 10
pages of the book. He divides the task into three parts: the evaluation of Analysis,
Geometry and Mechanics. Who else could have dared to evaluate three basically
so different histories? Even a careful and advised reader, I suspect, will discover in
these few pages, here and there, something of what he believed to have well understood but had in fact failed to grasp a fundamental aspect. I repeat here the first sentences of the first summary. The triumphant lines show Truesdell’s rightful pride in
his own beloved science: Rational Mechanics. He wrote: “Prior to 1730, researches
on continuum mechanics applied mathematical techniques already developed in
other subjects, notably in geometry and in the mechanics of point masses. Starting with the research on vibrating systems by DANIEL B ERNOULLI and E ULER,
the situation was completely inverted. From then on until the end of the century,
continuum mechanics gave rise to all the major new problems of analysis.”
On the two last pages of the book Truesdell asked why the foundations to a complete theory of elasticity escaped this period and writes: “Neither physical intuition
nor experiment was what was needed here; rather, as both E ULER and C HLADNI
said, it was want of differential geometry that blocked the way to theories of deformable surfaces and solids”. And after mentioning that Euler had introduced all
elements of the strain tensor in a paper on hydrodynamics, he notes on the very last
page: “In surveying all these brilliant individual achievements . . . , we are driven to
ask why, when Euler had succeeded in 1752 in creating a general theory of perfect
fluids . . . , nevertheless after many more years he failed to reach a general theory
of elasticity.” His answer was: “To succeed in hydrodynamics, the only hope lay
in abandoning a one-dimensional approach. But for elastic or flexible bodies onedimensional theories led to one triumph after another. It was the brilliant successes
of the special theories that blocked the way to the general theory, for nothing is
harder to surmount than a corpus of true but too special knowledge.”
I could give here only an insufficient account of this monumental work of Truesdell: the history of the theory of elasticity is now, probably due to him, the best
charted and the best investigated domain of the history of physics. And I remain
convinced that the three introductions to which I referred are the best guide to
a deeper understanding and further study of the history of classical mechanics
and indeed of the history of science; every time I open one of them I find again
something new and interesting that had escaped me.
48
D. SPEISER
4. The Concepts and Logic of Classical Thermodynamics as a Theory of
Heat Engines. Rigorously Constructed upon the Foundation Laid by
S. Carnot and F. Reech
Before turning to my second main topic, Truesdell’s work for the Bernoulli Edition,
I would like to mention his book with S. Bharatha. Even if it lies somewhat apart
from the other works of this account, it brings several characteristics of Truesdell
to the fore, which seem to me fundamental for his thinking as well as significant.
I did quote the full title of the book, since this book brings together, like no other
of his books that I know, science, history and, not surprisingly, conceptual logic.
And in no other book of those which I know, is Truesdell so preoccupied with
teaching. Not that I would recommend the book as a textbook for students, but
I recommend it highly to all those who teach physics. The aim of the book is to
construct a rigorous foundation of classical thermodynamics based on the idea of
Carnot cycle. “Rigorous” refers here not only to mathematical rigor, but also, and
in fact even more so, to conceptual rigor – to a clear and adequate introduction and
a sharp definition of all concepts that will be used in the equations as well as a
precise reference as to how they are to be measured, i.e., how they are connected
with experiment. The very first sentence of the preface makes this clear and it is,
at the same time, a “critique” in the sense of Kant of the possibility of writing
the history of science. He writes: “I do not think it possible to write the history
of a science until that science itself shall have been understood, thanks to a clear,
explicit, and decent logical structure.” I have hardly found in Truesdell’s work positive references to philosophy, and probably he would be surprised to be referred to
as a philosopher; yet what he notes here and in the rest of the section is as important
a contribution to the philosophy of science and to the philosophy of history of
science as I have ever heard. Perhaps Truesdell, were he here, would react to this
compliment with a little smile. On the other hand, the aim which he pursued as a
historian is expressed by his dedication of the book “as an expression of respectful
gratitude for the legacy of the great French thermodynamicists C ARNOT, R EECH ,
D UHEM”. This dedication incidentally refutes the accusation, which I heard some
times, that his was an anti French bias. In three sections Calorimetry, Carnot’s
General Axiom, and Universal Efficiency of Ordinary Carnot Cycles, the results
are presented.
I believe that the progress of science consists in establishing connections between various phenomena, between phenomena and their measurements, and it
includes a process called the formation of theories, i.e., constructing connections
between different restricted theories by erecting greater and even more comprehensive theories. The significance of the book then lies perhaps in the first place, that
no other book of those that I have consulted connects thermodynamics so cogently
and intimately to classical mechanics.
But the book has another distinction too. It is a book written especially for the
teacher, I dare say even for the teacher, who must speak to beginners. In other
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
49
words, the book has also a pedagogical aim. If it is not directly a textbook, this is
only because the authors wished to prove that their approach is powerful enough for
coming to grips with all situations that the practical applications demand. Hence the
careful analytical generalizations to cases where functions that are only piecewise
smooth are needed, etc. But for the explanation of the thermodynamical principles
themselves, these technical details are not necessary and can easily be suppressed
by the teacher. But what the teacher can learn and teach above all is not so much
the mathematical rigor, but the conceptual rigor of the theory or of any theory for
that matter, and the importance of a careful introduction and explanation of all
concepts. The importance, for instance, of the eternal question that looms over the
beginning of all introductory courses on mechanics: “what exactly is now the force,
professor”?
When I think of my own lecturing, my greatest regret is that I concentrated too
little and too late on the careful introduction of all concepts used in physics, and
that I spent in my lectures too little time on their discussion. It is with the help of
sentences that we prescribe the setting of a reproducible experiment – the concepts
connect the experiments with the mathematical formalisms.
Another fundamental point made clear in this book is that all theories are always
valid only with respect to a certain domain of the variables and under certain restrictions. It is the neglect of these caveats which makes possible only pseudophilosophical and pseudoscientific generalizations. Here teachers can learn much that
will prevent a certain boastful offer of their merchandise and at the same time make
the understanding of what they present easier since it is focussed. I learned myself,
for example, that the teacher must immediately at the beginning of each course
enumerate explicitly all restrictions under which the predictions of the theory only
are valid.
Over 20 years ago I had invited Truesdell to Louvain-la-Neuve for giving a
series of lectures. In the first one he outlined the content of this book in one hour,
overestimating, of course, his audience, which was oriented mainly towards quantum mechanics and its applications. And then he changed to other subjects. But he
presented me with a copy of the book, and when I had read it, I regretted deeply
that not the whole series of lectures was directed to this one topic, but I did not
dare say it to him. But later, at a lunch I told him, that I was particularly impressed
by one special topic, namely his treatment of the anomaly of water between 0
and 4 degrees, of which I never had seen an adequate presentation. He then said
approvingly, “I can tell you that this subject was my special goal for writing this
booklet”, and then taking his glass he invited me to call him henceforth “Clifford”,
which from here on I shall also do in this discourse!
But during his stay in Louvain-la-Neuve there was one other topic towards
which many of our conversations were directed again and again. This was the
Bernoulli Edition, to which I shall now turn.
50
D. SPEISER
5. Truesdell’s Contribution to the Restart of the Bernoulli Edition
This other topic which occupied Clifford and me was precisely the new beginning
of the Bernoulli Edition, and here I must now go back a few years. Clifford had
been involved with the Euler Edition, as I mentioned already, through the mathematician Andreas Speiser, an uncle of mine, with whom I had close contacts.
My uncle was extremely proud of this acquisition and spoke to me often enthusiastically about Clifford. He gave me separata of the two introductions to the
hydrodynamical works, and my uncle’s enthusiasm caught on also with me.
I made Clifford’s acquaintance in 1957 on the occasion of Euler’s 250th anniversary, where at my uncles invitation he was the main speaker at the official
university ceremony. A few years later he wrote to me a complimentary letter for
my own introduction to Euler’s works in the domain of physical optics. Meanwhile
I had begun to read his introductions, so that when J.O. Fleckenstein asked me to
succeed Hans Straub as the editor of the works of Daniel Bernoulli, I said “Yes,
but . . .”. Namely, I stated as a condition that I should be paid the equivalent of a half
time assistant. It so happened that a few month’s earlier a young student had asked
me if she could write a Ph.D. thesis under my direction. I wished to accept her,
for she had definitely “une tête bien organisée”, but no post seemed free. Hence
my proposal to Fleckenstein. I succeeded in persuading the student to work on
the history of science, although at first she found this a puzzling and somewhat
dubious proposition, and Fleckenstein arranged the financial side. Today, some
twenty-seven years later, this young student, Patricia Radelet-de Grave, is professor
at the Université Catholique de Louvain, where she teaches the history of science.
She has now succeeded me as editor of the Bernoulli Edition while an Italian Ph.D.
student of hers served as the Edition’s secretary.
Studying Clifford’s introductions, I had become convinced that he was the best
possible guide and counsellor for launching the whole enterprise again. In 1975,
I was in New York when I received a call from Clifford who inquired about what
was going on in the Bernoulli Edition. I gave him the little information I had,
but only later did I find out that he had not been terribly excited by my answers.
Nevertheless, later, as I shall mention, he accepted an invitation to be the editor of Daniel Bernoulli’s work on hydrodynamics. To my surprise he wrote to
me that he was not satisfied with his earlier work in the Euler Edition. His, as I
noted before, was the only critical voice about these introductions, of which I ever
heard.
In 1980, when I succeeded Fleckenstein as the Editor of the whole Edition,
he encouraged me to ask André Weil to become an editor of the works of Jacob
Bernoulli. Weil accepted very kindly first the volume Analysis and then later also
the volume Differential Geometry. Also from Weil I learned much on the art of
editing, and I wish to state here too that I cannot remember one difficult moment
with him. A bit later Weil introduced me to Herman Goldstine, who edited the
volume on Variational Calculus containing works of Jacob and of Johann. Mean-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
51
while Mrs. Radelet and I produced plans for editing the works of all Bernoullis;
so far there had not been any plans nor any reliable estimates of the work at
all, the old ones being all much too low. Precisely all these questions and many
more, including the choice of the typography and the design of the volumes, were
then discussed with Clifford in Louvain-la-Neuve. He took a detailed interest in all
problems and Mme. Radelet and I learned much from him.
At last we could publish the plans, which were printed in 1982 in an illustrated
brochure, which also contained a presentation of the Bernoulli family and the
importance of each member. The plans consisted of (i) a presentation of the whole
project, including what had already been achieved and the distribution of the works
into volumes and (ii) a determination of our priorities: first to complete what was
begun, i.e., Jacob I and Daniel Bernoulli and three volumes with the letters of
Joh. I B.; this first stage is now approaching its completion. Then only to complete
in a second stage all works of Joh. I and the “minor” B.’s; this stage is now opened
with the deposition of Vol. 8 of the works of Joh. I by P. Villaggio from Pisa.
A third stage was forseen for the letters; for their publication a project was worked
out by F. Nagel and myself. Our plans also contained (iii) a list with the names of
all editors of the first stage. At first, when I wrote to Clifford about this brochure,
he did not seem terribly impressed, but when I sent him two copies he sent me his
enthusiastic congratulations together with a list of friends and colleagues whom he
invited me to send a copy too.
These plans, with the appointment of editors for the first stage, were the basis
of the 1982 restart of the Bernoulli Edition. The same year, on the occasion of
the bicentennary of Daniel Bernoulli’s death, the Curatorium, under its president
the historian A. Gasser, organized a Symposium. The main speaker was Clifford,
who in the “Alte Aula” between the portraits of Daniel Bernoulli and Euler gave
a speech about the research of both on the theory of oscillations, evaluating the
strengths and the weaknesses of both of them.
This was the beginning of a series of exchanges concerning the progress of the
Edition, and especially of my requests of Clifford’s opinion on various questions.
At the Symposium we presented the first new volume edited by L. Bouckaert and
B.L. van der Waerden. It received, besides several favorable reviews, one that put
the Edition at its start into serious trouble. Again Clifford came to our rescue and
refuted in a letter to the Swiss National Science Foundation (SNScF) all points,
with the exception of one, of the review. His letter especially restored, as I was told,
the confidence of the SNScF. Incidentally, the author of the review later graciously
apologized to Mrs. Radelet.
Of course, we had all very much hoped that Clifford might deposit the two
volumes containing Daniel Bernoulli’s work on hydrodynamics. This was not to
be. As all know, an enormous load of work kept him busy beyond his forces, and
his health eventually failed him, although he was in the best of care. He had to
resign from his engagement, and he advised me to invite the Russian Academician
52
D. SPEISER
Gleb Mikhailov to take over. I am happy to report that recently we began with the
printing of the first of the two volumes!
But, thanks to his friend André Weil, Clifford, nevertheless, became a Bernoulli
Editor! Indeed Weil had advised me to edit completely all letters of Jacob Bernoulli’s correspondence with Leibniz and I had unhesitatingly accepted his advice,
ignoring the existence of an agreement between the Leibniz and the Bernoulli
Edition, which had left the editing of the letters with Leibniz to her sister-edition
in Hannover; these letters were, of course, the most interesting ones. But when I
explained the situation to our colleagues of the Leibniz Edition, they very generously accepted our plans, provided we would not undertake a “critical edition”.
During his work on the edition Weil persuaded Clifford to write an introduction to
the parts of the correspondence that dealt with questions of mechanics. It is there
that Leibniz drew Bernoulli’s attention to the problem of the curved arc. The result
of Weil’s invitation was again the appearence of a very penetrating introduction.
Thus the Edition is fortunate that Clifford’s name will remain connected to it,
and especially as an editor of Jacob Bernoulli, for whom he had done so much. But
even without this turn of events, after 30 years of work for the Bernoulli Edition,
I can state firmly, that no one has done more for making the new beginning of the
Edition in 1982 possible than Clifford Truesdell.
6. Truesdell the Artist and the Man
My report on Truesdell’s work for the Euler and the Bernoulli Edition must end
here, although I could go on at length. But there remains a question: neither Clifford’s scientific expertise nor his penetration into history alone can explain the
full fascination which his works exert on their reader. We know that all scientific
theories are even at best only approximations to the observed world and the same,
but even more so, holds for the reconstruction of the historical path on which they
were founded. Clifford, more than most historians was aware of this. All too often
we must be satisfied with guesses. Thus, like the scientific theories, the historical
reconstruction to some extent always remains a construct. So, what then produces
the great satisfaction that we experience when we read Truesdell’s works?
Here, for discovering the answer we must, I think, turn to another field: we
must enter another dimension – the realm of beauty. The man, who was so much
attached to all arts, music, painting, old books, etc., was himself also an artist.
He has composed his books, in the double sense of this word. As much as his
search for scientific precision in all details would allow it, his books are beautifully
constructed!
This brings me necessarily to Clifford the private man. Everyone who had the
intense pleasure to be received in the Palazzetto knows what I mean: the carefully
chosen objects of the collection, their carefully thought out presentation, and especially their owners’ passionate interests in all arts and also in the history of the
arts. Here too one could experience the truly enlightening comments which their
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
53
guests received. One could watch how they stimulated through their interest in all
crafts the artisans of Baltimore who made the Palazzetto into what it became. As
you realize, I slipped now, almost unconsciously, into the plural: the treasures of
the Palazzetto were offered to its guests by a couple! And so this was more than
only their home!
Can we imagine Clifford’s tremendous outpourings without the constant and
intense help of Charlotte? Can we imagine this without her painful proof reading
of his books, her corrections, her meticulous improvements of the last details, her
conscientious organization of the Archive as well as the classification of Clifford’s
correspondence in a private archive? Of course we cannot. Just as the Palazzetto’s
hospitality was the work of both, the Palazzetto’s soul was Clifford and Charlotte.
And Charlotte made sure that in spite of his harsh afflictions Clifford could spend
his last years there in dignity. For this, all of Clifford’s friends will always be in her
debt, and they will remain grateful to Charlotte and Clifford.
Acknowledgements
It is a pleasure to thank Professor G. Capriz for the invitation to contribute to the
Pisa Meeting in memory of Clifford Truesdell and to Professors Chi-Sing Man and
R. Fosdick for their careful editing of this article. Also, I would like to thank Professors L.A. Radicati di Brozolo and, especially, P. Villaggio for many interesting
conversations and my wife for linguistic advice.
References
1.
2.
3.
4.
G. Maltese, La Storia di “F = ma”. La seconda legge del moto nel XVIII secolo. Olschki,
Firenze (1992). After the Meeting in memory of Clifford Truesdell in Pisa, there appeared a
2nd volume by Giulio Maltese: Da “F = ma” alle leggi cardinali del moto. Hoepli, Milano
(2001). This is a worthy successor of the first volume: one must hope that many readers and
especially students will take profit from it. It is written in the spirit of Truesdell’s pioneering
work and it is a monument to it.
C.A. Truesdell, Leonhardi Euleri Commentationes Mechanicae, Ser. Secunda, Vol. XII, pp. IX–
CXXV; Ser. Secunda, Vol. XIII, pp. IX–CV; Ser. Secunda, Vol. X et XI Sect. Secunda. Orell
Füssli, Turici (1954; 1955; 1960).
C. Truesdell and S. Bharatha, The Concepts and Logic of Classical Thermodynamics as a
Theory of Heat Engines. Rigorously Constructed upon the Foundation Laid by S. Carnot and
F. Reech. Springer, New York (1977).
A. Weil (ed.), Der Briefwechsel von Jacob Bernoulli, with contributions by C. Truesdell and
F. Nagel. In: D. Speiser (general ed.), The Collected Scientific Papers of the Mathematicians
and Physicists of the Bernoulli Family. Birkhäuser, Basel (1993).
Baltimore, Maryland, 1978
Invariant Dissipative Mechanisms for the Spatial
Motion of Rods Suggested by Artificial Viscosity
STUART S. ANTMAN
Department of Mathematics, Institute for Physical Science and Technology, and Institute for
Systems Research, University of Maryland, College Park, MD 20742-4015, U.S.A.
E-mail: ssa@math.umd.edu
Received 10 September 2002; in revised form 16 January 2003
Abstract. The introduction of artificial viscosity into the partial differential equations of mechanics
is often useful for both analytic and numerical studies. The traditional forms of artificial viscosity,
originally designed to treat problems for fluids, when applied to problems for solids often lead to
equations describing material properties that are not invariant under rigid motions. Consequently, for
rapidly rotating bodies, artificial viscosity could produce serious errors. In this paper it is shown how
to introduce artificial viscosity in a properly invariant way, and that the resulting systems have a rich
and attractive structure, which beckons analysis.
Mathematics Subject Classifications (2000): 35L65, 65M99, 74K10.
Key words: artificial viscosity, invariance under rigid motions, frame-indifference, spatial deformations of rods, hyperbolic conservation laws.
In memoriam Clifford Truesdell
1. Introduction
A general version of a conservation law with one spatial variable s is a system of
partial differential equations of the form
ut = g(u, s)s + h(u, s)
(1.1)
where u is an n-tuple of unknown scalar functions of s and t; g and h are given
functions with values in Rn ; and derivatives are denoted by subscripts. Suppose that
g is differentiable and that h is continuous. In this case, this system is hyperbolic if
the matrix ∂g/∂u of partial derivatives of the components of g with respect to the
components of u is positive-definite. As is well known, such nonlinear hyperbolic
conservation laws admit shocks. In both analytic and numerical studies of (1.1), a
central role has been played by modifying it by appending to its right-hand side
an artificial viscosity term [D(u, s) · us ]s where D(u) is a small positive-definite
matrix (often taken to be constant and diagonal. See [4]. Here D · us denotes the
55
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 55–64.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
56
S.S. ANTMAN
image of the n-tuple us under the matrix D.) For problems of continuum physics,
the effect of D is to modify the material properties characterized by g and h.
When we introduce such modifications into conservation laws from physics,
it behooves us to determine if the modified system has physical significance. If
not, the modification might introduce serious analytic or numerical errors. (This
happens when some standard numerical methods for the treatment of shocks are applied to the equations for the planar motion of elastic rods [3].) When (1.1) is a set
of equations of continuum mechanics, there are three sources of difficulty: (i) Vectorial geometry is suppressed because the unknown u is just an n-tuple of scalars,
typically components of vectors with respect to some moving basis with the basis
carrying most of the geometric information. (ii) The system typically includes both
momentum equations and compatibility equations, the latter expressing the equality of mixed s and t derivatives. It is necessary to determine what physical meaning,
if any, inheres in adding dissipation to such compatibility equations. (Slemrod [5]
first treated this question for a scalar equation, in which the issue of invariance
under rigid motions does not arise.) (iii) The form of the artificial viscosity D · us is
inspired by the viscosity term for 1-dimensional gas dynamics and more generally
for that in the Navier–Stokes equations. These equations are typically given in
a spatial (Eulerian) formulation. This form of the dissipation is not preserved in a
material (Lagrangian) formulation, typically used for problems of solid mechanics.
Indeed, we shall see that in the material formulation, the artificial viscosity D · us
corresponds to constitutive equations that are not invariant under rigid motions.
The purpose of this paper is to resolve these difficulties, producing properly
invariant dissipative mechanisms suggested by this artificial viscosity. We shall see
that such dissipative terms have a rich and attractive structure.
Notation. We employ Gibbs notation for vectors and tensors: Vectors, which are
elements of Euclidean 3-space E3 , and vector-valued functions are denoted by
lower-case, italic, bold-face symbols. The dot and cross products of (vectors) u
and v are denoted u · v and u × v. The value of tensor A at vector v is denoted A · v
(in place of the more usual Av). Twice-repeated lower-case Latin indices except
for the independent variables s and t are summed from 1 to 3 and twice-repeated
lower-case Greek indices are summed from 1 to 2.
Triples (u1 , u2 , u3 ) of components of any vector u with respect to a certain
nonconstant right-handed orthonormal basis {dk } are denoted by the corresponding
lower-case, sans-serif, bold-face symbol u. In view of the orthonormality of this
√
basis, we set u · v := ui vi (= u · v), |u| = uk uk (= |u|), u × v := (u2 v3 − u3 v2 ,
u3 v1 − u1 v3 , u1 v2 − u2 v1 ) (so that the first component of u × v is (u × v) · d1 ,
etc.). The matrix of a tensor A with respect to this basis is denoted A. Its action on
a triple u is denoted A · u. The dot product and norm for other n-tuples are treated
analogously.
The (Gâteaux) differential of u → f (u) at v in the direction h is (d/dt)f (v +
th)|t =0 . When it is linear in h, we denote this differential by (∂f /∂u)(v) · h or
57
INVARIANT DISSIPATIVE MECHANISMS
fu (v) · h. We occasionally denote the function u → f (u) by f (·). The partial
derivative of a function f with respect to a scalar argument t is denoted by either ft
or ∂t f . The operator ∂t is assumed to apply only to the term immediately following
it. We shall always use notation like ∂t for a total derivative, i.e., a derivative of a
composite function. Obvious analogs of these notations will also be used.
2. Formulation of the Governing Equations
We briefly outline the formulation of geometrically exact equations governing the
motion in space of a rod that can suffer flexure, extension, torsion, and shear. We
follow [1, Chapter 8], which should be consulted for interpretations and for the
proofs of all our assertions.
The motion of a rod is defined here by three vector-valued functions
[0, 1] × R ∋ (s, t) → r(s, t),
d1 (s, t),
d2 (s, t) ∈ E3
(2.1)
with {d1 (s, t), d2 (s, t)} orthonormal. The function r(·, t) may be interpreted as the
configuration at time t of the curve of centroids of a slender 3-dimensional body.
The vectors d1 (s, t) and d2 (s, t) may be interpreted as characterizing the orientation of the material section at s at time t. In particular, d1 (s, t) and d2 (s, t) may
be regarded as characterizing the configurations at time t of a pair of orthogonal
material lines of the section s. We assume that s is the arc-length parameter of the
reference configuration of r and we scale the length so that 0 s 1. We set
d3 := d1 × d2 .
(2.2)
Since {dk (s, t)} is a right-handed orthonormal basis for E3 for each (s, t), there are
vector-valued functions u and w such that
∂ s dk = u × dk ,
∂ t dk = w × dk .
(2.3)
Since the basis {dk } is natural for the intrinsic description of deformation, we
decompose relevant vector-valued functions with respect to it:
v := rs = vk dk ,
p := rt = pk dk ,
u = uk dk ,
w = wk dk . (2.4)
The equality of mixed partial derivatives of r and of the dk implies that
ps = vt = (∂t vk )dk + w × v,
ws = ut + u × w = (∂t uk )dk .
(2.5)
We set
u := (u1 , u2 , u3 ),
v := (v1 , v2 , v3 ),
p := (p1 , p2 , p3 ),
w := (w1 , w2 , w3 ).
(2.6)
u and v are the strain variables corresponding to the motion (2.1). (The strains
u1 and u2 measure flexure, u3 measures torsion, v1 and v2 measure shear, and v3
measures dilatation.)
58
S.S. ANTMAN
In the configuration at time t, the resultant contact force and contact couple
exerted by the material of (s, 1] on the material of [0, s] (for 0 < s 1) are
respectively denoted n(s, t) and m(s, t). Provided that there are no body forces or
couples, the equations of motion have the form
ns + f = ρArt t ,
ms + rs × n + l = ∂t (ρJpq wq dp ) =: ∂t (ρJ · w)
(2.7)
(2.8)
where (ρA)(s) is the prescribed positive mass density per reference length at s,
the (ρJγ δ )(s), γ , δ = 1, 2, are the prescribed components of the positive-definite
symmetric 2 × 2 matrix of mass-moments of inertia of the section s. The positivedefinite symmetric 3 × 3 matrix ρJ := (ρJpq ) is defined by ρJγ 3 = ρJ3γ = 0,
ρJ33 = ρJγ γ , and ρJ := ρJpq dp dq .
Let
mk := m · dk ,
m := (m1 , m2 , m3 ),
nk := n · dk ,
n := (n1 , n2 , n3 ).
(2.9)
m1 and m2 are the bending couples, m3 is the twisting couple, n1 and n2 are the
shear forces, and n · rs /|rs | is the tension.
The rod is elastic if there are constitutive functions
(u, v, s) → m
(u, v, s), n̂(u, v, s)
(2.10a)
such that
m(s, t) = m
(u(s, t), v(s, t), s),
etc.
(2.10b)
This form of the constitutive equations ensures that the material behavior is invariant under rigid motions.
For any function [0, 1] × R ∋ (s, t) → z(s, t) we define zs,t (σ, τ ) := z(s − σ,
t − τ ) for all σ such that s − σ ∈ [0, 1] and for all τ t. The general constitutive
equations for a rod whose response at (s, t) depends nonlocally on other material
points (sections) of the rod and depends upon the past history has the form
m(s, t) = m
(us,t , vs,t , s),
etc.
(2.11)
These very general constitutive equations are likewise invariant under rigid motions.
We now recast our governing partial differential equations as a vectorial system
of first order in the time derivative. Equations (2.3)–(2.5), (2.10), (2.11) imply that
∂ t dk
ut
vt
∂t (ρJ · w)
ρApt
=
=
=
=
=
w × dk ,
ws − u × w,
ps ,
∂s (
mk dk ) + v × n̂k dk ,
∂s (n̂k dk )
(2.12a)
(2.12b)
(2.12c)
(2.12d)
(2.12e)
59
INVARIANT DISSIPATIVE MECHANISMS
where the arguments of m
k and n̂k are
u · dl ,
v · dl ,
ws · dl ,
ps · dl − (w × v) · dl ,
(2.12f)
s.
Note that the ordinary differential equation (2.12a) preserves the dot products dk ·dl
and therefore ensures that {dk (s, t)} is an orthonormal basis for all s, t if {dk (s, 0)}
is an orthonormal basis for all s. By the continuation theory for ordinary differential equations, the linearity of (2.12a) and the constancy of d1 · d1 , . . . imply
that the solutions of initial-value problems for (2.12a) are defined for all t. The
system (2.12) is hyperbolic if (
m, n̂) satisfies the monotonicity condition that the
matrix
∂(
m, n̂)
∂(u, v)
is positive-definite.
(2.13)
If we take the componential version of (2.12) with respect to the basis {dk }, we
can uncouple the equation (2.12a) for the dk from the remaining equations: Let eklm
denote the alternating symbol. Then (2.12a) has the componential form
∂t dk = ekij wj di ,
(2.14)
and system (2.12b–f) is equivalent to
∂t ui
∂t vi
∂t (ρJij wj )
∂t (ρApi )
=
=
=
=
∂s wi − eij k wj uk ,
∂s pi + eij k (uj pk − wj vk ),
∂s m
i + eij k (uj m
k + vj n̂k − wj ρJkq wq ),
∂s n̂i + eij k (uj n̂k − wj ρApk ),
where the arguments of m
k and n̂k are
u,
v,
ws − w × u (= ut ),
ps + u × p − w × v (= vt ),
(2.15a)
(2.15b)
(2.15c)
(2.15d)
s.
(2.15e)
We can write this system in the compact form
ut
vt
∂t (ρJ · w) ≡ ρJ · wt
∂t (ρAp) ≡ ρApt
=
=
=
=
ws − w × u,
ps + u × p − w × v,
∂s m
+u×m
+ v × n̂ − w × (ρJ · w),
∂s n̂ + u × n̂ − w × (ρAp).
(2.16a)
(2.16b)
(2.16c)
(2.16d)
(Even though (2.16) is independent of the dk , boundary conditions, which we do
not study, may not be.)
The systems (2.12) and (2.16) are in a general conservation form. It is important
to note that our original system (2.3)–(2.5), (2.10), (2.11) of governing equations,
the first-order vectorial system (2.12), and the first-order componential system
(2.16) are each equivalent.
60
S.S. ANTMAN
3. Artificial Viscosity
We now modify system (2.16), which we identify with (1.1), by adding an artificial
viscosity D · us , where, for simplicity, we take D to be a constant positive-definite
diagonal matrix. In particular, let U, V, W, P be constant positive-definite diagonal
3×3 matrices. Then the modification of (2.16) with artifical viscosity has the form
ut
vt
ρJ · wt
ρApt
=
=
=
=
ws − w × u + U · uss ,
ps + u × p − w × v + V · vss ,
∂s m
+u×m
+ v × n̂ − w × (ρJ · w) + W · (ρJ · ws )s ,
∂s n̂ + u × n̂ − w × (ρAp) + P · (ρAps )s .
(3.1a)
(3.1b)
(3.1c)
(3.1d)
Suppose we were to decompose u and v with respect to {dk } as above, but
decompose w and p with respect to a Cartesian basis {ik }. It is a straightforward
exercise to show that the modification of the resulting system with artificial viscosity as in (3.1) is not equivalent to (3.1) (and therefore these two systems could have
solutions with very different properties). This is a portent of some of the difficulties
we must overcome.
Modification of the momentum equations. We would like the artificial viscosity
terms in (3.1c,d) to represent a material dissipation, which would regularize the
behavior of solutions. Thus they should modify the constitutive functions in these
two equations. If these modified constitutive functions are to be invariant under
rigid motions, then the remarks surrounding (2.11) imply that W · ρJ · ws and
P · ps should depend solely (although possibly nonlocally in space and time) on
u, v. It follows from (2.5), which gives the actual kinematical relations, that these
viscosities lack the requisite form. It is also clear from (2.11) how to rectify this
deficiency. A particularly simple way to do this is to drop W · ρJ · ws and P · ps
from the right-hand side s of (3.1c) and (3.1d) and to replace (
m, n̂) of (2.10b) with
(
m + M · ut , n̂ + N · vt ) where M, N are constant positive-definite diagonal 3 × 3
matrices. Then in place of (3.1c,d) we obtain
ρJ · wt = ∂s m
+ M · ust + u × (
m + M · ut )
+ v × (n̂ + N · vt ) − w × (ρJ · w),
ρApt = ∂s n̂ + N · vst + u × (n̂ + N · vt ) − w × (ρAp).
(3.2)
(3.3)
These equations are equivalent to the following modifications of (2.12d,e):
(ρJ · w)t = {[
mk (u, v) + Mkl ∂t ul ]dk }s + v × [n̂k (u, v) + Nkl ∂t vl ]dk , (3.4)
ρApt = {[n̂k (u, v) + Nkl ∂t vl ]dk }s .
(3.5)
Modification of the compatibility equations. We now examine modifications like
(3.1a,b) of the compatibility equations (2.16a,b). Since such modifications are critical for numerical methods, we cannot avoid studying them by simply setting
61
INVARIANT DISSIPATIVE MECHANISMS
U = O = V, as we would do in the analysis of the differential equations for
viscoelastic rods of strain-rate type. We first have to frame a notion of invariance
for modified compatibility equations, which come from purely kinematic considerations and have no intrinsic material significance. We define an invariant system
with artificial viscosity to be a system of equations with single time-derivatives on
the left-hand side and with each equation containing a dissipative term such that
it is equivalent to the system consisting of momentum equations (2.7), (2.8) and
constitutive equations of the invariant form (2.11). Thus to study this issue, we have
to reconstitute the governing system of equations of motion in their traditional form
involving second time-derivatives from a suitable modification of (2.16) involving
first time-derivatives. In the process of constructing an invariant modification of
(2.16a,b) we show that (3.1a,b) are not invariant.
Rather than modifying (3.1a,b), it is more convenient to modify (2.12a–c). Let
U and V be tensors whose matrices with respect to the basis {dk } are U and V. We
first consider a modification of (2.12c) in the form
vt = ps + [(ρA)−1 V · vs ]s + [(ρA)−1 η]s
(3.6)
where η is a function at our disposal to make this equation invariant. Since this
equation sets a t-derivative equal to an s-derivative on a simply-connected domain,
there is a vector-valued potential, which is convenient not only to denote as r but
also to treat as r, such that
rt = p + (ρA)−1 V · vs + (ρA)−1 η,
rs = v.
(3.7)
The treatment of the analogous modification of (2.12b) is a little trickier: We
seek a ξ so that
ut = [w + (ρJ )−1 · (U · us + ξ )]s − u × [w + (ρJ )−1 · (U · us + ξ )] (3.8)
is invariant. Although this equation does not have the form of a t-derivative equaling an s-derivative, it does have the form of (2.5)2 . We therefore conclude that there
is a triple of vectors, which is convenient to denote as {dk }, such that
∂t dk = [w + (ρJ )−1 · (U · us + ξ )] × dk ,
∂ s dk = u × dk .
(3.9)
Since each of these equations conserves the dot products {dk · dl }, we see that these
{dk } are orthonormal if they are orthonormal at the initial time. Note that (3.9)1 is
a modification of (2.12a).
We replace p in the modified momentum equation (3.5) with its expression
coming from (3.7) and we now identify the dk appearing there with the new vectors
satisfying (3.9). We obtain
ρArt t = (V · vs )t + ηt + {[n̂k (u, v) + Nkl ∂t vl ]dk }s .
(3.10)
For this equation to have the requisite invariance, the first two terms on the right+
hand side must have the form [n+
k dk ]s where the nk depend (possibly nonlocally)
62
S.S. ANTMAN
on (u, v). (The first term on the right-hand side lacks this form because the timedifferentiation of the base vectors given by (3.9)1 introduces w-terms not of this
form.) We make a particularly simple choice of n+
k by choosing η so that
(V · vs )t + ηt = (Vkl ∂t vl dk )s .
(3.11)
The choice (3.11) gives (3.10) an invariant form:
ρArt t = {[n̂k (u, v) + (Nkl + Vkl )∂t vl ]dk }s
(3.12)
(where we use (3.9) to compute the derivatives of the dk ). We differentiate (3.6)
with respect to t and insert (3.11) into the resulting equation to get
vt t = pst + [(ρA)−1 (Vkl ∂t vl dk )s ]s
(3.13)
where again we use (3.9) to compute the derivatives of the dk .
The principal part of the partial differential operator acting on v in this equation
is
vt t − V · [(ρA)−1 vst ]s .
(3.14)
It is just a vectorial version of the heat operator on vt . It is responsible for the
dissipativity and it gives this equation a parabolic character and gives the entire
system a parabolic-hyperbolic character; see Zheng [6]. Note that the dk in (3.13)
depend upon ξ , which has not yet been fixed.
We now make (3.8) invariant by the same process by which we made (3.6)
invariant. Since (3.8) comes from (2.12b) by replacing w with w + (ρJ )−1 · (U ·
us + ξ ), we make this replacement in (3.4), use the modified constitutive functions
from (3.12), and interpret the dk in (3.4) as the new dk :
(ρJ · w)t = (U · us + ξ )t + {[
mk (u, v) + Mkl ∂t ul ]dk }s
+ v × [n̂k (u, v) + (Nkl + Vkl )∂t vl ]dk ,
(3.15)
To put this into invariant form, we simply choose ξ so that
(U · us + ξ )t = [Ukl ∂t ul dk ]s .
(3.16)
Thus (3.15) becomes the invariant
(ρJ · w)t = {[
mk (u, v) + (Mkl + Ulq )∂t uq ]dk }s
+ v × [n̂k (u, v) + (Nkl + Vkl )∂t vl ]dk ,
(3.17)
where once again we use (3.9) to compute the derivatives of the dk .
Equation (3.16) may be regarded as linear ordinary differential equation for ξ :
ξt = Ukl [(∂t ul dk )s − (∂s ul dk )t ]
≡ Ukl {(∂t ul u − ∂s ul [w + (ρJ )−1 · (U · us + ξ )]} × dk .
(3.18)
63
INVARIANT DISSIPATIVE MECHANISMS
(Had we replaced ξ by ρJ · ξ in (3.8), then the left-hand side of (3.18) would
be (ρJ · ξ )t ≡ ρJkl (ξl dk )t , and (3.18) would not be linear in the components of
ξ because the t-derivative of dk in this expression would introduce another factor
involving the ξl ).
We regard the modified compatibility equations (3.8) and (3.18) as a system for
u and ξ . We cannot eliminate ξ from (3.8) as we did η from (3.6) in getting (3.13).
(We may regard ξ as an internal variable.) Nevertheless, if we differentiate (3.8)
with respect to t and then substitute (3.16) into the resulting equation where possible, we obtain an equation for which the principal part of the partial differential
operator acting on u is
ut t − (ρJ)−1 · U · usst ,
(3.19)
which has the same character as (3.14).
4. Comments
It is clear that we could have replaced all the positive-definite diagonal matrices
appearing in the above development with positive-definite symmetric matrices.
(Numerical methods typically do not require even this sophistication.) Indeed,
we could introduce artificial viscosity through nonlinear constitutive laws, which
would just give another source to the quasilinearity of the governing system. In
short, there is no unique way to introduce invariant versions of artificial viscosity.
The following development shows that if we make a slight change in our procedure, then the alternative invariant versions of the momentum equations (3.4), (3.5)
have higher derivatives with respect to s corresponding to constitutive functions
depending on (us , vs ) also. The resulting theories are of strain-gradient type.
Consider the pair (2.12c,e) of equations. Suppose that the modification of (2.12e)
is taken to have the form
ρApt = [n̂k (u, v)dk ]s + P · (ρAp)ss + ζ
(4.1)
where the components of P with the basis {dk } form a constant positive-definite
diagonal matrix and where ζ is at our disposal to make (4.1) invariant. We retain
the modification (3.6) of the compatibility equation and replace the two visible p’s
in (4.1) with the expression coming from (3.7)1 . We choose ηt to control the term
(V · vs )t by again adopting (3.11), thus converting (4.1) to
ρArt t = {[n̂k (u, v) + Vkl ∂t vl ]dk }s + P · [ρArt − V · vs − η]ss + ζ
(4.2)
where
V (s, t) · vs (s, t) + η(s, t) = V (s, 0) · vs (s, 0) + η(s, 0)
t
Vkl ∂t vl (s, τ )dk (s, τ ) dτ
+ ∂s
0
(4.3)
64
S.S. ANTMAN
by (3.11). Since (3.11) only involves the t-derivative ov η, we choose the sum of
the first two initial terms on the right-hand side of (4.3) to vanish, whence
V (s, t) · vs (s, t) + η(s, t)
= ∂s [Vkl vl (s, t)dk (s, t)] − ∂s
t
Vkl vl (s, τ )∂t dk (s, τ ) dτ.
(4.4)
0
We next choose ζ to ensure that the resulting version of (4.1) is invariant, without
annihilating the expression involving P . Omitting details, we find that this equation
involves four s-derivatives of r.
The modifications associated with the imposition of artificial viscosity in hyperbolic conservation laws support numerical methods like the Lax–Friedrichs scheme
and the upwind schemes [4]. As mentioned above, standard versions of these methods, which do not correspond to invariant constitutive functions, can produce serious errors. The introduction of higher derivatives with respect to s represents, in
the language of hyperbolic conservation laws, a dispersive regularization, versions
of which support the Lax–Wendroff and Beam–Warming schemes [4]. The treatment of (4.1) shows that the invariant introduction of artificial viscosity may also
introduce some dispersive effects.
Slemrod [5] first associated the artifical viscosity in the equation of mass conservation for 1-dimensional gas dynamics with capillarity. In material coordinates, the
equation for conservation of mass becomes a compatibility equation and capillarity
corresponds to strain-gradient effects for solid mechanics.
It seems likely that the introduction of (properly invariant) artificial viscosity
into compatibility equations gives additional regularity to the solutions. Thus the
analysis of the regularized system should have advantages that more than compensate for the additional complexity.
Acknowledgment
The work reported here was supported in part by grants from the NSF and ARO.
References
1.
2.
3.
4.
5.
6.
S.S. Antman, Nonlinear Problems of Elasticity. Springer, New York (1995).
S.S. Antman, Physically unacceptable viscous stresses. Z. Angew. Math. Phys. 49 (1998) 980–
988.
S.S. Antman and J.-G. Liu, Errors in the numerical treatment of hyperbolic conservation laws
caused by lack of invariance, in preparation.
R.J. LeVeque, Numerical Methods for Conservation Laws. Birkhäuser, Basel (1992).
M. Slemrod, Dynamics of first order phase transitions. In: M.E. Gurtin (ed.), Phase Transitions
and Material Instabilities in Solids. Academic Press, New York (1984) pp. 163–203.
S. Zheng, Nonlinear Parabolic Equations and Hyperbolic-Parabolic Coupled Systems. Longman, New York (1995).
An Average-Stretch Full-Network Model for
Rubber Elasticity
MILLARD F. BEATTY
Department of Engineering Mechanics, University of Nebraska-Lincoln, P.O. Box 910215,
Lexington, KY 40591-0215, U.S.A. E-mail: mbeatty2@unl.edu
Received 24 April 2002; in revised form 14 March 2003
Abstract. Two constitutive models that are based on the classical non-Gaussian, Kuhn–Grün probability distribution function are reviewed. It is shown that all chains of a network cell structure
comprised of a finite number of identical chains in an affine deformation referred to principal axes
may have the same invariant stretch, if and only if the chains are oriented initially along any of eight
directions forming the diagonals of a unit cube. The 4-chain tetrahedral and the 8-chain cubic cell
structures are familiar admissible models having this property. An easy derivation of the constitutive
equation for the Wu and van der Giessen full-network model of initially identical chains arbitrarily
oriented in the undeformed state is presented. The constitutive equations for the neo-Hookean model,
the 3-chain model, and the equivalent 4- and 8-chain models are then derived from the Wu and van
der Giessen equation. The squared chain stretch of an arbitrarily directed chain averaged over a unit
sphere surrounding all chains radiating from a cross-link junction as its center is determined. An
average-stretch, full-network constitutive equation is then derived by approximation of the Wu and
van der Giessen equation. This result, though more general in that no special chain cell morphology
is introduced, is the same as the constitutive equation for the 4- and 8-chain models. Some concluding
remarks on extensions to amended models are presented.
Mathematics Subject Classifications (2000): 74B20, 82D60.
Key words: rubber elasticity, non-Gaussian chains, constitutive equations, full-network models.
Dedicated in memory of my friend and esteemed teacher,
Clifford Ambrose Truesdell III
1. Introduction
Various models of rubber elasticity are based on the non-Gaussian statistical characterization of a network of randomly oriented, perfectly flexible molecular chains
that occupy their most probable configurations in the natural, undeformed state.
These models use as a measure of deformation the change in length of the end-toend vector r between molecular cross-links; and an affine deformation assumption
relates the chain stretch to the macroscopic stretch of the continuum. The configurational entropy of a chain, and hence its strain energy, is determined by specification of a probability distribution function P (r) that derives from non-Gaussian
65
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 65–86
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
66
M.F. BEATTY
statistical theory, the distribution P (r) being a rough measure of the range of variation among the very many possible configurations of a chain [1]. Some early studies
of such distribution functions for freely jointed chains are described by Treloar [2],
the most widely used and simplest of which is due to Kuhn and Grün [3].
The question then focuses on how to model a network of a great many such
chains to obtain a continuum theory that characterizes the mechanical response of
isotropic rubberlike materials in finite extension. Consequently, a number of specific network models have been proposed, including the James–Guth 3-chain [4],
Flory–Rehner 4-chain [5], Arruda–Boyce 8-chain [6], and the Treloar [7], Treloar–
Riding [8], and the Wu–van der Giessen [9] full-network models. The work by Treloar’s group focuses on simple uniaxial [7] and biaxial [8] deformations, whereas
the Wu and van der Giessen [9, 10] result admits general three-dimensional deformation states. All of these non-Gaussian network models use the Kuhn–Grün [3]
probability distribution function, which is essentially a first order approximation
developed from Rayleigh’s exact Fourier integral representation [11]. In an effort
to adjust for inaccuracies introduced through approximations in the Kuhn–Grün
function, Jernigan and Flory [12] proposed an amended form of the distribution
function. This idea was explored recently by Zúñiga and Beatty [13] in a study
of several amended models of rubberlike materials. Details on these models and
several additional references citing various phenomenological models, including
the review article [14], may be found there. The Arruda–Boyce model, however,
has proved to be the most successful in that it is mathematically simpler than
others, it compares most favorably with experiments on the mechanical response
of various elastomers under diverse loading conditions, and it requires determination of only two well-defined material constants. Nevertheless, in comparison with
experimental data, none of these theoretical models predict fully accurate material
response for all deformations studied, the greatest variance generally occurring for
equibiaxial extension.
Here we study the Wu and van der Giessen [9, 10] full-network model. Their
major result is a formidable, three-dimensional integral type of constitutive equation for an incompressible and isotropic, hyperelastic rubberlike material whose
microstructure is characterized by a full-network of initially identical chains randomly oriented in the undeformed state. In applications, however, they admit that
their rule requires time intensive numerical computation, so no specific analytical
results have been obtained. With their result in hand, we seek a simpler but general
constitutive equation for a uniform full-network microstructure. First, we show
that the constitutive equations for the classical neo-Hookean (Gaussian network)
model [2], the James–Guth 3-chain model [4], and the Arruda–Boyce 8-chain
model [6], or the equivalent Wang–Guth 4-chain model [11], may be derived from
the general Wu and van der Giessen equation. It is then shown that the squared
chain stretch of an arbitrarily directed chain averaged over a unit sphere is a certain
function of the first principal invariant I1 (B) of the Cauchy–Green deformation
tensor B. With the aid of this result, a general average-stretch, full-network consti-
AN AVERAGE-STRETCH FULL-NETWORK MODEL
67
tutive equation valid in every reference frame is obtained by approximation of the
Wu and van der Giessen principal stress-stretch equation. The reduced equation,
though more general in that no specific chain cell morphology is assumed, has precisely the same form as the constitutive equation for the equivalent 4- and 8-chain
models. The same average-stretch procedure may be applied to full-network models characterized by certain amended distribution functions [13], all of which are
thus approximated by the same formal constitutive equation, but each having a
different isotropic elastic response function. It is shown elsewhere [15] that parallel
results may be derived for the back stress tensor in amorphous glassy polymers.
2. Work of Deformation
We begin with a sketch of some relations for uniform non-Gaussian networks of
perfectly flexible chains whose end points occupy their most probable positions in
the reference configuration [1–3, 11, 12]. The non-Gaussian statistical treatment of
a single, freely jointed molecular chain model accounts for the finite extensibility of
the end-to-end chain vector length r up to its ultimate, fully extended chain length
rL ≡ Nl, where N is the number of rigid links, each of length l.√In its undeformed
Hence,
state, the mean end-to-end chain vector length is given by⋆ r0 = N l [1, 2]. √
the fully extended, chain locking stretch is defined by λL ≡ rL /r0 = N. It
is useful to define the relative chain stretch λr as the ratio of the current chain
vector length r = rchain to its fully extended length rL . In terms of the chain stretch
λchain ≡ rchain /r0 , we then have
λr =
λchain
rchain
= √ ;
rL
N
(2.1)
and hence λr varies from N −1/2 in the undeformed state to the value 1 in the
deformed, fully extended state: N −1/2 λr 1, with λr → 0 as N → ∞.
The entropy s ≡ k ln P (r) for a single randomly oriented, freely jointed molecular chain is defined in terms of a probability distribution function P (r) depending
on r, where k is the universal Boltzmann constant [1, 2]. Kuhn and Grün [3] derived an approximate⋆⋆ non-Gaussian expression for P (r) that yields the following
⋆ Recent analysis [16] by computational simulations has shown that as a consequence of anneal-
ing, that is, finding the mechanically equilibrated states of network chains, each of several unimodular
networks has a mean undeformed end-to-end√
vector length that is roughly 10–20% smaller than the
classical root mean square (rms) value r0 = Nl [1, 2]. This revelation, however, is of no concern
in our current study of non-Gaussian networks for which we retain the classical rms value.
⋆⋆ This is essentially the first order approximation in a series representation of the Rayleigh distribution function discussed later in Section 7. See [1, Chapter VIII] for these and other details on the
Kuhn–Grün function for freely jointed chains. An alternative exact derivation of the chain tension
relation f = (1/Nl) dw(λr )/dλr = (k / l)β, concluded without proceeding by way of the entropy
function, is provided by Weiner [17, p. 244–247] based on the stress ensemble viewpoint. See also
[2, pp. 108–109] and the subsequent discussion there on the non-Gaussian network theory.
68
M.F. BEATTY
configurational entropy for a single, randomly oriented chain,
β
s = k c − N λr β + ln
,
sinh β
(2.2)
wherein c is a constant and
β ≡ L−1 (λr )
(2.3)
is the inverse of the Langevin function L(β). Therefore,
λr = L(β) ≡ coth β −
1
,
β
(2.4)
where we recall the relative chain stretch (2.1).
The work of deformation, the strain energy per chain, is given by w = − s,
in which is the absolute temperature. Hence, for the Kuhn–Grün chain entropy
function (2.2), the strain energy per chain is determined by
β
− c∗ ,
(2.5)
w(λr ) = k N λr β + ln
sinh β
c∗ being a constant chosen so that the energy vanishes in the undeformed state.
The relative chain stretch in an affine deformation of an arbitrarily directed
chain is readily related to the macroscopic principal stretches of the continuum,
which is considered incompressible. The total strain energy for a full-network
model, however, will depend on the orientation, distribution, and concentration
of all of the chains in the bulk material. We shall return to this farther on. For simplicity, however, this complication may be removed by the introduction of specific
chain cell structures. To characterize these structures, let n denote the chain density,
the number of freely jointed chains per unit volume of the bulk material. Suppose
that the microstructure consists of an assembly of certain unit cells of p chains
initially oriented in p distinct directions emanating from a cross-link and each
having a different relative chain stretch λpr . Assuming that the chain density np for
the pth directed set of chains is the same for each direction, we have np = n/p; and
the distribution of the network of chains is called homogeneous. Therefore, among
n chains per unit volume of a homogeneous distribution, the contribution to the
total strain energy from all chains in the pth direction is (n/p)w(λpr ). Thus, each
of the p distinct chains of the cell contributes an amount of energy wp = w(λpr )
to the total strain energy W , which is given by
W =
p
n
wj .
p j =1
(2.6)
Clearly, it is not necessary to include oppositely directed chains of a symmetric
cell structure; these have the same stretch and contribute the same energy, so p
AN AVERAGE-STRETCH FULL-NETWORK MODEL
69
may be replaced by p/2.
p If all chains in the cell should have the same relative
stretch λj r = λr , then j =1 wj = pw(λr ). In this case, the total strain energy per
unit volume for a homogeneous network of non-Gaussian chains is provided by
W (λr ) = nw(λr ).
(2.7)
We shall see momentarily that the 3-chain and 8-chain cell structures, for example,
are respectively characterized by (2.6) and (2.7).
We recall that the Gaussian (neo-Hookean) theory assumes that a molecular
chain adopts a tight configuration with end-to-end √
separation r that is small compared to its fully extended length (i.e., for λchain ≪ N). Therefore, the results for
this model are limited to moderate stretches for which the approximation of (2.3)
is given by
β ≈ 3λr .
(2.8)
It follows that the non-Gaussian, Kuhn–Grün energy function (2.5) reduces to
the Gaussian strain energy function per chain. The variance encountered between
predictions of the Gaussian theory and experiments at moderate strains, therefore, very likely will not be significantly diminished by any non-Gaussian network
model based on the Kuhn–Grün distribution function, as remarked by Treloar and
Riding [8].
The constitutive equations for several models of interest are sketched below.
First, however, let us note that the strain energy function for an incompressible,
(λ1 , λ2 , λ3 ) of
isotropic hyperelastic material is a symmetric function W = W
the principal stretches λj , subject to the incompressibility constraint λ1 λ2 λ3 = 1.
Then the stress-stretch equations for the principal Cauchy stress components Tj are
provided by
Tj = −p + λj Wj ,
j = 1, 2, 3
(no sum),
(2.9)
/∂λj .
where p is an arbitrary pressure and Wj ≡ ∂ W
3. The James–Guth 3-Chain Model
The James–Guth 3-chain model [4] considers a homogeneous full-network of n
chains per unit volume oriented along the three mutually orthogonal principal
directions
√ of deformation, all having the same initial (rms) chain vector length
r0 = N3 l. Here N3 denotes the number of rigid links, each of length l, in a
molecular chain between cross-links of the network. In an affine deformation the
current chain vector length in the j th direction is then defined by rj chain ≡ λj r0 ,
where λj denotes the macroscopic principal stretch of the continuum along the j th
principal axis. The corresponding j th relative chain stretch λj r , in accordance with
(2.1), is defined by
λj r ≡
λj
λj chain
=√ ,
λL
N3
j = 1, 2, 3,
(3.1)
70
M.F. BEATTY
in which λj chain ≡ rj chain /r0 = λj defines the j th current chain stretch and λL is
the chain locking stretch. Since the 6-chain cell structure is symmetric and the
distribution of chains is homogeneous, the chain density nj for the j th set of
orthogonal chains is the same for each set: nj = n/3, though wj is not. Hence,
(λ1 , λ2 , λ3 )
from (2.5) and (2.6) in which p = 3, the total strain energy W = W
per unit volume for the James–Guth 3-chain network model may be written as
3
βj
nk
βj λj r + ln
− c3 ,
N3
W =
3
sinh βj
j =1
(3.2)
in which the constant c3 is chosen so that the strain energy vanishes in the undeformed state; and, by (2.3), βj ≡ L−1 (λj r ). Observing that λj ∂W/∂λj = λj r ∂W/
∂λj r and using (3.2) in (2.9), we obtain the constitutive equation for the James–
Guth 3-chain model in the principal reference system:
Tj = −p +
µ0
N3 βj λj r ,
3
j = 1, 2, 3
(no sum),
(3.3)
wherein µ0 is the shear modulus in the undeformed state:
µ0 ≡ nk .
(3.4)
The non-Gaussian stress-stretch relations (3.3) are determined by two parameters:
the shear modulus µ0 , and the number of links N3 in a single chain of the 3-chain
network model. The latter controls the stiffening behavior of rubber materials at
large strains and determines the ultimate extensibility of the network.
With the aid of the infinite series expansion [2, 7] for L−1 (λj r ) and use of
the Cayley–Hamilton theorem for which B3 = I1 B2 − I2 B + 1 for an incompressible material, noting that the left Cauchy–Green deformation tensor B =
diag[λ21 , λ22 , λ23 ] in the principal reference system, we see that (3.3) may be cast
in the familiar invariant tensorial form
T = −p1 + ℵ1 (I1 , I2 ; N3 )B + ℵ2 (I1 , I2 ; N3 )B2 ,
(3.5)
where the elastic response functions ℵα (I1 , I2 ; N3 ) are defined by the infinite series
513
99
I +
(1 − I1 I2 )
ℵ1 (I1 , I2 ; N3 ) = µ0 1 −
2 2
175N3
875N33
42039
2
2
+
(I
−
I
I
+
I
)
+
·
·
·
,
1
1 2
2
67375N34
(3.6)
513
99
3
2
(I − I2 )
I1 +
+
ℵ2 (I1 , I2 ; N3 ) = µ0
5N3 175N32
875N33 1
42039
3
+
(I − 2I1 I2 + 1) + · · · .
67375N34 1
AN AVERAGE-STRETCH FULL-NETWORK MODEL
71
A similar series of terms has been absorbed in the arbitrary pressure term which
is simply rewritten as p. It is evident that, except for its invariant tensorial representation in the form (3.5), there is little significant advantage gained over the
principal axis representation (3.3), except possibly for finite element applications.
By (2.8), for small to moderate stretches βj ≈ 3λj r , and (3.3) reduces easily to the
neo-Hookean constitutive model [2]:
T = −p1 + µ0 B.
(3.7)
4. The Arruda–Boyce 8-Chain Model
We next develop the constitutive equation for the Arruda-Boyce 8-chain model. We
begin with a different view point and thereby obtain a new auxiliary result.
Let us consider a single, perfectly flexible and freely
√ jointed molecular chain
whose undeformed (rms) chain vector length is r0 = l N and whose chain vector
r0 = (X, Y, Z) in the reference configuration is directed along the line X = Y = Z
through O in a rectangular Cartesian frame ψ = {O;
√ Ik }. Then, in the undeformed network,√the chain vector has length r0 = X 3 and its direction in ψ
is m = (1, 1, 1)/ 3. (See Figure 1(a)). Now suppose that the network is subjected
to an affine deformation in which ψ coincides with the local principal axes. The
corresponding principal stretches are denoted by λj and the squared stretch of a
chain initially oriented in an arbitrary referential direction m is determined by
λ2chain
= m · Cm =
3
m2k λ2k ,
(4.1)
κ=1
where C is the right Cauchy–Green deformation tensor. Hence, for our special
single chain model, (4.1) yields the chain stretch
λchain =
rchain
=
r0
I1
,
3
(4.2)
in which I1 ≡ λ21 + λ22 + λ23 . We recall that I1 is the first principal invariant of
the Cauchy–Green deformation tensor C or B, each being equal to diag[λ21 , λ22 , λ23 ]
in its corresponding principal reference system. The same result holds for a chain
whose end point is initially situated along any of the three similar lines −X =
Y = Z; −X = −Y = Z; and X = −Y = Z. Hence, (4.2) is a necessary condition
in order that a chain may be initially oriented along any of the aforementioned eight
directions radiating from O in the principal referential frame ψ.
Let us consider the converse question. What are the orientations of all chains
whose chain stretch is given by (4.2)? Consider any chain C whose end points
in the principal frame ψ are at O and r0 = r0 mk Ik initially, where mk are its
direction cosines. In its deformed state in ψ, the squared stretch of the chain C
with direction m is determined by (4.1), that is, by
λ2chain = λ21 m21 + λ22 m22 + λ23 m23 .
(4.3)
72
M.F. BEATTY
Figure 1. Various chain cell models having the same chain stretch. (a) 1-chain structure,
(b) 4-chain tetrahedral structure, (c) 8-chain cubic structure, (d) 4-chain semi-octahedral and
8-chain octahedral structures.
Now suppose that the same network is subjected to an affine deformation with the
same values of the principal stretches as before, but with λ1 and λ2 interchanged.
The chain C now has the squared stretch
λ2chain = λ22 m21 + λ21 m22 + λ23 m23
(4.4)
for the same initial vector r0 and the same values of the stretches λk . Of course, the
interchange of stretches does not alter (4.2); so, if the chain stretch in both (4.3)
and (4.4) is the invariant chain stretch (4.2), then
(λ21 − λ22 )(m21 − m22 ) = 0
(4.5)
must hold for arbitrary principal stretches λk that satisfy the incompressibility
condition. It follows that m21 = m22 must hold initially for the chain C. Clearly,
AN AVERAGE-STRETCH FULL-NETWORK MODEL
73
a relation similar to (4.5) holds when the same network is again subjected to a
deformation with the same values of the principal stretches as before, but with
λ1 and λ3 , or λ2 and λ3 interchanged. Consequently, m21 = m22 = m23 = 1/3
must hold initially for the chain vector r0 . It thus follows that the only chain
orientations for which (4.2) holds for all deformations with arbitrary principal
stretches λk are chains situated along straight lines through the origin and defined
by X 2 = Y 2 = Z 2 in ψ, that is, the four lines described earlier and directed along
the diagonals of a unit cube. We thus have the following auxiliary result.
The chain stretch λchain in an affine deformation with local principal stretches λk
has the invariant form (4.2) if and only if the chain is oriented initially in any of
the eight directions from O along the diagonals of a unit cube.
As a consequence, we have only a few possible network cell structures having a
finite number of chain orientations for which (4.2) holds. These include the 1-chain
structure, the 4-chain tetrahedral structure, and the 8-chain cubic structure shown
in Figure 1, in which the origin O in Figures 1(b) and (c) is at the center of the
cube. The 4-chain semi-octahedral and 8-chain octahedral structures in Figure 1(d)
are identical to the cubic structure. The 4-chain and 8-chain models, therefore, are
important special members of the larger geometrical class of uniform polyhedra
networks having ν chains, all with initial vector length r0 . In the deformed state,
a uniform polyhedron chain structure is distorted with varying degrees of relative chain stretch among its many chains. The 4- and 8-chain network models,
however, are unique among these. They are the only uniform polyhedra chain
morphologies all of whose chains in the principal reference system have the same
chain stretch (4.2), and hence the same strain energy per chain. Therefore, these
geometrically similar models are said to be isomorphic.⋆
Equation (4.2) is the same as the Arruda–Boyce equation (16) in [6] obtained
√
for a network model based on eight chains of undeformed vector length r0 = l N8
linked at the center of a cube and extending to its eight corners, N8 denoting the
number of chain links⋆⋆ of the 8-chain model. Henceforward, in regards to this
model, we shall write N = N8 and refer to our subsequent constitutive equation as
the Arruda–Boyce 8-chain model.
With (4.2), the relative chain stretch (2.1) is given by
λr =
λchain
=
λL
I1
,
3N8
(4.6)
⋆ This is viewed somewhat differently by Yeoh and Fleming [18]. According to them, the constitutive coincidence of the 4- and 8-chain models arises because the 8-chain model is isomorphic to a
body centered cubic lattice comprised of two tetrahedral diamond sublattices; and the 4-chain model
is isomorphic to a diamond lattice.
⋆⋆ Wu and van Giessen [9] observed from simple extension data that N ≈ 3N ; so, experimentally
3
8
determined values for N3 and N8 for the same physical parameter N may vary considerably. See also
the remarks in [13].
74
M.F. BEATTY
√
where λL = N8 , the fully extended chain stretch. All eight chains of the symmetric 8-chain network, in an affine deformation of its cubic structure in the principal
frame ψ, have the same relative chain stretch (4.6). We shall assume that the network chain density n is distributed uniformly among these eight (or by symmetry,
four) chain directions. Hence, by (2.5) and (2.7), the total strain energy per unit
volume for the Arruda–Boyce 8-chain network model is
β
(4.7)
W = µ0 N8 βλr + ln
− c8 ;
sinh β
where c8 is a convenient constant, β is defined by (2.3), and we recall (3.4). Therefore, the strain energy for the Arruda–Boyce model is simply a function of the
principal invariant I1 alone. Hence, substitution of (4.7) into (2.9) yields⋆ the constitutive equation for the Arruda–Boyce 8-chain network model [6, 19]:
T = −p1 + ℵ(I1 )B,
(4.8)
where B = diag[λ21 , λ22 , λ23 ] in the principal reference system of the deformed state
and the material response function ℵ(I1 ) is defined by
ℵ(I1 ) ≡
µ0 β
,
3λr
(4.9)
with β = L−1 (λr ) and λr given by (4.6).
We see that the invariant 8-chain rule (4.8) with the single elastic response
function (4.9) is far simpler than the corresponding 3-chain result in (3.5). For small
to moderate values of the relative chain stretch (4.6) for which (2.8) holds, (4.9)
yields ℵ(I1 ) = µ0 , a constant; and (4.8) reduces to the neo-Hookean (Gaussian
network) relation (3.7), the same moderate principal stretch relation obtained from
the 3-chain model.
5. The Wu and van der Giessen Full-Network Model
The development of special network models, such as the 3-chain and 8-chain cell
structures, avoids the more difficult consideration of random chain orientations
studied by Wu and van der Giessen [9, 10]. A much simpler construction of their
major result is presented next, and some basic applications follow.
Let us consider a network cell of randomly oriented molecular chains radiating
from an arbitrarily chosen cross-link junction O. The other end point of each chain
is similarly connected with other randomly oriented network chains. For a homogeneous network, every network chain has the same initial, unstretched end-to-end
⋆ The result (4.8) in terms of principal stretches was first reported without demonstration by Wang
and Guth [11, 13] for the distinct Flory–Rehner 4-chain tetrahedral model [5]. Arruda and Boyce [6]
subsequently derived the same principal stretch result for the cubic 8-chain cell model and studied it
extensively in experiments on a variety of rubber materials. The constitutive equation (4.8), however,
holds in every reference system, a useful property noted in another context in [19]. See also [20].
AN AVERAGE-STRETCH FULL-NETWORK MODEL
75
√
chain vector length r0 = l N. Consequently, each chain emanating from O has
its end point on the surface of a sphere S of radius r0 and volume v = 4π r03 /3. The
volume averaged value of the strain energy per chain for the uniform network of
chains enclosed within S is defined by
1
w(λr ) dv,
(5.1)
w
≡
v S
in which we recall the Kuhn–Grün strain energy (2.5) for a single randomly oriented molecular chain. The end point of a typical chain initially at (X, Y, Z) in a
referential Cartesian frame ϕ = {O; Ik } has the spherical coordinates (r0 , θ0 , φ0 )
with θ0 = [0, π ], φ0 = [0, 2π ]. Hence, the total strain energy W = n
w, per unit
referential volume of a uniform, full-network of chain density n, by (5.1), is given
by
2π π
n
w(λr ) sin θ0 dθ0 dφ0 .
(5.2)
W =
4π 0 0
This is equivalent to integrating (5.1) over a unit sphere. Though here derived
differently, (5.2) is the same constitutive equation obtained by Wu and van der
Giessen [9] for an initially homogeneous distribution of a large number of randomly oriented chains, an equation introduced earlier by Treloar and Riding [8].
The initial end-to-end chain vector r0 = r0 m in ϕ has the direction cosines
{mi (θ0 , φ0 )} = (sin θ0 cos φ0 , sin θ0 sin φ0 , cos θ0 );
(5.3)
and the relative stretch λr = λr (θ0 , φ0 ; λkr ) of a randomly oriented chain, defined
by (2.1), is determined by
λ2r = λ2r (θ0 , φ0 ; λkr ) =
3
m2k λ2kr .
(5.4)
k=1
The relative principal stretches λkr are defined by
λk
λkr = √ ,
N
(5.5)
as in (3.1). These are independent of the chain orientation angles θ0 , φ0 . Thus, by
(2.9) and (5.2), the Wu and van der Giessen constitutive equation for the Cauchy
stress may be concisely written as
Tk = −p + ℵk λ2k ,
k = 1, 2, 3
(no sum),
in which the elastic response functions ℵk ≡ ℵk (λ1 , λ2 , λ3 ) are defined by
2π π
m2
1
µ0
β(λr ) k sin θ0 dθ0 dφ0 , k = 1, 2, 3,
ℵk =
4π
λr
0
0
wherein we recall (5.4) and the shear modulus µ0 is given by (3.4).
(5.6)
(5.7)
76
M.F. BEATTY
One might expect that the full-network model (5.6) should provide better predictions of observed experimental data than other approximate network models.
But Wu and van der Giessen [9] do not find significantly better predictions than
those demonstrated by the 8-chain model, which they attribute to factors other
than its superiority. In applications, however, the formidable constitutive equation (5.6) must be solved by time intensive numerical methods [9]. To get around
this computational difficulty, Wu and van der Giessen introduce an ad hoc phenomenological constitutive equation consisting of an additive mixture of 3-chain and
8-chain constitutive components with coefficients chosen to best fit their general
constitutive equation. They thus use this somewhat simpler composite phenomenological model in numerical applications in which N has the same value for
both contributions. We recall, however, that for the separate models experiments
exhibit distinct values for N3 and N8 . While their composite mixture rule is computationally simpler and easier to use in comparison of model results with test
data, it replaces an amorphous structure with specific chain cell morphologies
and it offers no significant analytical simplicity. (See also [13].) Consequently, we
seek a general, but approximate constitutive equation that may be derived directly
from (5.6). First, however, we shall show that three familiar special cases may be
readily derived from this full-network equation.
5.1. THE GAUSSIAN NETWORK MODEL
The Gaussian network is characterized by the moderate stretch approximation (2.8).
It thus follows that the response functions (5.7) are constants:
ℵk =
3µ0
Ik ,
4π
k = 1, 2, 3,
(5.8)
m2k sin θ0 dθ0 dφ0 .
(5.9)
where
Ik ≡
0
2π π
0
With the aid of (5.3) in (5.9), we obtain I1 = I2 = I3 = 4π/3; and hence each
ℵk = µ0 , a constant. Therefore, the full-network model (5.6) yields the familiar
constitutive equation (3.7) for the neo-Hookean, Gaussian network model.
5.2. THE JAMES – GUTH 3- CHAIN MODEL
The principal stress components for the James–Guth 3-chain model depend only
on the homogeneous distribution of chains with density nk = n/3 in each of the
corresponding three orthogonal principal directions, the chains in opposite directions being equivalent. Therefore, in (5.7) we replace n with n/3; that is, in view
of (3.4), µ0 is replaced with µ0 /3. In addition, because each principal direction
corresponds to a different relative chain stretch, for the kth principal direction
77
AN AVERAGE-STRETCH FULL-NETWORK MODEL
we replace λr with λkr in accordance with (3.1); and each of the three directed
contributions must be taken into account. Because the chains are oriented initially
along the three principal directions Ij , the three chain directors mj = mj k Ik are
the constant vectors m1 = (1, 0, 0), m2 = (0, 1, 0), and m3 = (0, 0, 1), in accord
with (5.3). Consequently, (5.7) is now written as
1
µ0
ℵk =
12π
0
2π π
0
3
βj
j =1
m2j k
λj r
sin θ0 dθ0 dφ0 ,
k = 1, 2, 3,
(5.10)
in which βj ≡ L−1 (λj r ). Noting that mj k is the kth component of the j th principal
chain director, we see for k = 1 that 3j =1 βj m2j 1 /λj r = β1 /λ1r , for example. It is
then easily seen that (5.10) yields
ℵk =
µ0 βk
,
3 λkr
k = 1, 2, 3.
(5.11)
Recalling (3.1), we have λ2k = N3 λ2kr , and hence with (5.11) we find that the constitutive equation (5.6) reduces to (3.3) for the non-Gaussian, James–Guth 3-chain
model.
5.3. THE ARRUDA – BOYCE 8- CHAIN MODEL
The Arruda–Boyce model consists of a homogeneous distribution of identical
chains with density nk = n/8 in each of the eight chain directions oriented along
the diagonals of a cubic cell, so µ0 is replaced with µ0 /8. (Because chains in
opposite directions are equivalent, actually we need only consider distributions
of chains along the four diagonal directions.) All chains have the same squared
direction cosines m2k ; and all experience the same relative chain stretch in an affine
deformation in the principal frame aligned with the edges of the referential cube.
Therefore, it turns out that the effect of our accounting for these eight identical
contributions in the manner described previously for the 3-chain model is simply
equivalent to our considering a homogeneous distribution of single chains with
chain density n and having the direction triple
{mk } =
(1, 1, 1)
.
√
3
(5.12)
Then, with (5.5), we see that (5.4) yields λ2r = (1/(3N8 ))(λ21 + λ22 + λ23 ), i.e. the
same relative stretch given in (4.6). Therefore, β(λr )/λr is now independent of the
coordinates θ0 , φ0 ; and hence with the aid of (5.12) the response functions in (5.7)
may be written as
ℵk =
µ0 β
.
3λr
(5.13)
78
M.F. BEATTY
We see from (5.13) that the all three functions ℵk = ℵ(I1 ) coincide with (4.9); and
(5.6) thus reduces to (4.8) for the Arruda–Boyce 8-chain model [6]. A parallel argument holds for the distinct Flory–Rehner 4-chain model [5] shown in Figure 1(b)
and again leads to (4.8), first recorded in terms of principal stretches and without
proof by Wang and Guth [11].
6. An Average-Stretch Non-Gaussian Full-Network Model
Let us return to our homogeneous full-network unit of randomly oriented molecular
chains of density n radiating from an arbitrarily chosen cross-link junction at the
origin of a principal reference frame ϕ = {O; Ik }. Each chain diverging from O
has its end point on the surface of a sphere S of radius r0 . In terms of spherical
coordinates, the end point r0 = r0 m of a representative chain has the initial direction cosines given by (5.3) in ϕ. When the continuum is subjected to a deformation
with local principal stretches λk in ϕ, the same end point in an affine deformation
has the end-to-end chain vector length rchain ; and its corresponding squared chain
stretch (4.1) may be written as
λ2chain =
2
rchain
= sin2 θ0 (λ21 cos2 φ0 + λ22 sin2 φ0 ) + λ23 cos2 θ0 .
r02
(6.1)
Notice that upon forming the ratio λ2r = λ2chain /N, the result (6.1) is the same
as (5.4) in which we recall (5.3) for a chain having an arbitrary orientation in ϕ.
The volume averaged value of the squared stretch of an arbitrarily directed chain
within a sphere S of radius r0 centered at a cross-link junction is defined by
1
2
λ2 dv.
(6.2)
λ̂chain =
v S chain
In terms of spherical coordinates, (6.2) becomes
2π π
1
2
λ̂chain =
λ2 sin θ0 dθ0 dφ0 ,
(6.3)
4π 0 0 chain
which is equivalent to averaging the squared stretch over a unit sphere at O. Hence,
for the squared stretch (6.1) of a typical arbitrarily directed chain in the fullnetwork structure surrounding the central junction at O, we find eventually from
(6.3) the following simple, invariant relation for the averaged chain stretch⋆ :
λ̂chain =
I1
,
3
(6.4)
⋆ Note added in proof: I have discovered recently that Kearsley [24] apparently was the first to
have derived this rule. In addition, he shows [24] that the square of the stretch ratio of a material
area element averaged over all orientations of its normal vector is equal to I2 /3. A simpler proof
this result follows readily from the relation α 2 = I3 C−1 m · m, given in [25, Equation (29.2)], in
which α denotes the areal stretch ratio α = da/dA of the deformed material area element da(x) to
its undeformed element dA(X) whose outward directed unit normal vector at X is m. The average
of α 2 over a unit sphere with radius vector m is defined by α̂ 2 = 1/(4π) 02π 0π α 2 sin θ0 dθ0 dφ0 ,
AN AVERAGE-STRETCH FULL-NETWORK MODEL
79
in which I1 =trB =λ21 +λ22 +λ23 . The average
√ end-to-end chain vector length is thus
defined by r̂chain ≡ λ̂chainr0 , where r0 = l N , as usual. The locking
√ chain length
is rL = Nl, and hence the limiting chain stretch is λL = rL /r0 = N. Therefore,
the mean relative chain stretch λ̂r is defined by
λ̂r ≡
λ̂chain
r̂chain
= √ =
rL
N
I1
.
3N
(6.5)
The italicized auxiliary result below (4.5) shows that a single network chain has
the relative chain stretch (6.5) when and only when that chain is specifically oriented along a diagonal of a unit cube. The 4- and 8-chain network models are the
only uniform polyhedra chain morphologies all of whose chains in the principal
reference system have the same relative chain stretch (4.2). The mean result (6.5),
however, is more general in that no specific chain cell morphology is introduced, it
holds in the mean for any single randomly oriented chain.
In the affine deformation, the stretch of some chains in S clearly will exceed the
average value (6.4), while that of others will be smaller, the nature of the macroscopic deformation being captured by the invariant I1 . Recalling that (5.9) with
(5.3) yields the constant value Ik = 4π/3 and introducing the mean value (6.5)
in (5.7) for the full-network model, we obtain for the elastic response functions ℵk
the approximate values
ℵk ≡
ℵk (λ1 , λ2 , λ3 ) defined by the single relation
ℵk ≡
µ0 β(λ̂r )
µ0 β(λr )
=
3 λ̂r
3λr
.
(6.6)
λr =λ̂r
With this estimate in hand, the general constitutive equation (5.6) for our averagestretch, full-network model for rubber elasticity is nicely approximated by the
invariant relation
T = −p1+µ(I1 )B,
(6.7)
in which, with (6.5), the shear response function µ(I1 ) ≡
ℵk (λ1 , λ2 , λ3 ) is given
by
µ(I1 ) =
µ0 β(λ̂r )
3λ̂r
.
(6.8)
It is remarkable that equation (6.7) is precisely the same as the constitutive
equation (4.8) for the Arruda–Boyce 8-chain model. The fundamental difference is
that the result holds more generally for an average-stretch, full-network model of
n randomly oriented chains per unit volume; it thus characterizes the amorphous
for all orientations m given by (5.3) in the principal frame of C. It is evident that C−1 m · m may
be read from the right-hand side of (6.1) in which the λ2k are replaced by λ−2
k , the principal values
of C−1 . The subsequent integration for α̂ 2 , therefore, yields a rule similar to (6.4). In consequence,
we have α̂ 2 = I3 (C)I1 (C−1 )/3 = I2 (C)/3, which is Kearsley’s result.
80
M.F. BEATTY
molecular structure of rubberlike materials. Therefore, it is not necessary to emphasize the heuristic 8-chain cell structure in reference to their result. Equation (6.7)
with (6.8) is simply the Arruda–Boyce constitutive equation for an average-stretch,
full-network of arbitrarily oriented molecular chains.
Alternatively, we may begin with the Kuhn–Grün configurational entropy (2.2)
for a single, freely jointed and randomly oriented molecular chain in which we
introduce the mean relative stretch (6.5) to obtain the mean entropy ŝ = s(λ̂r ), per
chain. The mean strain energy per chain is then given by w
= − ŝ in accord with
= n
(2.5); and (2.7) provides the mean total strain energy W
w for a homogeneous
full-network of n such chains per unit volume. We thus obtain with (6.5) a mean
= W (I1 ), namely,
strain energy per unit volume W
W (I1 ) = nw(λ̂r )
(6.9)
depending on only the first principal invariant of B; and, with the aid of (2.9), the
general constitutive equation for the incompressible elastomer is given by (6.7)
with (6.8).
7. A General Non-Gaussian Network Constitutive Equation
The average-stretch procedure may be applied to amended non-Gaussian molecular network models based on the Wang and Guth [11] series developments of
Rayleigh’s exact, but formidable integral formulation of non-Gaussian chain distributions [1, p. 314]. Their relation for N ≫ 1 and r comparable to N, characterizes highly elastic materials and leads to a probability distribution function [11,
equation (2.20)] that depends on only the fractional extension ratio; namely,
N
1
2 1/2
sinh β
PN (λr ) =
4π Nl 3 π N
β exp(λr β)
2λr −1/2
q(λr ; N)
β
2
1 − λr −
×
1+
+ ··· ,
(7.1)
λr
β
N
in which λr and β are defined in (2.1) and (2.3), and q(λr ; N) is a certain function
of λr and the chain parameter N. This development is valid for all λr ∈ [N −1/2 , 1].
We are mainly interested in ln PN (λr ). We thus see that the first square bracket
in (7.1) gives the Kuhn–Grün distribution that leads to the configurational entropy (2.2) and the strain energy function (2.5). This term together with the first
term within the second square bracket in (7.1) yields the amended Kuhn–Grün
function introduced by Jernigan and Flory [12], and recently studied by Zúñiga and
Beatty [13]. The remaining terms become increasingly negligible for sufficiently
large N. In any such series formulation, however, we shall have some strain energy
function w(λr ), per chain, that depends only on the relative chain stretch; and for
all such models the constitutive equation for a uniform full-network of randomly
oriented chains is given by (5.2). But use of the general distribution in (5.2) further
AN AVERAGE-STRETCH FULL-NETWORK MODEL
81
complicates an already difficult equation. On the other hand, our average-stretch,
full-network model for randomly oriented chains considerably simplifies the general distribution function in that, by (7.1), we always have PN (λ̂r ) = P (I1 ); and for
a homogeneous network of chain density n, we may write the mean strain energy
= −nk ln P (I1 ) = W (I1 ). The most general form of the
per unit volume as W
constitutive equation for every model characterized by (7.1) evaluated for λ̂r is
thus given by (6.7) in which the general shear response function, in accordance
with (2.9), is determined by
µ(I1 ) =
1
∂W (λ̂r )
3N λ̂r
∂ λ̂r
,
(7.2)
where we recall (6.5).
It is seen from (6.3) that as λ2chain → N, the greatest value of the squared chain
stretch, λ̂2chain → N also; and hence λ̂r → 1. Consequently, from (6.5), in any
given affine deformation the first principal invariant I1 → Im , its greatest possible
value determined by the material constant N; that is,
Im ≡ 3N
(7.3)
√
is a material constant reflecting the locking stretch λL = N of any randomly
oriented molecular chain and is thus named the network locking constant. It follows
that for an average-stretch model the first principal invariant I1 in every affine
deformation is bounded by the network locking
constant⋆ : 3 I1 < Im . The
√
general response function (7.2) with λ̂r = I1 /Im thus involves the two physical
constants µ0 and Im .
The constitutive equation (6.7) associated with (7.2) is valid for all deformations of rubberlike materials; it is an equation for which general solutions of many
boundary-value problems are well known, and for which specific knowledge of
the response function itself is not essential. Therefore, our average-stretch, fullnetwork model for homogeneous non-Gaussian networks of randomly oriented
molecular chains is especially useful in the general formulation and solution of
a great variety of practical problems in finite elasticity. We find that the simplest
of these average-stretch, non-Gaussian full-network models is described by the
Arruda–Boyce constitutive equation (6.7) with shear response function (6.8).
⋆ In principle, specifically, the network locking constant may be determined (actually only esti-
mated) by a simple uniaxial experiment with limiting uniaxial stretch λlim at which network chain
locking will occur. We thus have
2
.
Im = 3N = λ2lim +
λlim
(7.4)
It follows, as described in [13], that the limiting value λlim of the greater principal stretch of the
continuum in any other deformation is given by the equation I1 = Im . The limiting stretch (tension
or compression) in an equibiaxial stretch, for example, is given by the two positive roots of the
equation 2λ6lim − Im λ4lim + 1 = 0. Clearly, the limiting stretch λlim of the continuum in an affine
deformation is not equal to the molecular chain locking stretch λL .
82
M.F. BEATTY
8. Conclusion
We have shown that among all uniform polyhedra cell structures, the 4- and 8-chain
models share the unique property that all of their chains in a principal reference
system have the same stretch. Consequently, they are described by the same constitutive equation first derived by Arruda and Boyce [6]. These are special models
within the class of uniform non-Gaussian network models, a general model for
which is provided by the Wu and van der Giessen [9] constitutive equation which
is essentially based on the Treloar–Riding [8] general energy integral for a homogeneous full-network of non-Gaussian chains. All are based on the Kuhn–Grün
distribution function [2, 3]. We have shown that the constitutive equations for
the classical neo-Hookean, the 3-chain, and the 8-chain (hence also the 4-chain)
models may be derived from the Wu and van der Giessen full-network model. It
is also shown that the volume averaged, squared stretch of an arbitrarily directed
chain is the same squared stretch that characterizes the isomorphic 4- and 8-chain
cell models; the difference, however, is that no specific chain cell morphology is required. With this result in hand, an average-stretch constitutive equation is obtained
by approximation from the Wu and van der Giessen equation for uniform networks.
We find that this reduced equation is precisely the same as the Arruda–Boyce [6]
constitutive equation for the 8-chain model; and a similar equation with a different
shear response function may be derived for other models based on amended forms
of the Kuhn–Grün function. The same average-stretch, full-network approximation
is applied directly to obtain from the Kuhn–Grün function an approximate total
strain energy function for a uniform full-network model that leads again to the
Arruda–Boyce constitutive equation.
It is known that experimental data on homogeneous deformations of a variety
of rubberlike materials stand in very good overall agreement with results based on
the Arruda–Boyce constitutive equation [6, 9], the greatest variance of the model in
comparison with all known data arises in equibiaxial deformation. All data reported
so far, however, focus on homogeneous deformations for which principal axes are
readily identified. It would be most useful and interesting to expand the comparison
of additional experimental data on the same materials with theoretical predictions
based on the Arruda–Boyce constitutive equation, represented in (6.7) and (6.8), for
the torsion of bars, the twisting and inflation of tubes, and the bending of rubber
rods, for example.
9. Endnote: Remarks on Some Related Unpublished Work
A reviewer has pointed out that some parallel results derivable from (5.2) have
been reported in unpublished work by Puso [21]. Puso remarks that the strain
energy functions for both the James–Guth and Arruda–Boyce models may be obtained from the Wu and van der Giessen (actually the Treloar–Riding) equation
(5.2) as Gauss point approximations with six and eight points, respectively. In fact,
no approximations appear necessary. It is customary to assume a priori for these
AN AVERAGE-STRETCH FULL-NETWORK MODEL
83
models that the chains are equally distributed along the three principal axes or along
the four diagonal lines of a cube with edges along principal axes; and, of course,
oppositely directed chains have the same stretch and are assumed equivalent. Thus,
for the 3-chain case, replacing w(λr ) with each directed chain density contribution
wk = w(λk ) in summing over the unit sphere in (5.2) (opposite chains having the
2π π
same stretch) and noting that (1/(4π )) 0 0 sin θ0 dθ0 dφ0 = 1, we see that the
total strain energy (5.2) reduces to (2.6) (with p = 3) which thus leads to (3.2) for
the James–Guth model.√Similarly, for the 8-chain model, replacing w(λr ) with each
wk = w(λ̃r ) and λ̃r = I1 /3N, the total energy (5.2) simplifies to (2.7) which thus
yields (4.7) for the Arruda–Boyce model. There is no mention of identical results
for the Flory–Rehner tetrahedral model [5] whose four distinct chains are similarly
oriented along four diagonal lines of a cube,
√as shown in Figure 1(b), all of which
have the same relative chain stretch λ̃r = I1 /3N , as shown earlier. Plainly, the
corresponding Cauchy stress for each model must then be obtained separately by
application of (2.9), though Puso prefers the second Piola–Kirchhoff stress representation. I think this is essentially the idea perceived by Puso for derivation
of the aforementioned constitutive equations from the Treloar–Riding total energy
relation (5.2).
A leading objective in [21] is to obtain a certain series approximation of (5.2) to
provide a simplified constitutive equation in terms of stretch invariants that demonstrates improved accuracy over either of the special chain models and which can
be used in finite element formulations of boundary value problems. The model
thereby avoids integration of the Langevin function following use of (2.5) in (5.2).
First, the inverse Langevin function in our current notation is approximated by
β∼
= 3λr /(1 − λ3r ), an empirical estimate that exhibits very good graphical comparison with β = L−1 (λr ) up to λr = 0.8, after which it increases faster than the exact
value. This estimate is introduced in the force relation given in our footnote ∗∗ ,
page 3, which is then integrated with respect to λr to obtain the strain energy w(λr ),
per chain, as a sum of certain logarithm and inverse tangent functions⋆ . The result is
then used in (5.2), still not integrable. The next step in the constitutive formulation
for a chain with stretch λchain in an arbitrary direction consists of a√Taylor series
expansion of w(λchain) = w(λ
√ r ) about the chain stretch λ̃chain = I1 /3, or the
relative chain stretch λr = I1 /3N of the 8-chain model, an interesting idea. In
essence, though not stated in [21], it is assumed that the stretch in an arbitrary
direction varies only slightly from its mean value in (6.4). As may be expected, the
zero order term necessarily, by the assumed construction, leads to the constitutive
⋆ I find, however, that Puso’s integral result for w(λ ), which is to be used in the subsequent
r
Taylor series expansion of w(λr ), contains an error repeated in subsequent equations related to it.
I have not confirmed all of the subsequent details. The consistent appearance of this error and others,
including incorrect equations for λ2chain and for the second Piola–Kirchhoff stress tensor everywhere,
while troublesome, are probably typographical slips. I note, for example, that Puso reports the correct
value (6.4) obtained from the integral (6.3) that appears in the first order Taylor series term of the
total energy function.
84
M.F. BEATTY
equation for the 8-chain model. By use of Puso’s series estimate, the previously
non-integrable constitutive relation (5.2) in terms of the inverse Langevin function is much simplified to an algebraic constitutive equation having the general
tensorial representation T = FSFT = −p1 + ℵ1 (I1 , I2 )B + ℵ2 (I1 )B2 , based on
the form of S given in [21], the accuracy of which I have not confirmed. Like all
other non-Gaussian and related phenomenological models, this one fails to capture
effects observed in equibiaxial extension tests, though it does better than others
in comparison with uniaxial data. Puso’s series model subsequently is modified to
include effects due to neighboring chain entanglement interactions. The adjusted
model shows very good comparison with equibiaxial data for natural gum rubber
by James et al. [22], and improved though still imperfect and stiffer response in
comparison with similar equibiaxial data by Treloar [23].
Although my construction exhibits some similarities of result, it differs fundamentally in its procedure, objectives, and simplicity. All of the familiar constitutive
equations for the Cauchy stress, including the classical neo-Hookean equation, are
readily deduced directly from the Wu and van der Giessen representation (5.7).
My direct average-stretch approximation of the Wu and van der Giessen equation (5.7) shows that the Arruda–Boyce constitutive equation is equivalent to an
average-stretch, full-network model. Moreover, it is shown that similar constitutive
equations of the type (6.7) with different shear response functions hold for other
models based on amended forms of the Kuhn–Grün function. None of these results
have been reported elsewhere.
Finally, let us recall that the Cauchy stress for a general incompressible, isotropic
hyperelastic material may be written as
T = −p1 + 2
∂W −1
∂W
B−2
B ,
∂I1
∂I2
(9.1)
in which the strain energy function W = W (I1 , I2 ). Now suppose that the relative
chain stretch in an arbitrary direction of a full-network of randomly oriented, perfectly flexible molecular chains does not stray very far from its mean value (6.5).
We may then obtain from the Treloar–Riding energy functional (5.2) the following
approximate total strain energy function
W (I1 ) = nw(λ̂r )
(9.2)
in which w(λ̂r ) is the approximate strain energy per chain. For all such models, W = W (I1 ), a function of I1 alone; and (9.1) then has the reduced general
form (6.7) with shear response function (7.2). This is precisely the result obtained
differently in (6.9) based on the mean Kuhn–Grün configurational entropy per
chain. Indeed, when w(λ̂r ) is given by the Kuhn–Grün chain energy function (2.5),
it readily follows that dw(λ̂r )/dλ̂r = Nk β(λ̂r ); and in this case, it is easily seen
from (9.2) that
2
∂W
∂w(λ̂r ) ∂ λ̂2r
µ0 β(λ̂r )
= 2n
=
≡ µ(I1 ).
∂I1
∂(λ̂2r ) ∂I1
3λ̂r
(9.3)
AN AVERAGE-STRETCH FULL-NETWORK MODEL
85
Therefore, as mentioned before, (9.1) reduces precisely to the average-stretch, fullnetwork result (6.7), i.e. the Arruda–Boyce constitutive equation for the isomorphic
4- and 8-chain models. Of course, the result (9.3) is more general in that no specific
chain cell morphology is assumed.
The estimate in (9.2), and hence my result (6.9), essentially corresponds to the
lowest order approximation in Puso’s series expansion of w(λr ) about the squared
relative chain stretch of the 8-chain model. The introduction in (9.3) of Puso’s
aforementioned empirical estimate now yields the following approximate shear
response function for an average-stretch, full-network constitutive model:
µ̂(I1 ) =
µ0
1 − λ̂3r
.
(9.4)
We recall that λ̂r ∈ [0, 1], and hence the finite chain extensibility effect is evident
in (9.4). Indeed, recalling the network locking constant (7.3) in (6.5), we see that
the reduced shear response function (9.4), based on Puso’s estimate, is now given
explicitly in terms of the physical constants µ0 and Im :
µ̂(I1 ) =
µ0
.
1 − (I1 /Im )3/2
(9.5)
Clearly, the first principal invariant I1 in every affine deformation of this material
is bounded by the network locking constant: I1 ∈ [3, Im ]. The general constitutive
equation (9.1) for the average-stretch model thus simplifies to
3/2 −1
I1
B.
(9.6)
T = − p1 + µ0 1 −
Im
Though here obtained differently and expressed in altogether different terms, this
simple result is Puso’s first order constitutive equation [21, equation (1.2.27)].
Acknowledgment
I thank two anonymous reviewers for helpful comments and for pointing out the
additional references [19] and [21] previously unknown to me.
References
1.
2.
3.
4.
5.
P.J. Flory, Statistical Mechanics of Chain Molecules. Hanser Publishers, New York (1988)
Chapter 1.
L.R.G. Treloar, The Physics of Rubber Elasticity, 3rd edn. Clarendon Press, Oxford (1975).
W. Kuhn and F. Grün, Beziehungen zwischen elastischen Konstanten und Dehnungsdoppelbrechung hochelastischer Stoffe. Kolloid-Z. 101 (1942) 248–271.
H.M. James and E. Guth, Theory of the elastic properties of rubber. J. Chem. Phys. 10 (1943)
455–481.
P.J. Flory and J. Rehner, Statistical mechanics of cross-linked polymer networks: I. Rubber
elasticity. J. Chem. Phys. 11 (1943) 512–520.
86
M.F. BEATTY
6.
E.M. Arruda and M.C. Boyce, A three-dimensional constitutive model for the large stretch
behavior of rubber elastic materials. J. Mech. Phys. Solids 41 (1993) 389–412.
L.R.G. Treloar, The photoelastic properties of short-chain molecular networks. Trans. Faraday
Soc. 50 (1954) 881–896.
L.R.G. Treloar and G. Riding, A non-Gaussian theory for rubber in biaxial strain. I Mechanical
properties. Proc. Roy. Soc. London A 369 (1979) 261–280.
P.D. Wu and E. van der Giessen, On improved network models for rubber elasticity and their
application to orientation hardening in glassy polymers. J. Mech. Phys. Solids 41 (1993) 427–
456.
P.D. Wu and E. van der Giessen, On improved 3-D non-Gaussian network models for rubber
elasticity. Mech. Res. Comm. 19 (1992) 427–433.
M.C. Wang and E. Guth, Statistical theory of networks of non-Gaussian flexible chains.
J. Chem. Phys. 20 (1952) 1144–1157.
R.L. Jernigan and P.J. Flory, Distribution functions for chain molecules. J. Chem. Phys. 50
(1969) 4185–4200.
A.E. Zúñiga and M.F. Beatty, Constitutive equations for amended non-Gaussian network
models of rubber elasticity. Internat. J. Engrg. Sci. 40 (2002) 2265–2294.
M.C. Boyce and E.M. Arruda, Constitutive models of rubber elasticity: A review. Rubber
Chem. Technol. 73 (2000) 504–523.
M.F. Beatty, Constitutive equations for the back stress in amorphous glassy polymers. Math.
Mech. Solids, to appear.
P.R. von Lockette and E.M. Arruda, Computational annealing of simulated unimodal and
bimodal networks. Comput. Theoret. Polymer Sci. 11 (2001) 415–428.
J.H. Weiner, Statistical Mechanics of Elasticity. Wiley, New York (1983).
O.H. Yeoh and P.D. Fleming, Constitutive modeling of the large strain time-dependent behavior
of elastomers. J. Polymer Sci. B: Polymer Phys. 35 (1997) 1919–1931.
S. Socrate and M.C. Boyce, Micromechanics of toughened polycarbonate. J. Mech. Phys. Solids
48 (2000) 233–273.
A.E. Zúñiga and M.F. Beatty, A new phenomenological model for stress softening in
elastomers. Z. Angew. Math. Phys. 53 (2002) 794–814.
M.A. Puso, Mechanistic constitutive models for rubber elasticity and viscoelasticity. Doctoral
dissertation, University of California, Davis (1994) 124 pages.
A.G. James, A. Green and G.M. Simpson, Strain energy functions of rubber: I. Characterization
of gum vulcanizates. J. Appl. Polymer Sci. 19 (1975) 2033–2058.
L.R.G. Treloar, Stress–strain data for vulcanized rubber under various types of deformation.
Trans. Faraday Soc. 40 (1944) 59–70.
E.A. Kearsely, Strain invariants expressed as average stretches. J. Rheology 33 (1989) 757–760.
C. Truesdell and R. Toupin, The Classical Field Theories. Flügge’s Handbuch der Physik,
Vol. III/1. Springer, Berlin (1960).
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
From 3-D Nonlinear Elasticity Theory to 1-D Bars
with Nonconvex Energy
MICHELE BUONSANTI1 and GIANNI ROYER-CARFAGNI2,⋆
1 Department Mechanics and Materials, University Mediterranea, Reggio Calabria,
I-89026 Reggio Calabria, Italy
2 Department of Civil-Environmental Engineering and Architecture, University of Parma,
Parco Area delle Scienze 181/a, I-43100 Parma, Italy. E-mail: gianni.royer@unipr.it
Received 22 July 2002; in revised form 7 May 2003
Abstract. This paper represents a first attempt to derive one-dimensional models with non-convex
strain energy starting from “genuine” three-dimensional, nonlinear, compressible, elasticity theory.
Following the usual method of obtaining beam theories, we show here for a constrained kinematics
appropriate for long cylinders governed by a polyconvex, objective, stored energy function, that the
bar model originally proposed by Ericksen [3] is obtainable but enriched by an additional term in
the strain gradient. This term, characteristic of nonsimple grade-2 materials, penalizes interfacial
energies and makes single-interface two-phase solutions preferred. The resulting model has been
proposed by a number of authors to describe the phenomenon of necking and cold drawing in
polymeric fibers and, here, we discuss its suitability to interpret also the elastic-plastic behavior
of metallic tensile bars under monotone loading.
Mathematics Subject Classifications (2000): 74A45, 74B20, 74G55, 74G65, 74N20.
Key words: nonconvex energy, nonlinear elasticity, polyconvexity, nonsimple materials, necking,
polymer, cold-drawing, plasticity.
1. Introduction
Different-in-type material responses, such as ductile [1] or brittle [2], can be interpreted with a variational approach. In one-dimension, simple models with nonconvex stored energy, of which Ericksen’s [3] is perhaps the most cited example,
represent an extension to solids of the Van der Waals idea, which applies to a surprisingly wide range of materials. Theories of this kind, as more deeply considered
by Dunn and Fosdick [4], allow for stress- and deformation-induced phase transitions and they predict discontinuous strain fields in reasonable agreement with
experimental observations.
The basic difficulty in 1-D models of the kind given in [3, 4] is that phaserearrangements are equienergetic and, consequently, minimizing strain fields are
in general highly non-unique. In elastic fluids, where a similar problem occurs
when the energy is assumed to be a nonconvex function of the density, several
⋆ Corresponding author.
87
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 87–100.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
88
M. BUONSANTI AND G. ROYER-CARFAGNI
important results concerning the interfacial energy between phases were obtained
by Cahn and Hilliard [5] by introducing a further dependence of the energy upon
the density gradient. A number of authors have attempted to solve the congenital
indeterminacy for solids by similarly changing the energy functional. For example,
Carr et al. [6, 7] proposed a grade-2 model for one-dimensional bars, in which
the energy is nonconvex in the strain and quadratic in the strain gradient. Their
proposal is a particular case of a more general theory earlier advanced by Coleman
for thin polymeric fibers [8]. Such models represent the natural extension of the
Cahn and Hilliard idea and, in substance, predict that configurations with least energy are the two-phase single-interface solutions. Consequently, once the average
elongation of the bar is known, the minimizing strain field is uniquely determined
modulo reversal.
The aim of this paper is to discuss the soundness of models of this kind apart
from any agreement with experimental evidence and to exhibit their consistence
with 3-D nonlinear elasticity theory (respecting in particular classical requirements,
such as objectivity). The method is similar to the traditional approach to beam theory using a three-dimensional parent theory, i.e., one that assumes a restricted kinematics such as is common to the Bernoulli–Navier or Timoshenko hypotheses [9].
Here, it is shown that for particular choices of polyconvex objective strain energies,
the corresponding minimization problem, in a restricted class of kinematicallyconstrained deformation fields, naturally leads to the aforementioned 1-D straingradient models.
This idea has already been pursued by Coleman and Newman, who showed
that the 1-D model in [8] for polymeric fibers can be obtained starting from an ad
hoc incompressible three dimensional elasticity parent theory [10, 11]. The main
difference between their work and ours consists in the choice of the elastic potential
for the 3-D parent theory. Here, we do not consider an incompressible material and,
indeed, it is the material reaction against volume changes that yields the nonconvex
(though polyconvex) character of the elastic potential; this is maintained in the onedimensional reduction. Whether the nonconvexity of the 1-D reduced model is a
strict consequence of the assumed kinematical constraints, or is a general property
independent of any simplification, remains an open question at this stage, not even
discussed in [10, 11]. We conjecture that this conclusion holds true even under
weaker kinematical restrictions and for more general elastic potentials, but the
consequent analytical complications call for a numerical approach, which will be
considered in a further work.
In any case, despite drastic simplifications, the present analysis shows that bar
models with nonconvex energy may be naturally deduced from the most classical theories for three-dimensional compressible elastic bodies. The resulting onedimensional stored energy function, nonconvex in the axial strain and quadratic
in the strain gradient, coincides with the earlier proposals of Carr et al. [6, 7] and
Coleman and Newman [8, 10–12]. The second-order term, characteristic of nonsimple grade-2 materials, penalizes interfaces between heterogeneous phases since
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
89
they must be separated by a transition zone. This is characteristic of the well-known
phenomenon of necking in polymers and their cold drawing for increasing average
elongation. Moreover, we recognize that also the behavior of other materials, in
particular the yielding of metallic bars, can be interpreted using the same model;
this bears on the original idea of Müller and Villaggio [1].
2. The variational problem under kinematical constraints
Let us consider a straight bar with constant cross section and length L. A reference
orthogonal system (x, y, z), with associated unit vectors i, j, k, is introduced so
that the z-axis is parallel to the bar longitudinal axis and passes through the centroid
of its cross section. Let B = × (0, L) ⊂ R3 denote the undistorted natural
configuration of the body, where ⊂ R2 is the domain representative of the cross
section. A deformation is defined through the mapping y(x): B → R3 which
satisfies the usual hypothesis, i.e., regularity, injectivity, det ∇y > 0.
The bar is made of a simple hyperelastic material defined through the strain
potential W (∇y): Lin+ → R, supposed to comply with the material objectivity
requirements and with the well-known polyconvexity condition of Ball [13]. Thus,
W takes the form
W (∇y) = g(∇y, adj ∇y, det ∇y),
(2.1)
with g(·, ·, ·) convex in each term. Such requirement is crucial to demonstrate
existence theorems in nonlinear elasticity theory [13], since it implies the lowersemicontinuity of the energy functional. Indeed, lower-semicontinuity is assured by
the less-restrictive quasi-convexity condition of Morrey. But this weaker hypothesis
guarantees existence in general only if growth conditions for W are assumed [13]
that prohibit any singular behavior, such as the physical requirement
det ∇y → 0+
⇒
W (∇y) → +∞.
(2.2)
The particular class of stored energy functions W here considered, usually referred
to as the Blatz–Ko potential, is given by
W (∇y) = a |∇y|2 + ϕ(det ∇y) ,
(2.3)
where a > 0 is a constant and ϕ is a smooth convex function such that ϕ(det ∇y) →
+∞ as det ∇y → 0+ or det ∇y → +∞. Clearly, W of (2.3) is polyconvex
and (2.2) is satisfied. It should be noticed, however, that W is nonconvex in ∇y.
Indeed, any hypothesis of convexity would imply serious physical inconsistencies, such as either failure of material objectivity and of (2.2) or unacceptable
monotonicity of forces [14].
For this class of stored energy functions, the Piola–Kirchhoff stress, defined by
S ≡ W∇y , has the form
S = a 2∇y + ϕ ′ (det ∇y)∇y−T .
(2.4)
90
M. BUONSANTI AND G. ROYER-CARFAGNI
The assumed condition that the undistorted reference configuration is natural, i.e.,
S|∇y=I = 0, implies the further requirement
ϕ ′ (1) = −2.
(2.5)
Fosdick and Royer-Carfagni [15] have recently considered potentials of this kind
to exhibit possible material instabilities.
Let us consider a particular kinematics for B, defined through a class of deformations where a particle at x = xi + yj + zk is mapped to
y = (1 + ν)x − νx∂z f (x, y, z) i + (1 + ν)y − νy∂z f (x, y, z) j
+ f (x, y, z)k,
(2.6)
where f : B → R is a function to be determined. Notice that when the bar is uniformly stretched longitudinally so that f,z = const = 1 + e, it contracts laterally
in the ratio (1 − νe) : 1, so that ν may be referred to as the coefficient of lateral
contraction.
When the bar is extended in a hard-loading device, from (2.3) and (2.6) its
energy takes the form
a 2(1 + ν − νf,z )2 − 2ν(1 + ν − νf,z )∂z (xf,x + yf,y )
E[f ] =
B
2
2
2
+ f,x2 + f,y2 + f,z2 + ν 2 (x 2 + y 2 )(f,zx
+ f,zy
+ f,zz
)
+ ϕ f,z (1 + ν − νf,z )2 + ν(1 + ν − νf,z )
× [f,zz (xf,x + yf,y ) − f,z ∂z (xf,x + yf,y )] dV .
(2.7)
The associated variational problem is the following:
min E[f ],
(2.8)
f ∈A
where the class A of admissible functions f is characterized by conditions
f (x, y, 0) = 0,
f (x, y, L) = βL.
(2.9)
The parameter β defines the average stretch.
3. The Particular Case of Null Lateral Contraction
In order to analyze the variational problem (2.7)–(2.9), it is helpful to consider first
a preliminary case when the coefficient of lateral contraction ν is equal to zero.
Under this condition, the energy (2.7) reduces to
a[2 + f,x2 + f,y2 + f,z2 + ϕ(f,z )] dV .
(3.1)
E0 [f ] =
B
For boundary conditions of the type (2.9), it can be easily seen that any minimizing
field f ∗ must enjoy the property f,x∗ = f,y∗ = 0 , i.e., there is no warping of the bar
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
91
cross sections. Moreover f ∗ (x, y, ·) ≡ w ∗ (·) ∀(x, y), where w ∗ : (0, L) ⊂ R → R
is a solution of the auxiliary one-dimensional problem
E0 [w], Â = w | w(0) = 0, w(L) = βL ,
(3.2)
min
w∈Â
with
E0 [w] =
L
ψ(w ′ (z)) dz,
0
ψ(w (z)) = aA 2 + (w ′ (z))2 + ϕ(w ′ (z)) .
(3.3)
E0 [f ∗ ] =
E0 [w ∗ ]
E0 [wf ] E0 [f ],
(3.4)
′
Here, A is the bar cross-sectional area.
In fact, for any f (x, y, z): B ⊂ R3 → R, let wf (z): (0, L) ⊂ R → R
represent its restriction to the z-axis. Clearly
E0 [wf ] E0 [f ], where the equal
sign holds iff f,x = f,y = 0. On the other hand, if w ∗ solves (3.2), then
E0 [w ∗ ]
∗
∗
E0 [wf ]. Let us then define f (x, y, z) as the extension of w (z) such that f ∗ (x, y, ·)
≡ w ∗ (·) ∀(x, y), that is f,x∗ = f,y∗ = 0. We can therefore write
∀f ∈ A.
In words, the terms f,x and f,y in (3.1) produce the coupling in the response of
fibers parallel to the z axis, which must deform to the same extent at any cross
section in order to minimize the energy.
This finding is important if the function ψ(·) in (3.3) is not convex. In this
situation, it has been clear since Ericksen’s analysis [3] that, for β in a given range,
two-phase solutions are highly non-unique, since phase-rearrangements are energetically equivalent. If the longitudinal fibers of the bar behaved independently of
one another, phase rearrangements would be completely arbitrary for each one of
them. On the other hand, the effect of the coupling terms f,x and f,y in (3.1) implies
the organization of phases in layers at right angle to the bar axis. In conclusion,
the phase-rearrangement indeterminacy still persists in the case ν = 0, but only
longitudinally and not transversally.
However, a substantial remark should be made at this point. If ϕ(·) is convex,
as is assumed in Section 2 to assure polyconvexity of the stored energy, from (3.2)
also ψ(·) will be convex. Consequently, it is well known that for any β the unique
solution of (3.2) is the trivial one, where the longitudinal strain is uniform in the
whole bar. Evidently, the assumption ν = 0 is too drastic and rules out a wide range
of interesting cases. In other words, necking due to lateral contraction should play
a crucial role.
4. The Bernoulli–Navier Case
Let us now consider for (2.7)–(2.9) a further kinematical hypothesis, similar to
that of Bernoulli–Navier for beams, i.e., the deformation preserves the flatness of
92
M. BUONSANTI AND G. ROYER-CARFAGNI
transverse planes normal to the cylinder axis. This is equivalent to assuming that
f (x, y, z) in (2.6) does not depend upon x and y but, for convenience, we do
not change notation and simply denote with f = f (z) the new corresponding
function and with f ′ (z) its derivative. A restricted kinematics of this kind was
earlier considered by Coleman and Newman [10, 11] in their derivation of the onedimensional theory [8] as an approximation for the response of a three-dimensional
long fiber made of polymeric incompressible elastic material.
The total energy for B becomes
a 2(1 + ν − νf ′ )2 + f ′2 + ν 2 f ′′2 (x 2 + y 2 )
E[f ] =
B
+ ϕ(f ′ (1 + ν − νf ′ )2 ) dV ,
(4.1)
which can be integrated in the cross section, to give
L
1 ′′2
′
dz.
E[f ] = aA
(f ) + κf
2
0
(4.2)
Here, the function : R → R, from (4.1), takes the form
(t) = 2(1 + ν − νt)2 + t 2 + ϕ t (1 + ν − νt)2 ,
(4.3)
while the constant κ is equal to
κ=
2ν 2
I0 ,
A
(4.4)
having denoted with I0 the second order polar moment of inertia of the bar crosssectional area.
What should be noted in (4.3) is that, despite the fact that ϕ is assumed to
be convex, the cubic c(t) = t (1 + ν − νt)2 is not. Consequently, (t) may be
nonconvex. To illustrate this aspect, let consider the following example. Let ϕ be
the convex function defined by
ϕ(c) = 2
c10 − (19 + 10−8 )c + 10−12 c−10
,
9 + 10−11 + 10−8
c > 0.
(4.5)
For ν = 0.1, c(t) = t (1 + ν − νt)2 is a nonconvex cubic and, the composed
function ϕ(c(t)) is also nonconvex. The graphs corresponding to c(t), ϕ(c) and
ϕ(c(t)) are juxtaposed in Figure 1.
As a result, the graph of (t), appearing in (4.2) and defined in (4.3), is of the
type represented in Figure 2. Notice that condition (2.5) assures that (t) has a
minimum at t = 1, so that the undistorted configuration is natural.
In conclusion, it has been shown that classical nonlinear elasticity theory may
directly offer, under appropriate kinematical assumptions, 1-D models where the
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
Figure 1. (a) Graph of ϕ(c); (b) graph of c(t); (c) graph of ϕ(c(t)).
Figure 2. Graph of (t), defined in (4.3).
93
94
M. BUONSANTI AND G. ROYER-CARFAGNI
stored energy is a nonconvex function of the axial strain. However, the example at hand contains the natural appearance of a second-order term in the resulting energy (4.2), which plays the important role of penalizing interfaces between
heterogeneous material phases.
5. The Resulting One-Dimensional Model
Setting u := f ′ , we find that the problem under consideration becomes the following:
PROBLEM P1-D . Minimize
L
1
′
2
(u(z)) + κ(u (z)) dz,
[u] = aA
2
0
over all u ∈ W 1,2 (0, L), u > 0, consistent with the constraint
L
u(z) dz = βL.
(5.1)
(5.2)
0
This model is well studied in the literature. It was proposed by Coleman [8] and
Carr et al. [6, 7], and later developed by Coleman and Newman [10–12], as a generalization of the original idea by Ericksen [3]. Here, for the sake of completeness,
we briefly recall the main results.
Under mild smoothness hypotheses, direct methods in the calculus of variations [16] show existence of at least a minimizer, and that any such minimizer
solves the corresponding Euler–Lagrange equations.
Now, let α1 and α2 be defined by the Maxwell conditions
(α2 ) − (α1 ) = σ0 (α2 − α1 )
with σ0 = ′ (α1 ) = ′ (α2 ),
(5.3)
i.e., the line (α1 ) + σ0 (t − α1 ) supports the graph of (t) from below. It is easy
to show that when β α1 or β α2 the solution of P1-D is the trivial one u∗ (z) =
βz. In fact, such u∗ (z) minimizes [u] of (5.1) for κ = 0 [3] and, consequently,
L
L
∗
∗
(u(z)) dz
(u (z)) dz aA
[u ] = aA
0
0
L
1
(u(z)) + κ(u′ (z))2 dz = [u], ∀u.
aA
(5.4)
2
0
The most interesting case is when α1 < β < α2 , whose characterization is given
by the following theorem, contained in [6, 7].
THEOREM (Carr, Gurtin and Slemrod). Assume that is a smooth nonconvex function of the form shown in Figure 2 and, more precisely, is of class
C 5 (0, L), ′′ > 0 on (0, a1 ) ∪ (a2 , ∞) and ′′ < 0 on (a1 , a2 ), ′ (0+ ) = −∞
and ′ (+∞) = +∞. Then, for any β ∈ (α1 , α2 ):
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
95
(i) when κ > 0 is small enough, problem P1−D has a unique (modulo reversal)
solution u∗κ (z);
(ii) u∗κ (z) is strictly monotone;
(iii) As κ → 0, u∗κ (z) approaches the single interface solution u∗ (z), defined as
(modulo its reversal)
α1 0 z l,
∗
u (z) =
(5.5)
α2 l < z L,
with l = L(α2 − β)/(α2 − α1 ).
In other words, for infinitesimal κ the higher order term causes the two-phase
solutions with least energy to become the single-interface solutions. For the discussion of this very important case, we report an elegant argument by Alberti [17],
providing an elementary characterization of the transition zone between heterogeneous phases. To this aim, it is convenient to consider a problem equivalent to
P1-D , i.e.,
L
L
1
′
2
≡
u(z) dz = βL,
(5.6)
(u(z)) + κ(u (z)) dz,
min
2
0
0
where
≡ (u(z)) = (u(z)) − [(α1 ) + σ0 (u(z) − α1 )],
(5.7)
and σ0 has been defined in (5.3). With this choice, ≡ (·) is non-negative and
presents only two absolute minima at α1 and α2 , with ≡ (α1 ) = ≡ (α2 ) = 0.
It is well known that minimizers of (5.6) coincide with solutions of P1-D since any
linear functional is a null-Lagrangian for [u] in (5.1).
Now, case (iii) of the aforementioned theorem suggests that when β ∈ (α1 , α2 ),
the material tends to be separated into two different phases, but their interface is
not sharp because of the gradient term in (5.7). The two phases are connected by
a transition zone, which produces an effect in the bar equivalent to a distortion.
Under De Saint Venant’s principle, the effects of such a distortion should be negligible at distances that are large when compared with the diameter of the bar cross
section. Consequently, if the length of the bar is much larger than the diameter of
its cross section and β is far from α1 and α2 , recalling (5.5) the transition zone is
certainly distant from the bar extremities. Consequently, the following optimal profile problem may be considered for the determination of the shape of the transition
zone.
PROBLEM POP (Optimal Profile). Find u: R → R such that
+∞
⎧
1
⎪
′
2
≡
⎨ aA
(u(z)) + κ(u (z)) dz = min,
2
−∞
⎪
⎩ lim u(z) = α1 and
lim u(z) = α2 .
z→−∞
z→+∞
(5.8)
96
M. BUONSANTI AND G. ROYER-CARFAGNI
Following the argument proposed by Alberti [17] for the Ŵ-convergence of
Cahn–Hilliard models, the inequality r 2 + s 2 2r · s with r = ( 21 κ)1/2u′ and
√
s := ≡ (u(z)) is applied to (5.8)1 to give
+∞
1
′
2
≡
(u(z)) + κ(u (z)) dz
aA
2
−∞
+∞
1 ′
2aA
≡ (u(z)) ·
κ u (z) dz
2
−∞
α2
1 ≡
κ (u) du = γ .
(5.9)
2aA
2
α1
The quantity
α2
1 ≡
κ (u) du,
(5.10)
2
α1
represents the energy necessary to produce an interface between the two phases
and depends only upon the shape of ≡ and, consequently, of . Noticing that
r 2 + s 2 = 2r · s if and only if r = s, we find from (5.9) and (5.10) that the lower
bound γ for the integral in (5.8)1 is attained provided u satisfies the differential
equation
κ ′
(5.11)
u (z) = ≡ (u(z)),
2
with boundary conditions (5.8)2 . Notice that u is monotone and u′ → 0 as u
approaches the value α1 or α2 .
To illustrate the characteristics of the transition zone in one example, suppose
that takes the form
1
(5.12)
(u) = b1 (u − α1 ) + b2 + 2 κ inf (u − α1 )2 , (u − α2 )2
2ζ
for some constants b1 , b2 and ζ . Rigorously speaking, expression (5.12) does not
follow immediately from (4.3), but since what is important is the nonconvex character of in a neighborhood of (α1 , α2 ), it is assumed that there exist particular
choices of ϕ(·) and ν such that (4.3) is well approximated by (5.12) in such a
neighborhood.
From (5.3) and (5.7), ≡ becomes
1
(5.13)
≡ (u) = 2 κ inf (u − α1 )2 , (u − α2 )2 ,
2ζ
and the differential equation (5.11) has the form
⎧
|u − α1 |
α1 + α2
⎪
⎪
for α1 u <
,
⎨
ζ
2
′
u =
(5.14)
⎪ |u − α2 |
α1 + α2
⎪
⎩
for
u α2 .
ζ
2
γ = 2aA
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
97
By symmetry, the boundary conditions (5.8)2 may be replaced by requiring
u(0) = (α1 + α2 )/2, so that a simple integration gives the following solution u∗ (z):
⎧
z
1
⎪
⎪
⎪
for z < 0,
⎨ 2 (α2 − α1 ) exp ζ + α1
∗
u (z) =
(5.15)
⎪
z
1
⎪
⎪
+ α2 for z 0.
⎩ (α1 − α2 ) exp −
2
ζ
It should be noted that ζ represents a characteristic length for the transition
zone. Moreover, from (5.10) the energy γ necessary to produce the transition
(interface energy) is
κ α2 − α1 2
γ = aA
,
(5.16)
ζ
2
which depends also upon κ.
Recalling (2.6) and the notation u = f ′ , we see that the bar undergoes a lateral
contraction equal to ν(u − 1). Consequently, for β ∈ (α1 , α2 ), the shape of the
deformed bar would appear as represented in Figure 3. The lateral contraction of
the bar is ν(α1 − 1) or ν(α2 − 1) for the portions on the left-hand side or on the
right-hand side of the transition zone respectively. Provided that ζ ≪ L, the length
of the phase1-portion is approximately
α2 − β
L.
l∼
=
α2 − α1
(5.17)
The quasi-static equilibrium states of the bar, as the average stretch β is gradually increased starting from the undistorted state β = 1, are as follows.
(i) For 1 < β α1 , the bar is uniformly stretched. The axial dilatation is (β − 1)
and the lateral contraction is ν(β − 1).
(ii) For α1 < β < α2 , the deformation of the bar is as in Figure 3. As β increases,
one phase evolves at the expense of the second phase, gradually invading the
whole bar.
(iii) For β α2 , the bar dilatation becomes again uniform and equal to (β − 1) as
in case (i).
Figure 3. Predicted necking in the transition zone between heterogeneous phases.
98
M. BUONSANTI AND G. ROYER-CARFAGNI
In practice, the energy consumed in producing the necking is interpreted as a
surface energy between heterogeneous phases, assuring uniqueness (modulo reversal) of minimizers also in stage (ii). As β is increased the neck moves along the bar,
changing the ratio between the length of the regions occupied by heterogeneous
phases in order to accommodate the average elongation β. This particular motion
is usually referred to as drawing [8].
6. Comparison with Experiments and Conclusions
The model defined by (4.2) is a particular case of a general theory appositely conceived by Coleman [8] to describe the cold drawing in polymeric fibers such as
nylon. In the undistorted condition, nylon is composed of long chainlike molecules
that are oriented at random. Under a tensile force increasing from zero such filaments, tangled together, do not stretch much at first, but when a certain critical load
is reached, the long molecule chains suddenly rearrange themselves, so that they
become parallel to the filament axis. At this instant, a sharp constriction forms in
the filament profile, in general nucleated at one of the extremities where the fiber
is clamped by the loading device. A moment later, the contracted portion is seen
to increase its length and the “wave front” moves along the specimen towards the
opposite end (drawing), due to the successive rearrangement of the molecules in
the neighboring portions. Eventually, the filament shape looks as represented in
Figure 4, where the correspondence with the behavior of Figure 3, predicted by the
model, is evident. After the entire filament has contracted, all the molecules have
been rearranged and, if the fiber is further pulled, the longitudinal strain (and the
consequent lateral contraction) increases uniformly throughout the bar [18].
It is perhaps less known that such a behavior, despite being caused by a mechanism completely different at the microstructural level (dislocation movements), is
also analogous to the manner in which a bar of mild steel elongates. Metals have
the characteristic property of possessing a well-defined “yield point”, at which they
start to stretch permanently. At the macroscopic level, the load vs. displacement
curve is characterized by a pseudo-horizontal plastic plateau, but careful experiments [19] reveal that plastic strain does not progress uniformly throughout the
bar. When one element or layer yields, it strains a few percent almost instantly and
then nearly stops, while yielding is transferred to the neighboring portions.
In both examples, a microstructural rearrangement (molecular alignment or
plastic slip) produces a strain jump with the same characteristics as a phase tran-
Figure 4. Constriction forming on thin filaments of nylon after pulling (from [18, p. 300]).
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
99
sition, which occurs in 1-D elastic bars with nonconvex stored energy. Indeed,
Müller and Villaggio [1] recognized that Ericksen’s model [3] was suitable also
for an elastic-plastic body. However, if the segments of the bar could yield independently of one another, like the rings of a chain, the process would be highly
chaotic, in agreement with the non-uniqueness inherent in Ericksen’s model. On
the other hand, experiments on metals show an orderly deformation similar to that
of polymers. A possible explanation stems from the observation that the yield point
of metals is greatly influenced by any stress concentration. There is a wealth of
experimental evidence [19] that localized yielding produces a condition equivalent
to a stress concentration at the boundary of the yielded portion, which can greatly
influence [20] the behavior of the neighboring parts (nonlocal effect). The yielding
of the first portion produces a chain reaction that produces the spreading of plastic
deformation through the specimen similar to the cold drawing of polymers. For the
model of Section 5, this nonlocal effect is due to the constriction shown in Figure 3 that, while connecting heterogeneous phases, produces a condition somehow
equivalent to a stress concentration. The energy consumed by the formation of such
a transition plays the role of an interface energy.
In the class of models discussed here, the nonlocal effect is interpreted by the
presence of the quadratic term in the strain-gradient which appears in the energy
functional (4.2). A number of authors have proposed to add strain-gradient dependence in order to regularize Ericksen’s problem and to solve the congenital
equivalence of phase rearrangements. In this paper we have emphasized that, for
a specified restricted kinematics, classical 3-D nonlinear elasticity theory naturally
yields a 1-D formulation with the aforementioned characteristics. In particular, in
our derivation the nonconvex strain dependence of the energy, that dependence
which allows for phase transformations, is inevitably associated with the appearance of a convex term in the strain gradient, thus providing the nonlocal effect
necessary to reproduce the phenomena of necking and drawing.
Acknowledgements
Our most thankful appreciation is devoted to Professors G. Del Piero and R. Fosdick for their helpful comments during the preparation of this work. Partial support of Italian MURST under grant CoFin “Modelli matematici per la scienza dei
materiali” is gratefully acknowledged.
References
1.
2.
3.
I. Müller and P. Villaggio, A model for an elastic plastic Body. Arch. Rational Mech. Anal. 65
(1977) 25–56.
L. Truskinovsky, Fracture as a phase transition. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials, CIMNE, Barcellona (1996)
pp. 322–332.
J.L. Ericksen, Equilibrium of bars. J. Elasticity 5 (1975) 191–201.
100
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
M. BUONSANTI AND G. ROYER-CARFAGNI
J.E. Dunn and R.L. Fosdick, The morphology and stability of material phases. Arch. Rational
Mech. Anal. 74 (1980) 1–99.
J.W. Cahn and J.E. Hilliard, Free energy of a non uniform system. I. Interfacial free energy.
J. Chem. Phys. 28 (1958) 258–267.
J. Carr, M. Gurtin, and M. Slemrod, One-dimensional structured phase transformations under
prescribed loads. J. Elasticity 15 (1985) 133–142.
J. Carr, M. Gurtin, and M. Slemrod, Structured phase transition on a finite interval. Arch.
Rational Mech. Anal. 86 (1984) 317–351.
B. Coleman, Necking and drawing in polymeric fibers under tension. Arch. Rational Mech.
Anal. 83 (1983) 115–137.
P. Podio-Guidugli and M. Lembo, Internal constraints, reactive stresses and the Timoshenko
beam theory. J. Elasticity 65 (2002) 131–148.
B. Coleman and D.C. Newman, Constitutive relations for elastic materials susceptible to
drawing. In: V.K. Stokes and D. Krajcinovic (eds), Constitutive Modeling for Nontraditional
Materials. ASME, New York (1987).
B. Coleman and D.C. Newman, On the rheology of cold drawing. I. Elastic materials.
J. Polymer Sci. Part B Polymer Phys. 26 (1988) 1801–1822.
B. Coleman and D.C. Newman, Mechanics of neck formation in the cold drawing of elastic
films. Polymer Engrg. Sci. 30 (1990) 1299–1302.
J. Ball, Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational
Mech. Anal. 100 (1977) 337–403.
J.E. Marsden and T.J.R. Hughes, Mathematical Foundations of Elasticity. Dover, Mineola, NY
(1994).
R. Fosdick and G. Royer-Carfagni, Multiple natural states for an elastic isotropic material with
polyconvex stored energy. J. Elasticity 60 (2000) 223–231.
B. Dacorogna, Direct Methods in the Calculus of Variations. Springer, New York (1989).
G. Alberti, Variational models for phase transitions. An approach via Ŵ-convergence. In: Summer School on Differential Equations and Calculus of Variations, Pisa (16–28 September 1996)
Lecture Notes.
A. Nadai, Theory of Flow and Fracture of Solids, Vol. I. McGraw-Hill, New York (1950).
M. Froli and G. Royer-Carfagni, Discontinuous deformation of tensile steel bars: Experimental
results. J. Engrg. Mech. ASCE 125 (1999) 1243–1250.
M. Froli and G. Royer-Carfagni, A mechanical model for the elastic-plastic behavior of metallic
bars. Internat. J. Solids Struct. 37 (2000) 3901–3918.
Eshelby Tensor as a Tensor of Free Enthalpy
GIOVANNI BURATTI1, YONGZHONG HUO2 and INGO MÜLLER3
1 Dipartimento di Ingegneria Strutturale, Università di Pisa, Pisa, Italy
2 Department of Mechanics, Fudan University, Shanghai, PR China
3 Thermodynamik, Technische Universität Berlin, 10623 Berlin, Germany
Received 18 September 2002; in revised form 16 September 2003
Abstract. The balance equations of mass, momentum, energy and entropy at a phase boundary
imply phase boundary conditions which determine the position of the boundary as a function of
temperature. This is true when either the phase boundary is sharp or when it occurs through a
transition zone, albeit the latter case seems to require strongly symmetric geometry.
Mathematics Subject Classifications (2000): 74A15, 74A50.
Key words: phase transitions, Eshelby tensor, free enthalpy.
Dedicated to C.A. Truesdell, who taught us rational thinking
1. Eshelby Tensor as a Tensor of Free Enthalpy
Viewed macroscopically phase boundaries may sometimes be sharp boundaries
between uniform phases. Such is the case in a liquid–vapour or a solid–liquid
phase boundary; some solid–solid phase boundaries are of that type too, notably for
martensitic transformations. In such cases the equations of balance of thermodynamics assume the form of jump conditions on a singular surface, the phase boundary. In addition to the balance equations coherency of the phases requires kinematic
compatibility conditions which relate the jumps of the velocity components across
the phase boundary to the mass rate of the phase transition.
Thus, one obtains an entropy inequality on the phase boundary which implies
that the mass rate of the phase transition is proportional to the discontinuity of
the normal component of the Eshelby tensor. This idea goes back to the procedure
of linear irreversible thermodynamics (e.g., see [1]), by which the thermodynamic
forces and fluxes, whose product forms the entropy inequality, are proportional.
This idea has been adapted in [2, 3] to the motion of a phase boundary. In particular, in equilibrium the normal component of the Eshelby tensor is therefore
continuous. This equilibrium condition generalizes the thermodynamic condition
of phase equilibrium by which the specific free enthalpy – or Gibbs free energy
– is continuous. In simple cases the condition gives rise to the “common tangent
101
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 101–112.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
102
G. BURATTI ET AL.
construction”, which in thermodynamics is a familiar way to determine the load
and deformations of phase equilibria as functions of temperature.
The usefulness of the knowledge about the continuity of the Eshelby tensor
is strongly qualified as a universal tool by the need to know the generally nonuniform stress and strain fields adjacent to the phase boundary. The determination
of such fields requires numerical calculations and is therefore restricted to specific
cases.
More often than sharp phase boundaries we observe a transition zone between
the phases, in particular between two solid phases. Such a zone is narrow but
macroscopically noticeable. And while we cannot employ singular surface analysis
in such cases, we may integrate over the surface of the zone and thus compare
stresses and strains in the regions of the body adjacent to the transition zone. In
simple cases of high symmetry this procedure can lead to a variant of the “common
tangent construction”.
In recent years phase transitions in solids have become important, since structural elements of machines are subjected to high temperatures and strong tensile
forces which can affect the size and shape of inclusions. After the pioneering work
of Eshelby [4] for elastic inclusions, the thermodynamic character of the Eshelby
tensor was first recognized by Heidug and Lehner [5]. More recent references are
the works by Truskinovsky [6], Gurtin [7] and Liu [8].
Under most circumstances the exploitation of the phase boundary conditions
requires extensive numerical calculations and such calculations have been made
by Schmidt [9] and R. Müller [10], who have calculated the shapes of inclusions in
solids.
The present work emphasizes the similarity of the Eshelby tensor with the
free enthalpy, or Gibbs free energy of thermodynamics. After the derivation of
the general form of the phase equilibrium conditions, we specialize to a liquid–
vapour transition and to a solid–solid transition under shear. We also discuss the
case when the transition does not occur on a sharp interface but in a narrow but
smooth zone.
The mathematical formulation of the coherency of the phases makes use of the
displacement field, i.e., the position of material points relative to their positions at
an initial time, or in a reference configuration. And the displacement gradient affects the Eshelby tensor. This fact has given rise in the literature to the ideas that the
proper environment of the Eshelby tensor is a “configurational space”; see [11, 12],
who have gone so far as to call for a “configurational physics”. Similar ideas about
“configurational balances” have been proposed by Podio-Guidugli [13]. We see
no need for all these and we firmly place the Eshelby tensor in “physical space”,
where it belongs in our opinion, along with such classical subjects as stress, strain
and free enthalpy.
To keep arguments simple we have omitted to endow the phase boundary with
properties of its own like surface stress or surface energy. The formulae needed for
the consideration of such properties are available in [14] or [15].
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
103
Figure 1. Sharp phase boundary.
2. Equations of Balance
Figure 1 shows a cross section of a sharp phase boundary that separates the phases
(+) and (−) and has the unit normal ni pointing from (−) into (+). The velocity
of the boundary is Vi = V ni and the velocities of the phases on either side are v±
i .
The conservation laws of mass, momentum and energy on the phase boundary
read
[ρ(vi − Vi )]ni = 0,
[ρvj (vi − Vi ) − tj i ]ni = 0,
1 2
ρ ε + v (vi − Vi ) − tj i vj + qi ni = 0.
2
(2.1)
They state that the fluxes of mass, momentum and energy in and out of the surface
are equal; the convective fluxes are relative to the moving surface. Equations (2.1)
employ the canonical letters for the physical quantities. Thus ρ is the mass density,
vi the velocity, tj i the stress, ε the specific internal energy, and qi the heat flux.
Square brackets denote the difference of the bracketed quantity on the two sides
such that [c] = c+ − c− holds.
The balance of entropy states that the efflux of entropy from the surface is bigger
than the influx. Thus, if we suppose that the efflux is on the (+)-side and the influx
is on the (−)-side the balance reads
qi
ni = σ 0;
(2.2)
ρη(vi − Vi ) +
T
η denotes the specific entropy, T the absolute temperature and σ is the surface
density of entropy production.
It is reasonable to assume that the temperature is continuous. Therefore [qi ]ni
may be eliminated between (2.1)3 and (2.2). With ψ = ε − T η as the specific free
energy the result reads
1 2
(2.3)
ρ ψ + v (v⊥ − V ) − tj i vj ni = −T σ 0,
2
104
G. BURATTI ET AL.
where v⊥ is short for vi ni . We define the arithmetic mean value c = 12 (c+ + c− )
and use (2.1)1,2 as well as the identity
[ab] = [a]b + a[b]
(2.4)
to rewrite (2.3) in the form
[ψ]ρ(v⊥ − V ) − tj i ni [vj ] 0.
(2.5)
In words this means that the difference between efflux and influx of the free energy
is smaller than or equal to the power of the mean stress vector tj i ni on the relative
velocity [vj ].
An alternative form of (2.5) results if we decompose [vj ] into its normal and
tangential components according to
[vj ] = [v⊥ ]nj +
2
α=1
[vα ]τjα ,
(2.6)
τjα
where
are two orthonormal tangent vectors on the phase boundary such that
α β
τj τj = δαβ holds; vα = vj τjα are the corresponding tangential components of vj .
Thus we obtain from (2.5)
2
tj i ni nj
tj i ni τjα [vα ] 0,
ρ(v⊥ − V ) −
ψ−
ρ
α=1
(2.7)
so that the power of the mean stress vector is split into three parts due to the normal
and tangential components of tj i ni .
3. Kinematic Compatibility
Coherency of the phases implies that the displacement vector ui is continuous
across the phase boundary, i.e.,
[uj ] = 0.
(3.1)
(The displacement field ui (xj , t) refers to a material particle which at time t is
at the position xj , while at some prior time t0 it was at xj0 ; we have ui (xj , t) =
xi (xj0 , t) − xi0 .)
The gradient ∂uj /∂xi and the time derivative ∂uj /∂t may suffer jumps but,
because of (3.1), the jump of the gradient cannot have a tangential direction and the
jump of the time derivative is related to the normal speed V of the phase boundary.
We have
∂uj
∂uj
∂uj
= aj ni and
ni .
(3.2)
= −V
∂xi
∂t
∂xi
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
105
These conditions, where ai is an arbitrary vector on the phase boundary, are called
kinematic compatibility conditions (e.g., see [16, p. 503 ff]). The velocity is related
to the derivatives of the displacement by
∂ui
∂ui
∂uj −1 ∂ui
vi =
+ vj
, hence vj = δij −
.
(3.3)
∂t
∂xj
∂xi
∂t
We recall the decomposition (2.6) of [vj ] into normal and tangential components
and use (2.1)1 , (2.4), (3.2) and (3.3) to determine these components in terms of the
mass rate of the phase transition
1
[v⊥ ] =
ρ(v⊥ − V ),
ρ
(3.4)
−1
∂u
∂u
1
j
k
δkj −
δj i −
ni ρ(v⊥ − V ).
[vα ] = −τkα
∂xj
ρ
∂xi
Bki
By (3.4) the jumps of all three components are proportional to the mass rate
ρ(v⊥ − V ) of the phase transition. For the normal component the factor of proportionality is [1/ρ], while for the tangential components the factors are complex
expressions in terms of the densities and displacement gradients of the phases. In
the sequel we simplify by introducing the abbreviation Bki as indicated in (3.4)2 .
4. Entropy Inequality, Speed of Phase Boundary, Equilibrium
We eliminate [vj ] between (3.4) and (2.5) and obtain an alternative form of the
entropy inequality or free energy inequality, viz.
2
tkn nk nn
α α
(4.1)
tj l τj τk Bki nl ni ρ(v⊥ − V ) 0.
δli +
ψ−
ρ
α=1
Eshelby tensor gli
The bracketed quantity defines the Eshelby tensor; here we denote it by gik in
order to emphasize its kinship with the free enthalpy or Gibbs free energy, which
we shall exhibit later and which is usually denoted by g. Thus (4.1) may be written
in abbreviated form as
[gli ]nl ni ρ(v⊥ − V ) 0.
(4.2)
The left-hand side of (4.2) may be interpreted as a product of a thermodynamic
force [gli ]nl ni – the discontinuity of the normal component of the Eshelby tensor,
and a thermodynamic flux ρ(v⊥ − V ) – the mass rate of the phase transition.
The inequality (4.2) is satisfied by setting
ρ(v⊥ − V ) = N[gik ]ni nk ,
(4.3)
106
G. BURATTI ET AL.
where N is a negative factor of proportionality. Thus the mass rate of the phase
transition is proportional to the jump of the normal component of the Eshelby
tensor.
In particular, in equilibrium, when the phase transition comes to a standstill we
must have
[gik ]|E ni nk = 0.
(4.4)
The index E refers to equilibrium, where v⊥ = V holds. In that case (2.1)2 implies
[tj i ]|E ni = 0,
hence
tj i |E ni = tj i ni ,
(4.5)
so that the stress vector tj i ni is continuous. We obtain from (4.4) and (4.1)
[gik ]|E ni nk =
2
tkn
α α
ψ − nk nn δli + tj l
τj τk Bki
nl ni = 0,
ρ
E
α=1
(4.6)
where Bki is defined in (3.4)2 . We shall refer to this equation as the Eshelby
condition for phase equilibrium.
Given the continuity of the temperature across the phase boundary and the
continuity of the stress vector and the continuity of the normal component of the
Eshelby tensor we can determine the position of the phase boundary as a function
of temperature.
We proceed to study two special cases.
5. Liquid–Vapour Phase Transition
We envisage a situation as shown in Figure 2(a) with liquid and vapour of a fixed
mass m and in a fixed volume V . And we ask for V (−) (V , T ), the volume of the
liquid in equilibrium as a function of the total volume V and temperature T .
(a)
(b)
Figure 2. Common tangent for liquid–vapour transition.
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
107
In equlibrium the stress of both liquid and vapor reduces to an isotropic pressure p, i.e., in (4.6) we have tj i = −pδij . Therefore the stress vector tj l nl has no
tangential component and (4.6) reads
p
[ψ]
ψ+
= 0 or p = −
,
(5.1)
ρ
[1/ρ]
since, by (4.5), the pressure is continuous in this case. Equation (5.1), which represents the Eshelby condition in the the present case, expresses the continuity of the
specific free enthalpy or Gibbs free energy.
We recall from basic thermodynamics that p = −∂ψ/(∂1/ρ) holds in the liquid
and the vapour. Therefore it follows from (5.1)2 that
∂ψ
∂1/ρ
+
=
∂ψ
∂1/ρ
−
=
[ψ]
[1/ρ]
(5.2)
holds, which implies the well-known “common tangent construction”. This is a
graphical method for determining the densities ρ ± (T ) as functions of temperature.
These densities or, equivalently, the specific volumes 1/ρ ± (T ) result as the abscissae of the points of contact of the tangent common to the two free energy functions
ψ − (1/ρ, T ) and ψ + (1/ρ, T ); see Figure 2. The pressure needed to maintain the
volume V is given by the slope of the tangent; it depends on the temperature.
We are thus able to reach our objective and determine V (−) as a function of T ,
given V . With m being the mass of the fluid we obtain
V − (V , T ) = V
m/V − ρ + (T )
.
ρ − (T ) − ρ + (T )
(5.3)
Of course, this solution did require the knowledge of ψ (+) (1/ρ, T ) and
ψ (1/ρ, T ). A well-known example for which this knowledge is available is the
van der Waals gas, where ψ ± are given by the two convex parts of the free energy
function.
(−)
6. Solid–Solid Phase Transition under Shear
We consider a situation as shown in Figure 3(a) where the upper plane of an elastic
body is displaced by ui (H ) = (U, 0, 0). The transformation strain between the
phases (+) and (−) is given by
⎡
⎤
0 ε̄ 0
(6.1)
εij(+) − εij(−) = ε̄ij = ⎣ ε̄ 0 0 ⎦ .
0 0 0
We ask for X(U, T ), the position of the phase boundary in equilibrium as a function of temperature, if U is given.
108
G. BURATTI ET AL.
(a)
(b)
Figure 3. Common tangent construction for solid–solid transition in shear.
We assume that both phases are linearly elastic isotropic bodies with equal
Lamé coefficients µ and λ. Therefore the stresses tij in terms of the strains εij =
1
(∂ui /∂xj + ∂uj /∂xi ) are given by
2
tij(+) = 2µ εij(+) − ε̄ij + λεll(+) δij ,
(6.2)
tij(−) = 2µεij(−) + λεll(−) δij .
In both phases the equilibrium condition ∂tij /∂xj = 0 must be satisfied. It follows
that only the 1-components of the displacement fields are nonzero. They read
U
X
(+)
u1 =
+ 2ε̄
(x2 − H ) + U,
H
H
(6.3)
X
U
(−)
+ 2ε̄
− 1 x2 .
u1 =
H
H
These fields satisfy the boundary values u1(+) (H ) = U , u1(−) (0) = 0 and the jump
conditions (3.1) and (4.5). It remains to exploit the condition (4.6) on the continuity
of the Eshelby tensor.
Obviously from (6.3) we have
1
(U + 2ε̄X),
2H
1
(−)
(−)
(U + 2ε̄(X − H )),
=
= ε21
ε12
2H
µ
(±)
(±)
t12
= t21
= (U + 2ε̄(X − H )),
H
(+)
(+)
=
= ε21
ε12
(6.4)
and ρ (+) = ρ (−) holds, since εll = 0. Without loss of generality we set τ1α =
(1, 0, 0) and τ2α = (0, 0, 1) on the boundary. Therefore (4.6) reads [ψ +t12 B12 ] = 0
or, with (cf. (3.4)) [B12 ] = −(1/ρ)[∂u1 /∂x2 ],
[ρψ]
∂u1
.
(6.5)
= 0 or, by (6.1), (6.3), t12 =
ρψ − t12
∂x2
[2ε12 ]
109
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
Equation (6.5)1 or, equivalently [ρψ − t12 2ε12 ] = 0, represents the Eshelby
condition in this case. Clearly it may be expressed as the continuity of the free
enthalpy density appropriate to shear loading.
We recall from linear elasticity that the free energy densities of the phases are
given by
λ
ρψ (+) = ρf (+) (T ) + µ εij(+) − ε̄ij εij(+) − ε̄ij + εll(+)2
2
λ
ρψ (−) = ρf (−) (T ) + µεij(−) εij(−) + εll(−)2 ,
2
and
(6.6)
so that t12 = ∂ρψ/(∂2ε12 ) holds; f (±) (T ) are the specific non-elastic, thermal
parts of the free energy density. Therefore (6.5) may be written in the form
∂ρψ
∂2ε12
+
=
∂ρψ
∂2ε12
−
=
[ρψ]
,
[2ε12 ]
(6.7)
±
which implies the common tangent construction: for the determination of ε12
(T )
and t12 (T ); see Figure 3.
We are then able to calculate the position X of the interface from (6.4)1 as
X(U, T ) =
(+)
(T ) − U
2H ε12
(−)
(+)
(T ))
(T ) − ε12
2(ε12
.
(6.8)
Given U , the position changes with temperature.
7. Extrapolation to Transition Zones Rather than Sharp Boundaries
It is clear that simple and clear-cut considerations like those in Sections 5 and 6
require uniform fields of stress and strain up to the phase boundary whose position
may then be calculated as shown. This does not work very often. Usually the fields
are non-uniform and not known analytically. Also neither the position nor the shape
of the phase boundary is known. In such cases extensive numerical calculations are
needed to obtain solutions inside the phases that are coherent at the boundary and
satisfy the jump conditions of thermodynamics.
Such calculations have been made (e.g., see [9, 10]), and they have been used to
determine the size and shape of precipitates in alloys. Actually such investigations
are close in scope to the motivation of Eshelby’s original paper [4], in which the
Eshelby tensor appeared first.
Not wanting to enter such complex numerical studies we shall rest content to
point out, in this section, certain simple cases in which stress and strain are not
uniform and yet definite simple answers can be found like those of the previous
sections.
Specifically we shall consider circumstances in which the non-uniformity of the
stress and strain fields is confined to a narrow range, the transition zone between
110
G. BURATTI ET AL.
Figure 4. Tensile and compressive rods.
two phases. Typical examples are shown in Figure 4: a cylindrical rod under tension
undergoing a phase change that is accompanied by a change of cross section. And
a rod under compression where the lateral expansion is constrained by a rigid pipe.
In both cases there is a transition zone of non-uniform stress but outside that zone
it is reasonable to assume uniformity.
We apply the equations of balance of mass, momentum, energy and entropy
to material volumes whose surfaces ∂ are indicated by the dashed lines in
Figure 4 so that the cross-sections lie outside the transition zones. The surfaces
move with the bodies and the mantle parts of ∂ contribute nothing except possibly
heating. Therefore the balance equations read
{ρvi ni A} = 0,
{ρvj vi ni A − tj i ni A} = 0,
$
1 2
qi ni dA = 0,
ρ ε + v vi ni A − tj i ni vj A +
2
∂
qi ni
{ρηvi ni A} +
dA 0.
∂ T
(7.1)
The brackets denote differences between the two cross-sections, such that {c} =
c+ − c− . Note that the cross-sections may be different in area A. In the case of
the tensile rod density and stress vanish on the mantle, while in the case of the
compressed rod the contributions of stress on the mantle cancel each other because
of symmetry.
We assume that T is uniform and eliminate the heating between (7.1)3,4 . Thus
we obtain
$
1 2
ρ ψ + v vi ni A − tj i ni vj A 0
(7.2)
2
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
111
or, by (2.4), which holds for the curly brackets as well as the square ones,
{ψ}ρvi ni A − tj i ni A{vj } 0.
(7.3)
In both of our examples vj is normal to the cross-sections and, by (7.1)1 , we
may therefore write (with vi ni = v⊥ )
$
1
{vj } =
nj ρv⊥ A
(7.4)
ρA
so that (7.3) assumes the form
$
tj i nj ni A
ψ−
ρv⊥ A 0.
ρA
(7.5)
As before, in Section 4, we argue from (7.5) that the mass rate ρv⊥ A of the transition is proportional to its factor in (7.5) with a negative factor of proportionality
so as to satisfy the inequality. Thus with tij ni nj A as the tensile or compressive
force P we have
$
P
(7.6)
ρv⊥ A = N ψ −
ρA
and in phase equilibrium
$
P /A
ψ−
= 0,
ρ
(7.7)
since, by (7.1)2 , P is the same on the two cross-sections.
The phase equilibrium condition (7.7) is formally idential to the case of the
liquid–vapour condition, cf. (5.1)1 .
Note, however, that when A+ and A− are unequal – like in the tensile rod –
it is not the stress P /A which is equal in the two phases but the load P . In such
a case there is an interesting alternative form of (7.7) which results by extending
numerator and denominator by the lengths L+ and L− , see Figure 4. In this way
with ρAL = m we obtain
$
{ψ}
L
,
(7.8)
= 0 or P =
ψ −P
m
{l}
where l ± = L± /m± are the specific lengths of the phases.
Along with P = (∂ψ/∂l)|+ = (∂ψ/∂l)|− this gives rise to a common tangent
construction for the free energies ψ ± (l, T ). Such a construction was described by
Huo and Müller [17] in connection with a shape memory rod from a minimization
of the free energy.
Acknowledgements
Giovanni Buratti and Yonghzong Huo gratefully acknowledge the support of their
funding institutions. G. Buratti was a junior researcher in the TMR project: Phase
112
G. BURATTI ET AL.
Transformations in Crystalline Solids and Y. Huo has a Alexander von Humboldt
scholarship.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
S.R. de Groot and P. Mazur, Anwendung der Thermodynamik irreversibler Prozesse. Bibliographisches Institut, Mannheim (1974).
R. Abeyaratne and J.K. Knowles, On the driving traction acting on a surface of strain
discontinuity in a continuum. J. Mech. Phys. Solids 38 (1990) 345–360.
E. Fried, Energy release, friction and supplemental relations at phase interphases. Continuum
Mech. Thermodyn. 7 (1995) 111–121.
J.D. Eshelby, The elastic energy momentum tensor. J. Elasticity 5 (1975) 321–335.
W. Heidug and F.K. Lehner, Thermodynamics of coherent phase transformations in nonhydrostatically stressed solids. Pure Appl. Geophys. 123 (1985) 91–98.
L.M. Truskinovsky, Dynamics of non-equilibrium phase boundaries in a heat-conducting nonlinearly elastic medium. J. Appl. Math. Mech. PMM USSR 51 (1987) 777–784.
M.E. Gurtin, The dynamics of solid–solid phase transitions – 1. Coherent transitions. Arch.
Rational Mech. Anal. 123 (1993) 305–335.
I-Shih Liu, On interface equilibrium and inclusion problems. Continuum Mech. Thermodyn. 4
(1992) 177–188.
I. Schmidt, Gleichgewichtsmorphologien elastischer Einschlüsse. Dissertation TU Darmstadt.
Shaker Verlag (1997).
R. Müller, 3D-Simulation der Mikrostrukturentwicklung in Zwei-Phasen-Materialien. Dissertation TU Darmstadt (2001).
G.A. Maugin, Material forces: Concepts and applications. ASME Appl. Mech. Rev. 48 (1995)
213–245.
R. Kienzler and G.A. Maugin, Configurational Mechanics of Materials. CISM Internat. Centre
for Mechanical Sciences, Courses and Lectures 427 (1999).
P. Podio-Guidugli, Configurational balances via variational arguments. Interfaces Free Boundaries 3 (2001) 1–13.
I. Müller, Thermodynamics. Pitman, Boston (1985).
I. Müller, Eshelby tensor and phase equilibrium. Theor. Appl. Mech. 25 (1999) 77–89.
C. Truesdell and R. Toupin, The classical field theories. In: Handbuch der Physik, Vol. III/1.
Springer, Heidelberg (1960) pp. 226–793.
Y. Huo and I. Müller, Thermodynamics of pseudoelasticity – an analytical approach. Acta
Mechanica 99 (1993) 1–19.
E. Frola (1906–1962): an Attempt Towards an
Axiomatic Theory of Elasticity
SANDRO CAPARRINI and FRANCO PASTRONE
Dipartimento di Matematica, Università di Torino, Via Carlo Alberto 10, 10123 Torino, Italy.
E-mail: caparrini@libero.it, pastrone@dm.unito.it
Received 19 September 2002; in revised form 16 October 2003
Abstract. In a few papers published in the 1940’s, the Italian mathematician Eugenio Frola (1906–
1962) proposed an axiomatic formulation of the theory of linear elasticity as an autonomous branch
of mathematics, and he sketched a program for the logical foundations of that mathematical theory.
He gave only some general principles and a few hints, without obtaining any definite axiomatic
structure in the sense of the fundamental work of W. Noll and C. Truesdell a few years later.
Nevertheless, Frola’s attempt of finding rigorous axiomatic foundations of classical elasticity must
be acknowledged as a very first step in the right direction.
Mathematics Subject Classifications (2000): 74-03, 74Axx, 74B05, 74B20.
Key words: history of elasticity, axiomatic theory.
To Clifford Truesdell, Master of Science and Life
1. Introduction
As it is well known, at the beginning of the XXth century the theory of elasticity
was widely developed in Italy, to the point that Felix Klein claimed that “elasticity
was the national question of Italians”. In 1935, in the annual report of the Italian Society for the Progress in Science (SIPS), it was remarked that, in a rather
unpleasant period for Mechanics in Italy, two fields were still flourishing: Hydrodynamics and Elasticity. Eugenio Frola was part of the scientific group working in
linear elasticity and mechanics of structures and he showed a marked interest in
generalizing or trying to generalize any of the many problems studied by himself
or by his colleagues in different fields, even if he dealt mainly with the theory
of elasticity. He was variously attracted by problems in engineering, structural
mechanics, vibrations, physics, functional analysis, PDE, integral equations, and
economics. His efforts were always directed not only to solving particular problems
but also to finding more general formulations which could satisfy, even partially,
his expectation of pointing out the rigorous logical structure that ought to underlie
every theory.
113
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 113–125.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
114
S. CAPARRINI AND F. PASTRONE
In 1940 Frola wrote two papers, in which he sought to develop a systematic
approach to nonlinear elasticity, starting from a linear model. In Section 3 we
examine this attempt, first discussed in an article [1] that appeared in Il Saggiatore,
a journal devoted to the dissemination of science among cultivated people, then in
a more technical paper published in the Proceedings of the Academy of Sciences
of Torino [2], where he went deeper into the question. In Section 4 we turn our
analysis to a paper which appeared in 1948 in Atti del Centro Studi Metodologici,
a largely unknown scientific journal, where he went through the principles of the
theory of elasticity and tried to state them in a rigorous way, to point out the basic
ideas that were applied, consciously or unconsciously, in the treatises and papers
on this subject. Indeed he realized that there was a need for a logical, rigorous
mathematical formulation of the theory of elasticity, both linear and nonlinear, and
he would have liked to pursue this task, but his diverging interests and his ill health
stopped his researches.
In this paper our goal is to present and comment on the ideas of Frola on this
specific topic, which was disregarded by most of his contemporaries, but fundamental to any modern theory of elasticity, or, more generally, of continua.
2. Sketch of a Biography and Comments on Previous Works
Eugenio Frola was born in Montanaro, a small village at the feet of the Alps, near
Torino, on September 28, 1906. He obtained two degrees, one in Civil Engineering
from the Politecnico of Torino in 1929, and the other in Mathematics from the
University of Torino in 1933. Frola was a student of G. Albenga and G. Fubini, the
first working in Mechanics, the second in Analysis with a taste for the applications
to elasticity and mechanics (for a brief sketch of their activities see [3]). He became
Assistant to Fubini and to F.G. Tricomi, an outstanding mathematician who worked
in Analysis and is famous for the PDE bearing his name (see again [3]). After
a few years Frola obtained a position as Professor at the Politecnico of Torino.
On February 24, 1940, he was elected Member of the Academy of Sciences of
Torino, one of the oldest Italian academies, founded in 1759 by J.L. Lagrange
among others. Between 1932 and 1955 Frola published 37 papers, mostly in the
Proceedings of the Academy of Sciences of Torino and the Accademia dei Lincei.
Thereafter he suffered from a fatal illness, which lasted several years. He turned
his mind to Buddhism and translated a couple of Buddhist books, the second one
being published after his death, which happened on May 6, 1962.
His severe health problems diverted him from research and he did not complete
his program to build logical foundations for the theory of elasticity, thus preventing him from obtaining the results and the recognition he deserved. According to
the philosopher of science L. Geymonat, who edited in 1964 a book of Frola’s
collected papers [4], Frola’s contributions to epistemology were highly significant,
even more so than it would appear from his dense but few and schematic essays on
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
115
foundations and methodology (see the introduction of the volume [4, pp. 7–33]).
A brief analysis of the works of Frola can be found in [5].
Frola’s work is marked by some characteristic features that distinguish his approach from that of the other Italian mechanicians of the same period. In his
research he always tries to generalize known or new results, relaxing some hypotheses or putting in evidence analogies with other disciplines. In the first period
(1932–1940) his researches were devoted to satisfying these needs: for instance,
Frola [6] tried to extend to dynamics the Colonnetti and Castigliano versions of
Betti’s reciprocity theorem, even though it was D. Graffi who scored success in
1939 and 1963 (see [3, pp. 414–417]). It is worth noting that in the paper [7]
published in the Acta Pontificia in 1938, the abstract is in Latin as requested:
“Auctor ostendit theorema Colonnetti circa systemata elasto-plastica nihil aliud
esse nisi hypotheses illa fundamentalis de deformationis omnimodae congruentia.
Docet etiam rectam quamdam elementariam rationem, qua opus nos est algorithmis
minimizantibus uti”. Hence we can see that the use of Latin in scientific papers did
not disappear completely, even in the XXth century, and the revival of this language
in a couple of papers by C. Truesdell signified the continuation of a use inherited
from great scientists of the past. In the same period Frola applied to elasticity
techniques of quantum mechanics [8], namely: he used the Dirac delta function and
infinite dimensional vector spaces to obtain the equation of elastodynamics from
a microscopical approach. In other researches he made use of linear functionals
to obtain integral equations as field equations of elasticity, where the influence of
Volterra, who had been in Torino from 1893 to 1900, is evident.
3. A First Step: Linear and Nonlinear Theory of Elasticity
In 1940 two papers of different style but with the same objective appeared [1, 2].
In these papers Frola explored the mathematical foundations of linear elasticity as
this theory was established in his times. His purpose was to make clear which parts
of linear elasticity would remain unchanged and which parts had to be modified if
the nonlinear theory were to be developed. Clearly this is not a trivial problem. By
now we understand that it would be much more correct to develop an exact theory
first and then linearize in a mathematically rigorous way, but it was not natural
at all in those years, when elasticity meant linear elasticity for almost everyone
(and even nowadays one can find in textbooks of physics such an identification).
The first paper was published in the first issue of the journal Il Saggiatore, the
purpose of which was to disseminate the more advanced and up-to-date results
in sciences among scientists themselves with papers that would avoid technical
complications but expose the main ideas. In the editorial board there were, among
others, F.G. Tricomi (whom we have already mentioned in the preceding section)
and G.C. Wick, an outstanding physicist. In this paper Frola explained his ideas
in an almost colloquial way and, before tackling the main question, he gave a
simple sketch of what he thought were the main differences between elasticity
116
S. CAPARRINI AND F. PASTRONE
and plasticity, by means of simple ideal experiments as follows: if small rods of
steel and lead are strained and their deformations during loading and unloading are
measured, the steel rod will recover the initial configuration, while the lead rod will
“remember” the deformations undergone. Here Frola uses a rigorous language: “If
we want to make use of the language of modern analysis, we would say that [the
displacements in the lead rod] are no more a function of the force, but a functional
depending on the force, which is a function of time, a continuous functional in the
sense that small variations of the law with which the force has changed in time
cause small variations in the displacements” [1, p. 75]. The influence of the works
of Volterra, Pincherle and Amaldi on functionals is clear, but the idea of applying
such techniques to plasticity, in this context, is due to Frola.
Returning to elasticity, Frola realizes that elastic bodies can be subjected to
the phenomena of buckling when the load exceeds some critical values and it is
not contradictory to the assumption of elastic material, but it is a nonlinear effect:
“Nevertheless, even after the abrupt variation in the behavior [the buckling of an
elastic bar, authors’ remark], the body continues to be elastic (no more in the linear
sense) [. . .]” [1, p. 78]. In fact, Frola was interested in the problem of buckling
of structures not only for its many different applications, but also for a deeper
analysis of the theory of elasticity. While his contemporaries devoted their attention
to many detailed and particular problems, which led to engineering and structural
mechanics (see the comment by Truesdell and Noll in the final remarks below),
he spent his efforts at a crucial point, which was not a feature only of buckling of
rods, but a basic problem of the theory of elasticity: the correct formulation of a
nonlinear theory. He was aware of the paper of Trefftz on buckling [9], as we will
show later, but his interest was differently orientated and he was not satisfied with
the way this problem had been treated.
Frola concludes that, for such bodies under such circumstances, the principle of
superposition of the effects is no longer valid. On the other hand, at that time the
experimentalists assumed that in elastic solids this principle held and, in order to
be consistent with this experimental hypothesis, the linear model must be assumed.
Finally, to justify buckling effects the nonlinear model must be used, and the main
assumption to be rejected is the principle of superposition of the effects, and experiments must take into account this new fact. Then Frola distilled what he took
to be the essence of linear elasticity, as summarized in two axioms:
1. The principle of invariance of the response in a sequence of subbodies, each
one included in the preceding one, down to a body point: in the words of Frola,
“discesa dal globale al locale”, which means descent from global to local. In
other words, we can claim that an elastic body consists of elastic subbodies
and, in the limit, of elastic points.
2. The existence of a local elastic energy function as a positive definite quadratic
form in the first derivatives of the displacement (components): “l’energia potenziale elastica locale sia in particolare una forma quadratica, definita positiva,
nelle sole derivate prime dello spostamento, invariante per moti rigidi infinites-
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
117
imi [. . .]”, where the term “invariante” must be referred to the energy function
(ibidem, p. 79).
In fact, Axiom 2 is partially a consequence of Axiom 1: if we assume the existence of a global elastic potential energy, because of Axiom 1 we can obtain it as the
“sum” of the elastic potentials of the parts which, we imagine, compose the body, in
the limit even when these parts are elastic points. Then we can define a local function (the local elastic potential energy), which depends on local properties of the
displacement, and obtain the total energy by integration over the body. Since Frola
was mainly interested in static problems, he added the unnecessary assumption that
the elastic potential be positive definite and used the virtual work principle (which
is a more general assumption in mechanics) to build up an equilibrium theory for
linearly elastic solids.
He ends his paper with the following proposal: “Le basi e i fondamenti delle
teorie non lineari, che permettono una visione più ampia dei fenomeni elastici
generali, potranno formare argomento di un ulteriore esame” [The basis and the
foundations of nonlinear theories, which allow a broader view of general elastic
phenomena, can be in the future a subject of further investigations] [1, p. 80].
The next step appeared the same year. In a longer and more technical paper [2],
Frola proves that buckling phenomena cannot be included in the linear theory of
elasticity and the difficulty lies in the principle of superposition, which he calls here
the first postulate. In fact, the same remark was already made in [1], but now the
approach is more detailed and mathematically consistent. Again Frola begins from
a critical analysis of the postulates of the “ordinaria” (in our language it means linear) theory of elasticity, remarking that the very basic postulate is the superposition
principle. It is a phenomenological principle, from which both a global and a local
theory ensue: in the first case it is expressed by means of integral field equations,
and in the second case by differential equations. In a global theory this principle
means that the stress–strain relations are expressed with linear functionals and that
the local equations thus obtained are not necessarily linear differential equations.
On the other hand, Hooke’s law is a physical assumption typical of linear theories,
requiring the algebraic linearity of the stress–strain relations. But, as Frola clearly
proves, the two assumptions are not equivalent: neither global linearity implies
Hooke’s law, nor does the vice versa hold obviously. It follows that the first postulate can be saved, but must be generalized: a global principle of superposition is
required.
A second postulate is added: “I quadrati delle derivate prime degli spostamenti
siano trascurabili rispetto alle derivate stesse” [the squares of the first derivatives
of the displacements must be negligible with respect to the derivatives themselves]
[2, p. 535].
A third postulate concerns the loads: in modern language it states that only
dead loads are taken into consideration. At this point, almost incidentally, Frola
remarks that, if we admit the existence of a strain energy function, the matrix
of the coefficients in Hooke’s law must be symmetric and vice versa. This is a
118
S. CAPARRINI AND F. PASTRONE
consequence of Betti’s reciprocal theorem, as it was clearly proved by C. Truesdell
[10] in the more general context of finite elasticity. (Truesdell proved that Betti’s
reciprocal theorem is a sufficient and necessary condition for an elastic material to
be hyperelastic.)
Finally, Frola rephrases the axioms listed in the previous paper, which are necessary to obtain a linear theory. In modern language they can be summarized in the
following way: (i) the body is linearly hyperelastic, (ii) the displacement gradient
is “small”, and (iii) the external loads are dead loads.
Returning to the problem of buckling of elastic structures, Frola proves that the
first and third postulates are not affected by this anomalous behavior, while the
second postulate fails: experience shows, for instance, in the well-known case of
the elastica, that we can have small deformations but large first derivatives. Hence
Frola proposes to modify the strain–displacement relations, as follows:
1 2
1 2
1
1
1
1
1 2
+ exy
+ ezx
+ exy r − ezx q + q 2 + r 2 ,
εxx = exx + exx
2
8
8
2
2
2
2
(1)
and analogous expressions for the other components of the strain, where exx , eyy ,
ezz , exy , eyz , ezx are the classical linear strain components in a Cartesian coordinates
system (x, y, z), (u, v, w) the displacement components and (p, q, r) the components of the local rotation. In fact, according to Frola’s notations, the εxx , εyy , . . .
are the components of what we now call the Lagrangian strain tensor and the
exx , exy , . . . , p, q, r can be written in terms of the displacement gradient H,
namely as: exx = Exx , exy = 2 Exy , . . . , p = −Wyx , where Exx , . . . , Wxy , . . . are
the components of: E = Sym H, W = Skw H. Assuming that the squares of the
linear strain components and the mixed terms are negligible, while the squares of
the rotations are admitted, one can obtain the nonlinear relations:
1
εxx = exx + (q 2 + r 2 ),
2
εyz = eyz − qr, . . . .
(2)
By using the first axiom the local elastic potential energy density takes the form:
1
W = W0 + (λ + µ)θ(p 2 + q 2 + r 2 ) + (λ + µ)(p 2 + q 2 + r 2 )2
2
2
2
2
− µ[exx p + eyy q + ezz r + exy pq + eyz qr + ezx pr],
(3)
where W0 is the quadratic elastic strain energy density used in the linear theory
and θ = (exx + eyy + ezz ) the ordinary cubic dilatation. In a few final lines Frola
states a variational principle by means of the virtual work theorem and refers to a
forthcoming paper with applications and local field equations given explicitly. But
the paper never appeared.
Let us remark that this critical review of the foundations of elasticity was carried
out by Frola in a totally independent way. Analogous researches were carried out
at the same time only in the former Soviet Union and were revealed to western
scientists through the treatise by Valentin Valentinovich Novozhilov (1910–1987)
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
119
when its English version appeared in 1953 [11]. The subject was then definitively
cleared up in a general context by Truesdell and Toupin [12].
The different “weights” attributed to displacement gradients and local rotations
became a common feature after the works of Naghdi and others [13, 14], where
the concepts of small displacements and moderate rotations were introduced. This
modern terminology was not used in [2], but Frola explicitly said that the deformation components exx , eyy , ezz are of the first order and the squares of the local
rotation components p 2 , q 2 , r 2 must be of the same order. In other words, we do not
deal with infinitesimal deformations and finite rotations, as misunderstood by many
people, but small rotations, with a different order of infinitesimality with respect to
the deformations. The definition of Frola in [2] fits exactly with the definition given
by Naghdi and Vongsarnpigoon [14, Section 4.3, p. 282]: “Given [a small strain]
E = O(ε0 ), a proper orthogonal tensor R is said to be a moderate rotation with
respect to ε0 if for any unit vector v, the vector β defined in (4.29) [β = Rv − v]
1/2
satisfies: β = O(ε0 ) as ε → 0” [14, p. 282].
Nobody, but one, noticed this new idea of Frola. The reason is that the nonlinear
theory of elasticity was introduced independently in the 1930’s by A. Signorini
in Italy and by C. Murnaghan in the USA. The approach of Signorini (and Murnaghan, but he was not known in Italy) is completely different from Frola’s. Signorini did not pay much attention to the logical foundations of his theory. He put his
efforts in developing techniques useful in solving some problems he had in mind,
which could not be treated by the linear model. Unfortunately Signorini chose
the “dreadful notation” [C. Truesdell, verbatim] of the homographies, imposed in
Italy by C. Burali-Forti and T. Boggio, instead of tensor calculus. This awkward
formalism and the fact that the papers were written in Italian made the work of
Signorini almost unreadable outside Italy. Only C. Truesdell took the trouble to
scrutinize it thoroughly, so that many of Signorini’s results, clarified and cleansed,
appeared in the treatise of Truesdell and Toupin [12], with the credits he deserved.
The only one who noticed Frola’s results was Placido Cicala (1910–1996), a
professor of civil engineering at the Politecnico of Torino. Cicala had just written
a paper [15] in which he proposed to modify the classical equations of equilibrium
of linear elasticity in order to allow critical and postcritical behavior in loaded
structures, i.e., again the buckling problem. As pointed out by Frola [2, p. 532],
this approach can lead to a family of contradictory theories “in quanto partenti
da risultati acquisiti dalla teoria ordinaria dell’elasticitá che deve d’altra parte nel
corso della ricerca essere negata nei suoi postulati” [because it is based on results
obtained from the linear theory of elasticity, which, on the other hand, must be
denied in its postulates during the same research]. Frola means that authors like
Cicala modified the linear theory by adding ad hoc terms in the strain–displacement
relations and in the constitutive relations relative to some specific problem, and in
this way they contradicted, without noticing it, the postulates of linear elasticity
they were still using. Cicala [16], in a one-page footnote of a paper devoted to the
nonlinear theory, rejected the above-mentioned critical remark of Frola, and tried
120
S. CAPARRINI AND F. PASTRONE
to prove with a simple numerical example that Frola’s deductions were wrong.
Incidentally, Cicala provided evidence that Frola was aware of the paper of Trefftz [9]. In a footnote on page 95, Cicala remarks: “(1 ) . . . This note represents an
appreciable development of the theory whose foundations are stated by Trefftz in
his master paper . . .” and, on page 97, we find another footnote: “This [variational]
procedure is used by Trefftz in the paper quoted above. Frola in his future notes,
will adopt, as he announces [annuncia], such method”. The Italian verb “annuncia”
used in this remark means that probably Frola mentioned this paper to Cicala in
some private discussion, because he never wrote such a sentence.
Cicala deals with a rigid rotation around the z-axis of angle α, which produces
the displacement:
u = x(cos α − 1) − y sin α,
v = y(cos α − 1) + x sin α,
w = 0.
(4)
Assuming the approximations required by Frola, Cicala obtains the explicit
form for the deformation components εxx :
∂u 1 ∂u ∂w 2 1 ∂v ∂u 2
εxx =
+
,
(5)
+
−
−
∂x
8 ∂z
∂x
8 ∂x
∂y
which is equal to the expression (2). By substitution of (4) in (5) Cicala finds the
following value for the first deformation component:
εxx = −(cos α − 1)2
(6)
and since “the rotation α can be any, it can even result = 2” (indeed, as Frola
will remark: = −2). And this is in contradiction with the hypothesis of small
deformations.
In fact, as Frola himself wrote in [17], Cicala confused “infinitesimal” and
“small” quantities: “the theory [. . .] is obviously a limit theory which intends to
represent phenomena not encompassed in the classical [= linear] theory, but it
does not pretend to be either a theory of finite displacements or to interpret any
elastic phenomenon”. Frola tried to explain that, when we deal with “infinitesimal”
quantities, they must not necessarily be of the same order, and, in any case, the
example is misleading. The same rotation inserted in the formula (3) of Cicala’s
paper will produce the strain exx = cos x − 1 and Cicala claims that this result is
correct, because it is of the same order as the deformation. It is easy for Frola
to argue that exx is of order α 2 , while εxx is of order α 4 ; hence the rotations
are of a different order of “smallness” with respect to the pure deformations, as
postulated.
Finally, one must not confuse “small” with “finite”: “the theory, . . . , does not
pretend to be a theory of finite displacements . . . . All the first derivatives of the
displacement components must be considered small, . . . not necessarily of the
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
121
first order” [17, p. 259]. Moreover the criticism of Cicala does not apply to this
model “because it acts on a field (finite displacements) which is not included in my
theory” (ibidem, p. 260).
It has already been remarked that Frola’s point of view is consistent with the
theory of small deformations and moderate rotations as developed by Naghdi and
coworkers [13, 14], who surely did not know the papers of Frola.
Then, having been provoked, Frola proved that the counterexample produced
by Cicala could be easily adapted to the classical linear elasticity; moreover, he
claimed that his own theory was not rigorous, but sufficient to interpret some physical facts, and that he was not dealing with a finite displacement theory. Moreover,
Frola showed that in Cicala’s paper [15] there were some “weak points”, not to say
mistakes and “inesattezze alquanto pericolose” [quite dangerous inaccuracies], as,
for example, the confusion of the modulus of a derivative with the derivative of the
modulus, the use√of wrong rules of approximation, and finally, the statement that “if
misleading. In fact, “if
a is very small, 8 a is very
√ small too” [15, p. 213], which is√
a goes to zero, so does 8 a, but if a = 10−4 (really small), 8 a = 10−1/2 ≈ 0.315,
which I would not say to be very small” [17, p. 262]. Cicala did not reply and the
debate passed unnoticed in the Italian scientific community.
4. The Methodology of Mathematics and the Last Paper (1948)
At the end of World War II, Frola turned his mind to new perspectives: he was
mainly, if not exclusively, interested in problems related to the foundations and
methodology of mathematics. This change must surely be connected with the establishment of the “Centro Studi Metodologici di Torino”, whose founders were
prominent scientists working at the University and at the Politecnico of Torino:
mathematicians, physicists, chemists, economists, etc. It was a smaller version of
the Vienna Circle: a private association of scholars, linked by a common interest in
logical foundations of science. Frola was very active, from the beginning, even if
he was interested almost only in mathematics and its applications and did not seem
to be too much influenced by the Austrian neopositivism. On the history of the
Centro the reader is referred to a very complete article by Giacardi and Roero [19].
Included in the above mentioned book [4] (edited by Geymonat and published in
1964) are Frola’s main papers of this period, all concerned with mathematics and
its relationship with other sciences, including engineering, physics and economics.
This last subject is discussed in detail in his most celebrated methodological paper,
written with the economist Leoni [20]. Here Frola strongly denies that it would be
possible to apply mathematics to economics in the same way that it is applied to
physics. He shows that other mathematical techniques are needed, most of them
not even known to the economists of that period. Here we have no intention to go
into the details of this paper, which contains many interesting critical comments
and suggestions, but it is worth noting that at the beginning criticism is levelled to
those who blindly use mathematics without caution: “Writing down a system of
122
S. CAPARRINI AND F. PASTRONE
mathematical relations is useless [. . .] and meaningless, if theorems of existence of
the solutions are not proved and techniques of approximations are not determinated
[. . .]. This solution will be considered satisfactory, and the system of mathematical
relations efficient, if it [. . .] will provide us a good accuracy” [20, p. 88]. At the
end of the same paper Frola attacks again the improper use of mathematics and
quotes a book by P.A. Samuelson, Foundations of Economic Analysis (Cambridge,
1948), where, on page 10, a system of n equations involving some not well defined “functional relationships” in n unknowns and m parameters is considered.
Samuelson deduces that any of the unknowns can be expressed as a function of the
parameters. After some sharp considerations, Frola concludes: “[. . .] we cannot
but regard this approach as a free and easy use of mathematics. One could find
countless examples of such uninhibited use of mathematics in economics, which
are not considered here [. . .]” [20, p. 109].
Pursuing his new interest in the philosophy of mathematics, in 1948 Frola wrote
[18] a paper on the logical foundations of the theory of elasticity, where he does
not enter into the details of an axiomatic structure, but simply tries to fix the fundamental concepts of the theory. It is sort of a first draught, which was never followed
by a more complete work. Its relevance is due to the fact that almost certainly it is
the first attempt, together with the papers of 1940–1942, to formalize some basic
axioms of the theory of elasticity.
Frola makes a distinction amongst the characteristics of the axioms, dividing
them into three families. A similar distinction in different groups of hypotheses
can be found in the general and classical works of W. Noll and C. Truesdell, even
if, obviously, the classification of Frola is simpler, more schematic, and even naive.
In Frola’s scheme, there are three groups of axioms, each one containing three
hypotheses. The first group contains “geometrical” assumptions:
1. An elastic body is a domain in an Euclidean 3-dimensional space (what we can
call a “placement” of the body).
2. A vector field, representing the deformations, is defined over this domain.
3. This vector field must satisfy suitable smoothness conditions.
The second group contains the “mechanical” assumptions:
4. The loads are described by vectors, with the same properties of forces in rational
mechanics.
5. “On the existence of tensions”: internal tractions as surface forces are defined
by the assumption that a surface force acts on the boundary of any subbody and
represents the action of the remaining part of the body over the given subbody.
6. The nature of loads and tensions must be defined in a precise, mathematical
way, including hypotheses analogous to those required in point 3.
It is clear that such a scheme presumes a more general theoretical framework,
where the forces (i.e., volume and surface forces) must be introduced according
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
123
to some general theory of rational mechanics. The third group is called “physical
hypotheses” and concerns more specifically linear elasticity:
7. Linearity: there holds the superposition principle, as introduced in [1].
8. The displacements are small, in the usual sense.
9. “[D]iscesa all’infinitesimo”: all the preceding assumptions are valid for any
subbody of the body, down to the smallest part of the body, namely a point,
taken as the limit of a sequence of shrinking domains ordered by inclusion.
This final assumption, of a nature analogous to assumption 5, neither has the
support of experimental evidence nor represents any physical experience, but is
the result of a “metaphysical attitude” necessary to apply the macroscopic continuous model, which allows the derivation of differential equations as field equations.
In his final remark, Frola expresses his dissatisfaction with this inadequate construction. He realizes that it is not the only possible approach, and the mathematical
language used here is not even the best mathematical language one can imagine.
Surely there are gaps to be filled, and he hints vaguely that he would continue his
researches in this direction.
But Frola soon turned his mind away from elasticity and mathematics. He wrote
a few papers about methodology of mathematics, including the aforementioned
one on economics, but he never continued his interesting program on axiomatization of the theory of elasticity and became wholly immersed in his studies on
Buddhism.
In conclusion Frola’s ideas, even if his construction has not been completed,
remain exemplary for their consistency and coherence, and they stimulate further work to search for rigorous logical foundations of theories, for physically
meaningful hypotheses, and for correct applications in different fields.
The last lines of this paper [18, p. 14] are a sketch of a program: “Rimane
quindi allettante, e dal punto di vista critico e da quello applicativo della scienza
delle costruzioni, il tentare altre vie per la costruzione di nuovi complessi teorici
destinati a descrivere scientificamente il fenomeno elastico”. [It remains hence appealing, both from critical and applicative points of view in materials science, to try
different ways to build up new theoretical structures apt to describe scientifically
the elastic phenomenon.]
Frola failed, but this task would be successfully fulfilled a few years later by
W. Noll, C. Truesdell and others, in complete generality, not only for elastic material, but also for a broad class of materials, within the framework of Rational
Continuum Mechanics.
5. Final Remarks
Even if Frola did not obtain the results he pursued, he felt the need of constitutive equations for the nonlinear theory of elasticity, and by extension for other
124
S. CAPARRINI AND F. PASTRONE
theories based on a mathematical model. In this sense he realized beforehand the
significance of one of the basic points in Rational Continuum Mechanics. The constitutive equations, correctly formulated, are fundamental in providing a rigorous
framework and the logical foundations of a physical-mathematical theory. They are
based on different sets of axioms, as Frola explicitly said, and his classification is
very similar, even if much sketchier, to the classification used nowadays.
The historical impact of Frola’s papers is slight, because they were few, incomplete and unknown, but it is surprising that he, isolated and pathetic figure of a
scientist, working outside the main stream of interest of his time, tried to find the
logical structure of the theory of elasticity, both linear and nonlinear.
The main interests in Italian mechanics between the 1930’s and the 1950’s
were surely not to build up general theories of continua, but to solve many particular problems, applying analytical tools, often complicated and sophisticated,
with the goal of proving existence, uniqueness, and regularity of the solutions of
various differential equations. In other countries different models were developed,
mainly linear, to study phenomena such as plasticity, dynamics of fluids, structural
mechanics, etc., and again many particular problems were solved, many results
were found, sometimes interesting, but “most of which have later turned out to be
unnecessary in the cases they are justified. Knowledge of the true principles of the
general theory seems to have diminished except in Italy, where it was kept alive
by the teaching and writing of Signorini” [21, p. 9]. Nevertheless the relevant work
done by Signorini, one of the pioneers, with Murnaghan, on the nonlinear theory of
elasticity, was ignored until C. Truesdell read his difficult papers (written in Italian)
and noticed their significance.
Frola did not follow the lead of Signorini. His approach was completely different: his goal was to find the logical structure at the basis of a theory in applied
mathematics and, even if he solved some particular problems in elasticity, he was
aware that the theory of elasticity, as well as other theories, had weak foundations
and it was necessary to reconstruct them firmly.
There are no important theorems in his papers, but we would like to point out
that Frola tried to go deeper into the logical foundations of linear and nonlinear
elasticity, while his contemporaries were not interested in it. He was not able to set
the theory of elasticity on a satisfactory basis, but we must recognize that his was
one of the very first attempts in this direction.
Acknowledgements
The first author was supported by the project MIUR-COFIN 2000 “Storia delle
scienze matematiche”. The second author was supported by the project MIURCOFIN 2002 “Mathematical Models for Material Science” and, partially, by
GNFM-CNR. The authors are indebted to Chi-Sing Man for his careful reading
of the manuscript and helpful suggestions, which improved greatly the content of
this paper.
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
125
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
E. Frola, La teoria dell’elasticitá. Il Saggiatore 1 (1940) 74–80.
E. Frola, Sull’elasticitá non globalmente lineare. Principi e fondamenti delle teorie. Atti Accad.
Sci. Torino 75 (1940) 531–540.
F. Pastrone, Fisica matematica e meccanica razionale. In: S. Di Sieno, A. Guerraggio and
P. Nastasi (eds), La Matematica Italiana dopo l’Unitá. Gli Anni tra le Due Guerre Mondiali.
Marcos y Marcos, Milano (1998) pp. 381–504.
E. Frola, Scritti Metodologici, L. Geymonat (ed.). Giappichelli, Torino (1964).
L. Geymonat, Eugenio Frola. Atti Accad. Sci. Torino 151 (1961/62) 986–997.
E. Frola, Su di una generalizzazione dinamica del teorema di Betti diversa da quella di Lord
Rayleigh. Rend. Accad. Lincei s. VI XXV (1937) 586–589.
E. Frola, Intorno al teorema di Colonnetti sui sistemi elasto-plastici. Acta Pont. Acad. Sci. 2(7)
(1938) 61–71.
E. Frola, Il problema di Cauchy in grande e le equazioni alle derivate parziali lineari a
coefficienti costanti. Rend. Accad. Lincei s. VI XXVII (1938) 518–524.
E. Trefftz, Ueber die Ableitung der Stabilitätskriterien des elastischen Gleichgewichtes aus der
Elastizitätstheorie endlicher Deformationen. In: Verh. 3. Internat. Kongr. Techn. Mech. 3 (1931)
44–50.
C.A. Truesdell, The meaning of Betti’s reciprocal theorem. J. Research of N.B.S. 67B (1963)
85–86.
V.V. Novozhilov, Foundations of the Nonlinear Theory of Elasticity. Gostekhizdat, Moscow
(1948) (English transl. by F. Bagemihl, H. Komm and W. Seidel, Graylock, Rochester, NY
(1953)).
C.A. Truesdell and R.A. Toupin, The classical field theories. In: S. Flügge (ed.), Handbuch der
Physik, Vol. III/1. Springer, Berlin (1960) pp. 226–793.
J. Casey and P.M. Naghdi, Physically nonlinear and related approximate theories of elasticity,
and their invariance properties. Arch. Rational Mech. Anal. 76 (1981) 355–390.
P.M. Naghdi and L. Vongsarnpigoon, Small strain accompanied by moderate rotation. Arch.
Rational Mech. Anal. 80 (1982) 263–294.
P. Cicala, Sulla stabilitá dell’equilibrio elastico. Atti Accad. Sci. Torino 75 (1940) 185–222.
P. Cicala, Sulla teoria non lineare dell’elasticitá. Atti Accad. Sci. Torino 76 (1940) 94–104.
E. Frola, Su alcune questioni di elasticitá non lineare. Atti Accad. Sci. Torino 77 (1942) 258–
262.
E. Frola, Sui fondamenti logici della teoria dell’elasticitá, Atti del Centro di Studi Metodologici
I (1948) 12–14.
L. Giacardi and C.S. Roero, L’ereditá del Centro di Studi Metodologici di Torino. In: Quaderni
di Storia dell’Universitá di Torino, Vol. 2. (1998) pp. 289–356.
E. Frola and B. Leoni, Possibilitá di applicazione delle matematiche alle discipline economiche.
Il Politico 20 (1955) 190–210.
C. Truesdell and W. Noll, The nonlinear field theories of mechanics. In: S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer, Berlin (1965) pp. 1–602.
Symmetries and Hamiltonian Formalism
for Complex Materials
GIANFRANCO CAPRIZ1 and PAOLO MARIA MARIANO2
1 Dipartimento di Matematica, Università di Pisa, via Buonarroti 2, I-56127 Pisa, Italy.
E-mail: capriz@dm.unipi.it
2 Dipartimento di Ingegneria Strutturale e Geotecnica, Università di Roma “La Sapienza”,
via Eudossiana 18, I-00184 Roma, Italy. E-mail: paolo.mariano@uniroma1.it
Received 25 March 2003; in revised form 6 October 2003
Abstract. Preliminary results toward the analysis of the Hamiltonian structure of multifield theories
describing complex materials are reported: we invoke the invariance under the action of a general
Lie group of the balance of substructural interactions. Poisson brackets are also introduced in the
material representation to account for general material substructures. A Hamilton–Jacobi equation
suitable for multifield models is presented. Finally, a spatial version of all these topics is discussed
without making use of the notion of paragon setting.
Mathematics Subject Classifications (2000): 74A30, 74A35, 74B20.
Key words: complex materials, microstructures, elasticity, multifield theories.
In memory of Clifford Ambrose Truesdell, our teacher and mentor
1. Lagrangian and Hamiltonian Descriptions of Elastic Bodies with
Substructure
In standard continuum mechanics, each material element of a body is “collapsed”
into the place occupied by its centre of mass; let X be that place in the reference
placement; the set of all X is taken to be a fit region B0 of the three-dimensional
Euclidean space E.
Sometimes such simplicistic model of physical reality is insufficient; then, to
render the picture adequate, the material element must be portrayed as a system and
at least some coarse grained descriptor ν (an order parameter) enters the picture.
Here, as in [1] (see for other details and additional results [2–5]), we take ν as
an element of some differentiable manifold M, and presume that physical circumstances impose a single choice of metric and of connection for M.
We also assume that the region occupied by the body in the current placement
be obtained through a sufficiently smooth mapping x̃: B0 → E; so that the current
place of a material element at X in B0 is given by x = x̃(X); and B = x̃(B0 )
127
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 127–140.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
128
G. CAPRIZ AND P.M. MARIANO
is also fit. We denote as usual with F the placement gradient. We presume also
that another sufficiently smooth mapping ν̃: B0 → M shows the value of the order
parameter at X, namely ν = ν̃(X). A motion is a pair of time-parametrized families
x̃t (X) = x̃(X, t) and ν̃t (X)) = ν̃(X, t), twice differentiable with respect to time.
Rates in the material representation are indicated with ẋ(X, t) and ν̇(X, t), and we
will write ẋ and ν̇ for brevity.
We restrict here our attention to bodies for which a Lagrangian density L exists,
so that the total Lagrangian L of the body is given by
L(X, x, ẋ, F, ν, ν̇, ∇ν) d(vol),
(1)
LB0 =
B0
the gradient ∇ν being based on the mandatory connection.⋆ We presume that L be
of the form
L(X, x, ẋ, F, ν, ν̇, ∇ν) =
1
ρ0 ẋ2 + ρ0 χ(ν, ν̇) − ρ0 e(X, F, ν, ∇ν)
2
− ρ0 w(x, ν),
(2)
where ρ0 is the referential mass density (conserved during the motion), χ the kinetic co-energy (see [1, p. 19]) associated with the substructure, e the elastic energy
density and w the density of the potential of external actions, all per unit mass.
Below we use the notation b = −∂x w for the density standard external actions
and β = −∂ν w for the substructural ones. The kinetic energy density ρ0 κ(ν, ν̇)
pertaining to the substructure is the partial Legendre transform of χ with respect
to ν̇.
If L is sufficiently smooth, we may apply standard procedures to derive EulerLagrange equations for the functional LB0 :
∂ẋ˙L = ∂x L − Div∂F L,
∂ ˙L = ∂ L − Div∂ L,
ν̇
ν
∇ν
(3)
(4)
where Div is the divergence calculated with respect to X, i.e. Div = tr∇.
Put
H = ẋ · ∂ẋ L + ν̇ · ∂ν̇ L − L.
(5)
⋆ The pair (ν̃, ∇ ν̃) collects the peculiar elements of the tangent mapping T ν̃: B × Vec → T M,
0
where Vec is the translation space over E , and we identify B0 × Vec with T B0 . More specifically, we
have T ν̃(X): TX B0 → Tν M. Such elements cannot be separated invariantly unless M is endowed
with a parallelism (and one wants also to have a physically significant parallelism). A similar remark
holds also for each element (ν, ν̇) of T M. The Lagrangian density is then defined on
B0 × E × Vec × Hom(Vec, Vec) × T M × Hom(T B0 , T M)
(satisfying some compatibility conditions of possible various nature, depending on the substructure),
with Hom(A, B) the set of linear transformations between A and B.
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
129
Clearly, H is the density of the total energy. In fact, since ∂ν̇ L = ρ0 ∂ν̇ χ, the term
ν̇ · ∂ν̇ χ − χ in (5) coincides with the substructural kinetic energy density κ(ν, ν̇)
(hence the presence of χ rather than κ in the expression of L), then
1
H = ρ0 ẋ2 + ρ0 κ(ν, ν̇) + ρ0 e(X, F, ν, ∇ν) + ρ0 w(x, ν),
2
(6)
as asserted.⋆
The balance of energy can be expressed in terms of H as follows
Ḣ − Div(ẋP + ν̇S) = 0,
(7)
where P and S are respectively the Piola–Kirchhoff stress and the referential microstress
P = −∂F L,
S = −∂∇ν L.
(8)
That (7) is true follows from direct computation. Notice that, in view of our
hypothesis on the existence of a unique, physically significant connection for M,
concrete meaning can be assured for microtractions Sn which represent interactions between neighboring material elements. As already remarked in various
occasions in [1] (see, e.g., pp. 26, 27), in general, a properly covariant separation
between ‘self-actions’ (−∂ν L) and ‘microstresses’ (−∂∇ν L) does not apply.
Equations (3) and (4) lead us to an appropriate version of Noether theorem
(see [4, p. 29]); here we follow the program of [6, p. 284]. We consider some virtual
motion of our system, by assigning two one-parameter families fsii of sufficiently
smooth point valued diffeomorphisms, i = 1, 2, acting respectively on B0 and E,
and a Lie group G of transformations of M. We indicate with a prime the derivative
with respect to the relevant s.
1. At each s1 , f1s1 acts on B0 so that X −→ f1s1 (X) ∈ E, and is isocoric (no virtual
1
1′
change of density), i.e., Divf1′
s1 = 0; f0 is the identity. We put f0 (X) = w.
2
2. At each s2 , fs2 is a diffeomorphism that transforms E into itself. We assume that
f02 is the identity and put f02′ (x) = v.
3. A Lie group G, containing SO(3), acts on M. Let ξ be an element of the Lie
algebra of G. The infinitesimal generator of its action on ν ∈ M is indicated
with ξM (ν) (see [7, p. 256]); νg is the value of ν after the action of g ∈ G. If we
consider a one-parameter trajectory s3 → gs3 ∈ G such that g0 is the identity,
⋆ We do not identify a-priori the substructural kinetic co-energy χ with its Legendre transform κ
(assuming thus for both the traditional quadratic form), to encompass cases in which such a distinction is necessary to capture prominent physical phenomena. Excluding for a while exotic cases, in
fact, even when κ(ν, ν̇) = (1/2)ν̇ α αβ ν̇ β , with (ν) ∈ Sym+ (Tν M, Tν M), a priori χ differs from
κ by an addendum of the type λα ν̇ α , with λ(ν) ∈ Tν∗ M. Although such an addendum is commonly
neglected, it becomes prominent in some cases as, e.g., when the substructural kinetics has rotational
character, i.e., when the anholomic constraint ν̇ = Am holds, with m an arbitrary vector. Such a
circumstance occurs in magnetostrictive solids for which m is the magnetization and one obtains
from (4) the standard Gilbert’s equation (see [13] for the relevant calculations).
130
G. CAPRIZ AND P.M. MARIANO
we have also s3 → νgs3 and ξM (ν) = (d/ds3 )νgs3 |s3 =0 . When G coincides with
the special orthogonal group SO(3), we identify ξM (ν) with Aq̇, where q̇ is
the characteristic vector of a rotational rigid velocity and A a linear operator
mapping vectors into elements of the tangent space of M, namely, if νq is the
value of the order parameter measured by an observer after a rotation q, then
A = (dνq /dq)|q=0 .
Henceforth, to simplify notations, we use f1 , f 2 and νg to indicate f1s1 (X), fs22 (x),
νgs3 (X), and write |0 for |s1 =0,s2 =0,s3 =0 . Moreover, grad indicates the gradient with
respect to x.
We say that L is invariant with respect to fsii ’s and G when
L(X, x, ẋ, F, ν, ν̇, ∇ν),
= L f1 , f 2 , (grad f 2 )ẋ, (grad f2 )F(∇f1 )−1 , νg , ν̇g , (∇νg )(∇f1 )−1 .
(9)
Let us define
Q = ∂ẋ L · (v − Fw) + ∂ν̇ L · (ξM (ν) − (∇ν)w),
F = Lw + (∂F L)T (v − Fw) + (∂∇ν L)T (ξM (ν) − (∇ν)w),
(10)
(11)
where v, w and ξM (ν) are as mentioned in items 1, 2, 3.
THEOREM 1 (Noether-like theorem for complex materials). If the Lagrangian
density L is invariant under f1s1 , f2s2 and G, then
Q̇ + DivF = 0.
(12)
Proof. To prove the theorem, as a first step we note that (9) implies
d 1 2
L f , f , (grad f2 )ẋ, (grad f2 )F(∇f1 )−1 , νg , ν̇g , (∇νg )(∇f1 )−1
ds1
0
= 0,
(13)
d 1 2
L f , f , (grad f2 )ẋ, (grad f2 )F(∇f1 )−1 , ν g , ν̇g , (∇νg )(∇f1 )−1 0 = 0,
ds2
(14)
d
L f1 , f2 , (grad f2 )ẋ, (grad f2 )F(∇f1 )−1 , νg , ν̇g , (∇νg )(∇f1 )−1 0 = 0,
ds3
(15)
which lead to
∂X L · w − ∂F L · (F∇w) − ∂∇ν L · (∇ν∇w) = 0,
∂x L · v + ∂ẋ L · ((grad v)ẋ) + ∂F L · ((grad v)F) = 0,
′
∂ν L · ξM (ν) + ∂ν̇ L · ξM
(ν) + ∂∇ν L · ∇ξM (ν) = 0,
(16)
(17)
(18)
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
131
as a consequence of the properties listed under items 1–3 above. Then, we calculate
the time rate of the scalar Q, the divergence of the vector F and, by using the
equations (3) and (4), identifying s3 with t, we recognize that
Q̇ + DivF =
d
d
d
L0+
L0+
L ,
ds1
ds2
ds3 0
which proves the theorem.
(19)
✷
REMARK 1. As a first special case, we require that f2s2 alone acts on L leaving v
arbitrary. By using (17) we obtain from (12)
∂
∂ẋ L − ∂x L + Div∂F L · v = 0,
(20)
∂t
i.e.,
ρ0 ẍ = ρ0 b + DivP,
(21)
which is the standard equation of balance of momentum.
REMARK 2. With G arbitrary, we consider its action alone on L; by using (18),
with the identification s3 = t, we obtain from (12) that
∂
∂ν̇ L − ∂ν L + Div∂∇ν L · ξM (ν) = 0
(22)
∂t
or
ρ0 ∂ν̇˙χ − ∂ν χ + z − ρ0 β − DivS = 0.
(23)
z = −ρ0 ∂ν e is called self-force in the terminology of [1]. This result assures the
covariance of the balance of substructural interactions. When G coincides with
SO(3), the co-vector in the parentheses in (22), namely the term multiplying ξM (ν),
must be an element of the null space of AT (see for details of this special case
[1–4]).
REMARK 3. As a second special choice, let f2s3 be such that v = q̇ × (x − x0 )
(with q̇ a rigid rotational velocity – depending on time only – and x0 a fixed point
in space) and G = SO(3); q̇× is an element of its Lie algebra, thus ξM (ν) = Aq̇. If
L is independent of x and we assume that only f2s2 and G (in the form just defined)
act on L, we have
skw (∂F LFT ) = e AT ∂ν L + (∇AT )t ∂∇ν L ,
(24)
where e is Ricci’s alternating tensor and skw(·) extracts the skew-symmetric part
of its argument.
132
G. CAPRIZ AND P.M. MARIANO
REMARK 4. If we require that f1s1 alone acts on L, with w arbitrary (but satisfying 1), by using (16) we obtain from (12) that
1
˙
2
ρ0 ẋ + ρ0 χ(ν, ν̇) I − ∂X L = 0,
(FT ∂ẋ L + ∇ν T ∂ν̇ L) − Div P −
2
(25)
where P = ρ0 eI − FT P − ∇ν T ∗S is the modified Eshelby tensor for continua
with substructure (see [4] for a similar result in a non-conservative setting, where
the elastic potential e is substituted by the free energy). I is the second-order unit
tensor and, in writing the explicit expression of P above, we find convenient to
introduce the product ∗ defined by (∇ν T ∗S)n · u = Sn · (∇ν)u for any pair of
vectors n and u.
REMARK 5. Let us assume as special choices that f1s1 is such that w = q̇ ×
(X − X0 ) (with q̇ a rigid rotational velocity, and X0 a fixed point in space) and
that G = SO(3), being q̇× an element of its Lie algebra, thus ξM (ν) = Aq̇. If
the material is homogeneous, and we assume that f1 and G alone (in the form just
defined) act on L, we have skw(FT ∂F L + (∇ν)T ∗∂∇ν L) = 0.
REMARK 6. The action of f1s1 can be interpreted as a special virtual mutation of
a possibly existing smooth distribution of inhomogeneities throughout the body, in
the sense of [8]. In other words, we may say that (25) is the balance of interactions
arising when the body mutates its inhomogeneous structure. This interpretation has
been also suggested in [9] in non-conservative setting.
1.1. HAMILTON EQUATIONS
Define p and µ, respectively, the canonical momentum and the canonical substructural momentum, by
p = ∂ẋ L,
µ = ∂ν̇ L.
(26)
The Hamiltonian density H,
H(X, x, p, F, ν, µ, ∇ν) = p · ẋ + µ · ν̇ − L(X, x, ẋ, F, ν, ν̇, ∇ν),
(27)
has partial derivatives with respect to its entries; some of them are the opposite of
the corresponding derivatives of L so that (3), (4) can be also written respectively
as
ṗ = −∂x H + Div∂F H,
ẋ = ∂p H;
µ̇ = −∂ν H + Div∂∇ν H,
ν̇ = ∂µ H.
(28)
(29)
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
133
2. Canonical Poisson Brackets in Multifield Theories
We now consider a general boundary value problem where the following boundary
conditions are associated with (28) and (29)
x(X)
∂F Hn
ν(X)
∂∇ν Hn
=
=
=
=
x̄
t
ν̄
t
on ∂ (x) B0 ,
on ∂ (t) B0 ,
on ∂ (ν) B0 ,
on ∂ (t) B0 ;
(30)
(31)
(32)
(33)
x̄, t, ν̄ and t are prescribed on the relevant parts ∂ (·) B0 of the boundary, Cl(∂B0 ) =
Cl(∂ (x) B0 ∪ ∂ (t) B0 ), with ∂ (x)B0 ∩ ∂ (t) B0 = ∅, and Cl(∂B0 ) = Cl(∂ (ν)B0 ∪
∂ (t) B0 ), with ∂ (ν) B0 ∩ ∂ (t) B0 = ∅, where Cl indicates closure and n is the outward
unit normal to ∂B0 at all points in which it is well defined.
Again, as below (8) we argue that physical significance can be attributed to (33)
in view on our hypotheses on M. It should be clear that those hypotheses need not
apply to all substructures, e.g., in “homogenized” theories of liquids containing gas
bubbles (order parameter the gas fraction) no such microstress could be expected
to have such physical substance; there the interactions are at least weakly nonlocal,
and no significant connection seems to exists then for M.
We assume that there exist two surface densities U (x) and U (ν) such that
t = ρ 0 ∂x U ,
t = ρ0 ∂ν U,
where U and U plays here the rôle of surface potentials.
Then the Hamiltonian H of the whole body is given by
H (x, p, ν, µ) =
H(X, x, p, ν, µ) d(vol)
B0
−
(U (x) − U (ν)) d(area).
(34)
(35)
∂ (t) B0
Notice that we write H(X, x, p, ν, µ) instead of H(X, x, p, F, ν, µ, ∇ν) because
below we consider directly variational derivatives.
THEOREM 2. The canonical Hamilton equation
Ḟ = {F, H }
(36)
is equivalent to the Hamiltonian system of balance equations (28), (29) for a continuum with substructure where F is any functional of the type B0 f (X, x, p, ν, µ),
with f a sufficiently smooth scalar density, and the Poisson bracket {·, ·} for a
complex material is given by
δf δH
δH δf
{F, H } =
·
−
·
d(vol)
δp
δx δp
B0 δx
134
G. CAPRIZ AND P.M. MARIANO
δH δf
d(area)
·
−
+
δx δp ∂ (t) B0
∂ (t) B0
∂ (t) B0
δf δH
δH δf
+
·
−
·
d(vol)
δµ
δµ δν
B0 δν
δf δH
δH δf
+
·
·
−
d(area),
δµ ∂ (t) B0
δν δµ ∂ (t) B0
∂ (t) B0 δν
δf δH
·
δx δp
(37)
where the variational derivative δH/δx is obtained fixing p and allowing x to
vary;⋆ an analogous meaning is valid for the variational derivative with respect to
the order parameter.
The proof can be developed by direct calculation. Clearly, {·, ·} is bilinear and
skew-symmetric, and one can check easily that it satisfies the Jacobi’s identity. We
note that
δf
δf ∂H
·
−
· (∂x H − Div∂F H) d(vol)
{F, H } =
∂p
δp
B0 δx
δf ∂H
δf
+
·
· (∂x U − ∂F Hn) ∂ (t) B d(area)
−
0
∂p ∂ (t) B0 δp
∂ (t) B0 δx
δf ∂H
δf
+
·
−
· (∂ν H − Div∂∇ν H) d(vol)
∂µ
δµ
B0 δν
δf ∂H
·
+
∂µ ∂ (t) B0
∂ (t) B0 δν
δf
(38)
−
· (∂ν U − ∂∇ν Hn) ∂ (t) B d(area),
0
δµ
and, in terms of functional partial derivatives,
δf
δf
δf
δf
Ḟ =
· ẋ +
· ṗ +
· ν̇ +
· µ̇ d(vol)
δp
δν
δµ
B0 δx
δf
δf
+
· ẋ
· ν̇
d(area) +
d(area).
∂ (t) B0 δx
∂ (t) B0 δν
∂ (t) B0
∂ (t) B0
(39)
By identifying analogous terms in (38) and (39), we obtain both the Hamiltonian
system (28), (29) and the boundary conditions (30)–(33).
When we put F = H , (36) coincides with the equation of conservation of
energy. We have, in fact,
Ḣ = {H, H } = 0.
(40)
Geometrical properties of the Poisson brackets for direct models of rods, plates
and complex fluids have been discussed in [10, 11].
⋆ See relevant remarks in [10].
135
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
3. A Formal Approach toward an Hamilton–Jacobi Theory with Gradient
Effects
Let h be a smooth diffeomorphism
h: (X, x, p, F, ν, µ, ∇ν) −→ (X, x∗ , p∗ , F∗ , ν∗ , µ∗ , ∇ν∗ ).
(41)
The transformation h generates a new Hamiltonian density
H∗ (X, x∗ , p∗ , F∗ , ν∗ , µ∗ , ∇ν∗ ),
(42)
with corresponding Lagrangian density
L∗ = p∗ · ẋ∗ + µ∗ · ν̇∗ − H∗ .
(43)
If h were such that H∗ = 0, then an immediate integration of the system (28),
(29) could be achieved. To this aim we choose h to be such that the integral of
the difference L − L∗ between two instants, say t1 and t2 , be equal to the time
derivative of a generating function S of the type S = S(t, X, x, p∗ , ν, µ∗ ), i.e.,
t2
(L − L∗ ) dτ = S|t =t2 − S|t =t1 .
(44)
t1
Then, from (44) we would have
(p · ẋ + µ · ν̇ − H) − (p∗ · ẋ∗ + µ∗ · ν̇∗ − H∗ )
= Ṡ = ∂t S + ∂x S · ẋ + ∂p∗ S · ṗ∗ + ∂ν S · ν̇ + ∂µ∗ S · µ̇∗ ,
(45)
and hence
p = ∂x S,
x∗ − x0 = ∂p∗ S,
∂ t S + H = H∗ .
µ = ∂ν S,
ν = ∂µ∗ S,
(46)
(47)
(48)
t
To obtain (47) one makes use of the fact that δ t12 (p · (x − x0 ) + µ · ν)
= 0 for variations vanishing at t1 and t2 (in the sense that δ(µ · ν)|t =t1 ,t2 = 0
and δ(p · (x − x0 ))|t =t1 ,t2 = 0) so that p · ẋ = ṗ · (x − x0 ) and µ · ν̇ = µ̇ · ν.
A necessary and sufficient condition to assure that H∗ = 0 is
∂t S + H(X, x, ∂x S, F, ν, ∂ν S, ∇ν) = 0,
(49)
which is a Hamiltonian–Jacobi like equation. Since H∗ = 0, p∗ and µ∗ are constant in time, the time derivative of S reduces to
Ṡ = ∂t S + ∂x S · ẋ + ∂ν S · ν̇ = −H + p · ẋ + µ · ν̇ = L.
The relation (50) allows us to determine S to within a constant, namely
S = L dt + const.
(50)
(51)
136
G. CAPRIZ AND P.M. MARIANO
4. The Spatial Form
Circumstances in which the notion of reference placement is wanting, as in the
case of fluids or granular flows, render the choice of a material or spatial representation not matter of form only (see, e.g., [12] for standard bodies). Here, having
in mind the study of complex fluids, we provide a spatial variational derivation of
the balance equations free of any concept of reference place or paragon setting⋆
and without even formal recourse to an inverse motion. So, in the present section
x ∈ B is just a point in space. The notation u = û(x, t) is used for the velocity field
over B. The order parameter is now ν = ν̃(x, t) (with some abuse of notation) and
we indicate with υ = υ̂(x, t) its rate in the present placement. The symmetric
tensor g is the spatial metric characterizing the present state of the body; it plays a
prominent rôle because in this case the counterpart of (2) of the Lagrangian density
is of the form
L(x, u, g, ν, υ, gradν) =
1
ρv2 + ρχ(ν, υ) − ρe(g, ν, gradν)
2
− ρw(x, ν),
(52)
with some slight abuse of notation. We then find balance equations as conditions
verifying the relation
t¯
L(x, u, g, ν, υ, gradν) d(vol) = 0,
(53)
dτ
δ̂
0
B
where δ̂ denotes the total variation (where we use perhaps inappropriately the adjective “total” in the sense which is sometimes accepted in elementary treatises
when speacking of total time derivatives, as it will be clear in the developments
below).
To define the variation of the relevant fields, we make use of f2 introduced at
point 2 of Section 1 and identify δx with v. We consider a special (though wide)
subclass of possible vector fields x −→ v(x) characterized by the circumstance that
they are purely deformative; in other words, we choose v such that skewgrad v = 0.
We then define
δ̂g =
d 2∗
f g
ds2 s2
s2 =0
= Lv g = 2symgrad v = 2grad v,
(54)
where f2∗
s2 means pull back and Lv is thus the autonomous Lie derivative following
the flow v. In analogous way, we put
δ̂ν = δν + (gradν)v,
grad δ̂ν = grad δ̂ν + (gradν)grad v.
(55)
(56)
⋆ With the words “paragon setting” we refer to an ideal model of paragon for the material element
and the body, say, e.g., an ideal crystal or any other choice that physical circumstances may suggest.
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
As an intermediate step we notice that
e(g, ν, gradν) d(vol)
δ̂
B
2∂g e · grad v + ∂ν e · δ̂ν
δ̂e d(vol) =
=
137
B
B
+ ∂gradν e · (grad δ̂ν + (gradν)grad v) d(vol).
(57)
By developing the variation of (53), making use of (54)–(57) and Gauss theorem, we recognize that appropriate balances in the bulk are
∂u˙L − ∂x L + div 2∂g L − (gradν)T ∂gradν L = 0,
(58)
∂υ˙L − ∂ν L + div(∂gradν L) = 0.
(59)
Cauchy stress T is then given by
T = −2∂g L − (gradν)T Sa ,
(60)
where the actual microstress Sa is defined by
Sa = −∂gradν L.
(61)
In the case of simple bodies, (60) reduces to the well known Doyle–Ericksen
formula.
REMARK 7. A requirement of invariance of e under the action of SO(3) implies
that
(62)
skew(2∂g L) = e AT za + (gradAT )t Sa ,
where za = −ρ∂ν e is the actual self-force and e Ricci’s alternating tensor.
4.1. SPATIAL HAMILTON EQUATIONS
To find appropriate spatial Hamilton equations, we follow the pattern of Section 1.1.
To this end we define spatial canonical standard and substructural momenta (p̄ and
µ̄, respectively) through
p̄ = ∂u L,
µ̄ = ∂υ L.
(63)
Consequently, the spatial Hamiltonian density is given by
H(x, p̄, g, ν, µ̄, gradν) = p̄ · u + µ̄ · υ − L(x, u, g, ν, υ, gradν)
(64)
(with some slight abuse of notation) and has partial derivatives with respect to its
entries. By evaluating the variation of H, taking into account (54) and (56), and
138
G. CAPRIZ AND P.M. MARIANO
comparing the result with the variation of L, after making use of the balances (58)
and (59), we obtain the spatial form of the Hamilton equations:
p̄˙ = −∂x H + div 2∂g H − (gradν)T ∂gradν H ,
(65)
u = ∂p̄ H;
˙ = −∂ν H + div(∂gradν H),
µ̄
(66)
υ = ∂µ̄ H.
4.2. SPATIAL HAMILTON – JACOBI FORM
We may obtain the spatial counterpart of (49) by considering a smooth diffeomorphism
h̄: (x, p̄, g, ν, µ̄, gradν) −→ (x∗ , p̄∗ , g∗ , ν∗ , µ̄∗ , gradν∗ ),
(67)
which generates a new Hamiltonian density
H∗ (x∗ , p̄∗ , g∗ , ν∗ , µ̄∗ , gradν∗ ).
(68)
Now, we may use a generating function S = S(t, x, p̄∗ , ν, µ̄∗ ), and, following
the same procedure of Section 3, we find that a necessary and sufficient condition
to assure that H∗ = 0 is
∂t S + H(x, ∂x S, g, ν, ∂ν S, gradν) = 0.
(69)
4.3. A SPATIAL FORM OF POISSON BRACKETS
For the spatial Hamiltonian in equations (65), (66), taking into account (54)–(56),
we define a new variational derivative ∂H/∂x through the relation
δH
(x, p̄, ν, µ̄) · v = −∂x H + div(2∂g H − (gradν)T ∂gradν H) · v,
δx
(70)
holding p̄ fixed and allowing x to vary, for any v of the kind used in (54)–(56).
Consider a boundary value problem of the type
2∂g H − (gradν)T ∂gradν H n = ∂x ū(x),
(71)
(∂gradν H)n = ∂ν u(ν), on ∂B,
(where ū(x) and u(ν) are the counterparts of the surface potentials U (x) and U (ν)).
The total Hamiltonian is now given by H (x, p̄, ν, µ̄) = B H (with some slight
abuse of notation) and we list only the entries (x, p̄, ν, µ̄) because we consider the
variational derivative (65) below. We consider also arbitrary functionals F of the
type B f (x, p̄, ν, µ̄), with f a sufficiently smooth scalar density.
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
139
THEOREM 3. The canonical Hamilton equation
Ḟ = {F, H }a
(72)
is equivalent to the Hamiltonian system of balance equations (65), (66) with
δf δH
δH δf
d(vol)
{F, H }a =
·
−
·
δp
δx δp
B δx
δf δH
δH δf
−
+
·
·
d(area)
δp ∂B
δx δp ∂B
∂B δx
δH δf
δf δH
·
−
·
d(vol)
+
δµ
δµ δν
B δν
δf δH
δH δf
+
−
·
·
d(area),
(73)
δµ ∂B
δν δµ ∂B
∂B δν
where {·, ·}a is bilinear, skew-symmetric and satisfies Jacobi’s identity.
5. Final Remarks
To illustrate possible uses of Theorem 2, we list below some special cases. Analogous results accrue from Theorem 3.
REMARK 8. If we choose f = p · v, with v an arbitrary vector, equation (28a )
and the boundary condition (31) follow immediately from (36).
REMARK 9. Let f = µ · ξM (ν), then from (36) we get (29a ) and the boundary
condition (33).
REMARK 10. Let f be of the form
f = p · (q̇ × (x − x0 )) + µ · Aq̇,
(74)
with q̇ arbitrary as in previous sections. Consider also, for the sake of simplicity, absence of external bulk interactions (the ones accounted for w(x, ν)). By using (28)
and (29), we obtain from (36)
e(∂F HFT ) = AT ∂ν H + (∇AT )t ∂∇ν H.
(75)
These remarks are the Hamiltonian counterparts of Remarks 1–3. Of course,
Poisson parentheses not only allow one to write in a concise form balance equations, but generate articulated geometric structures over the infinite-dimensional
manifold of mappings showing placements and order parameters, and properties of
these structures depend also strictly on the geometric properties of M.
140
G. CAPRIZ AND P.M. MARIANO
Acknowledgements
This paper is an extended version of the first part of a communication of P.M.M. delivered at the Symposium honoring the memory of Clifford Ambrose Truesdell III,
which was held in conjunction with the 14th US National Congress of Theoretical
and Applied Mechanics, Blacksburg, June 2002. P.M.M. acknowledges gratefully
the support of the U.S. National Science Foundation (through a conference grant
to C.-S. Man). We also thank Reuven Segev for valuable discussions. The support of the Italian National Group of Mathematical Physics (INDAM-GNFM) is
acknowledged.
References
1.
2.
3.
G. Capriz, Continua with Microstructure. Springer, Berlin (1989).
G. Capriz, Continua with substructure. Phys. Mesomech. 3 (2000) 5–14, 37–50.
G. Capriz and P.M. Mariano, Balance at a junction among coherent interfaces in materials
with substructure. In: G. Capriz and P.M. Mariano (eds), Advances in Multifield Theories of
Materials with Substructure. Birkhäuser, Basel (2003).
4. P.M. Mariano, Multifield theories in mechanics of solids. Adv. Appl. Mech. 38 (2001) 1–93.
5. R. Segev, A geometrical framework for the statics of materials with microstructure. Math.
Models Methods Appl. Sci. 4 (1994) 871–897.
6. J.E. Marsden and T.J.R. Hughes, Mathematical Foundations of Elasticity. Prentice-Hall,
Englewood Cliffs, NJ (1983).
7. R. Abraham and J.E. Marsden, Foundations of Mechanics. Benjamin/Cummings Publishing
(1978).
8. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal.
27 (1967) 1–32.
9. M. Epstein, The Eshelby tensor and the theory of continuous distributions of inhomogeneities.
Mech. Res. Comm. 29 (2002) 501–506.
10. J.C. Simo, J.E. Marsden, and P.S. Krishnaprasad, The Hamiltonian structure of nonlinear elasticity: The material and convective representation of solids, rods and plates. Arch. Rational
Mech. Anal. 104 (1988) 125–183.
11. H. Cendra, J.E. Marsden, and T.S. Ratiu, Cocycles, compatibility and Poisson brackets for
complex fluids. In: G. Capriz and P.M. Mariano (eds), Advances in Multifield Theories of
Materials with Substructure. Birkhäuser, Basel (2003).
12. G. Capriz (1984), Spatial variational principles in continuum mechanics. Arch. Rational Mech.
Anal. 85 (1984) 99–109.
13. M. Brocato and G. Capriz, Spin fluids and hyperfluids. Theoret. Appl. Mech. 28/29 (2002)
39–53.
Geometrically-based Consequences of Internal
Constraints
DONALD E. CARLSON1, ELIOT FRIED1,2 and DANIEL A. TORTORELLI1,3
1 Department of Theoretical and Applied Mechanics, University of Illinois at Urbana-Champaign,
104 South Wright Street, Urbana, IL 61801-2935, USA. E-mail: dec@uiuc.edu
2 Department of Mechanical Engineering, Washington University in St. Louis, St. Louis,
MO 63130-4862, USA
3 Department of Mechanical and Industrial Engineering, University of Illinois at
Urbana-Champaign, Urbana, IL 61801-2935, USA
Received 23 October 2002; in revised form 19 June 2003
Abstract. When a body is subject to simple internal constraints, the deformation gradient must
belong to a certain manifold. This is in contrast to the situation in the unconstrained case, where
the deformation gradient is an element of the open subset of second-order tensors with positive
determinant. Commonly, following Truesdell and Noll [1], modern treatments of constrained theories
start with an a priori additive decomposition of the stress into reactive and active components with the
reactive component assumed to be powerless in all motions that satisfy the constraints and the active
component given by a constitutive equation. Here, we obtain this same decomposition automatically
by making a purely geometrical and general direct sum decomposition of the space of all secondorder tensors in terms of the normal and tangent spaces of the constraint manifold. As an example,
our approach is used to recover the familiar theory of constrained hyperelasticity.
Mathematics Subject Classifications (2000): 74A20, 74B20.
Key words: continuum mechanics, internal constraints, constitutive theory, hyperelasticity.
Dedicated to the memory of Clifford A. Truesdell
1. Introduction
Most contemporary works in constrained theories of continuum mechanics follow
the approach of Truesdell and Noll [1],⋆ wherein the stress is decomposed a priori
into reactive and active terms with the reactive stress assumed to be powerless in
all motions consistent with the constraints and the active stress given by a constitutive equation. The approach of Truesdell and Noll was motivated by the Ericksen
⋆ See Carlson and Tortorelli [2] for a fuller account of other work in this area. To that account the
more recent work of Casey and Krishnaswamy [3, 4] must be added. As in earlier work of Casey [5]
cited by Carlson and Tortorelli [2], these considerations are based on the behavior on the constraint
manifold of associated unconstrained materials – an approach fundamentally different than ours.
141
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 141–149.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
142
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
and Rivlin [6] treatment of constrained hyperelasticity, which is based on the requirement that the constitutive equations for the stress and internal energy satisfy
balance of energy in all motions consistent with the constraints. The main feature
of the Ericksen–Rivlin hyperelastic development is that the stress is automatically
decomposed into the sum of two terms. One term has zero power in any motion
meeting the constraints and is determined by the constraints to within scalar multipliers; it is natural to think of this term as being present to maintain the constraints
and to call it the reactive stress. The other term is, roughly speaking, the gradient
of the internal-energy density with respect to the strain, and it is called the active
stress. Carlson and Tortorelli [2] replaced the Lagrange multiplier formalism of the
Ericksen–Rivlin approach with an elementary geometrical argument – essentially,
the assertion that, if a vector a is orthogonal to every vector b that is orthogonal
to some vector c, then a is parallel to c – used in the Truesdell–Noll method for
determining the form of the reactive stress.
It is widely accepted that many of the advances in modern continuum mechanics rest in large part on the clear separation of kinematics, basic laws of balance
and growth, and constitutive equations that characterizes the subject. Where do
internal constraints fit into this hierarchy? While internal constraints do delimit
aspects of material response, they apply to broad classes of materials; for instance,
the constraint of incompressibility applies equally well to both hyperelastic solids
and viscous fluids. Hence, we view internal constraints as being more basic than
constitutive equations.⋆ It is natural then to attempt to ascertain the implications
of the kinematical nature of internal constraints. Motivated by this point of view,
Anderson, Carlson and Fried [8] used a modified version of the geometrical argument of Carlson and Tortorelli [2] to deal with the constraints of incompressibility
and microstructural inextensibility present in their theory of nematic elastomers.
They started with a purely geometrical direct sum decomposition of the relevant
fields based on the normal and tangent spaces of the constraint manifold to obtain
automatically the decompositions of the deformational stress, orientational stress,
and internal orientational body-force density into active and reactive components
– without the use of any balance laws or constitutive assumptions. In this paper,
we present this improved approach in the simpler context of isothermal continuum
mechanics. We also take this opportunity to treat multiple constraints.
In Section 2, we consider the case where the deformation gradient is restricted
by n independent constraints. Thus, the deformation gradient is constrained to
belong to a certain manifold in contrast to being an arbitrary element of the open
subset of second-order tensors with positive determinant as in the unconstrained
case. Next, we use the projection theorem to effect a unique orthogonal decomposition of the space of all second-order tensors in terms of the normal and tangent
spaces of the constraint manifold.
⋆ O’Reilly and Srinivasa [7] take an analogous view in their treatment of constrained discrete
mechanical systems.
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
143
In the absence of thermal contributions, the general thermomechanical principles of energy balance and entropy growth combine to yield a free-energy inequality, which may be simplified by means of the power identity. These considerations
are developed in Section 3.
In Section 4, the orthogonal decomposition of Section 2 is applied to the stress
tensor. We find that, for motions consistent with the constraints, the normal component is automatically powerless and only the tangential component enters into
the free-energy inequality. Consequently, the tangential component is called the
active stress, and one would expect to write a constitutive equation for it. On the
other hand, the normal component, termed the reactive stress, is determined by the
constraints to within scalar multipliers that we take to be constitutively indeterminate. Thus, our approach to internal constraints has the same level of generality as
that of Truesdell and Noll [1] and provides exactly the same results. However, our
decomposition of the stress, rather than being a priori, is dictated by the geometry
of the constraint manifold.
In Section 5, as an application of the general theory, we make elastic constitutive
assumptions for the free energy and the stress and require that the free-energy
inequality be satisfied for all motions consistent with the constraints to recover the
theory of constrained hyperelasticity; and, in this sense, the present paper replaces
the paper of Carlson and Tortorelli [2]. Finally, in Section 6, we show that when the
principle of material frame-indifference is invoked in constrained hyperelasticity,
the active and reactive stresses individually satisfy local balance of moment of
momentum.
Throughout, we use the notations of modern continuum mechanics; see, e.g.,
the text of Gurtin [9].
2. The Geometry of the Constraint Manifold
We use a referential formulation. Accordingly, the body is identified with the region
of space B that it occupies in a fixed reference configuration. We write y for the
motion of the body and
F = Grad y,
(2.1)
with det F > 0, for the deformation gradient.
We consider the case where the motion of the body is restricted by n simple
constraints; i.e., the deformation gradient is required to meet⋆
γ̂i (F ) = 0,
i = 1, . . . , n,
(2.2)
where the constraint functions γ̂i : Lin+ → R are suitably smooth and independent
in the sense that the set {Grad γ̂i (F ), i = 1, . . . , n} is linearly independent at each
⋆ At this level of generality, it must be required that n < 9. However, once the principle of material
frame-indifference is imposed (cf. the developments of Section 6), the constraint functions γ̂i are seen
to depend on F only through the symmetric tensor F ⊤F . Consequently, we must, in fact, have n < 6.
144
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
F belonging to Lin+ . In other words, the deformation gradient must belong to the
constraint manifold
Con := F ∈ Lin+ : γ̂i (F ) = 0, i = 1, . . . , n .
(2.3)
Of great use to us will be the normal space to Con at F ,
Norm(F ) := Lsp Grad γ̂i (F ), i = 1, . . . , n ,
and its orthorgonal complement in Lin,
⊥
= A ∈ Lin : A·B = 0, ∀B ∈ Norm(F )
Norm(F )
= A ∈ Lin : A·Grad γ̂i (F ) = 0, i = 1, . . . , n
=: Tan(F ),
(2.4)
(2.5)
which is the tangent space to Con at F .
Of course, the constraint equations (2.2) must hold for all time, and time differentiation yields
Grad γ̂i (F )· Ḟ = 0,
i = 1, . . . , n,
(2.6)
which, in view of (2.5), is equivalent to
Ḟ ∈ Tan(F ).
(2.7)
If the body actually occupies the reference configuration at some reference time (so
that γ̂i (I ) = 0, i = 1, . . . , n), then (2.6) implies (2.2) (see Carlson and Tortorelli
[2]); hence, in this case, (2.7) is equivalent to (2.2).
By the projection theorem, Lin admits the direct sum decomposition
Lin = Norm(F ) ⊕ Tan(F );
(2.8)
i.e., each A ∈ Lin can be written uniquely as⋆
A = A⊥ + A ,
A⊥ ∈ Norm(F ), A ∈ Tan(F ).
(2.9)
In view of (2.5), (2.7), and (2.9),
A⊥ · Ḟ = 0,
A· Ḟ = A · Ḟ .
(2.10)
3. Free-energy Inequality
We restrict attention to processes in which the temperature is independent of position and time; in this case, the principles of energy balance and entropy growth
(in the form of the Clausius–Duhem inequality), or the first and second laws of
⋆ Our usage of the subscripts ⊥ and here is exactly opposite to that used by Anderson, Carlson
and Fried [8].
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
145
thermodynamics, combine to yield a free-energy inequality. On using P to denote
an arbitrary regular part of B with boundary ∂P and unit outward normal field n,
this free-energy inequality requires that
˙
1
2
ρ ψ + 2 |v| dv
ρb·v dv
(3.1)
Sn·v da +
∂P
P
P
for each instant and for all parts. Here, ρ is the referential mass density, v is the
velocity field, ψ is the free energy per unit mass in the reference configuration, S
is the first Piola–Kirchhoff stress tensor, b is the body force per unit mass in the
reference configuration, and the superposed dot indicates time differentiation.
Next, we recall that an easy consequence of the principles of mass balance and
momentum balance is the power identity, which asserts that
˙
1
2
ρ|v|
dv
(3.2)
S · Ḟ dv +
ρb·v dv =
Sn·v da +
2
∂P
P
P
P
for each instant and all parts. Equations (3.1) and (3.2) imply that
˙
S · Ḟ dv
ρψ dv
(3.3)
P
P
for each instant and all parts. The local equivalent of (3.3) is
ρ ψ̇ S · Ḟ ,
(3.4)
and it is this inequality on which our subsequent considerations of hyperelasticity
are based.
4. Active and Reactive Stresses
On employing the decomposition (2.10) in the particular case when A is identified
with the first Piola–Kirchhoff stress S, it follows from the power identity (3.2) that
only the component S expends nonzero power over a constrained motion, and we
refer to S as the active component of the stress and write
S = S a.
(4.1)
On the other hand, the component S ⊥ is powerless in a constrained motion, and we
refer to S ⊥ as the reactive component of the stress and write
S⊥ = S r.
(4.2)
Finally, since S r belongs to Norm(F ), it follows from (2.4) that there exist scalar
fields λ1 , . . . , λn that we take to be constitutively indeterminate such that
Sr =
n
i=1
λi Grad γ̂i (F ).
(4.3)
146
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
Thus, we have shown that, when a body is internally constrained by simple constraints of the form (2.2), the geometry of the constraint manifold dictates that
the stress is automatically decomposed into the sum of two components: a powerless component S r that is determined to within scalar multipliers by (4.3); and
a component S a that does expend power and consequently appears in the freeenergy inequality. We emphasize that this result is independent of any constitutive
considerations other than the “simple” nature of the constraints; in particular, the
body need not be elastic.
A noteworthy feature of our approach is that, in view of (4.1), (4.2), and (2.9),
S a ·S r = 0.
(4.4)
This automatic normalization is important, because the presence of the constitutively indeterminate multipliers in S r (see (4.3)) means that the response function
for any component of S a not orthogonal to S r could not be measured.
5. Constrained Hyperelasticity
In the constrained case, it follows from (2.10) and (4.1) that the local free-energy
inequality (3.4) reduces to
ρ ψ̇ S a · Ḟ .
(5.1)
For hyperelasticity, we make the constitutive assumptions that
ψ = ψ̂(F ),
ψ̂: Con → R,
(5.2)
and
S a = Ŝ a (F ),
Ŝ a : Con → Tan(F ).
(5.3)
Now, with ψ̂ assumed to be smooth,
ψ̇ = Grad ψ̂(F )· Ḟ ;
(5.4)
so the local free-energy inequality becomes
(Ŝ a (F ) − ρGrad ψ̂(F ))· Ḟ 0.
(5.5)
In the spirit of Green [10, 11], Ericksen and Rivlin [6], and Coleman and Noll
[12], we require that our constitutive equations be restricted such that the local
free-energy inequality (5.5) is always satisfied. To make this precise, we say that a
constrained hyperelastic process consists of:
(i)
(ii)
(iii)
(iv)
a motion y consistent with the constraint equations (2.2);
scalar fields λ1 , . . . , λn ;
a free-energy field ψ given in terms of y by constitutive equation (5.2);
an active stress field S a given in terms of y by constitutive equation (5.3);
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
147
(v) a reactive stress field S r given in terms of y and λ1 , . . . , λn through (4.3); and
(vi) a body force field b determined in terms of the above fields through local
balance of momentum.
Then, we insist that the local free-energy inequality (5.5) be satisfied for every
constrained hyperelastic process. At least locally, it is possible to choose a constrained hyperelastic process such that, at any given position and time, F and Ḟ
take on arbitrary values in Con and Tan(F ), respectively. Since both S a (F ) and
Grad ψ̂(F ) belong to Tan(F ), we conclude that
Ŝ a (F ) = ρGrad ψ̂(F ).
(5.6)
In (5.4)–(5.6), Grad ψ̂(F ) represents the tangential gradient of ψ̂ at F . When
the response function ψ̂ admits a smooth extension off the constraint manifold to
an open subset of Lin+ , then
&
%
n
(5.7)
N i ⊗N i Grad ψ̂(F ),
Grad ψ̂(F ) = I −
i=1
where the fourth-order tensor I is the identity operator on Lin, {N i , i = 1, . . . , n}
is an orthonormal basis for the linear subspace Norm (F ), and A⊗B is the fourthorder tensor defined such that (A ⊗ B)C = (B · C)A for any second-order tensor C.
6. Material Frame-indifference and Moment-of-momentum Balance
An interesting feature of hyperelasticity in the unconstrained case is that the principle of balance of moment-of-momentum need not be taken as an axiom; rather it
appears as a theorem in the theory primarily as a consequence of the principle of
material frame-indifference. In this section, we show that this is the case also in the
constrained theory as developed above.
As noted in the introduction, internal constraints do delimit aspects of material
response. Thus, the kinematical restrictions embodied in (2.2) are subject to the
principle of material frame-indifference:
γ̂i (QF ) = γ̂i (F ),
i = 1, . . . , n, ∀(Q, F ) ∈ Orth+ × Lin+ .
(6.1)
A standard consequence of (6.1) is that for each i
γ̂i (F ) = γ̄i (C),
γ̄i : Psym → R,
(6.2)
where
C = F ⊤F
is the right Cauchy–Green deformation tensor.
(6.3)
148
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
By (6.2) and (6.3),
Grad γ̂i (F ) = 2F Grad γ̄i (C),
(6.4)
and (4.3) becomes
Sr =
n
λi F Grad γ̄i (C)
(6.5)
i=1
in terms of the reduced constraint functions γ̄i , where the factor of 2 has been
absorbed into the constitutively indeterminate multipliers. An immediate consequence of (6.5) is that
S r F ⊤ = F S ⊤r ,
(6.6)
which is the local form of balance of moment-of-momentum for the reactive stress.
Similarly, material frame-indifference requires that the constitutive equation
(5.2) for the free-energy density reduce to
ψ = ψ̄(C).
Here, of course, the domain of ψ̄ is the reduced constraint manifold
Con(C) := C ∈ Psym : γ̄i (C) = 0, i = 1, . . . , n .
(6.7)
(6.8)
In terms of ψ̄, (5.6) becomes
S a = S̄ a (C) = 2ρF Grad ψ̄(C),
(6.9)
where Grad now denotes the tangential gradient with respect to the manifold Con.
Furthermore, it follows from (6.9) that
S a F ⊤ = F S ⊤a ,
(6.10)
which is the local form of moment-of-momentum balance for the active stress.
Acknowledgements
This work was supported in part by the National Science Foundation under Grant
CMS96-10286.
References
1.
2.
C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer-Verlag, Berlin (1965).
D.E. Carlson and D.A. Tortorelli, On hyperelasticity with internal constraints. J. Elasticity 42
(1996) 91–98.
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
3.
149
J. Casey and S. Krishnaswamy, On constrained thermoelastic materials. In: R.C. Batra and
M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials.
CIMNE, Barcelona (1996) pp. 359–371.
4. J. Casey and S. Krishnaswamy, A characterization of internally constrained thermoelastic
materials. Mathematics and Mechanics of Solids 3 (1998) 71–89.
5. J. Casey, A treatment of internally constrained materials. Trans. ASME J. Appl. Mech. 62 (1995)
542–544.
6. J.L. Ericksen and R.S. Rivlin, Large elastic deformations of homogeneous anisotropic materials. J. Rational Mech. Anal. 3 (1954) 281–301.
7. O.M. O’Reilly and A.R. Srinivasa, On a decomposition of generalized constraint forces. Proc.
Roy. Soc. London A 457 (2001) 1307–1313.
8. D.R. Anderson, D.E. Carlson and E. Fried, A continuum-mechanical theory for nematic
elastomers. J. Elasticity 56 (1999) 33–58.
9. M.E. Gurtin, An Introduction to Continuum Mechanics. Academic Press, New York (1981).
10. G. Green, On the laws of reflection and refraction of light at the common surface of two noncrystallized media. Trans. Cambridge Philos. Soc. 7 (1839) 245–269.
11. G. Green, On the propagation of light in crystallized media. Trans. Cambridge Philos. Soc. 7
(1841) 113–120.
12. B.D. Coleman and W. Noll, The thermodynamics of elastic materials with heat conduction and
viscosity. Arch. Rational Mech. Anal. 13 (1963) 167–178.
Second Variation Condition and Quadratic Integral
Inequalities with Higher Order Derivatives
YI-CHAO CHEN
Department of Mechanical Engineering, University of Houston, Houston, TX 77204, U.S.A.
E-mail: chen@uh.edu
Received 1 October 2002; in revised form 13 June 2003
Abstract. The positivity of quadratic integrals involving variable coefficients and derivatives of any
order is studied. The result is determined by the solution of an initial value problem for a system of
first order nonlinear differential equations. The system is identified as the matrix Riccati differential
equation in control theory. A complete conclusion is reached by considering the cases when the
solution is bounded and when the solution is unbounded.
Mathematics Subject Classifications (2000): 34A12, 34A34, 49K15, 49N10, 74B20.
Key words: stability, calculus of variations, Riccati equation.
Dedicated to Professor C. Truesdell with deepest admiration and appreciation
1. Introduction
In the analysis of mechanics, need often arises to determine whether certain quadratic integral inequalities hold. For example, in a recent work, Chen and
Haughton [1] study the stability of inflation and stretch of thick-walled incompressible elastic cylinders. An energy stability criterion is used, which leads to a
variation problem. The second variation condition reads
∇u · WFF [∇u] + WF · ∇(∇uF−1 u) dV 0,
(1)
where is the cylindrical reference configuration, F the deformation gradient,
W (F) the strain energy function, and u the variation function. By incompressibility
and a Fourier analysis, inequality (1) is found to be equivalent to the following
quadratic integral inequality
b
U(r) · A(r)U(r) dr 0,
a
where r is the radial coordinate, a and b the inner and outer radii, respectively,
of the cylinder, A(r) a 3 × 3 symmetric matrix whose components are smooth
151
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 151–167.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
152
YI-CHAO CHEN
functions of r, and U(r) = (u(r), u′ (r), u′′ (r))T is a column vector consisting of
an arbitrary smooth scalar function u(r) and its first and second order derivatives.
The above inequality was solved in [1] and is a special case of the problem studied
here. In this work, we develop a general method to determine the positivity of
quadratic integrals involving derivatives of arbitrary order.
Let [a, b] be an interval in R, and n be a positive integer. We define the space
of admissible functions U by
U ≡ C n [a, b]; R .
Consider the following quadratic integral inequality
b
n
di u(x) dj u(x)
dx 0,
aij (x)
dx i
dx j
a i,j =0
(2)
where aij ∈ C n ([a, b]; R), i, j = 0, 1, . . . , n, satisfy
aij = aj i .
(3)
The objective of this work is to determined whether inequality (2) holds for all
u ∈ U.
It is noted that (2) is the second variation condition for minimizing an integral
involving higher order derivatives of the competing function. The case where n = 1
has been treated by many authors, and has become a part of classical theory of
calculus of variations. See, for example, Hestenes [2] and Sagan [3]. The focus of
this work is on the cases where (2) contains derivatives of u(x) of any order. While
the conditions to be derived in this paper pertain to (2) as it stands, they can be
made appropriate for the strict inequality version of (2).
It is also noted that (2) can be recast in a form which is related to a constrained
minimization problem studied in control theory. Indeed, by writing ui (x) ≡
di u(x)/dx i , the left-hand side of (2) can be rewritten as
b
n
aij (x)ui (x)uj (x) dx.
(4)
a i,j =0
A problem in control theory is to find un (x) that minimizes (4) subject to the
differential equation constraints
dui (x)
= ui+1 (x),
dx
i = 0, 1, . . . , n − 1.
Variations of this latter problem, often for the constant coefficients aij with ain =
0, i = 0, . . . , n − 1, have been studied in control theory (for example, see [4–6]),
where un (x) is called control function. An important development is the associated
matrix Riccati equation whose solution determines the optimal control function.
The solution of the Riccati equation has been studied extensively [7–10].
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
153
For unconstrained inequality (2) itself, some sufficient conditions and necessary
conditions can be derived by elementary arguments. Let A(x) be the (n+1)×(n+1)
matrix function with aij (x) being its elements. An obvious sufficient condition
for (2) to hold for all u ∈ U is that A(x) be pointwise positive semi-definite, that
is,
v · A(x)v 0
∀v ∈ Rn+1 , x ∈ [a, b].
(5)
This condition, although very simple, is in most cases too strong to be practically
useful.
Condition (5) is a pointwise algebraic inequality. A condition of this kind is
desirable as it usually allows a simple verification. However, a necessary and sufficient condition for (2) to hold is in general not pointwise.
When A is a constant matrix, one can derive various algebraic necessary conditions by taking special form of u(x) in (2) and carrying out the integration. For
example, by choosing u(x) ≡ ekx , k ∈ R, we find that (2) implies that
⎛ ⎞
1
⎜ k ⎟
⎜ 2⎟
⎜ ⎟
v · Av 0 ∀v = ⎜ k ⎟ .
⎜ .. ⎟
⎝ . ⎠
kn
This scheme, however, may not lead to useful result when A is not constant. Even
when A is constant, the necessary conditions obtained this way may well be much
weaker than (2) itself.
The difficulty in finding necessary and sufficient conditions lies in the fact
that U is an infinite-dimensional function space. This problem is dealt with in
the present work by relating the integral in (2) to a system of nonlinear ordinary
differential equations, which is identified as the Riccati equation in control theory.
The analysis presented here provides the workers in mechanics with a direct access
to the solution of (2) without referring to the constrained minimization problem
studied in control theory, and without the usual restrictions on aij , which are particular in control theory. In addition, we present a detailed treatment of the case
where the solution of the Riccati equation becomes unbounded.
Besides its theoretical value, the method presented in this work offers great
numerical advantage in dealing with the integral inequality (2). As no numerical
method has been developed to solve such an inequality, numerous ODE solvers are
available. Once the coefficients aij are given, the corresponding Riccati equation
can always be solved numerically, and the properties of the solution provide a
definite conclusion on whether (2) holds for all u ∈ U, as demonstrated in [1]
where the stability of inflation of elastic cylinders has been determined for the first
time.
In the next section, we state some well-known results in calculus of variations
regarding the integral in (2). Section 3 introduces the system of differential equa-
154
YI-CHAO CHEN
tions upon which the solution method is based. In Section 4, a necessary and
sufficient condition is derived when the solution of the differential equations is
bounded. In Section 5, the case where the solution is unbounded is analyzed. An
illustrative example is given in the concluding Section 6.
2. Preliminaries
The quadratic inequality (2), being the second variation condition of some minimization problem, forms a minimization problem itself. The first variation condition of this minimization problem reads
b
n
di u(x) dj v(x)
dx = 0.
(6)
aij (x)
dx i
dx j
a i,j =0
If u(x) is a minimizing function of the integral in (2), it must satisfy (6) for any
v(x) in a class of variation functions. If both u(x) and v(x) are smooth (say, of
class C 2n ), one can integrate (6) by parts n times to obtain
.b
- n n n
b
n
k−1
di u dj v
di u dj −k v
k−1 d
(−1)
dx =
aij i
aij i
k−1
dx dx j
dx
dx dx j −k
a i,j =0
k=1 j =k i=0
a
b
n
j
i
d
du
(−1)j j aij i v dx
+
dx
dx
a i,j =0
- n−1 n−l n
.b
k−1
di u dl v
k−1 d
(−1)
=
ai,k+l i
dx k−1
dx dx l
l=0 k=1 i=0
a
b
n
j
i
d
d
u
(−1)j j aij i v dx
+
dx
dx
a i,j =0
.b
- n n−l n
k
di u dl−1 v
k d
ai,k+l i
(−1)
=
dx k
dx dx l−1
l=1 k=0 i=0
a
b
n
j
i
d
du
(−1)j j aij i v dx
+
dx
dx
a i,j =0
= 0.
Here a change of variables j = k + l for the summation indices has been used.
By a standard argument [2, 3] in calculus of variations, one can derive the following Euler–Lagrange equation which must be satisfied by a minimizing function
u(x) at the points where u(x) is of C 2n :
n
j
di u
j d
aij i = 0.
(7)
(−1)
dx j
dx
i,j =0
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
155
Furthermore, at a point where u(x) is not of C 2n , the Weierstrass–Erdmann corner
condition must hold, which states that
n−l
n
dk
di u
(−1)k k ai,k+l i , l = 1, . . . , n
dx
dx
k=0 i=0
must be continuous.
A well-known pointwise necessary condition for (2) to hold is the Legendre
condition⋆
ann (x) 0
∀x ∈ [a, b].
(8)
Indeed, if ann (x0 ) < 0 for some x0 ∈ (a, b), one can construct an admissible
function u(x) that has a non-empty support contained in a neighborhood of x0 , and
that is so oscillatory that (2) is violated for this u(x). The conclusion that (8) must
hold at the end points follows from the continuity of ann . In this work, we assume
that the following strengthened Legendre condition holds:
ann (x) > 0 ∀x ∈ [a, b].
(9)
3. A System of Differential Equations
Of central importance to the solution of (2) is the solution of a system of first order
differential equations. Via this system of equations, the left-hand side of (2) can
be written in a form whose positiveness can be determined unambiguously by the
solution of the system.
To be determined is an n × n matrix function Y(x), whose elements yij (x),
i, j = 1, . . . , n, satisfy the following system of ordinary differential equations and
initial conditions
(ai−1,n − yin )(aj −1,n − yj n )
dyij
= ai−1,j −1 − yi,j −1 − yi−1,j −
,
dx
ann
i, j = 1, . . . , n,
yij (a) = 0.
(10)
(11)
The notation in (10) is so chosen that
yi0 = y0i = 0,
i = 1, . . . , n.
(12)
It follows from (3) that Y is symmetric:
yij = yj i .
(13)
The boundary value problem (10, 11) for a system of differential equations is
introduced to solve the integral inequality (2). Through this system of differential
⋆ The classical Legendre condition pertains to (2) for n = 1.
156
YI-CHAO CHEN
equations, the left-hand side of (2) can be integrated by parts in such a way that
the resulting integrand is the square of a function which can be made to vanish for
a certain choice of the admissible function. Whether (2) holds is then determined
by the boundary terms, which are further related to the solutions of this boundary
value problem.
The system of differential equations (10) is identified as the matrix Riccati
equation in control theory where the coefficients aij are often taken to be constant
with ain = 0, i = 0, . . . , n − 1. The solution of the Riccati equation has been
widely studied [7–10].
The general theory for the solution of the initial value problem (10) and (11)
is well developed. See, for example, Ince [11]. It can be shown that under condition (9), a Lipschitz condition is satisfied in a neighborhood of the initial point.
A unique continuous solution of (10) and (11) then exists in the neighborhood.
This neighborhood, however, may or may not cover the entire interval [a, b]. The
solution may become unbounded as x approaches some c b. It will be shown
below that in this latter case inequality (2) is violated for some admissible function.
The system of differential equations (10) is nonlinear. It is known in control theory, as shown below, that this system can be related to a 2nth order linear ordinary
differential equation, which is identified as a generalization of the classical Jacobi
equation. The behavior of the solution to this equation is well understood. It is
found that if the solution of (10) and (11) becomes unbounded at a point in (a, b],
then the 2nth order equation has a solution which and its first n − 1 derivatives
vanish at the point.
In the next section, we first consider the case where the solution of the initial
value problem is bounded in [a, b].
4. Bounded Solution
When the solution of (10) and (11) is bounded on [a, b], whether (2) holds for all
u ∈ U is determined completely by the values of the solution and its derivatives
at b. This will be proved by using the following lemma, which will also be utilized
for further development.
LEMMA 1. If the initial value problem (10) and (11) has a solution Y(x) that is
of C 1 on [a, c] for some c ∈ (a, b], then for any u ∈ U the following equality holds
c
n
a i,j =0
aij
n−1
di u
dj u
di u dj u
y
(c)
dx
=
(c)
(c)
i+1,j
+1
dx i dx j
dx i
dx j
i,j =0
&2
%
c
n−1
dn u ain − yi+1,n di u
dx. (14)
+
+
ann
i
dx n
a
dx
nn
a
i=0
157
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
Proof. Let the solution yij (x) with the required properties be given. By (3), (12),
(10), (11) and (9), we find that
c
n
di u dj u
aij i
dx
dx dx j
a i,j =0
n−1
di u dn u
dn u dn u
di u dj u
2ain i
+
+ ann n n dx
aij i
=
dx dx j
dx dx n
dx dx
a
i=0
i,j =0
c/
n−1
(ain − yi+1,n )(aj n − yj +1,n ) di u dj u
aij − yi+1,j − yi,j +1 −
=
ann
dx i dx j
a
i,j =0
c
n−1
i
n−1
d u dn u
dn u dn u
di u dj +1 u
2
a
−
y
+
+
a
+ 2yi+1,j +1 i
in
i+1,n
nn
dx dx j +1
dx n dx n i=0
dx i dx n
0
n−1
(ain − yi+1,n )(aj n − yj +1,n ) di u dj u
+
dx
ann
dx i dx j
i,j =0
c-
n−1
dyi+1,j +1 di u dj u
di u dj +1 u
=
+ 2yi+1,j +1 i
dx
dx i dx j
dx dx j +1
a
i,j =0
.
n
n−1
d u ain − yi+1,n di u 2
+
dx
+ ann
dx n
ann
dx i
i=0
n
.
c-
n−1
n−1
d
di u dj u
d u ain − yi+1,n di u 2
=
+ ann
yi+1,j +1 i
+
dx
j
n
i
dx
dx
dx
dx
a
dx
nn
a
i,j =0
i=0
n−1
di u
dj u
yi+1,j +1 (c) i (c) j (c) +
=
dx
dx
i,j =0
a
c
ann
n−1
dn u ain − yi+1,n di u
+
dx n
ann
dx i
i=0
2
dx.
✷
Lemma 1 and inequality (9) readily render the following sufficient condition
for (2) to hold.
THEOREM 1. If the initial value problem (10) and (11) has a solution Y(x) that
is of C 1 on [a, b], and if the matrix Y(b) is positive semi-definite, then inequality (2) holds for all u ∈ U.
In Theorem 1, the condition that a C 1 solution of (10) and (11) exists on [a, b]
is crucial. For a particular problem, this condition may not be satisfied, as demonstrated by the example in Section 6. This case will be treated in the next section.
In the next theorem, we shall show that when the above-mentioned condition is
satisfied, the remaining conditions in Theorem 1 are actually necessary for (2) to
hold.
158
YI-CHAO CHEN
THEOREM 2. If the initial value problem (10) and (11) has a solution Y(x)
that is of C 1 on [a, b], and if the matrix Y(b) is not positive semi-definite, then
inequality (2) does not hold for some u ∈ U.
Proof. By the given conditions, there are vi ∈ R, i = 1, . . . , n, such that
n
yij (b)vi vj < 0.
(15)
i,j =1
Consider the following initial value problem for u(x):
n−1
dn u ain − yi+1,n di u
+
= 0,
dx n
ann
dx i
i=0
(16)
di u
(b) = vi+1 , i = 0, 1, . . . , n − 1.
(17)
dx i
By the theory of linear ordinary differential equations, this initial value problem
has a unique solution u(x) on [a, b]. For this u(x), we have, with the help of (14),
b
n−1
n
di u
dj u
di u dj u
y
(b)
dx
=
(b)
(b)
aij i
i+1,j +1
i
j
dx dx j
dx
dx
a i,j =0
i,j =0
%
&2
b
n−1
dn u ain − yi+1,n di u
ann
+
dx
+
dx n
ann
dx i
a
i=0
=
n−1
i,j =0
yi+1,j +1 (b)vi+1 vj +1
< 0.
The last two steps follow from (17), (16) and (15).
✷
5. Unbounded Solution
In this section we consider the case where the solution of the initial value problem (10) and (11) becomes unbounded as x approaches some c ∈ (a, b]. Theorems 1 and 2 are in this case inapplicable. We shall show that it is possible to
construct a function u ∈ U for which inequality (2) is violated. To this end, we
first relate the system of the first order nonlinear differential equations (10) to a
sequence of linear differential equations.
Let the solution Y(x) of (10) and (11) be given, that is of C 1 in [a, c). Consider
the following nth order linear differential equation for u:
n−1
dn u ain − yi+1,n di u
+
= 0.
dx n
ann
dx i
i=0
(18)
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
159
It is noted that equation (18) is identical with (16), and that a solution u(x) of (18)
makes the last integral in (14) vanish. By the theory of linear ordinary differential
equations, all solutions of (18) form an n-dimensional linear space. In this work,
we shall assume that these solutions are of class C 2n .
LEMMA 2. A solution u(x) of (18) satisfies
&
% n−1
n
n−1
di u
di u
di u
d
ai,j −1 i −
yi+1,j −1 i ,
yi+1,j i =
dx i=0
dx
dx
dx
i=0
i=0
j = 1, . . . , n.
(19)
Proof. By using (10), (18), (12), (13) and (3), we find that
&
% n−1
d
di u
yi+1,j i
dx i=0
dx
=
n−1
yi+1,j
i=0
+
di+1 u
dx i+1
n−1
(ain − yi+1,n )(aj −1,n − yj n ) di u
ai,j −1 − yi+1,j −1 − yij −
ann
dx i
i=0
n−1
=
n
di u
dn u
di u
(ai,j −1 − yi+1,j −1 − yij ) i + (aj −1,n − yj n ) n
yij i +
dx
dx
dx
i=0
=
n−1
(ai,j −1 − yi+1,j −1 )
=
i=1
i=0
n
i=0
dn u
di u
+
a
j −1,n
dx i
dx n
n−1
ai,j −1
di u
di u
y
−
.
i+1,j
−1
i
dx i
dx
i=0
PROPOSITION 1. A solution u(x) of (18) satisfies
% n
& n−1
l
j
i
d
di u
d
u
(−1)j j
yi+1,n−l i = 0,
ai,n−l+j i −
dx i=0
dx
dx
j =0
i=0
l = 0, 1, . . . , n.
✷
(20)
Proof. Equation (18) can be rewritten as
n
i=0
n−1
di u
di u
ain i −
yi+1,n i = 0.
dx
dx
i=0
(21)
160
YI-CHAO CHEN
Taking the derivative of (21) l times and using (19) repeatedly, we find that
% n
&
n−1
di u
dl di u
ain i −
yi+1,n i
dx l i=0
dx
dx
i=0
% n
&
% n
&
n−1
di u
dl−1
di u
dl di u
ain i − l−1
ai,n−1 i −
yi+1,n−1 i
= l
dx i=0
dx
dx
dx
dx
i=0
i=0
% n
&
% n
&
dl di u
di u
dl−1
= l
ain i − l−1
ai,n−1 i
dx i=0
dx
dx
dx
i=0
% n
&
n−1
dl−2
di u
di u
ai,n−2 i −
yi+1,n−2 i
+ l−2
dx
dx
dx
i=0
i=0
= ...
% n
&
l
i
j
d
u
d
ai,n−l+j i
(−1)l−j j
=
dx i=0
dx
j =2
&
% n
n−1
i
i
d
d
u
d
u
+ (−1)l−1
yi+1,n−l+1 i
ai,n−l+1 i −
dx i=0
dx
dx
i=0
%
&
l
n
dj
di u
ai,n−l+j i
(−1)l−j j
=
dx i=0
dx
j =1
&
% n
n−1
di u
di u
l
yi+1,n−l i
+ (−1)
ai,n−l i −
dx
dx
i=0
i=0
%
&
l
n
n−1
dj
di u
di u
(−1)l−j j
yi+1,n−l i
ai,n−l+j i + (−1)l+1
=
dx i=0
dx
dx
j =0
i=0
= 0.
✷
An important consequence of Proposition 1, obtained by taking l = n in equation (20) and using (12), is
COROLLARY 1. A solution u(x) of (18) satisfies
n
j
di u
j d
aij
= 0.
(−1)
dx j
dx i
i,j =0
(22)
Equation (22) is found to be a generalization of the classical Jacobi equation,
which was originally derived for the variational problem involving the first order
derivative of the admissible function. It is also noted that (22) is identical with the
Euler–Lagrange equation (7) since the functional in (2) is quadratic. The theory for
such an equation is well developed. Among other things, since aij ∈ C n ([a, b], R),
all solutions of (22), and therefore all solutions of (18), are bounded on [a, b].
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
161
PROPOSITION 2. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is
unbounded at x = c, then there exists a nontrivial solution û(x) of (18) on [a, c],
that satisfies
di û
(c) = 0,
dx i
i = 0, 1, . . . , n − 1.
(23)
Proof. The first n equations of (20) can be rewritten as a vector equation
Y(x)w(x) = g(x),
(24)
where the n × n matrix function Y(x) is the solution of (10) and (11) under consideration, and w(x) and g(x) are n-dimensional vector functions whose components
are given, respectively, by
di−1 u(x)
, i = 1, . . . , n,
dx i−1
- n
.
n−i
j
k
d
d
u(x)
(−1)j j
gi (x) ≡
ak,i+j (x)
,
dx k=0
dx k
j =0
wi (x) ≡
(25)
i = 1, . . . , n.
(26)
Equation (18) has n linearly independent solutions u(1) (x), u(2) (x), . . . , u(n) (x) on
[a, c). By Proposition 1, these solutions also satisfy (20), and therefore (24). Let
w(1)(x), w(2) (x), . . . , w(n) (x) and g(1) (x), g(2) (x), . . . , g(n) (x) be the corresponding vector functions defined through (25) and (26), respectively. Then each pair of
functions w(l)(x) and g(l) (x), l = 1, . . . , n satisfy (24). Since Y(x) is unbounded
at x = c, at least one eigenvalue of Y(x) is unbounded at x = c. It then follows
from (24) that the vectors w(l)(c), l = 1, . . . , n are all orthogonal to the associated eigenvector, and therefore are linear dependent. Hence, there exist constants
c1 , c2 , . . . , cn , not all zero, such that
n
l=1
cl w(l) (c) = 0.
Now define
û(x) ≡
n
cl u(l) (x).
(27)
l=1
This function satisfies (18) and (23). Moreover, û(x) is nontrivial since functions
u(l) , l = 1, . . . , n are linearly independent on [a, c).
✷
It is noted that Proposition 2 is related to the notion of conjugate point in the
classical theory of calculus of variations [2, 3]. For the case n = 1, the point x = c
is said to be conjugate to x = a if the Jacobi equation has a nontrivial solution that
vanishes at these two points.
162
YI-CHAO CHEN
PROPOSITION 3. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is
unbounded at x = c, then there exists a nontrivial solution û(x) of (18) on [a, c],
such that
c
n
di û dj û
aij i
dx = 0.
dx dx j
a i,j =0
Proof. Let û(x) be given by (27). By (14), (18), (25), and (24), we find that
c
n
aij (x)
a i,j =0
= lim
x→c
+
n−1
i,j =0
c
a
= lim
x→c
di û(x) dj û(x)
dx
dx i
dx j
yi+1,j +1 (x)
di û(x) dj û(x)
dx i
dx j
n−1
dn û(x) ain (x) − yi+1,n (x) di û(x)
ann (x)
+
dx n
ann (x)
dx i
i=0
n−1
i,j =0
yi+1,j +1 (x)
2
dx
di û(x) dj û(x)
dx i
dx j
= lim Y(x)ŵ(x) · ŵ(x)
x→c
= ĝ(c) · ŵ(c),
where ŵ(x) and ĝ(x) are defined by (25) and (26) through û(x). The desired
conclusion then follows from Proposition 2.
✷
We are now in a position to prove the main result of this section.
THEOREM 3. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is
unbounded at x = c for some c < b, then inequality (2) does not hold for some
u ∈ U.
Proof. Let û(x) be given by (27). Define
û(x) for a x c,
ũ(x) ≡
0
for c < x b.
It follows from Proposition 3 that
b
n
a i,j =0
aij
di ũ dj ũ
dx = 0.
dx i dx j
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
163
Suppose that (2) holds for all u ∈ U. Then we would have
b
b
n
n
di ũ dj ũ
di u dj u
aij i
a
dx
dx ∀u ∈ U.
ij
dx dx j
dx i dx j
a i,j =0
a i,j =0
This implies that ũ is a minimizing function of the quadratic integral in (2), and
therefore must satisfy, in addition to the Euler–Lagrange equation (7), the
Weierstrass–Erdmann corner condition. This latter condition asserts that the expressions
n−k
n
j
di ũ
j d
(−1)
ai,j +k i , k = 1, . . . , n
dx j
dx
j =0 i=0
must be continuous at x = c. By the definition of ũ(x), we would then have
n−k
n
j
di û
j d
= 0, k = 1, . . . , n.
(28)
(−1)
ai,j +k i
dx j
dx x=c
j =0 i=0
By Corollary 1, function û is a solution of the 2nth order linear differential equation (22). The initial conditions (23) and (28) then imply that û(x) is identically
zero on [a, c], which is a contradiction.
✷
To complete the solution of the integral inequality (2), it remains to analyze the
case where the solution Y(x) of (10) and (11) is of C 1 in [a, b), and is unbounded
at the end point x = b. We shall show, in the following theorem, that a function
u ∈ U can be constructed for which (2) is violated.
THEOREM 4. If the solution Y(x) of (10) and (11) is of C 1 in [a, b), and is
unbounded at x = b, then inequality (2) does not hold for some u ∈ U.
Proof. Let û(x) be given by (27) in Proposition 2, with c therein being replaced
by b. By Propositions 1 and 2, and Corollary 1, this û satisfies (18), (20), (22)
and (23), again with c being replace by b. Define functions ĝi (x), i = 1, . . . , n,
through (26) and û(x). The values ĝi (b), i = 1, . . . , n, cannot be all zero, because
otherwise û(x) would be identically zero, in virtue of (22) and (23). Now define
g(x) ≡
n−1
ĝi+1 (b)
i=0
i!
(x − b)i
and
u(x) ≡ û(x) − ǫg(x),
(29)
where ǫ is a small positive number. Obviously u ∈ U and
di g
(b) = ĝi+1 (b),
dx i
i = 0, 1, . . . , n − 1.
(30)
164
YI-CHAO CHEN
Furthermore, by (14), (29), (18), (20), (26), (23) and (30), we find that
b
b
n
n
di u dj u
di g dj g
2
aij i
a
dx
−
ǫ
dx
ij
dx dx j
dx i dx j
a i,j =0
a i,j =0
- n−1
%
&2 .
x
n−1
di u dj u
dn u ain − yi+1,n di u
yi+1,j +1 i
= lim
ann
dx
+
+
x→b
dx dx j
dx n
ann
dx i
a
i,j =0
i=0
- n−1
di g dj g
yi+1,j +1 i
− ǫ 2 lim
x→b
dx dx j
i,j =0
&2 .
%
x
n−1
dn g ain − yi+1,n di u
dx
+
+
ann
dx n
ann
dx i
a
i=0
&
% n−1
n−1
i
j
d
g
di u dj u
d
g
yi+1,j +1 i
− ǫ2
yi+1,j +1 i
= lim
j
x→b
dx
dx
dx dx j
i,j =0
i,j =0
di û dj û
dj g
yi+1,j +1 i
= lim
− 2ǫ j
x→b
dx dx j
dx
i,j =0
% n
&
−1
n−1 n−j
k
di û
dj û
dj g
k d
ai,j +k+1 i
(−1)
− 2ǫ j
= lim
x→b
dx k i=0
dx
dx j
dx
j =0 k=0
n−1
x→b
n−1
ĝj +1 (b) −2ǫ ĝj +1 (b)
= lim
=
n−1
j =0
j =0
ĝj +1
dj g
dj û
−
2ǫ
dx j
dx j
2
= −2ǫ ĝ(b) .
Since ĝ(b) is nonzero, the desired conclusion follows for sufficiently small ǫ.
✷
In summary, Theorems 1, 2, 3 and 4 provide a definite conclusion on the positivity of the quadratic integral in (2). The conclusion hinges upon the solution
Y(x) of the initial value problem (10) and (11). For a practical problem, once the
coefficients aij (x) are given, this initial value problem can be solved numerically.
If the solution Y(x) becomes unbounded at some c b, the inequality (2) does
not hold for some u ∈ U. On the other hand, if the solution is bounded on [a, b],
whether (2) holds for all u ∈ U depends on whether Y(x) is positive semi-definite
at x = b.
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
165
6. An Example
As an illustrative example, we consider the inequality
1
4 2
k u + 2k 4 xuu′ + k 2 (1 − kx)2 u′2 + 6k 2 xu′ u′′ + u′′2 dx 0,
(31)
0
where k is a constant, and a prime denotes the derivative with respect to x. This is
inequality (2) with
n = 2,
a22 = 1,
a = 0,
b = 1,
a00 = k 4 ,
a11 = k 2 (1 − kx)2 ,
a01 = k 4 x,
a02 = 0,
a12 = 3k 2 x.
Not only do the theorems in the previous sections decide whether (31) holds for all
u ∈ U, the proofs of the theorems also provide the details of direct justification of
the conclusion as shown below.
The initial value problem (10) and (11) takes the form
⎧ ′
2
y = k 4 − y12
,
⎪
⎪
⎪ 11
2
⎨
′
2
y22 = k (1 − kx)2 − 2y12 − 3k 2 x − y22 ,
′
⎪
y12
= k 4 x − y11 + y12 3k 2 x − y22 ,
⎪
⎪
⎩
y11 (0) = y22 (0) = y12 (0) = 0.
The solution is found to be
y11 = k 3 x,
y22 =
k 2 x(1 − 2kx)
,
1 − kx
y12 = 0.
(32)
The solution is bounded in [0, 1] if k ∈ (−∞, 1). In this case, we have
4
k
0
Y(1) =
.
0 k 2 (1 − 2k)/(1 − k)
By Theorems 1 and 2, inequality (31) holds when k ∈ (−∞, 1/2], and does not
hold when k ∈ (1/2, 1). Indeed, when k ∈ (−∞, 1/2], we have, by integrating by
parts, that
1
4 2
k u + 2k 4 xuu′ + k 2 (1 − kx)2 u′2 + 6k 2 xu′ u′′ + u′′2 dx
0
k 2 − 2k 3 ′2
= k u (1) +
u (1) +
1−k
0.
3 2
1
2k 2 x − k 3 x 2 ′ 2
′′
u +
u dx
1 − kx
0
On the other hand, when k ∈ (1/2, 1), we choose
1 2
1 2 2
u = exp k − k + kx − k x − 1,
2
2
166
YI-CHAO CHEN
and arrive at
1
4 2
k u + 2k 4 xuu′ + k 2 (1 − kx)2 u′2 + 6k 2 xu′ u′′ + u′′2 dx
0
1 2 2 1
4
2
2 3
2 1 2
k − k + kx − k x
= k x − 3kx + 2k x exp
2
2
0
4
= k (1 − k)(1 − 2k)
< 0.
Finally, when k ∈ [1, ∞), the solution (32) is unbounded at x = 1/k. Theorems 3 and 4 assert that inequality (31) does not hold for some u ∈ U. Here we
demonstrate it for k = 1. Taking in the left-hand side of (31),
√
1 2
u = − e(4 − 3x) + exp x − x ,
2
we find that
1
2
u + 2xuu′ + (1 − x)2 u′2 + 6xu′ u′′ + u′′2 dx
0
1
√
1 2
2
3
2
=
e 25 − 66x + 36x − e 2 + 14x + 4x − 6x exp x − x
2
0
1
+ 2 − 2x − 4x 2 + 10x 3 − 4x 4 exp2 x − x 2 dx
2
√
1 2
2
2
3
= e 25x − 33x + 12x − e 2x + 6x exp x − x
2
1
1
+ 2x − 3x 2 + 2x 3 exp2 x − x 2
2
0
= −3e
< 0.
Acknowledgements
The author wishes to thank Professor A. Mielke for his critiques and comments
on an earlier version of this work. The support of ONR grant 99PR08596 and of
the Texas Institute for Intelligent Bio-Nano Materials and Structures for Aerospace
Vehicles, NASA NCC-1-02038 is acknowledged.
References
1.
2.
Y.C. Chen and D. Haughton, Stability and bifurcation of inflation of elastic cylinders. Proc.
Roy. Soc. London A 459 (2003) 137–156.
M.R. Hestenes, Calculus of Variations and Optimal Control Theory. Wiley (1966).
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
3.
4.
5.
6.
7.
8.
9.
10.
11.
167
H. Sagan, Introduction to the Calculus of Variations. Dover Publications (1992).
R.E. Kalman, Contribution to the theory of optimal control. Bol. Soc. Mat. Mexicana 5 (1960)
102–119.
M. Andjelic, On a matrix Riccati equation of cooperative control. Internat. J. Control 23 (1976)
427–432.
B.D. Anderson and J. Moore, Optimal Control Linear Quadratic Methods. Prentice-Hall
(1989).
W.T. Reid, Riccati Differential Equations. Academic Press (1965).
M. Razzaghi, Solution of the matrix Riccati equation in optimal control. Inform. Sci. 16 (1978)
61–73.
L. Jodar and E. Navarro, Closed analytical solution of Riccati type matrix differential equations.
Indian J. Pure Appl. Math. 23 (1992) 185–187.
J. Nazarzadeh, M. Razzaghi and K.Y. Nikravesh, Solution of the matrix Riccati equation for
the linear quadratic control problems. Math. Comput. Modelling 27 (1998) 51–55.
E.L. Ince, Ordinary Differential Equations. Dover Publications (1956).
Principal Compliance and Robust Optimal Design
ELENA CHERKAEV and ANDREJ CHERKAEV
Department of Mathematics, University of Utah, U.S.A.
E-mail: elena@math.utah.edu, cherk@math.utah.edu
Received 1 November 2002; in revised form 17 September 2003
Abstract. The paper addresses a problem of robust optimal design of elastic structures when the
loading is unknown and only an integral constraint for the loading is given. We propose to minimize
the principal compliance of the domain equal to the maximum of the stored energy over all admissible
loadings. The principal compliance is the maximal compliance under the extreme, worst possible
loading. The robust optimal design is formulated as a min–max problem for the energy stored in
the structure. The maximum of the energy is chosen over the constrained class of loadings, while
the minimum is taken over the design parameters. It is shown that the problem for the extreme
loading can be reduced to an elasticity problem with mixed nonlinear boundary conditions; the
last problem may have multiple solutions. The optimization with respect to the designed structure
takes into account the possible multiplicity of extreme loadings and divides resources (reinforced
material) to equally resist all of them. Continuous change of the loading constraint causes bifurcation
of the solution of the optimization problem. It is shown that an invariance of the constraints under a
symmetry transformation leads to a symmetry of the optimal design. Examples of optimal design are
investigated; symmetries and bifurcations of the solutions are revealed.
Mathematics Subject Classifications (2000): 35B27, 35J50, 35P15, 49K20, 65K10, 74P05.
Key words: structural design, robustness, bifurcation, Steklov eigenvalues, minimax, constrained
optimization.
This paper is dedicated to the memory of Professor Clifford Truesdell.
1. Introduction
A typical structural optimization problem asks for a material layout in the stiffest
design. The stiffness is defined as an elastic energy of a domain loaded by external
boundary forces (loading). If the loading is fixed and known, an optimal structure
adapts itself to resist the loading. However, the optimal designs are usually unstable
to variations of the forces. This instability is a direct result of optimization: To best
resist the given loading, all the resistivity of the structure is concentrated against a
certain direction thus decreasing its ability to sustain loadings in other directions
[7, 8, 20]. For example, consider a problem of optimal design of a structure of a
cube of maximal stiffness made from an elastic material and void; assume that the
cube is supported on its lower side and loaded by a homogeneous vertical force
169
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 169–196.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
170
E. CHERKAEV AND A. CHERKAEV
on its upper side. It is easy to demonstrate, that the optimal structure is a periodic
array of unconnected infinitely thin cylindrical rods. Obviously, this design does
not resist any other but the vertical loading.
The instability to variations of the loading is not a defect of an optimization
procedure – the structure does exactly what it is asked to do; it is a defect of the
modeling. In order to find a more stable robust solution, one needs to optimize
a more general robust stiffness-like functional that characterizes an elastic body
loaded by unspecified (or partly unspecified) forces on its boundary, as it happens
with most engineering constructions. To avoid this vulnerability of the optimally
designed structures to variations of loading, we propose to minimize the principal
compliance of the domain equal to the maximum of the stored energy over all
admissible loadings. The principal compliance is the maximal compliance under
the extreme, worst possible loading. We formulate the robust optimal design problem as a min–max problem for the energy stored in the domain, where the inner
maximum is taken over the set of admissible loadings and the minimum is chosen
over the design parameters characterizing the structure. This formulation corresponds to physical situations when biological materials are created and engineering
constructions are designed to withstand loadings that are not known in advance.
This approach to the structural optimization was discussed in our papers [9, 12]
and (for the finite-dimensional model) in the papers [18, 19]. Various aspects of the
optimal design against partly unknown loadings were studied in [1, 5, 8, 21, 25–
27, 31, 32, 37], see also references therein. In some cases, the minimax design
problem, where the designed structure is chosen to minimize maximal compliance
of the domain, can be formulated as minimization of the largest eigenvalue of an
operator. The minimization of dominant eigenvalues was considered in a setting of
the inverse conductivity problem in [11, 13]. The multiplicity of optimal design that
we find in the minimax loading-versus-design problem is similar to multiplicity of
stationary solutions investigated in the engineering problems of the optimal design
against buckling [14, 34] and vibration [30, 28, 33, 22].
The structure of this paper is as follows. In Section 2, we introduce an integral
quantity of an elastic domain, the principal compliance, equal to the response
of the domain to the worst (extremal) boundary loading from the given class of
loadings; this quantity is a basic integral characteristic of the domain similar to the
capacity, the eigenfrequency, or the volume. The principal compliance is a solution
of a variational problem, which can be reduced to an eigenvalue problem or to a
bifurcation problem.
Examples of various constraints for admissible loadings and resulting variational problems are considered in Section 3. Particularly, the variational problem
for the principal compliance with a quadratically constrained class of loadings
is reduced to the Steklov eigenvalue problem. The principal compliance of the
domain in this case is a reciprocal of the principal Steklov eigenvalue. We also
consider the constraints of the Lp norm, p > 1, of the loading and inhomogeneous
constraints and show that the Lp norm constraints result in a nonlinear boundary
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
171
value problem. The constraint of L1 norm of the loading yields to a variational
problem which does not have a classical solution, but a distribution: the optimal
loading turns out to be a δ-function or, physically speaking, a concentrated loading
(if such a loading does not lead to infinite energy).
Section 4 considers robust structural optimization which is formulated as a
problem of minimization of the principal compliance. The optimal design takes
into account the multiplicity of stationary solutions for extreme (most dangerous)
loadings; typically, the optimal structure equally resists several extreme loadings.
The set of the extreme loadings depends on the constraints of the problem. Continuous change of the constraints leads to modification of the set of extreme loadings;
the optimal structure changes in response. This corresponds to bifurcation of the
solution of the optimization problem. Another characteristic feature of the optimization problem is the symmetry of its solution. We show that the invariance of
the set of the constraints for the admissible loadings, together with the corresponding symmetry of the domain, leads to the symmetry of the optimally designed
structure.
Section 5 contains two examples of problems of structural design for uncertain
loadings. One example is provided by the problem of designing the optimally
supported beam loaded by an unknown loading with fixed mean value. The second example is a problem of determining the optimal structure of a composite
strip loaded by a force which deviates from the normal in an unknown direction.
The force is assumed to have a prescribed normal component and an additional
component which is arbitrarily directed and is unknown.
2. The Principal Compliance of a Domain
2.1. PROBLEM , EQUATIONS , CONSTRAINTS
2.1.1. Equations
Consider a domain with the boundary ∂ = ∂0 ∪∂ filled with a linear anisotropic
elastic material, loaded on its boundary component ∂ by a force f , and fixed on the
boundary component ∂0 . The elastic equilibrium of such a body is described by a
system (see, for instance, [35]):
∇ · σ = 0 in ,
σ = C : ǫ,
1
ǫ(w) = ∇w + (∇w)T .
2
T
σ =σ ,
(1)
Here C = C(x) is the fourth-order stiffness tensor of an anisotropic inhomogeneous material, w = w(x) is the displacement vector, ǫ is the strain tensor, σ is the
stress tensor, and (:) represents contraction of two indices. Thus,
ǫ:σ =
ǫij σj i ,
(C : ǫ)ij =
Cij kl ǫlk .
i,j
k,l
172
E. CHERKAEV AND A. CHERKAEV
Equation (1) is supplemented with the boundary conditions
σ ·n=f
w = 0 on ∂0 ,
on ∂,
(2)
where n is the normal to the boundary ∂. These equations are the first variation
conditions of the variational problem,
J(C, f ) = − min
(C, ǫ(w)) dx − w · f ds
w:w|∂0 =0
∂
(3)
w · f ds − (C, ǫ(w)) dx ,
= max
w:w|∂0 =0
∂
where is the density of the elastic energy:
1
1
(C, ǫ(w)) = ǫ : σ = ǫ : C : ǫ.
2
2
(4)
The nonnegative functional J is called the compliance of the domain; (3) states that
it is maximal at equilibrium. At equilibrium, the energy stored in the body equals
the work of the applied external forces f ,
1
J0 (C, f ) =
(C, ǫ(w)) dx.
(5)
w · f ds =
2 ∂
Simultaneously with the elasticity problem, we consider also a close problem
of the bending of a Kirchhoff plate (see, for example, [35]). The equilibrium of the
plate is described by the fourth order equation
∇∇ : Cpl : ∇∇w = f
in
(6)
with homogeneous boundary conditions
w = 0 on ∂,
∂w
= 0 on ∂,
∂n
(7)
corresponding to a clamped plate, or
w = 0 on ∂,
nT (Cpl : ∇∇w)n = 0
on ∂,
(8)
for a simply supported plate. Here, w is the deflection orthogonal to the plane of
the plate, Cpl is the fourth-order tensor of bending stiffness of the elastic material,
∇∇w is the Hessian of w, and f is the external loading. Notice that the force f
enters the equation as a right-hand-side term. The equation for the plate deflection
corresponds to maximization of the functional
1
∇∇w : Cpl : ∇∇w − wf dx.
(9)
Jpl (C, f ) = −
2
The results that we develop further in this paper apply to both the elasticity (1)
and the bending problem (6); therefore, we will drop the subscript in Jpl (C, f ),
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
173
and keep notation J(C, f ) for both compliance functionals. If this does not cause
a confusion, we use the same notation w to denote both the displacement in the
elasticity problem (1) and the deflection in the bending problem (6), even though
the first one is a vector function, whereas the second one is a scalar function.
2.1.2. Admissible Loadings
Let F be a set of admissible loadings f . The elastic energy over a finite domain
is assumed to be finite. We consider integral constraints to describe the set of
loadings F :
$
∂, for problem (1),
(10)
φ(f ) ds = 1 ,
Df =
F = f:
, for problem (6).
Df
Here Df is a domain of application of the forces: in the elasticity problem (1),
Df concides with the part of the boundary ∂, whereas for the bending plate problem (6), Df is the domain or a part of it. We assume that φ is a convex function
of f , with the derivative ψ: R3 → R3 :
∂φ ∂φ ∂φ
∂φ
,
=
,
,
ψ(f ) =
∂f
∂f1 ∂f2 ∂f3
which has an inverse ρ = ψ −1 .
2.1.3. Principal Compliance
We define the principal compliance of an elastic domain in a class of loadings as a
compliance in the worst possible loading scenario.
DEFINITION. The principal compliance of the domain is
= max J(C, f ).
(11)
f ∈F
The loadings that correspond to the principal compliance are extreme or the most
dangerous loadings; we denote them as fD .
(C) = J(C, fD ) J(C, f )
∀f ∈ F .
(12)
The most dangerous loadings exist if the set F is closed and convex, see [15].
2.2. CALCULATION OF THE PRINCIPAL COMPLIANCE
The concept of the principal compliance is useful if there are efficient algorithms
for computing the extreme loadings. We show here that the problem of computation of the principal compliance and the extreme loadings can be formulated as a
boundary value problem.
174
E. CHERKAEV AND A. CHERKAEV
Consider problem (11) and assume that the loadings are constrained as in (10).
The augmented functional J for the problem is:
φ(f ) ds − 1 ,
J = J(C, f ) − µ
Df
where µ is the Lagrange multiplier. Clearly, maxf ∈F J = maxf J . Variation of
the augmented functional with respect to f gives the optimality condition for the
extreme loading(s):
∂
(−f · w + µφ(f ))δf = 0,
δf J =
Df ∂f
or, since δf is arbitrary,
w−µ
∂φ
= 0 on Df .
∂f
Solving for the extreme loading(s) fD = f , we arrive at the condition
w
fD = ρ
µ
(13)
which links the loading fD to the displacement w at the same boundary point
for the elasticity problem (1) or at the same point in the domain for the bending
problem. Condition (13) together with the first boundary condition in (2) allows us
to exclude f from the boundary conditions, leading to the boundary value problem
for the displacement w. We arrive at:
THEOREM 1. The principal compliance of the elasticity problem (1), (2) with
the constraints for the class of loadings (10) equals
1
w
=
wρ
ds,
(14)
2 ∂
µ
where w satisfies the elasticity equations (1) in with the boundary conditions
1
w
on ∂,
w = 0 on ∂0 .
(15)
σ ·n=ρ
µ
The Lagrange multiplier µ is determined from the integral condition
w
φ ρ
ds = 1,
µ
∂
where the function ρ(·) is an inverse of ψ = ∂φ/∂f .
(16)
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
175
Indeed, the displacement w, whose energy is the principal compliance, satisfies
the elasticity equations (1) in with the boundary conditions obtained from (2)
and (13). The first condition in (15) relates the normal stress at a point on the
boundary ∂ to the displacement at this point. The boundary value problem (1),
(15), (16) allows us to compute w and µ, fD , and .
For the bending problem (6), the calculation is similar. The principal compliance is the maximum of the functional (9) over all loadings bounded by the
constraint (10); its value is the following.
THEOREM 2. The principal compliance for the bending problem (6)–(8) with
the constraint for the class of loadings (10) is
w
1
wρ
dx,
(17)
=
2
µ
where w satisfies the equation
w
∇∇ : Cpl : ∇∇w = ρ
µ
(18)
together with the corresponding homogeneous boundary conditions (7) or (8). The
function ρ(·) is an inverse of ψ = ∂φ/∂f . The Lagrange multiplier µ is determined
from
w
ds = 1.
(19)
φ ρ
µ
Indeed, the extreme loading f is related to the displacement w by a scalar
relation w = µφ ′ (f ) or f = ρ(w/µ), and the plate equlibrium is described by
equation (18).
3. Examples of Constraints
3.1. HOMOGENEOUS QUADRATIC CONSTRAINT
Assume that the constraint (10) restricts a weighted L2 norm of f :
1
1
f T f ds = 1 or φ(f ) = f T f,
2 ∂
2
(20)
where (s) is a symmetric, positive matrix. In this case, ρ is a linear mapping:
ρ(f ) = −1 f , and the first of the boundary conditions (15) for the extremal
loading becomes linear:
1 −1
w−σ ·n=0
µ
on ∂.
(21)
The optimality condition states that w and σ · n are proportional to each other
everywhere on the boundary ∂ with the same tensor of proportionality µ.
176
E. CHERKAEV AND A. CHERKAEV
REMARK 1. The stationary condition (21) allows for the following physical interpretation: The boundary ∂ is equipped with distributed springs with negative stiffness. The forces in them are proportional but opposite to the forces in conventional
linear springs.
The elasticity equations (1) with boundary conditions (21) form a linear eigenvalue problem that has a nonzero solution w only if 1/µ is one of its discrete
eigenvalues. Eigenvalue 1/µ relates the displacement on the boundary and the
normal stress.
As all eigenvalue problems, the problem (1), (21) represent Euler–Lagrange
equations of a variational problem:
1
= min
µ w:w|∂ =0
or
ǫ(w) : C : ǫ(w) ds
−1
∂ w · w ds
1
ǫ(w) : C : ǫ(w) dx −
µ
∂
w·
−1
w ds
→ min .
w:w|∂ =0
(22)
The eigenvalue problem that contains the eigenvalue in the boundary condition is a
Steklov eigenvalue problem, and µ is a reciprocal to the Steklov eigenvalue, see [4].
The eigenfunctions are normalized by condition (20).
Using (20) and (21) in the form w = µf , we observe that the second term
in (22) is equal to µ, thereafter µ = . The Steklov problem has infinitely many
real positive eigenvalues (see [4, 23]), but the principal compliance of the domain
corresponds to the dominant eigenvalue, = µmax . The dominant eigenfunction
is not necessarily unique; we will demonstrate below that the existence of many
stationary solutions is typical for the problems of minimization of the principal
compliance with respect to the structure. The dominant eigenfunctions are the
extreme loadings. The results are formulated as
THEOREM 3. If the L2 -norm of admissible loadings is bounded, the principal
compliance is a solution of the eigenvalue problem:
∇ · σ = 0 in ,
w = σ · n on ∂.
(23)
is a reciprocal to the principal eigenvalue 1/µ of the problem (1), (21).
REMARK 2. The spectrum of the problem (1), (21) has one condensation point,
zero. Positive eigenvalues µk tend to zero but never reach it. This implies that the
dual problem of minimal compliance does not have a solution: the compliance can
be made arbitrarily small by choosing a fast alternating loading.
REMARK 3. The problem becomes isomorphic to the problem of the principal
eigenfrequency of the domain, if the kinetic energy (and the inertia) are concentrated on the boundary: T = δ(x − xb )ww, where xb ∈ ∂.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
177
In the bending problem (6), the analogy between the principal compliance and
the principal eigenfrequency of vibrations is complete. The equilibrium (18) of
the optimally loaded plate coincides with the equation for the magnitude of the
deflection of the oscillating plate,
∇∇ : Cpl : ∇∇w =
1
w.
3.2. L1 - NORM CONSTRAINT
Consider the L1 -norm constraint for the class of admissible loadings which assumes that the mean value of loading’s magnitude is fixed:
|f | ds =
f · f ds = 1.
(24)
∂
∂
From an engineering viewpoint, this case is probably the most interesting one: it
models the situation when the total weight applied to the structure is known but the
distribution of the loading over the boundary is uncertain.
For this, the functional of the variational problem grows linearly as |f | → ∞
which leads to a significantly different analysis. The straightforward variational
technique does not provide the correct answer. Indeed, the variation with respect
to f returns the vector condition
δf :
w − µ√
1
f =0
f ·f
on ∂,
which says that
|w| = constant
and
wf
on ∂.
The last condition, together with the condition σ · n = f (see (2)), allows us to
exclude f and end up with a pair of conditions on w:
(σ · n) × w = 0,
|w| = constant
on ∂.
Generally, these conditions cannot be satisfied if the ∂-component of the boundary is adjacent to the component ∂0 where w = 0 since w is continuous. This
contradiction shows that the naive variational method does not apply.
REMARK 4. The appearance of discontinuous solutions in the variational problems of linear growth is well-known [36]. The famous classical example is the
existence of a non-smooth solution in the minimal surface problem.
To solve the contradiction, we need to assume that the optimal loading f is a
distribution. Indeed, the distribution does not have to satisfy the Euler equations
178
E. CHERKAEV AND A. CHERKAEV
of the variational problem because this equation was derived under the assumption
that the optimal solution f is finite and smooth.
Dealing with distributions in the L1 -constrained set of loadings may cause difficulties because the distributions δ(x − x0 ) may or may not correspond to a finite
energy of the elastic system, as is stated in the Sobolev embedding theorem, see,
for example, [24]. For the compliance of the bending plate (9), the energy of the
concentrated loading and the Green’s function of the corresponding operator are
finite. We illustrate this case below considering a one-dimensional example of a
beam; the concentrated loadings of the type δ(x − x0 ) are acceptable because the
corresponding energy stored in the elastic beam is finite.
However, the linear elasticity problem does not allow a concentrated loading
because the corresponding energy is infinite; the Green’s function g(x, y) has a
singularity, g(x, x) = ∞. In this case, the restriction on the class of admissible
f can be slightly tighten. We may assume, for example, that the force is piecewise constant within small domains of area ǫ. Alternatively, we may constrain the
L1+ǫ -norm of the loading,
|f |1+ǫ ds = 1,
(25)
∂
where ǫ > 0 is a fixed parameter. This loading can be supported by a linear elastic
material, although the displacement w can indefinitely grow when ǫ → 0. The
analysis of this case leads to the optimality condition
w 1/ǫ w
,
µ
|w|
which shows that magnitude of an optimal loading either stays arbitrarily close to
zero or is very large (of the order of 1/ǫ). The integral constraint (25) guarantees
that the measure of the set of large values of f (s) goes to zero when ǫ → 0.
With this warning, we proceed with the formal analysis of the problem with the
L1 constraint assuming that either the limit exists or that ǫ can be chosen arbitrary
close to zero to preserve the qualitative properties of the solution.
The extremal loading is concentrated in several points,
f =
ci ξi δ(x − xi ),
f =
i
where {xi } is the set of points where the (concentrated) loading is applied, xi ∈ ∂,
ξi : ξi = (ξi(1), ξi(2) , ξi(3) ), |ξi | = 1, are directional vectors of the concentrated
loadings, and ci are their intensities; due to (24), ci belong to the simplex
ci :
ci = 1, ci 0.
(26)
i
Further, we show that the extreme loading is always applied to a single point. The
displacements wk = w(xk ) are
wk =
g(xk , xi )ci ξi ,
i
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
179
where g(xk , xi ) is the Green’s function which relates the δ-function loading at the
point xi to the generated displacement w at the point xk . The compliance becomes
J=
ci ck ξiT g(xi , xk )ξk .
i
k
The principal compliance corresponds to the maximum of J with respect to ci , ξi
and the points xi .
As a function of ci , J is a nonnegative quadratic form, because the work J is
always nonnegative. Therefore, J is a convex function of ci and its maximum is
reached in a corner of the simplex (26): the maximum Jc of J corresponds to a
single concentrated loading c1 = 1, c2 = · · · = cp = 0. Next, we maximize
this maximum Jc with respect to the direction ξ1 = (ξ1(1) , ξ1(2) , ξ1(3)) of the single
applied loading. The resulting compliance Jξ,c is equal to the maximal eigenvalue
g
λmax (x1 ) of the Green’s function g(x1 , x1 ) at the point x = x1 :
Jξ,c = max ξ1T g(x1 , x1 )ξ1 = λgmax (x1 ).
ξ1
This implies that the applied loading f (x) must be parallel to the displacement
w(x). Finally, we choose the point x1 ∈ ∂ of application of the extreme concentrated loading and obtain the principal compliance . Summarizing, we obtain
THEOREM 4. The L1 -principal compliance is
= max λgmax (x) ,
x∈∂
g
λmax (x)
is the maximal eigenvalue of the 3 × 3 tensor Green’s function
where
g(x, x) of the problem (1) at the point x ∈ ∂.
We stress that the point x1 may be not unique although the extreme loading
is always concentrated at one point. For example, there may be two symmetric
extreme loadings if is a symmetric domain. An example in Section 5.1 below
shows that there are several equally dangerous loadings in an optimal solution:
g
g
λmax (x1 ) = · · · = λmax (xq ); the number q depends on the structure.
3.3. OTHER SPECIAL CASES
3.3.1. Constrained Lp -norm of the Loading
If the constraint is imposed on the Lp -norm of the loading, i.e.,
1
|f |p = 1, p > 1,
p ∂
the problem has the form (1) but the boundary conditions (21) are replaced by
|w| 1/(p−1) w
σ · n = η(w), η(w) =
(27)
µ
|w|
180
E. CHERKAEV AND A. CHERKAEV
and the normalization (16) for µ becomes
1/q
1
1
1
q
µ=
|w| ds
with + = 1.
p ∂
q
p
(28)
In this case, the relation between the stress and displacement is nonlinear. Again,
the multiplicity of stationary solutions that satisfy (27), (28) is expected; this time
the solutions correspond to bifurcation points instead of spectrum points. The physical interpretation is similar to the one given in Remark 1, but the springs attached
to the boundary ∂ are nonlinear.
3.3.2. Nonhomogeneous Constraint
Let the loading f consist of some known component f 0 and an unknown deviation
with a constrained Lp -norm:
f 0 − f Lp 1.
(29)
Applying the previous variational analysis, we conclude that an extremal loading
can be found from the elasticity problem with a inhomogeneous mixed boundary
condition:
σ · n = f 0 + η(w)
on ∂.
Since the boundary condition is inhomogeneous, w = 0 is not a solution. Still, the
problem may have several stationary solutions. An example of this constraint is
discussed later in Section 5.2.
4. Robust Optimal Design
4.1. MULTIPLICITY OF EXTREME LOADINGS
Consider an optimal design problem: find a layout of elastic materials over the
domain that minimizes the principal compliance . Such a structure (stiffness
C(x)) corresponds to a solution of the extremal problem
Pmin max = min (C),
C∈C
(30)
where C is a class of admissible layouts. We rewrite the problem using the definition of (C):
Pmin max = min max J(C, f ),
C∈C f ∈F
(31)
where the compliance J = J(C, f ) is defined in (3). Minimization over w in (3)
is performed first so that w will satisfy the elasticity equations while interchanging the order of the extremal operations minC∈C and maxf ∈F correspond to two
physically different situations. Minimax problem (31) is a problem of optimization
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
181
of the material layout when the applied loading is unknown, while in the maximin
problem
Pmax min = max min J(C, f )
(32)
f ∈F C∈C
the loading is chosen to maximize the stored energy and is known to the designer;
so the design resists this particular loading. If J is a saddle-point functional, the
solutions to these two problems coincide, and
Pmax min = Pmin max .
Saddle point solutions are typical for ‘weak’ control as we will demonstrate below.
The general case
Pmax min < Pmin max
corresponds to a situation when several loadings are ‘equally dangerous.’ The stiffness of the structure Copt should be fairly distributed to resist equally well each of
these extreme loadings leading to the condition
J(Copt , fi ) = J(Copt , fj ),
fi , fj ∈ ,
where is a set of extreme loadings.
Generally, the set of stationary loadings may consist of any number of elements.
They can be found from the following equations, see [16]. Consider a design Copt
and the functional J(Copt , f ). The extremal loadings that solve the variational
problem
δ2
J(Copt , f ) 0
δf 2
δ
J(Copt , f ) = 0,
δf
are denoted by fˆi , i = 1, . . . , p, where p ∞; we assume that there are p
stationary loadings that can become extreme. The optimized principal compliance
Pmin max is determined from the problem
ˆ
min max Pmin max +
νi J(Copt , fi ) ,
(33)
C
νi 0
i
where νi 0 are the Lagrange multipliers due to the constraints
J(Copt , fˆi ) − Pmin max 0,
νi = 1.
i
Optimal design Copt is found from the following conditions that reformulate the
minimax problem as the problem of minimization of a sum of energies corresponding to extreme loadings.
182
E. CHERKAEV AND A. CHERKAEV
THEOREM 5. The optimal principal compliance Pmin max equals
Pmin max = min max
C∈C {νi }:νi >0
q
i=1
νi J(C, fˆi ),
i
νi = 1,
(34)
where q is the number of active extreme loadings.
The nonzero Lagrange multipliers correspond to the equalities
J0 = J(Copt , fˆi )
i = 1, . . . , q,
⇒
νi > 0,
and the multipliers equal zero if the stationary loading leads to a smaller value of
the functional, i.e.,
J0 > J(Copt , fˆk ),
k = q + 1, . . . , p
⇒
νk = 0.
These last conditions should be checked in the optimization procedure; that is,
minimizing J0 we check if the value of the functional for the next loading fq+1
(not the most dangerous one) is still less than J0 . When this inequality becomes
equality, the set of extreme loadings should be enlarged to include fˆq+1 , and the
corresponding Lagrange multiplier νq+1 becomes positive.
The multiplicity of equally dangerous loadings closely resembles the multiplicity of optimal solutions in a well studied problem of maximization of the
minimal eigenfrequency. The multiplicity of optimal eigenvalues in that problem
was observed first in a pioneering paper of Olhoff and Rasmussen [30]; then it was
investigated in [33, 14, 34].
REMARK 5. The optimization problem (34) also admits a probabilistic interpretation. Namely, assume that the optimal loading is a random variablewhich takes
q stationary values with some probability ν1 , . . . , νq . Then the sum νi J(C, fi )
in (34) is the expectation of the energy. The optimal design minimizes the expectation of the energy, meanwhile the loading chooses probabilities ν1 , . . . , νq to
maximize it.
4.2. SYMMETRIES
Symmetries are typical for designs that minimize the principal compliance. Namely,
if the domain and the class of loadings are invariant under a symmetry transformation (translation, reflection, or rotation), then the set of extreme loadings and
the optimal design are invariant under this transformation as well. We state the
following:
THEOREM 6. If the domain , the boundary component ∂, and the set F of
admissible loadings are invariant under a symmetry transformation R, i.e.,
= R,
∂ = R∂,
and
F = RF ,
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
183
Figure 1. The force could be applied at arbitrary points along the elastically supported beam.
The mean value of the magnitude of the force is constrained.
then the set of extreme loadings and the optimal materials’ layout C are invariant under this transformation, i.e.,
= R,
C = RC.
(35)
Indeed, applying the above consideration we can see that if f0 ∈ is an extreme
loading, then Rf0 is also an extreme loading. The compliance of the structure
should be the same for both loadings, which implies invariance of the design parameters with respect to the transformation R. Particularly, when the loaded domain
is rotationally symmetric, and the loading can be applied from any direction, the
optimal layout is axisymmetric.
REMARK 6. Notice the symmetry of many natural ‘designs’ that are perfected
by evolution: The rotationally symmetric shape of trees allows them to sustain
wind from all directions; our natural “protective shell”, the skull, provides the best
protection for the brain against hits from any direction.
The conditions of the theorem do not require the symmetry of the extreme loading, only a possibility to apply a loading symmetric to any given one. In contrast,
the design must be symmetric.
5. Examples of Optimal Designs
The following examples highlight the discussed multiplicity of extreme loadings
and bifurcation of the optimal solution.
5.1. OPTIMAL DESIGN OF A SUPPORTED BEAM
5.1.1. Formulation
Consider a homogeneous elastic beam of unit length simply supported at both ends,
elastically supported from below by a distributed system of elastic vertical springs
with the specific stiffness q(x) 0, and loaded by a distributed nonnegative force
f (x) 0. The elastic equilibrium of the displacement w is described by a onedimensional version of (6):
(Ew ′′ )′′ + qw = f,
w(0) = w(1) = 0,
w ′′ (0) = w ′′ (1) = 0, (36)
184
E. CHERKAEV AND A. CHERKAEV
where E is Young’s modulus. The compliance is equal to
1
E ′′ 2 q 2
J=
f w − (w ) − w dx,
2
2
0
(37)
where w is a solution of (36). Assume that the mean value of the magnitude of
the loading (L1 -norm constraint) is equal to one, and the integral stiffness of the
supporting springs is constrained by a constant κ.
$
1
−1
f dx = 1 ,
F = f ∈ H (0, 1):
0
$
1
−1
q dx = κ .
Q = q ∈ H (0, 1):
0
The optimal design problem of minimization of the principal compliance by distributing the springs stiffness becomes:
1
2
Pmin max = min max J .
q∈Q
f :∈F
Applying the above analysis, we conclude:
1. The domain, class of loadings and the boundary conditions are invariant to the
translation x → 1 − x, therefore the design (the springs stiffness) is symmetric
with respect to the center of the beam, see Section 4.2,
q(x) = q(1 − x).
2. Necessary conditions in Section 3.2 show that the extreme loading is a deltafunction f (x) = δ(x − xj ) applied at one of the points {x1 , x2 , . . . , xp }, where
w ′ (xj ) = 0,
w ′′ (xj ) 0.
(38)
The extreme loading may be applied to different points symmetric with respect
to the center of the beam; the resulting stiffness must be equal.
3. The stiffness of an optimal spring is a distribution
q(x) =
αi δ(x − yi ),
αi = κ, αi 0.
i
i
Indeed, the assumption that q(x) satisfies variational stationary conditions leads
to a contradiction similar to the contradiction discussed in Section 3.2. Particularly, the optimal positions of the springs satisfy the necessary conditions (38),
and therefore the set of reinforcement points coincides with the set {x1 , x2 ,
. . . , xp }. The number p of the critical points depends on the relative stiffness
of the springs κ/E.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
185
Accounting for the loading and springs being concentrated, we reformulate the
problem (37) for the optimal principal compliance:
Pmin max =
min
(α1 ,...,αp )
/
0
1
p
αi 2
E ′′ 2
max
δik wk − wi −
(w ) dx ,
xk
2
0 2
i=1
(39)
where δik is Dirac function.
The response of a supported beam can be characterized by a function
v(x) = max g(ζ, x),
ζ ∈(0,1)
(40)
where g is the Green’s function of the boundary value problem (36): g(ζ, x) is the
displacement w(ζ ) at the point ζ corresponding to a delta-function loading applied
at the point x, f (ζ ) = δ(ζ − x), and v(x) is the maximal displacement under the
concentrated force applied at the point x. Figure 2 shows the response v(x) of the
beam supported by two symmetric springs. The family of the thin curves shows
the displacements wk (x) under several concentrated loadings applied at different
points along the beam. The thick curve shows the maximal displacement, v(x).
Notice that the point of application of the concentrated force is generally different
from the point of maximum of the displacement curve; see the caption to Figure 2.
i
, i = 1, 2, of the maximum
However, the optimal springs are located at points xopt
of v(x), and the extreme loading is the one applied at one of the same points,
1
2
fD = δ(x − xopt
) or fD = δ(x − xopt
).
The numerical results demonstrate the following: if the springs are weak, κ/E
κ1 , they are concentrated in the center of the beam. We are dealing with the saddlepoint case: the most dangerous loading is a concentrated loading applied also at the
center. The maximal displacement v(x) is a unimodal function of the position of
the loading, with the maximum in the center, (v ′ (1/2) = 0, v ′′ (1/2) < 0). There
Figure 2. Thin curves: The displacement functions generated by concentrated loadings applied at various points along the beam. The thick curve: maximal displacement v(x) generated
by a force applied at x ∈ (0, 1) as a function of the position of the force. The displacement
corresponding to the force applied at x = 0.15, has a maximum at x = 0.25. Figure shows the
responses of the beam optimally reinforced by two symmetric springs.
186
E. CHERKAEV AND A. CHERKAEV
(a)
(b)
(c)
Figure 3. Maximal displacement v(x) as a function of the position of the applied loading:
(a) corresponds to a saddle point case, κ/E < κ1 : the function v(x) is unimodal, the optimal
spring and the extreme loading are both located in the middle of the beam; (b) shows v(x)
corresponding to κ/E in the interval κ1 < κ/E < κ2 when the strong spring is located in
the center of the beam. Maximal displacement v(x) is not unimodal; design is not optimal;
(c) corresponds to κ/E in the same interval κ1 < κ/E < κ2 , the maximal displacement v(x)
is shown for an optimally designed beam which is supported by two symmetric springs.
is only one solution for the optimal applied force and the optimal position of the
spring:
1
f (x) = δ x −
,
2
1
q(x) = κδ x −
.
2
187
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
Figure 3(a) shows v(x) for the beam supported by a weak spring in the center
of the beam. One can see that v(x) is unimodal. If the spring becomes stronger,
κ1 < κ/E κ2 , but is still located in the center, the maximum of v(x) corresponds
to a noncentral applied force. The equally dangerous loadings could be applied
in two symmetric eccentric points. The maximum displacement v(x), shown in
Figure 3(b), is not a unimodal function of the position of the moving applied
force; the design is not optimal. The optimal design for this case (Figure 3(c))
corresponds to two equally stiff springs located symmetrically with respect to the
center; the design experiences a bifurcation at the critical value of κ/E = κ1 .
An optimally supported beam is shown in Figure 3(c), where two strong springs
are located symmetric with respect to the center of the beam. The maximal displacement curve becomes unimodal again, with a large interval of almost constant
values in the middle. The next bifurcation occurs when κ further increases, at the
point κ/E = κ2 . Three springs appear after the next bifurcation. The number of
optimal supporting points increases and tends to infinity when the springs are much
stronger than the beam, κ/E ≫ 1. The optimality conditions
w ′ (xi ) = 0,
w(xi )|f =fi = constant(i),
give the optimal position of the supporting springs xi and a requirement on their
stiffnesses αi .
5.2. COMPOSITE STRIP WITH CONSTRAINED DEVIATION OF THE LOADING
This example shows the design of an optimal structure for the worst possible loading. Consider an infinite strip = {−∞ < x < ∞, −1 y 1}, made from a
two-component elastic composite with arbitrary structure but with fixed fractions
mA and mB = 1 − mA of the isotropic components. The stiffness of the composite
C(x, y) is an anisotropic elasticity tensor; it is assumed that the stiffness can vary
only along the strip, C = constant(y).
Assume that the upper boundary is loaded by some unknown but uniform loading f ,
σ (x, 1) · N = f
∀x,
where N = (0, 1) is the normal vector. The loading f consists of the fixed component f0 = (0, 1) directed along the normal and a variable component (deviation)
(fN , fT ), the magnitude of the deviation is constrained:
f = (f0 + fN )N + fT T ,
fN2 + fT2 = γ 2 .
(41)
Here T = (1, 0) is the tangent vector and γ is the intensity of the deviation. The
constraint (41) can be rewritten as
f = (f0 + γ cos θ)N + (γ sin θ)T
for y = 1,
188
E. CHERKAEV AND A. CHERKAEV
Figure 4. An infinite composite strip loaded by a force f that could deviate from the normal
direction. If the norm γ of the deviation is smaller than a critical value γ1 , the optimal composite is a laminate with layers directed across the strip. If γ is greater than γ1 , the optimal
composite is second rank laminate with layers oriented along directions φ and −φ.
where θ is the angle of inclination of the deviation of the loading; see Figure 4. The
lower boundary of the strip is assumed to be loaded by a symmetrically deviated
force
f− = −f = −(f0 + γ cos θ)N + (γ sin(−θ))T
for y = −1.
The symmetry of the loadings results in the horizontal strain being zero,
ǫxx (x, y) = 0,
−1 y 1,
(42)
so that the strain tensor has only two, vertical and shear, nonzero components. The
stiffness of the composite C(x) is an anisotropic tensor that is assumed to vary only
along the x coordinate. We consider the problem of optimization of the principal
compliance of the described domain.
5.2.1. Design Parameters
Applying the symmetry theorem, we conclude that:
1. The elastic properties of the optimally designed structure do not vary along the
strip, since the design is invariant to the translation x → x + χ. Together with
the assumption that the material properties do not vary with the thickness, this
leads to the conclusion that the elastic properties are uniform: the tensor C is
constant in x and y. This implies that the stress field σ is constant inside an
optimal strip and
σyy = 1 + γ cos θ,
σxy = γ sin θ.
(43)
2. The material in the optimal strip is orthotropic with main axes directed along
the x and y axes since the design is invariant to the reflection x → −x:
0 ǫxy
0
−ǫxy
C:
=C:
.
ǫxy ǫyy
−ǫxy ǫyy
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
189
This implies orthotropy with the main axes codirected along the x, y axes.
For the following calculations, we introduce an orthonormal (ai : aj = δij )
tensor basis
1
0 1
0 0
1 0
.
(44)
a1 =
,
a2 =
,
a3 = √
0 1
0 0
2 1 0
In this basis, the stress tensor σ ,
σ2 σ3
σ =
,
σ3 σ1
is represented as a vector
σ = σ1 a1 + σ2 a2 +
√
2σ3 a3 .
The compliance tensor S and stiffness tensor C = S −1 are presented as matrices
with the components {Sij } and {Cij }; their orthotropy implies the representation
%
&
S11 S12 0
S = S12 S22 0
0
0 S33
and a similar one for C.
5.2.2. The Optimization Problem
The energy of an orthotropic material is computed either as a function of stresses
and compliance tensor S = {Sij } (stress energy):
1
σ (S, σ ) = (S11 σ12 + S22σ22 + 2S12 σ1 σ2 + 2S33 σ32 ),
2
or as a function of strain ǫ and stiffness tensor C = {Cij },
(45)
1
ǫ (C, ǫ) = (C11 ǫ12 + C22 ǫ22 + 2C12 ǫ1 ǫ2 + 2C33 ǫ32 ).
(46)
2
Recall (see (43)) that two components σ1 = σyy and σ3 = σxy of the stress field
σ are known, and the strain in the xx direction is zero, (42):
ǫ2 = S12 σ1 + S22 σ2 = 0;
therefore, σ2 can be excluded. The elastic energy (46) becomes
1
ǫ (C, ǫ) = (C11 ǫ12 + 2C33 ǫ32 )
2
or, in terms of stress (see (45)),
2
S12
1
2
2
σ + 2S33 σ3 .
S11 −
σ (S, σ ) =
2
S22 1
190
E. CHERKAEV AND A. CHERKAEV
The problem of robust optimal design becomes
Pstrip =
min
max (S, σ ),
(47)
C∈Gm closure f ∈F
where Gm closure is the set of all possible effective compliance tensors of a microstructure formed from the two given materials with the compliance tensors SA
and SB , taken in the proportion mA and mB = 1 − mA , respectively, see [8, 27]. We
reformulate the problem using a sum of weighted energies, where the minimized
functional is taken as a sum of the energies due to the extreme loadings.
5.2.3. Laminates of Third Rank: Symmetry
The description of the strongest structures that minimize the sum of the energies
due to several loadings is known, (see the original papers [2, 3, 17] and the books
[8, 29]); the best structures in 2D are so-called “laminates of the third rank” shown
in Figure 5. In 3D, they are the sixth rank laminates [17]. Structural optimization based on using the third rank composites was effectively developed for the
multi-loadings case in [6, 10, 25]. The effective compliance tensor S = C −1 of
a third rank composite – the symmetric fourth-order tensor of elasticity – has the
representation
−1
S = SA + mB (SB − SA )−1 + mA N ,
(48)
where SA is the compliance of an enveloping (reinforcing) material, SB is the compliance of the material in the nucleus, N is the matrix of structural parameters that
depends on the structure of the composite, see [8, 29],
N = EA
3
i=1
αi P (φi ),
3
i=1
αi = 1,
αi 0.
Here EA is the Young’s modulus of the A-material, angles φi are the angles that define the directions of laminates (directions of reinforcement), P is a tensor product
of four directional vectors zi = (cos φi , sin φi ):
Figure 5. The schematic picture of the composite of the third rank.
191
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
P (φi ) = zi ⊗ zi ⊗ zi ⊗ zi ,
(49)
αi is the corresponding relative thickness of the reinforcing layer in the ith direction.
The above mentioned symmetry of an optimal composite requires the orthotropy
of the optimal structure. Since the original materials are isotropic, the structure is
orthotropic if the matrix N is orthotropic. This can be achieved by setting
φ2 = −φ3 = φ,
α2 = α3 = α.
Generally, the optimal strip is reinforced by three layers of strong material; one
layer (with relative volume fraction 1 − 2α) is directed in the y-direction and two
other layers (with equal relative volume fractions α) are symmetrically inclined by
the angles ±φ. In addition, the structure may degenerate into a single layer (when
α = 0) or two symmetric layers (when α = 12 ) with angles φ and −φ. Because of
this symmetry, the matrix N for an optimal composite becomes
N = (1 − 2α)P (0) + αP (φ) + αP (−φ).
(50)
Let us compute the compliance of a third-rank composite in the basis (44).
Compliance SA of an isotropic material A is given by a matrix
%
&
−νA
0
1 + νA 1 − νA
SA =
−νA
1 − νA 0 ,
EA
0
0
1
and similarly for the material B. To compute the effective compliance of a thirdrank laminate, we first represent the matrix P (φ) of (49) in the basis (44),
√
&
%
2 sin φ cos3 φ
cos4 φ
sin2 φ cos2 φ
√
2
4
3
2
P (φ) = √sin φ cos φ
2 sin φ cos φ ,
√ sin3 φ
2 sin φ cos3 φ
2 sin φ cos φ 2 sin2 φ cos2 φ
and obtain from (50)
%
1 − 2α + 2α cos4 φ
N=
2α sin2 φ cos2 φ
0
2α sin2 φ cos2 φ
2α sin4 φ
0
0
0
4α sin2 φ cos2 φ
&
.
The matrix N is the variable part of the compliance matrix, (see (48)); it depends
on only two scalar parameters, φ and α.
The structural optimization problem (47) finally becomes an algebraic problem
Jstrip = min max σ (S(φ, α), σ (θ));
φ,α
θ
(51)
the expressions for the quantities involved are described above. The angle θ is
the angle of deviation of the loading from the normal, and φ and α are structural
parameters.
192
E. CHERKAEV AND A. CHERKAEV
5.2.4. Second Rank Structure is Optimal
Although in the general case of minimization of a sum of energies corresponding to
multiple loadings the third-rank laminates are optimal, here the optimal structures
are the second – not the third-rank laminates. To prove this statement we must find
the derivative of σ in the algebraic minimization problem (51), and demonstrate
that it does not become zero; this would give the optimal value of α on the boundary
of the constraint. However, we skip this bulky calculation and give a physical argument supported by results of numerical optimization. Because of the absence of
a displacement in the x-direction, there is no need to reinforce this direction. Even
more, the stress in the composite does not change if a layer with infinite stiffness
oriented along x-axes is added to the composition. If this infinitely stiff layer is
counted, then the structure would be reinforced by three layers of stiff material.
Since the stiffness of a structure with an infinitely stiff layer is not smaller than
the stiffness of a structure without such a layer, the optimality of the second-rank
laminates follows.
This conclusion is supported by results of numerical optimization, which gives
αopt = 1/2 for all settings. Physically, this means that the optimal structure is reinforced by either single laminates oriented across the strip (the case when φ = 0) or
by a second-rank laminate with two symmetric reinforcement directions φ and −φ,
see Figure 4. This degeneration of the third-rank laminates can be explained by the
special geometry of the strip and the loading, which do not allow for any strain ǫxx
along the strip, and the assumed independence of the design on the y-coordinate.
The formulas for the effective properties of a symmetric second-rank composite
are simplified: They are still given by the expression (48) but the structural matrix
N is
1
N = (P (φ) + P (−φ))
2
instead of (50); in the basis (44) it has the form
%
&
cos4 φ
sin2 φ cos2 φ
0
2
4
N = sin φ cos2 φ
sin φ
0
.
2
2
0
0
2 sin φ cos φ
We notice that the symmetry in this example efficiently reduces the dimension of
the computational problem, but the general method works with or without symmetry.
5.2.5. Numerical Example
For the first example, the following values of parameters were chosen:
mA = 1 − mB = 0.2,
νA = νB = 0.3,
EA = 1,
f0 = 1.
EB = 5,
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
193
Figure 6. Bifurcation diagram shows (1) the angle of deviation θ̂ (γ ) of the extreme, most
dangerous loading and (2) the angle φ̂(γ ) of optimal reinforcement of the second rank laminated composite. Notice that the bifurcation parameter γ has different critical values for the
deviation of the loading θ and for the angle of reinforcement φ.
The relative magnitude γ of the variable part of the loading is the parameter of
the problem; the angle θ of the optimal deviation of the extreme loading and the
structural parameters α and φ are determined from the solution of the min–max
optimization problem. We detect three regimes:
1. When γ < γ0 = 0.31, the extreme loading is vertical, θopt = 0, and the optimal
structure is a laminate with vertical layers directed across the strip, φopt = 0,
see Figure 6.
2. At the critical value γ0 of the parameter γ , the direction of the extreme deviation
undergoes a bifurcation, θopt = ±θ̂(γ ), shown by the curve 1 in Figure 6. But
for γ < γ1 = 0.46, the optimal structure remains the same: a laminate with
layers directed across the strip, φopt = 0 (curve 2 in Figure 6).
3. When the magnitude γ further increases, γ γ1 , the optimal structure bifurcates as well; it becomes a second-rank matrix laminate with the angle φopt =
±φ̂(γ ) (curve 2 in Figure 6).
Although the problem has two solutions for the extreme loading, the dependence of the compliance on the parameters φ and θ is a saddle-point surface as is
shown in Figure 7. Indeed, the problem is reformulated (relaxed) accounting for
non-uniqueness of the loading and for the symmetry in the design.
The following examples demonstrate the dependence of the optimal solution
on the ratio of Young’s moduli for the materials in the composite. Figure 8 shows
the bifurcation diagrams for different ratios of Young’s moduli. Qualitatively, the
picture remains the same, but the critical values of the bifurcation parameter γ are
different: The larger the ratio, the smaller the critical value of γ0 and γ1 at which
the bifurcation occurs. The interval (γ0 , γ1 ) decreases with an increase of the ratio
of Young’s moduli.
194
E. CHERKAEV AND A. CHERKAEV
Figure 7. Energy stored in the composite is a saddle point function of the angle of deviation
of the loading θ and of the direction of reinforcement φ.
(a)
(b)
Figure 8. Bifurcation diagram for different ratios of Young’s moduli of the materials in the
composite ranging from 1 : 2 to 1 : 25. (a) Bifurcation of the angle θ̂(γ ) of deviation of the
extreme loading from the normal. (b) Bifurcation of the angle φ̂(γ ) of direction of the optimal
reinforcement for the second rank laminated composite.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
195
5.3. DISCUSSION
The principal compliance is a basic characteristic of an elastic body which depends only on the shape of the domain and on the stiffness of the material. By
the proper normalization of using and C, this quantity is reduced to the
dimensionless parameter λ:
λ=
,
C
and can be treated as a basic integral characteristic of the filled domain along with
such properties as main eigenfrequency, the capacity, etc.
The optimal design aimed to decrease the principal compliance is a minimax
problem; typically, the problem does not have a saddle point and the optimal design provides equal minimal compliance for several extreme loadings. Symmetries
and relaxation bring the problem to a saddle-point type. Depending on the type of
constraints, the extreme loading can be a principal eigenfunction of an eigenvalue
problem, a concentrated loading, or a solution of a bifurcation problem.
Acknowledgement
The authors acknowledge the support from NSF and ARO.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
G. Allaire, Shape Optimization by the Homogenization Method. Springer, Berlin (2002).
M. Avellaneda, Optimal bounds and microgeometries for elastic two-phase composites. SIAM
J. Appl. Math. 47 (1987) 1216–1228.
M. Avellaneda and G. W. Milton, Bounds on the effective elastic tensor of composites based
on two-point correlations. J. Appl. Mech. (1989) 89–93.
C. Bandle, Isoperimetric Inequalities and Applications. Pitman Publishing Program, London
(1980).
M.P. Bendsoe, Optimization of Structural Topology, Shape, and Material. Springer, Berlin
(1995).
M. Bendsoe, A. Diaz, R. Lipton and J. Taylor, Optimal design of material properties and material distribution for multiple loading conditions. Internat. J. Numer. Methods Engrg. 38(7)
(1995) 1149–1170.
A. Cherkaev, Stability of optimal structures of elastic composites. In: M. Bendsoe and
C.A. Mota Soares (eds), Topology Design of Structures. Kluwer, Dordrecht (1992) pp. 547–558.
A. Cherkaev, Variational Methods for Structural Optimization. Springer, New York (2000).
A. Cherkaev and E. Cherkaeva, Optimal design for uncertain loading conditions. In:
V. Berdichevsky, V. Jikov and G. Papanicolaou (eds), Homogenization. World Scientific,
Singapore (1999) pp. 193–213.
A. Cherkaev, L. Krog and I. Kucuk, Stable Optimal design of two-dimensional structures made
from optimal composites. Control Cybernet. 27(2) (1998) 265–282.
E. Cherkaeva, Optimal source control and resolution in nondestructive testing. J. Structural
Optim. 13(1) (1997) 12–16.
196
12.
E. CHERKAEV AND A. CHERKAEV
E. Cherkaeva and A. Cherkaev, Bounds for detectability of material damage by noisy electrical measurements. In: N. Olhoff and G.I.N. Rozvany (eds), Structural and Multidisciplinary
Optimization. Pergamon, New York (1995) pp. 543–548.
13. E. Cherkaeva and A.C. Tripp, Inverse conductivity problem for inexact measurements. Inverse
Problems 12 (1996) 869–883.
14. S.J. Cox and M.L. Overton, On the optimal design of columns against buckling. SIAM J. Math.
Anal. 23(2) (1992) 287–325.
15. B. Dagorogna, Direct Methods in the Calculus of Variations. Springer, Berlin (1989).
16. V.F. Demyanov and V.N. Malozemov, Introduction to Minimax. Dover, New York (1990).
17. G.A. Francfort, F. Murat and L. Tartar, Fourth-order moments of nonnegative measures on S 2
and applications. Arch. Rational Mech. Anal. 131(4) (1995) 305–333.
18. M.B. Fuchs and E. Farhi, Shape of stiffest controlled structures under unknown loads. Comput.
Struct. 79(18) (2001) 1661–1670.
19. M.B. Fuchs and S. Hakim, Improved multivariate reanalysis of structures based on the
structural variation method. J. Mech. Struct. Mach. 24(1) (1996) 51–70.
20. L. Gibiansky and A. Cherkaev, Microstructures of composites of extremal rigidity and exact
bounds on the associated energy density. Ioffe Physico-Technical Institute, Academy of Sciences of USSR, Report N. 1115, Leningrad (1987). Translation in: A. Cherkaev and R. V. Kohn
(eds), Topics in the Mathematical Modelling of Composite Materials. Birkhäuser, Basel (1997)
pp. 273–317.
21. R.T. Haftka and Z. Gurdal, Elements of Structural Optimization. Kluwer, Dordrecht (1992).
22. L.A. Krog and N. Olhoff, Topology optimization of plate and shell structures with multiple
eigenfrequencies. In: N. Olhoff and G.I.N. Rozvany (eds), Structural and Multidisciplinary
Optimization. Pergamon, Oxford (1995) pp. 675–682.
23. J.R. Kuttler, Bounds for Stekloff eigenvalues. SIAM J. Numer. Anal. 19(1) (1982) 121–125.
24. O.A. Ladyzhenskaya and N.N. Uraltseva, Linear and Quasilinear Elliptic Equations. New
York/London (1968).
25. T. Lewinski and J.J. Telega, Plates, laminates and shells. Asymptotic Analysis and Homogenization. World Scientific, Singapore (2000).
26. R. Lipton, Optimal design and relaxation for reinforced plates subject to random transverse
loads. J. Probab. Engrg. Mech. 9 (1994) 167–177.
27. K.A. Lurie, Applied Optimal Control Theory of Distributed Systems. Plenum, New York (1993).
28. E.F. Masur, On structural design under multiple eigenvalue constraints. Internat. J. Solids
Struct. 20 (1984) 211–231.
29. G.W. Milton, Theory of Composites. Cambridge Univ. Press, Cambridge (2002).
30. N. Olhoff and S.H. Rasmussen, On bimodal optimum loads of clamped columns. Internat. J.
Solids Struct. 13 (1977) 605–614.
31. N. Olhoff and J.E. Taylor, On structural optimization. J. Appl. Mech. 50(4) (1983) 1139–1151.
32. G.I.N. Rozvany, Structural Design via Optimality Criteria. Kluwer Academic Publishers,
Dordrecht, The Netherlands (1989).
33. A.P. Seyranian, Multiple eigenvalues in optimization problems. Prikl. Mat. Mekh. 51 (1987)
272–275.
34. A.P. Seyranian, E. Lund and N. Olhoff, Multiple eigenvalues in structural optimization
problems. J. Struct. Optim. 8 (1994) 207–227.
35. S. Timoshenko, Theory of Elasticity, 3rd edn. McGraw-Hill, New York (1970).
36. R. Weinstock, Calculus of Variations with Applications to Physics and Engineering. Dover,
New York (1974).
37. J. Zowe, M. Kocvara and M.P. Bendsoe, Free material optimization via mathematical programming. Math. Programming 79(1–3) (1997) 445–466.
Rivlin’s Representation Formula is Ill-Conceived
for the Determination of Response Functions via
Biaxial Testing
JOHN C. CRISCIONE
Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843-3120,
U.S.A. E-mail: JCCriscione@tamu.edu
Received 6 August 2002; in revised form 14 March 2003
Abstract. The experimental determination of a strain energy function W for a rubber specimen
must address departures from an elastic ideal in a rational fashion. Herein, such a rational experimental method is developed for biaxial stretching experiments and applied to rubber data in the
literature. It is shown that Rivlin’s representation formula is experimentally ill-conceived because
experimental error is magnified to the extent that error obscures trends in the response function
plots. Upon developing direct tensor expressions for the response function calculations, we show
that Rivlin’s representation formula (or any such constitutive law that has high covariance amongst
the response terms) magnifies experimental error greatly. By “high covariance”, we mean the inner
product amongst the response terms in the constitutive law is nearly equal to the maximum possible
value – i.e., the product of their magnitudes. Moreover, we show that the second partials of W
with respect to I1 and I2 should approach infinity as the strain decreases. Using an alternate set
of invariants with minimal covariance (i.e., a null inner product amongst the response terms), a W
for rubber can be determined forthwith.
Mathematics Subject Classifications (2000): 74-05, 74B20.
Key words: finite elasticity, elastomer, biaxial testing.
This work is dedicated to the memory of Clifford C. Truesdell whom I have only
known through his writings. As evident by his publications and our own,
Professor Truesdell was a man of great reason and our rational investigations of
mechanics have benefited immensely from his devotion.
1. Introduction
By defining strain energy functions of the form W (I1 , I2 ) where I1 and I2 are
respectively the first and the second principal invariants of C, the right Cauchy–
Green deformation tensor, it is possible to find exact solutions to some nonlinear
boundary and initial value problems in mechanics of incompressible materials with
behavior that is isotropic and hyperelastic. This approach was pioneered by Rivlin
197
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 197–215.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
198
J.C. CRISCIONE
(e.g., [1]). Moreover, many of these boundary value problems can be solved analytically to provide universal solutions – solutions that are valid regardless of the
specific form of W .
Nevertheless, to use finite elasticity theory in practice or to verify results in
solid-state physical chemistry (e.g., statistical mechanics of polymer chains), it
is necessary to determine a W for a real material. Experimental finite elasticity,⋆
however, is in its infancy and is in need of much attention. In part, this is due to the
complexity of the task. Yet, even for the most basic case, the results of experiments
have been inconclusive and often contentious.
Specifically, let us consider the biaxial stretching of a rubber sheet – a statically
determinate test on an isotropic elastomer with minimal hysteresis and with nearly
incompressible behavior. Here, the response functions (i.e., ∂W/∂I1 and ∂W/∂I2 )
can be calculated directly from the biaxial stretch data. Because of a magnification
of experimental error in much of the deformation range of rubber, however, the
functional form of W (I1 , I2 ) remains elusive in the sense that experiments cannot
determine it in a definitive manner.
Ambiguity in determining W (I1 , I2 ) is, incidentally, used by Rivlin as a justification for not reporting a W in his seminal paper with Saunders [2]. Rivlin and
Sawyers [3] state: “. . . was not explicitly presented in the paper, since it was felt
that other expressions for W could fit the experimental results equally well and it
seemed invidious to select this one for special mention.” Some of the various forms
of W (I1 , I2 ) for elastomers in the literature [2, 4–9] are displayed in Table I.
Additionally, one cannot solve for the I1 and I2 response functions for uniaxial
tests (i.e., uniaxial stretch and equibiaxial stretch) unless one assumes a functional
form for W a priori. Whereby, calculations of ∂W/∂I1 and ∂W/∂I2 for uniaxial
tests are only as valid as that which is assumed for the functional form of W (I1 , I2 )
in the first place. As a notorious case in point (see the discussion in [10]), a Mooney
plot only yields C10 and C01 if indeed the test piece behaves like a Mooney material
(entry 1 in Table I).
Herein, we show that this indeterminacy for uniaxial tests and this magnification of experimental error in determining W (I1 , I2 ) is due to significant covariance amongst the response terms in the constitutive law for Cauchy stress or true
stress t. All constitutive theories with such covariance, moreover, will magnify the
experimental error that is inherent in tests on real materials.
To understand why, let us first define the covariance amongst tensors in an
explicit fashion. Toward this end, let the covariance ratio between second order
tensors A1 and A2 be defined as
RC (A1 , A2 ) =
abs(A1 : A2 )
,
|A1 ||A2 |
(1.1)
⋆ The experimental determination of constitutive laws for elastic solids that undergo large
deformations.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
199
Table I.
W = C10 (I1 − 3) + C01 (I2 − 3)
W = C10 (I1 − 3) + C01 (I2 − 3) + C02 (I2 − 3)2
1 2
W = C10 (I1 − 3) + C01 ln I32
1 2
W = C10 exp(c1 (I1 − 3)2 ) dI1 + C01 ln I32
1
2
I −3+c
W = C10 (I1 − 3) + C10 (I2 − 3) + C02 ln 2 c2 2
1 2
W = C10 exp(c1 (I1 − 3)2 ) dI1 + C01 ln I32
2
W = (a0 (I1 − 3) − a1 (I1 − 3)−1 + a2 (I1 − 3)−3/2 (I2 − 3)
3
+ A0 (I2 − 3) − A1 ln(I2 − 3))
W = (C10 (I1 − 3) + C20 (I1 − 3)2 + C30 (I1 − 3)3
+ C01 (I2 − 3) + C11 (I1 − 3)(I2 − 3))
Mooney [4]
Rivlin et al. [2]
(1940)
(1951)
Gent et al. [5]
(1958)
Hart-Smith [6]
(1967)
Alexander [7]
(1968)
Alexander [7]
(1968)
Obata et al. [8]
(1970)
James et al. [9]
(1975)
The Cij , ci , Ai , and ai are all constants. Most of the constants have different symbols in the
corresponding reference. Obata et al. [8] notation is preserved because W was not given and it
was necessary to integrate their response functions [8, equation 15]. Rivlin and Saunders [2] did not
report the W corresponding to that above, yet they draw such a W response in their response function
plots.
where A1 : A2 = tr(AT1 A2 ) is the
√ inner product of A1 and A2 , and |A1 |, for example, is the magnitude of A1 or A1 : A1 . It follows (from the Cauchy–Schwarz
inequality) that:
(1) RC (A1 , A2 ) ∈ [0, 1];
(2) RC (A1 , A2 ) = 1 iff A1 and A2 are colinear;⋆ and
(3) RC (A1 , A2 ) = 0 iff A1 and A2 are mutually orthogonal.⋆⋆
Hence, the covariance is high if RC (A1 , A2 ) is near 1 and low if near 0.
Let us consider a constitutive law of the form t = −qI+α1 A1 +α2 A2 wherein q
is an indeterminate pressure, α1 and α2 are scalar response functions and A1 and A2
are symmetric, deviatoric and kinematic tensors. Upon separately contracting A1
and A2 onto t, we obtain two equations that can be solved for response functions α1
and α2 in terms of stress and strain measurements. As shown in the Appendix, error
in stress measurements will propagate through response function calculations, and
the error will be magnified by the factor: (1 − RC (A1 , A2 )2 )−1/2 . Hence, if the covariance is high then error will be magnified greatly (with an infinite magnification
of error if RC (A1 , A2 ) = 1).
For an incompressible material with W = W (I1 , I2 ), the constitutive law for t
is expressed by Rivlin’s representation formula which is
⋆ There exists two nonzero scalars a and a such that a A + a A is the null tensor.
1
2
1 1
2 2
⋆⋆ It is understood that A and A are mutually orthogonal when their inner product A : A
1
2
1
2
vanishes. Terminology such as “A1 and A2 are orthogonal” has to be avoided because it is conven−1
−1
T
tional to say “A1 and A2 are orthogonal tensors” when AT
1 = A1 and A2 = A2 rather than when
A1 : A2 = 0.
200
J.C. CRISCIONE
Figure 1. Covariance ratio RC of the I1 and I2 response terms for the biaxial stretch of an incompressible sheet. λ1 and λ2 are the in-plane stretch ratios and the data points shown correspond to the
biaxial stretch tests of Rivlin and Saunders [2]. The light blue and the black data are respectively
the constant I2 and constant I1 testing protocols. RC is maximal and 1 iff the response terms are
colinear. RC is zero iff the response terms are mutually orthogonal. For much of the stretch domain
of rubber, covariance amongst response terms is significant in the sense that RC is close to unity.
t = −pI + 2
∂W
∂W −1
B−2
B ,
∂I1
∂I2
(1.2)
where B is the left Cauchy–Green deformation tensor. It follows from the results
in the Appendix that error in the response function calculations will be magnified
by the factor: (1 − RC (dev(B), dev(B−1 ))2 )−1/2 wherein dev(B), for example, denotes the deviatoric part of B. Figure 1 displays RC (dev(B), dev(B−1 )) for biaxial
stretching tests. Note that it is nearly 1 for moderate strain and it is equal to 1 for
uniaxial tests. The magnification of error is large and even infinite in much of the
strain domain of rubber.
In Section 2, we provide a detailed analysis of the experimental error in assuming that rubber is elastic, and in Section 3, we develop a method of analysis
that rationally adjusts for this error. In this rational experimental method, we fit
the data with a stress tW (λ1 , λ2 ) that is continuous and hyperelastic like (i.e.,
tW satisfies the necessary conditions for an isotropic hyperelastic material being
stretched biaxially). To facilitate direct comparison, we use the same tW to generate
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
201
response function plots for Rivlin’s representation formula and for a novel representation formula [11]. In so doing, we show that the experimental determination
of W (I1 , I2 ) from biaxial stretch tests is ill-conceived. The representation formula
of Criscione et al. [11], on the other hand, is well-posed. This later formula has
minimal covariance (i.e., null inner products amongst the response terms), and the
form of W for rubber can be determined forthwith from biaxial stretch tests.
Furthermore, we show that the second partials of W with respect to I1 and/or
I2 should approach infinity as the strain vanishes. Following Mooney [4], most
elasticians assume the complete opposite – i.e., they consider rubber to be such that
∂ 2 W/∂I12, ∂ 2 W/∂I22 , and ∂ 2 W/∂I1 ∂I2 vanish. As a simple example of a W with
singular second partials, consider W = µ|E|2 + γ |E|3 where |E| is the magnitude
of the Green strain tensor and µ and γ are positive constants. Such a material
law has smooth, monotonic behavior that recapitulates linear elasticity when |E| is
small. Nevertheless, it is easy to show that ∂ 2 W/∂I12 , ∂ 2 W/∂I22 , and ∂ 2 W/∂I1 ∂I2
go to infinity as |E| vanishes. One cannot rule out the existence of cubic dependence on strain magnitude in W . Based on the analysis herein (Section 4) and the
experiments⋆ of Obata et al. [8], one should, in fact, expect singular second partials
for W (I1 , I2 ) when I1 and I2 approach 3.
2. Experimental Error in Biaxial Tests on Rubber
Ideally, a test specimen that is composed of an elastic solid should behave such that
there is one stress response for each state of deformation. In practice, however, and
even with measurement error withstanding, stress measurements display⋆⋆ variation when one particular configuration is retested multiple times. If one is investigating the inelastic behavior of metals, polymers, etc., then this departure from
elasticity is of primary interest. Nevertheless, many solids have a range of deformation wherein their behavior is predominately hyperelastic in the sense that the work
done on the specimen is mostly recoverable. For such materials, a hyperelasticity
framework may be useful for predicting material behavior and for investigating the
physical origin of the mechanical behavior.
Yet, if an elasticity law is to be determined from tests on rubber, then it is imperative that departures from an elastic ideal be considered as experimental error.
A direct relationship between irreproducibility of stress data and experimental error
is undeniable when one assumes there to be only one stress state for every strain
state. This fact is neglected, however, in almost all experimental reports on rubber.
To the knowledge of this author, only Treloar [10] and Jones and Treloar [12]
address this type of experimental error.
This type of error is an “error of definition” (e.g., see Beers [13]) as opposed
to an “error of measurement”. It is error nevertheless, and like measurement error, it represents an amount by which we are uncertain of the elastic stresses in
⋆ In particular, note that the slope of ∂W/∂I vs. I becomes steeper as I and I approach 3.
1
2
1
2
⋆⋆ Provided that the force transducers have a good enough resolution.
202
J.C. CRISCIONE
rubber. Rather than simply report the error as the resolution of the transducer, an
experimentalist should retest the same values of λ1 and λ2 in a multitude of ways
(e.g., loading and unloading). The variance amongst the stress measurements is the
square of the error in assuming that the true stress is the average of these stress
measurements at this particular strain.
Retests of data points are rarely (if ever) reported in the rubber literature for
error analysis purposes; however, such retests often arise unintentionally. For example, consider the biaxial stretching data of Rivlin and Saunders [2] wherein⋆
there are the data λ1 = 2.3, λ2 = 1.91, t1 = 21.5 kg/cm2 , and t2 = 16.5 kg/cm2
in Table I (constant I1 protocols), but in Table II (constant I2 protocols) they report
λ1 = 2.3, λ2 = 1.92, t1 = 21.6 kg/cm2 , and t2 = 16.0 kg/cm2 . These two data
points have the same λ1 , but λ2 of the latter is greater. One would expect t2 of the
latter to be greater as well, yet in fact, it is lesser. This difference (0.5 kg/cm2 )
cannot be attributed to measurement error (less than 0.1 kg/cm2 ). Hence, one must
suspect that the experimental error associated with Rivlin and Saunders’ data is
predominately that which is due to the inelasticity of their specimen rather than
that which is due to the resolution of their transducers (calibrated helical springs).
From the hysteresis loop reported by Rivlin and Saunders it should be evident
that their rubber specimen has a small amount of inelasticity. As a fair quantification of their experimental error, let us assume that the loading curve and unloading
curve are bounds on the stress variance in such a manner that these curves are
one standard deviation on either side of an expected mean curve. Whereby, half
of the difference of the loading and unloading stresses vs. stretch would be the
experimental error in uniaxial stress vs. stretch. Upon noticing that the hysteresis
loop is wider at larger strains, a first approximation may be that the experimental
error is proportional to the stress measurement. In particular, we estimate that the
error in assuming elasticity is 2% of the stress measurement.
A detailed error analysis for tests on rubber remains wanting. However, in lieu
of such, all error bars in the figures herein are calculated assuming that the error in
knowing an elastic stress value (at a particular configuration) is 2% of the measured
stress value. Rivlin and Saunders tried to minimize the effects of hysteresis; yet for
the two overlapping data points mentioned above, the t2 data differ more than 2%
from their mean. Also recall that the difference is in the wrong direction.
In order to estimate the propagation of error in the response function calculations for each data point in the biaxial stretch tests of Rivlin and Saunders [2],
use their equations for calculating the response functions with rearrangement as
follows:
∂W
λ21
λ22
=
t
−
t,
1
−2
−2 2
∂I1
2(λ21 − λ22 )(λ21 − λ−2
2(λ21 − λ22 )(λ22 − λ−2
1 λ2 )
1 λ2 )
(2.1a)
⋆ To be fair, there are two other overlapping data sets in these data tables, yet the set with the
greatest variance is given here. To justify so doing, note that the variance displayed by two data points
will be typically less than the full variance displayed by multiple data since it is relatively rare to have
both points significantly deviate to opposite sides of a true mean – i.e., that obtained from many data.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
∂W
1
1
t1 −
t,
=
−2
−2 2
−2
2
2
2
2
2
∂I2
2(λ2 − λ1 )(λ1 − λ1 λ2 )
2(λ2 − λ1 )(λ22 − λ−2
1 λ2 )
203
(2.1b)
where t1 and t2 are the in-plane principal stresses that correspond respectively
with the in-plane principal stretches λ1 and λ2 .
Assuming the error in t (at particular values of λ1 and λ2 ) is 2% of the measured
stress, the error in t1 , for example, is t1 = ±2t1 /100. The errors t1 and t2 are
unrelated, and as such, the error propagates as the square-root of the sum of each
term squared,
λ21
t1 2
∂W
= ±
−2 100
∂I1
(λ21 − λ22 )(λ21 − λ−2
1 λ2 )
λ22
t2 2 1/2
+
.
(2.2)
−2 100
(λ21 − λ22 )(λ22 − λ−2
1 λ2 )
Error propagation in the I2 response function calculation is obtained in a similar
manner.
Figure 2. I1 and I2 response function plots of the biaxial stretch data of Rivlin and Saunders [2] with error bars obtained by assuming that the experimental error in the stress data is
2% of the stress measurement. There are more points in this plot than in Figure 6 of Rivlin
and Saunders because all of the data in their Tables I and II are plotted here. The symbols
correspond to those used by Rivlin and Saunders and they also correspond to those in Figure 1
herein. The magnification of error is greatest near equibiaxial stretch where, as evident in
Figure 1, the covariance is greatest.
204
J.C. CRISCIONE
Figure 2 displays the response function plots of the biaxial stretch data in [2]
with error bars included. The error bars are, in general, small for the data in the
vicinity of pure shear. Not surprisingly, the data near pure shear are in regions with
the least amount of covariance (see Figure 1). Data in the red zone of Figure 1
have substantial error bars – even at high values of I1 and I2 . Moreover, Rivlin and
Saunders intentionally did not test regions with a higher magnification of experimental error. In the moderate strain range (i.e., extensions below 100%), error bars
would be much larger and approaching infinity as the strain decreases.
3. Rational Experimental Method
As shown in the prior section, biaxial stretch data on rubber specimens will display
departures from the elastic ideal. So too will the data depart from isotropic and
hyperelastic ideals. To address this error in a rational fashion, we assume that a
stress measurement tM at the strain state given by λ1 and λ2 is
tM = tW +
t,
(3.1)
where tW is a continuous function of the strain that satisfies the assumptions of
isotropy and hyperelasticity. The error t represents the amount by which measurements depart from the isotropic, hyperelastic ideal.
For a biaxial test with a specimen being stretched λ1 and λ2 in the associated
directions e1 and e2 , let tW be written as
tW = tW 1 (λ1 , λ2 )e1 ⊗ e1 + tW 2 (λ1 , λ2 )e2 ⊗ e2 ,
(3.2)
where tW 1 and tW 2 are scalar functions of λ1 and λ2 and the sheet surface (with
normal e3 ) is traction free. In order for the stress response to be hyperelastic and
isotropic with respect to the reference configuration, the following constraints
should be evident
tW 1 (λ1 , λ2 ) = tW 2 (λ2 , λ1 ),
−1/2
tW 2 (λ1 , λ1 ) = 0 ∀λ1 1,
−1/2
tW 1 (λ2 , λ2 ) = 0 ∀λ2 1,
∂λ2 tW 2
∂λ1 tW 1
=
.
∂λ2
∂λ1
(3.3a)
(3.3b)
(3.3c)
(3.3d)
We sought a continuous, best fit for tW by defining 9 nodes and 4 elements
in the (λ1 , λ2 ) plane. The nodes have (λ1 , λ2 ) values as follows: (1, 1); (1, 1.75);
(1, 2.5); (1.75, 1); (1.75, 1.75); (1.75, 2.5); (2.5, 1); (2.5, 1.75); and (2.5, 2.5).
The domains of the 4 elements are: (1) λ1 < 1.75 and λ2 < 1.75; (2) λ1 1.75
and λ2 < 1.75; (3) λ1 < 1.75 and λ2 1.75; and (4) λ1 1.75 and λ2
1.75. Constraint (3.3d) allows the definition of a potential ω(λ1 , λ2 ) with λ1 tW 1 =
∂ω/∂λ1 and λ2 tW 2 = ∂ω/∂λ2. Hence, to find a best-fit for tW we sought a ω with
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
205
bicubic Hermite interpolation within each element (C 1 smoothness across element
boundaries).
Let there be n measurements of t1 , t2 , λ1 , and λ2 . With i = 1, 2, . . . , n; let
[i]
[i]
[i]
be the stress measurements of t1 and t2 at stretches λ[i]
and tM2
tM1
1 and λ2 .
To determine the nodal degrees of freedom, we minimized the following error
function:
2
2
n
∂ω
∂ω
[i] [i]
[i] [i]
λ1 tM1 −
,
(3.4)
+ λ2 tM2 −
[i]
[i]
1
∂λ1 λ1 =λ1[i]
∂λ2 λ1 =λ[i]
i=1
λ2 =λ2
λ2 =λ2
subject to nodal constraints required by (3.3a). Since (3.3a) and (3.4) only constrain
derivatives of ω we enforced ω(1, 1) = 0 in order to obtain a solution. In total,
there are 19 independent degrees of freedom to be determined. Conditions (3.3b),
(3.3c) were not enforced during the fit (except at λ1 = λ2 = 1), yet the degrees of
freedom determined by minimizing (3.4) were such that (3.3b), (3.3c) were nearly
∗
but not exactly satisfied. Let tW
1 be our initial data fit that does not satisfy (3.3c).
∗
To satisfy (3.3c) exactly, we added the following function to tW
1 (λ1 , λ2 ):
⎧
λ1 1,
⎨0
2
(λ
−
1)
1
−1/2
∗
(3.5)
ζ1 (λ1 , λ2 ) = −
tW
, λ2 ) λ1 < 1.
⎩
1 (λ2
−1/2
(λ2 − 1)2
In this work, we do not consider deformations with in-plane compression (i.e.,
−1/2
λ1 < λ2
when λ2 1), and hence the denominator of the fraction in (3.5) is
−1/2
∗
greater than or equal to the numerator. Since ζ1 = −tW
, λ2 ) when λ2 > 0
1 (λ2
−1/2
∗
and λ1 = λ2 , the augmented data adjustment, tW 1 (= tW
+
ζ
1 ) will vanish and
1
thus satisfy (3.3c). A likewise method (yet with λ1 and λ2 interchanged) was used
to enforce (3.3b).
Although not continuous at (1, 1) in general, ζ1 is continuous in this case. To
understand why, consider a deformation with λ1 = 1 − αε and λ2 = 1 + ε in the
vicinity of the reference configuration (i.e. ε ≪ 1) with ε positive.⋆ Since we do
not consider deformation with in-plane compression, α has its maximal value of
1/2 when the deformation is uniaxial extension in the e2 direction. When α 0
then λ1 1 and ζ1 = 0. When α ∈ (0, 1/2], we have
2 ∗
∗
(3.6)
|ζ1 | −α 2 λ2 λ2 + 1 tW
1 2|tW 1 |
for sufficiently small ε > 0. The continuity of ζ1 at (1, 1) follows from the smooth∗
∗
ness of tW
1 and the fact that tW 1 vanishes at λ1 = λ2 = 1.
Rivlin and Saunders’ biaxial data is too sparse for our fitting method because
data is lacking for small and moderate strains. Jones and Treloar’s data, as tabulated
in [14], is better with n = 99 and uniform coverage of the (λ1 , λ2 ) plane. Figure 3
⋆ Since we do not consider deformations with in-plane compression, negative ε would necessitate
λ1 > 1 and ζ1 = 0.
206
J.C. CRISCIONE
Figure 3. Fit of tW to biaxial stretch data in [14]. The data for both t1 and t2 are on this
plot with the λ1 t1 values shown directly and with the λ2 t2 values shown at the point with λ1
and λ2 transposed (see text). The fit is smooth, monotonic, and it satisfies the isotropy and
hyperelasticity assumptions. A segment at each data point bridges the data and fit values. The
two circles highlight regions (with transposed λ1 and λ2 ) where the data violate hyperelasticity
and thus depart from the fit (see text).
plots the data and our fit. Only the λ1 tW 1 surface is shown because (3.3a) requires a
mirror symmetry such that λ2 tW 2 is obtained when λ1 and λ2 are interchanged (i.e.,
[i]
reflection about a plane given by the condition λ1 = λ2 ). The λ[i]
1 tM1 data points are
[i]
shown directly. The λ[i]
2 tM2 data points are displayed by interchanging λ1 and λ2 .
Note that our fit is smooth, monotonic and representative of the test data. There
are systematic departures – two of which are circled. Such systematic departures
also occur when the number of degrees of freedom of the fit is increased (34 independent degrees of freedom obtained from 16 nodes and 9 elements) because these
departures violate the hyperelasticity constraint. To see this, combine constraints
(3.3a) and (3.3d) to obtain
∂
∂
=
.
(3.7)
λ1 tW 1 (λ1 , λ2 )
λ1 tW 1 (λ1 , λ2 )
λ1 =λ∗1
λ1 =λ∗2
∂λ2
∂λ2
∗
∗
λ2 =λ2
λ2 =λ1
where λ∗1 and λ∗2 are arbitrary constants. As highlighted on the left side of Figure 3,
the surface is flat such that the left side of (3.7) is near zero. Yet in the region
highlighted on the right (i.e., the region with λ1 and λ2 transposed), the data have
a nonzero slope. Our fit must depart from the data because it models the stress in a
hyperelastic material – one that satisfies (3.7).
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
207
Figure 4. Biaxial stretching trajectories with I1 or I2 held constant at 3.5, 4, 5, 6, and 7.
Also shown are the equibiax line and the uniaxial stretch curves. The trajectories with I1
constant have a radius of curvature (on the equibiax line) with a center toward the origin. The
trajectories with I2 constant, on the other hand, have a radius of curvature (on the equibiax
line) with a center away from the origin.
4. W (I1 , I2 ) Has Singular Second Partials
The prior section described how to obtain a stress field tW for rubber that is a
continuous function of (λ1 , λ2 ) and satisfies the assumptions of isotropy and hyperelasticity. With this data adjustment, we can now attempt to find W (I1 , I2 ). Toward
this end, consider the ideal testing trajectories in Figure 4 which separately hold I1
or I2 at the values: 3.5, 4, 5, 6, and 7. Note that all deformations with I1 7 have λ1
and λ2 within the domain of the fit displayed in Figure 3. Hence, subsequent plots
with the I1 axis truncated at 7 are within the stretch range used to determine tW –
i.e., we are interpolating, not extrapolating the Jones and Treloar data in [14].
The response function plots in Figure 5 are calculated from λ1 , λ2 , and tW (λ1 , λ2 )
for the aforementioned I1 and I2 trajectories. Note that highly nonlinear behavior is
evident for small and moderate strain despite the fact that tW is smooth, monotonic,
208
J.C. CRISCIONE
and thus linearly elastic in the small strain limit. Singular second partials of W are
evident as I1 and I2 approach 3. Since the error bounds are large and even infinite
in the red zone of Figure 1, W (I1 , I2 ) cannot be found from response function plots
– i.e., error obscures any trends in the plot.
5. Determining W (K2 , K3 ) Is Well-Posed
Magnification of error is not problematic for all phenomenological theories of rubber elasticity. In fact, Criscione et al. [11] developed an approach using natural
strain ln V which minimizes the covariance amongst response terms. For rubberlike materials, [11] reports⋆ W = W (K2 , K3 ) where K2 = |dev(ln V)| and K3 =
piud(ln V). Whereby, the constitutive law becomes
3
3
1 − K32
∂W
∂W
t = −qI +
udev(ln V) −
cudev(ln V).
(5.1)
∂K2
∂K3
K2
Although the operators udev(·), piud(·), and cudev(·) are not used in [11], they are
defined in the Appendix. With simple substitution, this formulation is consistent.
Note that K2 is the magnitude of the distortion strain⋆⋆ dev(ln V), and note that
the K2 response term is colinear with the distortion strain. As for K3 ∈ [−1, 1], it
is the mode-of-distortion (1 for uniaxial extension, 0 for pure shear, −1 for uniaxial
contraction), and its response term is orthogonal to the distortion strain. Although
K2 appears in the denominator, an important result from [11] is that ∂W/∂K3 must
vanish as order K23 when K2 goes to zero.
To solve for the K2 and K3 response functions, respectively contract udev(ln V)
and cudev(ln V) onto (5.1) to obtain
∂W
= udev(ln V) : t,
∂K2
∂W
K2 cudev(ln V)
3
=
: t.
∂K3
2
3 1 − K3
(5.2a)
(5.2b)
The orthogonal nature of the response terms (i.e., udev(ln V) : I = 0, cudev(ln V) :
I = 0, udev(ln V) : cudev(ln V) = 0) makes isolation of the response functions
easy. Moreover, |udev(ln V)| = 1 and |cudev(ln V)| = 1.
With an approach similar to that in the Appendix, it follows from (5.2a) that
error in calculating ∂W/∂K2 is on the same order as the root-mean-squared error
of the principal stresses. Throughout the entire deformation range of rubber, thus,
∂W/∂K2 can be evaluated without magnification of experimental error.
⋆ We are liberal with our use of W to generally represent strain energy functions, and W (K , K )
2
3
is meant to imply that W can be expressed in terms of K2 and K3 . It does not indicate that W depends
on K2 and K3 in exactly the same fashion that W (I1 , I2 ) depends on I1 and I2 .
⋆⋆ dev(ln V) is referred to as distortion strain because the spherical part of ln V solely depends on
dilatation whereas its deviatoric part does not depend on dilatation whatsoever.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
209
Figure 5. I1 and I2 response function plots for the trajectories in Figure 4 with the stress
given by tW (i.e., the surface fit in Figure 3). Error bars are not shown, yet the error bounds
are larger than those in Figure 2 and go to infinity near the ends of each curve.
As discussed in [11], it is appropriate that the ∂W/∂K3 calculation be sensitive
to error for uniaxial, axis-symmetric deformations (i.e., K32 = 1) because the K3
response term must vanish in order to satisfy symmetry. In particular, note from
(5.1) that dev(t) is colinear to dev(ln V) for uniaxial deformations – a necessary
210
J.C. CRISCIONE
Figure 6. Functional dependence of W on K2 and K3 . W and its derivatives are in units of
MPa. The top panels show how ∂W/∂K2 (a) and W (b) depend on K3 . The width of the line
spans the upper and lower error bounds. A functional form, given by W = g(K2 ) + K3 h(K2 ),
is appropriate. Fitting a line to the W vs. K3 relation at multiple values of K2 yields the
intercept g(K2 ) and slope h(K2 ) which are plotted in panels, respectively. As required by [11],
note that g(K2 ) goes to zero as order K22 whereas h(K2 ) vanishes faster.
condition for an axis-symmetric deformation superimposed on an isotropic body.
Because of this, the K3 response function has an infinite magnification of error for
uniaxial tests.
Yet, it is a simple experimental task to measure how ∂W/∂K2 depends on K3
because the K2 response function is measurable, nevertheless, for uniaxial deformation and all neighboring configurations. In contrast, neither ∂W/∂I1 nor ∂W/∂I2
can be measured for uniaxial tests because their response terms are colinear – the
maximum of covariance.
Using our data adjustment (i.e., tW in Section 3) to calculate ∂W/∂K2 , Figure 6(a) plots ∂W/∂K2 as a function of K3 when K2 is held constant at 0.25,
0.5, 0.75, 1.0, and 1.25. The thickness of each curve is determined by the error
propagation such that the upper and lower edges are maximal and minimal bounds,
respectively. With tW being continuous, we numerically integrate ∂W/∂K2 (with
dK2 = 0.01 and using the trapezoidal rule) at fixed values of K3 . In so doing for
many K3 values, Figure 6(b) plots W as a function of K3 when K2 is held constant.
Note the functional form of W is nearly W = g(K2 ) + K3 h(K2 ), i.e., linear in K3
as suggested by [11]. When K2 is held constant, a linear regression of W vs. K3
has an intercept equal to g(K2 ) and a slope equal to h(K2 ). With such a regression
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
211
method, we plot g(K2 ) and h(K2 ) in Figures 6(c) and (d), respectively. Finding
W (K2 , K3 ) is forthright and accurate.
6. Conclusions
It is shown that a representation formula for t in rubber is experimentally illconceived when there is significant covariance amongst the response terms. This
is so because experimental error, inherent in tests on rubber, is unacceptably magnified when response functions are calculated from biaxial test data. This result is
important because it explains why the most widely used phenomenological theory
for rubber elasticity (i.e., that of Rivlin) is the most intractable experimentally. In
contrast, the phenomenological theories of Criscione et al. [11] and Ogden [15]
contain response terms that are mutually orthogonal (the absolute minimum of
covariance), and they are experimentally tractable. Although not yet applied to
data, the approach of Laine et al. [16] is sure to be experimentally tractable since
the response terms are mutually orthogonal.
Furthermore, the analytical example in Section 1, the data in [8], and the results
of Section 4 all show that W should have singular second partials with respect
to I1 and I2 . In contrast, the functional form of W (K2 , K3 ) is simple. As shown
in Section 5, one should expect a W (K2 , K3 ) that is linear in K3 and smooth
and monotonic in K2 . The behavior of the second partials is an important matter
because they appear in the equilibrium equations.
To their credit, Rivlin and Saunders [2] recognized that experimental error is
magnified unacceptably for moderate strain. Subsequently, they restricted their
biaxial tests to large strain with I1 and I2 at 5 or above. By quantifying the magnification of error in terms of covariance, we show more precisely why W (I1 , I2 )
cannot be determined in the moderate strain domain and in some domains of high
strain (i.e., the red region in Figure 1). Since high covariance is inherent in Rivlin’s
representation formula itself, it is doubtful that tests such as torsion of a cylinder
would be able to determine the functional form of W (I1 , I2 ). Yet, this is an open
question – only biaxial stretching is considered in detail here.
As for the statistical theory of rubber, many consider W (I1 , I2 ) to be useful or at
least thermodynamically convenient (e.g., see Treloar’s book [10]). However, the
covariance amongst the I1 and I2 response terms is most pronounced in the moderate strain range where entropy terms (rather than internal energy or crystallization)
are most likely to predominate. In other words, entropically based formulations
of W are potentially valid for a strain range in which W (I1 , I2 ) cannot be found
experimentally. This is a compelling problem for a statistical theory. How unique or
useful is a W (I1 , I2 ) formulation that cannot be independently verified with biaxial
testing? We conclude that some large variations in W (I1 , I2 ) will perturb the stress
only minimally because we have shown the reverse to be true – i.e., small variations
in stress values can give rise to large variations in W (I1 , I2 ).
212
J.C. CRISCIONE
Section 2 focuses on the inelastic behavior of rubber in order to accurately
quantify the experimental error in assuming elastic behavior. Albeit important for
assessing error, this inelastic behavior is small in comparison to the total stress
response. It is worthwhile to note that the stress is predominately (say 98%) dependent on strain alone when the variation in stress for a given strain is 2%.
We do not report a specific functional form of W (K2 , K3 ) for rubber herein
because at present there is not an acceptable data set in the literature to so do.
Although the data in [14] are useful for showing that W (K2 , K3 ) is experimentally
well-posed, there are systematic departures in our fit for tW (Figure 3) that arise
because the data violate hyperelasticity. One possible explanation for this is that
the specimens may have been held (and stress relaxed) at the uniaxial stretch state
before they were stretched in the cross direction. Moreover, isotropy cannot be
verified because only half of the stretch domain is tested. More experimental work
needs to be done.
An ideal testing protocol would randomly test and retest a large set of stretch
values that cover the biaxial stretch domain. Upon randomizing the stretches and
stretch-rates during the test, the resulting best-fit tW (see Section 3) would be the
best-guess for the stress if the strain-rate and deformation-history were not known.
The standard deviation of the data from the tW fit would be the error in assuming
that the stress in rubber is that which is given by a strain energy function alone.
Regardless of what experimental method is used, a departure from an isotropic,
hyperelastic ideal introduces error that must be addressed in a rational fashion when
trying to determine W .
Acknowledgement
The Texas Engineering Experiment Station provided financial support for this investigation.
Appendix
This Appendix shows that there is an inherent magnification of experimental error
when there is significant covariance amongst response terms in t. Classically, t for
incompressible materials with isotropic elastic behavior is written as:
t = −pI + α1 B − α2 B−1 ,
(A.1)
where p is an arbitrary scalar, α1 and α2 are scalar response functions, I is the identity tensor, and B = FFT is the left Cauchy–Green deformation tensor where F is
the local deformation gradient tensor. To develop general equations for calculation
of α1 and α2 , let q = p − α1 tr(B)/3 + α2 tr(B−1 )/3. Whereby,
t = −qI + α1 dev(B) − α2 dev(B−1 ),
where dev(·) denotes the deviatoric part of the argument.
(A.2)
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
213
Two independent equations for α1 and α2 are obtained by taking the inner product of (A.2) with dev(B) for one and with dev(B−1 ) for the other. In particular,
t : dev(B) = α1 dev(B) : dev(B) − α2 dev(B−1 ) : dev(B),
t : dev(B−1 ) = α1 dev(B) : dev(B−1 ) − α2 dev(B−1 ) : dev(B−1 ).
(A.3a)
(A.3b)
These equations are easily solved, yet to express the solution in a more useful
manner, let us introduce the operator udev(·) which denotes the unit deviator of
its argument. The unit deviator of B, for example, is dev(B) divided by its magnitude, or equivalently, udev(B) = | dev(B)|−1 dev(B). Upon solving (A.3) and
rearranging terms, we obtain
udev(B) − (udev(B) : udev(B−1 )) udev(B−1 )
: t,
| dev(B)|(1 − RC (dev(B), dev(B−1 ))2 )
udev(B−1 ) − (udev(B) : udev(B−1 )) udev(B)
: t.
α2 = −
| dev(B−1 )|(1 − RC (dev(B), dev(B−1 ))2 )
α1 =
(A.4a)
(A.4b)
Although udev(B) and udev(B−1 ) must have unit magnitude, combinations
of
them do not. In fact, the numerators in (A.4a), (A.4b) have a magnitude of
1 − (udev(B) : udev(B−1 ))2 . Note further that (udev(B) : udev(B−1 ))2 is equal
to RC (dev(B), dev(B−1 ))2 , whereby α1 becomes
α1 =
udev(udev(B) − (udev(B) : udev(B−1 )) udev(B−1 ))
: t.
| dev(B)| 1 − RC (dev(B), dev(B−1 ))2
(A.5)
Now the tensor in the numerator always has unit magnitude, and hence the sum of
the squares of its principal values is unity (i.e., ξ12 + ξ22 + ξ32 = 1 where ξi are the
principal values of the tensor in the numerator).
As necessary for isotropy, B and t are coaxial. It follows that any combination
of B and B−1 is coaxial to t. Consequently, (A.5) becomes
α1 =
ξ1 t1 + ξ2 t2 + ξ3 t3
,
| dev(B)| 1 − RC (dev(B), dev(B−1 ))2
(A.6)
where the ti are the corresponding principal values of t. Assuming that the error in
knowing each of the principal values (i.e., t1 , t2 , and t3 ) of t is uncorrelated
to the other principal values then the error bounds of α1 are
3
ξ12 t12 + ξ22 t22 + ξ32 t32
α1 = ±
,
(A.7)
| dev(B)| 1 − RC (dev(B), dev(B−1 ))2
where t12 , for example, is the variance in t1 . If the ti have a similar variance (i.e.,
t12 = t22 = t32 = t02 ), then the variance of α1 (i.e., α12 ) is
α12 =
t02
.
| dev(B)|2 (1 − RC (dev(B), dev(B−1 ))2 )
(A.8)
214
J.C. CRISCIONE
It should be evident that error in t will be greatly magnified when RC (dev(B),
dev(B−1 )) is near 1.
To generalize for all isotropic, incompressible, elastic materials, note that one
does not have to use B and B−1 as the kinematic tensors in t. In particular, we may
write
t = −qI + α1 A1 + α2 A2 ,
(A.9)
wherein α1 and α2 are scalar response functions and A1 and A2 are deviatoric
tensors that are linearly independent combinations of dev(B) and dev(B−1 ). With
an approach similar to the above, it follows that the error in the calculation of α1
and α2 will grow as RC (A1 , A2 ) approaches unity.
In order to avoid magnification of experimental error, hence, an ideal choice
for A1 and A2 would be combinations of dev(B) and dev(B−1 ) that are mutually
orthogonal so that RC (A1 , A2 ) vanishes. Moreover, if these combinations were
normalized such that |A1 | = |A2 | = 1 then contraction of A1 and A2 onto (A.9)
would separately yield
α1 = t : A 1 ,
α2 = t : A 2 .
(A.10)
In so doing, the error in the response function calculations would be on the same
order as that of t itself. The response functions and the stress would have the
same units and the same variance.
REMARK. Such an orthogonal tensor basis with normalized magnitudes can be
defined rather easily (see [17] for verification of all statements in this remark).
Toward this end, let D be a linear combination of B and B−1 . udev(D) will be a
combination of dev(B) and dev(B−1 ) that has unit magnitude. As for a tensor that
is orthogonal to udev(D), use the complementary unit deviator of D, cudev(D), as
given by
√
√
6I + 3 piud(D) udev(D) − 3 6 udev(D)2
,
(A.11)
cudev(D) =
3 1 − piud(D)2
√
where piud(D) = 3 6 det(udev(D)) is the principal invariant of the unit deviator
of D. Since tr(udev(D)) is zero and tr(udev(D)2 ) is unity, udev(D) only has
√ one
principal invariant, det(udev(D)), which happens to be bounded by ±(3 6)−1 .
Hence, piud(D) ∈ [−1, 1], and it is such that piud(D) = −1 iff D is like the
strain of uniaxial contraction⋆ and piud(D) = 1 iff D is like the strain of uniaxial
extension.⋆⋆ With generous use of the Cayley–Hamilton equation for udev(D), it
is possible to show that cudev(D) : cudev(D) = 1 and cudev(D) : udev(D) = 0.
Being deviatoric, it should be evident that udev(D) : I = 0 and cudev(D) : I = 0.
⋆ For uniaxial, isochoric contraction, strain tensors have one negative principal value and two
positive ones that are equal.
⋆⋆ For uniaxial, isochoric extension, strain tensors have one positive principal value and two
negative ones that are equal.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
215
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
R.S. Rivlin, Large elastic deformations of isotropic materials: IV. Further developments of the
general theory. Phil. Trans. Roy. Soc. A 241 (1948) 379–397.
R.S. Rivlin and D.W. Saunders, Large elastic deformations of isotropic materials: VII.
Experiments on the deformation of rubber. Phil. Trans. Roy. Soc. A 243 (1951) 251–288.
R.S. Rivlin and K.N. Sawyers, The strain-energy function for elastomers. Trans. Soc. Rheol. 20
(1976) 545–557.
M. Mooney, A theory of large elastic deformation. J. Appl. Phys. 11 (1940) 582–592.
A.N. Gent and A.G. Thomas, Forms of the stored (strain) energy function for vulcanized rubber.
J. Polym. Sci. 28 (1958) 625–628.
L.J. Hart-Smith, Elasticity parameters for finite deformations of rubber-like materials.
Z. Angew. Math. Phys. 17 (1967) 608–626.
H. Alexander, A constitutive relation for rubber-like materials. Internat. J. Engrg. Sci. 6 (1968)
549–563.
Y. Obata, S. Kawabata and H. Kawai, Mechanical properties of natural rubber vulcanizates in
finite deformation. J. Polymer Sci. A 2(8) (1970) 903–919.
A.G. James, A. Green and G.M. Simpson, Strain energy function of rubber. I. Characterization
of gum vulcanization. J. Appl. Polymer Sci. 19 (1975) 2033–2058.
L.R.G. Treloar, The Physics of Rubber Elasticity. Clarendon Press, Oxford (1975) p. 225.
J.C. Criscione, J.D. Humphrey, A.S. Douglas and W.C. Hunter, An invariant basis for natural
strain which yields orthogonal stress response terms in isotropic hyperelasticity. J. Mech. Phys.
S. 48 (2000) 2445–2465.
D.F. Jones and L.R.G. Treloar, The properties of rubber in pure homogeneous strain. J. Phys.
D: Appl. Phys. 8 (1975) 1285–1304.
Y. Beers, Introduction to the Theory of Error. Addison-Wesley, Reading, MA (1957).
D.W. Haines and W.D. Wilson, Strain-energy density function for rubber-like materials.
J. Mech. Phys. S. 27 (1979) 345–360.
R.W. Ogden, Nonlinear Elastic Deformations. Halsted Press, New York (1984).
E. Laine, C. Vallee and D. Fortune, Nonlinear isotropic constitutive laws: Choice of the three
invariants, convex potentials and constitutive inequalities. Internat. J. Engrg. Sci. 37 (1999)
1927–1941.
J.C. Criscione, Direct tensor expression for natural strain which yields a fast, accurate
approximation. Internat. J. Comp. Struct. 80 (2002) 1895–1905.
Generalized Hessian and External Approximations
in Variational Problems of Second Order
CESARE DAVINI and ROBERTO PARONI
Dipartimento di Ingegneria Civile, Universitá degli Studi di Udine, Via delle Scienze,
208-33100 Udine, Italy. E-mail: {cesare.davini:roberto.paroni}@dic.uniud.it
Received 18 January 2002; in revised form 13 August 2002
Abstract. We introduce a suitable notion of generalized Hessian and show that it can be used
to construct approximations by means of piecewise linear functions to the solutions of variational
problems of second order. An important guideline of our argument is taken from the theory of the
Ŵ-convergence. The convergence of the method is proved for integral functionals whose integrand is
convex in the Hessian and satisfies standard growth conditions.
Mathematics Subject Classifications (2000): 65N12, 65N30, 46N10, 74K20, 74S05.
Key words: numerical methods, non-conforming approximations, Ŵ-convergence, anisotropic plates.
To the memory of Clifford Truesdell, with gratitude.
1. Introduction
The approximation of second, or higher order, variational problems by standard
conforming methods is quite cumbersome, since it requires the continuity of the
first, or higher order, derivatives across the mesh elements. It is therefore preferable
to use alternative methods that with various strategies provide external approximations to the solution, that is approximations within spaces of functions that are less
regular than what would be required by the variational problem. Several families
of these methods have been thoroughly studied in the past decades and a well
established theory has been developed.
In non-conforming finite elements the regularity requirements across the mesh
elements are relaxed and an external approximation is obtained by simply considering bases of non-conforming functions. The possible jumps and discontinuities
at element interfaces resulting from nonconformity are completely ignored and
just the sum of the contribution over the mesh elements is taken into account.
The convergence to the solution is then assured for the so called consistent finite elements [7]. The mixed methods instead provide external approximations
by enlarging the list of primal variables and adding suitable constraints as side
conditions. By the introduction of Lagrange multipliers the minimum problem is
217
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 217–242
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
218
C. DAVINI AND R. PARONI
then changed into a saddle point problem. Obviously to solve the latter problem a
peculiar choice of algorithms is needed [4, 17].
Here we follow a different approach that does not make use of Lagrange multipliers nor ignores the discontinuities across the mesh elements. We consider variational problems of second order in a two-dimensional bounded domain, and give
an approximation scheme using spaces of piecewise linear functions defined for a
chosen sequence of triangulations of the domain. Precisely, we introduce a notion
of generalized Hessian, based on a discrete Green’s formula, and show that it endows the union of these discrete spaces with a sequential topology that makes it
dense, in an appropriate sense, in the function space in which the given variational
problem is defined. This is established under a mild assumption on the triangulations. Then, for a generic functional of integral type whose integrand is convex with
respect to the Hessian and satisfies a standard quadratic growth condition, we construct a sequence of functionals defined on these discrete spaces and prove that it
Ŵ-converges to the given functional. Moreover, when the integrand is strictly convex, the minimizers converge to the minimizer of the original problem. So this
approach provides an approximation technique.
All this generalizes ideas discussed by Davini [10, 11] and Davini and Pitacco
[12, 13]. Credit must also be given to an early paper by Glowinski [15] that probably did not receive the attention it deserves.
Applications and the crucial issue of estimating the convergence rate are not
considered in the present paper (see however the related paper by Davini and
Pitacco [13] where the rate of convergence for the biharmonic problem is studied
within the general framework of the mixed method). Our attention is rather focused
on general aspects of the method and, particularly, on its connections with the
Ŵ-convergence of functionals. Although it is customarily used for different scopes,
it seems to us that the framework of Ŵ-convergence lends itself quite naturally to
approximation purposes. In particular, as in this case, it may lead in a direct way
to the introduction of sequences of unconstrained minimum problems for functionals defined in non-conforming spaces and whose minimizers provide external
approximations to the solution.
Our results are not confined to the quadratic functionals and cover a fairly broad
class of problems. It is worth recalling that various authors have proposed for
the quadratic case approximation techniques that turn out to be similar to ours,
although they have been worked out in a different perspective. We mention, among
others, the works by Bhattacharyya et al. [3, 16], who treated the case of linear
anisotropic plates within the scheme of the mixed methods, and those by Angelillo
et al. [1, 2], who adapted the argument of Davini and Pitacco [13] for the loading
problem of plane anisotropic linear elasticity.
2. Discretization of the Domain
Let ⊂ R2 be an open bounded domain with smooth boundary. Let Th :=
{Tj }j =1,...,Ph , with h taking values in some countable set Ᏼ of real numbers, be
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
219
a sequence of triangulations of regular in the sense of Ciarlet [6], i.e., such that
the ratio between
ρh = inf sup{diam (S): S is a disk contained in Tj } and
j
S
h := sup{diam Tj }
j
is bounded away from zero by a constant independent of h. We denote by xi the
vertices of the triangles Tj and call them the nodes of the mesh . We indicate by
Ph := {1, 2, . . . , Ph } and ᏺh := {1, 2, . . . , Nh } the sets of values taken by the
indexes of the triangles and the mesh nodes, respectively. We shall call Th the
primal mesh. Denoting
◦
h :=
4
Tj
Tj ∈Th
we require that h invades from inside.
Following Davini and Pitacco [12, 13], for each h ∈ Ᏼ we also introduce a
dual mesh Th := {Ti }i=1,...,Nh consisting of disjoint open polygonal domains, each
containing just one primal node, as shown in Figure 1 where the dual elements
are drawn with dashed lines. We assume that the sequence of dual meshes is also
regular and that
◦
h =
4
Tj ∈Th
Tj .
Let Xh be the space of functions which are affine on Tj and continuous on h
(briefly, the polyhedral functions over Th ), and let X0h ⊂ Xh denote the set of
functions that vanish on ∂h . We regard X0h as a subspace of H01 () by extending
the functions to zero in \ h .
Figure 1. , h , the primal and the dual mesh.
220
C. DAVINI AND R. PARONI
Let ϕ̂i be the polyhedral splines in Xh defined by the condition that ϕ̂i (xj ) = δij
for i, j = 1, . . . , Nh . We assume that
1
|Ti | =
ϕ̂i dx
= |supp(ϕ̂i )|
(1)
3
h
and call it A SSUMPTION (H0). Note that it is always possible to construct a mesh
with this property, e.g., by taking the nodes of the dual mesh to be the center of
mass and the middle points of the sides of the triangles of the primal mesh.
In what follows, in order to keep the notation simple, we sometime avoid labeling by h certain quantities that are mesh dependent, such as, for instance, the
mesh nodes or the mesh elements, assuming that that dependence is clear from the
context. Also, it shall be useful to distinguish between the internal nodes, which are
those that do not belong to ∂h and whose indices take value in the set Ᏽh ⊂ ᏺh ,
and the boundary nodes which are those sitting on it.
DEFINITION 1. We shall say that the sequence of partitions considered has the
P ROPERTY (⋆) if the fourth-order tensor
Nh
−1
(k)
∇ ϕ̂j ⊗ ∇ ϕ̂k dx ⊗ (xj − xk ) ⊗ (xj − xk )
Gh :=
2|Tk |
j =1
satisfies the following inequality
5
5
5 1,
lim sup sup 5G(k)
h
h
k∈Ᏽh
where · denotes the sup-norm, i.e.,
(k)
5 (k) 5
5G 5 := sup |Gh H| ,
h
|H|
H=0
H ranging over the space of second order tensors.
A similar condition was required by Glowinski [15]. Angelillo et al. [1, 2]
considered instead meshes with the following property (P ROPERTY (AFF)):
Nh
H(x − xj ) · (x − xj )∇ ϕ̂j ⊗ ∇ ϕ̂k dx = 0 ∀H
j =1
h
for every node xk . In what follows we prove a couple of lemmas implying that if the
mesh satisfies P ROPERTY (AFF) at the internal nodes, then it has the P ROPERTY (⋆).
As we shall see our assumption that h invades is sufficient to control also the
contribution coming from the boundary nodes.
REMARK 1. The fourth-order tensor G(k)
h maps second-order tensors into symmetric tensors. To deduce this it suffices to show that ∇ ϕ̂j ⊗ ∇ ϕ̂k dx is a symmetric second-order tensor. Note that if xj and xk are not the nodes of the same
221
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Figure 2. Two triangles of the primal mesh.
triangle then the integral considered is equal to zero. So, let us fix attention to any
node xk and adopt local labels t = 1, 2, . . . , tk to denote the nodes all around it
and the respective primal triangles. Also, choose counterclockwise ordering and
indicate by nt the unit normals to the sides joining the central node xk to xt(k) .
(k)
(k)
Let Tt(k) and Tt(k)
−1 be the triangles having the side (xk , xt ), with xj ≡ xt , in
common. The situation is represented in Figure 2. After denoting with ϕ̂t(k)
−1 and
(k)
(k)
(k)
ϕ̂t the restriction of ϕ̂j on Tt −1 and Tt , respectively, we may write
(k)
∇ ϕ̂t(k) ⊗ ∇ ϕ̂k dx,
∇ ϕ̂t −1 ⊗ ∇ ϕ̂k dx +
∇ ϕ̂j ⊗ ∇ ϕ̂k dx =
(k)
(k)
h
Tt
Tt−1
from which, by applying Green formula, it follows that
∇ ϕ̂j ⊗ ∇ ϕ̂k dx
h
∇ ϕ̂t(k) ⊗ nt +1 ϕ̂k dτ
=
(k)
xk ,xt+1
−
(k)
xk ,xt
∇ ϕ̂t(k) ⊗ nt ϕ̂k dτ −
(k)
xk ,xt−1
∇ ϕ̂t(k)
−1 ⊗ nt −1 ϕ̂k dτ.
Now, the gradients ∇ ϕ̂t(k) and ∇ ϕ̂t(k)
−1 have the direction of nt +1 and nt −1 , respectively. So, the integrands of the first and third integrals are symmetric dyads. On
(k)
the other hand, by Hadamard lemma, [|∇ ϕ̂t(k) |] = [|ϕ̂t,n
|] nt and the second integral
also takes on symmetric values.
The following simple lemma shall be useful in what follows.
LEMMA 1. Let xk be an internal node. Then, for all H, the identity
Nh
−1
G(k)
H
=
∇ ϕ̂j ⊗ ∇ ϕ̂k dx
Hx
·
x
h
k | j j
2|
T
j =1
holds true.
(2)
222
C. DAVINI AND R. PARONI
Proof. We notice that, if W is skew-symmetric, equation (2) is satisfied because
= 0 by the definition. Hence, without loss in generality, let H be any
symmetric second-order tensor. Then, from
G(k)
h W
G(k)
h H
Nh
−1
=
H(xj − xk ) · (xj − xk ) ∇ ϕ̂j ⊗ ∇ ϕ̂k dx,
2|Tk |
j =1
and by taking into account that
Nh
j =1 xj ϕ̂j (x) = x we find
Nh
j =1
∇ ϕ̂j = 0, since
Nh
j =1
ϕ̂j = 1, and that
Nh
−1
(Hxj · xj − 2Hxj · xk ) ∇ ϕ̂j ⊗ ∇ ϕ̂k dx
2|Tk |
j =1
%
Nh
−1
∇ ϕ̂j ⊗ ∇ ϕ̂k dx
Hxj · xj
=
k |
2|
T
j =1
&T
&
%N
h
1
+
xj ϕ̂j Hxk ⊗ ∇ ϕ̂k dx
∇
|Tk |
j =1
G(k)
h H =
Nh
Nh
1
−1
∇ ϕ̂j ⊗ ∇ ϕ̂k dx +
∇ ϕ̂k dx.
Hxj · xj
Hxk ⊗
=
2|Tk |
|Tk |
j =1
j =1
By a Green formula the last integral on the right hand side vanishes because ϕ̂k
vanishes at the boundary of supp ϕ̂k when xk is an internal node. Thus, equation (2)
holds true.
✷
The next lemma shows that P ROPERTY (⋆) is more general than P ROPERTY
(AFF).
LEMMA 2. An internal node xk owns the property (AFF) if and only if G(k)
h = I,
where I is the fourth-order identity tensor.
Proof. Without loss in generality it suffices to consider a generic symmetric
tensor H. Then,
&T
%N
Nh
h
xj ϕ̂j Hx ⊗ ∇ ϕ̂k dx
∇
Hx · xj ∇ ϕ̂j ⊗ ∇ ϕ̂k dx =
j =1
h
h
=
j =1
Hx ⊗ ∇ ϕ̂k dx = −
= −H
ϕ̂k dx,
h
h
h
∇(Hx)ϕ̂k dx
223
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
where we have applied Green’s formula and taken into account that xk is an internal
node so that ϕ̂k vanishes on the boundary of supp ϕ̂k . By A SSUMPTION (HO) we
then obtain
Nh
Hx · xj ∇ ϕ̂j ⊗ ∇ ϕ̂k dx = −H|Tk |.
(3)
j =1
h
Moreover we have
Nh
H(x − xj ) · (x − xj )∇ ϕ̂j ⊗ ∇ ϕ̂k dx
j =1
=
h
h
+
Hx · x ∇
Nh
j =1
h
%N
h
j =1
ϕ̂j
&
⊗ ∇ ϕ̂k dx − 2
Nh
j =1
j =1
h
Hx · xj ∇ ϕ̂j ⊗ ∇ ϕ̂k dx
Hxj · xj ∇ ϕ̂j ⊗ ∇ ϕ̂k dx
and by Lemma 1, equation (3) and noticing again that
Nh
h
Nh
j =1
ϕ̂j = 1, we obtain
H(x − xj ) · (x − xj )∇ ϕ̂j ⊗ ∇ ϕ̂k dx = 2|Tk |(H − G(k)
h H).
From this identity it then follows that
G(k)
h H =H
∀H
if and only if P ROPERTY (AFF) applies.
✷
As a consequence of Lemma 2 we deduce that P ROPERTY (⋆) holds whenever
the chosen triangulation forms hexagons or half hexagons formed by triplets of
equal isosceles triangles, see Figure 3, since it was proved by Angelillo et al. [1, 2]
that in this case P ROPERTY (AFF) holds.
We now look at the case in which the primal mesh is generated by a rectangular
grid of nodes, see Figure 4. In this case, after tedious calculations similar to those
done in Remark 1, we deduce (with the notation shown in Figure 4) that G(k)
h = γ I,
where
γ :=
3(1 + αβ)
2 + 2αβ + α + β
and where α := a1 /a2 and β := b1 /b2 . Hence P ROPERTY (⋆) holds provided
γ 1, that is for ({α 1} ∩ {β 1}) ∪ ({β 1} ∩ {α 1}), cf. Figure 4. Note
that P ROPERTY (AFF) holds only for either α = 1 or β = 1.
224
C. DAVINI AND R. PARONI
Figure 3. Partition for which property P ROPERTY (AFF) holds.
Figure 4. α := a1 /a2 , β := b1 /b2 .
3. Generalized Hessian
In what follows we are interested in dealing with functions in X0h and shall imagine them extended to zero in the whole of R2 . Let v̂ ∈ X0h . Then, its second
distributional derivatives are defined by v̂,αβ , ψ = − h v̂,α ψ,β dx, for all
functions ψ ∈ C0∞ (R2 ). Hence the Hessian, D 2 v̂, is a symmetric linear operator
from C0∞ (R2 ) into the space of two by two real matrices defined by
6
7
D 2 v̂, ψ = −
h
∇ v̂ ⊗ ∇ψ dx.
By density we can extend this operator to the Sobolev space H 1 (h ). We shall still
denote this extension by D 2 v̂. It follows that
6
7
D 2 v̂, ψ̂ = −
h
∇ v̂ ⊗ ∇ ψ̂ dx,
ψ̂ ∈ Xh .
225
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Noticing that every function ψ̂ ∈ Xh can be written as ψ̂(x) =
ϕ̂j (x)), we can write
6
Nh
7
D 2 v̂, ψ̂ = −
j =1
h
∇ v̂ ⊗ ∇ ϕ̂j dx ψ̂ (xj ) =
Nh
j =1
Nh
j =1 (ψ̂(xj )
Hh v̂(xj )ψ̂(xj )|Tj |
for every v̂ ∈ X0h and every ψ̂ ∈ Xh , where we have set
−1
Hh v̂(xj ) :=
∇ v̂ ⊗ ∇ ϕ̂j dx.
|Tj | h
×
(4)
(5)
Given v̂ ∈ X0h , we define the tensor valued function
Hh v̂ :=
Nh
Hh v̂(xj )χTj ,
j =1
(6)
where χTj denotes the characteristic functions of Tj . We will call Hh v̂ the generalized Hessian of v̂.
Furthermore, given a continuous function f we define
ch f (x) :=
rh f (x) :=
Nh
j =1
Nh
f (xj )χTj (x),
(7)
f (xj )ϕ̂j (x).
j =1
Obviously, if v̂ ∈ Xh , then rh v̂ = v̂. For every v̂ ∈ X0h and ψ̂ ∈ Xh , then, from
equation (4) we deduce that
6 2
7
D v̂, ψ̂ =
Hh v̂ ch ψ̂ dx.
(8)
h
For later use let us also denote by
◦
Hh v̂(xj )χTj
Hh v̂ :=
(9)
j ∈ Ᏽh
the simple function that coincides with the generalized Hessian of v̂ on the internal dual elements and vanishes outside. Note that, when ψ̂ ∈ X0h , equation (8)
becomes
◦
6 2
7
(10)
D v̂, ψ̂ =
Hh v̂(xj )ψ̂(xj )|Tj | =
Hh v̂ ch ψ̂ dx.
j ∈ Ᏽh
h
226
C. DAVINI AND R. PARONI
REMARK 2. From what we have shown in Remark 1 it follows that the generalized Hessian is a symmetric tensor valued function for every v̂ ∈ X0h . Its trace
turns out to coincide with the generalized Laplacian defined by Davini in [11].
It is worth noticing an interesting interpretation that can be given to the generalized Hessian when the boundary of Tj intersects at the midpoints the edges of
the primal triangles that concur at xj . From calculations similar to those made in
Remark 1 we get in fact
tj
1
(j )
[|v̂,n |]nt ⊗ nt ϕ̂t ds
Hh v̂(xj ) =
|Tj | t =1 xj ,xt(j)
tj
1 1
(j )
(xj − xt ) [|v̂,n |]nt ⊗ nt ,
=
|Tj | t =1 2
(11)
where [|v̂,n |] denotes the jump of the directional derivative of v̂ in the direction
(j )
normal to the primal side (xj , xt ). It is easy to see that Xh is contained in the
space of functions with special bounded Hessian, SBH(), cf. [5], and that the
distributional Hessian of v̂ ∈ Xh is a Radon measure of the form
8
D 2 v̂ = [|v̂,n |] n ⊗ n dᏴ1 S(∇ v̂) ,
(12)
where Ᏼ1 ⌊(S(∇ v̂)) denotes the one-dimensional Hausdorff measure restricted to
the set where the gradient of v̂ is discontinuous. By recalling that the boundary of
Tj meets the sides of the primal mesh at the midpoints, we get
D v̂(Tj ) =
2
tj
1
t =1
2
(j )
(xj − xt ) [|v̂,n |]nt ⊗ nt ,
when this measure is calculated on Tj . Therefore, by (11)2 it follows that
Hh v̂(xj ) =
D 2 v̂(Tj )
,
|Tj |
(13)
(14)
which motivates us to regard the generalized Hessian of a polyhedral function at
xj as the mean value of the Hessian over Tj .
9
4. Some Properties of a Sequential Topology in h∈Ᏼ X0h
9
The set h∈Ᏼ X0h is obviously dense in H01 (), with respect to the H 1 norm.
The notion of generalized
9 Hessian introduced in Section 3 can be used in order
to endow the space h∈Ᏼ X0h with a sequential topology that makes it dense
in H 2 () ∩ H01 (), or H02 (), in an appropriate sense, and allows us to define converging discretization methods that provide external approximations to
the solutions of variational problems involving the Hessian. This generalizes ideas
discussed by Davini [10, 11] and Davini and Pitacco [12, 13].
It is routine to show that:
227
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
LEMMA 3. There exist constants c and C, which do not depend on h, such that
cch v̂2L2 () v̂2L2 () Cch v̂2L2 ()
∀v̂ ∈ Xh .
Proof. Let v̂ ∈ Xh and Tj be one of the primal triangles. Then,
v̂(x) =
v̂(xt1 )ϕ̂t1 (x) for x ∈ Tj ,
t1 ∈ᏺh (j )
where the sum extends to the nodes that are vertices of Tj and ᏺh (j ) ⊂ ᏺh denotes
the set of values taken by their respective indices. Hence, we have
2
|v̂| dx =
v̂(xt1 )
ϕ̂t1 ϕ̂t2 dx v̂(xt2 ).
Tj
Tj
t1 ,t2 ∈ᏺh (j )
So, the matrix ( Tj ϕ̂t1 ϕ̂t2 dx) is strictly positive. One can use an affine change of
coordinates x → x ′ to transform Tj into a normalized triangle, obtaining that
ϕ̂t1 ϕ̂t2 dx = |Tj |Kt1 t2 , t1 , t2 ∈ ᏺh (j ),
Tj
with Kt1 t2 the entries of a 3 × 3 symmetric matrix which is also strictly positive and
independent of j . It follows that there are positive constants λ and such that
v̂(xt1 )2
|v̂|2 dx |Tj |
v̂(xt1 )2 .
λ|Tj |
Tj
t1 ∈ᏺh (j )
t1 ∈ᏺh (j )
By summing up over the primal triangles and reorganizing the sums in the first and
last term suitably, it follows that
2
|Tj |,
v̂(xi )2
|v̂|2 dx
|Tj |
v̂(xi )
λ
i∈ᏺh
h
j ∈Ph (i)
i∈ᏺh
j ∈Ph (i)
where Ph (i) ⊂ Ph is the subset of index
values relative to the primal triangles
that have xi as a vertex. By observing that j ∈Ph (i) |Tj | = |supp ϕ̂i | and recalling
A SSUMPTION (HO), we find that
v̂(xi )2 |Ti |
v̂(xi )2 |Ti |
|v̂|2 dx 3
3λ
i∈ᏺh
h
i∈ᏺh
which is our thesis.
LEMMA 4. There exists a constant c, independent of h, such that
5◦ 5
∇ v̂L2 () c5Hh v̂ 5L2 () ∀v̂ ∈ X0h .
✷
228
C. DAVINI AND R. PARONI
Proof. This is a straightforward consequence of equation (10) and Lemma 3. In
fact, by applying the Cauchy–Schwarz inequality to equation (10), with ψ̂ equal
to v̂, and the Poincaré inequality successively, we deduce that
5◦ 5
5◦ 5
∇ v̂ ⊗ ∇ v̂ dx 5Hh v̂ 5L2 () ch v̂L2 () c5Hh v̂ 5L2 () ∇ v̂L2 ()
h
for some constant c. But
2
∇ v̂L2 () = I ·
∇ v̂ ⊗ ∇ v̂ dx
h
√
5◦ 5
∇ v̂ ⊗ ∇ v̂ dx c5Hh v̂ 5L2 () ∇ v̂L2 () .
2
(15)
h
Hence, the lemma is proved.
✷
We now prove a compactness theorem.
THEOREM 1. Let v̂h ∈ X0h and v ∈ L2 (). If
v̂h → v
in L2 ()
and
5◦
5
sup5Hh v̂h 5L2 () < +∞,
h
then
v ∈ H 2 () ∩ H01 (),
v̂h → v
in H 1 (),
and
◦
2
Hh v̂h ⇀ ∇ v
in L2 ().
◦
Proof. Since suph Hh v̂L2 () < +∞, from Lemma 4 we deduce that
sup∇ v̂h L2 () < +∞,
h
and hence v̂h ⇀ v in H 1 (). But v̂h ∈ H01 () and hence also v ∈ H01 (). Let
f ∈ C0∞ (). Then, for h small enough supp f ⋐ h and we have that
ˆ j dx f (xj )
∇ v̂h ⊗ ∇rh f dx =
∇ v̂h ⊗ ∇ϕ
Tj
j ∈ Ᏽh
= −
= −
j ∈ Ᏽh
◦
|Tj |Hh v̂h (xj )f (xj ) = −
◦
Hh v̂h f dx −
◦
◦
Hh v̂h ch f dx
Hh v̂h (ch f − f ) dx
229
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
and hence
◦
◦
∇ v̂h ⊗ ∇f dx = − Hh v̂h f dx − Hh v̂h (ch f − f ) dx
∇ v̂h ⊗ ∇(f − rh f ) dx.
+
(16)
Let ε > 0. By using the fact that ch f → f in L2 () and rh f → f in H 1 (),
we find that for h sufficiently small we have
◦
v̂
(c
f
−
f
)
dx
+
∇ v̂h ⊗ ∇(f − rh f ) dx εf L2 () .
Hh h h
From equation (16) we find
∇ v̂h ⊗ ∇f dx cf L2 ()
and passing to the limit we deduce
6 2
7
D v, f =
∇v ⊗ ∇f dx cf L2 () .
Thus the Hessian is a bounded linear operator on L2 (). By the Riesz representation theorem there exists a function V ∈ (L2 ())2×2 such that
6 2
7
Vf dx.
D v, f =
As is usually done we shall denote the function V by ∇ 2 v. Now, passing to the
limit in equation (16) we deduce
◦
2
∇ vf dx = − ∇v ⊗ ∇f dx = lim Hh v̂h f dx,
h
for every f ∈
◦
C0∞ (),
2
Hh v̂h ⇀ ∇ v
which implies that
in L2 ().
We now finish the proof by showing that ∇ v̂h converges strongly in L2 () to
∇v. To this end, from equation (8) we find
◦
◦
∇ v̂h ⊗ ∇ v̂h dx = − Hh v̂h v̂h dx +
Hh v̂h (v̂h − ch v̂h ) dx.
Hence,
lim
h
∇ v̂h ⊗ ∇ v̂h dx = −
2
∇ v v dx =
∇v ⊗ ∇v dx,
(17)
230
C. DAVINI AND R. PARONI
◦
as Hh v̂h ⇀ ∇ 2 v, v̂h → v and (v̂h − ch v̂h ) → 0 in L2 (). By calculating the trace
of the terms on the two sides of (17) it follows that
lim ∇ v̂h 2L2 () = ∇v2L2 () .
h
Therefore,
∇ v̂h → ∇v
in L2 ()
since ∇ v̂h ⇀ ∇v in L2 ().
✷
Assuming P ROPERTY (⋆) we now deduce a density result.
THEOREM 2. Assume that P ROPERTY (⋆) holds. For every v ∈ H 2 () ∩ H01 ()
there exists a sequence {v̂h } ⊂ X0h such that
v̂h → v
in H 1 (),
and
◦
2
Hh v̂h → ∇ v
in L2 ().
Proof. We start by considering v ∈ C ∞ () ∩ H 2() ∩ H01() and let v̂h := rh v
be its nodal interpolation defined in equation (7). Obviously, v̂h → v in H 1 ().
We now compute
−1
Hh v̂h (xj ) =
∇ v̂h ⊗ ∇ ϕ̂j dx
|Tj | h
Nh
−1
=
∇ ϕ̂k ⊗ ∇ ϕ̂j dx v(xk )
(18)
j | h
|
T
k=1
at any internal node xj . Since the function v is smooth we can write
v(xk ) = v(xj ) + ∇v(xj ) · (xk − xj )
1
+ ∇ 2 v(xj )(xk − xj ) · (xk − xj ) + o |xk − xj |2 .
2
Then, taking into account that
%
Nh
k=1
&
∇ ϕ̂k ⊗ ∇ ϕ̂j dx v(xj ) = 0,
j ∈ Ᏽh
(19)
231
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Nh
because k=1 ϕ̂k = 1, and
Nh
∇v(xj ) · (xk − xj )∇ ϕ̂k ⊗ ∇ ϕ̂j dx
k=1
=
%
∇ ∇v(xj ) ·
Nh
k=1
&
(xk − xj )ϕ̂k ⊗ ∇ ϕ̂j dx
∇ ∇v(xj ) · (x − xj ) ⊗ ∇ ϕ̂j dx
= ∇v(xj ) ⊗ ∇ ϕ̂j dx = 0 j ∈ Ᏽh ,
=
(20)
we deduce
Nh
−1
Hh v̂h (xj ) =
∇ ϕ̂k ⊗ ∇ ϕ̂j dx ∇ 2 v(xj )(xk − xj ) · (xk − xj )
2|Tj | h
k=1
+
=
Nh
o(h2 )
k=1
(j ) 2
Gh ∇ v(xj )
+ o(1)
j ∈ Ᏽh ,
(21)
where we have taken into account that xk and xj must belong to the same triangle in
order to appear in the expressions above. In equation (21) o(1) indicates a quantity
that tends to 0 with h, uniformly in j . Hence, by applying P ROPERTY (⋆),
5◦
52
lim sup5Hh v̂h 5L2 () = lim sup
|Hh v̂h (xj )|2 |Tj |
h
h
lim sup
h
j ∈ Ᏽh
5 (j ) 52
5G 5 ∇ 2 v(xj ) 2 |Tj |
h
j ∈ Ᏽh
5 (j ) 52 2
2
5
5
∇ v(xj ) |Tj |
lim sup sup Gh
h
lim sup
h
j ∈ Ᏽh
Nh
j =1
It follows that
5◦
52
5
52
lim sup5Hh v̂h 5L2 () 5∇ 2 v 5L2 () .
j ∈ Ᏽh
2
∇ 2 v(xj ) |Tj |.
(22)
h
◦
For a subsequence, not relabeled, we have that suph Hh v̂h 2L2 () < +∞, and hence
from Theorem 1 we deduce that
◦
2
Hh v̂h ⇀ ∇ v
in L2 (),
while, from equation (22) we find that the convergence is indeed strong in L2 ().
232
C. DAVINI AND R. PARONI
We now consider the general case, i.e., v ∈ H 2 ()∩H01 (). Let wk ∈ C ∞ ()∩
H 2 () ∩ H01 (), k ∈ N, be such that wk → v in H 2 (). From the case discussed
above we deduce that for every k there exists a h = h(k) such that if we let v̂h :=
rh(k) wk we have
1
,
k
1
.
k
v̂h − wk 2H 1 ()
5◦
5
5Hh v̂h − ∇ 2 wk 52 2
L ()
Now, letting k go to infinity we see that v̂h is the sequence we were looking for. ✷
The following important variants of the previous theorems apply if we take into
account the generalized Hessian up to the boundary nodes.
THEOREM 3. Let v̂h ∈ X0h and v ∈ L2 (). If
v̂h → v
in L2 ()
and
supHh v̂h L2 () < +∞,
h
then
v ∈ H02 (),
v̂h → v
in H 1 (),
and
Hh v̂h ⇀ ∇ 2 v
in L2 ().
◦
Proof. Since suph Hh v̂h L2 () < +∞ implies that suph Hh v̂h L2 () < +∞,
◦
the conclusions of Theorem 1 hold. Moreover, obviously, Hh v̂h ⇀ ∇ 2 v in L2 ()
implies that Hh v̂h ⇀ ∇ 2 v in L2 (). Thence we have only to prove that
v,n = 0
in ∂.
To see this, we can replicate the argument of Davini [11, Lemma 3]. Let f ∈
C ∞ () and observe by the Green formula that
(23)
vf dx + ∇v · ∇f dx.
v,n f ds =
∂
233
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Proceeding as in Theorem 1, we have that
∇ v̂h · ∇rh f dx + ∇ v̂h · ∇(f − rh f ) dx
∇v · ∇f dx =
∇ v̂h · ∇rh f dx + o(1),
∇(v − v̂h ) · ∇f dx =
+
(24)
where we have used the fact that rh f → f and v̂h → v in H 1 (). By taking
account of equation (24) in equation (23) and observing that
∇ v̂h · ∇rh f dx =
Nh
j =1
trace Hh v̂h (xj ) f (xj )|T̂j |,
cf. equation (8), it follows that
∂
v,n f ds =
vf dx +
Nh
j =1
trace Hh v̂h (xj ) f (xj )|Tj | + o(1).
Hence, by using obvious inequalities and passing to the limit we get
2
1
v,n f ds supHh v̂h L2 () + vL2 () f L2 () .
h
∂
Then, by density,
2
1
v,n f ds supHh v̂h L2 () + vL2 () f L2 ()
(25)
h
∂
for all f ∈ H 1 (). This implies that v,n = 0 on ∂ because, for every f ∈
H 1 (), it is always possible to construct a sequence {fk } such that f − fk ∈
H01 () and fk → 0 in L2 (). Thus, v ∈ H02 ().
✷
THEOREM 4. Assume that P ROPERTY (⋆) holds. For every v ∈ H02 () there
exists a sequence {v̂h } ⊂ X0h such that
v̂h → v
in H 1 (),
and
Hh v̂h → ∇ 2 v
in L2 ().
Proof. We start considering v ∈ C0∞ () and proceed as in Theorem 2. Let
v̂h := rh v. Obviously, v̂h → v in H 1 () and the analysis that leads to equation (21)
keeps on holding for the internal nodes. On the other hand, for h small enough
v(xj ) = ∇v(xj ) = 0 at the boundary nodes. Thus, equations (19) and (20) hold
234
C. DAVINI AND R. PARONI
true for every node. It follows that equation (21) applies to every node as well and
thence we have that
5
52
lim supHh v̂h 2 2 5∇ 2 v 5 2 .
(26)
h
L ()
L ()
One of the implications of Theorem 1 is that Hh v̂h ⇀ ∇ 2 v in L2 (), as we have
already noticed. Thus, we conclude from (26) that the convergence is indeed strong.
By repeating the argument of Theorem 2, the thesis follows from the density of
C0∞ () in H02 ().
✷
5. External Approximations of Quadratic Functionals: the Equilibrium
Problem for Anisotropic Elastic Plates
We use the results of Sections 3 and 4 to prove the convergence of a direct nonconforming approximation scheme for quadratic variational problems involving
the Hessian. Namely, we adopt the format of Ŵ-convergence theory in order to
prove that a suitable sequence of discrete functionals {Fh } defined in the spaces
X0h Ŵ-converges in an appropriate topology to the functional F that describes the
problem we wish to approximate. In particular, since the limit functional we shall
consider has a unique minimizer and the discrete functionals are equicoercive, one
of the key properties of Ŵ-convergence stated in Theorems 7.8 and 7.24 of [9]
applies:
min F (v) = lim min Fh (v)
(27)
uh → u,
(28)
Cαβγ δ u,γ δαβ = f on ⊂ R2 ,
B0 u = 0 on ∂,
B1 u = 0 on ∂,
(29)
Cαβγ δ = Cβαγ δ = Cαβδγ = Cγ δαβ
(30)
h
and
u and uh being the minimizers of F and Fh , respectively. Therefore, the uh can
be used in order to approximate u. To compute them a sequence of discrete unconstrained minimum problems for the functions Fh have to be solved and we can
use standard techniques to do it. It is fair to say that the customary perspective of
Ŵ-convergence is reversed here, since the theory is used for validating approximation schemes rather than for getting a characterization of the limit problem, as is
more common in other types of applications.
Let us consider the equilibrium problem for an elastic homogeneous anisotropic
plate under transverse loads
where f describes the applied loads, Cαβγ δ are the components of the elasticity
tensor C of the plate and B0 and B1 are suitable boundary operators. We assume
that C has both the minor and major symmetries
235
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
and is strictly positive, that is,
CK · K m|K|2
∀K ∈ Sym
(31)
for some positive real number m > 0. Sym stands for the space of two by two
symmetric matrices. For our purposes below we also assume that f ∈ H −1 ().
As to the boundary conditions, we focus on two cases:
u = u,n = 0 on ∂
(clamping conditions)
(32)
and
u = Cαβγ δ u,γ δ nα nβ = 0
on ∂
(simple support conditions).
(33)
To apply the techniques of Ŵ-convergence it is convenient to study the variational formulation of problem (29). Let us define
C∇ 2 v · ∇ 2 v dx − 2f, v,
F (v) :=
having denoted by ·, · the duality pairing between H −1 () and H01 (). Then, the
solution of problem (29) can be found by minimizing F among all v ∈ HE2 (),
with HE2 () = H 2 () ∩ H01 () or HE2 () = H02 () in the two cases above,
respectively. We shall extend in fact F to L2 () by letting it take the value +∞ in
L2 () \ HE2 (), that is, we set
⎧
⎨
C∇ 2 v · ∇ 2 v dx − 2f, v if v ∈ HE2 (),
(34)
F (v) :=
⎩ +∞
if v ∈ L2 () \ H 2 ().
E
Accordingly, we introduce the sequences of discrete functionals
⎧
⎨
CHh v · Hh v dx − 2f, v if v ∈ X0h ,
Fh (v) :=
⎩ +∞
v ∈ L2 () \ X ,
(35)
0h
in the clamping case, and
⎧
◦
◦
⎨
CHh v · Hh v dx − 2f, v if v ∈ X0h ,
Fh (v) :=
⎩ +∞
v ∈ L2 () \ X0h ,
(36)
in the simple support case.
The next theorem follows from the theory discussed so far.
THEOREM 5. Let C be a strictly positive symmetric fourth-order tensor and let
f ∈ H −1 (). Assume that P ROPERTY (⋆) holds. Let F be defined as in (34) and
Fh as in (35) or (36) in the clamping and simple support case, respectively. Then,
Fh Ŵ-converges to the functional F with respect to the L2 () topology.
236
C. DAVINI AND R. PARONI
Proof. The two cases can be alike treated, so let us consider the clamping
case for illustration. Since L2 () is a metric space we can use the sequential
characterization of Ŵ-convergence:
“Ŵ(L2 ) − limh Fh = F if and only if the following conditions are satisfied:
(i) ∀v ∈ L2 () and ∀{vh } ⊂ L2 (), vh → v: F (v) lim inf Fh (vh ),
L2 ()
2
2
(ii) ∀v ∈ L (), ∃{vh } ⊂ L (),
h
vh → v: F (v) lim sup Fh (vh ).”
L2 ()
h
We refer to the book of Dal Maso [9] for more details. Let us call the two requirements the lim-inf inequality and the recovery sequence condition, respectively.
We start by proving the lim-inf inequality. Let vh be a sequence in L2 () converging to v in the L2 norm. If lim infh Fh (vh ) = +∞ there is nothing to prove.
Hence, suppose lim infh Fh (vh ) < +∞. By passing to a subsequence, if necessary,
we have that suph |Fh (vh )| < +∞ and hence v̂h := vh ∈ X0h . Moreover, using
simple inequalities and Lemma 4 we find
Fh (v̂h ) mHh v̂h 2L2 () − c∇ v̂h L2 ()
c1 ∇ v̂h 2L2 () − c∇ v̂h L2 ()
(37)
with c1 a positive constant. From these inequalities we deduce that suph ∇ v̂h L2 ()
< +∞ and that suph Hh v̂h L2 () < +∞. Then, by Theorem 3, we obtain that
v ∈ H02 (),
v̂h → v
in H 1 (),
and
Hh v̂h ⇀ ∇ 2 v
in L2 ().
Since
0 C Hh v̂h − ∇ 2 v · Hh v̂h − ∇ 2 v
= CHh v̂h · Hh v̂h − 2CHh v̂h · ∇ 2 v + C∇ 2 v · ∇ 2 v,
it follows that
CHh v̂h · Hh v̂h 2CHh v̂h · ∇ 2 v − C∇ 2 v · ∇ 2 v,
and using this inequality we find
lim inf Fh (v̂h ) = lim inf CHh v̂h · Hh v̂h dx − 2 limf, v̂h
h
h
h
C∇ 2 v · ∇ 2 v dx − 2f, v = F (v).
The existence of a recovery sequence is an immediate consequence of Theorem 4, and thus we have finished the proof.
✷
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
237
REMARK 3. It is possible to extend the analysis and allow the datum f to be
in (HE2 ())∗ , the dual of HE2 (), rather than just in H −1 (). Indeed, by an easy
application of the Lax–Milgram lemma, for every f ∈ (HE2 ())∗ there exist a
f 0 ∈ L2 (), f 1 ∈ L2 (; R2 ) and a f 2 ∈ L2 (; R2×2 ) such that,
0
f v + f 1 · ∇v + f 2 · ∇ 2 v dx
f, v =
for every v ∈ HE2 (). Here, ·, · denotes the duality pairing between HE2 () and
its dual (HE2 ())∗ . It is then natural to define
0
f v̂ + f 1 · ∇ v̂ + f 2 · Hh v̂ dx
f, v̂h :=
for every v̂ ∈ X0h . Note that if v̂h ⇀ v in H 1 () and Hh v̂h ⇀ ∇ 2 v in L2 () then
f, v̂h h → f, v.
Thence for f ∈ (HE2 ())∗ the previous theorem still holds provided we replace
f, v with f, vh in the definition of Fh , i.e., in equations (35) and (36).
6. External Approximations of General Convex Functionals
The results of Section 5 extend to more general functionals. Giving symbols the
same meaning they had in the previous sections, here we consider functionals of
the form
W (x, v, ∇v, ∇ 2 v) dx − 2f, v
(38)
F (v) :=
to be minimized in HE2 (). We assume that W : × R × R2 × R2×2 → [0, +∞)
be a function satisfying the following requirements:
(H1) W is a Carathéodory function;
(H2) W (x, s, ξ, ·) is convex for a.e. (x, s, ξ ) ∈ × R × R2 ;
(H3) there exist two positive constants c, C such that
c||2 W (x, s, ξ, ) C 1 + ||2
for a.e. (x, s, ξ ) ∈ × R × R2 .
To stay with a mechanical interpretation, this class of problems encompasses
the equilibrium of nonhomogeneous and nonlinearly elastic plates, including the
case where the plate is supported by a Winkler foundation, for instance. Since the
continuity of W on x is not required, the theory is applicable to materials with
inclusions of different materials.
The goal of this section is to study the approximation of these kind of functionals in spaces X0h by using the generalized notion of the Hessian introduced
238
C. DAVINI AND R. PARONI
above. As before, we imagine F extended to L2 () by defining it equal to +∞ in
L2 () \ HE2 () and introduce the sequences of discrete functionals
⎧ N
h
⎪
⎨
W xj , v(xj ), ∇h v(xj ), Hh v(xj ) |Tj | − 2f, v if v ∈ X0h ,
Fh (v) :=
⎪
⎩ j =1
+∞ if v ∈ L2 () \ X0h
(39)
in the clamping case, or
⎧ N
h
◦
⎪
⎨
W xj , v(xj ), ∇h v(xj ), Hh v(xj ) |Tj | − 2f, v
Fh (v) :=
⎪
⎩ j =1
+∞ if v ∈ L2 () \ X0h
if v ∈ X0h ,
(40)
in the simple support case. In formulae (39) and (40) we have set
1
∇h v(xj ) :=
∇v dx.
|Tj | Tj
THEOREM 6. Assume that (H1)–(H3) and P ROPERTY (⋆) hold. Let f ∈ H −1 (),
and let F and Fh be the functionals defined in (38) and (39), or (40), according
to the studied case. Then Fh Ŵ-converges to the functional F , with respect to the
L2 () topology.
To prove the theorem above we shall use the following well known theorem:
LEMMA 5 (see [8]). Let be a bounded open set of Rn . Let g: × Rm × RN →
[0, +∞] be a Carathéodory function. Let
g x, u(x), ξ(x) dx.
G(u, ξ ) :=
Assume that g(x, u, ·) is convex and that uk → u in L2 () and that ξk ⇀ ξ in
L2 () then
lim inf G(uk , ξk ) G(u, ξ ).
k
Proof of Theorem 6. Again the proof is similar for the two kinds of boundary
conditions, so let us consider the clamping case. By using equation (7) and defining
ch id(x) :=
Nh
j =1
xj χTj (x),
∇h v(x) :=
Nh
j =1
∇h v(xj )χTj (x)
239
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
we can write
Fh (v) :=
⎧
⎨
W (ch id, ch v, ∇h v, Hh v) dx − 2f, v
⎩ +∞
if v ∈ X0h ,
otherwise,
where id: R2 → R2 is the identity function.
We first prove the lim-inf inequality. Let vh be a sequence in L2 () converging
to v in the L2 norm. If lim infh Fh (vh ) = +∞ there is nothing to prove. Hence,
suppose lim infh Fh (vh ) < +∞. By using H3 and proceeding as in Theorem 5 we
deduce that v̂h := vh ∈ X0h , v ∈ H02 (),
v̂h → v
in H 1 (),
and
Hh v̂h ⇀ ∇ 2 v
in L2 ().
By the Scorza–Dragoni theorem (cf. [14]) we have that for every ε > 0 there exists
a compact subset Kε of such that | − Kε | < ε and W restricted to Kε × R ×
R2 × R2×2 is continuous. Now, observing that ch id → id in L∞ (), ch v̂h → v in
L2 (), and ∇h v̂h → ∇v in L2 () we have that uh := (ch id, ch v̂h , ∇h v̂h ) → u :=
(id, v, ∇v) in L2 (), hence by applying Lemma 5 we find
lim inf Fh (v̂h ) lim inf χKε (x)W (uh , Hh v̂h ) dx − 2 limf, v̂h
h
h
h
χKε (x)W (u, ∇ 2 v) dx − 2f, v.
Letting ε go to zero and applying Fatou’s lemma we obtain
W (u, ∇ 2 v) dx − 2f, v
lim inf Fh (v̂h )
h
W (x, v, ∇v, ∇ 2 v) dx − 2f, v = F (v).
=
We now prove the existence of a recovery sequence. If v does not belong to
H02 () there is nothing to prove. So let us suppose that v ∈ H02 (). By Theorem 4
there exists a sequence v̂h ∈ X0h such that
v̂h → v
in H 1 (),
and
Hh v̂h → ∇ 2 v
in L2 ().
It immediately follows that ch v̂h → v in L2 (), and ∇h v̂h → ∇v in L2 ().
Moreover by passing to a subsequence, if necessary, we may also suppose that
ch v̂h , ∇h v̂h and Hh v̂h converge almost everywhere to v, ∇v and ∇ 2 v, respectively.
Since
limf, v̂h = f, v
h
it suffices to prove that
lim sup W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
W (x, v, ∇v, ∇ 2 v) dx.
h
(41)
240
C. DAVINI AND R. PARONI
Observing that the integrand below is non negative, by using (H3) and Fatou’s
lemma we have
lim inf
C 1 + |Hh v̂h |2 − W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
h
lim inf C 1 + |Hh v̂h |2 − W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
h
Kε
C 1 + |∇ 2 v|2 − W (x, v, ∇v, ∇ 2 v) dx,
=
Kε
where Kε is the compact subset of defined in the first part of the proof. Letting
ε go to zero and applying Fatou’s lemma we find
lim inf
C 1 + |Hh v̂h |2 − W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
h
C 1 + |∇ 2 v|2 − W (x, v, ∇v, ∇ 2 v) dx.
(42)
On the other hand we also have
lim inf
C 1 + |Hh v̂h |2 − W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
h
lim sup C 1 + |Hh v̂h |2 dx
h
+ lim inf −W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx
h
C 1 + |∇ 2 v|2 dx − lim sup W (ch id, ch v̂h , ∇h v̂h , Hh v̂h ) dx. (43)
=
h
Combining equations (42) and (43) we obtain equation (41), and thus we have
completed the proof.
✷
7. Convergence of the Minimizers
As pointed out at the beginning of Section 5 the functionals we have considered
have properties that assure the convergence of the minimizers of Fh to the minimizers of the functional F in the L2 topology, by general properties of Ŵ-convergence.
Here, however, that convergence is stronger. The considerations done in this short
section apply to the clamping case as well as to the simple support case. For
simplicity we discuss only the former.
Let ûh be a minimizer of the discrete functional Fh defined by either one of
equations (35) and (39). Then, due to the equicoercivity of the functional Fh , to
Lemma 4 and to Poincaré’s inequality we have that the sequence ûh is bounded in
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
241
H 1 (), and thence it has a weakly convergent subsequence (not relabeled) converging to a function u ∈ H 1 () which is a minimizer of F by equation (27).
Indeed from Theorem 1 we deduce that
ûh → u,
in H 1 ()
Hh ûh ⇀ ∇ 2 u in L2 ().
and
(44)
The next theorem shows that also the generalized Hessian of ûh converges strongly
in L2 () to the Hessian of u. By equation (27) this is obviously true for the
quadratic case, but a similar result also holds for the minimizers of the functional
defined by equation (39) provided the potential W is strictly convex in the last
variable.
THEOREM 7. Assume the hypothesis of Theorem 6 hold. Moreover, suppose there
exists a continuous function W : ×R ×R2 ×R2×2 → R2×2 and a strictly positive
constant γ > 0 such that
W (x, s, ξ, 2 ) W (x, s, ξ, 1 ) + W (x, s, ξ, 1 ) · (2 − 1 )
+ γ |2 − 1 |2 ,
(45)
for a.e. (x, s, ξ, 1 , 2 ) ∈ × R × R2 × R2×2 × R2×2 . Then if ûh is the minimizer
of the functional Fh defined in equation (39) we have that
ûh → u,
in H 1 ()
Hh ûh → ∇ 2 u in L2 (),
and
where u is the unique minimizer of the functional F defined in equation (38).
The proof of this theorem is standard, we include it here just for the reader’s
convenience.
Proof. Because of (44) we just have to prove that Hh ûh → ∇ 2 u in L2 (). From
assumption (45) it follows that
W (ch id, ûh , ∇h ûh , ∇ 2 u) + γ |Hh ûh − ∇ 2 u|2
Fh (ûh )
+ W (ch id, ûh , ∇h ûh , ∇ 2 u) · (Hh ûh − ∇ 2 u) dx − 2f, ûh .
From the convexity of W in the last variable and hypothesis (H3), we deduce
(cf. [8]), that |W (x, s, ξ, )| C(1 + ||) for a.e. (x, s, ξ ) ∈ × R × R2 , and
from Lebesgue’s convergence theorem we find that W (ch id, ûh , ∇h ûh , ∇ 2 u) →
W (x, u, ∇u, ∇ 2 u) in L2 (). Hence passing to the limit in the inequality above
and using the fact that Fh (ûh ) → F (u) we deduce that
F (u) F (u) + γ lim sup |Hh ûh − ∇ 2 u|2 dx
h
which concludes the proof.
✷
242
C. DAVINI AND R. PARONI
Acknowledgements
A substantial part of this work was completed while C.D. was a visiting professor
at the University of Kentucky and R.P. was a visiting fellow at the University
of Oxford, within the European TMR project “Phase Transitions in Crystalline
Solids.” The authors thank prof. C.-S. Man and prof. J. M. Ball for providing them
appropriate context to carry it out.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
M. Angelillo, A. Fortunato and F. Fraternali, The lumped stress method and the discretecontinuum approximation, Part I: Theory. Internat. J. Solids Struct. 39 (2002) 6211–6240.
M. Angelillo, A. Fortunato and F. Fraternali, The lumped stress method and the discretecontinuum approximation, Part II: Applications. Ibid.
S. Balasundaram and P.K. Bhattacharyya, A mixed finite element method for fourth order
partial differential equations. Z. angew. Math. Mech. 66 (1986) 489–499.
F. Brezzi and M. Fortin, Mixed and Hybrid Finite Element Methods. Springer, New York
(1991).
M. Carriero, A. Leaci and F. Tomarelli, Special bounded Hessian and elastic–plastic plate.
Rend. Accad. Naz. Sci. XL Mem. Mat. 16 (1992) 223–258.
P.G. Ciarlet, The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam
(1978).
P.G. Ciarlet, Basic error estimates for elliptic problems. In: P.G. Ciarlet and J.L. Lions (eds),
Handbook of Numerical Analysis. North-Holland, Amsterdam (1991).
B. Dacorogna, Direct Methods in the Calculus of Variations. Springer, New York (1989).
G. Dal Maso, An Introduction to Ŵ-Convergence. Birkhäuser, Boston (1993).
C. Davini, Note on a parameter lumping in the vibrations of elastic beams. Rend. Ist.
Matematica Università di Trieste 28 (1996) 83–99.
C. Davini, Ŵ-convergence of external approximations in boundary value problems involving
the bi-Laplacian. J. Comput. Appl. Math. 140 (2002) 185–208.
(2000) (to appear).
C. Davini and I. Pitacco, Relaxed notions of curvature and a lumped strain method for elastic
plates. SIAM J. Numer. Anal. 35 (1998) 677–691.
C. Davini and I. Pitacco, An unconstrained mixed method for the biharmonic problem. SIAM
J. Numer. Anal. 38 (2000) 820–836.
I. Ekeland and G. Temam, Convex Analysis and Variational Problems. North-Holland, Amsterdam (1976).
R. Glowinski, Approximations externes, par éléments finis de Lagrange d’ordre un et deux,
du problème de Dirichlet pour l’operateur biharmonique. Méthode iterative de résolution des
problèmes approches. In: J.J.H. Miller (ed.), Topics in Numerical Analysis. Academic Press,
New York (1973) 123–171.
N. Nataraj, P.K. Bhattacharyya, S. Balasundaram and S. Gopalsamy, On a mixed-hybrid finite
element method for anisotropic plate bending problems. Internat. J. Numer. Methods Engrg. 39
(1996) 4063–4089.
J.E. Roberts and J.-M. Thomas, Mixed and hybrid methods. In: P.G. Ciarlet and J.L. Lions
(eds), Handbook of Numerical Analysis. North-Holland, Amsterdam (1991).
Static Deformations of a Linear Elastic Porous
Body Filled with an Inviscid Fluid
F. DELL’ISOLA1, G. SCIARRA2 and R.C. BATRA3
1 Dip. Ingegneria Strutturale e Geotecnica, Universita degli Studi di Roma “La Sapienza”,
via Eudossiana 18, 00184 Roma, Italy. E-mail: Francesco.Dellisola@uniroma1.it
2 Dip. Ingegneria Chimica, dei Materiali, delle Materie Prime e Metallurgia, Università degli Studi
di Roma “La Sapienza”, via Eudossiana 18, 00184 Roma, Italy. E-mail: giulio.sciarra@uniroma1.it
3 Department of Engineering Science and Mechanics, M/C 0219, Virginia Polytechnic Institute and
State University, Blacksburg, VA 24061, U.S.A. E-mail: rbatra@vt.edu
Received 20 September 2002; in revised form 19 September 2003
Abstract. We study infinitesimal deformations of a porous linear elastic body saturated with an
inviscid fluid and subjected to conservative surface tractions. The gradient of the mass density of the
solid phase is also taken as an independent kinematic variable and the corresponding higher-order
stresses are considered. Balance laws and constitutive relations for finite deformations are reduced
to those for infinitesimal deformations, and expressions for partial surface tractions acting on the
solid and the fluid phases are derived. A boundary-value problem for a long hollow porous solid
cylinder filled with an ideal fluid is solved, and the stability of the stressed reference configuration
with respect to variations in the values of the coefficient coupling deformations of the two phases is
investigated. An example of the problem studied is a cylindrical cavity leached out in salt formations
for storing hydrocarbons.
Mathematics Subject Classifications (2000): 74F10, 74F20.
Key words: solid–fluid mixture, conservative tractions, principle of virtual power, partial tractions,
fluid-filled cylindrical cavity, stability analysis.
R.C. Batra dedicates this work with deep respect and admiration to Professor
C.A. Truesdell, a superb teacher and an excellent friend.
1. Introduction
Simple models of a mechanical system comprised of a deformable porous solid
matrix filled with a compressible fluid have been developed by Fillunger [22],
Biot [11], Truesdell [40] and Müller [30]. In these works a spatial point is simultaneously occupied by all constituents. This is readily comprehensible for gaseous
mixtures [30] and fluid solutions [41].
Observations on fluid saturated solids have shown higher values of fluid percolation through pores of the solid matrix than that predicted by the aforestated
243
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 243–264.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
244
F. DELL’ISOLA ET AL.
models (see, e.g., [13] or [33]). The increase of percolation is possibly not only due
to the higher externally applied pressure but also due to the opening of pores in the
vicinity of the boundary (see [17]). This provides a justification for endowing the
theory of mixtures with the volume fraction concept (see, e.g., [18]); this additional
scalar parameter can describe the nonstandard dilatation effects (see, e.g., [38]) as
follows. Once the mixture is modelled from a microscopic point of view and the
reference configuration of the solid matrix is required to be periodic, then those
dilatations of pores which do not involve global deformations of cells are captured
by the volume fraction.
In this paper we consider a binary mixture involving a second gradient solid
and a perfect fluid; we refer the reader to [23, 24, 28] for the relationship between
micro-structural and second gradient theories. Our approach is close to that of the
volume fraction concept as the latter reduces to the former once a suitable constraint among the enlarged set of state parameters is assumed [34]. For example,
for an incompressible solid constituent, it is easy to see [25] how a mixture model
endowed with the volume fraction concept transforms into a binary mixture whose
solid constituent has second gradient constitutive relations. After formulating a
general problem, we study deformations of a porous hollow linear elastic cylinder
filled with a perfect fluid. In particular, by assuming that the internal energy density
can be split into a part involving first gradient of the displacements and a part
involving second order gradients of displacements, we perform parametric analysis of the density profiles of the solid matrix with respect to a suitable energetic
coupling coefficient between the solid and the fluid. We limit our analysis to the
case when the external tractions applied on both constituents are conservative and
can be derived from a potential.
We also discuss stability of the prestressed reference configuration of the hollow
cylinder with respect to perturbations of the aforementioned energy coupling coefficient; an energetic criterion is proposed for this analysis. The distance in the space
of mixture configurations is described in terms of the total energy which equals the
sum of the mixture deformation energy and the potential of surface tractions.
Batra et al. [5–9] have analyzed numerically finite transient thermomechanical
deformations of a homogeneous body with the Cauchy stress and higher-order
stresses depending upon gradients of deformation.
2. Formulation of the Problem
Material particles of the fluid and the solid are identified respectively by their position vectors X(f) and X(s) in fixed reference configurations f0 and s0 . We presume
that, at any time t, particles of both constituents occupy the same position x in the
present configuration . The velocity v(α) (α = f, s) of the material particle X(α)
is defined by
vα =
d(α) u(α)(X (α) , t)
,
dt
(1)
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
245
where d(α) /dt denotes the material time derivative following the motion of Xα and
u(α) is the displacement of the αth constituent from its reference configuration. Let
ρ (f) and ρ (s) denote, respectively, the apparent mass densities of the fluid and the
solid; then the mass density ρ of the mixture equals ρ (f) + ρ (s) . The mean or the
barycentric velocity v of the mixture is defined by
ρv = ρ (f) v(f) + ρ (s) v(s) .
(2)
Details of the theory of mixtures are given in [13, 19, 27, 29–31, 33, 40].
2.1. BALANCE LAWS
We presume that there is no interconversion of mass between the solid and the
fluid. In the spatial description of motion, the balance of mass for each constituent
is given by
d(α) ρ (α)
+ ρ (α) div v(α) = 0,
dt
(3)
where div v = tr(gradv), and grad denotes derivatives with respect to coordinates
in the present configuration. We use the principle of virtual power to derive the
balance of linear momentum and the boundary conditions for each constituent.
That is, we postulate that
(s) (s)
m · v̄ + m(f) · v̄(f) + T(s) · ∇ v̄(s) − p (f) div v̄(f) + (s) · ∇∇ v̄(s) dV
(s) (s)
b · v̄ + b(f) · v̄(f) dV
=
(s)
(s)
(s)
(f)
(f)
(s) ∂ v̄
dA.
(4)
t · v̄ + t · v̄ + τ ·
+
∂n
∂
Here m(α) is the bulk solid-fluid interaction force, T(s) the partial Cauchy stress in
the solid, p (f) the hydrostatic pressure in the fluid (we assume that the fluid is ideal;
therefore partial Cauchy stress in it is spherical), (s) the second-order stress in
the solid, ∇ the gradient operator with respect to coordinates in the present configuration, b(α) the density of partial body forces, t(α) the partial surface tractions,
τ (s) the traction corresponding to the second-order stress tensor in the solid, v̄(s)
the virtual velocity in the solid that vanishes on the part of the boundary of the
solid where essential boundary conditions⋆ are prescribed, v̄f the virtual velocity
in the fluid that vanishes on the part of the boundary of the fluid where essential
boundary conditions are specified, a · b the inner product between tensors a and
b of the same order, and ∂v(s) /∂n is the directional derivative of v(s) along the
outward unit normal n to the boundary ∂ of . The effect of inertia forces is
⋆ For a boundary-value problem with field equations involving derivatives of order 2m, boundary
conditions involving derivatives of order at most (m−1) are called essential; others are called natural.
246
F. DELL’ISOLA ET AL.
included in the density of body forces. The physical meaning of (s) and τ (s) can be
described in a way similar to that done in different contexts in [20, 23]. We note that
the external action τ (s) can be regarded as the sum of two different contributions,
the first one is a doubly normal double force, i.e., an external areal action which
works on the rate of opening, (∇v(s) · n ⊗ n), along the outward unit normal n of
pores on the boundary, the other one is a tangential couple working on the vorticity
of the apparent velocity of the solid; this nomenclature is due to Germain [23].⋆
The areal action is also considered in the Cosserat model for granular materials
(see, e.g., [21]) and in the present problem vanishes. However, the doubly normal
double force plays an important role in the dilatancy phenomenon studied here.
A motivation for considering higher-order stresses in the solid will be provided
below. We note that capillary type forces in the fluid, discussed in the literature by
second-order stresses (see, e.g., [14–16, 26]), have been neglected here to keep the
analysis tractable. Whereas we have included in equation (4) the internal supplies
m(f) and m(s) of the linear momentum, the internal supplies of the moment of
momentum have been neglected. This is consistent with the assumption that the
stress in the fluid is a hydrostatic pressure. The symmetry of T(s) follows from
equation (4) by setting
v̄(f) = v̄(s) = velocity field of a rigid body motion,
that is, from the objectivity of the left-hand side of (4) which also implies that the
sum of m(f) and m(s) equals zero.
By using the divergence theorem and exploiting the fact that equation (4) must
hold for all virtual velocities vanishing on the part of the boundary where essential
boundary conditions are given, we obtain the following set of field equations and
boundary conditions:
(5)
div T(s) − div (s) − m(s) + b(s) = 0, in ,
−∇p (f) − m(f) + b(f) = 0,
m(s) + m(f) = 0,
(s)
T − div (s) n − divs ((s) n) = t(s) ,
(s)
n n = τ (s) ,
(s)
(s)
v = v̂ ,
−p (f) n = t(f) ,
v(f) · n = v̂ (f) ,
in ,
in ,
on ∂1 ,
(6)
(7)
(8)
on ∂1 ,
(9)
on ∂2 ,
on ∂3 ,
on ∂4 .
(10)
(11)
(12)
Here divs is the surface divergence on ∂, and ∂1 and ∂2 are complementary
parts of the boundary ∂ of , where natural and essential boundary conditions,
respectively, are prescribed for the solid; a similar interpretation holds for ∂3 and
∂4 for the fluid.
⋆ Here a ⊗ b is the tensor product between the nth order tensor a and the mth order tensor b
defined as (a ⊗ b)c = (b · c)a for every mth order tensor c; A · B = tr(ABT ) for tensors A and B.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
247
2.2. CONSTITUTIVE RELATIONS
The balance laws (5) and (6) are to be supplemented by constitutive relations; we
express these in terms of the internal energy. We presume that the mixture is at
a uniform temperature, the constituents are deformed quasistatically so that their
kinetic energy can be neglected, and no energy is dissipated. The continuum model
we use in order to describe deformations of a solid matrix saturated with a fluid
must account for the strain energy associated with the gradient of the mass density
of the solid. At the macroscopic level, the dependence of the strain energy on the
gradient of deformation describes the effects of the opening of neighboring pores
on the pore cluster which is modelled as a solid material point. We assume that
the internal energy density can be split into two parts: a part that depends upon
the “local” deformation of the solid and the fluid particles and another part that
depends upon a “nonlocal” measure of deformation of the solid particles; the latter
is taken to be proportional to |∇ρ (s)|2 . Thus we write the balance of energy as
λs
d
2
ρ ǫ ρ (f) , F(s), X(s) +
∇ρ (s) dV
dt
2ρ
(s) (s)
b · v + b(f) · v(f) dV
=
∂v(s)
dA
(13)
+
t(s) · v(s) + t(f) · v(f) + τ (s) ·
∂n
∂
which because of (4) implies that
λs
d
(f)
(s)
(s)
(s) 2
ρ ǫ(ρ , F , X ) +
∇ρ
dV
dt
2ρ
(s) (s)
m · v + m(f) · v(f) + T(s) · ∇v(s) − p (f) div v(f)
=
+ (s) · ∇∇v(s) dV .
(14)
Here F(s) = Grad x = ∂x/∂X is the deformation gradient for the solid, λs > 0 is
a material parameter with units of Newton m6 /kg2 , and d/dt signifies the material
time derivative following the mean motion of the mixture.
By using the Reynolds transport theorem the left hand side of equation (14) can
be represented as a linear functional of the velocity field of the two constituents.
Thus the following constitutive equations for the partial Cauchy stresses T(a), the
solid-fluid interaction forces m(a), a = s, f, and the second-order stress (s)
associated with the solid constituent must hold:
ss
f
∂ǫ
(1 + ξ (f) )I + ∇ρ (s) ⊗ ∇ρ (s) ,
(15)
T(s) = ρ (s) F(s)T − λs
∂F
2
λs
∂ǫ
(16)
p (f) = ρρ (f) (f) − f ss ξ (f) ,
∂ρ
2
248
F. DELL’ISOLA ET AL.
(s) = −λs ρ (s) I ⊗ ∇ρ (s) ,
(17)
T ∂ǫ
∂ǫ
∂ǫ
m(s) = −m(f) = −ρ ξ (f) ∇F(s)
+ ξ (f) F(s)−T (s) − ξ (s) (f) ∇ρ (f)
(s)
∂F
∂X
∂ρ
λs
(18)
+ ∇(ξ (f) f ss ) ,
2ρ
where
f ss = ∇ρ (s) · ∇ρ (s),
(19)
and ξ (f) is the mass fraction of the fluid phase. Note that the partial Cauchy stress
tensor T(s) is symmetric and ∂ǫ/∂F(s) equals the partial first Piola–Kirchhoff stress
tensor of the solid constituent. Equations (15)–(18) are derived in [34] where Germain’s [23] arguments are used to obtain constitutive relations for a second gradient
porous matrix filled with an ideal fluid. Equation (13) does not include all features
of a general second gradient linear elastic matrix; only density gradients have been
assumed to affect the internal potential energy and contributions of other components of the third order tensor, Grad F(s) , have not been considered. Furthermore,
we only analyze static deformations of the mixture. Thus Darcy-type drag forces
are not modeled.
2.3. SPLITTING OF EXTERNAL SURFACE TRACTIONS INTO PARTIAL
TRACTIONS
We consider problems for which b(s) = b(f) = 0, i.e., there are only external surface
tractions. In a physical problem, total surface tractions are prescribed either on a
part or on all of the boundary of the region . Here we require that these tractions
be assigned on all of the boundary of the mixture in the current configuration.
However, the solution of the boundary-value problem defined by equations (5)–(12) requires that the partial surface tractions be specified. In order to
find the partial tractions we assume the existence of a potential function such that
(s)
(s)
(s)
(f)
(f)
(s) ∂v
t ·v +t ·v +τ ·
dA
∂n
∂
d
ψ ext x, ρ (s) , ρ (f) , ∇ρ (s) dV .
(20)
=
dt
The external surface tractions for which equation (20) holds are conservative. It
is readily apparent that not all conservative surface tractions are characterized
by equation (20). Here we consider only those surface tractions which satisfy
equation (20) and ψ ext depends upon deformations of the solid only through ρ (s)
and ∇ρ (s) . Equation (20) is dictated by the intended application of studying static
deformations of an annular cylindrical porous region filled with an inviscid fluid.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
249
Requiring that equation (20) hold for all choices of the velocity field, we obtain
∂ψ ext
∂ψ ext
−
div
= C s , in ,
(21)
∂ρ (s)
∂ ∇ρ (s)
∂ψ ext
= C f,
in ,
(22)
∂ρ (f)
ext
∂ψ ext
∂ψ ext
(s)
(s) ∂ψ
t(s) = − (s) ρ (s) + ξ (s) ψ ext −
·
∇ρ
+
div
ρ
∂ρ
∂ ∇ρ (s)
∂ ∇ρ (s)
(s)
ext
ext
∂ψ
∂ψ
∂ρ
+ ρ (s)
n
· n tr(∇ s n) −
·n
(s)
(s)
∂ ∇ρ
∂ ∇ρ
∂n
∂ψ ext
(s) s
·n ,
on ∂,
(23)
+ρ ∇
∂ ∇ρ (s)
∂ψ ext
t(f) = − (f) ρ (f) + ξ (f) ψ ext n, on ∂,
(24)
∂ρ
ext
(s)
(s) ∂ψ
· n n,
on ∂,
(25)
τ =− ρ
∂ ∇ρ (s)
where ∇ s is the surface gradient on ∂. Equations (21) and (22) with C s and C f
as constants are necessary conditions for the existence of a ψ ext for which b(s) =
b(f) = 0.
Once the constitutive relations for external actions are specified, which in our
model is equivalent to specifying ψ ext , the partial surface tractions can be expressed
in terms of the total surface tractions; see [33] for details. The consideration of
conservative external tractions limits the space of admissible surface tractions.
The often used assumption characterizing the partial surface tractions in terms of
the volume fraction of the constituents and the total surface tractions cannot be
deduced from our work. It is because no state parameter in addition to the solid
and the fluid placement maps has been introduced.
For a nonpolar medium whose response does not depend upon second-order
displacement gradients, Batra [4] showed that surface tractions cannot depend upon
∂u/∂n where u is the displacement of a point.
Results of this section can be summarized as follows: (i) a variational principle describing static deformations of a solid–fluid mixture is formulated, (ii) the
Euler–Lagrange equations are deduced from the principle, and are recognized as
the balance laws and the constitutive relations for the solid–fluid mixture, (iii) the
external and internal actions are specified by requiring that they be conservative.
3. Solution of a Boundary-Value Problem
We analyze, within a linearized second gradient theory, static infinitesimal deformations of a long hollow porous cylinder filled with an inviscid fluid and with the
inner and the outer surfaces subjected to uniform external pressures p1ext and p2ext ,
250
F. DELL’ISOLA ET AL.
respectively. We assume that the pressure on the inner and the outer surfaces of
ext
ext
, respectively, and
and p02
the cylinder, in the reference configuration, equals p01
postulate that
1
1
1
ǫ =
T0(s) · H(s) + γ 0f ρ (f) + C H(s) · H(s) − H(s) · T0(s)H(s)T
ρ0
2
4
1
− H(s) T0(s) + H(s) · H(s) T0(s) − T0(s)H(s) + T0(s)H(s)T − H(s)T T0(s)
8
1 ff (f) 2
(f) sf
(s)
+ γ
(26)
ρ
+ ρ K ·H ,
2
where
H(s) := ∇u(s),
ρ (f) = ρ (f) − ρ 0(f) ,
(27)
ρ 0(f) is the mass density of the fluid in the reference configuration, T0(s) a symmetric tensor representing the partial stress in the solid in the reference configuration,
C is the classical elasticity tensor for the solid constituent mapping symmetric
second order tensors into symmetric second order tensors, γ 0f , γ ff and Ksf = KsfT
are material parameters.
Terms involving the second order tensor T0(s) and the scalar γ (0)f in equation (26) represent contributions to the internal potential energy by the prestress
in the solid and the fluid constituents. Since a fluid can be in equilibrium only if
it is confined, it is necessary to consider a pre-stressed reference configuration.
Equation (26) is the most general one can have to get linear constitutive relations
for a pre-stressed solid-fluid mixture. Equations (15)–(18) imply that terms of order
one in H(s) and ρ (f) yield zeroth order terms for the solid stress tensor and the fluid
pressure, and the second-order terms in H(s) and ρ (f) provide the first-order terms
for the solid stress tensor, the fluid pressure, and the bulk internal action m(s) .
The second-order tensor Ksf accounts for the interaction between the solid and
the fluid phases because of deformations of pores; Ksf is not necessarily a spherical
tensor even when the solid and the fluid constituents are isotropic. It is because
pores need not be of uniform shapes and sizes.
The coupling coefficient Ksf can be explained in terms of the pore dilatancy as
follows: the dilatation of pores induced by the injection of the solid or the fluid
into a suitable elementary reference volume either deforms the solid constituent or
changes the mass density of the fluid or both.
(s)
Denoting the first term on the right-hand side of equation (15) by T and using
(26) we obtain
T
(s)
ρ (f) 0(s)
1
T − ξ0(s) tr H(s) T0(s) + T0(s)H(s)T + H(s)T0(s)T
ρ0
2
1 (s) 0(s)
(28)
+ W T − T0(s)W(s) + C E(s) + ρ (f) Ksf ,
2
= T0(s) +
251
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
where E(s) := (H(s) + H(s)T )/2 and W(s) := (H(s) − H(s)T )/2. If each component
of the initial stress, T0(s), is much less than the smallest value of an elastic modulus
of the solid phase, then terms such as T0(s)H(s)T can be neglected in equation (28).
We assume the following form for the elasticity tensor applied to the E(s)
C E(s) = λtr(E(s))I + 2µE(s) ,
(29)
where λ and µ are the Lamé constants for the solid. One can deduce expressions
for p (f) and m(s) by substituting for ǫ from (26) into (16) and (18).
We assume that deformations of the mixture are plane strain, and in the plane
of deformation they are independent of the angular position. That is, in cylindrical
coordinates the two in-plane physical components ur and uθ of the displacement
are functions of the radial coordinate r only and uz = 0; we call such a deformation field radially symmetric. This assumption is reasonable because the body
and the boundary conditions are radially symmetric. Henceforth we use cylindrical
coordinates with orthonormal vectors er and eθ at a point in the radial and the
circumferential directions, respectively, and work in terms of physical components
of stresses. In order to find boundary conditions on the solid and the fluid phases,
we consider external tractions given by the following expression for ψ ext :
ψ ext = C s ρ (s) + C f ρ (f) + Cint(r, θ) · ∇ρ (s) ρ (s) + p0 + p1 r.
(30)
Here ρ (s) = ρ (s) − ρ 0(s), ρ 0(s) is the density of the solid phase in the reference
configuration, Cint(r, θ) is a solenoidal vector field whose radial component is
independent of r and θ. It is easy to verify that the aforementioned assumptions
on Cint(r, θ) suffice to satisfy equation (21). Bearing in mind the hypothesis of
radially symmetric deformations for both phases, we have
′
′
Cint (r, θ) · ∇ρ (s) ρ (s) = Cint(r, θ) · (ρ (s) er ) ρ (s) = C int ρ (s) ρ (s) ,
(31)
where C int is the constant radial component of Cint (r, θ) and a prime indicates differentiation with respect to r. Substitution from (30) and (31) into equations (23)–
(25) gives
C int (s) (s)
(s)
(s) ext
s (s)
int
(s) (s)′
ρ
ρ n,
(32)
t = ξ ψ −C ρ −C
ρ ρ +
r
t(f) = ξ (f) ψ ext − C f ρ (f) n,
(33)
τ (s) = −(ρ (s) ρ (s) C intδ)n,
(34)
where δ = +1 on the external surface and δ = −1 on the internal surface. Note
that values of C s , C f and C int depend upon the shape of the bounding surface,
constituents of the mixture, and on the interaction between the mixture and the
medium surrounding it. For example, at an impermeable wall, C int = 0 because
externally applied tractions of type (34) vanish; as a matter of fact, the absence of
fluid flux implies that the fluid belonging to the cluster of pores near the boundary
252
F. DELL’ISOLA ET AL.
rests at a uniform pressure. In other words the external world cannot access pores
within the body and dilate them.
Because of our interest in studying infinitesimal deformations under the action
of higher-order tractions (or double forces without moments), we retain terms in
′
(32)–(34) that are bilinear in ρ (s) and ρ (s) ; the remaining terms are either linear
in ρ (s) and ρ (f) or independent of ρ (s) and ρ (f) . We thus get the following linearized
constitutive characterizations of external tractions and double forces:⋆
0(f) 0(s) 0 (s)
(s)
(f)
0(s)
t = −ξ ξ ρ (C − C ) + ξ p̄0 n
C int 0(s)
(s)
0(f) 2
(f)
0(f) p̄0
ρ
ρ (s)
+
+ −(ξ ) (C − C ) + ξ
ρ0
r
p̄0
(s)
+ −(ξ 0(s))2 (C − C (f) ) − ξ 0(s) 0
ρ (f) + ξ 0(s)(p̃0 + rp1 ) n,
ρ
(35)
0(f) 0(s) 0 (s)
(f)
(f)
0(f)
t = ξ ξ ρ (C − C ) + ξ p̄0 n
p̄0
(s)
+ (ξ 0(s))2 (C − C (f) ) + ξ 0(s) 0
ρ (f)
ρ
(s)
(s)
0(f)
0(f) 2
(f)
0(f) p̄0
ρ + ξ (p̃0 + rp1 ) n, (36)
+ (ξ ) (C − C ) − ξ
ρ0
τ (s) = −ρ 0(s)C int ρ (s)δ n.
(37)
Here ξ 0(f) and ξ 0(s) are mass fractions of both phases in the initial configuration,
ρ 0 is the apparent density of the mixture in this configuration, p̄0 is the reference
value of p0 and p1 equals its infinitesimal variation. Note that terms in the first
brackets in the representation formulas of t(s) and t(f) describe tractions applied on
the boundary of the mixture in the reference configuration, and terms in the second
brackets are infinitesimal increments of tractions. Similarly equation (37) states
that the initial double force vanishes but its infinitesimal increment is nonzero.
Let R1 and R2 denote the inner and the outer radii of the porous cylinder in the
reference configuration. It is clear from equation (28) that T0(s) is the partial stress
in the solid in the reference configuration. Boundary conditions for the solid in the
reference configuration are
(s)
(s) ext
T0(s)(R1 )n = t (R1 ) = − d01
p01 n,
(s)
T0(s)(R2 )n = t (R2 ) =
(s) ext
p02 n,
− d02
(38)
(39)
(s)
(s)
ext
denote fractions of the externally applied pressures, p01
and
and d02
where d01
ext
p02
, carried by the solid phase on the inner and the outer surfaces of the porous
⋆ We have assumed that the small parameter defining the linearization procedure is same for the
solid and the fluid kinematical descriptors. Thus small deformations of the solid matrix are associated
with small variations of the fluid apparent density.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
253
cylindrical body. Through these equations a representation formula for both areal
fractions splitting external pressure into partial tractions is obtained. Let the state
of stress, T0(s) and p0(f) , be
b0
b0
0(s)
T = a0 − 2 er ⊗ er + a0 + 2 eθ ⊗ eθ ,
r
r
(40)
(s) ext
(s) ext
(f)
0f 0(f)
p0 := γ ρ = const = 1 − d01 p01 = 1 − d02 p02 ,
then
(s) ext
(s) ext
p01
p02 − R12 d01
R22 d02
,
a0 = −
2
2
R2 − R1
R 2 R 2 (d (s) p ext − d (s) p ext )
b0 = − 1 2 02 2 02 2 01 01 .
R2 − R1
(41)
It is easily verified (see, e.g., [32]) that the pre-stress (40) is admissible. Equa(s)
(s)
ext
ext
, p01
and p02
cannot be independently prescribed.
, d02
tion (40)2 implies that d01
(s)
(s)
ext
ext
must equal p02
,
For example, if d01 = d02 for a reference configuration, then p01
and the stress state in the solid phase must also be that of a hydrostatic pressure, as
(s)
b0 = 0 (see equation (41)). At an impermeable wall, d01
need not equal 1 because
a part of external tractions applied to the impermeable wall may be carried by the
fluid.
Equations for the determination of infinitesimal displacements in the radial and
in the circumferential directions and for ρ (f) are
2 ′′′
3 ′′
3 ′
3
0(s)2
′′′′
−λs ρ
ur + ur − 2 ur + 3 ur − 4 ur
r
r
r
r
b
0
+ 2ξ 0(f) a0 − 2 + λ + 2µ u′′r
r
2b0
b0
1
0(f)
a0 + 2 + 2 + λ + 2µ u′r
+ 2ξ
r
r
r
1
b
2b0
0
0(f)
+ 2 −2ξ
a0 + 2 − 2 − (λ + 2µ) ur
r
r
r
1
b0
′
sf
0(s) 0(f)
+
a0 − 2 + Krr − ξ γ
ρ (f)
ρ0
r
(f)
1 sf
sf
ρ = 0,
(42)
+ Krr − Kθθ
r
1
1
a0
a0
+ µ r 2 − b0 u′′θ + 3
+ µ r 2 − b0 u′θ
2
r
2
r
2
1
a0
2 (f)
2
sf
(f)′
− 4
+ µ r − b0 uθ + Krθ
ρ
= 0,
(43)
ρ +
r
2
r
254
F. DELL’ISOLA ET AL.
2ρ 0(f) γ 0f + ρ 0 ρ 0(f) γ ff ρ (f)
1
0(s) 0(f) 0f
′
= c + ρ ρ γ ur + ur
r
1
0 0(f)
sf 1
sf ′
sf
′
−ρ ρ
Krr ur + Krθ uθ − uθ + Kθθ ur
r
r
b
b
1
0
0
ur .
− ρ 0(f) a0 − 2 u′r − ρ 0(f) a0 + 2
r
r r
(44)
In equation (44), c is a constant of integration. Equation (44) can be solved for
ρ (f) and the result substituted into equations (42) and (43) to obtain two coupled
ordinary differential equations for ur and uθ . These equations belong to Heun’s
family of equations (see [3]); their solution has four poles, one at r = 0, one at
r = ∞, and locations of the other two poles depend upon the coefficient of u′′θ in
equation (43). In order for a pole to be within the hollow cylinder the following
inequalities must hold
R1 <
b0
< R2 .
a0 /2 + µ
(45)
Thus depending upon the shear modulus of the solid phase, the pre-stress, and the
fraction of externally applied pressures carried by the solid constituent, the solution
may blow up at a point within the annular cylinder.
3.1. A SIMPLIFIED PROBLEM
Henceforth we consider the case of b0 = 0, therefore r = 0 is a triple pole and there
is no pole within the hollow cylinder. The stress in the solid phase in the reference
(s) ext
(s) ext
p01 = − a0 , i.e., − a0
p02 = d01
configuration is a hydrostatic pressure. Thus d02
equals the reference value of the hydrostatic pressure in the solid. Moreover p̄0
equals the negative of the external pressure applied on both the external and the
internal surfaces of the hollow cylinder
ext
ext
p01
= p02
= −p̄0 .
(46)
We also assume that the second order tensor Ksf is spherical: Ksf = K sf I. Thus the
strain-energy density ǫ provides both the dependence of the hydrostatic component
of the solid partial stress tensor on ρ (f) , and that of the hydrostatic pressure acting
in the fluid on ρ (s). An increment ρ (f) in the apparent mass density of the fluid
does not induce nonzero deviatoric stresses in the solid constituent.
3.1.1. Influence of the Coupling Coefficient K sf on Density Profiles
With the aforestated assumptions and the existence of a first integral of equation (42), equations (42)–(44) can be reduced to the following two uncoupled
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
ordinary differential equations:
1 ′
0(s)2
′′
−λs ρ
Ur + Ur
r
(ξ 0(s)γ 0(f) − a0 /ρ 0 − K sf )2
Ur = Ŵs ,
+ 2µ + λ + 2ξ 0(f) a0 −
2γ 0(f) /ρ 0 + γ ff
2
Uθ′ + Uθ = 0,
r
255
(47)
(48)
where Ŵs is an integration constant to be determined by boundary conditions, Ur =
(u′r + (1/r) ur ) = tr H(s) and Uθ = (u′θ − (1/r) uθ ) are dimensionless quantities.
In particular, Ur is related to the increment of the solid apparent density by
ρ (s) = −ρ 0(s)Ur .
(49)
According to the previous assumptions we can also specify boundary conditions
arising from equations (9)–(12) and equations (35)–(37). Recalling the structure of
balance laws (5) and (6) these conditions prescribe external radial and circumferential tractions for the solid constituent acting on the inner and the outer surfaces, the
external pressure for the fluid constituent acting on the inner (or the outer) surface
and the double forces only for the solid constituent acting on disjoint parts of the
boundary. The differential equation for the variable Ur is a Bessel equation, and
that for Uθ is the classical Euler equation.
Let us first consider equation (48) and boundary conditions corresponding to
shear external tractions for the solid constituent. As these tractions are null, see
equation (35), we get
Uθ = 0
⇒
uθ = cθ r,
(50)
cθ being a constant of integration. Thus the only admissible infinitesimal displacement in the circumferential direction is a rigid body rotation.
In order to solve equation (47), we recall that a typical Bessel equation is
x 2 y ′′ (x) + xy ′ (x) + x 2 − υ 2 y(x) = 0,
and a typical modified Bessel equation is
x 2 y ′′ (x) + xy ′ (x) − x 2 + υ 2 y(x) = 0.
It is evident that equation (47) for Ŵs = 0 is either a classical or a modified Bessel
equation, according to the sign of the coefficient of Ur . Therefore, two different
solutions for the increment of the solid apparent density can be obtained. Let the
coefficient of Ur in equation (47) be defined by
q := 2µ + λ + 2ξ 0(f) a0 −
(ξ 0(s)γ 0(f) − a0 /ρ 0 − K sf )2
,
2γ 0(f)/ρ 0 + γ ff
(51)
256
F. DELL’ISOLA ET AL.
then, through a change of variable, equation (47) for Ŵs = 0 becomes
d2 Ur (ξ ) 1 dUr (ξ )
+
− sign(q) Ur (ξ ) = 0,
dξ 2
ξ dξ
ξ=
|q|
r.
λs ρ 0(s)2
(52)
Depending upon the sign of q, equation (52) is either a Bessel or a modified Bessel
equation. If q > 0 then the solution of equation (52) is given by a linear combination of modified Bessel functions I0 (ξ ) and K0 (ξ ). However, if q < 0 then
the solution of equation (52) is given by a linear combination of classical Bessel
functions J0 (ξ ) and Y0 (ξ ). The subscript zero indicates that these functions are solutions of the Bessel equation with υ = 0. Note that the standard nomenclature for
Bessel equations has been adopted. Expressions for the classical Bessel functions
J0 (ξ ) and Y0 (ξ ) and the modified Bessel functions I0 (ξ ) and K0 (ξ ) can be found
in a book, e.g., [1].
Starting from a vanishing coupling coefficient we conduct a parametric study
of the solution of equation (47) when |K sf | is monotonically increased. This is
essentially necessitated by our inability to identify the suitable range of values
of K sf . Even though we know that it describes the effects of pore dilatancy, there
is no experimental data available to quantify this effect. We commence from the
solution of the homogeneous equation associated with equation (47). Two different
ranges of values of the coupling coefficient can be determined. Let
sf
K1,2
:= ξ
0(s)
γ
0(f)
a0
− 0∓
ρ
2γ 0(f)
0(f)
ff
2µ + λ + 2ξ a0
+γ
ρ0
(53)
be the values of K sf for which q vanishes. When K sf ∈ (K1sf , K2sf ) then the solution
of the homogeneous equation is given by a linear combination of the modified
Bessel functions as sign(q) = 1. For K sf ∈ (− ∞, K1sf ) ∪ (K2sf , ∞) then the solution of the aforementioned equation is a linear combination of the classical Bessel
functions as sign(q) = −1.
Once equation (52) has been solved, the solution of equation (47) is obtained
by simply adding a suitable constant to it. The following figures depict the ρ (s)
profiles for a salt matrix filled with brine; values of constitutive and geometric
parameters and surface tractions applied on the boundary are listed in Table I.
Values of constitutive parameters are deduced from the test data on salt rock and
brine [37]. Note that values for the constitutive coefficients λs and C int that describe second gradient effects are arbitrarily chosen since no experimental data is
available for their determination. However, they are reasonable and can describe
the pore-opening effect near the boundary of the mixture. The variation of ρ (s) in
the boundary layer due to second gradient is about 1% of its reference value (i.e. its
value due to the first gradient model only). The plot in Figure 1 is representative of
solutions in the range (K1sf , K2sf ) and that in Figure 2 corresponds to solutions in the
range (− ∞, K1sf ) ∪ (K2sf , ∞). In Figure 1 the typical behavior of fields exhibiting
boundary layers is shown; in our model these boundary layers are driven by the
257
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
Table I. Values of parameters used in the computation of results. E and ν are Young’s modulus
and Poisson’s ratio of the solid matrix; ρ̂ 0(s) and ρ̂ 0(f) are the densities of the solid and the
fluid constituent in the reference configuration, ν 0(s) and ν 0(f) their volume fractions. In the
reference configuration the mixture is saturated
Constitutive parameters
E = 200 MPa
ν = 0.33
λs = 200 N m4 /kg2
γff = 1.64 106 N m4 /kg2
(s)
C = C (f)
C int = 1 N m3 /kg2
Geometric and referencial state properties
ρ̂ 0(s) = 1850 kg/m3
ρ̂ 0(f) = 1300 kg/m3
ν 0(s) = 0.97
ν 0(f) = 1 − ν 0(s) = 0.03
ρ 0(s) = ρ̂ 0(s) ν 0(s) = 1794.5 kg/m3
ρ 0(f) = ρ̂ 0(f) ν 0(f) = 39 kg/m3
R1 = 2 m, R2 = 20 m
Tractions
ext = 20 MPa
p01
ext = p ext
p02
01
p̃0 = −2.21 MPa
p1 = 0.1 · 106 N/m3
Figure 1. Qualitative ρ (s) -profiles for K sf ∈ (K1sf , K2sf ).
applied double forces. Close to the boundary of the solid matrix their effect is either
a dilatation or a compaction induced by the applied fluid pressure. In Figure 2 the
solution apparently shows wide oscillations due to the change of type occurring in
the Bessel equation (52). This is usually an indication of instability and motivates
the following analysis.
3.1.2. Stability of the Stressed Reference Configuration
We now investigate the stability of the reference configuration with respect to
changes in the coupling coefficient K sf . Recalling our earlier remarks on the admissible values of K sf we delineate now the range of its values which assure the
uniqueness of the solution of the elastic problem. Said differently, our goal is to
characterize the coupling coefficient K sf which ensures the structural stability of
the partial differential equations defined on the space of state parameters u(s) and
ρ (f) and describing deformations of the mixture. One could also investigate the
stability of the prestressed configuration with respect to the value of the prestress.
According to the criterion stated, for example, by Arnold [2] or Thom [39], we dis-
258
F. DELL’ISOLA ET AL.
(a)
(b)
Figure 2. Qualitative ρ (s) -profiles for K sf ∈ (−∞, K1sf ) ∪ (K2sf , ∞). Figure 2(a) corresponds
to 1.75 < K < 1.81, and (b) to 1.83 < K < 1.9, where K = K sf /
0f
µ(2γ /ρ 0 + γ ff ).
cuss the possibility that for two solutions corresponding to sufficiently close values
of K sf , an homeomorphism on the space of mixture states exists transforming one
solution into the other. The norm induced by the energy inner product is used to
define the neighborhood of an element in this space.
In order that the partial differential equations of our problem fulfill the condition
of topological equivalence (see [2]) it is assumed that a solution of equations (47)
and (48) describes available transformations of a reference configuration. This
provides an admissible criterion for the stability analysis; many other choices are
possible.
The following physically meaningful energetic criterion can be used to study the
stability. It requires that for the reference equilibrium configuration to be stable, the
total energy given by the functional
Etot ρ (f) , F(s)
λs
2
(54)
∇ρ (s) − ψ ext x, ρ (s) , ρ (f) , ∇ρ (s) dV
ρǫ ρ (f) , F(s) +
=
2
be minimum in the reference configuration. Here essential boundary conditions are
assumed to be prescribed on the entire boundary. In particular, we prove that the
reference configuration is stable when the coupling coefficient lies in a suitable
subset of the open interval (K1sf , K2sf ). The mathematical reasoning parallels that
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
259
given, for example, in [12]. That is, for the reference configuration to be stable, the
second functional derivative of Etot evaluated in the reference configuration must be
positive definite. We shall prove that, when ǫ is prescribed by equation (26), this is
equivalent to requiring that the following spectral problem has positive eigenvalues
only.
⎧
r2
1
1
⎪
⎪
2µ + λ + 2ξ 0(f) a0 − α
r,rr + r,r − 2 1 +
⎪
⎪
⎪
r
r
λs ρ 0(s)
⎪
⎪
⎪
⎪
0(f)2 0(s) 0f
⎪
⎪
(ξ γ − a0 /ρ 0 − K sf )2
⎨ +ρ
r = 0,
α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff )
(55)
⎪
⎪
0(f) 0(s) 0f
0
sf
⎪
⎪
ρ (ξ γ − a0 /ρ − K )
⎪
⎪
−
r = 0,
⎪
⎪
α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff )
⎪
⎪
⎪
⎩
(2µ + a0 − α)θ = 0,
in ;
⎧
1
2
⎪
⎪
λ − 2ξ 0(s) − 1 a0 + ρ 0(s) C int − α div vs
⎪
⎪
⎪
r
⎪
⎪
⎪
⎪
⎪
⎪ + ρ 0(f) ξ 0(s)γ 0f − a0 − K sf div v
⎪
f
⎪
⎪
ρ0
⎪
⎪
⎪
⎪
⎪
a0
1
⎪
0(s)2
⎪
⎪
(vs )r,r − λs ρ
+2 µ +
r,r + r = 0,
⎪
⎨
2
r
(56)
1
δ
⎪
⎪
(2µ + a0 − α) (vs )θ,r + (vs )θ = 0,
⎪
⎪
⎪
2
r
⎪
⎪
⎪
⎪
⎪ 0(f) 0(s) 0f a0
⎪
⎪
ξ γ − 0 − K sf div vs
ρ
⎪
⎪
ρ
⎪
⎪
⎪
⎪
⎪
⎪
+ ρ 0(f) 2ξ 0(f) γ 0f + ρ 0(f) γ ff − α div vf = 0,
⎪
⎪
⎪
⎩
λs r − C int div vs = 0,
on ∂.
Here α denotes a generic eigenvalue, and quantities r , θ and are defined by
1
r := ∇(div vs ) · er = (vs )r,r + (vs )r ,
r
,r
1
1
(57)
(vs )θ,r + (vs )θ ,
θ := div(skw (∇vs )) · eθ =
2
r
,r
:= ∇(div vf ) · er ,
in terms of the virtual velocity fields vs and vf . This equivalence can be proved by
following the procedure adopted by Seppecher [36]. In a virtual motion the second
time derivative of Etot evaluated in the reference configuration is required to equal
the integral over of a suitable quadratic form multiplied by α. Consequently its
sign depends on the sign of α. In order to get conditions for a positive definite
260
F. DELL’ISOLA ET AL.
second derivative of Etot by means of a suitable eigenvalue problem, the proper
quadratic form is
d2 Etot
dt 2
(ρ 0(f) ,ρ 0(s) ,p̄0 )
=
α (div vs )2 + skw(∇vs ) · skw(∇vs ) + (div vf )2 dV .
(58)
It is clear that if α is positive then the left-hand side of equation (58) is positive
and the reference configuration is stable. Cumbersome calculations involved in the
derivation of equations (55) and (56) from equation (58) have been omitted. We
simply note that the divergence theorem applied to the right-hand side of equation (58) implies that field equations (55) and boundary conditions (56) depend on
the eigenvalue α.
Field equation (55)1 is a Bessel equation. Therefore, according to the sign of
2
ρ 0(f) (ξ 0(s)γ 0f − a0 /ρ 0 − K sf )2
1
0(f)
2µ + λ + 2ξ a0 − α +
,
Q :=
α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff )
λs ρ 0(s)2
(59)
its solution is a linear combination of classical Bessel functions or of modified
Bessel functions. In√particular, if Q √
< 0 the solution of equation (55)1 is a linear
combination of J1 (√ −Qr) and Y√
1 ( −Qr), conversely if Q > 0 it is a linear
combination of I1 ( Qr) and K1 ( Qr). In order to determine the range of K sf
for which eigenvalues are positive, we first determine the range of eigenvalues
corresponding to positive or negative values of Q. When α is negative, we restrict
our discussion to the case
α < ρ 0(f) 2ξ 0(f) γ 0f + ρ 0(f) γ ff .
(60)
>0
Consequently we get
Q>0
Q<0
⇒
⇒
α < α1 , α > α2 ,
α1 < α < α2 ,
(61)
where α1 and α2 are roots of the equation Q = 0. Consider the case when α < α1 ;
it is easy to check that, for values of the constitutive coefficients considered in
Table I, a characteristic root always exists in this range of eigenvalues. Therefore solutions of equations (55) are linear combinations of modified Bessel functions.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
261
Figure 3. The solution of equation ϕ(α, K sf ) = 0 for α < α1 is presented where a =
2
α/λs ρ 0(s) .
Assume that α = 2µ + a0 , i.e., require equation (55)3 to be satisfied by a rigid
body rotation. Then equations (55) imply that
√
√
r = C1 I1 Qr + C2 K1 Qr ,
θ = 0,
C1
C2
(62)
div vs = √ I0 Qr − √ K0 ( Qr) + C3 ,
Q
Q
C4
C1 C2 C3
I1 Qr +
K1 Qr +
r+
,
(vs )r =
Q
Q
2
r
where Ci (i = 1, . . . , 4) are constants of integration. Substituting from (62) into
the homogeneous boundary conditions (56) and requiring that the determinant of
this linear system vanish, one gets the equation which determines the admissible
eigenvalues of the system for α < α1 . These solutions are functions of the coupling coefficient K sf . Let ϕ be the determinant of the linear system obtained from
equations (56) and regarded as a function of α and K sf . In Figure 3 the curve
ϕ(α, K sf ) = 0 is exhibited. The gray area in the figure indicates the region of positive eigenvalues larger than α1 . For values of the coupling coefficient in the range
(K1sf , K2sf ) an eigenvalue smaller than α1 always exists. Consequently, the loss of
stability of the equilibrium configuration occurs when such an eigenvalue becomes
negative. A numerical simulation shows that the stability is guaranteed for values
sf
sf
of the coupling coefficient in a suitable open subset (K1s
, K2s
) of (K1sf , K2sf ). Thus
solutions of equations (47) and (48) are meaningful only for coupling coefficients
in this open interval.
262
F. DELL’ISOLA ET AL.
sf and K sf upon λ .
Figure 4. Dependence of K1s
s
2s
These results indicate that the stability-instability transition does not involve a
change in the macroscopic deformation profile for the hollow cylinder. In other
words the loss of stability and wide oscillations occurring when the coupling coefficient belongs to the open set (− ∞, K1sf )∪(K2sf , ∞) are not correlated. This could
be due to the stability criterion used here. A nonlinear analysis and/or a different
choice of the energetic functional may provide a different stability limit.
sf
sf
It is interesting to note that the stability limits K1s
and K2s
are not strongly
affected by second gradient effects. An increase in λs does not induce a noticeable
change in these limits (see Figure 4). However, estimates of the stability limits provided by the first gradient theory (i.e. λs = 0) cannot be used when second gradient
effects are present, and the length of the stable region progressively decreases as
second gradient effects become more relevant (see Figure 4).
4. Concluding Remarks
We have studied infinitesimal static deformations of a long hollow porous isotropic
elastic cylinder initially saturated with a perfect fluid and subjected to a hydrostatic
pressure on the inner and the outer surfaces. The cylinder in the reference configuration is stressed. Equations governing deformations of the solid and the fluid that
are linear in displacement gradients and infinitesimal changes in the apparent mass
densities of the solid and the fluid have been derived; these are Heun’s equations
and may have singular points in the interior of the hollow cylinder. Constitutive
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
263
relations for the solid have been assumed to depend upon the gradients of the apparent mass density of the solid. Deformations of the solid and the fluid are coupled
through the scalar coefficient K sf multiplying gradients of the apparent mass density of the solid. When the initial stress state in the solid is that of uniform pressure,
equations governing the radial and the circumferential components of displacement
are uncoupled; the former is a nonhomogeneous Bessel equation and the latter an
Euler equation. These equations are solved for a somewhat arbitrarily chosen set
of material and geometric parameters; these correspond to a salt matrix filled with
brine. The stability of the reference configuration with respect to changes in the
values of the coupling coefficient K sf has been scrutinized. It is found that the
reference configuration is stable for values of the coupling coefficient in a suitable
open set. The computed profiles of the mass density of the solid phase exhibit an
oscillatory behavior and also a boundary layer near the outer surface. The variation
of changes in the apparent mass density of the solid due to the consideration of
second-gradient effects is about 1% of its value in the absence of second-gradient
effects and depends upon the value assigned to the coupling coefficient K sf .
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
G.E. Andrews, R. Askey and R. Roy, Special functions. In: Encyclopedia of Mathematics and
Its Applications. Cambridge Univ. Press., Cambridge (1999).
V.I. Arnold, Geometrical Methods in the Theory of Ordinary Differential Equations, translated
by J. Szucs; English translation edited by M. Levi. Springer, New York (1983).
F.M. Arscott, Heun’s equation. In: A. Ronveaux (ed.), Heun’s Differential Equations. Oxford
Univ. Press, Oxford (1995) Part A.
R.C. Batra, On nonclassical boundary conditions. Arch. Rational Mech. Anal. 48 (1972) 163–
191.
R.C. Batra, Thermodynamics of non-simple elastic materials. J. Elasticity 6 (1976) 451–456.
R.C. Batra, The initiation and growth of, and the interaction among adiabatic shear bands in
simple and dipolar materials. Internat. J. Plasticity 3 (1987) 75–89.
R.C. Batra and L. Chen, Shear band spacing in gradient-dependent thermoviscoplastic materials, Comput. Mech. 23 (1999) 8–19.
R.C. Batra and J. Hwang, Dynamic shear band development in dipolar thermoviscoplastic
materials. Comput. Mech. 12 (1994) 354–369.
R.C. Batra and C.H. Kim, Adiabatic shear banding in elastic-viscoplastic nonpolar and dipolar
materials. Internat. J. Plasticity 6 (1990) 127–141.
P. Bérest, J. Bergues, B. Brouard, J.G. Durup and B. Guerber, A salt cavern abandonment test.
Internat. J. Rock Mech. Min. 38 (2001) 357–368.
M.A. Biot, General theory of three-dimensional consolidation. J. Appl. Phys. 12 (1941) 155–
164.
P. Blanchard and E. Bruning, Variational Methods in Mathematical Physics (a Unified
Approach). Springer, Heidelberg (1992).
R.M. Bowen, Theory of mixtures. In: Continuum Physics, Vol. III (1976) pp. 2–127.
J.W. Cahn and J.E. Hilliard, Free energy of a non-uniform system. J. Chem. Phys. 31 (1959)
688–699.
P. Casal, La théorie du second gradient et la capillarité. C. R. Acad. Sci. Paris Sér. A 274 (1972)
1571–1574.
264
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
F. DELL’ISOLA ET AL.
P. Casal and H. Gouin, Equations du movement des fluides thermocapillaires. C. R. Acad. Sci.
Paris Sér. II 306 (1988) 99–104.
P. Cosenza, M. Ghoreychi, B. Bazargan-Sabet and G. de Marsily, In situ rock salt permeability
measurement for long term safety assessment of storage. Internat. J. Rock Mech. Min. 36 (1999)
509–526.
R. de Boer, Theory of Porous Media. Springer, Berlin (2000).
F. dell’Isola, M. Guarascio and K. Hutter, A variational approach for the deformation of a
saturated porous solid. A second gradient theory extending Terzaghi’s effective stress principle.
Archive Appl. Mech. 70 (2000) 323–337.
F. dell’Isola and P. Seppecher, Edge contact forces and quasibalanced power. Meccanica 32
(1997) 33–52.
W. Elhers, Toward finite theories of liquid-saturated elasto-plastic porous media. Internat. J.
Plasticity 7 (1991) 433–475.
P. Fillunger, Erdbaumechanik. Selbstverlag des Verfassers, Wien (1936).
P. Germain, La méthode des puissances virtuelles en mécanique des milieux continus.
J. Mécanique 12(2) (1973) 235–274.
P. Germain, The method of virtual power in continuum mechanics. Part 2: Microstructure.
SIAM J. Appl. Mech. 25(3) (1973) 556–575.
M.A. Goodman and S.C. Cowin, A continuum theory for granular materials. Arch. Rational
Mech. Anal. 44 (1972) 249–266.
H. Gouin, Tension superficielle dynamique et effet Marangoni pour les interfaces liquidevapeur en théorie de la capillarité interne. C. R. Acad. Sci. Paris Sér. II 303(1) (1986).
S. Krishnaswamy and R.C. Batra, A thermomechanical theory of solid-fluid mixtures. Math.
Mech. Solids 2 (1997) 143–151.
G.A. Maugin, The method of virtual power in continuum mechanics: application to coupled
fields. Acta Mech. 35 (1980) 1–70.
L.W. Morland, A simple constitutive theory for a fluid saturated porous solid. J. Geoph. Res.
77 (1972) 890–900.
I. Müller, A thermodynamic theory of mixtures of fluids. Arch. Rational Mech. Anal. 28 (1968)
1–39.
I. Müller, Thermodynamics. Pittman, Boston (1985).
N.I. Muskhelishvili, Some Basic Problems of the Mathematical Theory of Elasticity. Noordorf,
Groningen (1953).
K.R. Rajagopal and L. Tao, Mechanics of Mixtures. World Scientific, Singapore (1995).
G. Sciarra, F. dell’Isola and K. Hutter, A solid-fluid mixture model allowing for solid dilatation
under external pressure. Continuum Mech. Thermodyn. 13 (2001) 287–306.
G. Sciarra, K. Hutter and G.A. Maugin, A variational approach to a micro-structured theory of
solid-fluid mixtures, in preparation.
P. Seppecher, Equilibrium of a Cahn–Hilliard fluid on a wall: Influence of the wetting properties
of the fluid upon the stability of a thin liquid film. European J. Mech. B Fluids 12(1) (1993)
69–84.
S.M.R.I. Solution, Mining Research Institute, Technical class guidelines for safety assessment
of salt caverns, Fall Meeting, Rome, Italy (1998).
B. Svendsen and K. Hutter, On the thermodynamics of a mixture of isotropic materials with
constraints. Internat. J. Engrg. Sci. 33 (1995) 2021–2054.
R. Thom, Stabilité structurelle et morphogénèse: Essai d’une Théorie Générale des Modèles.
Benjamin, New York (1972).
C.A. Truesdell, Sulle basi della termomeccanica. Lincei Rend. Sc. Fis. Mat. Nat. XXII (Gennaio
1957).
C.A. Truesdell, Thermodynamics of diffusion, Lecture 5. In: C.A. Truesdell (ed.), Rational
Thermodynamics. Springer, Berlin (1984) pp. 216–219.
A Class of Fit Regions and a Universe of Shapes for
Continuum Mechanics
GIANPIETRO DEL PIERO⋆
Dipartimento di Ingegneria, Università di Ferrara, Via Saragat 1, 44100 Ferrara, Italy
E-mail: gdpiero@ing.unife.it
Received 12 September 2002; in revised form 3 March 2003
Abstract. A new class of fit regions is proposed as an alternative to those available in the literature,
and specifically to the class defined by Noll and Virga in their paper [12]. An advantage of the
proposed class is that of being based mostly on topological concepts rather than on less familiar
concepts from geometric measure theory. A distinction is introduced between fit regions and shapes
of continuous bodies. The latter are defined as equivalence classes of fit regions, made of regions
all with the same interior and with the same closure. In the final part of the paper the axioms for a
universe of bodies, formulated by Noll and incorporated in Truesdell’s book [15], are re-discussed
and partially re-formulated.
Mathematics Subject Classifications: 73A05.
Key words: foundations of continuum mechanics, fit regions, universes of bodies.
Dedicated to the memory of Clifford A. Truesdell, teacher and friend
1. Introduction
In his textbook on Rational Continuum Mechanics, Clifford Truesdell writes: “Mechanics rests upon three substructures: a universe of bodies, a geometry with
its kinematics, and a theory of forces. These substructures provide the concepts
mechanics is to connect.” [15, p. 6].
In this paper I deal with the first of these substructures. A universe of bodies, or
material universe, is a collection of continuous bodies. Mathematically it is defined
as a pair (, ≺), where is a set and ≺ is a relation on , subject to a number of
axioms.⋆⋆ The elements of are the bodies, and if two bodies A and B satisfy the
relation A ≺ B we say that A is a part of B.‡
⋆ Presently on leave at the Centro Linceo Interdisciplinare “Beniamino Segre”, Accademia
Nazionale dei Lincei, Rome, Italy.
⋆⋆ The axioms define on (, ≺) the structure of a Boolean algebra, see [10, 15].
‡ The concept of a material universe was introduced by Noll in 1959 [8]. Formal definitions are
given in [9–11, 15]. Sometimes, a material universe is identified with a single continuous body, and
its elements are identified with the subbodies of the given body, see, e.g., [2, 3, 7, 11, 13, 14].
265
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 265–285.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
266
G. DEL PIERO
The bodies of Continuum Mechanics are deformable. They may take different
shapes. A change of shape is called a transplacement, and the collection of all
shapes which can be taken by all bodies in a universe of bodies is a universe of
shapes. Usually, shapes are identified with regions of a Euclidean space.⋆ The idea
that shapes should belong to some “suitable class of nice subsets of Euclidean
spaces” appears in [11]. In the same paper, two examples of “suitable classes” are
provided:
– the set of all regularly closed subsets,
– the set of all regularly open subsets.⋆⋆
Purely mathematical examples are also given in [10]. The concept of a nice
subset was made precise by Noll and Virga in their paper [12]. In it, nice subsets
were called fit regions, and their properties were fixed as follows:
(F1) The set of all fit subregions of a given fit region should satisfy the axioms of
a material universe.
(F2) The class of fit regions should be invariant under transplacements, which
should include adjustments to fit regions of smooth diffeomorphisms from
one Euclidean space to another.
(F3) Each region should have a surface-like boundary for which a form of the
integral-gradient (Gauss–Green) theorem should be valid.
The same authors added:
(F4) It is also desirable that the class of fit regions include all that can possibly be
imagined by an engineer but exclude those that can be dreamt up only by an
ingenious mathematician.
The class of regularly open sets and that of regularly closed sets do not satisfy
the requirement (F3). In fact, the integral-gradient formula is usually established
within the class of sets with finite perimeter, see, e.g., [5, Section 5.8]. The class of
open sets with finite perimeter were proposed as fit regions by Banfi and Fabrizio [2].‡ As remarked by Noll and Virga in [12], this class does not satisfy
the requirement (F1). Later, Gurtin et al. added the condition that the sets be
d-regular.‡‡ The class of d-regular open sets with finite perimeter satisfies all requirements but, in Noll and Virga’s opinion, it is “unnecessarily large”.¶ In the
spirit of their statement (F4), they select a more restricted class, which also has the
⋆ [15, Section 2.1]. In papers preceding Truesdell’s book, there is some confusion between bodies
and regions occupied by bodies.
⋆⋆ Denote by int A the interior and by clo A the closure of a set A. Then A is regularly open if
A = int clo A, and is regularly closed if A = clo int A.
‡ See also [13].
‡‡ Gurtin et al. [7]. A set is d-regular, or normalized, if it coincides with the set of its density
points, see Section 2 below.
¶ In [3], Degiovanni et al. restricted this class to bounded sets. In [14], Šilhavý further restricted
this class to sets with negligible boundary, see condition (NV4) below.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
267
advantage of involving a reduced amount of measure theoretic concepts. This is the
class of all subsets of a Euclidean space which are
(NV1)
(NV2)
(NV3)
(NV4)
bounded,
regularly open,
with finite perimeter,
with negligible boundary.⋆
Regions with these properties will be called NV-regions. In the present paper I define a new class of fit regions, which I call D-regions. In it, the only measure
theoretic concept involved is that of the Hausdorff measure of a set. Indeed, the
boundary of every D-region is required to have a finite (N − 1)-dimensional Hausdorff measure. This implies that D-regions are both with finite perimeter and with
negligible boundary.
For D-regions, properties (F2) and (F3) are proved in Section 2.⋆⋆ More precisely, the invariance property (F2) is proved for bi-Lipschitz homeomorphisms, a
class of mappings larger, and physically more realistic, than the class of smooth
diffeomorphisms considered in [12].‡
D-regions need not be either open or closed, but those which are open are NVregions. This is proved in Section 3, where it is also shown by an example that
open D-regions form a proper subset of the set of all NV-regions. Referring to
the statement (F4), the new class of fit regions can be considered as a further step
in the process of eliminating regions which may be “dreamt up by an ingenious
mathematician” but never “imagined by an engineer”.
A new definition of a shape of a body is given in Section 4. While shapes are
usually identified with fit regions of Euclidean spaces, here they are defined as
equivalence classes of D-regions, called D-shapes.‡‡ Each class is made of regions
having the same interior and the same closure, so that regions within the same
class only differ by the portion of boundary included in the region. Accordingly, a
D-shape may be represented by an open region, by a closed region, or by a region
which is neither open nor closed. This may be helpful in facing many situations
encountered in problems of mechanics.¶ Also, at least in my opinion, the proposed
definition meets more directly some requirements suggested by physics. For exam⋆ I.e., the boundary has zero N-dimensional Lebesgue measure.
⋆⋆ If, as in the present paper, shapes are not identified with fit regions, then (F1) applies to shapes
rather than to individual fit regions.
‡ Indeed, bi-Lipschitz homeomorphisms need not be differentiable at every interior point of the
region, and cases of transplacements which are not differentiable on singular surfaces or on other
sets with zero measure are frequently met in problems of elasticity. Invariance under bi-Lipschitz
homeomorphisms was first assumed in [2].
‡‡ To my knowledge, the only reference to equivalence classes of regions as an alternative to
individual regions was made by Šilhavý in [13].
¶ For example, open sets are more appropriate to the study of boundary value problems of elasticity, while closed sets are more convenient when a specific material structure is prescribed to the
boundary or to a part of it.
268
G. DEL PIERO
ple, as shown in Section 4, it provides a more satisfactory definition of the partition
of a shape, a definition which cannot be given in terms of open sets or of closed
sets alone.
The axioms of a material universe mentioned in (F1) are discussed in Section 5.
In the presentation given in [15] there are six axioms. The first three are:
(A1)
(A2)
(A3)
A ≺ A,
A≺B
A≺B
and
and
B ≺A
B ≺C
⇒
⇒
A = B,
A ≺ C.
They state that the relation ≺ is reflexive, antisymmetric, and transitive. These are
the defining properties of a partial ordering.
It is then assumed that contains two elements ∅, ∞, called the null body and
the universal body, such that
∅ ≺ A≺∞
∀A ∈ .
(1.1)
A body C is said to be an envelope of A and B if both A and B are parts of C,
and is said to be a common part to A and B if C is a part of both A and B. The
minimum envelope of A and B, if it exists, is noted A ∨ B and is called the join of
A and B, and the maximum common part of A and B, if it exists, is noted A ∧ B
and is called the meet of A and B.
The fourth axiom postulates the existence of a unique exterior Ae for each
body A:
(A4)
for each A ∈ there is a unique Ae ∈ such that A ∧ Ae = ∅ and
A ∨ Ae = ∞,
the fifth axiom requires that all bodies disjoint from a body A be parts of the
exterior of A:
(A5)
A∧B =∅ ⇒
B ≺ Ae ,
and the last axiom postulates the existence of the meet for every pair of bodies:
(A6)
the meet A ∧ B exists for every A, B ∈ .
There are some differences between the postulates listed above and those in
[10, 11].⋆ In [10], axiom (A1) is replaced by the assumption of the existence of
the relation ≺, and in [11] the same axiom is replaced by the existence of the
elements ∅, ∞, an assumption made in [10, 15] without giving it the status of an
axiom.⋆⋆
⋆ In fact, there are two lists of postulates in [10], one in the Appendix, to which I refer here, and
one in Section 8, which coincides with the list given in [11].
⋆⋆ In [10, 15], ∅ and ∞ are considered as improper bodies, to be included in order to perform a
sort of completion of .
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
269
While the introduction of ∅ is essential, that of ∞ can be avoided. This consideration, together with the discrepancies in the lists of axioms mentioned above,
induced me to re-consider the whole matter. The result is the new list of axioms
given in Section 5. In it, axioms (A1)–(A3) are eliminated, simply by declaring
from the outset that the relation ≺ is a partial ordering. About the remaining axioms, I found it convenient to replace assumption (A6) on the existence of the meet
by that of the existence of the join, and to substitute axiom (A5) with a separation
postulate asserting that a body disjoint from two other bodies is also disjoint from
their join. Moreover axiom (A4), which is meaningless if the universal body ∞ is
left out of the picture, is replaced by a partition postulate assuming the existence of
the complementary body. By adding the assumption of existence of the null body,
I obtain four axioms which I denote by (B1)–(B4). I prove that counterparts of the
statements (A4) and (A5), involving complementary sets instead of exteriors, can
be deduced from these axioms.
Finally, Section 6 is devoted to checking that the universe of shapes constructed
in Section 4 satisfies all axioms (B1)–(B4) of a universe of bodies, as required
by (F1).
2. A Class of Fit Regions
Consider the class of all measurable subsets of RN with the following properties:
(i)
(ii)
(iii)
(iv)
is bounded,
int = int clo ,
clo = clo int ,
H N−1 (bdy ) < +∞.
Here int, clo, bdy denote the topological interior, closure and boundary, respectively, and H N−1 denotes the (N − 1)-dimensional Hausdorff measure. A set with
the above properties will be called a D-region of RN . In particular, property (ii)
states that the interior of is regularly open, and (iii) states that the closure of is
regularly closed. Thus, a D-region need be neither open nor closed. An illustration
of the meaning of (ii) and (iii) is given in Figure 1. It is shown there that, due to (ii),
a D-region cannot have missing lines or points and that, due to (iii), it cannot have
isolated lines or points.
The interiors, closures and boundaries of D-regions have the remarkable properties proved in the next propositions.
PROPOSITION 2.1. If is a D-region, then
bdy int = bdy = bdy clo .
(2.1)
Proof. By property (iii),
bdy(int ) = clo(int )\ int(int ) = clo \ int = bdy ,
(2.2)
270
G. DEL PIERO
Figure 1. A region of the plane, and the regions int , clo , int clo , clo int . Thick lines
are included in the region, thin lines are not included.
and, by property (ii),
bdy(clo ) = clo(clo )\ int(clo ) = clo \ int = bdy .
(2.3)
✷
PROPOSITION 2.2. The interior and the closure of a D-region are D-regions.
Proof. In a metric space, is bounded if and only if int and clo are bounded
[4, Chapter 5]. For clo , property (ii) is trivially satisfied and (iii) is proved by
observing that, if is a D-region,
clo(clo ) = clo = clo int = clo(int clo ) = clo int(clo ).
(2.4)
Similarly, for int property (iii) is trivially satisfied and (ii) is proved by interchanging “int” and “clo” in the preceding equations. Finally, (iv) follows
from (2.1).
✷
PROPOSITION 2.3. If and ′ are D-regions, then
int ⊂ int ′
⇔
clo ⊂ clo ′ .
Proof. If int ⊂ int ′ , then
clo = clo int ⊂ clo int ′ = clo ′ .
The inverse implication is proved by interchanging “int” and “clo”.
(2.5)
(2.6)
✷
An immediate corollary is that two D-regions have the same interior if and only
if they have the same closure:
int = int ′
⇔
clo = clo ′ .
(2.7)
The proof of the next proposition is based on the following properties of arbitrary
subsets A, B of RN .
clo A ∪ clo B = clo(A ∪ B),
int A ∪ int B ⊂ int(A ∪ B),
bdy A ∪ bdy B ⊃ bdy(A ∪ B),
clo A ∩ clo B ⊃ clo(A ∩ B),
int A ∩ int B = int(A ∩ B),
(2.8)
bdy A ∪ bdy B ⊃ bdy(A ∩ B),
clo A\ int B ⊃ clo(A\B),
int A\ clo B = int(A\B),
bdy A ∪ bdy B ⊃ bdy(A\B).
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
271
PROPOSITION 2.4. If and ′ are D-regions, then clo( ∪ ′ ), int( ∩ ′) and
int(\′ ) are D-regions.
Proof. For clo( ∪ ′ ), properties (i) and (ii) are trivially verified. Moreover,
by (2.8)1 and (2.8)2 ,
clo(clo( ∪ ′ )) = clo( ∪ ′ ) = clo ∪ clo ′ = clo int ∪ clo int ′
= clo(int ∪ int ′ ) ⊂ clo int( ∪ ′ )
⊂ clo int(clo( ∪ ′ )).
(2.9)
Because the inverse inclusion is true for any set, we obtain (iii). Finally, from (2.8)1 ,
(2.8)3 and (2.1) we have
bdy clo( ∪ ′ ) = bdy(clo ∪ clo ′ ) ⊂ bdy clo ∪ bdy clo ′
= bdy ∪ bdy ′ ,
(2.10)
and this implies
H N−1 (bdy clo( ∪ ′ )) H N−1 (bdy ) + H N−1 (bdy ′ ) < +∞. (2.11)
For int( ∩ ′ ) properties (i) and (iii) are immediate, (ii) follows from (2.9) after
interchanging clo, ∪, ⊂ with int, ∩, ⊃, respectively, and (iv) is proved by using
(2.8)5 , (2.8)6 and (2.1):
bdy int( ∩ ′ ) = bdy(int ∩ int ′ ) ⊂ bdy int ∪ bdy int ′
= bdy ∪ bdy ′ .
(2.12)
Finally, for int(\′ ) conditions (i) and (iii) are trivially satisfied. Moreover, by
(2.8)8 and (2.8)7 ,
int(int(\′ )) = int(\′ ) = int \ clo ′ = int clo \ clo int ′
= int(clo \ int ′ ) ⊃ int clo(\′ )
⊃ int clo(int(\′ )),
(2.13)
and because the inverse inclusion is true in general we obtain (ii). Finally, by (2.8)8 ,
(2.8)9 and (2.1),
bdy(int(\′ )) = bdy(int \ clo ′ ) ⊂ bdy int ∪ bdy clo ′
= bdy ∪ bdy ′ ,
and (iv) follows.
(2.14)
✷
Next, I prove that D-regions have the properties (F2), (F3), so that they form a
class of fit regions. About the first property, I prove that a bi-Lipschitz homeomorphism maps D-regions onto D-regions. I recall that a homeomorphism is a bijective
continuous mapping with a continuous inverse, and that a mapping f : RN → RN
is bi-Lipschitz if there are positive constants c, m such that
c|x − y| |f (x) − f (y)| m|x − y|
∀x, y ∈ RN .
(2.15)
272
G. DEL PIERO
I also recall that if f is a homeomorphism, then
f (int A) = int f (A),
f (clo A) = clo f (A),
f (bdy A) = bdy f (A)
(2.16)
for all regions A in RN .
PROPOSITION 2.5. Let f : RN → RN be a bi-Lipschitz homeomorphism. Then
a subset of RN is a D-region if and only if f () is a D-region.
Proof. Let be a D-region. Then bounded and f continuous in clo implies
f () bounded by the Weierstrass theorem. Thus f () satisfies the first property
of a D-region. The second property is proved using (2.16) and the property (iii)
for :
int clo f () = int f (clo ) = f (int clo ) = f (int ) = int f (),
(2.17)
and property (iii) follows after interchanging “int” and “clo”. Finally, property (iv)
holds because H N−1 (bdy f ()) = H N−1 (f (bdy )) by (2.16)3 and
H N−1 (f (bdy )) mN−1 H N−1 (bdy ),
(2.18)
by the Lipschitz continuity of f , see [5, Section 2.4.1, Theorem 1]. Thus, f ()
is a D-region whenever is a D-region. The inverse implication is proved by
interchanging and f with f () and f −1 .
✷
I now turn to the proof that the integral-gradient formula
f (x) ⊗ n(x) dH N−1
∇f (x) dx =
(2.19)
eby
holds for all D-regions of RN and for all continuous functions f : RN → RN
whose gradient ∇f is locally summable. Here eby denotes the essential boundary of and n(x) is the measure theoretic outward normal at x. Given a region A
in RN , the set of all density points for A is the set dns A made of all x in RN such
that
|B(x, r) ∩ A|
= 1,
(2.20)
lim
r→0
|B(x, r)|
where B(x, r) is the open ball of RN centered at x and with radius r, and |·| denotes
the N-dimensional Lebesgue measure.⋆ A point of rarefaction for A is a point for
which the above limit is zero, and the essential boundary eby A of A is the set of
all points which are neither points of density nor points of rarefaction for A. The
inclusions
int A ⊂ dns A ⊂ clo A,
hold for every subset A of RN .
eby A ⊂ bdy A,
(2.21)
⋆ For the basic definitions and results from geometric measure theory, I refer to the book [16] by
Vol’pert and Hudjaev.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
273
A subset A of RN is a set with finite perimeter if the (N − 1)-dimensional
Hausdorff measure of the essential boundary is finite.⋆ From (2.21)2 , we have
H N−1 (eby A) H N−1 (bdy A),
(2.22)
so that every set whose boundary has finite (N −1)-dimensional Hausdorff measure
is a set with finite perimeter. In particular, from the property (iv) of D-regions it
follows that every D-region is a set with finite perimeter.
For sets with finite perimeter, formula (2.19) holds with replaced by dns
in the first integral [16, p. 198, Theorem 1]. Hence, formula (2.19) holds as it is,
whenever and dns differ by a set of Lebesgue measure zero:
|(\ dns ) ∪ (dns \)| = 0.
(2.23)
By the general inclusions (2.21)1 and int ⊂ ⊂ clo , both (\ dns ) and
(dns \) are included in (clo \ int ) = bdy . Then the Lebesgue measure
of their union is not greater than |bdy |. For a D-region, the property (iv) implies
|bdy | = 0. Consequently, (2.23) holds for all D-regions, and this proves the
following
PROPOSITION 2.6. Let be a D-region of RN . Then the integral-gradient formula (2.19) holds for any f : RN → RN continuous and with ∇f ∈ L1loc .
REMARK. There are D-regions for which inequality (2.22) is strict. An example
is the region D(x, y, d, l) defined in the following section. Thus, eby cannot be
replaced by bdy in (2.19).
3. A Comparison with Noll and Virga’s Class
If we identify the N-dimensional Euclidean space with RN and if we consider
all open D-regions of RN , we see that they have all properties (NV1)–(NV4) in
Section 1. Thus, all open D-regions are NV-regions. Here I wish to show that there
are NV-regions which are not D-regions.
For a fixed positive number l and for each d in (0, l) consider the following
subset of R2
h−1
∞ 24
4
D(x, y, d, l) :=
B(x + xh,k , y + yh , rh ) | k = 2p − 1,
h=1 p=1
$
d
d
kl
(3.1)
xh,k = h , yh = h , rh = h ,
2
2
4
where B(x, y, r) is the open ball centered at (x, y) with radius r. This set is both a
NV-region and a D-region. Its essential boundary is the union of the boundaries of
all balls which form the set, and the boundary is the union of the essential boundary
⋆ This definition is equivalent to standard definitions given, for example, in [5] or in [16]. The
equivalence follows from Proposition 3.62 in the book [1] by Ambrosio et al.
274
G. DEL PIERO
and of the segment (x, x + l) × {y}. Thus,
H N−1 (eby D(x, y, d, l)) =
∞
h=1
2h−1 2π rh =
∞
h=1
2h 4−h π d = π d,
(3.2)
and
H N−1 (bdy D(x, y, d, l)) = π d + l.
(3.3)
For each positive integer p, consider the set Dp := D(0, yp , dp , l), with
y1 = 0,
yp+1 =
p
dq ,
q=1
dq = 2−q l,
(3.4)
and take
:=
∞
4
(3.5)
Dp .
p=1
The region is shown in Figure 2. It is bounded because it is included in the square
(0, l) × (0, l), and it is regularly open because it is the countable union of pairwise
disjoint open balls. Moreover, its perimeter is equal to
H N−1 (eby ) =
∞
q=1
π dq = π l,
Figure 2. A region in R2 which is a NV-region but not a D-region.
(3.6)
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
275
and the boundary has zero Lebesgue measure because it is a countable union of sets
with zero measure. Therefore, is a NV-region. But it is not a D-region, because
H N−1 (bdy ) =
∞
q=1
(π dq + l) = +∞.
(3.7)
4. D-Shapes
In the set of all D-regions of RN consider the equivalence relation
∼ ′
⇔
int = int ′ ,
(4.1)
stating that two D-regions are equivalent if they have the same interior. Notice that,
in view of (2.7), two D-regions are equivalent if and only if they have the same
closure.
If is a D-region, then int and clo are D-regions by Proposition 2.2.
Moreover, they all have the same interior, because int = int(int ) is trivially
satisfied and int = int(clo ) is property (ii) in the definition of a D-region.
Therefore, , int and clo are equivalent.
Given a D-region , denote by the equivalence class containing . Then not
only int and clo belong to but also, by (4.1) and (2.7), they are the unique
open region and the unique closed region in , respectively. They will be called
the open representative and the closed representative of . The notations
◦
′′ ∈¯ ,
′ ∈ ,
(4.2)
will be used to denote that ′ is the open representative and that ′′ is the closed
representative of , respectively. An equivalence class of D-regions will be called
a D-shape,⋆ and the set of all D-shapes in RN will be denoted by SN .
On SN , consider the partial ordering
′ ≺
⇔
int ′ ⊂ int
∀′ ∈ ′ , ∀ ∈ .
(4.3)
We say that ′ is a part of if ′ ≺ . Notice that, by Proposition 2.3, ′ ≺
if and only if clo ′ ⊂ clo for all ′ in ′ and for all in .
˜ such that
There is in SN a D-shape, denoted by ∅,
∅˜ ≺ ∀ ∈ SN .
(4.4)
This is the equivalence class containing the empty set ∅. Because the empty set is
at the same time open and closed, we have int ∅ = ∅ = clo ∅, and because the only
⋆ In the terminology introduced in Truesdell’s book [15], bodies are sets in some abstract topological space, and shapes are regions in the Euclidean point space which can be occupied by a body
[15, pp. 16, 86]. Shapes were called places by Newton [15, p. 33].
276
G. DEL PIERO
set whose closure is the empty set is the empty set itself, the class ∅˜ consists of the
single element ∅.
A basic property of the equivalence relation ∼ is that it is preserved under biLipschitz homeomorphisms.
PROPOSITION 4.1. Let f : RN → RN be a bi-Lipschitz homeomorphism, and
let , ′ be D-regions in RN . Then ∼ ′ if and only if f () ∼ f (′ ).
Proof. If and ′ are D-regions, then f () and f (′ ) are D-regions by Proposition 2.5. Moreover, ∼ ′ implies int = int ′ , and from (2.16)1 we have
int f () = f (int ) = f (int ′ ) = int f (′ )
(4.5)
and therefore f () ∼ f (′ ). The proof that f () ∼ f (′ ) implies ∼ ′ is
similar.
✷
The preceding proposition states that the image f () of the equivalence class
is the equivalence class containing f (). Thus, the image of a D-shape under a
bi-Lipschitz homeomorphism is a D-shape.
Let , ′ be D-shapes, and let , ′ be D-regions in , ′ , respectively. Then
clo( ∪ ′ ), int( ∩ ′ ) and int(\′ ) are D-regions by Proposition 2.4. I prove
below that these regions are determined by the equivalence classes , ′ and not
by their specific representatives , ′ . In other words, they do not change if , ′
are replaced by equivalent regions.
PROPOSITION 4.2. Let , ′ be D-shapes, let , 1 be D-regions in , and
let ′ , ′1 be D-regions in ′ . Then,
clo( ∪ ′ ) = clo(1 ∪ ′1 ),
int(\′ ) = int(1 \′1 ).
int( ∩ ′ ) = int(1 ∩ ′1 ),
(4.6)
Proof. By (2.7), ∼ 1 and ′ ∼ ′1 implies clo = clo 1 and clo ′ =
clo ′1 . Then by (2.8)1 ,
clo( ∪ ′ ) = clo ∪ clo ′ = clo 1 ∪ clo ′1 = clo(1 ∪ ′1 ).
(4.7)
This proves the first equality in (4.6). The two remaining equalities are proved in a
similar way.
✷
Because clo( ∪ ′ ) is a closed D-region, it can be taken as the closed representative of a D-shape. By the proposition just proved, that shape depends on the
D-shapes , ′ , and not on their specific elements. The D-shape ∨ ′ defined
by
∨ ′ ∋¯ clo( ∪ ′ ),
∈ , ′ ∈ ′ ,
(4.8)
will be called the join of and ′ . Similarly, int( ∩ ′ ) and int(\′ ) can be
taken as the open representatives of a D-shape. The D-shapes
◦
∧ ′ ∋ int( ∩ ′ ),
◦
⊲ ′ ∋ int(\′ ),
∈ , ′ ∈ ′ ,
(4.9)
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
277
define the meet and the difference of and ′ , respectively. From the inclusions
clo ⊂ clo( ∪ ′ ),
int(\′ ) ⊂ int ,
int( ∩ ′ ) ⊂ int ,
(4.10)
it follows that
≺ ∨ ′ ,
∧ ′ ≺ ,
⊲ ′ ≺
(4.11)
for all pairs of D-shapes , ′ . If ′ is a part of , the difference ⊲ ′ defines the
complementary part ′ of ′ in . I prove below that ′ is the unique D-shape
with the properties
′ ∨ ′ = ,
˜
′ ∧ ′ = ∅.
(4.12)
PROPOSITION 4.3. Let , ′ and ′ be D-shapes, and let ′ ≺ . Then equations (4.12) hold if and only if ′ = ⊲ ′ .
Proof. Let , ′ , ′ be D-regions in , ′ , ′ , respectively. Then equations (4.12) mean that
clo(′ ∪ ′ ) = clo ,
int(′ ∩ ′ ) = ∅.
(4.13)
The only if part of the proposition consists in proving that the above equalities
imply
int(\′ ) = int ′ .
(4.14)
Using the relations (2.8), (4.13)1 , the identity (A ∪ B)\B = A\B, and the fact that
and ′ are D-regions, we get
int(\′ ) =
=
=
=
=
=
=
=
int \ clo ′
int clo \ clo ′
int clo(′ ∪ ′ )\ clo ′
int(clo(′ ∪ ′ )\ clo ′ )
int((clo ′ ∪ clo ′ )\ clo ′ )
int(clo ′ \ clo ′ )
int clo ′ \ clo ′
int ′ \ clo ′ .
(4.15)
But
∅ = int ′ ∩ int ′
= int ′ ∩ clo int ′
= int ′ ∩ clo ′ .
(4.16)
278
G. DEL PIERO
Indeed, the first equality follows from (4.13)2 , the second is due to the fact that
A ∩ clo B = ∅ for every pair of open sets A, B with A ∩ B = ∅, and the third
follows from property (iii) of D-regions. Then (4.16) implies int ′ \ clo ′ =
int ′ , and (4.14) follows from (4.15).
To prove the if part of the proposition, assume that (4.14) holds. Then
clo(′ ∪ ′ ) =
=
=
=
=
=
=
=
clo ′ ∪ clo ′
clo int ′ ∪ clo ′
clo int(\′ ) ∪ clo ′
clo(int \ clo ′ ) ∪ clo ′
clo((int \ clo ′ ) ∪ clo ′ )
clo(int ∪ clo ′ )
clo int ∪ clo ′
clo ∪ clo ′ ,
(4.17)
and the last set is equal to clo because clo ′ ⊂ clo by the assumption ′ ≺ .
This proves equation (4.13)1 . Equation (4.13)2 follows from
int(′ ∩ ′ ) =
=
=
=
=
int ′ ∩ int ′
int(\′ ) ∩ int ′
(int \ clo ′ ) ∩ int ′
(int ∩ int ′ )\ clo ′
int ′ \ clo ′ .
(4.18)
The last equality follows from the assumption ′ ≺ which implies int ′ ⊂
int , and the preceding equality follows from the identity (A\B)∩C = (A∩C)\B.
Then we have int(′ ∩ ′ ) = int ′ \ clo ′ = ∅.
We say that two D-shapes ′ , ′′ form a partition of if
′ ∨ ′′ =
and
˜
′ ∧ ′′ = ∅.
(4.19)
This definition makes precise the idea of subdividing a D-shape into two subshapes.
Indeed, Proposition 4.3 tells us that a partition of is obtained simply by taking
any part ′ of and its complementary part in .
To provide examples of join and meet, consider the plane R2 and select a system
of Cartesian coordinates (x, y). Take the regions
:= (x, y) ∈ R2 | −l < x 0, 0 y l ,
(4.20)
′ := (x, y) ∈ R2 | 0 x < l, 0 y l .
(4.21)
Both are squares of side l, neither open nor closed, and they are both D-regions.
Their union is the rectangle
∪ ′ = (x, y) ∈ R2 | −l < x < l, 0 y l ,
(4.22)
279
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
their intersection is the segment
∩ ′ = (x, y) ∈ R2 | x = 0, 0 y l ,
(4.23)
and the difference \′ is the square
\′ = (x, y) ∈ R2 | −l < x < 0, 0 y l .
(4.24)
If and ′ are the equivalence classes of , ′ , their join is the set of all rectangles
with the same sides as ∪ ′ , irrespective of which part of the sides is included
in the region. Their meet is the equivalence class including the interior of ∩ ′ ,
˜ and their difference ⊲ ′ coincides with . Notice that,
that is, the empty class ∅,
by (4.19), ∧ ′ = ∅˜ implies that and ′ form a partition of ∨ ′ .
✷
5. Universes of Bodies
A universe of bodies is a pair (, ≺), with a set and ≺ a partial ordering on .
The elements A, B, . . . of are called bodies, and A ≺ B is to be read as A is a
part of B. The pair (, ≺) is subject to four axioms. The first is the existence of
the null body
(B1) There is an element ∅ of such that ∅ ≺ A for all A ∈ ,
and the second is the existence of the minimum envelope
(B2) For every A, B in , there is a C ∈ such that:
(i) A and B are parts of C,
(ii) if A and B are parts of D then C is a part of D.
Note that this axiom consists of two statements: (i) postulates the existence of
an envelope for every pair of elements of , and (ii) postulates the existence of a
minimum envelope. The uniqueness of the null body and of the minimum envelope
are easy to prove. Indeed, if ∅ and ∅′ are null bodies then ∅ ≺ ∅′ and ∅′ ≺ ∅, and
this is possible only if ∅′ = ∅. Similarly, if both C and C ′ are minimum envelopes
of A and B, then A and B are parts of both C and C ′ by (i). Then by (ii) C ≺ C ′
and C ′ ≺ C, and therefore C ′ = C.
The minimum envelope of A and B is called the join of A and B and is denoted
by A ∨ B. The following properties are direct consequences of the definition:
A ∨ A = A,
A ∨ ∅ = A,
A ∨ B = B ∨ A,
A ≺ B ⇒ (A ∨ C) ≺ (B ∨ C) for all C in ,
(A ∨ B) ∨ C = A ∨ (B ∨ C).
(5.1)
(5.2)
(5.3)
Two bodies A, B are separate if their only common part is the null body:
A and B separate
⇔
(C ≺ A and C ≺ B
⇒
C = ∅).
(5.4)
280
G. DEL PIERO
Notice that
A and B separate and C ≺ A ⇒
C and B separate.
(5.5)
Indeed, if C ≺ A then (D ≺ C and D ≺ B) implies (D ≺ A and D ≺ B), and this
implies D = ∅ if A and B are separate.
The third axiom is a separation postulate
(B3) If A and C are separate and if B and C are separate, then A ∨ B and C are
separate,
and the last axiom is a partition postulate
(B4) If A ≺ C, there is a part AC of C such that
(i) A and AC are separate,
(ii) A ∨ AC = C.
AC is called the complementary part of A in C. The following properties are
easily proved:
CC = ∅,
A≺C
⇒
(AC )C = A,
A≺B≺C
⇒
A and BC are separate.
(5.6)
(5.7)
(5.8)
In particular, to prove (5.8) it is sufficient to observe that, by (5.5), B and BC
separate and A ≺ B implies A and BC separate.
For every body C and for every part A of C, the complementary part of A in C
is unique. To see this, assume that there are two complementary parts of A, A′C
and A′′C . Then AC := A′C ∨ A′′C is a complementary part of A as well. Indeed, A
and AC are separate by the separation postulate, and
A ∨ AC = A ∨ (A′C ∨ A′′C ) = (A ∨ A′C ) ∨ A′′C = C ∨ A′′C = C.
By definition, A′C is a part of AC . Let A′CC be a complementary part of A′C in AC .
Then A′C and A′CC are separate. Moreover, A′CC and A are separate, because AC
and A are separate and A′CC is a part of AC . Then, A′CC and C = A′C ∨ A are
separate by the separation postulate.
On the other hand, A′CC is a part of AC which is a part of C. So, we have at the
same time that A′CC is a part of C and that A′CC and C are separate. This is possible
only if A′CC is the null body. Then AC = A′C ∨ A′CC = A′C . In the same way it can
be proved that AC = A′′C . Thus, A′C = A′′C .
An example of a pair (, ≺) for which the meet does not exist is given in [15,
p. 9]. A pair for which the meet exists but the separation postulate does not hold is
obtained by taking as the set of all closed intervals of the real line, and as ≺ the
inclusion in the sense of set theory. Indeed, the meet of A = [0, 1] and B = [4, 5]
is A ∨ B = [0, 5], and we see that the interval [2, 3] is separate from A and B but
not from A ∨ B.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
281
A pair (, ≺) for which the axioms (B1)–(B3) are satisfied but the complementary set does not exist is obtained by taking as the set of all closed subsets
of the real line and as ≺ again the set inclusion. Then if we take C = [0, 2] and
A = [0, 1] we have that A ≺ C but the complementary set AC does not exist. In
particular, AC is not (1, 2] because this is not a closed interval, and AC is not [1, 2]
because [0, 1] and [1, 2] have as common part the singleton {1}, which belongs to
and is different from the null set.
There are two remarkable consequences of axioms (B3) and (B4). The first is an
inverse of the implication (5.8). It can be regarded as a counterpart to axiom (A5)
in the Introduction, in the absence of a universal body.
PROPOSITION 5.1. If A and B are separate parts of C, then A is a part of BC .
Proof. Because B and BC are separate, B and A separate implies B and A ∨ BC
separate by the separation postulate. Moreover, by (5.3), (A ∨ BC ) ∨ B = A ∨
(BC ∨B) = A∨C = C. Then A∨BC = BC by the definition of the complementary
part of B, and therefore A is a part of BC .
✷
The second consequence of axioms (B3) and (B4) is the following:
PROPOSITION 5.2. Let A, B be parts of C. Then
A≺B
⇔ BC ≺ AC .
(5.9)
Proof. If A ≺ B, then A and BC are separate by (5.8), and if A and BC are
separate then BC is a part of AC by Proposition 5.1. Conversely, if BC ≺ AC then
BC and (AC )C = A are separate by (5.8). Then, A is a part of (BC )C = B by
Proposition 5.1.
✷
The meet of A and B is defined by
A ∧ B := (AC ∨ BC )C ,
(5.10)
where C is any envelope of A and B. In order this definition be meaningful, it is
necessary to prove that the right-hand side of (5.10) does not depend on C. This is
done in Proposition 5.4 below, after proving the following preliminary result.
LEMMA 5.3. Let A ≺ C ≺ D, and let AC and AD be the complementary parts
of A in C and D, respectively. Then AD = AC ∨ CD .
Proof. By (5.2), AC ≺ C implies AC ∨ CD ≺ C ∨ CD = D. Moreover, by (5.3),
A ∨ (AC ∨ CD ) = (A ∨ AC ) ∨ CD = C ∨ CD = D.
(5.11)
It remains to prove that A and AC ∨ CD are separate. In fact, A and AC are separate
by the definition of the complementary set, and A and CD are separate because A
is a part of C and C and CD are separate. Then A and AC ∨ CD are separate by the
separation postulate.
✷
282
G. DEL PIERO
PROPOSITION 5.4. Let A and B be parts of both C and D. Then (AC ∨ BC )C =
(AD ∨ BD )D .
Proof. Assume first that C ≺ D. Then, by the preceding lemma and by (5.3),
AD ∨ BD = (AC ∨ CD ) ∨ (BC ∨ CD ) = (AC ∨ BC ) ∨ CD ,
(5.12)
and, again by the lemma,
(AC ∨ BC ) ∨ CD = ((AC ∨ BC )C )D .
(5.13)
Then,
(AC ∨ BC )C = ((AC ∨ BC ) ∨ CD )D = (AD ∨ BD )D .
(5.14)
If C is not a part of D, take an envelope E of C and D. Then (5.14) holds both
for C and E and for D and E. Combining the two equalities we get (5.14) for C
and D.
✷
The following consequences of the definition (5.10) are easy to prove.
A ∧ B ≺ A,
A ∧ B = B ∧ A,
D ≺ A and D ≺ B ⇒ D ≺ A ∧ B.
(5.15)
(5.16)
The last statement characterizes the meet A∧B as the maximum common part of A
and B. From (5.10) it also follows:
COROLLARY 5.5. A and B are separate if and only if A ∧ B = ∅.
Proof. If A and B are separate, then A ∧ B = ∅ follows from (5.4) and (5.15).
Conversely, if A ∧ B = ∅ then by (5.16) the only common part of A and B is
D = ∅. Therefore, A and B are separate.
✷
Other consequences of the definition (5.10) are:
(A ∧ B) ∧ C = A ∧ (B ∧ C),
(A ∨ B) ∧ C = (A ∧ C) ∨ (B ∧ C),
(A ∧ B) ∨ C = (A ∨ C) ∧ (B ∨ C).
(5.17)
(5.18)
(5.19)
Proof of (5.17). Take an envelope D of A, B, C. Then by (5.3) and (5.10),
(A ∧ B) ∧ C = (AD ∨ BD )D ∧ C = ((AD ∨ BD ) ∨ CD )D
= (AD ∨ (BD ∨ CD ))D = A ∧ (BD ∨ CD )D
= A ∧ (B ∧ C).
(5.20)
✷
Proof of (5.18). Assume first that
A ≺ C,
B ∧ C = ∅,
(5.21)
283
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
and set D := A ∨ B ∨ C. It is easy to check that
(A ∨ B)D = AC ,
AD = AC ∨ B,
CD = B.
(5.22)
Then by the definition (5.10)
(A ∨ B) ∧ C = ((A ∨ B)D ∨ CD )D = (AC ∨ B)D = (AD )D = A.
(5.23)
On the other hand,
(A ∧ C) ∨ (B ∧ C) = A ∨ ∅ = A,
(5.24)
and (5.18) follows in the special case (5.21).
Now consider arbitrary bodies A, B, C, and set
H := A ∧ C,
K := B ∧ C.
(5.25)
Then
A ∨ B = (H ∨ HA ) ∨ (K ∨ KB ) = (H ∨ K) ∨ (HA ∨ KB ),
(5.26)
with (H ∨ K) ≺ C and (HA ∨ KB ) ∧ C = ∅.
Then (5.23) holds with A, B replaced by (H ∨ K), (HA ∨ KB ):
((H ∨ K) ∨ (HA ∨ KB )) ∧ C = H ∨ K,
and (5.18) follows from (5.25) and (5.26).
(5.27)
✷
Proof of (5.19). Let D be an envelope of A, B, C. Then from the definition
(5.10) we have
(A ∧ B) ∨ C = (AD ∨ BD )D ∨ C = ((AD ∨ BD ) ∧ CD )D ,
(5.28)
(A ∨ C) ∧ (B ∨ C) = (AD ∧ CD )D ∧ (BD ∧ CD )D
= ((AD ∧ CD ) ∨ (BD ∧ CD ))D .
The two right-hand sides are equal by (5.18). Then (5.19) follows.
(5.29)
✷
6. A Universe of Shapes
A universe of shapes is a pair (, ≺), where is a set of shapes, ≺ is a partial
ordering on , and and ≺ satisfy the axioms (B1)–(B4) of a universe of bodies
stated in the preceding section. Here I take as the set SN of all D-shapes in RN
defined in Section 4, as ≺ I take the relation (4.3), and I show that with this
choice the axioms (B1)–(B4) are satisfied. I recall that each element of SN is
an equivalence class of D-regions with respect to the equivalence relation (4.1).
284
G. DEL PIERO
Axiom (B1) on the existence of a null shape is satisfied by the shape ∅˜ defined
in Section 4. On the contrary, SN does not include a universal shape, i.e., a D-shape
∞ with the property ≺ ∞ for all in SN .⋆
Axiom (B2) on the existence of the join is satisfied by the definition (4.8). To
check whether the separation postulate (B3) is satisfied notice that, according to
the definition given in Section 5, two D-shapes , ′ are separate if, for any other
D-shape ′′ ,
int ′′ ⊂ int and
int ′′ ⊂ int ′
⇒
′′ = ∅
(6.1)
for any representatives , ′ , ′′ of the classes , ′ , ′′ . It is easy to see that this
condition is verified if and only if
int ∩ int ′ = ∅.
(6.2)
Then axiom (B3) is satisfied if
(int ∩ int ′′ = ∅ and int ′ ∩ int ′′ = ∅)
⇒ int clo( ∪ ′ ) ∩ int ′′ = ∅,
(6.3)
and this implication is proved by the following chain of equalities
int clo( ∪ ′ ) ∩ int ′′ = int(clo( ∪ ′ ) ∩ int ′′ )
= int((clo ∪ clo ′ ) ∩ int ′′ )
= int((clo ∩ int ′′ ) ∪ (clo ′ ∩ int ′′ ))
= int((clo int ∩ int ′′ ) ∪ (clo int ′ ∩ int ′′ ))
= ∅.
(6.4)
In it, the first two equalities follow from (2.8)5 and (2.8)1 , the third is due to the
set identity (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C), the fourth comes from the property
clo = clo int of D-regions, and the last follows from the left-hand side of (6.3),
because (clo A) ∩ B = ∅ for any pair A, B of open sets with A ∩ B = ∅.
Finally, it is proved in Proposition 4.3 that axiom (B4) on the existence of the
complementary part of any part ′ of is satisfied by the D-shape ′ defined in
Section 4.
Acknowledgements
I thank the anonymous reviewers for precious suggestions and comments. This research has been supported by the Programma Cofinanziato 2000 “Modelli Matematici per la Scienza dei Materiali” of the Italian Ministry for University and Scientific Research.
⋆ An alternative choice for a universe of shapes is to remove the requirement of boundedness from
the definition of a D-region [6]. In this case, ∞ is identified with the singleton {RN }. Dealing with
unbounded regions would imply to replace the condition (iv) of area-boundedness of the boundary
with a condition of local area-boundedness: for every D-region and for every ball B of RN , the
(N − 1)-dimensional Hausdorff measure of (bdy ) ∩ B is finite.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
285
References
[1] L. Ambrosio, N. Fusco and D. Pallara, Functions of Bounded Variation and Free Discontinuity
Problems. Oxford Science Publications, Oxford (2000).
[2] C. Banfi and M. Fabrizio, Sul concetto di sottocorpo nella meccanica dei continui. Rend. Accad.
Naz. Lincei 66 (1979) 136–142.
[3] M. Degiovanni, A. Marzocchi and A. Musesti, Cauchy fluxes associated with tensor fields
having divergence measure. Arch. Rational Mech. Anal. 147 (1999) 197–223.
[4] J. Dixmier, General Topology, Springer Undergraduate Texts in Mathematics. Springer, New
York (1984).
[5] L.C. Evans and R.F. Gariepy, Measure Theory and Fine Properties of Functions, CRC Press,
Boca Raton, FL (1992).
[6] M.E. Gurtin, Private communication, Blacksburg (June 2002).
[7] M.E. Gurtin, W.O. Williams and W.P. Ziemer, Geometric measure theory and the axioms of
continuum thermodynamics. Arch. Rational Mech. Anal. 92 (1986) 1–22.
[8] W. Noll, La mécanique classique, basée sur un axiome d’objectivité. In: La Méthode
Axiomatique dans les Mécaniques Classiques et Nouvelles, Colloque International, Paris, 1959.
Gauthier-Villars, Paris (1963).
[9] W. Noll, The foundations of mechanics. In: Non-linear Continuum Theories, CIME Lectures,
1965. Cremonese, Roma (1966) pp. 159–200.
[10] W. Noll, Lectures on the foundations of continuum mechanics and thermodynamics. Arch.
Rational Mech. Anal. 52 (1973) 62–92.
[11] W. Noll, Continuum Mechanics and geometric integration theory. In: F.W. Lawvere and
S.H. Schnauel (eds), Categories in Continuum Physics, Buffalo, 1982, Springer Lecture Notes
in Mathematics, Vol. 1174. Springer, Berlin (1986) pp. 17–29.
[12] W. Noll and E.G. Virga, Fit regions and functions of bounded variation. Arch. Rational Mech.
Anal. 102 (1988) 1–21.
[13] M. Šilhavý, The existence of the flux vector and the divergence theorem for general Cauchy
fluxes. Arch. Rational Mech. Anal. 90 (1985) 195–212.
[14] M. Šilhavý, Cauchy’s stress theorem and tensor fields with divergences in Lp . Arch. Rational
Mech. Anal. 116 (1991) 223–255.
[15] C.A. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1, 2nd edn. Academic
Press, Boston (1991).
[16] A.I. Vol’pert and S.I. Hudjaev, Analysis in Classes of Discontinuous Functions and Equations
of Mathematical Physics. Nijhoff, Dordrecht (1985).
Toward a Field Theory for Elastic Bodies
Undergoing Disarrangements
LUCA DESERI1 and DAVID R. OWEN2
1 Dipartimento di Ingegneria, Università di Ferrara, 44100 Ferrara, Italy.
E-mail: deseri@ing.unife.it
2 Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A.
E-mail: do04@andrew.cmu.edu
Received 26 September 2002; in revised form 19 June 2003
Abstract. Structured deformations are used to refine the basic ingredients of continuum field theories
and to derive a system of field equations for elastic bodies undergoing submacroscopically smooth
geometrical changes as well as submacroscopically non-smooth geometrical changes (disarrangements). The constitutive assumptions employed in this derivation permit the body to store energy
as well as to dissipate energy in smooth dynamical processes. Only one non-classical field G, the
deformation without disarrangements, appears in the field equations, and a consistency relation based
on a decomposition of the Piola–Kirchhoff stress circumvents the use of additional balance laws or
phenomenological evolution laws to restrict G. The field equations are applied to an elastic body
whose free energy depends only upon the volume fraction for the structured deformation. Existence
is established of two universal phases, a spherical phase and√an elongated phase, whose volume
fractions are (1 − γ0 )3 and (1 − γ0 ) respectively, with γ0 := ( 5 − 1)/2 the “golden mean”.
Mathematics Subject Classifications (2002): 74A, 74B20, 74M25, 74H99, 76N99, 80A17.
Key words: structured deformations, multiscale, slips, voids, field equations, elasticity, dissipation.
1. Introduction
The vast scope of elasticity as a continuum field theory includes the description
at the macrolevel of the dynamical evolution of bodies that undergo large deformations, that respond to smooth changes in geometry by storing mechanical
energy, and that experience internal dissipation in isothermal motions only during
non-smooth macroscopic changes in geometry such as shock waves. The research
described in this paper represents the first step in a program to employ structured
deformations of continua to obtain a field theory capable of describing, in the
context of dynamics and large isothermal deformations, the evolution of bodies
that (i) undergo smooth deformations at the macroscopic length scale, that (ii)
can experience piecewise smooth deformations at submacroscopic length scales,
and that (iii) can not only store energy but can also dissipate energy during such
multiscale geometrical changes.
287
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 287–326.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
288
LUCA DESERI AND DAVID R. OWEN
The main goal of the present study is the derivation of the following field relations governing the smooth vector field χ that describes the macroscopic changes
in geometry of a body and the smooth tensor field G that describes the contribution
at the macrolevel of only the smooth part of the submacroscopic piecewise smooth
geometrical changes experienced by the body. If we put M := ∇χ − G and
K := (∇χ)−1 G, the desired relations are (10.1)–(10.5), which we record in the
following detailed form:
˜
˜
divX (DM (M(X,
t), G(X, t)) + DG (M(X,
t), G(X, t)))
+ bref (X, t) = ρref (X)χ̈(X, t),
˜
DG (M(X,
t), G(X, t))(K(X, t)−T − I )
˜
+ DM (M(X,
t), G(X, t))K(X, t)−T = 0,
˜
sk(DG (M(X,
t), G(X, t))M(X, t)T )
˜
+ sk(DM (M(X,
t), G(X, t)) G(X, t)T ) = 0,
˜
DG (M(X,
t), G(X, t)) · Ṁ(X, t)
˜
+DM (M(X,
t), G(X, t)) · Ġ(X, t) 0,
det(G(X, t) + M(X, t)) det G(X, t) > m(t) > 0.
(1.1)
(1.2)
(1.3)
(1.4)
(1.5)
˜
Here, (M, G) → (M,
G) is the response function that gives the Helmholtz free
˜
energy density (M(X, t), G(X, t)) at each point X in the reference configuration
˜ and DG
˜ denote its partial derivatives, bref is the body
and each time t, DM
force in the reference configuration, ρref is the mass density in the reference configuration, m(t) is a positive number depending on time, I is the identity tensor, sk
denotes the skew part of a tensor, and superposed dots denote differentiation with
respect to time. The balance of linear momentum (1.1), the “consistency relation”
(1.2), and the frame-indifference relation (1.3) amount to 12 scalar equations that
restrict the unknown fields χ and G representing 12 scalar fields in all. The Piola–
Kirchhoff stress field S in the reference configuration is related constitutively to
the fields G and M by the stress relation
˜
˜
S(X, t) = DM (M(X,
t), G(X, t)) + DG (M(X,
t), G(X, t)),
(1.6)
and the “mixed power” inequality (1.4) guarantees that the internal dissipation is
non-negative on each dynamical process of the body. Finally, the inequality (1.5)
guarantees that no interpenetration of matter occurs submacroscopically [1]. In
addition to the frame-indifference relation (1.3), the free energy response function
˜ is required to be frame-indifferent in the sense described in Section 9. There we
show that these two conditions of frame-indifference imply that the law of balance
of angular momentum is satisfied and, hence, need not be imposed directly.
The theory of structured deformations [1, 2] shows that the tensor field M =
∇χ − G describes the contributions at the macrolevel of “disarrangements,” i.e.,
of the non-smooth part of piecewise smooth submacroscopic geometrical changes,
and we use the term “elasticity with disarrangements” to distinguish the nascent
FIELD THEORY FOR ELASTIC BODIES
289
field theory described in (1.1)–(1.5) from that embodied in the now standard field
theory of non-linear elasticity:
divX (D(∇χ(X, t))) + bref (X, t) = ρref (X)χ̈(X, t),
det ∇χ(X, t) > 0
(1.7)
(1.8)
along with the stress relation
S(X, t) = D(∇χ(X, t)).
(1.9)
We note that the balance of angular momentum need not be imposed explicitly if
the response function F → (F ) is required to be frame-indifferent.
We describe in this paper how the multiscale geometry embodied in structured
deformations affords not only the decomposition
∇χ = G + M
(1.10)
of the macroscopic deformation gradient ∇χ into a part M due to disarangements
and a part G without disarrangements, but also the decomposition
(det K)S = S\ + Sd ,
(1.11)
with S\ := (det K)SK −T the stress without disarrangements and Sd := (det K)S −
S\ the stress due to disarrangements. The decomposition of stress (1.11) is the basis
for the consistency relation (3.6) which, in turn, yields the field equation (1.2), once
constitutive assumptions are laid down. The two decompositions (1.10) and (1.11)
are central to the “top-down” nature of our methodology, in which standard macroscopic fields, such as the stress power S · ∇ χ̇ and the volume density of moments
due to contact forces sk(S(∇χ)T ), are refined and enriched by substitution of the
decompositions (1.10) and (1.11) for the factors S and ∇χ:
(det K)S · ∇ χ̇ = S\ · Ġ + Sd · Ṁ + S\ · Ṁ + Sd · Ġ,
(1.12)
(det K)sk(SF T )
= sk(S\ GT ) + sk(Sd M T ) + sk(S\ M T ) + sk(Sd GT ).
(1.13)
We utilize in the sequel an “identification relation” (2.2)2 for the field G, an
identification relation (2.4) for M, and one for divS\ (relation (A.3) in the Appendix), that are provided by the theory of structured deformations. These relations
(i) describe G, M and divS\ as limits of geometrical or statical quantities calculated in terms of the piecewise smooth, injective deformations that approximate a structured deformation,
(ii) justify the attributes “without disarrangements” and due to “disarrangements”,
and
(iii) provide unambiguous interpretations for the terms in the decompositions
(1.12) and (1.13).
290
LUCA DESERI AND DAVID R. OWEN
This methodology is supplemented by factorizations of the type
(χ, G) = (χ, ∇χ) ◦ (π, K)
(1.14)
in which the pair (χ, ∇χ) represents only classical geometrical changes and (π, K)
represents purely submacroscopic geometrical changes. This factorization permits
us to introduce a “virgin configuration,” macroscopically identical to the reference
configuration, and to interpret the stress without disarrangements S\ as a stress in
the virgin configuration. In the case of invertible structured deformations, the virgin
configuration also can serve as an intermediate configuration for theories based on
classical deformations.
We are able with these tools to scrutinize and refine principal ingredients in
continuum field theories, namely,
· geometry
· power
· material symmetry
· kinematics
· dissipation
· material frame indifference
· forces and moments · constitutive relations · material uniformity,
and to arrive at the field relations (1.1)–(1.5) for an elastic body undergoing disarrangements. The new relations derived here incorporate the effects of submacroscopic disarrangements, such as slips, separations, the formation of voids, and the
switching or reorientation of submacroscopic units. They also cover submacroscopically smooth geometrical changes, such as the distortion of atomic lattices and
of molecular networks at length scales large enough to justify the use of smooth
fields to extrapolate the discrete geometrical changes of the lattice or network.
However, relations (1.1)–(1.5) do not directly incorporate the effect of jumps in the
gradients of approximating piecewise smooth deformations (“gradient disarrangements”), so that fine mixtures of phases are not captured. Moreover, the effects of
time-like disarrangements, in which changes in position occur at very short time
scales, are not incorporated, and macroscopic disarrangements such as fracture,
shear bands, shock waves, and acceleration waves are formally excluded by our
assumption that χ and G are smooth. Nevertheless, the inclusion of macroscopic
disarrangements can be accomplished in a manner analogous to that used in the
field theory based on (1.7)–(1.9). In addition, time-like disarrangements and gradient disarrangements are amenable to treatment via the concepts of “structured
motions” [3, 4] and of “second-order structured deformations” [5] that go beyond
the geometry and the kinematics in this paper. We note also that couple stresses
and other multipolar entities, temperature variations, electromagnetic fields, and
chemical reactions are left out of the present development.
˜
It is evident from the constitutive expression (M(X,
t), G(X, t)) for the volume density of the Helmholtz free energy that our theory permits energy to be
stored by means of both smooth and non-smooth submacroscopic geometrical
changes. For example, contribution to the energy both from the distortion of a
crystalline lattice between slip bands and the relative translations of parts of the
crystalline lattice across slip bands can be included here, because the former is captured by G(X, t) and the latter by M(X, t). Similarly, the macroscopic stretching
FIELD THEORY FOR ELASTIC BODIES
291
produced when a polymer network deforms can be identified through ∇χ(X, t) =
G(X, t) + M(X, t) while the submacroscopic reorientations of attached nematic
˜
particles can be described by G(X, t), so that (M(X,
t), G(X, t)) can reflect
the energetic contributions of both. Our development includes the possibility that
˜ depends upon the material point X explicitly, and we discuss the concepts of
material uniformity and homogeneity in Section 13.
˜
The response function (M, G) → (M,
G) may or may not be obtained by
means of a process of homogenization or relaxation from an “initial” response
function describing the energy stored in piecewise smooth approximating deformations. In fact, it has been shown that such a relaxation procedure, starting from a
standard form of the initial energy, does lead to response functions of the type
˜
(M, G) → (M,
G) [6], but also that the specific dependence on M and G
obtained by such a relaxation may exclude response functions already found to
˜ identified
be useful in applications ([3, 7–9]). We expect that response functions
in a variety of ways will play a role in applying the field relations (1.1)–(1.5) to
specific bodies.
˜ restricted only
In the present paper we describe a class of response functions
by considerations of material frame indifference, material symmetry, or material
uniformity as explained in Sections 9, 12, and 13. This choice provides a broad
view of elasticity with disarrangements, but does not provide for the moment insights into specific solid or fluid bodies encountered in the laboratory. However, we
do include a specific example, that of an “energetically nearsighted elastic body,”
in which the constitutive relation for the free energy takes the form
det G
˜
.
(1.15)
(M, G) → (M, G) = ψ̄
det(G + M)
We note by (1.5) that det G/ det(G + M) = det K takes values in (0, 1], and we
can interpret 1 − det K as the “void fraction” created by the purely submacroscopic
factor (i, K) in (1.14). Similarly, we call det K the “volume fraction” associated
with (i, K). We show that the consistency relation (1.2), rewritten in terms of the
variables ∇χ and K, and the constitutive assumption (1.15) imply that such an
elastic body can arise in two non-trivial “universal” phases: a spherical phase, in
which det
K = (1 − γ0 )3 , and an elongated phase, in which det K = 1 − γ0 , where
√
5−1
γ0 := 2 is the golden mean. The stress relation (1.6) for the spherical phase
reduces to that of an ideal gas, so that the stress in the current configuration is a
hydrostatic pressure that depends linearly on the density in the current configuration. For the elongated phase, a uniaxial stress in the direction of submacroscopic
elongation is superposed on such a hydrostatic pressure, as an outcome of the stress
relation (1.6).
We turn now to further introductory remarks on the nature of the field relations
˜
(1.1)–(1.5). Suppose that the response function (M, G) → (M,
G) is chosen to
satisfy the condition
˜
G) = 0,
DM (0,
(1.16)
292
LUCA DESERI AND DAVID R. OWEN
for all tensors G with det G > 0. If we consider a smooth deformation χ and put
G := ∇χ, then the tensor field M = ∇χ − G is identically zero and (1.2)–(1.4) are
satisfied identically, the last with “” replaced by “=”. Consequently a classical
motion satisfies (1.1)–(1.4) if and only if
˜
divX (DG (0,
∇χ(X, t))) + bref (X, t) = ρref (X)χ̈(X, t),
(1.17)
which is equivalent to (1.7), the balance of linear momentum for a non-linearly
elastic body. The inequality (1.5) and the relation G = ∇χ yield (1.8), and the
stress relation (1.6) for an elastic body undergoing disarrangements reduces, by
virtue of (1.16), to
˜
S(X, t) = DG (0,
∇χ(X, t)),
(1.18)
a relation equivalent to (1.9). Therefore, relation (1.16) implies that classical motions that satisfy the new relations (1.1)–(1.6) also satisfy the field relations (1.7)–
(1.9) of non-linear elasticity. We note also that, in the example of energetically
nearsighted elastic bodies treated in Section 14, the condition (1.16) is satisfied
only in exceptional cases.
The statical quantities underlying the relations (1.1)–(1.6) all can be expressed
in terms of the classical measure of stress S, and the only balance law directly
imposed is the classical balance of linear momentum. This observation provides a
point of contrast between elasticity with disarrangements and theories of “structured continua,” in which additional geometrical fields are accompanied by additional statical quantitities and additional balance laws [10, 11]. Moreover, the
presence of the non-classical geometrical field G in the present theory does not
require that we impose constitutively an evolution law expressing Ġ in terms of
other geometrical and statical quantities, as is the case in theories of materials
with “internal variables.” Instead, the decomposition (1.11) for the stress leads to
the consistency relation (1.2) that restricts G and M = ∇χ − G in dynamical
processes for the body. We note further that the field relations (1.1)–(1.5) are not
obtained by imposing balance laws and constitutive relations at a submacroscopic
length scale followed by an averaging procedure that leads to corresponding relations at the macrolevel. This fact distinguishes the present theory from multiscale
approaches employed in the field of micromechanics that use homogenization or
other systematic schemes of averaging.
The present study does not address initial-boundary value problems for the field
relations (1.1)–(1.5). Nevertheless, we expect that the problem of existence and
uniqueness locally in time of smooth solutions χ, G satisfying initial conditions
on χ, χ̇, and G and boundary conditions on χ can be attacked by means of the
energy methods described in Chapter III of the monograph [12]. Our expectation
is based on preliminary calculations for one-dimensional versions of (1.1)–(1.5),
expressed in the equivalent form (10.23)–(10.27). Key issues in confirming our
expectation are the local solvability of the consistency relation (10.24) at the initial
293
FIELD THEORY FOR ELASTIC BODIES
time and satisfaction of the mixed power inequality (10.26) with strict inequality
at the initial time.
2. Structured Deformations
Specification of a structured deformation from a region A in a Euclidean space E
with translation space V includes the specification of two fields g: A → E and
G: A → Lin V called the macroscopic deformation and the deformation without
disarrangements, respectively. In the present study we assume that the fields g
and G are smooth, although discontinuites are permitted in the piecewise-smooth
approximating deformations fn introduced below (2.1), as well as in ∇fn . This
assumption excludes slip and separation at the macroscopic level while permiting such discontinuities at submacroscopic levels. To avoid some technical issues,
precise smoothness assumptions on these fields and on the region A will not be
specified here, but sufficient smoothness requirements on the fields can be inferred
from the context. Other than smoothness requirements, the only conditions imposed on the fields g and G are the injectivity of g and the existence of a positive
number m such that the inequalities
m < det G(X) det ∇g(X)
(2.1)
hold for all X ∈ A. The Approximation Theorem for structured deformations [1]
assures that there is a sequence n → fn of piecewise smooth, injective deformations defined on A (a determining sequence) such that
g = lim fn ,
n→∞
G = lim ∇fn ,
(2.2)
n→∞
with the limits taken in the sense of essentially uniform convergence (i.e., L∞ convergence). The spatial derivatives ∇fn are taken in the classical sense, and
the limit G of derivatives in (2.2)2 need not equal the corresponding derivative
∇g of the macroscopic deformation (nor need G even be the gradient of some
deformation). Specific quantitative information about the difference M := ∇g − G
is provided in the first subsection below and justifies the terminology deformation
due to disarrangements for M.
2.1. DECOMPOSITIONS AND IDENTIFICATION RELATIONS
The additive decomposition
∇g = G + M
(2.3)
for the macroscopic deformation gradient is given deeper significance by means of
the following limit relation [2] for M:
M(X) = lim lim vol B(X; δ)−1
[fn ](Y ) ⊗ ν(Y ) dAY .
(2.4)
δ→0 n→∞
Ŵ(fn)∩B(X;δ)
294
LUCA DESERI AND DAVID R. OWEN
In this relation, n → fn is an arbitrary sequence of piecewise smooth deformations
that satisfies the limit relations (2.2). The symbol B(X; δ) denotes the ball of radius
δ > 0 centered at a point X in A, and Ŵ(fn ) ⊂ E, [fn ](Y ) ∈ V, and ν(Y ) ∈ V
denote, respectively, the jump set of the piecewise smooth deformation fn , the jump
of fn at a point Y ∈ Ŵ(fn ), and the unit normal to the jump set Ŵ(fn ) at the point Y .
The integrand in (2.4) is the tensor product of [fn ](Y ) and ν(Y ), both vectors in V.
The precise interpretations now available for G and M = ∇χ − G permit
us to understand and interpret various features of structured deformations in the
following subsections.
2.2. FACTORIZATIONS , VIRGIN CONFIGURATIONS , AND INTERMEDIATE
CONFIGURATIONS
The definition of composition of two structured deformations [1] is provided in the
formula:
(g̃, G̃) ◦ (g, G) := (g̃ ◦ g, (G̃ ◦ g)G).
(2.5)
Here, the symbol “◦” on the left-hand side denotes the composition of two structured deformations, while on the right-hand side it denotes the composition of two
functions. In addition, (G̃ ◦ g)G denotes the pointwise composition of the two
tensor fields G̃ ◦ g and G. This formula provides the following factorizations for a
structured deformation (g, G):
(g, G) = (g, ∇g) ◦ (i, K),
(g, G) = (i, H̃ ) ◦ (g, ∇g),
(2.6)
(2.7)
where i(X) := X for all X ∈ A, K := (∇g)−1 G and H̃ := (G ◦ g −1 )((∇g)−1 ◦
g −1 ). The first factorization (2.6) represents the given structured deformation as
the classical deformation (g, ∇g) following a “purely submacroscopic” structured
deformation (i, K) that accomplishes all of the disarrangements associated with
(g, G). Analogously, the second factorization (2.7) represents (g, G) as the same
classical deformation followed by the purely submacroscopic structured deformation (i, H̃ ). We emphasize that all of the factors in the above representions are
deformations of the entire body.
The factorization (2.6) provides a distinction between the body before and after it undergoes the purely submacroscopic deformation (i, K), a distinction that
permits us to distinguish between the reference configuration, from which the
classical deformation (g, ∇g) procedes, and the virgin configuration, from which
both (i, K) and (g, G) procede. Similarly, we may distinguish by means of (2.7)
between the deformed configuration without disarrangements, attained from the
virgin configuration via the classical deformation (g, ∇g) alone, and the deformed
configuration, attained from the deformed configuration without disarrangements
via the purely submacroscopic deformation (i, H̃ ). Of course, all of the configurations mentioned are global configurations of the body.
295
FIELD THEORY FOR ELASTIC BODIES
We note that the inequality (2.1) implies the relations
0 < det K = det H̃ 1
(2.8)
and permits us to call det K = det H̃ = det G/ det ∇g the volume fraction associated with the given structured deformation. The case det K < 1 reflects creation of
voids through the purely submacroscopic deformations (i, K) and (i, H̃ ). Of particular interest in applications such as crystalline plasticity are invertible structured
deformations (g, G), i.e., structured deformations for which the volume fraction
equals 1. The term “invertible” is appropriate, because the pair (g −1 , G−1 ◦ g −1 )
then is itself a structured deformation that is a two-sided inverse for (g, G) with
respect to the composition in (2.5) and with (i, I ) playing the role of the identity
structured deformation (here I v = v for all v ∈ V). In this case, the purely submacroscopic factor (i, K) also is an invertible structured deformation with inverse
(i, K)−1 = (i, K −1 ), and we have the following factorization
(g, ∇g) = (g, G) ◦ (i, K)−1
(2.9)
of the classical deformation (g, ∇g). For the structured deformation (g, G), the
purely submacroscopic deformation (i, K) carried the virgin configuration into
the reference configuration; consequently, its inverse (i, K)−1 carries the reference
configuration into the virgin configuration. Consequently, the virgin configuration
for the invertible structured deformation (g, G) plays the role of a (global) intermediate configuration for the classical deformation (g, ∇g). Local intermediate
configurations play an important role in descriptions of single and polycrystalline
materials and of polymers (see [13, 14] and references cited therein).
2.3. MOTIONS VIA FAMILIES OF STRUCTURED DEFORMATIONS ; SPACE - LIKE
DISARRANGEMENTS
The most immediate way of capturing the possibility that a body evolves in time
while undergoing structured deformations at each instant is to consider a given
positive number T and a pair of smooth mappings χ: A ×(0, T ) → E and G: A ×
(0, T ) → Lin V such that the pair (χ(·, t), G(·, t)) is a structured deformation for
each t ∈ (0, T ). When the Approximation Theorem and the identification relation
in Section 2.1 are invoked at each time t, the relations (2.2) and (2.4) become:
χ(·, t) = lim χn (·, t),
G(·, t) = lim ∇χn (·, t)
n→∞
n→∞
(2.10)
and
M(X, t)
= lim lim vol B(X; δ)
δ→0 n→∞
−1
Ŵ(χn(·,t ))∩B(X;δ)
[χn (·, t)](Y ) ⊗ ν(Y ) dAY .
(2.11)
In this context, the disarrangements associated with the approximating motions
χn that are captured in the tensor field M: A × (0, T ) → Lin V are space-like,
296
LUCA DESERI AND DAVID R. OWEN
so that time-like jumps in χn do not affect the fields associated with the family
t → (χ(·, t), G(·, t)) of structured deformations. The more complete treatment of
“structured motions” described in [3], Part 2, introduces not only a deformation
without disarrangements G but also a velocity without disarrangements χ̇\ that
permit both space-like and time-like jumps to be captured in two analogues of the
identification relation (2.11).
We choose here to follow the more immediate route, bodies evolving through
time-parameterized families of structured deformations, and our theory of elasticity with disarrangements more accurately can be entitled elasticity with space-like
disarrangements. The inequality (2.1) becomes in the case of time-parameterized
families of structured deformations:
0 < m(t) < det G(X, t) det ∇χ(X, t).
(2.12)
3. Contact and Body Forces
3.1. DECOMPOSITIONS
Earlier studies of balance laws for bodies undergoing structured deformations ([3],
Part 2, Section 1, and [15]) showed that the classical law of balance of forces in the
reference configuration is equivalent to a “refined balance law” that may be written
as:
div(SK ∗ ) + div((det K)S − SK ∗ ) − S∇(det K) + (det K)bref = 0.
(3.1)
Here, S: A×(0, T ) → Lin V is the Piola–Kirchhoff stress field, K := (∇χ)−1 G,
bref is the body force per unit volume in the reference configuration, and A∗ :=
(det A)A−T for all invertible A ∈ Lin V. Moreover, the decomposition (A.1) and
the identification relations (A.2), (A.3) derived in earlier studies ([3, 15]) and
recorded in the Appendix, permit us to call
S\ := SK ∗ = (det K)SK −T
(3.2)
the stress without disarrangements, div(SK ∗ ) the volume density of contact forces
without disarrangments, and div((det K)S − SK ∗ ) − S∇(det K) the volume density
of contact forces due to disarrangements. We call
Sd := S[(det K)I − K ∗ ]
(3.3)
the stress due to disarrangements. The availability through structured deformations
of a both a virgin configuration and a reference configuration permits one to view
(3.1) as balance of forces in the virgin configuration, differing from the reference configuration by a purely submacroscopic deformation as described in Section 2.2. Of course, the scalar field det K may be thought of as the volume fraction
associated with the given time-parameterized family of structured deformations.
FIELD THEORY FOR ELASTIC BODIES
297
The considerations above lead us not only to the decomposition
F =G+M
(3.4)
of the macroscopic deformation gradient F = ∇χ: A×(0, T ) → Lin V but also,
upon adding relations (3.2) and (3.3), to the decomposition of the stress:
(det K)S = S\ + Sd .
(3.5)
The stress tensor (det K)S is an analogue of the “weighted Cauchy tensor” (det F )T
discussed in [16], and equations (3.1) and (3.5) show that it is this weighted measure of stress that readily decomposes into a part without disarrangements plus a
part due to disarrangements.
3.2. CONSISTENCY RELATION
If we use the defining relation (3.2) for the stress without disarrangements S\ to
eliminate the Piola–Kirchhoff stress S from the decomposition (3.5), we obtain a
consistency relation between the stresses due to and without disarrangements:
S\ K T = S\ + Sd .
(3.6)
Roughly speaking, there is less freedom in the decomposition (3.5) of the weighted
stress (det K)S into parts with and without disarrangements than in the decomposition (3.4) of the macroscopic deformation gradient into parts with and without
disarrangements. Accordingly, we refer to (3.6) as the consistency relation. It will
provide, through the constitutive assumptions for S\ and Sd made in Section 7, the
restriction (1.2) on the dynamical processes that can occur in a given elastic body.
An equivalent form of the consistency relation,
S\ M T + Sd GT + Sd M T = 0,
(3.7)
sk(S\ M T ) + sk(Sd GT ) + sk(Sd M T ) = 0,
(3.8)
follows from (3.6), after substitution of GT F −T for K T , and implies that
where skA := (A − AT )/2 denotes the skew part of A ∈ Lin V. This relation plays
a role in the analysis of moment densities in Section 5.
4. Power Expended; Balance Laws
We postulate that in a family of structured deformations (χ, G) the power expended at time t ∈ (0, T ) on a subbody S ⊂ A by its exterior is given by the
classical formula
S(X, t)ν(X) · χ̇ (X, t) dAX
P (S, t) =
bdy S
(4.1)
+ b∗ (X, t) · χ̇ (X, t) dVX .
S
298
LUCA DESERI AND DAVID R. OWEN
Here, beyond the quantities S and χ̇ introduced in Sections 2 and 3, ν(X) denotes
the outward unit normal at the point X ∈ bdy S, and b∗ := bref − ρref χ̈ is the total
body force, with bref the body force in the reference configuration and ρref the mass
density in the reference configuration. Our use of the classical formula (4.1) for
the power allows us to preserve much of the structure of the standard field theory
of non-linear elasticity. (See [4] for a derivation of balance laws that arise from a
non-classical formula for the power.)
A standard argument [17, 18] based on invariance of the power expended under
superposed rigid motions yields the classical laws of balance of linear and angular
momentum:
divS + bref = ρref χ̈ ,
(4.2)
sk(SF T ) = 0.
(4.3)
The definition (4.1) of the power expended and the balance law (4.2) yield by
means of the divergence theorem and a standard product rule the following reduced
expression for the power expended
(4.4)
P (S, t) = S(X, t) · ∇ χ̇ (X, t) dVX .
S
We note that the formula ∇ χ̇ = (∇χ)· = Ḟ and the two basic decompositions
(3.4) and (3.5) permit us to decompose (det K)S · ∇ χ̇, the density of stress power
in the virgin configuration, in the following manner:
(det K)S · ∇ χ̇ = S\ · Ġ + Sd · Ṁ + S\ · Ṁ + Sd · Ġ.
(4.5)
The contribution S\ · Ġ + Sd · Ṁ to the stress power involves pairing like quantities
(a stress without disarrangements and a rate of deformation without disarrangements, or corresponding quantities due to disarrangements). Because the contribution S\ · Ṁ + Sd · Ġ mixes factors with and without disarrangements, we refer to
it as the mixed (stress) power.
In a similar way, we may decompose the volume density of moments in the
balance law (4.3):
(det K)sk(SF T ) = sk(S\ GT )+sk(Sd M T )+sk(S\ M T )+sk(Sd GT ),
(4.6)
and individual terms on the right-hand side may be interpreted as particular moment densities, as described in the next section. By the consistency relation (3.8),
the last three moment densities sk(Sd M T ), sk(S\ M T ), and sk(Sd GT ) appearing
on the right-hand side of (4.6) must add to zero. By the balance of angular momentum (4.3), by (4.6), and by (3.8), the first moment density sk(S\ GT ) on the
right-hand side of (4.6) must vanish.
299
FIELD THEORY FOR ELASTIC BODIES
5. Offset Moments
The volume densities of moments sk(S\ M T ), sk(Sd M T ), sk(Sd GT ) arose in Section 4 through the formula (4.6). In this section we identify the terms sk(S\ M T )
and sk(Sd M T ) as volume densities of “offset moments.”
The significance of the moment density sk(Sd M T ) is revealed by the following
identification relation:
sk(Sd (X, t)M T (X, t))
= lim lim
δ→0 n→∞
Ŵ(χn (·,t ))∩B(X;δ) Sd (X, t)ν(Y )
× [χn (·, t)](Y ) dAY
vol B(X; δ)
,
(5.1)
which we verify below. The vector Sd (X, t)ν(Y ) is the traction due to disarrangements at the point Y on a disarrangement site in the virgin configuration, computed
using the stress due to disarrangements at the center X of the ball. The vector
product Sd (X, t)ν(Y ) × [χn (·, t)](Y ) is (minus) the moment per unit area produced
by that traction acting against the offset [χn (·, t)](Y ) caused by disarrangements.
An elementary instance of such moments would arise if a deck of cards, in equilibrium under a system of loads, is shifted near the middle card without changing
either the shape of the individual cards or the applied loads. The moment arising
from the change in geometry of the deck corresponds to the moment calculated
on the right-hand side of (5.1). Consequently, we call sk(Sd M T ) a volume density
of offset moments. Of course, replacing Sd by S\ in (5.1) permits us also to call
sk(S\ M T ) a volume density of offset moments. The identification relation (5.1)
follows immediately if we substitute the right-hand side of the identification relation (2.11) into the left-hand side of (5.1) and if we identify the skew tensor
sk(Sd (X, t)ν(Y ) ⊗ [χn (·, t)](Y )) with its axial vector.
Because G measures the deformation away from disarrangement sites, we interpret the moment density sk(Sd GT ) in (4.6), arising even in motions involving
disarrangements, as an analogue of the moment density sk(SF T ) arising in the
classical balance law. (Recall that in Section 4 we pointed out that the analogous
density sk(S\ GT ) vanishes, because it is a scalar multiple of sk(SF T ).) According
to (3.8), the offset moment densities sk(Sd M T ), sk(S\ M T ), and the moment density
sk(Sd GT ) in (4.6) must add to zero. Actually, we show in Section 9 that material
frame-indifference implies that sk(Sd M T ) vanishes and, therefore, that sk(S\ M T )
and sk(Sd GT ) also must add to zero.
6. Dynamical Processes, Constitutive Classes, and the Dissipation Inequality
A dynamical process is specified here by giving a motion χ, the deformation without disarrangements G, the stress field S, the volume density ψ of the Helmholtz
free energy in the reference configuration, and the mass density ρref in the reference configuration. Of course, the stresses without and due to disarrangements
S\ and Sd are determined by the Piola–Kirchhoff stress S, the motion χ, and
300
LUCA DESERI AND DAVID R. OWEN
the deformation without disarrangements G through the relations (3.2) and (3.3),
and the body force bref also is determined by fields in our list from the balance
of linear momentum (4.2). (We may guarantee that the balance of angular momentum (4.3) is satisfied on every dynamical process by imposing the condition
sk(SF T ) = 0, but we refrain from doing so pending the discussion of frameindifference in Section 9.) We will omit ρref in the list above for the sake of
conciseness.
The concept of a constitutive class is central to the specification of the particular
material that is to be considered. Here, following Gurtin [19], a constitutive class
C simply is a collection of dynamical processes. A particular choice of constitutive class limits the dynamical processes that are to be considered. In practice, a
constitutive class is specified by giving a list of response functions: the constitutive
class is the collection of those dynamical processes that satisfy the relations on the
fields χ, G, S, ψ provided by the response functions. Of course, these relations
may include inequalities as well as equations.
Another limitation on dynamical processes is provided by the second law of
thermodynamics which, in the present context of isothermal processes, is the dissipation inequality:
ψ̇(X, t) S(X, t) · ∇ χ̇(X, t),
(6.1)
asserting that the rate of change of the density of the Helmholtz free energy does not
exceed the stress power. We denote by D the collection of all dynamical processes
χ, G, S, ψ that satisfy the dissipation inequality. The dissipation inequality is
imposed by means of the requirement
C ⊂ D.
(6.2)
In other words, every dynamical process for the given material must obey (6.1).
The dissipation inequality may be used to impose restrictions on the response
functions that specify a constitutive class C, as first described in the context of the
Clausius–Duhem inequality by Coleman and Noll [20] and now widely followed in
continuum thermodynamics. According to this procedure, one seeks necessary and
sufficient conditions on the response functions that specify C in order that C ⊂ D.
We indicate in the next section that, when the free energy and stresses depend only
upon F = ∇χ and G, the restrictions obtained from the procedure of Coleman and
Noll include the vanishing of the internal dissipation S(X, t) · ∇ χ̇ (X, t) − ψ̇(X, t)
on dynamical processes in C. We shall maintain the premise that it is useful to identify and study constitutive classes that admit internal dissipation on a non-trivial
class of dynamical processes. Consequently, instead of following the procedure of
Coleman and Noll, we are led in the next section to impose sufficient conditions
on the constitutive class in order that it be included in D. To do so, we specify a
particular constitutive class Ed and show directly the inclusion Ed ⊂ D. Although
all of the fields in our description of a dynamical process are smooth, the present
approach echoes the standard use of the second law of thermodynamics to limit the
FIELD THEORY FOR ELASTIC BODIES
301
class of non-smooth processes that can occur in the presence of a shock wave (see,
for example, [21]). In our context, the non-smoothness occurs at a submacroscopic
level and is made explicit only through the piecewise-smooth motions χn that arise
in the Approximation Theorem.
In spite of the present choice not to pursue the procedure of Coleman and Noll,
the constitutive class C obtained via that procedure merits detailed study, because
it admits the possibility that internal dissipation arises via small jumps between
points on a constitutive manifold determined by the consistency relation. (See [7, 8]
for elementary examples.)
7. A Constitutive Class for Elastic Bodies Undergoing Disarrangements
The constitutive data that we employ initially for the specification of an elastic
body undergoing disarrangements are the smooth response functions (F, G) →
(F, G), (F, G) → S\ (F, G), and (F, G) → Sd (F, G) for the free energy, stress
without disarrangements, and stress due to disarrangements, all defined on pairs of
invertible tensors (F, G) satisfying the inequalities
0 < det G det F.
(7.1)
An equivalent description of these response functions entails the specification of
˜
the mappings (M, G) → (M,
G) := (M + G, G)(M, G) → S̃\ (M, G) :=
S\ (M + G, G), and (M, G) → S̃d (M, G) := Sd (M + G, G) defined on pairs of
tensors (M, G) satisfying
0 < det G det(M + G).
(7.2)
For future reference, we record here the relations
˜
G) = DF (M + G, G),
DM (M,
(7.3)
˜
DG (M,
G) = DF (M + G, G) + DG (M + G, G).
(7.4)
We allow the free energy response function also to depend upon the material point
X at which the free energy is to be computed, but we delay until Section 9 making
˜
explicit this dependence on X in the symbols (F, G) and (M,
G).
˜ S̃\ , S̃d now permit us to define the class C of dynamical
The functions ,
processes satisfying the constitutive relations
˜
ψ(X, t) = (M(X,
t), G(X, t)),
(7.5)
S\ (X, t) = S̃\ (M(X, t), G(X, t)),
(7.6)
Sd (X, t) = S̃d (M(X, t), G(X, t))
(7.7)
and
302
LUCA DESERI AND DAVID R. OWEN
for all X, t. We now indicate how the requirement (6.2) imposed via the procedure
of Coleman and Noll leads to a constitutive class in which no internal dissipation
occurs. (To simplify the relations below, we omit the argument (X, t) throughout.)
We multiply both sides of the dissipation inequality (6.1) by det K, use the constitutive relations (7.5)–(7.7) and the formula (4.5) for the stress power in the virgin
configuration, and we conclude that the internal dissipation
det K(S · ∇ χ̇ − ψ̇)
˜
= (S̃\ (M, G) + S̃d (M, G) − (det K)DM (M,
G)) · Ṁ
˜
+ (S̃\ (M, G) + S̃d (M, G) − (det K)DG (M,
G)) · Ġ
(7.8)
is not negative on each dynamical process in C. In spite of the restrictions that
the consistency relation (3.6) together with the constitutive relations (7.6), (7.7)
place on Ṁ and Ġ, we may reverse any dynamical process in C with respect to its
time-evolution and obtain another dynamical process in C. Consequently, Ṁ and Ġ
may be replaced by −Ṁ and −Ġ in (7.8), leaving all other quantities unchanged.
Therefore, the internal dissipation as given in (7.8) must vanish for every dynamical
process in C, and the dissipation inequality (6.1) must be satisfied as an equality.
In order to obtain a theory that admits internal dissipation, we consider now a
collection of dynamical processes different from C. The constitutive class that we
now specify is suggested by comparing the formula (4.5) for the stress power in
the virgin configuration with the formula for (det K)ψ̇ obtained by differentiating
both sides of (7.5) with respect to t:
(det K)S · ∇ χ̇ = Sd · Ṁ + S\ · Ġ + Sd · Ġ + S\ · Ṁ,
˜
˜
(det K)ψ̇ = (det K)DM (M,
G) · Ṁ + (det K)DG (M,
G) · Ġ.
(7.9)
(7.10)
Our goal of specifying a material that can both store energy and dissipate energy
in smooth processes can be achieved first by choosing some terms on the righthand side of (7.9) to be set equal to the entire right-hand side of (7.10), thereby
specifying the amount of work done on each time interval that will be stored by
the body. In order to satisfy the dissipation inequality (6.1), the remaining terms
on the right-hand side of (7.9) must be assumed to be non-negative. Accordingly,
˜ with domain {(M, G) | 0 < det G det(G + M)},
given a response function
we consider the collection Ed of dynamical processes χ, G, S, ψ satisfying the
constitutive relations
˜
ψ(X, t) = (M(X,
t), G(X, t)),
˜
Sd (X, t) = (det K(X, t))DM (M(X,
t), G(X, t)),
˜
S\ (X, t) = (det K(X, t))DG (M(X, t), G(X, t)),
(7.11)
(7.12)
(7.13)
and the mixed power inequality
0 S\ (X, t) · Ṁ(X, t) + Sd (X, t) · Ġ(X, t)
for all X, t.
(7.14)
FIELD THEORY FOR ELASTIC BODIES
303
In making these choices we appeal to the idea that forces separated from a site
of geometrical changes are unlikely to be able to maintain a metastable geometrical
configuration at that site and, therefore, should be capable of contributing to dissipation. Thus, we take into account the separation of the points of applications of the
contact forces due to disarrangements (produced by Sd ) from the sites where the
geometrical changes without disarrangements (that contribute to Ġ) occur. Analogous considerations can be made for the other term S\ · Ṁ in the mixed power. On
the contrary, for the “pure” term Sd · Ṁ the proximity of the points of application
of the contact forces due to disarrangements to the sites where changes in the disarrangements occur enables the maintainance of metastability, and so also for the
other “pure” term S\ · Ġ. The constitutive assumptions (7.12) and (7.13) embody
the non-dissipative character of the terms Sd · Ṁ and S\ · Ġ in the stress power.
An important conclusion that can be drawn from the definition of the constitutive class Ed is that the dissipation inequality is satisfied for every dynamical
process in Ed , i.e., Ed ⊂ D. Indeed, the constitutive relations (7.11)–(7.13), the
mixed power inequality (7.14), and relations (3.4) and (3.5) tell us that
˜ · Ṁ + (det K)DG
˜ · Ġ
(det K)ψ̇ = (det K)DM
˜ · Ṁ + (det K)DG
˜ · Ġ
(det K)DM
˜ · Ġ + (det K)DG
˜ · Ṁ
+ (det K)DM
= (det K)S · Ḟ ,
which is equivalent to the dissipation inequality (6.1).
It is also significant that the consistency relation (3.6), through the constitutive
relations (7.12) and (7.13), imposes a restriction on dynamical processes in Ed : for
every dynamical process χ, G, S, ψ in Ed there holds
˜
DG (M(X,
t), G(X, t))K(X, t)T
˜
˜
= DG (M(X,
t), G(X, t)) + DM (M(X,
t), G(X, t))
(7.15)
for all (X, t). Consequently, the pairs (M(X, t), G(X, t)) available through dynamical processes in Ed lie in a submanifold of Lin V × Lin V. In particular, for each
(X, t), the pairs (Ṁ(X, t), Ġ(X, t)) of time-derivatives available through dynamical processes in Ed lie in the tangent space of the submanifold at (M(X, t), G(X, t))
and, hence, cannot be arbitrary elements of Lin V × Lin V. Similarly, the mixed
power inequality (7.14) imposes a restriction on the quantities M, G, Ṁ and Ġ,
or, equivalently, on F , G, Ḟ and Ġ, that can arise for dynamical processes in the
constitutive class Ed , and we shall discuss some of these restrictions in Section 8.
Finally, for every classical dynamical process χ, ∇χ, S, ψ in Ed , the consistency
relation (7.15), and the fact that K = I when M = F − G = 0, yield for all X, t:
˜
DM (0,
∇χ(X, t)) = 0,
(7.16)
and, equivalently, by (7.3),
DF (∇χ(X, t), ∇χ(X, t)) = 0,
a restriction on the classical dynamical processes for the given elastic body.
(7.17)
304
LUCA DESERI AND DAVID R. OWEN
Our theory thus implies that a given choice of free-energy response function
˜
restricts the dynamical processes available to a body through relations (7.11)–
(7.15), and that choice also restricts the classical dynamical processes through
˜ itself, only through
relation (7.16). In contrast, our theory restricts the choice of ,
the condition of frame-indifference (9.1).
8. Internal Dissipation
The internal dissipation in the reference configuration for a dynamical process χ,
G, S, ψ in Ed is defined to be the excess of the stress-power over the rate of change
of free energy: S · ∇ χ̇ − ψ̇ = S · Ḟ − ψ̇. Because the dissipation inequality (6.1) is
satisfied for every dynamical process in Ed , the internal dissipation is non-negative,
and we consider from now on
ϒ := (det K)(S · Ḟ − ψ̇) 0,
(8.1)
the internal dissipation in the virgin configuration. It follows immediately from
(4.5), (7.12), and (7.13) that the internal dissipation in the virgin configuration
equals the mixed stress power:
ϒ = S\ · Ṁ + Sd · Ġ
˜ · Ṁ + DM
˜ · Ġ)
= (det K)(DG
= (det K)[(DF + DG ) · Ḟ − DG · Ġ] 0
(8.2)
for each dynamical process χ, G, S, ψ in the constitutive class Ed . Our aim in this
section is to relate the internal dissipation to familiar quantities in the literature by
investigating the relative contributions of the two terms S\ · Ṁ and Sd · Ġ in (8.2).
An equivalent rewriting of (8.1) yields the relations
S · Ḟ = (det K)−1 (S\ · Ġ + Sd · Ṁ + S\ · Ṁ + Sd · Ġ)
˜ · Ġ + DM
˜ · Ṁ + (det K)−1 ϒ
= DG
= ψ̇ + (det K)−1 ϒ,
(8.3)
a decomposition of the stress-power in the reference configuration into a nondissipative part ψ̇ and a dissipative part (det K)−1 ϒ 0. Thus, by (8.2), the
dissipative part (det K)−1 ϒ of the stress-power equals the mixed stress-power
in the reference configuration. Moreover, for classical dynamical processes χ,
∇χ, S, ψ in the constitutive class Ed , the internal dissipation vanishes, because
Ṁ = Sd = 0.
For a given stress S and for given deformation rates Ṁ and Ġ, the relative
magnitudes of the terms S\ · Ṁ and Sd · Ġ can be altered by adjusting K = F −1 G,
because of the formulas S\ = SK ∗ and Sd = (det K)S − SK ∗ . In particular, for K
close to the identity I , we have S\ = S + O(K − I ) and Sd = O(K − I ), and we
expect that the term S\ · Ṁ dominates the term Sd · Ġ in the expression (8.2) for ϒ
FIELD THEORY FOR ELASTIC BODIES
305
as K tends to the identity I . (The symbol O(K −I ) denotes a tensor whose norm is
bounded above by a constant times the norm of K − I . ) In order to understand this
idea in more depth, it is enlightening to express the internal dissipation ϒ in terms
of the Cauchy stress T , the macroscopic deformation F and its time-derivative Ḟ ,
and the deformation without disarrangements G and its derivative Ġ. In doing so,
we employ (8.2) along with the the formulas S\ = SK ∗ and Sd = (det K)S − SK ∗ ,
and we find it convenient to suppress the arguments X, t, and χ(X, t) for the sake
of simplicity of notation. We record the result here, omitting its routine derivation:
(det F )−1 ϒ = T H ∗ · (Ḟ F −1 − ĠG−1 ) + T H ∗ · ĠG−1 (H − I )2 ,
(8.4)
where H := GF −1 is the referential version of the tensor field H̃ appearing in
Section 2.2 and, as usual, H ∗ = (det H )H −T . We note that the expression T H ∗ ·
ĠG−1 (H − I )2 on the right-hand side of (8.4) is quadratic in H − I , while the
first term T H ∗ · (Ḟ F −1 − ĠG−1 ) equals T · (Ḟ F −1 − ĠG−1 ) plus a term linear in
H − I . In other words, we conclude from (8.4) that
(det F )−1 ϒ = T H ∗ · (Ḟ F −1 − ĠG−1 ) + O((H − I )2 )
= T · (Ḟ F −1 − ĠG−1 ) + O(H − I ).
(8.5)
In order to relate the last formula for the internal dissipation to more familiar
quantities, we note that the fields LG := ĠG−1 and LM := Ḟ F −1 − ĠG−1
appeared in the study [13] of multiple slip in single crystals as the relative rate of
deformation without disarrangements and the relative rate of deformation due to
disarrangements, respectively. (In [13], the term “slip” replaced “disarrangement”
because of the particular context of that study.) Moreover, the factorization (2.7)
implies that the tensor field T\ := T H ∗ in (8.4) and (8.5) is analogous to S\ = SK ∗
and may be called the stress in the current configuration without disarrangements,
a configuration macroscopically identical to the current configuration but containing none of the disarrangements associated with χ and G. Accordingly, T\
represents a stress without disarrangements. (In view of (3.3), the tensor field Td :=
(det H )T −T\ is the analogue of Sd and represents a stress due to disarrangements.)
Therefore, (8.5) may now be recast in the form
(det F )−1 ϒ = T\ · LM + O((H − I )2 )
= T · LM + O(H − I ).
(8.6)
The tensor H − I measures the disarrangements from the current configuration
without disarrangements to the current configuration, and the decomposition (8.6)
tells us that the quantities T ·LM and T\ ·LM provide approximations to the internal
dissipation to within, respectively, linear and quadratic terms in the disarrangements from the current configuration without disarrangements. This result places
in perspective with respect to the present theory the frequent identification of the
internal dissipation as an expression of the form T · LM (sometimes called “plastic
power” in phenomenological theories of plasticity).
306
LUCA DESERI AND DAVID R. OWEN
9. Material Frame-Indifference
We consider here the transformation properties of the kinematical quantities associated with dynamical processes under changes of observer. These transformation properties can be obtained by replacing the motion χ, and the approximating motions χn from the Approximation Theorem, by (X, t) → r(χ(X, t), t)
and (X, t) → r(χn (X, t), t), where r denotes a rigid motion (X, t) → x0 (t) +
Q(t)(X − X0 ) with Q(t) a proper orthognal tensor. From this observation and the
fact that G = limn→∞ ∇χn , we obtain the transformation rules
F
G
M
K
Ḟ
Ġ
Ṁ
→
→
→
→
→
→
→
QF
QG
QM
K
QḞ + Q̇F
QĠ + Q̇G
QṀ + Q̇M.
In the present context of an elastic body undergoing disarrangements, we say that
˜ is frame-indifferent if, for all proper orthogonal tensors Q
the response function
and pairs (M, G) with 0 < det G det(M + G), there holds
˜
˜
(QM,
QG) = (M,
G),
(9.1)
or, equivalently,
(QF, QG) = (F, G)
(9.2)
for all proper orthogonal tensors Q and pairs (F, G) with 0 < det G det F .
A useful characterization of this condition follows from the polar decomposiT
in (9.1) or (9.2),
tions F = RF UF and G = RG UG . Indeed, we may put Q := RG
−1 T
T
T
T
or Q := RF in (9.1) and use the relations RF = UF F , RG = UG−1 GT to obtain
for all pairs (M, G), with 0 < det G det(M + G), the representations
↔
˜
˜ G−1 GT M, UG ) = (GT M, CG ),
(M,
G) = (U
(GT F, CG ),
(F, G) = (UG−1 GT F, UG ) =
(F, G) =
(UF , UF−1 F T G)
T
= (CF , F G),
(9.3)
(9.4)
(9.5)
where CF := F T F and CG := GT G are the right Cauchy–Green tensors for F
and G, respectively. Each one of these representations is both a necessary and a
˜ and
sufficient condition for the frame-indifference of the response functions
in the context of elastic bodies undergoing disarrangements.
A second characterization of the frame-indifference of the response function
˜ follows by imposing (9.1) on smooth, time-parameterized families t → Q(t)
FIELD THEORY FOR ELASTIC BODIES
307
and t → (M(t), G(t)) and differentiating both sides of (9.1) with respect to t to
conclude that
˜
˜
DM (QM,
QG) · [Q̇M + QṀ] + DG (QM,
QG) · [Q̇G + QĠ]
˜
˜
G) · Ṁ + DG (M,
G) · Ġ.
= DM (M,
(9.6)
˜ we may vary
Because the restriction (9.1) applies throughout the domain of ,
T
Q̇, Ġ, and Ṁ independently (subject to the constraints sym(Q̇Q ) = 0 and 0 <
˜ that
det G det(M + G)) to conclude from the smoothness of
˜
˜
DM (QM,
QG) = QDM (M,
G),
(9.7)
˜
˜
DG (QM,
QG) = QDG (M,
G),
(9.8)
˜
˜
sk(DM (M,
G)M T + DG (M,
G)GT ) = 0
(9.9)
and
for all proper orthogonal tensors Q and pairs (M, G) with 0 < det G det(M +
˜
G). It is easy to verify that relations (9.7)–(9.9) imply that the response function
is frame-indifferent.
It is crucial to distinguish between, on the one hand, the smooth time-parameterized families t → (M(t), G(t)) used in establishing (9.6) and, on the other
hand, the families
t → (M(X, t), G(X, t)) = (∇χ(X, t) − G(X, t), G(X, t))
arising from dynamical processes in the constitutive class Ed . In particular, the time
derivatives of former pairs can be varied arbitrarily (when det G(t) < det(M(t) +
G(t))), while those of the latter pairs cannot, as we observed near the end of
Section 7.
˜
We say that the mixed power S\ · Ṁ + Sd · Ġ = (det K)DG (M,
G) · Ṁ+
˜
(det K)DM (M,
G) · Ġ is frame-indifferent if, for all smooth, time-parameterized
families t → Q(t) and for all families t → (M(X, t), G(X, t)) arising from
dynamical processes in Ed , there holds
˜
˜
(det K)DG (M,
G) · Ṁ + (det K)DM (M,
G) · Ġ
·
˜
= det(K)DG (QM,
QG) · (QM)
˜
+ det(K)DM (QM,
QG) · (QG)· .
(9.10)
This condition amounts to the assertion that the mixed power is invariant under superpositions of rigid motions on dynamical processes. We now show that, given the
˜ the mixed power is frame-indifferent
frame-indifference of the response function ,
if and only if
˜
˜
G)M T + DM (M,
G)GT ) = 0
sk(DG (M,
(9.11)
308
LUCA DESERI AND DAVID R. OWEN
for all pairs (M, G) arising from dynamical processes in Ed . In fact, expansion of
the derivatives on the right-hand side of (9.10) tells us that (9.10) is equivalent to
the relation
˜
˜
(DG (M,
G) − QT DG (QM,
QG)) · Ṁ
T
˜
˜
+ (DM (M,
G) − Q DM (QM,
QG)) · Ġ
T
˜
˜
= (DG (QM,
QG)M + DM (QM,
QG)GT ) · Q̇.
˜ we conclude from (9.7) and (9.8) that the
Given the frame-indifference of ,
previous relation is equivalent to
˜
˜
0 = (DG (M,
G)M T + DM (M,
G)GT ) · QT Q̇,
and the relation sym(QT Q̇) = 0 along with the arbitrariness of t → Q(t) provides
the asserted characterization of frame-indifference of the mixed power.
Our main result on material frame-indifference is a generalization of a result
of Noll [22] in classical elasticity: if both the free energy response function and
the mixed power are frame indifferent, then the balance of angular momentum
(4.3) is satisfied for all dynamical processes in Ed . (Noll actually showed that the
frame-indifference of the free energy response is equivalent to the law of balance
of angular momentum in the classical context.) Indeed, if we add (9.11) to (9.9) we
conclude that
˜
˜
G)F T + DG (M,
G)F T ) = 0
sk(DM (M,
(9.12)
for all pairs (M, G) arising from dynamical processes in Ed . For each dynamical
process, the constitutive relations (7.12), (7.13), and the decomposition (3.5) may
be applied to yield (4.3), the law of balance of angular momentum.
This main result permits us to impose the law of balance of angular momentum
indirectly by requiring that both the free energy response function and the mixed
power be frame-indifferent, a requirement that we impose from now on through
the relations (9.1) and (9.11). In these considerations, it is important to remember
that (9.11) is a restriction on dynamical processes, while (9.1) is a restriction on
the free energy response function. Moreover, our main result and the imposition of
(9.1) and (9.11) permit us to omit the law of balance of angular momentum among
the field equations that we provide in the next section.
We note from relations (7.12) and (7.13), that (9.11) implies
sk(S\ M T + Sd GT ) = 0
(9.13)
on all dynamical processes in the constitutive class Ed , and relation (3.8) then
implies also that
sk(Sd M T ) = 0.
We conclude from frame-indifference as realized in (9.1) and (9.11) that:
(9.14)
FIELD THEORY FOR ELASTIC BODIES
(i)
(ii)
(iii)
(iv)
309
sk(SF T ) = 0,
sk(Sd M T ) = 0,
sk(S\ GT ) = 0,
sk(S\ M T + Sd GT ) = 0.
Thus, each of the “pure” moment densities sk(Sd M T ) and sk(S\ GT ) is self-equilibrated, while the mixed moment densities sk(S\ M T ) and sk(Sd GT ) add to zero.
Moreover, by (ii) and by the identification relation (5.1), the traction due to disarrangements Sd (X, t)ν(Y ) and the geometrical offset [χn ](Y, t) are colinear on
average.
It is useful to record the forms that relations (9.1) and (9.11) assume when the
˜
response function (M, G) → (M,
G) is replaced by (F, G) → (F, G) =
˜
(F − G, G). Of course, (9.1) is replaced by (9.2), a restriction on the response
function . In view of (3.4), (7.5), and (7.11), the relation (9.11) is equivalent to
sk(DF (F, G)F T + DG (F, G)(F T − GT )) = 0,
(9.15)
a restriction on dynamical processes. Henceforth, when using F and G as the
arguments of the free energy, we assume that (9.2) and (9.15) are satisfied, the
first throughout the domain of and the second on all dynamical processes in Ed .
10. Field Relations
Our analysis in Sections 7–9 has led us to the specification of one response function
˜
(M, G) → (M,
G) satisfying relations (9.1) and (9.11), the former throughout
˜ and the latter on all dynamical processes in Ed . Given the body
the domain of
force field bref in the reference configuration, the remaining relations employed in
deriving the field relations are restrictions on dynamical processes: the balance of
linear momentum (4.2), the constitutive relations (7.11)–(7.13), the mixed power
inequality (7.14), and the consistency relation (7.15). As demonstrated in Section 9,
the law of balance of angular momentum is a consequence of the assumptions of
frame-indifference (9.1) and (9.11).
We now are in a position to record and derive from (9.1), (9.11), (4.2), and
(7.11)–(7.15) the field relations for an elastic body undergoing disarrangements:
˜ + DG )
˜ + bref = ρref χ̈ ,
div(DM
(10.1)
˜ −T − I ) + DM K
˜ −T = 0,
DG (K
(10.2)
˜ T + DM G
˜ T ) = 0,
sk(DG M
(10.3)
˜ · Ṁ + DM
˜ · Ġ 0,
DG
(10.4)
det(G + M) det G > m > 0,
(10.5)
where, as in (3.4), ∇χ = G + M, and, as in (2.12), m is a positive number
depending upon t alone. (These are relations (1.1)–(1.5), with arguments omitted
310
LUCA DESERI AND DAVID R. OWEN
for the sake of conciseness.) The law of balance of linear momentum (10.1), the
consistency relation (10.2), and the frame-indifference of the mixed power (10.3)
amount to 3 + 6 + 3 = 12 scalar equations for the unknowns χ and G, having
a total of 12 scalar components. (That the consistency relation amounts to only 6
scalar equations follows from the fact that both sides of the original consistency
relation (3.6), when multiplied by F T , are symmetric tensors.) The inequalities
(10.4) and (10.5) further restrict the dynamical processes satisfying (10.1), (10.2),
and (10.3). We emphasize that, given the body force field bref , the field relations
are restrictions on dynamical processes, while the relation (9.1) is a restriction
˜ We must also keep in mind that
˜ and its derivatives
on the response function .
depend not only upon M(X, t) and G(X, t) but may also upon the material point X
itself, and we make this dependence explicit when needed for clarity. For example,
the first term on the left-hand side of the equation of balance of linear momentum
(10.1) is the field
˜
(X, t) → divX [DM (∇χ(X,
t) − G(X, t), G(X, t), X)
˜
+ DG (∇χ(X,
t) − G(X, t), G(X, t), X)].
(10.6)
The field relations follow readily from (9.1), (9.11), (4.2), (7.11)–(7.15), and
(2.12). The law of balance of linear momentum is a consequence of its counterpart
(4.2), of the constitutive relations (7.12) and (7.13) for the stresses without and with
disarrangements, and of the decomposition (3.5); the consistency relation (10.2) is
the relation (7.15), rewritten with trivial algebraic changes; (10.3) is (9.11), and the
mixed power inequality (10.4) is (7.14) with S\ and Sd replaced by the expressions
in the formulas (7.12) and (7.13). Moreover, by the decomposition (3.5) and the
constitutive relations (7.12), (7.13), one has the stress relation
˜
˜
S(X, t) = DM (M(X,
t), G(X, t)) + DG (M(X,
t), G(X, t))
(10.7)
valid for all dynamical processes in Ed .
When the motion χ and G determine a classical motion, i.e., G = ∇χ, then
the relations M = 0, K = I , and det K = 1 tell us that the balance of linear momentum (10.1), the consistency relation (10.2), and the inequality (10.5)
become
˜
div DG (0,
∇χ) + bref = ρref χ̈ ,
(10.8)
˜
DM (0,
∇χ) = 0,
(10.9)
det ∇χ > m > 0.
(10.10)
The remaining relations (10.3) and (10.4) are satisfied identically in view of (10.9).
In some applications, it is easier to use the field relations when they are ex˜
pressed in terms of the response function (F, G) → (F, G) = (F
− G, G). In
FIELD THEORY FOR ELASTIC BODIES
311
this case, the response function is assumed to satisfy (9.2), and the field relations
(10.1)–(10.5) become
div(2DF + DG ) + bref = ρref χ̈ ,
(10.11)
DF (2K −T − I ) + DG (K −T − I ) = 0,
(10.12)
sk((DF + DG )F T − DG GT ) = 0,
(10.13)
(DF + DG ) · Ḟ − DG · Ġ 0,
(10.14)
det F det G > m > 0,
(10.15)
respectively, while the stress relation (10.7) becomes
S(X, t) = 2DF (F (X, t), G(X, t)) + DG (F (X, t), G(X, t)).
(10.16)
Corresponding to the expression (10.6), the first term on the left-hand side of the
equation of balance of linear momentum (10.11) is the field
(X, t) → divX [2DF (∇χ(X, t), G(X, t), X)
+DG (∇χ(X, t), G(X, t), X)].
(10.17)
It is convenient to record for future use the following formulas for the stresses
with and without disarrangements in terms of (omitting X and t for the sake of
brevity):
S\ = (det K)(DF (F, G) + DG (F, G)),
(10.18)
Sd = (det K)DF (F, G).
(10.19)
In view of the significance of the purely submacroscopic factor (i, K) in (2.6),
some applications become more accessible if one employs the response function
(F, K) → (F, K) := (F, F K)
(10.20)
with domain the set of pairs (F, K) satisfying 0 < det F and 0 < det K 1. The
relations
DF (F, F K) = DF (F, K) − F −T DK (F, K)K T
(10.21)
DG (F, F K) = F −T DK (F, K)
(10.22)
and
permit us to express the field relations in terms of this choice of variables:
div[2DF + F −T DK (I − 2K T )] + bref = ρref χ̈ ,
DF (2K −T − I ) + F −T DK K T ({K −T }2 − 3K −T + I ) = 0,
sk(DF F T + F −T DK (I − 2K T )F T ) = 0,
(DF + F −T DK (I − 2K T )) · Ḟ − DK · K̇ 0,
1 det K > 0.
(10.23)
(10.24)
(10.25)
(10.26)
(10.27)
312
LUCA DESERI AND DAVID R. OWEN
In addition, the stress relation (10.16) becomes
S(X, t) = 2DF (F (X, t), K(X, t))
+ F −T (X, t)DK (F (X, t), K(X, t))(I − 2K T (X, t)). (10.28)
It also is convenient to record for future use the following counterparts of (10.18)
and (10.19):
(10.29)
S\ = (det K) DF (F, K) + F −T DK (F, K) I − K T ,
(10.30)
Sd = (det K) DF (F, K) − F −T DK (F, K)K T .
11. Submacroscopic Stability
Suppose that, for a given tensor F0 , there is a tensor G0 (with 0 < det G0 det F0 )
that provides a local minimum for G → (F0 , G), the Helmholtz free energy at
the given macroscopic deformation gradient F0 . In this case, we say that the pair
(F0 , G0 ) is submacroscopically stable. If the pair (F0 , G0 ) is submacroscopically
stable and, in addition, det G0 < det F0 , then DG (F0 , G0 ) = 0. By (7.12), (7.13),
(7.3), (7.4), and (3.5), there holds
1
(11.1)
S\ (X, t) = Sd (X, t) = (det K(X, t))S(X, t)
2
for all dynamical processes in Ed and pairs (X, t) satisfying F (X, t) = F0 and
G(X, t) = G0 . By the first formula in (8.2), we may conclude that dynamical
processes in Ed through submacroscopically stable pairs (F0 , G0 ) with det G0 <
det F0 proceed with internal dissipation ϒ given by
1
(det K(X, t))S(X, t) · Ḟ (X, t),
(11.2)
2
one half the stress-power in the virgin configuration. The stress-power in such
submacroscopically stable processes thus is partitioned equally between energy
stored and dissipated.
This result on equipartition of the stress-power does not apply to classical dynamical processes, because the relation F (X, t) = G(X, t) for classical processes
means that the strict inequality det G0 < det F0 , assumed in the derivation of
(11.2), would be violated. We shall provide now information about arbitrary dynamical processes that encounter a submacroscopically stable pair (F0 , G0 ) at
(X, t). To this end, we observe from (2.12) that, for a submacroscopically stable
pair (F0 , G0 ), the tensor G0 is a solution of the constrained minimization problem:
minimize G → (F0 , G) subject to the constraint det G det F0 . The Kuhn–
Tucker theorem ([23], p. 314) implies that, corresponding to the given solution G0 ,
there is a number λ 0 for which
ϒ(X, t) =
DG (F0 , G0 ) + λ(det G0 )G−T
0 = 0.
(11.3)
FIELD THEORY FOR ELASTIC BODIES
313
Again, by (11.3), (7.12), (7.13), (7.3), (7.4), and (3.5), we have
−λ(det G0 )(det K(X, t))G−T
0
= (det K(X, t))DG (F0 , G0 )
= S\ (X, t) − Sd (X, t)
= (det K(X, t))S(X, t)(2K −T (X, t) − I ),
(11.4)
for a dynamical process and a pair (X, t) satisfying F (X, t) = F0 and G(X, t) =
G0 . Because the Cauchy stress and the Piola–Kirchhoff stress are related by T =
(det F )−1 SF T , relation (11.4) implies the formula
T (X, t)(2I − H0T ) = −λI,
(11.5)
with H0 := G0 F0−1 , corresponding to the field H̃ defined below (2.7). For the
special case when the submacroscopically stable pair (F0 , G0 ) corresponds to a
classical deformation, i.e., F0 = G0 , (11.5) reduces to the relation
T (X, t) = −λI.
(11.6)
In other words, a submacroscopically stable pair (F0 , F0 ) can arise in a classical
dynamical process in Ed only if the corresponding Cauchy stress is a hydrostatic
pressure.
For general submacroscopically stable pairs (F0 , G0 ) encountered in dynamical
processes, the relation (11.5) may be written in the equivalent form
T\ − Td = −λH0∗
(11.7)
with T\ and Td the stresses without and due to disarrangements with respect to the
current configuration defined in Section 8. This result applies even in cases where
det G0 < det F0 . An equivalent form of (11.7) is obtained from (11.4) and reads
S\ − Sd = −λ(det K0 )G∗0 ,
(11.8)
where (F0 , G0 ) is a submacroscopically stable pair and K0 = F0−1 G0 .
The results in the previous paragraphs show that submacroscopic stability leads
to hydrostatic states of stress for classical dynamical processes in Ed , but not necessarily for non-classical dynamical processes in Ed , and describe the states of
stress encountered at arbitrary submacroscopically stable pairs. In the elastostatics
of crystals, the conclusion that equilibrium leads to hydrostatic states of stress
has been verified in several contexts ([24–27]). We note that the conclusions of
Ericksen [24], of Chipot and Kinderlehrer [25], and of Fonseca and Parry [26]
were based in part on the symmetries of the crystals they considered, whereas
the results of the previous paragraphs are independent of the notion of material
symmetry, a concept that we study in the next section. For isotropic elastic bodies,
Mizel’s results on the energetics of fractured states [28] foreshadow our results on
submacroscopic stability.
314
LUCA DESERI AND DAVID R. OWEN
12. Material Symmetry
For each point X0 in the region A undergoing a dynamical process, we consider the
transformation properties of the kinematical quantities at that point under a change
of virgin configuration determined by a given unimodular tensor H0 . These transformation properties can be obtained by replacing the time-parameterized family
of structured deformations (χ, G) by the composition
(X, t) → ((χ, G) ◦ (ξH0 , H0 ))(X, t)
= (χ(ξH0 (X, t), t), G(ξH0 (X, t), t)H0 ),
(12.1)
where ξH0 denotes the homogeneous, time-independent deformation (X, t) →
X0 + H0 (X − X0 ). From this observation we obtain the following transformation
rules
F
G
M
K
→
→
→
→
F H0
GH0
MH0
H−1
0 KH0
(12.2)
under change of virgin configuration. In this display, if a quantity on the left is
evaluated at (X, t), the corresponding quantity on the right is evaluated at (X0 +
H0 (X − X0 ), t).
We say that H0 is a symmetry transformation at X0 with respect to changes
of virgin configuration for the elastic body undergoing disarrangements if the response function (F, G) → (F, G; X0 ) satisfies
(F H0 , GH0 , X0 ) = (F, G, X0 )
(12.3)
for all (F, G) with 0 < det G det F or, equivalently, if the response function
˜
(M, G) → (M,
G, X0 ) satisfies
˜
˜
(MH
0 , GH0 , X0 ) = (M, G, X0 )
(12.4)
for all (M, G) with 0 < det G det(M + G). As in elasticity without disarrangevirgin
ments, the symmetry transformations at X0 form a group GX0 .
virgin
In the special case when GX0 is the proper orthogonal group, it is easy to obtain
virgin
necessary and sufficient conditions that (12.3) holds for all H0 ∈ GX0 . Indeed,
T
we can choose H0 to be RFT or RG
, from the polar decompositions F = VF RF and
G = VG RG , to obtain
(F, G, X0 ) = (VF RF RFT , GRFT , X0 ) = (VF RF RFT , GF T VF−1 , X0 )
1/2
−1/2
ˆ F , GF T , X0 ),
= (BF , GF T BF , X0 ) = (B
(12.5)
or, alternatively,
ˇ GT , BG , X0 ).
(F, G, X0 ) = (F
(12.6)
FIELD THEORY FOR ELASTIC BODIES
315
Here, BF = F F T and BG = GGT are the left Cauchy–Green tensors for F and
ˆ such that (12.5) holds for all (F, G) with 0 <
G. The existence of a function
ˇ such that (12.6) holds for all such pairs)
det G det F (or, equivalently, of
virgin
is a necessary and sufficient condition that GX0 be the proper orthogonal group.
Similarly, the existence of a function (M, G) → # (M, G, X0 ) such that
˜
(M,
G, X0 ) = # (MGT , BG , X0 )
(12.7)
for all (M, G) with 0 < det G det(M + G) is both necessary and sufficient for
virgin
GX0 to be the proper orthogonal group.
virgin
In the case when GX0 is the proper unimodular group, we may put H0 =
(det F )1/3 F −1 or H0 = (det G)1/3 G−1 to conclude that the Helmholtz free energy
can be expressed as a function of the pair (det F, H ) or, equivalently, in terms of
(det G, H ) with, as usual, H = GF −1 . In Section 14, we will consider the special
case where the Helmholtz free energy depends on the volume fraction det K =
det G/ det F = det H , alone.
Alternatively, a notion of material symmetry may be formulated in terms of
invariance of response to changes in reference configuration. For each point X0
in the region A undergoing a dynamical process, we consider the transformation
properties of the kinematical quantities at that point obtained first by factoring
(χ, G) via the notion of composition introduced in (2.5),
(χ, G) = (χ, ∇χ) ◦ (π, ∇χ −1 G).
(12.8)
((χ, ∇χ) ◦ (ξH0 , H0 )) ◦ (π, ∇χ −1 G),
(12.9)
Here, π(X, t) = X for all X and t. The factor (χ, ∇χ) on the right-hand side
of (12.8) is a family of classical deformations, while (π, ∇χ −1 G) involves only
purely submacroscopic deformations, because π leaves each point fixed. We next
replace the expression on the right-hand side of (12.8) by
where, as above, ξH0 denotes the homogeneous, time-independent deformation
(X, t) → X0 + H0 (X − X0 ). This replacement leaves the purely submacroscopic
factor (π, ∇χ −1 G) unchanged and changes only the classical factor (χ, ∇χ). From
this replacement we obtain the following transformation rules
F
G
M
K
→
→
→
→
F H0
(F H0 F −1 )G
(F H0 F −1 )M
K
(12.10)
under change of reference configuration. In this display, if a quantity on the left is
evaluated at (X, t), the quantity on the right is evaluated at (X0 + H0 (X − X0 ), t).
For a pair (H0 , K0 ), with 0 < det K0 1 = det H0 , we say that H0 is a
symmetry transformation at X0 for K0 with respect to changes of reference configuration for the elastic body undergoing disarrangements if the response function
F → (F, F K0 , X0 ) satisfies
(F H0 , F H0 K0 , X0 ) = (F, F K0 , X0 )
(12.11)
316
LUCA DESERI AND DAVID R. OWEN
for all tensors F with 0 < det F . Equivalently, we may use the definition in Section
10, (F, K, X0 ) := (F, F K, X0 ) for all pairs (F, K) with 0 < det F and 0 <
det K 1, to write (12.11) in the simpler form
(F H0 , K0 , X0 ) = (F, K0 , X0 )
(12.12)
for all tensors F with 0 < det F . We denote by Gref
X0 ,K0 the group formed by
the symmetry transformations at X0 for K0 . The symmetry group Gref
X0 ,K0 defined
through (12.11) or (12.12) corresponds to the usual symmetry group of an elastic
body undergoing only classical deformations, because the influence of disarrangements is removed by fixing the value of K = F −1 G at K0 .
We remark that there is a notion of invariance dual to (12.12):
(F, KP0 , X0 ) = (F, K, X0 )
(12.13)
for all tensors F and K with 0 < det F and 0 < det K 1, with P0 a given
unimodular tensor. This invariance arises by replacing the right-hand side of the
factorization (12.8) by
(χ, ∇χ) ◦ ((π, ∇χ −1 G) ◦ (π, P0 )).
(12.14)
The purely submacroscopic factor (π, P0 ) alters the given one (π, ∇χ −1 G) without
changing the classical factor (χ, ∇χ). The resulting symmetry group Gsubmac
reX0
sembles a notion introduced by Šilhavý and Kratochvíl [29], in the context of Noll’s
new theory of simple materials [30], and adapted by Bertram [31] to formulate
and solve problems in the plasticity of materials undergoing large deformations.
Moreover, Gsubmac
is obtained by means of non-classical changes (π, P0 ) in conX0
virgin
figuration, as distinct from the groups Gref
X0 ,K0 and GX0 , obtained via the classical
changes (ξH0 , H0 ).
A case that merits further study consists of the assumption that, during a dynamical process of an elastic body undergoing disarrangements, there holds
K(X, t) ∈ Gsubmac
X
(12.15)
for all (X, t). It is easy to show that
DF (F, K(X, t), X) = DF (F, I, X)
(12.16)
DK (F, K(X, t), X)K(X, t)T = DK (F, I, X)
(12.17)
and
for all F with det F > 0 and for all dynamical processes satisfying (12.15). These
relations should prove to be useful, because they restrict substantially the manner in
which the field (X, t) → K(X, t) can appear in the field relations (10.23)–(10.26).
In other words, the field relations simplify significantly when the disarrangements
embodied in K correspond to submacroscopic symmetries of the elastic body.
FIELD THEORY FOR ELASTIC BODIES
317
In the next section, we study invariance properties of the response function
in (12.12) under simultaneous changes in K0 and in X0 .
13. Material Uniformity
We now have evidence that the factorization
(χ, G) = (χ, ∇χ) ◦ (π, ∇χ −1 G),
(13.1)
with (X, t) → π(X, t) := X the trivial motion of the body, provides a useful way
to distinguish between, on the one hand, the virgin configuration from which the
purely submacroscopic family (π, ∇χ −1 G) proceeds while introducing all of the
disarrangements associated with the given family (χ, G), and, on the other hand,
the classical reference configuration, a macroscopically time-independent configuration from which the classical motion (χ, ∇χ) proceeds without introducing
further disarrangements. With a view toward capturing the influence of the purely
submacroscopic factor (π, ∇χ −1 G) on the material response, for each (X, t) we
put as usual K(X, t) = ∇χ(X, t)−1 G(X, t) and consider the mapping
F → (F, F K(X, t), X) = (F, K(X, t), X),
(13.2)
the classical free-energy response induced at (X, t) by the purely submacroscopic
motion (π, ∇χ −1 G). We may interpret this response as that of an elastic material
element to local deformations in which the disarrangements are frozen at their
values for the material point X at time t. Because ∇χ −1 G may vary with material
point and time, the classical response will vary with the pair (X, t). However, the
response function also depends upon X, and it may happen for a given time t0
that the dependence of on X compensates for the dependence of K(X, t0 ) on X,
i.e., for all X, Y ∈ A and for every tensor F with det F > 0:
(F, K(X, t0 ); X) = (F, K(Y, t0 ); Y ).
(13.3)
The condition (13.3) embodies the idea of a materially uniform elastic body as
described by Truesdell and Noll [32], Noll [33]. (In particular, see relation (27.4) of
[32], in which the dependence of response on material point is compensated by the
choice of local configuration.) Moreover, the mapping X → K(X, t0 ) corresponds
to their concept of a uniform reference for the body at time t0 . It is possible that
one of the uniform references X → K(X, t0 ) = ∇χ(X, t0 )−1 G(X, t0 ) for the
body at time t0 is the gradient of a mapping on A, and this embodies the notion
of a materially uniform, homogeneous elastic body (Truesdell and Noll [32], Noll
[33]). If none of the uniform references at time t0 is a gradient, the induced classical
response is described as that of a materially uniform, inhomogeneous elastic body.
These considerations lead us to define the collection Edunif of dynamical processes in Ed that satisfy the material uniformity condition
(F, K(X, t); X) = (F, K(Y, t); Y )
(13.4)
318
LUCA DESERI AND DAVID R. OWEN
for all X, Y ∈ A, for every tensor F with det F > 0, and for every time t.
The dynamical processes in Edunif have the property that, for every t, K(·, t) is a
uniform reference for the body. The material uniformity condition then may be
viewed as a restriction on the field (X, t) → K(X, t) = ∇χ(X, t)−1 G(X, t), to
be appended to the field relations for elastic bodies undergoing disarrangements
derived in Section 10. Determining or characterizing explicitly the collection Edunif
would entail characterizing the solutions of the field relations, augmented by the
material uniformity condition.
If we impose the requirement that the collection Edunif be non-empty, then the
material uniformity condition (13.4) places restrictions on the response function .
To make this observation more apparent, we consider a given classical dynamical
process χ, ∇χ, S, ψ in Ed , and we ask how would be restricted by requiring
that this classical dynamical process be in Edunif . An immediate answer follows
from the fact that, for every classical dynamical process in Ed , K(X, t) = I for
all X, t. Therefore, the material uniformity condition applied to the given classical
dynamical process in Ed becomes the condition
(F, I, X) = (F, I, Y )
(13.5)
for all X, Y ∈ A and for every tensor F with det F > 0. Evidently, this relation
restricts the constitutive function used to define Ed .
The material uniformity condition (13.4) may be differentiated with respect to
F to obtain the relation
DF (F, K(X, t), X) = DF (F, K(Y, t), Y )
valid for all F , X, Y , and t. In the relation (10.28) between the stress and free
energy, the material uniformity condition does not seem to imply a corresponding
uniformity condition on the response functions that determine any of the stresses
S, S\ , and Sd . Of course, one directly can impose in place of (13.4) – or in addition
to it – a material uniformity condition on one or more of the stress responses, for
example on the response function from (10.28) for the Piola–Kirchhoff stress S:
2DF (F, K(X, t), X) + F −T DK (F, K(X, t), X)(I − 2K T (X, t))
= 2DF (F, K(Y, t), Y ) + F −T DK (F, K(Y, t), Y )(I − 2K T (Y, t))
for all F , X, Y , and t. Although this choice would provide a different restriction on
the response function , it can be interpreted and studied along the lines outlined
for (13.4).
14. Energetically Nearsighted Elastic Bodies
The field relations obtained in Section 10, together with special properties of a
body such as material uniformity and material symmetry, provide the setting for
understanding the scope and range of applicability of elasticity with disarrangements. In this section we take a preliminary step in this direction by considering
FIELD THEORY FOR ELASTIC BODIES
319
elastic bodies that are “energetically nearsighted” in the sense that only the purely
submacroscopic factor (π, (∇χ)−1 G) = (π, K) in the factorization (13.1) affects
the free energy. Thus, submacroscopic slips or formation of voids would permit the
body to change its free energy, while the classical deformations embodied in the
factor (χ, ∇χ) would not, and we consider now elastic bodies for which the free
energy response in (10.20) does not depend upon the macroscopic deformation
F and, therefore, satisfies DF (F, K, X) = 0 for all triples F , K, X in the domain
of . Accordingly, the field relations (10.23)–(10.27) and the stress relation (10.28)
take the form
′
div F −T (K) I − 2K T + bref = ρref χ̈,
(14.1)
′
(K)K T ({K −T }2 − 3K −T + I ) = 0,
′
sk(F −T (K)(I − 2K T )F T ) = 0,
′
′
(F −T (K)(I − 2K T )) · Ḟ − (K) · K̇ 0,
1 det K > 0,
′
S = F −T (K)(I − 2K T ),
(14.2)
(14.3)
(14.4)
(14.5)
(14.6)
′
where we have written in place of DK to simplify notation. In some of the considerations below, it is helpful to use the field H := GF −1 = F KF −1 associated
with the factorization (2.7), and we note the relation
H −T = F −T K −T F T = (F T )−1 K −T F T .
(14.7)
14.1. UNIVERSAL PHASES AND THE GOLDEN MEAN
The consistency relation in the form (14.2) provides a restriction on the field K, and
the form of this restriction depends in general upon the response function . We
observe, however, that there are solutions K of the consistency relation that do not
depend upon , because the expression {K −T }2 −3K −T +I occurs multiplicatively
in (14.2). Consequently, each tensor K with 0 < det K 1 for which K −T is a
solution of the quadratic, tensor equation
X 2 − 3X + I = 0,
X ∈ Lin V,
(14.8)
determines a solution of the consistency relation (14.2). It is easy to see that K −T
is a solution of the quadratic equation (14.8) if and only K itself is a solution. In
turn, this is equivalent to the assertion that the tensor H = F KF −1 is a solution of
(14.8). We call solutions K (with 0 < det K 1) of the consistency relation
universal if they are solutions of (14.8), because they do not depend upon the
free energy response function of the nearsighted elastic body. We also refer to
the corresponding tensor H = F KF −1 as a universal solution of the consistency
relation.
A necessary condition for√H to be universal
is the inclusion of the spectrum of
√
H in the solution set {(3 + 5)/2, (3 − 5)/2} = {2 + γ0 , 1 − γ0 } of the scalar
320
LUCA DESERI AND DAVID R. OWEN
√
quadratic equation x 2 − 3x + 1 = 0. Here, γ0 := ( 5 − 1)/2 ≈ 0.618 is the
“golden mean,” the positive number satisfying the relation 1/x = x/(1 − x). By
elementary linear algebra, H is universal if and only if it is diagonalizable over
the reals with diagonal entries given up to permutations by one of the two triples:
(1−γ0 , 1−γ0 , 1−γ0 ), (1−γ0 , 1−γ0 , 2+γ0 ). (The possibilites (1−γ0 , 2+γ0 , 2+γ0 )
and (2 + γ0 , 2 + γ , 2 + γ0 ) are ruled out by the restriction det H ∈ (0, 1].)
Of course, the first triple (1 − γ0 , 1 − γ0 , 1 − γ0 ) determines the tensor
Hsph := (1 − γ0 )I
(14.9)
that, in turn, determines the purely submacroscopic structured deformation (i, (1 −
γ0 )I ) to follow a classical deformation (χ(·, t), ∇χ(·, t)). A piecewise smooth
approximation hn for (i, (1−γ0 )I ) takes a body in its current configuration without
disarrangements, partitioned into congruent cubic cells of side 1/n, and replaces
each cell by one with the same center but now of side (1 − γ0 )/n. The simultaneous shrinking of each cell creates voids, and the resulting structured deformation
has volume fraction det Hsph = det Ksph = (1 − γ0 )3 ≈ 0.056. This change
of submacroscopic geometry determines the (universal) spherical phase of the
energetically nearsighted elastic body.
For each choice of basis (d(1) , d(2) , d(3) ), with corresponding reciprocal basis
j
(d (1) , d (2) , d (3) ) satisfying d(i) · d (j ) = δi , the second triple (1 − γ0 , 1 − γ0 , 2 + γ0 )
of diagonal entries determines a tensor
Hlong := (1 − γ0 )d(1) ⊗ d (1) + (1 − γ0 )d(2) ⊗ d (2) + (2 + γ0 )d(3) ⊗ d (3)
= Hsph + (1 + 2γ0 )d(3) ⊗ d (3)
(14.10)
that, in turn, determines a purely submacroscopic deformation (i, Hlong) to follow
a classical deformation (χ(·, t), ∇χ(·, t)) that we refer to as the (universal) elongated phase of the body. A piecewise smooth approximation hn for (i, Hlong) takes
each of the basic cells of side 1/n, with its edges now parallel to d(1) , d(2) , d(3) ,
respectively, and stretches the “d(3) ” edge to the length (2 + γ0 )/n ≈ 2.618/n
while shrinking the other two edges to the length (1 − γ0 )/n ≈ 0.382/n. The
simultaneous elongation of each cell creates voids, and the resulting structured
deformation (i, Hlong ) has volume fraction
det Hlong = det Klong = (1 − γ0 )2 (2 + γ0 ) = (1 − γ0 ) ≈ 0.382.
(14.11)
We have used the fact that (1 − γ0 ) and (2 + γ0 ) are reciprocals in the last calculation. Moreover, after elongation, the cells may have to be translated slightly in
order to avoid interpenetration of neighboring cells, because the piecewise smooth
approximations hn are required to be injective. While the universal solution Hsph
does not vary in space and time, the universal solution Hlong may vary through
dependence of the dyad d(3) ⊗ d (3) on position and time. In addition, the basis
(d(1) , d(2) , d(3) ) need not be orthogonal, so that the approximating deformations
hn map unit cubes into possibly non-rectangular parallelepipeds. For a particular
321
FIELD THEORY FOR ELASTIC BODIES
class of energetically nearsighted elastic bodies, the basis (d(1) , d(2) , d(3) ) must be
orthonormal, as we demonstrate below, and we have d (i) = d(i) for i = 1, 2, 3.
14.2. FIELD RELATIONS FOR A CLASS OF NEARSIGHTED ELASTIC BODIES
We now specialize the discussion above to the case
(F, K) = ψ̄(det K) = ψ̄(det H )
(14.12)
in which only the volume fraction f := det K = det H ∈ (0, 1] produced by
the purely submacroscopic deformations (i, H ) and (i, K) affects the free energy.
We shall restrict our attention to universal phases of the energetically nearsighted
elastic material under consideration, so that the consistency relation need not be
considered further, and the formula
′
(K) = f ψ̄ ′ (f )K −T
(14.13)
along with the fact that f ∈ {1 − γ0 , (1 − γ0 )3 } is a constant for each phase yields
after some computations the following forms of the remaining field relations (14.4),
(14.3), and (14.1), as well as the stress relation (14.6):
f ψ̄ ′ (f )div((H −1 − 2I )F −T ) + bref = ρref χ̈,
(14.14)
skH = 0,
(14.15)
ψ̄ ′ (f )(H −1 − 2I ) · Ḟ F −1 − ψ̄ ′ (f )tr(Ḣ H −1 ) 0,
(14.16)
S = f ψ̄ ′ (f )(H −1 − 2I )F −T .
(14.17)
We have used the frame-indifference of the mixed power (14.15) to replace H −T
by H −1 in the balance law (14.14) and in the mixed power inequality (14.16), and
′
we have assumed that ψ̄ (f ) = 0 for f ∈ {1 − γ0 , (1 − γ0 )3 }. The symmetry of H
implies that we may write
Hlong = Hsph + (1 + 2γ0 )d ⊗ d,
(14.18)
with d := d(3) = d (3) . An easy calculation provides the formulas
−1
= (2 + γ0 )I,
Hsph
−1
= (2 + γ0 )I − (1 + 2γ0 )d ⊗ d,
Hlong
(14.19)
and, using the constancy of Hsph, we obtain the specific forms for the balance
of linear momentum, the mixed power inequality, and the stress relation in the
spherical phase
γ0 (1 − γ0 )3 ψ̄ ′ ((1 − γ0 )3 )div F −T + bref = ρref χ̈,
(14.20)
ψ̄ ′ ((1 − γ0 )3 )tr(Ḟ F −1 ) 0,
(14.21)
322
LUCA DESERI AND DAVID R. OWEN
S = γ0 (1 − γ0 )3 ψ̄ ′ ((1 − γ0 )3 )F −T .
(14.22)
The stress relation (14.22) implies that the Cauchy stress in the spherical phase is
given by
T = SF T / det F = ρSF T /ρ det F
= C0 ρI,
(14.23)
where C0 := γ0 (1 − γ0 )3 ψ̄ ′ ((1 − γ0 )3 )/ρref has the same sign as ψ̄ ′ ((1 − γ0 )3 )
and ρ denotes the density in the current configuration. Thus, in the spherical phase,
the energetically nearsighted elastic body experiences a hydrostatic stress that is
proportional to the density in the current configuration. If ψ̄ ′ ((1 − γ0 )3 ) < 0, then
the stress is a hydrostatic pressure, again proportional to the density, as in the case
′
of an ideal gas. Thus, if ψ̄ ((1−γ0 )3 ) < 0, the equation of state of the energetically
nearsighted elastic body in the spherical phase is that of an ideal gas undergoing
isothermal dynamical processes. Of course, the balance of linear momentum then
takes the standard form in the current configuration for gas dynamics:
C0 grad ρ + b = ρ v̇,
(14.24)
where ρ and b now denote the density and body force in the current configuration.
However, the mixed power inequality now requires that
div v 0,
(14.25)
which tells us that, when ψ̄ ′ ((1 − γ0 )3 ) < 0, the spherical phase can arise only
when the elastic material is not expanding.
By employing relations (14.18), (14.19), and the formulas
Ḣlong = (1 + 2γ0 )(ḋ ⊗ d + d ⊗ ḋ),
d·d =1
we obtain in a similar way specific forms for the balance of linear momentum, the
mixed power inequality, and the stress relation in the elongated phase
(1 − γ0 )ψ̄ ′ ((1 − γ0 ))div[(γ0 I − (1 + 2γ0 )d ⊗ d)F −T ] + bref
= ρref χ̈ ,
(14.26)
(1 − γ0 )ψ̄ ′ ((1 − γ0 ))[γ0 I − (1 + 2γ0 )d ⊗ d] · Ḟ F −1 0,
(14.27)
S = (1 − γ0 )ψ̄ ′ ((1 − γ0 ))[γ0 I − (1 + 2γ0 )d ⊗ d]F −T .
(14.28)
(In relations (14.26)–(14.28), the symbol d denotes the field introduced in (14.18)
referred to the reference configuration.) The stress relation (14.28) implies that the
Cauchy stress in the elongated phase is given by
T = C1 ρ[I − (3 + γ0 )d ⊗ d]
(14.29)
323
FIELD THEORY FOR ELASTIC BODIES
with C1 := γ0 (1 − γ0 )ψ̄ ′ ((1 − γ0 ))/ρref having the same sign as ψ̄ ′ ((1 − γ0 )), and
the relations (14.24) and (14.25) become in the elongated phase:
C1 grad ρ − C1 (3 + γ0 )div[ρd ⊗ d] + b = ρ v̇
(14.30)
div v (3 + γ0 )(grad v)d · d
(14.31)
and
the latter when ψ̄ ′ ((1 − γ0 )) < 0. We conclude that the elongated phase can persist
when div v is positive, as long as the stretching (grad v)d · d in the direction of
′
d is at least divv/(3 + γ0 ). Consequently, when ψ̄ ((1 − γ0 )) < 0, the material
in the elongated phase may expand or contract as long as the direction of submacroscopic elongation d is strongly aligned with directions of stretching in the
macroscopic motion.
Appendix: Decomposition of Flux Densities
We record here for the sake of completeness the principal relations obtained in
[3, 15], concerning the decomposition of flux densities arising through a structured
deformation, with g the macroscopic deformation and G the deformation without
disarrangements. If w: A → V is a smooth vector field and K = (∇g)−1 G, then
the identity
(det K)div w
= div(K ∗T w) + div((det K)w − K ∗T w) − w · ∇(det K)
(A.1)
is an immediate consequence of the product rule
div(ϕw) = ∇ϕ · w + ϕdiv w,
where ϕ is an arbitrary smooth scalar field. By an appropriate choice of determining sequence n → hn for the purely submacroscopic structured deformation
(i, K), one can derive the following identification relations for (det K)div w and
div(K ∗T w) at each point X ∈ A:
−3
(det K)div w|X = lim lim r
w(Y ) · ν(Y ) dAY , (A.2)
r→0 n→∞
∗T
div(K w)|X = lim lim r
r→0 n→∞
−3
hn (bdy(Cr (X)∩C))
C∈Cn
C∈Cn
hn (bdy(Cr (X))∩C)
w(Y ) · ν(Y ) dAY .
(A.3)
In (A.2) and (A.3), for each positive integer n, Cn is a collection of closed cubes
C that cover the region A and whose faces together include the jump-sites of hn
324
LUCA DESERI AND DAVID R. OWEN
and of ∇hn , and Cr (X) is a cube centered at X of side r whose faces intersect the
jump-sites of all the functions hn and ∇hn in a set of area zero.
The surface integral in (A.2) is taken over the image under hn of all the faces of
the parallelepiped Cr (X) ∩ C, so that the sum in (A.2) represents the total flux of w
across the image of the faces of Cr (X) and across the image of the faces of cubes
C in Cn containing the jump-sites of hn and ∇hn inside Cr (X). Therefore, the
limit on the right-hand side of (A.2) and, hence, the left-hand side (det K)div w|X ,
represents the volume density of the total flux of w.
The surface integral in (A.3) is taken instead over the image under hn of only
those faces of the parallelepiped Cr (X) ∩ C that belong to the boundary of Cr (X)
and not to the images of faces of cubes C in Cn containing the jump-sites of hn
and ∇hn inside Cr (X). Therefore, the limit on the right-hand side of (A.3) and,
hence, the left-hand side div(K ∗T w)|X , represents the volume density of the flux of
w without disarrangements. Consequently, the remaining terms (div((det K)w −
K ∗T w) − w · ∇(det K))|X on the right-hand side of (A.1) represent the volume
density of the flux of w due to disarrangements.
The identification relation (A.3) also permits us to call the vector field
w\ := K ∗T w
(A.4)
the portion of w without disarrangements. Moreover, from (A.2) the divergence of
the vector field
wd := (det K)w − K ∗T w
(A.5)
together with the term −w · ∇(det K), account for all the volume density of flux
due to disarrangements, and we call wd the portion of w due to disarrangements.
By (A.5) we may write
(det K)w = w\ + wd
(A.6)
as an additive decomposition of the vector field (det K)w into the portion of w
without disarrangements and the portion of w due to disarrangements. Relations
(A.4)–(A.6) yield the consistency relation
K T w\ = w\ + wd .
(A.7)
We note that for each fixed vector a ∈ V, we may set w = S T a in (A.1)–(A.7) and,
in view of the arbitrariness of a, recover the relations (3.2), (3.3), (3.5) and (3.6).
Acknowledgements
We thank Amit Acharya, Morton Gurtin, William Hrusa, and the referees for valuable comments and discussions related to this research. We also acknowledge the
FIELD THEORY FOR ELASTIC BODIES
325
support of the US National Science Foundation, Division of Mathematical Sciences, Award #0102477, and the support of the Italian Ministry of University and
Scientific Research through the Grant “Cofin 2000: Modelli Matematici per la
Scienza dei Materiali”, coordinated by P. Podio-Guidugli. We thank the Department of Mathematics of the University of Kentucky at Lexington and C.-S. Man
for the valuable support offered to L. Deseri as Visiting Professor in the Spring
Semester, 2002.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
G. Del Piero and D.R. Owen, Structured deformations of continua. Arch. Rational Mech. Anal.
124 (1993) 99–155.
G. Del Piero and D.R. Owen, Integral-gradient formulae for structured deformations. Arch.
Rational Mech. Anal. 131 (1995) 121–138.
G. Del Piero and D.R. Owen, Structured Deformations. Quaderni dell’ Istituto Nazionale di
Alta Matematica, Gruppo Nazionale di Fisica Matematica No. 58 (2000).
D.R. Owen, Twin balance laws for bodies undergoing structured motions. In: P. Podio-Guidugli
and M. Brocato (eds), Rational Continua, Classical and New. Springer-Verlag, New York
(2002); Research Report No. 01-CNA-005, February 2001, Center for Nonlinear Analysis,
Department of Mathematical Sciences, Carnegie Mellon University.
D.R. Owen and R. Paroni, Second order structured deformations. Arch. Rational Mech. Anal.
155 (2000) 215–235.
R. Choksi and I. Fonseca, Bulk and interfacial energy densities for structured deformations of
continua. Arch. Rational Mech. Anal. 138 (1997) 37–103.
R. Choksi, G. Del Piero, I. Fonseca and D.R. Owen, Structured deformations as energy minimizers in models of fracture and hysteresis. Mathematics and Mechanics of Solids 4 (1999)
321–356.
L. Deseri and D.R. Owen, Energetics of two-level shears and hardening of single crystals.
Mathematics and Mechanics of Solids 7 (2002) 113–147.
G. Del Piero, The energy of a one-dimensional structured deformation. Mathematics and
Mechanics of Solids 6 (2001) 387–408.
G. Capriz, Continua with Microstructure. Springer Tracts in Natural Philosophy 35. SpringerVerlag, New York (1989).
A. Eringen, Microcontinuum Field Theories, I. Foundations and Solids. Springer-Verlag, New
York (1999).
M. Renardy, W. Hrusa and J.A. Nohel, Mathematical Problems in Viscoelasticity. Pitman
Monographs and Surveys in Pure and Applied Mathematics 35. Longman Scientific and
Technical (1987).
L. Deseri and D.R. Owen, Invertible structured deformations and the geometry of multiple slip
in single crystals. Internat. J. Plasticity 18 (2002) 833–849.
M. Boyce, G. Weber and D. Parks, On the kinematics of finite strain plasticity. J. Mechanics
and Physics of Solids 37 (1989) 647–665.
D.R. Owen, Structured deformations and the refinements of balance laws induced by microslip.
Internat. J. Plasticity 14 (1998) 289–299.
P. Haupt, Continuum Mechanics and Theory of Materials. Springer-Verlag, Berlin (2000).
W. Noll, La mécanique classique, basée sur un axiome d’objectivité. In: La Méthode Axiomatique dans les Mécaniques Classiques and Nouvelles (Colloque International, Paris, 1959).
Gauthier-Villars, Paris (1963) pp. 47–56.
326
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
LUCA DESERI AND DAVID R. OWEN
A.E. Green and R.S. Rivlin, On Cauchy’s equations of motion. J. Appl. Math. Phys. 15 (1964)
290–292.
M.E. Gurtin, An Introduction to Continuum Mechanics. Academic Press, New York (1981).
B.D. Coleman and W. Noll, The thermodynamics of elastic materials with heat conduction and
viscosity. Arch. Rational Mech. Anal. 13 (1963) 167–178.
C.M. Dafermos, Quasilinear hyperbolic systems with involutions. Arch. Rational Mech. Anal.
94 (1986) 373–389.
W. Noll, On the continuity of the solid and fluid states. J. Rational Mech. Anal. 4 (1955) 3–81.
D. Luenberger, Linear and Nonlinear Programming, 2nd edn. Addison-Wesley, Reading, MA
(1989).
J.L. Ericksen, Loading devices and stability of equilibrium. In: Nonlinear Elasticity. Academic
Press (1973) pp. 161–173.
M. Chipot and D. Kinderlehrer, Equilibrium configurations of crystals. Arch. Rational Mech.
Anal. 103 (1988) 237–277.
I. Fonseca and G. Parry, Equilibrium configurations of defective crystals. Arch. Rational Mech.
Anal. 120 (1992) 245–283.
C. Davini and G. Parry, On defect-preserving deformations in crystals. Internat. J. Plasticity 5
(1989) 337–369.
V. Mizel, On the ubiquity of fracture in non-linear elasticity. J. Elasticity 52 (1999) 257–266.
M. Šilhavý and J. Kratochvíl, A theory of inelastic behavior of materials, Part I. Arch. Rational
Mech. Anal. 65 (1977) 97–129; Part II. Arch. Rational Mech. Anal. 65 (1977) 131–152.
W. Noll, A new mathematical theory of simple materials. Arch. Rational Mech. Anal. 48 (1972)
1–50.
A. Bertram, An alternative approach to finite plasticity based on material isomorphisms.
Internat. J. Plasticity 14 (1999) 353–374.
C. Truesdell and W. Noll, The Non-Linear Field Theories of Mechanics, 2nd edn. SpringerVerlag, Berlin (1992).
W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal.
27 (1967) 1–32.
Continuous Distributions of Dislocations in Bodies
with Microstructure
MARCELO EPSTEIN1 and IOAN BUCATARU2
1 Department of Mechanical and Manufacturing Engineering, The University of Calgary, Calgary,
AB T2N 1N4, Canada. E-mail: epstein@enme.ucalgary.ca
2 Faculty of Mathematics, “Al.I.Cuza” University, Iasi, 6600, Romania. E-mail: bucataru@uaic.ro
Received 16 August 2002; in revised form 13 June 2003
Abstract. A material body with smoothly distributed microstructure can be seen geometrically as a
fibre bundle. Within this very general framework, we show that a theory of continuous distributions
of dislocations can be formulated and specialized to particular applications, both old and new.
Mathematics Subject Classifications (2000): 74E05, 74M25, 53C10, 53B05, 55R10.
Key words: inhomogeneity, differential geometry, G-structures, fibre bundles, Eshelby stress.
Dedicated to the memory of Clifford Ambrose Truesdell III.
1. Introduction
The modern theory of continuous distributions of dislocations in simple bodies
can be traced back to the pioneering work of Kondo [13] and his collaborators
in Japan, and the works of Bilby [1], Kroener [14] and others in Europe. Within
the context of mainstream Continuum Mechanics, the seminal articles of Noll [18]
and Wang [19] paved the way for a formalism that is amenable to generalization
in a variety of directions. Some of these generalizations have been presented elsewhere [7, 15, 16]⋆ and they comprise theories of second-grade materials as well
as generalized Cosserat bodies. These are instances of bodies with some kind of
internal structure (or microstructure) along the lines of the classical ideas of the
Cosserat brothers [3] and of the more modern developments by numerous authors
(Truesdell, Ericksen, Toupin, Mindlin, Green and Naghdi, Eringen, etc.). A book
by Capriz [2] points towards more general theories and their possible interpretations and applications. The basic common feature underlying all these theories is
that the microstructure (be it of a granular nature, or of an orientational origin such
as in liquid crystals, or arising from any other physical motivation) is eventually
⋆ In [15] and [16] there is a first attempt at a general theory of the type presented here, but the
behavior at the fibre level is still of the local type.
327
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 327–344.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
328
M. EPSTEIN AND I. BUCATARU
represented by a smoothed-out apparatus (such as Cartan’s repère mobile). The obvious differential-geometric object corresponding exactly to this conceptual model
is a fibre bundle. The nature of the fibre bundle depends naturally on the particular
application at hand. The most intuitively obvious fibre bundle, general enough to
subsume the Cosserat’s initial idea and many of its generalizations, is the frame
bundle of an ordinary body. In this paper, however, we choose to leave the nature
of the typical fibre unspecified and attempt to answer the question: is it possible
to develop a fully fledged theory of inhomogeneities before such specification is
made? We answer this question in the affirmative and then proceed to show how
old and new theories can be derived as particular cases of the general formulation.
2. The Body Bundle and Its Configurations
2.1. FIBRE BUNDLES
The simplest instance of a fibre bundle (and one that is convenient to bear in mind)
is a product bundle or trivial bundle. It consists of the Cartesian product M =
B × F of a base manifold B and a fibre manifold F . This product is, of course,
itself a manifold whose dimension is the sum of the dimensions of B and F . It is
endowed with two natural differentiable projection maps such that, given a point
p ∈ M consisting of the ordered pair (b, f ) (with b ∈ B and f ∈ F ), the first
projection (pr1 ) renders b, and the second f . In extending this idea to the general
notion of a fibre bundle, the second projection is lost. Thus, a fibre bundle consists
of a triple (M, B, π ), such that the projection:
π: M → B
(2.1)
is a differentiable map (a surjective submersion) with the following property: there
exists a manifold F , called the typical fibre, such that every point b ∈ B has an
open neighbourhood U and a diffeomorphism fU : π −1 (U ) → U × F making the
following diagram commutative:
We express this fact by saying that the fibre bundle is locally trivial, since it
is neighbourhood-wise diffeomorphic to a product bundle. We note that on the
non-vanishing intersection of any two trivialization neighbourhoods U and V , the
transition map fV ◦ fU−1 : F → F (restricted to each point of the intersection)
is a diffeomorphism of the typical fibre. Instead, it is possible to require that the
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
329
transition maps belong to a subgroup G of the group of diffeomorphisms of the
typical fibre. When this is done, the group G is called the structural group of the
fibre bundle. The transition maps are required to depend differentiably on the points
of the intersection U ∩ V .
2.2. BUNDLE CONFIGURATIONS
In the physical picture, we shall refer to the fibre bundle M as the body bundle.
The base manifold B represents the macromedium, namely, an ordinary threedimensional body upon which the microstructure is later superimposed. As such,
for this particular use in continuum mechanics, the base manifold is assumed to be
trivial, in the sense that it can be covered by an atlas consisting of just one chart.
We note in passing that this requirement is not essential for the development of
most of the conceptual framework of classical Continuum Mechanics. Rather, it is
imposed to represent the intuitive notion that the body must manifest itself in toto in
physical Euclidean space. Be that as it may, we shall adopt the standard assumption.
The microstructure will be represented by the typical fibre F , whose nature is left
undefined at this point beyond the fact that it is an m-dimensional differentiable
manifold. In particular, the typical fibre need not have a priori the property of being
trivial, nor is it necessary that the total body bundle M be globally trivializable. We
will, however, assume this last property for the sake of consistency. It is important
to distinguish between a trivial bundle (namely, a bundle that is a given product of
two manifolds) and a globally trivializable one (namely, a bundle that is globally
diffeomorphic to a trivial one). The difference resides in the fact that the former
has a particular singled-out trivialization, while the latter doesn’t.
A configuration of a body bundle is, by definition, a global trivialization κ given
in terms of a fibre-consistent embedding:
κ: M → E 3 × F ,
(2.2)
3
where E stands for a three-dimensional Euclidean space, which we may identify with R3 . By fibre-consistency we simply mean that the following diagram is
commutative:
where χ is an ordinary configuration of the macromedium B. Such a fibre-consistent embedding is also called a fibre-bundle morphism. In the rest of this paper we will take the liberty of using this κ/χ notation freely. Namely: the character χ (possibly with some subscripts) will always denote the configuration of the
330
M. EPSTEIN AND I. BUCATARU
macromedium induced by the body-bundle configuration denoted by κ (with the
same subscripts).
REMARK 2.1 For some particular theories, it may be desirable to further limit the
allowable configurations. Thus, for example, in a theory of second-grade materials
the body bundle is the principal bundle of frames of B and, in contradistinction
with the Cosserat (anholomic) case, we may only allow embeddings which are lifts
of ordinary configurations of B.
Let X I , Y A (I = 1, 2, 3; A = 1, . . . , m) and x i , y a (i = 1, 2, 3; a = 1, . . . , m)⋆
be local coordinate systems in the body bundle and in the spatial product E 3 × F ,
respectively. Then, a configuration of the body bundle is represented locally by the
3 + m smooth functions:
x i = x i (X I );
y a = y a (X I , Y A ).
(2.3)
Equations (2.3) can also be regarded as representing a deformation from the
reference configuration implied by the assignment of the bundle chart (X I , Y A ),
into a spatial configuration expressed in the coordinate system (x i , y a ). Note that
the Jacobian of the transformation (2.3) must have maximal rank in the sense that:
i
∂x
= 3,
rank
∂X I
(2.4)
a
∂y
rank
= m.
∂Y A
For each fixed point X ∈ B, equation (2.3)b must be a diffeomorphism of the
typical fibre F belonging to the structural group G of the bundle. Consequently,
this equation is tantamount to a smooth map
g:
B → G,
X → g(X).
(2.5)
Accordingly, denoting by Lg the left action of G on F , equation (2.3)b can also
be understood as
y a = Lg(X)(Y A ).
(2.6)
In closing this section, we remark that whereas the physical space E 3 is assumed
to have an intrinsic physical meaning and a distinguished metric structure, the typical fibre does not, in principle, possess either, unless and until a concrete physical
context is established.
⋆ From now on, we adopt the convention that the indices H , I , J , K, L, h, i, j , k, l vary within
the range 1,2,3, while the indices A, B, C, D, E, a, b, c, d, e vary within the range 1, . . . , m.
331
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
3. The Material Response
3.1. GENERALITIES
We should like to confine our attention to materials whose mechanical behavior
is in some sense local. At this point, however, it is not necessary to require any
particular locality of response insofar as the fibers (or microstructure carriers)
are concerned. We will only demand that the response functional be localized in
terms of its dependence on the points of the base manifold (the macromedium). In
particular, we wish to consider the case in which the material response is of the
first grade, namely, it involves only the local values of the first derivative of the
configuration with respect to the base-manifold coordinates. One has to ascertain,
however, that such presumed first-grade behavior can be established intrinsically,
independently of any particular coordinate system. We will now show that this is
indeed the case and that the entity that characterizes the independent argument of
the constitutive functional is a well-defined geometric object that we call a fibre jet.
3.2. FIBRE JETS
We start by defining an equivalence relation ∼0,X within the set of all possible
configurations of the body bundle M. Two configurations, κ and λ, are said to be
∼0,X -equivalent at a point X ∈ B if they take exactly the same values for each
point of the fibre π −1 (X). Using coordinates we may, therefore, write:
κ ∼0,X λ
⇐⇒
κ(X I , Y A ) = λ(X I , Y A )
∀Y A ∈ π −1 (X).
(3.7)
More explicitly, denoting with a “hat” the coordinates corresponding to the
λ-configuration, the equivalence of κ and λ at a point X ∈ B implies that x i (X) =
x̂ i (X) and y a (X I , Y A ) = ŷ a (X I , Y A ). The second equality is actually an identity
of the functions y a and ŷ a over the whole range of values of the fibre coordinates Y A for the given (fixed) X I .
REMARK 3.1 If we adopt κ as a reference configuration, then for any other
configuration λ which is ∼0,X -equivalent to κ, the deformation of the fibre at X,
as given by equation (2.5), corresponds to the value g = e, where e is the neutral
element of the structural group G.
The equivalence relation just defined partitions the class of all configurations
into equivalence classes. Each equivalence class will be called a zero-th fibre jet
at X and will be denoted as J0,X . In a similar way, we can define higher-order
fibre jets. In particular, the first-order fibre jets are obtained from the equivalence
relation ∼1,X defined by the conditions:
κ ∼1,X λ
⇐⇒
κ∗ (X I , Y A ) = λ∗ (X I , Y A )
∀Y A ∈ π −1 (X),
(3.8)
where the asterisk subscript indicates the tangent (or gradient) map. The maps
κ∗ (X I , Y A ) and λ∗ (X I , Y A ) are linear maps defined on the tangent space T(XI ,Y A ) M
332
M. EPSTEIN AND I. BUCATARU
for every Y A ∈ π −1 (X). The equivalence condition (3.8) may also be written as
κ∗ (X I , ·) = λ∗ (X I , ·). Obviously, the following implication holds:
κ ∼1,X λ
0⇒
κ ∼0,X λ.
(3.9)
In any coordinate system the ∼1,X -equivalence of κ and λ at a point X boils
i
i
down to x i (X I ) = x̂ i (X I ), y a (X I , Y A ) = ŷ a (X I , Y A ), x,J
(X I ) = x̂,J
(X I ) and
a
a
y,J
(X I , Y A ) = ŷ,J
(X I , Y A ), ∀Y A ∈ π −1 (X), with commas denoting partial derivatives. It is important to note that these conditions are independent of the coordinate
chart since the derivatives along the fibres are automatically identical. Each equivalence class of the relation ∼1,X is called a first-order fibre jet (or a first fibre jet)
and is denoted by J1,X . The first fibre jet at X corresponding to a configuration κ
is denoted as J1,X κ.
REMARK 3.2 In local coordinates, we can identify a first fibre jet at X I with a
triple
i
a
x,J (X I ), y a (X I , ·), y,J
(X I , ·) .
(3.10)
i
Here, x,J
(X I ) is simply an ordinary frame of B at the point X, and y a (X I , ·) is
a diffeomorphism of F belonging to the structural group G. To reveal the nature
a
of the third entry, y,J
(X I , ·), we may provisionally adopt as a reference configuration one belonging to the given equivalence class, in which case, according to
remark 3.1, the second entry can be regarded as the identity e. It follows then that
the third entry, being the derivative map of (2.5) evaluated at the identity, can be
identified with an element of the set of linear transformations L(R3, g), where g is
the Lie algebra of G.
The composition of fibre jets (of the same order) is defined by taking any two
representatives of the respective equivalence classes, effecting their composition
and then adopting the jet of the composition as the composition of the jets, namely
J1,χ(X) λ ◦ J1,X κ = J1,X (λ ◦ κ).
(3.11)
It is not difficult to prove that this definition is independent of the particular representatives chosen.
From a more physical point of view, we may say that a first fibre jet of a deformation represents the deformation of a “first-order neighbourhood” of a point and
its fibre. This terminology, although lacking of mathematical rigor, will be used
frequently in an attempt to trigger the right physical intuition.
3.3. FIRST- GRADE RESPONSE
We will now attempt to provide a more rigorous characterization of our introductory remarks as to the material response. For specificity, we will speak of a (time
333
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
independent) energy functional W , so that one may say that we are confining our attention to hyperelastic behavior. Although this is not, strictly speaking, a necessary
requirement for the developments that follow, we adopt it here so as to concentrate
on the more intricate geometric aspects of the theory. The energy functional W
at any time t is assumed to be given as the integral over the macromedium B of
an energy-density functional w whose independent argument is the first fibre jet
of the configuration. In any given reference configuration, the volume element for
integration is given by the underlying Cartesian volume element in E 3 . We may
thus write:
w(J1,X κ; X) dV (X).
(3.12)
W =
χ0 (B)
We note that the density w is still a functional as far as its dependence on the
functions y a (X I , ·) and their X-derivatives is concerned. In other words, the value
of w depends on the values of the functions x i and their derivatives at X I , but also
on the entire functions y a (X I , ·) and their derivatives as functions of the running
coordinate Y A . The behavior is, therefore, local only insofar as its dependence
on the deformation of the macromedium, but it may be global in terms of its
dependence of the deformation of the micromedium. To emphasize this fact, we
sometimes will write the energy density w more explicitly (in coordinates) as:
i
a
w = w x,J
(X I ), y a (X I , ·), y,J
(X I , ·); X I .
(3.13)
Note that the dependence on x i (X I ) has been eliminated, since the energy density
is assumed to be invariant under space translations. Further exploitations of the
principle of frame indifference are not pursued in this paper, since they are not
essential to the description of continuous distributions of dislocations at the fundamental level. Notice also that we have intentionally specified a dependence of
the energy density functional on the material point X to allow for the fact that the
material properties may change from point to point of B or that, even if the material
is the same, inhomogeneities may be present. In fact, the nature of this dependence
is a major issue in establishing a consistent theory of continuous distributions of
dislocations. As a prolegomenon to the theory, we record here the way in which
the energy density changes upon a change of reference configuration.
Let a change of reference configuration (i.e., a change of trivialization of M) be
given in terms of a fixed deformation κ̂0 of the reference configuration κ0 :
κ̂0: κ0 (M) → E 3 × F .
(3.14)
The energy density functional w
to be integrated over the new domain χ
0 (B) is
related to w by:
w
(J(1,
0 (X)) =
χ0 (X)) κ̂; χ
−1
χ0 (X)) κ̂
χ̂0∗ (X) w(J(1,
◦ J1,X κ̂0 ; X),
(3.15)
334
M. EPSTEIN AND I. BUCATARU
where is the determinant of its subscript, and κ̂ represents arbitrary deformations
measured from the new reference configuration. In coordinates, we may write:
i
a
∂x
a ∂y
I
w
,y ,
; X̂
∂ X̂ I
∂ X̂ I
i
∂y a ∂ Ŷ A J
∂x ∂ X̂ I a ∂y a ∂ X̂ I
−1
= (∂ X̂I /∂XJ ) w
,y ,
+
; X . (3.16)
∂ X̂ I ∂X J
∂ X̂ I ∂X J
∂ Ŷ A ∂X J
Naturally, the independent variables of the functions appearing on either side are
different. Thus, for example, the functions y a on the left-hand side are y a (X̂ I , Ŷ A ),
whereas the functions y a on the right-hand side are to be understood (by composition) as y a = y a (X J , Y B ) = y a (X̂ I (X J ), Ŷ A (X J , Y B )), and so on. We remark
once again that the value of the coordinate expression at a point of B is invariant
under a change of representative of the first fibre-jet.
3.4. THE MATERIAL SYMMETRY GROUP
A change of reference configuration that maps a point X of the macromedium B
to itself may happen to have the property that it leaves the material response at
X unchanged. Since in order to check whether or not this is the case one has to
make use of the transformation equation (3.15) (or its coordinate version (3.16)),
and since this equation is sensitive only to the first fibre jet at X, we need only
consider the fibre jets of changes of configuration (that preserve the point X). By
analogy with the case of simple materials, we define a material symmetry H at the
point X ∈ B relative to the reference configuration κ0 of M as a first fibre-jet of a
change of configuration that preserves X and such that the equation:
w(J1,X κ; X) = w((J1,X κ) ◦ H ; X)
(3.17)
is satisfied identically for all deformations κ. Note that the determinant of the
gradient of the macromedium deformation does not appear since, in accordance
with the theory of unstructured continua, we assume that only volume preserving
local transformations can be physically meaningful symmetries.
A material symmetry at a point X ∈ B is given by a triple (G, Lg , L). Here G is
an element of the special linear group SL(3, R) and has the meaning of a symmetry
of the macromedium at X, Lg is the left translation induced by an element g of the
structural group G that acts on the fibre π −1 (X), and L(X, ·) represents a mixed
symmetry of micro- and macro-structures. We remark here also that L(X, ·) can
be regarded as an element of L(R3 , g). This follows from a consideration similar
to that embodied in remark 3.2 taking into account that all tangent spaces to a Lie
group are canonically isomorphic via the adjoint map.
The collection of all symmetries at X forms a group HX whose group operation
is represented by the composition of fibre jets. More explicitly, the material sym-
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
335
metry group HX is a subgroup of the semidirect product Gl(3, R) × G × L(R3 , g),
where the multiplication law is given by:
(G, Lg , L)(G′ , L′g , L′ ) = (GG′ , Lgg ′ , LG′ + (Lg )∗ L′ ),
(3.18)
where (Lg )∗ is the automorphism of the Lie algebra g induced by the left translation Lg . The neutral element of this group is given by (I3 , e, 0), where e is the
neutral element of the structural group G. The inverse of an element (G, Lg , L) is
given by (G, Lg , L)−1 = (G−1 , Lg −1 , −(Lg −1 )∗ LG−1 ).
We observe that every fibre jet determines uniquely by projection an ordinary jet
at the base manifold. More precisely, let J1,X κ be a fibre jet. The induced ordinary
first jet at X is given by π(J1,X κ) = j1,X (χ), where χ is the ordinary configuration
of B associated with the bundle configuration κ, and where j denotes an ordinary
jet. The collection GX = π(HX ) obtained by taking the projection of each and
every symmetry H ∈ HX is a subgroup of GL(3, R) called the induced symmetry
group of the macromedium at X.
An important subgroup of HX is the group SH X obtained by considering only
those symmetries stemming from changes of configuration whose zeroth fibre jet
at X is the identity. It follows that SH X is a normal subgroup of HX . The quotient group HX /SH X carries the physical meaning of global symmetries of the
isolated fibre manifold itself. Using the above notations, the normal subgroup can
be expressed as SH X = {(G, e, L)}. Then it is easy to see that the quotient group
HX /SH X is isomorphic to a subgroup of the structural group G of the fibre bundle.
The symmetry groups of one and the same material point relative to two different reference configurations are related by conjugation through the first fibre jet of
the change of reference configuration.
3.5. MATERIAL ISOMORPHISMS AND UNIFORMITY
The energy density functional w varies from point to point of the macromedium B,
as explicitly indicated in equations (3.12) or (3.13) by the dependence of w on the
last argument, X. It is, therefore, legitimate to ask the question: are two points,
X1 and X2 , of B made of the same material? A necessary and sufficient condition
for this to be the case is, most certainly, the existence of some reference configuration in which the identity: w(J1,X1 κ; X1 ) = w(J1,X2 κ; X2 ) is satisfied for all
J1,X1 κ = J1,X2 κ. By this last equation we mean, abusing the notation, that we are
comparing fibre jets at two different points by means of the parallelism induced by
the trivialization implied in the choice of reference configuration. In other words,
we say that the two points are made of the same material if, in some reference
configuration, they have exactly the same response to the “same” deformations.
The naive, but intuitively clear, definition just introduced already points at the
fact that, even if all the points of B happen to be made of the same material, there
might not exist a common global reference configuration for which the identity
just quoted is satisfied for every pair of points. Thus, we can intuitively distinguish
336
M. EPSTEIN AND I. BUCATARU
between the concept of uniformity (“all points are made of the same material”)
and the idea of homogeneity (“there exists a reference configuration in which the
energy density w is independent of position”). An intermediate situation of local
homogeneity is also possible (“for each point of B there exists a reference configuration for which the energy density is independent of position in a neighbourhood
of the point”). These ideas, which are at the heart of Noll and Wang’s treatment of
the theory of continuous distributions of dislocations in simple bodies, will now be
extended to materials with general microstructure.
We say that two points X1 and X2 of a macromedium B are materially isomorphic (read: “made of the same material”) if there exists a body-bundle morphism κ1,2 such that χ1,2 (X1 ) = X2 and
w(J1,X2 κ ◦ J1,X1 κ1,2 ; X1 ) =
χ1,2 ∗ (X1 ) w(J1,X2 κ; X2 ),
(3.19)
for all jets J1,X2 κ. We will use the notation P (X1 , X2 ) = J1,X1 κ1,2 . Physically, this
jet represents a “transplant operation” which achieves a perfect graft, as far as the
mechanical response is concerned. What has been done is to cut out a first-order
neighbourhood of the point X1 , including its fibre π −1 (X1 ), deform it according to
the map P (X1 , X2 ), and implant it in the place of a similar neighbourhood of X2
and its fibre. The identity above expresses the fact that the graft has been successful,
and this can only happen if the materials are the same!
If we make the source and target points of a material isomorphism to coincide,
we obtain a material automorphism, namely, we recover the concept of a material
symmetry at a point. Let H be a material symmetry at the point X1 ∈ B and let
P (X1 , X2 ) be a material isomorphism between X1 and another point X2 . Then the
composition of jets P (X1 , X2 ) ◦ H ◦ P −1 (X1 , X2 ) is a material symmetry at X2 .
It is not difficult to show that all material symmetries at X2 can be obtained in this
way and, consequently, that the symmetry groups at two materially isomorphic
points are conjugate of each other, the conjugation being effected by means of any
material isomorphism.
A body M → B with microstructure is said to be materially uniform if all
the points of B are pairwise materially isomorphic. Since material isomorphism
is clearly an equivalence relation, an alternative definition of material uniformity
consists of establishing the material isomorphisms of all points of B with a fixed
point X0 ∈ B. For clarity, one may conceive of this archetypal point X0 as placed
outside the body. Naturally, this archetype consists of a point carrying the typical
fibre and a first-order neighbourhood of both. Once the archetype is chosen, the
field of transplants becomes a function of one variable alone, namely: P (X) =
P (X0 , X). Let (X µ ) be the coordinates of the archetypal point X0 and (Y α ) the
fibre coordinates along the typical fibre. Then a uniformity field can be written as
P (X) = (PµI (X), Y A (X, ·), RµA (X, ·)). Here Y A (X, ·) and RµA (X, ·) are functions
of Y α . In terms of the archetypal energy density functional w̄ per unit volume of
the archetype the uniformity condition can be written as
w(J1,X κ; X) =
−1
P
(X)w̄(J1,X κ ◦ P (X)).
(3.20)
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
337
A body M is said to be smoothly uniform if the uniformity field (or “field of implants”) P (X) is smooth. Although in some important practical applications it may
be the singularities of this field that matter the most, we will henceforth assume
that the body is smoothly uniform.
3.6. THE MATERIAL LIE - GROUPOID
We abandon for the moment the mechanical motivation and turn our attention to
the following interesting geometrical object. For a given fibre bundle M over the
base manifold B we consider a pair of points X1 and X2 of B and we construct all
possible fibre jets of fibre-bundle morphisms that map X1 to X2 . In other words,
we consider the collection of all possible transplants (regardless of the material
response). We now form the union of all these collections of jets for all pairs of
points in B. The result is an object J(B, M) endowed with two “projection” maps.
Indeed, given any element of J(B, M), namely a fibre jet between some points X1
and X2 , the first projection, α: J(B, M) → B, points at the source X1 , while the
second projection, β: J(B, M) → B, points to the target point X2 .
In addition to being endowed with two projections, the object J(B, M) enjoys other properties. Firstly, the subset JX (B, M) = {J ∈ J(B, M) | α(J ) =
β(J ) = X} is a group. Secondly, if J1 , J2 ∈ J(B, M) and β(J1 ) = α(J2 ), then
the composition J2 ◦ J1 belongs to J(B, M). Finally, if J ∈ J(B, M) then also
J −1 ∈ J(B, M). A set with these properties is called a groupoid, and we will
call J(B, M) the fibre jet groupoid associated with the fibre bundle M. It is not
difficult to see that all the groups JX (B, M) are conjugate to each other. Any one
of them can be rightly called the structural group of the groupoid. When, as in
the case J(B, M), the projections α and β are smooth functions, we have a Lie
groupoid. Notice that the projections are maps onto B, so it is appropriate to say
that B (and not M) is the base manifold of the groupoid J(B, M).
If we reintroduce the temporarily abandoned material picture and consider, for a
given uniform body bundle M with energy density w, the set of all those fibre jets
representing material isomorphisms, we obtain a subset Jw (B, M) of J(B, M)
enjoying precisely all the properties required by the definition of a groupoid. We
say that Jw (B, M) is a subgroupoid of J(B, M). This material subgroupoid has
as its structural group the material symmetry group of any of the points of B. If
we adopt the idea of an archetypal point outside the body, we may conveniently
say that the structural group of the material subgroupoid is the material symmetry
group H0 of the archetype. By the assumed smoothness of the uniformity field,
Jw (B, M) is a Lie groupoid, which we shall call the material Lie groupoid of M.
There exists a parallel, almost equivalent, picture that is worth revealing. A first
fibre-jet J1,X κ of a local trivialization κ can be said to define a fibre frame at
X ∈ B. With this picture in mind, if we consider the collection of all fibre frames at
all points of B, we can construct a fibre bundle F (B, M) over B, with projection
τ: F (B, M) → B. The typical fibre of this bundle consists of all possible changes
338
M. EPSTEIN AND I. BUCATARU
of fibre frames and is, therefore, a group. It can be shown that F (B, M) is actually
a principal bundle, which we call the principal bundle of fibre frames of M. It is
worthwhile repeating that the base manifold of this bundle is B, not M.
If we now reconsider the notion of the material archetype, we realize that each
uniformity implant P (X) (or, rather, its inverse) is nothing but a fibre frame at X.
A uniformity field is then just a section of the principal bundle of fibre frames. The
set of all possible implants from a given archetype, consistent with a given constitutive law, forms a subbundle of F (B, M). This subbundle is itself a principal
bundle, whose structural group is the material symmetry group H0 of the archetype.
In this way, we have obtained a generalization of the concept of a G-structure, that
we call a fibre G-structure.
3.7. HOMOGENEOUS BODIES WITH MICROSTRUCTURE
The uniformity concept for a body with microstructure has been introduced and
studied in Section 3.5. In the same section we introduced in general terms the ideas
of homogeneity and local homogeneity for a bundle body. We will now make this
ideas more precise.
We say that a body bundle M is homogeneous with respect to a given fibre
frame of an archetypal point X0 if it admits a global deformation κ such that the
fibre jet J1,X κ −1 = P (X) is a uniformity field. If this is the case we say that the
associated fibre G-structure is integrable. In other words, integrability is equivalent
to the existence of a section P of the fibre G-structure that is the fibre jet J1,X κ −1
of a global deformation κ.
In local coordinates we have that the body bundle is homogeneous if there exists
a uniformity field P (X) = (PµI (X), Y A (X, y α ), RµA (X, y α )) which, by a global
change of reference configuration, can be brought to the trivial field {δµI , e, 0}. The
first component can be achieved if the inverse matrix of PµI (X) is derivable from
three scalar functions x µ (X I ) by (P −1 )µI = ∂x µ /∂X I . This will be the case if the
equality of mixed partial derivatives is satisfied, namely: (P −1 )µI,J = (P −1 )µJ,I The
second entry can always be achieved by just inverting the function Y A (X, y α ) for
each X thus obtaining a function y α (X, Y A ). If the first condition is satisfied, we
have determined the change of configuration x µ = x µ (X I ) and y α = y α (X I , y A ).
We still have to check whether the third entry, for this particular change of configuration, vanishes. By the law of composition of first fibre jets, this will be the case if
∂y α A −1 µ ∂y α
R (P )I +
= 0.
∂Y A µ
∂X I
REMARK 3.3 We note that if the symmetry group is discrete, then the test just
described is necessary and sufficient for (local) homogeneity. In the case of a
continuous symmetry group, however, the degree of freedom afforded by the continuity has to be taken into consideration, each group leading to different necessary
and sufficient criteria. In other words, given a particular uniformity field the test
339
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
is sufficient for homogeneity, but the body may still be homogeneous if the test is
violated, since there may exist another independent uniformity field that satisfies it.
In more geometrical language, we can say that a uniformity field for a bodybundle induces three different parallelisms:
(i) the ordinary material parallelism of Noll, as determined by the field of matrices PµI . On an appropriate coordinate patch the corresponding Christoffel
symbols will be curvature-free, since the parallelism is obviously distant;
(ii) a distant parallelism between the fibres of the body-bundle M induced by the
functions Y A (X, y α ), two points at two different fibres π −1 (X1 ) and π −1 (X2 )
being in correspondence if they have the same values of the coordinates y α ;
and
(iii) a curve dependent parallelism on M generated by the horizontal distribution
spanned by the vectors:
∂
∂
+ RµA (X, y α ) A .
(3.21)
I
∂X
∂Y
The test of local homogeneity described above is equivalent to the following
geometric conditions: (i) the torsion of the first connection vanishes; (ii) the horizontal distribution spanned by Hµ is involutive, thus giving rise to a distant fibre
parallelism; and (iii) this distant parallelism coincides with the one induced by
Y A (X, y α ).
It is remarkable that a necessary condition for the above homogeneity criteria
to be satisfied can be expressed in terms of an ordinary linear connection on M,
namely, an object that lives in the principal bundle of frames of M, or in its associated tangent bundle T M. Indeed, consider the field of frames of M given by the
3 + m linearly independent vector fields Hµ and Vα = (∂Y A /∂y α )(∂/∂Y A ). Then,
a necessary condition for local homogeneity is the vanishing of the following three
inhomogeneity tensors:
⎧
−1 µ
∂(P −1 )µK
)J
⎪
I ∂(P
⎪
−
;
Pµ
⎪
⎪
∂X K
∂X J
⎪
⎪
⎪
⎨ ∂Y A ∂R α
α
∂RK
J
(3.22)
−
;
α
K
J
⎪
∂y
∂X
∂X
⎪
⎪
⎪
⎪
∂ 2yα
∂Y A ∂RJα
⎪
⎪
⎩
−
,
∂y α ∂Y B
∂Y B ∂X J
Hµ = PµI (X)
µ
where we have set RIα = −(∂y α /∂Y A )RµA (P −1 )I . The above mentioned tensors are
nothing but the torsion components of the complete parallelism (linear connection)
D on M induced by the field of frames on M given by Hµ and Vα or, in matrix
form:
&
% −1 µ
(P )I (X)
0
.
(3.23)
F (X, Y ) =
∂y α
(X, Y )
RIα (X, Y )
A
∂Y
340
M. EPSTEIN AND I. BUCATARU
The complete parallelism on M induced by this field of frames determines a
unique linear connection D on M for which the frame {Hµ, Vα } is covariantly
constant. This means that DZ Hµ = DZ Vα = 0 for all vector fields Z on M.
The linear connection D is curvature free. If we express the torsion T (Z, W ) =
DZ W −DW Z−[Z, W ] with respect to the frame {Hµ , Vα }, then there are only three
nonzero components. These components are the inhomogeneity tensors (3.22).
REMARK 3.4 The field of frames (3.23) has some important features: it is projectable and preserves the vertical vectors with respect to the projection π . The
first tensor (3.22) measures the inhomogeneity of the macrostructure. This tensor
is the torsion of the complete parallelism (linear connection) ∇ on B induced by
µ
the frame (P −1 )I (X). It is easy to see that the linear connection ∇ on B is the
projection of the linear connection D on M.
4. Additional Topics
4.1. ON STRESSES AND CONFIGURATIONAL FORCES
Since our purpose has been merely to reveal the geometrical material structure
underlying a distribution of dislocations or other forms of inhomogeneity in a
body with general microstructure, we have not discussed the field equations that
govern the possible motion and material evolution of such bodies. Our intention in
this subsection is to touch upon this issue briefly so as to prepare the ground for
future work on the applications of the general theory to particular cases. We start by
recalling once again that the energy density function w appearing in equation (3.13)
is in fact a functional as far as its dependence on the fibre deformation y a (X, ·) and
its material gradient y,Ia (X, ·) are concerned. At this level of generality one may,
therefore, define the microstresses as functional derivatives (such as Gateaux or
Fréchet derivatives) of w with respect to the functions y a (X, ·) and y,Ia (X, ·). In
particular cases, the functional w may eventually turn out to be expressed in terms
of an ordinary function of a finite number of parameters. An important example of
this type is discussed in the next subsection. In such cases, then, the microstresses
will boil down to a finite number of “hyperstress tensors” defined at each point of
the macromedium B. Another case of practical importance is a situation in which
the functional can be represented as an integral over the fibre, via an appropriately
defined volume element arising from physical considerations. In these cases, it
appears that the microstresses will be reducible to a number of tensor fields defined
over the fibres themselves. In all cases, not necessarily confined to the two examples just mentioned, the relevant form of the equations of motion can be gathered
from an exercise consisting of postulating a form of the kinetic energy and then
demanding that the corresponding Lagrangian attain an extremal value.
An issue directly related to that of microstresses is that of Eshelby-like stresses
for material with microstructure. Born from the classical article of Eshelby [11] on
the force on an isolated singularity in an elastic material, the notions of Eshelby
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
341
stresses and configurational forces have acquired in recent years considerable impetus. The conceptual reach of the original idea has been extended to a large variety
of physical situations, including phase transitions and material growth, some of
which are discussed at length in [17] and [12]. It is a fundamental feature of
the theory of continuous distributions of inhomogeneities that, once the nature
of the uniformity maps has been established for a particular theory on the basis
of the kinematic variables involved, the correct expression for the corresponding
configurational stresses arises canonically as the static dual in, say, the free energy expression (see [9, 10, 4]). Thus, the kinematics alone, by determining the
appropriate uniformity maps, determines the form of the Eshelby stresses. This
feature is particularly useful in the case of materials with microstructure. Within
the generality maintained thus far in this article, the Eshelby microstresses would
be measured, indeed, by means of functional derivatives of the right-hand side of
equation (3.20) with respect to the uniformity field P (X), namely, with respect
to the triple P (X) = (PµI (X), Y A (X, ·), RµA (X, ·)). In any particular theory, these
configurational microstresses will have exactly the same nature as their Newtonian
counterparts. The evolution of the material structure, namely, the time evolution of
the corresponding groupoid within the class of conjugate groupoids, will then be
subjected to a number of formal restrictions. For the case in which the microstructure describes a second gradient behaviour, these restrictions have been studied in
detail elsewhere [5, 6], but a general treatment is lacking.
4.2. BODIES WITH LINEAR MICROSTRUCTURE
In this section we shall show how the theory applies for the particular case when
the fibre bundle (M, π, B) is the tangent bundle (T B, π, B) of a material body B.
As the typical fibre is a linear space and the structural group is the general linear
group Gl(3, R) we say that the microstructure behave linearly.
We consider a material body B, that is, a three-dimensional manifold that can
be covered with just one chart, and we denote by (T B, π, B) its tangent bundle.
Then a configuration of the body bundle is given by a map
κ: T B → E 3 × R3
(4.24)
that preserves the fibre structure, or, equivalently, by the six smooth functions:
⎧
i
∂x
⎪
⎨ x i = x i (X I ),
= 3,
rank
∂X I
(4.25)
⎪
⎩ i
y = HIi (X I )Y I , rank(HIi ) = 3.
It is important to remark that, in this case, the function y i describing the fibre
deformation at a point X ∈ B is completely defined by the matrix HIi (X). This
means that the energy density functional w effectively has become an ordinary
function of a finite number of variables. In order to study the mechanical behavior
342
M. EPSTEIN AND I. BUCATARU
of a body with microstructure, we have defined in Section 3.2 the first order fibre
jet of a configuration. In our case, when a configuration κ has the form given by
(4.24) or (4.25), the fibre jet is given by
i
i
J1,X = x,J
(X), HJi (X), HJ,K
(X) .
(4.26)
i
Here HJi (X)Y J and HJ,K
(X)Y J have to be seen as linear maps. Consequently, the
material response at each point X ∈ B of the body bundle is given by an ordinary
function:
i
i
w = w x,J
(X), HJi (X), HJ,K
(X); X I ,
(4.27)
(compare with [7, 8]). A material symmetry H of a point X ∈ B is a triple (GIJ ,
KJI , LIJ M ) such that
i
i
i
I N
(4.28)
w(x,J
, HJi , HJ,M
) = w x,Ii GIJ , HIi KJI , HI,N
KJI GN
M + KN LJ M .
As the structural group of the tangent bundle (T B, π, B) is Gl(3, R), and as the
Lie algebra of this group is the algebra of matrices M3 (R), then L(R3 , M3 (R))
can be identified with L2 (3), the set of all bilinear maps from R3 × R3 to R3 . The
material symmetry group is then a subgroup of the semidirect product Gl(3, R) ×
Gl(3, R) × L2 (3).
The uniformity of the body bundle B reduces to the existence of a globally
defined field
I
P (X) = PαI (X), QIα (X), Rαβ
(X) .
(4.29)
For the general case, a body with microstructure is homogeneous if the three
inhomogeneity tensors (3.22) induced by a uniformity field, vanish. Due to the
particular form (4.29) of the uniformity field, the second inhomogeneity tensor
vanishes if the third one does. Consequently, we may conclude that a body with
linear microstructure is homogeneous if there exists a field of uniformities (4.29)
with zero inhomogeneity tensors:
⎧
−1 α
∂(P −1 )αK
)J
⎪
⎪
I ∂(P
⎪
;
−
⎨ Pα
∂X K
∂X J
(4.30)
⎪
∂(Q−1 )αL
⎪
α
I
⎪ Qα RLJ −
.
⎩
∂X J
Here RIαJ is the third entry of the inverse of the uniformity field (4.29) and is given
by:
β
γ
L
RIαJ = −(Q−1 )αL Rβγ
(Q−1 )I (P −1 )J .
(4.31)
The vanishing of the two tensors (4.30) implies the existence of a diffeomorphism
κ: (X I , Y I ) ∈ T B → (x α (X), (Q−1 )αI (X)Y I ) ∈ E 3 × R3 such that
P −1
α
I
=
∂x α
∂X I
and
RIαJ =
∂(Q−1 )αI
.
∂X J
(4.32)
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
343
It is important to note here that, just as in the general case, the two inhomogeneity
tensors (4.30) are the torsion components of a linear connection D on T B. This
linear connection is the complete parallelism induced by the following field of
frames on T B
−1 α
(P )I (X)
0
.
(4.33)
F (X, Y )
RJαI (x)Y J (Q−1 )αI (X)
In [7], the homogeneity of a second order simple material body reduces to the
vanishing of two inhomogeneity tensors. These tensors are exactly the same as
the inhomogeneity tensors (4.30), but they were obtained in a different way. The
first inhomogeneity tensor is the torsion of the complete parallelism of the material
body B induced by the frame (P −1 )αI , while the second is the difference of the two
linear connections induced by the fields (Q−1 )αI and RIαJ .
The material Lie groupoid consists of all triples
X) = P Iˆ (X,
X), QIˆ (X,
X), R Iˆ ˆ (X,
X)
P (X,
(4.34)
I
I
IJ
to X that are compatible with the energy density
of material isomorphisms from X
functional w. A fibre frame, as defined in Section 3.6 can be seen now as a second
order frame (4.29) on B or as a special frame (4.33) on T B. Then the induced
fibre G-structure is isomorphic to a second order G-structure on B or a G-structure
on T B. The homogeneity of the body bundle is equivalent to the integrability of the
second order G-structure (see [8]) or to the integrability of the G-structure on T B
and each of these conditions is equivalent to the vanishing of the tensors (4.30).
Acknowledgements
This work has been partially supported by the Natural Sciences and Engineering Research Council of Canada. The second author (I.B.) would like to thank
Dr. Epstein for his support during the visit at the University of Calgary.
The first author (M.E.) gratefully acknowledges an illuminating discussion with
Professor Gianfranco Capriz on the possible applications of the theory to highergrade liquid crystals and nematic elastomers.
References
1.
2.
3.
4.
5.
B.A. Bilby, Continuous distributions of dislocations. In: Progress in Solid Mechanics, Vol. 1.
North-Holland, Amsterdam (1960) pp. 329–398.
G. Capriz, Continua with Microstructure. Springer (1989).
E. Cosserat and F. Cosserat, Théorie des Corps Déformables, Paris, Hermann (1909).
M. Epstein, Eshelby-like tensors in thermoelasticity. In: W. Muschik and G.A. Maugin (eds),
Nonlinear Thermomechanical Processes in Continua, Vol. 61. TUB-Dokumentation, Berlin
(1992) pp. 147–159.
M. Epstein, On the anelastic evolution of second-grade materials. Extracta Mathematicae 14
(1999) 157–161.
344
M. EPSTEIN AND I. BUCATARU
M. Epstein, Towards a complete second-order evolution law. Math. Mech. Solids 4 (1999) 251–
266.
7. M. Epstein and M. de León, Homogeneity conditions for generalized Cosserat media.
J. Elasticity 43 (1996) 189–201.
8. M. Epstein and M. de León, Geometrical theory of uniform Cosserat media. J. Geom. Physics
26 (1998) 127–170.
9. M. Epstein and G.A. Maugin, Sur le tenseur de moment matériel d’Eshelby en élasticité non
linéaire. C. R. Acad. Sci. Paris 310/II (1990) 675–768.
10. M. Epstein and G.A. Maugin, The energy-momentum tensor and material uniformity in finite
elasticity. Acta Mechanica 83 (1990) 127–133.
11. J.D. Eshelby, The force on an elastic singularity. Philos. Trans. Roy. Soc. London A 244 (1951)
87–112.
12. M.E. Gurtin, Configurational Forces as Basic Concepts of Continuum Physics. Springer, Berlin
(2000).
13. K. Kondo, Geometry of elastic deformation and incompatibility. In: Memoirs of the Unifying
Study of the Basic Problems in Engineering Science by Means of Geometry. Tokyo Gakujutsu
Benken Fukyu-Kai (1955).
14. E. Kroener, Allgemeine Kontinuumstheorie der Versetzungen und Eigenspannungen. Arch.
Rational Mech. Anal. 4 (1960) 273–334.
15. M. de León, A geometrical description of media with microstructure: Uniformity and homogeneity. In: Gepmetry, Continua and Microstructure, Collection Travaux en Cours 60.
Herrmann, Paris (1999) pp. 11–20.
16. M. de León and M. Epstein, Geometric characterization of the homogeneity of continua with
microstructure. Extracta Mathematicae 11 (1996) 1116–1126.
17. G.A. Maugin, Material Inhomogeneities in Elasticity. Chapman and Hall, London (1993).
18. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal.
27 (1967) 1–32.
19. C.C. Wang, On the geometric structure of simple bodies: A mathematical foundation for the
theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967) 33–94.
6.
A Model of the Evolution of a Two-dimensional
Defective Structure
MARCELO EPSTEIN1 and MAREK ELŻANOWSKI2
1 Department of Mechanical and Manufacturing Engineering, The University of Calgary, Calgary,
AB, Canada. E-mail: epstein@enme.ucalgary.ca
2 Department of Mathematical Sciences, Portland State University, Portland, OR, U.S.A.
E-mail: marek@mth.pdx.edu
Received 28 June 2002
Abstract. A model of the anelastic evolution law of a two-dimensional defective solid crystal body
is proposed. Assuming that the material body is made of triclinic crystals and that the evolution
process does not alter the basic material symmetry group, we postulate that the evolution is driven by
the present state of the density of the distribution of defects. We show that a linear relation between
the inhomogeneity velocity gradient and the torsion tensor is rich enough to model such phenomena
as relaxation of defects and dislocation pile-up.
Mathematics Subject Classifications (2000): 74E05, 53C10, 53B05.
Key words: defects, evolution, inhomogeneity, anelasticity.
1. Introduction
The theory of continuous distributions of dislocations in its various formulations
results always in a mathematical description of distributions of inhomogeneities in
terms of differential-geometric objects. An open question, however, is the formulation of constitutive laws that govern the possible time evolution of such geometric
structures so as to represent a variety of important physical phenomena involving the massive motion of defects. The driving force behind these phenomena
can perhaps be best explained in terms of configurational forces such as those
represented by the Eshelby tensor. Nevertheless, it is quite possible to conceive
of an evolutionary process that is driven by the dislocation pattern itself in its
natural tendency to eliminate residual stresses or, even if these stresses are absent,
to achieve a defect-free structure over time. These processes can be enhanced, for
example, by raising the temperature of the body so as to increase the probability of
the atoms in overcoming potential barriers. On the other hand, a dislocation pattern
may lead in the opposite direction, in the sense that a dislocation pile-up may
arise naturally out of an initially smooth distribution of defects. These typically
nonlinear phenomena are in need of a general theoretical framework consistent
with the differential-geometric apparatus mentioned above. The purpose of this
345
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 345–355.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
346
M. EPSTEIN AND M. ELŻANOWSKI
paper is to show how a relatively simple model, valid for solids endowed with only
a discrete material symmetry group and already possibly devoid of residual stress,
can explain, among other phenomena, the appearance of dislocation pile-ups. The
proposed evolution law consists of assuming nothing more than a linear relation
between the inhomogeneity velocity gradient and the instantaneous value of the
torsion of the (unique) material connection. That such a simple law can account
for nonlinear phenomena is an encouraging sign of the power of the theory of
continuous distributions of inhomogeneities, which is just beginning to be fully
tapped. A possible extension of the theory would include the modelling of the
release of residual stresses present in an isotropic solid. In this case, the dislocation
density can be completely characterized by the curvature tensor of an appropriately
defined Riemannian connection. The theory would be necessarily more involved
than the one presented in this paper not only because the curvature tensor is of
higher order than a torsion, but also because the evolution would involve a coupling
with the solution of the equilibrium boundary-value problem at each instant. It is
mainly for reasons of simplicity that we have limited the presentation to the solid
crystal case. Conspicuously absent from this treatment is the important issue of
thermodynamic restrictions on the form of the evolution law. For the case of a
stress-driven evolution, and within different phenomenological frameworks, such
restrictions have been considered, among others, by [7–9].
2. Uniformity
Let B denote an open, possibly unbounded, region in R3 . We shall view it as a
deformable continuum in a reference configuration. A deformation of the body B
is an embedding χ: B → R3 . Its tangent map evaluated at the material point
X ∈ B is called the deformation gradient at X, and it will be denoted by F(X). In
fact, due to the canonical identification of a tangent space of R3 with the Euclidean
vector space E3 we recognize the deformation gradient as an automorphism of E3 ,
and drop the explicit dependence of F on the material point X.
In pure elasticity the density of the stored energy per unit reference volume is
given by a function W (F; X) where, as mentioned earlier, F is the gradient of the
deformation from the reference configuration to the current configuration evaluated at X. Adopting a three-dimensional vector space V as a reference crystal (an
archetype material point) we say that the body B is materially uniform whenever
there exist smoothly distributed (throughout the body) uniformity maps P(X) from
the reference crystal V to the tangent space of the reference configuration at X, and
such that
a real-valued function W
(FP(X))
W (F; X) = W
(1)
for all deformation gradients F and for each material point X [11]; see also [4].
Given a basis Eα (α = 1, 2, 3) in the reference crystal V and a (right-handed)
A MODEL OF THE EVOLUTION
347
coordinate system eI (I = 1, 2, 3) in R3 the mappings P(X) induce in the reference
configuration a field of bases
fβ (X) ≡ PβI (X)eI ,
(2)
called a uniform reference. The uniform frame at X is related to the uniform frame
at Y by the linear isomorphism
P(X; Y ) ≡ P(X)P−1(Y ),
(3)
called a material isomorphism from Y to X. Note that the choice of the basis Eα
in the reference crystal, although arbitrary, has no effect on the choice of
maps P(X; Y ).
A uniform reference (a moving frame) fβ is not, in general, induced by any
coordinate system on the body B even if considered only in some neighborhood
of a material point. However, if for every material point X there exists such a coordinate neighborhood (albeit different at different points) the body is called locally
homogeneous [12, 13]. By an appropriate change of reference configuration, the
uniformity maps P(X) can then be chosen as independent of X in each such neighborhood. This in turn implies that the parallelism induced on B by such a material
reference fβ is locally trivial. The material connection associated with such a parallelism is torsion-free, where a material connection of the mathematical theory of
inhomogeneities is a connection generated by any (homogeneous or not) uniform
reference [10]. Note that any material connection is locally integrable, i.e., its curvature tensor vanishes locally, as uniform references are induced from the reference
crystal by the smoothly distributed (throughout the body) mappings P(X).
For a solid crystal point the material symmetry group is finite. In particular,
the triclinic crystal is a solid crystal with the trivial symmetry group (there are no
symmetries other than the identity I).⋆ A material body made of solid crystals has
a unique material connection. This is in contrast with the case when the material
symmetry group is continuous, e.g., in an isotropic solid.
In this paper we shall only consider uniform material bodies made of triclinic
crystals and such that there exists a global reference configuration in which all material isomorphisms P(X; Y ) are proper rotations, i.e., the uniform reference corresponds to contorted aelotropy [10] or, equivalently, a state of constant strain [5].
This can be realized if, for example, there exists a global stress-free reference, and
the reference crystal is assumed stress-free. Other states of stress are also possible.
Indeed, one can show that in a 2-dimensional solid crystal body the state of stress
compatible with a state of constant strain is hydrostatic [5].
In other words, if the body is in a state of constant strain, and if a (right-handed)
orthonormal basis eI (I = 1, 2, 3) defines a Cartesian coordinate system on R3 ,
then
fβ (Z) = QIβ (Z)eI ,
⋆ One may also allow −I to be a symmetry of a triclinic solid [11].
(4)
348
M. EPSTEIN AND M. ELŻANOWSKI
where all QIβ (Z) are proper orthogonal tensors. The Christoffel symbols of the
second kind of the unique (constant strain) material connection are given in the
Cartesian coordinate system by
Ŵ IKJ (Z) = −QIα,J (Z)QαK (Z)
(5)
where a comma indicates partial differentiation. When the body is locally homogeneous, and the rotations QIβ (Z) are locally material point independent, the
Christoffel symbols of the material connection vanish.
3. Evolution Law
Consider a uniform solid crystal body. In the realm of pure elasticity the given
uniform reference remains unchanged. In other words, there are no processes of
elastic deformations which may change the existing structure. However, anelastic
processes involve usually mechanisms which modify the distribution of material inhomogeneities. This can be modelled by allowing the uniform reference to change
in time. As the uniform reference fα evolves, and assuming that the evolution does
not alter the symmetry group, its time derivative yields
γ
γ
ḟβ = ṖβI eI = ṖβI (P −1 )I fγ = L β fγ ,
(6)
γ
as implied by relation (2). Here, Lβ represent the components of the inhomogeneity
velocity gradient [6]
L ≡ P−1 Ṗ,
(7)
which measures the temporal rate of change of uniform references pulled back to
the reference crystal. Note that for the triclinic crystal body in a state of constant
strain
γ
γ
L β = Q̇Iβ QI
(8)
are components of a skew-symmetric matrix, as implied by (4).
Given a particular uniform reference of an arbitrary uniform material body the
torsion
I
I
TKJ
≡ ŴKJ
− ŴJI K
(9)
of the induced material connection is an indicator of whether or not the body is
homogeneous. Indeed, if the torsion vanishes the induced material parallelism is
trivial and the body is homogeneous. On the other hand, if the torsion of a particular
material connection does not vanish the corresponding uniform reference is not
integrable. The body may still be homogeneous as there may exist another uniform
reference, obtained by the action of the material symmetry group, inducing a flat
material connection. In the triclinic crystal case, however, as we pointed out earlier,
the material connection is unique. The torsion of such a material connection is not
A MODEL OF THE EVOLUTION
349
only an indicator of inhomogeneity but it may be considered a true measure of the
density of the distribution of inhomogeneities.
We, therefore, postulate that, regardless of the state of stress, the distribution
(density) of inhomogeneities is the driving force behind the intrinsic anelastic evolution of these inhomogeneities. According to this idea, we suggest an evolution
law of the form:
Ṗ(X, t) = f (T(X, t), P(X, t))
(10)
where T is the torsion tensor of the instantaneous intrinsic material connection
(as generated by the current uniformity maps P), and where f is assumed not to
depend explicitly on X because of the assumed uniformity of the evolving body.⋆
Formulating an evolution law is a difficult constitutive modelling process. However, for such a law to describe a true evolution it must satisfy the principle of
covariance [6]. That is, it must be independent of any particular reference configuration chosen. If λ: R3 → R3 is a diffeomorphism representing a change of reference configuration and H denotes its gradient at a material point the corresponding
uniformity maps R and P are related by
R = HP.
(11)
As we want our evolution law to describe a particular physical situation in a manner
independent of the reference configuration and, since λ is time independent, we
have:
Ṙ = HṖ.
(12)
This implies that
f (HTH−1 H−1 , HP) = Hf (T, P)
(13)
for all nonsingular tensors H. Note that as the torsion T is a vector-valued two-form
the notation HTH−1 H−1 is a shorthand for the pull-back transformation whose
coordinate representation takes the form
B C
TIJ K = (H −1 )IA T A
BC HJ HK .
(14)
fv (T) ≡ f (P−1 TPP, I).
(15)
In particular, let us select (with some abuse of notation) H = P−1 and define
Hence,
Ṗ = Pfv (T) = Pf (Tv , I),
(16)
⋆ We emphasize that, in principle, the function f may depend also on other parameters, such as
temperature, stress, etc. Our aim is to exhibit the richness of the theory even under the assumption of
a “self-driven” evolution. Note also that (10), although particularly appealing in the triclinic crystal
case, may as well be applicable in other situations, with possibly extra equivariance conditions.
350
M. EPSTEIN AND M. ELŻANOWSKI
where
Tv ≡ P−1 TPP
(17)
can be recognized as the density of the distribution of inhomogeneities (torsion tensor) seen from the perspective of the reference crystal. The evolution equation (16)
can now be rewritten in terms of the inhomogeneity velocity gradient as follows:
L(P) = fv (T).
(18)
It is not difficult to see that this form of the evolution law is completely invariant.
In particular, we may restrict the form of the evolution law by supposing a linear
relation such that
L(P) = CTv ,
(19)
where C is a fifth order tensor of material constants. In other words the evolution
law is given in component form by
M
(P −1 )αI P˙ Iβ = C αβρσ λ (P −1 )ρM P Nσ P K
λ T NK .
(20)
According to the principle of actual evolution [6] a process described by such
an evolution law is truly evolutive only if the inhomogeneity velocity gradient L is
outside of the Lie algebra of the material symmetry group of the reference crystal.
In the case of a material body made of triclinic crystals, when the material symmetry group is finite, this principle implies that every non-trivial evolution, i.e.,
γ
L β = 0, represents a true evolution.
4. The Two-Dimensional Case
For the sake of specificity and to illustrate the range of phenomena within the
scope of this approach, we consider now a class of problems for which the uniform
reference is independent at all times of, say, the third Cartesian coordinate. In
doing so, we render the evolution problem two-dimensional and gain the added
computational simplicity afforded by the explicit representation of the rotation by
means of a single angular parameter.
Adopting an orthonormal basis in V and a Cartesian coordinate system x, y, z in
the fixed reference configuration, the assumption that at all times t and at all points
the uniform reference represents a state of constant strain results in the following
matrix representation of the uniformity maps P:
%
&
cos θ(x, y, t) sin θ(x, y, t) 0
[P] = − sin θ(x, y, t) cos θ(x, y, t) 0 ,
(21)
0
0
1
where θ = θ(x, y, t) measures, say, the counterclockwise rotation between the
x-axis and the vector f1 . The non-vanishing Christoffel symbols of the second kind
A MODEL OF THE EVOLUTION
351
of the induced material connection Ŵ IKJ can now be calculated directly from (5) as
1
2
Ŵ21
= −Ŵ11
= θ, x ,
1
2
Ŵ22 = −Ŵ 12 = θ, y
(22)
(23)
whence the non-vanishing torsion components are:
1
1
T12
= −T21
= −θ, x ,
2
2
T12 = −T 21 = −θ, y .
(24)
(25)
Similarly, the non-vanishing components of the inhomogeneity velocity gradient at
the reference crystal are
L12 = −L21 = θ, t .
(26)
The most general evolution law (20) results (after some calculation effort) in
the single quasi-linear partial differential equation
θ,t +(a cos θ − b sin θ)θ,x +(a sin θ + b cos θ)θ,y = 0,
(27)
where a and b are, respectively, the material constants 2C 12112 and 2C 12212 . These
are the only two material constants left due to the skew-symmetry of the torsion
tensor and the form of the uniformity maps (21).
We may further simplify the form of the evolution equation (27) by writing it as
a single nonlinear balance law for the new variable β
β, t + c(sin β), x − c(cos β), y = 0,
(28)
√
where β ≡ θ + θ0 , c = 1/ a 2 + b2 , and where θ0 is such that tan θ0 = b/a.
The characteristic strips [2] of this equation are solutions of the following system of ordinary differential equations:
dt
= 1,
ds
(29)
dx
= c cos β,
ds
(30)
dy
= −c sin β,
ds
(31)
dβ
= 0,
ds
(32)
dβ,t
= −cβ,t [β,x sin β + β,y cos β],
ds
(33)
dβ,x
= −cβ,x [β,x sin β + β,y cos β],
ds
(34)
352
M. EPSTEIN AND M. ELŻANOWSKI
dβ,y
= −cβ,y [β,x sin β + β,y cos β].
ds
(35)
As it is well known, the quasi-linearity of the single partial differential equation
has several important consequences. Firstly, for given initial conditions x(0), y(0),
t (0) and β(x(0), y(0), 0), the first four equations can be solved independently from
the last three. A line x(s), y(s), t (s) thus obtained is called a characteristic curve
or simply a characteristic. Equation (32) implies that β is constant along each
characteristic. Moreover, the parameter s, according to (29), can be identified with
time t, except for an arbitrary additive constant. Finally, the constancy of β implies
that along a characteristic the right-hand sides of equations (30) and (31) are constant, and therefore that the characteristics are actually straight lines. The values
of the material constants, together with the initial condition, determine whether or
not the characteristics will tend to converge (intersect) or diverge. In the former
case, we will observe the creation of dislocation pile-ups, while the latter is a
representation of the tendency of the dislocations to dissipate after the passage
of a long enough time. Indeed, the general Cauchy problem for such a balance law
has, as it is well known [1], no smooth global solution even for smooth compactly
supported initial condition. A solution stays temporarily smooth but eventually develops singularities. The blow-up of a smooth solution, which in the context of our
model we identify with a dislocation pile-up, occurs when the spatial gradient of β
becomes unbounded. In a one-dimensional case, given any particular initial distribution of inhomogeneities, it is rather elementary to determine, as shown in [3],
such propagation characteristics as the blow-up time, the speed of propagation
(Rankine–Hugoniot condition), and the propagation condition for the amplitude of
the pile-up. Moreover, looking at the Rankine–Hugoniot condition for the evolution
equation (28), whether planar or one-dimensional, it is easy to realize a possibility
of the occurrence of a stationary pile-up, i.e., a singular pattern of inhomogeneities
which will not propagate.
5. Examples
For the sake of being even more specific and to better illustrate each of the above
mentioned types of evolutions, let us restrict further our analysis to the one-dimensional case by assuming that the uniform references depend only on one Cartesian
coordinate, say y. This renders the evolution equation particularly simple, namely:
β, t + cβ, y sin β = 0.
(36)
The general Cauchy problem for such a balance law has, as it is well known, no
smooth global solution even for smooth compactly supported initial conditions.
A solution stays temporarily smooth but eventually develops singularities. The
blow-up of a smooth solution, which in the context of our model we identify with
A MODEL OF THE EVOLUTION
353
a dislocation pile-up, occurs when β,y becomes unbounded. It is easy to show by
integrating along characteristics that this is possible provided
cβ0′ cos β0 (y) < 0
(37)
at some y ∈ R, where β0 (y) ≡ β(y, 0) and where k(y) ≡ c cos β0 (y) is obviously
constant along the characteristics. The actual breaking of a continuous solution will
be observed at the critical time
tc ≡ min
y
−1
.
cβ0′ (y) cos β0 (y)
(38)
Such a singularity, once developed, will propagate, as implied by the Rankine–
Hugoniot condition, with the speed
v=c
[cos β]
[β]
(39)
along the shock-curve y = Ŵ(s), where (d/ds)Ŵ(s) = v(y(s), s). The evolution of
the amplitude [β] of such a shock is given by the propagation condition
˜ = c [cos β] [β,y ] + [β,y sin β] ,
[β]
(40)
[β]
where [f (β)] ≡ f (β + ) − f (β − ) denotes the jump of the quantity f across the
˜ indicates differentiation along Ŵ. Using the method
shock-curve Ŵ, and where [β]
of singular surfaces the propagation of such a singularity can be further analyzed
by developing the infinite system of iterated compatibility conditions and solving
it numerically.
To show the relation between the form of the initial condition and the choice
of the material constants a and b we briefly discuss here some one-dimensional
evolution initial-value problems.
(i) Suppose that a = b = 1 and let β0 (y) = arctan y. As β0 ,y > 0 the condition (37) is never satisfied proving that no pile-up of dislocations will ever
occur. A simple analysis of characteristics shows, in fact, that the solution
θ(y) tends asymptotically to −π/4 at every y ∈ R.
(ii) Let β0 (y) = − arctany and let us keep the same material constants. This initial condition, in contrast to the previous one, will develop, as easily attested
by (37), into a shock. In fact, investigating the arrangement of characteristics
and calculating the critical blow-up time (38) one arrives at the conclusion
that the two shocks travelling in opposite directions
√ (one front-shock and one
back-shock) will develop at the same time tc = 2/2.
(iii) Suppose a = b = 1 and select a symmetric (about y = 0) initial condition,
e.g., β0 (y) = (π/2)sech y. An elementary analysis of characteristics shows
that this solution will blow up in finite time into a front shock. Changing
the material constants to a = −b = −1, but keeping the initial condition
354
M. EPSTEIN AND M. ELŻANOWSKI
unchanged, will make very little difference. Indeed,
√ rewriting the evolution
equation for the new material constants as β,t + ( 2/2)β,y cos β = 0 one
can easily conclude that the new solution also blows up in finite time. However, a different part of the initial condition contributes now to the pile-up,
slowing down its occurrence and propagation considerably.
(iv) As the last example we consider the spherically symmetric planar problem.
In other words, we seek a solution to the evolution equation (28) such that
it is invariant at all times t 0 with respect to rotations about the origin.
Rewriting equation (28) in the polar coordinates (̺, ψ) we obtain
β, t + cβ,̺ sin(β − ψ) −
c
β,ψ cos(β − ψ) = 0,
̺
(41)
where β = β(̺, ψ, t). The solution β is truly rotationally invariant provided
(β − ψ),ψ = 0.
(42)
Hence,
β(̺, ψ, t) = ψ + F (̺, t),
(43)
where
F, t + cF,̺ sin F −
c
cos F = 0.
̺
(44)
What we have now is a one-dimensional balance law with a source. The characteristic curves are no longer straight lines and the solution F is no longer
constant along characteristics. The initial value problem is well posed only
locally in time. As in the case of a conservation law, the solution of (44)
generally stays smooth only up to some critical time at which a singularity
develops. Moreover, the source term may even cause the singular solution to
become unbounded in finite time and, if dissipative enough, it may altogether
prevent the breaking of some relatively weak waves. Note also that the source
term of (44) plays a prominent role close to the origin while it is negligible
very far away from the center. Indeed, the proximity of defects increases the
density of defects which, in turn, as expected, influences their evolution in a
more significant way.
Acknowledgements
This work has been supported in part by the Natural Sciences and Engineering
Research Council of Canada (NSERC). The work was done in part when the second
author was visiting the University of Calgary in September–December 2001. The
financial support for this visit was provided by NSERC, the Department of Mathematics and Statistics of the University of Calgary, and Portland State University.
A MODEL OF THE EVOLUTION
355
References
1.
C.M. Dafermos, Hyperbolic Conservation Laws in Continuum Physics. Springer, New York
(2000).
2. G.F.D. Duff, Partial Differential Equations. University of Toronto Press, Toronto (1956).
3. M. Elżanowski and M. Epstein, On the intrinsic evolution of material inhomogeneities. In:
Proc. the 2nd Canadian Conf. on Nolinear Solid Mechanics, Vancouver, Canada (2002) in
press.
4. M. Elżanowski, M. Epstein and J. Śniatycki, G-structures and material homogeneity. J. Elasticity 23(2/3) (1990) 167–180.
5. M. Epstein, A Question of constant strain. J. Elasticity 17 (1987) 23–34.
6. M. Epstein and G.A. Maugin, On the geometrical material structure of anelasticity. Acta Mech.
115 (1996) 119–134.
7. M. Epstein and G.A. Maugin, Thermomechanics of volumetric growth in uniform bodies.
Internat. J. Plasticity 16 (2000) 951–978.
8. M.E. Gurtin, A Gradient theory of single crystal viscoplasticity that accounts for geometrically
necessary dislocations. J. Mech. Phys. Solids 50 (2002) 5–32.
9. P.M. Naghdi and A.R. Srinivasa, A dynamical theory of structured solids. I Basic developments.
II Special constitutive equations and special cases of the theory. Phil. Trans. Roy. Soc. London
A 345 (1993) 425–476.
10. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal.
27 (1967) 1–32.
11. C. Truesdell and W. Noll, The Non-Linear Field Throeries of Mechanics. Handbuch der Physik,
Vol. III/3. Springer, Berlin (1965).
12. C.-C. Wang, On the geometric structure of simple bodies, a mathematical foundation for the
theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967) 33–94.
13. C.-C. Wang and C. Truesdell, Introduction to Rational Elasticity. Nordhoff, Leyden (1973).
On the Theory of Rotation Twins in Crystal
Multilattices
J.L. ERICKSEN
5378 Buckskin Bob Rd., Florence, OR 97439, U. S. A.
Received 1 May 2002; in revised form 1 October 2002
Abstract. Rotation twins form a subset of twins in crystals which at least closely resemble many
of the twins that are observed. My purpose is to characterize all solutions of this kind for twinning
equations in the X-ray theory of crystals. An analysis of a common kind of growth twins in staurolite
is presented.
Mathematics Subject Classifications (2000): 74E15, 82D25.
Key words: twinning theory, continuum theory of crystals.
Dedicated to the memory of Clifford Truesdell
1. Introduction
The definition of rotation twins to be studied here is that given by Barrett and
Massalski [1, p. 406],
“Crystals are rotation twins if a two-, three-, four- or sixfold rotation about
a twinning axis produces the orientation of the other. The rotation axis lies
either in the twinning plane or normal to it and is not a symmetry element of
the individual crystals.”
Of course, use of the adjective “rotation” is reasonably interpreted as implying
that not all configurations called twins are rotation twins. Except for the two-fold
possibility, which applies to almost all mechanical twins, all examples known to
me occur in growth twins. Here, my purpose is to describe all solutions of the
twinning equations in my [2] X-ray theory for such twins. I [3, 4] have described
how these equations are a bit different from some others used in studies of mechanical twins, which cannot reasonably be applied to growth twins. Also, I [5] have
verified that my equations describe several different kinds of growth twins that
are well-established in quartz, excepting those for which the two sides represent
crystallographically inequivalent surfaces.
It should be noted that workers have been unable to agree on a general definition of twins. In the words of one expert, Cahn [6, Section 1.1], one of the best
357
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 357–373.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
358
J.L. ERICKSEN
known proposals, attributed to Friedel, describes a “true twin” as involving a pair
of configurations such that “. . . the two crystals can be brought into one congruent
configuration by reflection in a lattice plane of low indices, or by a rotation through
60◦ , 90◦ , 120◦ or 180◦ about a lattice row of low indices.” The decision as to how
low the indices must be seems to be left to the individual, but practice favors using
single digits. Generally, Cahn seems to take this view fairly seriously, although
he describes what he considers to be some rare exceptions. Other workers call
other kinds of exceptions twins. Structural considerations motivated Hartman [7]
to exclude existence of three-, four- and six-fold rotations in his definition of twins.
Given that all measurements are subject to some error, there is no way to confirm
experimentally that the rotation involved in one of these examples is exactly 90◦ ,
for example. Particularly in minerals, various kinds of complicated intergrowths
occur, and most workers are not happy to call all of these twins. I note that Barrett
and Massalski do not mention that the rotation axes should be restricted to the kind
of crystallographic directions mentioned by Cahn, and I will not assume this. However, according to my theory, it turns out that, in most cases, these axes are parallel
to some (parallel) rows of atoms or to normals of some (parallel) crystallographic
planes, and I will note kinds of cases where one of these conditions must hold.
I won’t expend more ink in trying to describe all of the different views on what
should be meant by a twin. For present purposes, I will regard an intergrowth as a
twin if it is described by some nontrivial solution of my twinning equations.
2. Background
A crystal n-lattice, pictured as filling all of space, consists of n geometrically
identical lattices translated relative to each other, with lattice vectors ea and their
duals ea , the reciprocal lattice vectors. Physically, the atoms in any one of the
lattices are identical, but atoms in different lattices can be the same or different. To
describe the relative translations occurring when n > 1, we use shift vectors pi ,
i = 1, . . . , n − 1. Pick one atomic position in each of the lattices and take one as
a base point. Then the shifts are position vectors of the others relative to the base
point. For a given configuration, there are infinitely many ways of choosing the
vectors, (ea , pi ) and (ēa , p̄i ) being two possibilities if the first is and the second
satisfies
a
(2.1)
ēa = mba eb ⇔ ēa = m−1 b eb ,
where m = mba is a unimodular matrix of integers, and
j
p̄i = αi pj + lia ea ,
j
lia ∈ Z,
(2.2)
the matrices α = αi being discussed in some detail by Pitteri [8] and Ericksen [9]. Briefly, they describe interchanging identical atoms. Here, the detailed
descriptions of them are not important. In dealing with matrices, my convention
ON THE THEORY OF ROTATION TWINS
359
is that the lower index labels rows. The transformations (2.1) and (2.2) form an
infinite discrete group, with the group product indicated by
def
{m, ᾱ, l̄} · {m, α, l} = {mm, ᾱα, ᾱl + l̄m}.
(2.3)
For each configuration of an n-lattice with n > 1, we have four finite groups that
are relevant here, the lattice group
j
L(ea , pi ) = m, α, l | mba eb = Qea , αi pj + lia ea = Qpi , Q ∈ O(3) , (2.4)
the skeletal lattice group
L(ea ) = m | mba eb = Qea , Q ∈ O(3) ,
(2.5)
the point group
P (ea , pi )
j
= Q | Qea = mba eb , Qpi = αi pj + lia ea , {m, α, l} ∈ L(ea , pi ) ,
and the skeletal point group
P (ea ) = Q | Qea = mba eb , m ∈ L(ea ) .
(2.6)
(2.7)
Often, the latter is called the holohedral point group or holohedry. For a Bravais
lattice (1-lattice), the lattice and point groups are just the skeletal groups.
While workers seem unable to agree on a general definition of twins, they are
generally considered to involve jump discontinuities in lattice vectors and/or shifts
although, frequently, the latter are not considered explicitly. Normally, at least tacitly, these are considered as they occur in crystals at constant temperature that are
unstressed, or in which the stress is a constant hydrostatic pressure, with ea , ea
and pi piecewise constant. So, assume this and the fact that the aforementioned
Barrett–Massalski description of rotation twins presumes this. In this context, the
twinning equations I [2] proposed are of the form
a
ēa = (1 − n ⊗ a)ea = Q m−1 b eb ⇔ ēa = (1 − n ⊗ a)−T ea = Qmba eb ,
(2.8)
and
j
p̄i = Q αi pj + lia ea ,
Q ∈ O(3).
(2.9)
Here, (ea , pi ) and (ēa , p̄i ) are values of these vectors on the two sides of the discontinuity surface, n being its unit normal, m, α and l being some choice of the
matrices referred to in (2.1) and (2.2), a being some vector not having a definite
physical interpretation. One finds it by solving (2.8), when possible, which depends
on the nature of data available for the other variables. Here, the description of rotation twins gives some information about Q, that it is a rotation with axis parallel or
perpendicular to n, angles of rotation being those commonly encountered in various
360
J.L. ERICKSEN
studies of crystals. There are solutions of (2.8) with Q a rotation with axis making
a different angle with n. Our task is to characterize all choices of the other variables
in (2.8) and (2.9) for the indicated possibilities of Q and n. In this setting, (2.9) is
almost trivial. For given pi , we could satisfy it by taking p̄i = Qpi , for example.
Of course, in trying to match a solution to some observations, one should match
the observed shifts, when data on these is available. In addition, ea and pi should
satisfy some equilibrium equations, which I [2] have described, but are not used
explicitly here. Whether they can be satisfied by reasonably stable configurations
for any one of the kinds of configurations discussed depends on the nature of the
particular constitutive equations considered.
It follows from (2.8) that
det(1 − n ⊗ a) = ±1,
(2.10)
which distinguishes two kinds of possibilities. The upper sign gives
S-twins with
a·n=0
⇒
(1 − n ⊗ a)−T = 1 + a ⊗ n
(2.11)
a·n=2
⇒
(1 − n ⊗ a)−T = 1 − a ⊗ n.
(2.12)
and the lower gives
O-twins with
Here, the S and O refer to the fact that ea and ēa or, equivalently, ea and ēa have the
same or opposite orientations, respectively. As will become clear, rotation twins of
both kinds or at least configurations closely resembling them are encountered, in
practice. For some deformation twins, S-twin solutions of (2.8) are used, with a
interpreted as the amplitude of a simple shearing deformation. The X-ray theory
does not require this interpretation. Note that, if (2.8) is satisfied for some values
of Q and m, it is also satisfied if we simply replace these by −Q and −m, leaving
the remaining arguments unchanged. However, it can be that (2.9) is satisfied by
one of these and not the other, for equivalent sets of shifts. In considering rotation
twins, I interpret the description as implying that we should assume that
Q = R ∈ SO(3),
(2.13)
from which it follows that
det m = 1 for S-twins,
det m = −1 for O-twins.
(2.14)
Obviously, it is easy to remove the restriction (2.13). Henceforth, I write R in place
of Q whenever these are considered as rotations and denote by Rvψ the rotation
through angle ψ with axis in the direction of v.
There are solutions of (2.8) and (2.9) sometimes called fake twins, because the
discontinuities involved are not visible in typical X-ray observations. One kind
consists of the
lattice invariant shears: S-twins with Q = 1.
(2.15)
361
ON THE THEORY OF ROTATION TWINS
Sometimes, workers compose some of these with certain other solutions to adjust
values of a or n. The other kind involves solutions with
{m, α, l} ∈ L(ea , pi ),
(2.16)
perhaps combined with a suitable lattice invariant shear. It is easy to show that,
for S-twins of this kind, a = 0 and QT is the corresponding element of P (ea , pi ).
For O-twins of this kind, a = 2n and −QT Rnπ is the corresponding element of
P (ea , pi ). One can also compose these with another solution to get alternative
descriptions of the same twin, with different values of Q, m, etc. For n-lattices with
n > 1, having suitable symmetry, one can have somewhat similar but nontrivial
solutions not of this kind, for example,
S-twins with m ∈ L(ea ),
m∈
/ L(ea , pi )
⇒
a = 0,
(2.17)
n then being arbitrary. This applies to at least some twins called penetration twins.
I call them penetration twins, not implying that all things called this fit (2.17).
I am still looking for other possibilities for describing these. According to my [5]
analyses of Brazil and Dauphiné twins in α-quartz, they are of this kind. Observations of Dauphiné twins indicate that the discontinuity surfaces are rather random
surfaces. The isometry can be taken as a certain 180◦ rotation. Theoretically, one
could pick planes such that these fit the description of rotation twins and workers
have done this in considering occurrence of these in some Japan twins in quartz,
for example. The surfaces associated with Brazil twins are different, usually of
zigzag form, involving pieces of various kinds of crystallographic planes. I know
of no theoretical reason for this difference. The isometry involved can be taken as
a central inversion. There is also the analog of (2.17)
O-twins with m ∈ L(ea ),
1 − n ⊗ a = −Rnπ .
m∈
/ L(ea , pi ) ⇒ a = 2n
⇒
(2.18)
One could also call these penetration twins, but I am not yet sure how consistent
this is with practice, although we will see an example that is. So, for the present,
I call them exceptional O-twins. According to my [5] analysis of Friedel twins in
α-quartz, (2.18) applies to them and they fit the description of rotation twins. The
isometry can be taken as a 90◦ rotation, with axis perpendicular to n. Theoretically,
the interfaces are orthogonal planes, and I do not know of any observations that are
inconsistent with this.
Those who have a little familiarity with twinning studies have encountered
examples of the rotation twins of the two-fold kind, said to be of types I and II.
Analyses of these are presented by Pitteri [10, 11]. These are S-twins, so a · n = 0
and, for both, m2 = 1, det m = 1. This implies that m is of the form
mβα = −δαβ + uα v β ,
uα v α = 2,
(2.19)
362
J.L. ERICKSEN
where uα and v α can be taken as integers. For both, the isometries are 180◦ rotations. Solving (2.8) gives
u
n
type I: Q = Rπ ,
n⊗a= u⊗ 2 2 −v ,
(2.20)
|u|
and
type II:
Q=
Raπ ,
where
u = ua ea ,
v
n ⊗ a = u − 2 2 ⊗ v,
|v|
v = v a ea .
(2.21)
(2.22)
As is known from studies of deformation twins, it follows from (2.20) that, for
type I twins, n is rational, meaning that there is a vector in this direction with
integer components relative to the basis ea . Such directions are normal to crystallographic planes. For type II twins, it is also known that (2.22) implies that a is
rational in a different sense, here meaning that there is a vector in this direction with
integer components relative to the basis ea . Rows of atoms have such directions. Of
course, these are also the directions of the axes of rotation. Most mechanical twins
are of type I. Like the lattice invariant shears, these solutions are available for any
set of lattice vectors. Pitteri and Zanzotto [12] show that these are the only S-twins
with this property.
As is discussed in slightly different ways by Adeleke [13] and Ericksen [14],
one subset of rotation twins has been characterized completely,
rotation S-twins with
Rn = n,
a = 0.
(2.23)
From the analyses of these, it follows that n, the axis of rotation, must be parallel to
a normal of some crystallographic planes and m must be similar to Q. Adeleke [13]
describes all S-twin solutions of (2.8) with a = 0, making our task relatively easy.
In one respect, possibilities with a = 0 are somewhat like those mentioned above
for Dauphiné twins, since a value of Q included in any point group satisfies QN = 1
for N = 1, 2, 3, 4 or 6. Thus, take Q = R = 1 to be a rotation of this kind and you
can pick n to be parallel to its axis, so you then have a description of all rotation
S-twins such that Rn = n. For any set of lattice vectors, R ∈ P (ea ) implies that
the axis of R is normal to some crystallographic planes. If you are not familiar
with this, it is easily seen by taking the inner product of the equation in (2.7) with
the axis, so this property extends to these penetration twins, assuming we are not
dealing with the trivial fake twins. Thus, we can consider these possibilities to be
characterized.
3. S-twins with RN = 1, N = 3, 4 or 6
Here, N is considered to be the smallest integer such that RN = 1. The analogous
assumption is made elsewhere. From the above remarks, the only S-twins that
363
ON THE THEORY OF ROTATION TWINS
need to be considered are those for which the axis of R is normal to n. The cases
indicated in the heading all have the property that the rotations do not take a plane
with normal n to itself. Such cases are special cases of the solutions covered by
Adeleke [13], in his subcase 3.3.2.1, so it is only necessary to specialize his results.
Let i, j, k be an orthonormal basis with
n = i,
Rk = k,
R = Rkψ ,
(3.1)
for any of the angles ψ fitting the powers listed in the heading. There are some
restrictions on m, namely
(a)
m must have 1 as an eigenvalue
⇒
mab x b = x a ,
(3.2)
where the x a are not all zero, and
(b)
for some real numbers y a , x a , y a and mab y b are linearly independent.
(3.3)
In the following, one can use any such values of m and y a . Translating Adeleke’s
notation to mine, one gets that ea is obtained by solving
ea · k = x a ,
ea · j = y a ,
ea · RT j = mab y b ,
(3.4)
from which
k = x a ea .
(3.5)
If the eigenspace corresponding to 1 is one-dimensional, the x a are, to within a
scalar factor, integers, so k is parallel to some rows of atoms. Obviously, this is the
case if 1 is an isolated eigenvalue. The other logical possibility is that the eigenvalues are 1, 1, 1. Then, one can use the Jordan Canonical Theorem on matrices to
show that, if m = 1 and if m does not correspond to a lattice invariant shear, trivial
possibilities, this eigenspace is one-dimensional. Adeleke [13] uses this theorem
repeatedly. Using the results above, I calculate that
ea = csc ψmab y b − cot ψy a i + y a j + x a k.
(3.6)
I won’t belabor the elementary calculations giving ea . Then, Adeleke’s calculations
give
a = mab eb · iRea − i.
(3.7)
This generalizes solutions I [14] obtained for the special case Ra = a. Among
the solutions considered here, these and only these have m similar to R. Thus,
RN = 1 ⇒ mN = 1 just for this special case.
By using (3.5) to calculate Rkψ (m−1 )ab eb and rearranging terms, I get an alternative to (3.7) as
a
a = csc ψ mab + m−1 b y b − 2 cot ψy a ea .
(3.8)
364
J.L. ERICKSEN
Except for differences of notation, most of these calculations are included as special cases of those given by Adeleke [13], but (3.8) is not.
4. S-twins with R2 = 1
There are obvious possibilities of this kind with the axis of R perpendicular to n, the
type II twins described by (2.21), but these do not quite include all such solutions,
which are covered by Adeleke’s [13] subcase 3.3.4.3. For all, it is necessary that
m have 1, −1, −1 as its eigenvalues.
(4.1)
If m2 = 1, one gets type II twins. If not, proceed as follows. Of course, by
hypothesis, one will have, for some orthonormal basis i, j, k,
Ri = i,
Rj = −j,
Rk = −k,
n = k.
(4.2)
Introducing a pair of eigenvectors of m, we have
mab x b = x a ,
mab y b = −y a .
(4.3)
One can take ea as any linearly independent vectors such that
ea · i = x a ,
ea · j = y a .
(4.4)
So,
ea = x a i + y a j + za k,
(4.5)
where, except for the requirement of linear independence, the numbers za are
arbitrary. Adeleke’s description of a is
a = mba ea · neb − n = mba za eb − k.
(4.6)
The values of m involved here have the property that
m = m′ m′′ ,
m′′2 = 1,
(4.7)
where m′ is a value of m corresponding to a lattice invariant shear. One way to
see this is to use Adeleke’s observation that, by a similarity transformation using
matrices of integers with determinant one, m can be reduced to the form⋆
5
5
51 0
0 5
5
5
5
m=5
(4.8)
5 p −1 0 5 ,
5q
r −1 5
where the entries are, of course, integers. Said differently, given an m with the
properties described above, one can find lattice vectors relative to which it reduces
⋆ He and I use different conventions, making his matrices transposes of mine.
365
ON THE THEORY OF ROTATION TWINS
to the form (4.8). If r = 0, m2 = 1, so assume that r = 0. It is then easy to verify
that (4.7) is satisfied by
5
5
5
5
5 1
51 0
0 05
0 5
5
5
5
5
5
1 05
m′ = 5
m′′ = 5
(4.9)
5 0
5,
5 p −1 0 5 ,
5 rp −r 1 5
5 q 0 −1 5
that m′ does represent a lattice invariant shear and that m′′2 = 1. Pitteri [11] first
produced an example of an S-twin solution of (2.8) with R2 = 1, m2 = 1. For this
example, Zanzotto [15] showed that it is really a type II twin, in disguise. Here, we
can come to the same conclusion, by similar reasoning. To do so, note that we have
solutions of (2.8), which I put in a form more like that used by Zanzotto,
Rmba eb = (1 + a ⊗ n)ea = ēa .
(4.10)
With (4.7), this is equivalent to
b
b
R(m′′ )ba eb = (1 + a ⊗ n) m′−1 a eb = ē¯ a = m′−1 a ēb ,
(4.11)
ēa and ē¯ a being equally acceptable choices of lattice vectors on the second side.
Assuming the lattice vectors used correspond to (4.8), we use (4.5) to calculate
that
e3 = z3 k = z3 n
and, with this that
′−1 b
m
e = (1 + b ⊗ n)ea ,
a b
(4.12)
b · n = 0,
(4.13)
where
b = rz3 (e2 − pe1 ).
(4.14)
Then, (4.11) becomes
R(m′′ )ba eb = (1 + c ⊗ n)eb = ē¯ a ,
c = a + b.
(4.15)
This is a solution of (2.8) with m2 = 1, det m = 1, m = m′′ , axis of R perpendicular to n, implying that it is a type II twin, so
Rc = c.
(4.16)
The isometry is then the same as it is for the type II twin. In the example presented
by Pitteri [11], Zanzotto [15] reasoned from this that the apparently different solution is physically equivalent, and I agree. Thus, all of the cases considered in
this section are really type II twins. Actually, what appeared to be exceptions are
compound twins, meaning that they can also be described as type I twins.
366
J.L. ERICKSEN
5. O-twins with Rn = n
Having disposed of the S-twins, we now consider the type of O-twins described in
the heading. So, we are to satisfy
a
a · n = 2,
(5.1)
Rnψ m−1 b eb = (1 − n ⊗ a)ea ,
for any of the angles ψ associated with rotation twins. Operate on both sides with
−Rnπ , replace m by m = −m and note that
−Rnπ (1 − n ⊗ a) = (1 − 2n ⊗ n)(1 − n ⊗ a) = 1 − n ⊗ ā,
(5.2)
where
ā = 2n − a
⇒
ā · n = 0.
(5.3)
Also,
Rnπ Rnψ = Rnψ+π .
(5.4)
Thus, (5.1) gets transformed to the S-twin solution
a
Rψn m−1 b eb = (1 − n ⊗ ā)ea , ψ = ψ + π, m = −m
(5.5)
and, obviously, we can transform (5.5) in a similar way to get (5.1). There are
exceptional cases. For one,
ψ =π
⇒
ψ = 2π
⇒
Rψn = 1,
(5.6)
(5.5) then describing a lattice invariant shear. One is then combining the trivial
−Rnπ = 1 − 2n ⊗ n with a lattice invariant shear, to get a solution which is rather
trivial, but perhaps not useless. Another exceptional case occurs if
m ∈ L(ea )
⇔
m ∈ L(ea ),
(5.7)
(5.5) then describing a penetration twin or a fake twin. Either way, ā = 0. Then,
Rψn ∈ P (ea ), so the axis n is normal to some crystallographic planes.
For the remaining solutions, if ψ is one of the angles associated with rotation
twins, ψ + π is another, so one can take any of the S-twin solutions discussed
in Section 3 and transform it to get an O-twin solution. From the remarks after
(2.23), it follows that, for the twins considered in this section, the axis of rotation
is parallel to the normal of some crystallographic planes.
6. O-twins with axis of R perpendicular to n
For these, we can use the fact that any rotation can be written as a product of two
180◦ rotations with axes perpendicular to that of the rotation and one of these axes
367
ON THE THEORY OF ROTATION TWINS
can be chosen at will. If u is a unit vector such that u · n = 0, there is then a vector
v such that
Ruψ = Rnπ Rvπ ,
v · u = 0,
2(v · n)2 = 1 + cos ψ.
|v| = 1,
Given an O-twin solution of the form
a
Ruψ m−1 b eb = (1 − n ⊗ a)ea = ēa ,
a · n = 2,
(6.1)
(6.2)
we can transform it as we did (5.1) to get the S-twin solution
a
Rvπ m−1 b eb = (1 − n ⊗ ā)ea = ẽa , ẽa = −Rnπ ēa , m = −m,
(6.3)
with ā again given by (5.3).
One exceptional case occurs when (6.3) describes a penetration twin, so that
m ∈ L(ea )
⇔
m ∈ L(ea ),
ā = 0.
(6.4)
It then follows that m is similar to Rvπ , so
m2 = 1
⇒
m2 = 1.
(6.5)
In these cases, it is not necessary that u be parallel to a row of atoms or to a normal
of a crystallographic plane. My [5] analysis of Friedel twins in quartz, which is
consistent with Zanzotto’s [16, 17], shows that they provide an example fitting
(6.4) and (6.5), with ψ = π/2. Another exceptional case occurs when
v·n=0
⇒
Ruψ = Rπv∧n .
(6.6)
Then, (6.3) describes a rotation S-twin of the kind discussed in Section 4, essentially a type II twin. Using (2.21), one can transform it to get solutions of (5.1)
rather explicitly. In this case, u need not be parallel to a normal of crystallographic
planes or to rows of atoms. This covers the nontrivial possibilities with m2 = 1, so
I now assume that m2 = 1. If v is parallel to n, (6.1) gives Ruψ = 1, which is of no
interest. So, we can assume that v is neither parallel nor perpendicular to n. Then,
Rvπ does not map a plane with normal n to itself, and we do have
Rvπ u = −u,
u · n = 0.
(6.7)
Such solutions are characterized by Adeleke [13], in his subcase 3.3.2.2. Restrictions on m not already mentioned are that
m must have −1 as an eigenvalue ⇒ mba k b = −k b
and
mba lb = −la ,
(6.8)
for some numbers k a and la , not all in either set being zero. Also,
there must be numbers y a such that k a , y a , and m
1ab y b are linearly
independent.
(6.9)
368
J.L. ERICKSEN
Of course, m can be replaced by m−1 in (6.8). If I calculate correctly, such y a always exist for any particular m of the kind considered, but the linear independence
fails for some values of these. For calculations to follow, one can use any values of
m and y a that are consistent with (6.8) and (6.9).
Take u as a unit vector and use the orthonormal basis u, n, w = u ∧ n. Then,
for some angle ϕ with sin 2ϕ = 0,
v = cos ϕn + sin ϕw.
(6.10)
Then, (6.1)4 can be replaced by
ϕ=±
ψ
.
2
(6.11)
Adeleke’s prescription for ea gives them as solutions of
ea · u = k a ,
ea · w = y a ,
ea · Rvπ w = mba y b ,
(6.12)
which gives
ea = k a u + csc 2ϕmba y b + cot 2ϕy a n + y a w.
(6.13)
For ā, his prescription gives
ā = mba eb · nRvπ ea − n.
(6.14)
Of course, one must transform this to get the corresponding solutions of (6.1), using
a = ā + 2n. From (6.12)1 , it follows that the axis of Ruψ is given by
u = k a ea .
(6.15)
Now, starting with the combination Rvπ (m−1 )ab eb , use (6.7) and (6.10) to get
Rvπ n = cos 2ϕn + sin 2ϕw,
Rvπ w = sin 2ϕn − cos 2ϕw
(6.16)
(6.17)
to evaluate this. One finds that it can be put in the form (1 − n ⊗ ā)ea , with
a
(6.18)
ā = csc 2ϕ mba − m−1 b y b ea ,
as an alternative to (6.14). With l = la ea ⇒ la = l · ea , it follows from (6.8) and
(6.18) that
l · ā = 0.
(6.19)
Using (6.13), one gets
l = k a la u + sec ϕla y a (− sin ϕn + cos ϕw)
⇒
l · v = 0,
(6.20)
where (6.10) is used. With the possibility v · n = 0 excluded, ā and v cannot
be parallel, so these determine the direction of a plane with normal l. When the
369
ON THE THEORY OF ROTATION TWINS
eigenspace of m corresponding to −1 is one-dimensional, k a and la are proportional to integers, implying that u is parallel to some rows of atoms and l is normal
to some crystallographic planes. Either −1 is an isolated eigenvalue, in which case
the eigenspace is one-dimensional, or the eigenvalues are 1, −1, −1. In the latter
case, one can reduce m to the form (4.8). It is then easy to show that if m2 = 1, the
eigenspace corresponding to −1 is one-dimensional, and we have already covered
possibilities with m2 = 1.
1 2 = 1. One possibilAs is clear from (6.18), there are exceptional cases when m
ity is that one has type I or type II twins, so sin 2ϕ = 0, ā as given by (6.18) then
being indeterminate. The other possibility, mentioned earlier, is that
ā = 0
⇒
m ∈ L(ea ),
Rvπ ∈ P (ea ).
(6.21)
With this, we have all solutions of the twinning equations describing rotation twins.
7. An example
In part, this is an attempt to warn those unfamiliar with research on minerals of
some of the pitfalls. Experimentally, it is not always easy to distinguish between
various twin laws which seem to be quite different. In the words of some experts,
Donnay et al. [18], “Whenever the crystal lattice or one of its multiple lattices
possesses pseudo-symmetry, the crystal may twin (twinning by pseudo-merohedry
or by reticular pseudo-merohedry). If the pseudo-symmetry is pronounced and
sufficiently high, several twin laws may lead to nearly identical orientations of
the twinned individual. The resulting twins have been called “neighboring twins”
(macles voisines, Friedel, 1926). Because the relative orientation of one of the
twinned individuals with respect to the other is known to morphologists only to
within the limits of error of optical goniometry, the identification of neighboring
twins may be a difficult problem, as is well illustrated by cryolite, staurolite, harmotome, morvenite, etc.” These workers developed a rather sensitive technique
for distinguishing between such possibilities. As an example, they considered four
likely possibilities for describing one kind of twin in staurolite. These involve four
rotations with different axes. Two have 120◦ angles, the angles for the other two
being 90◦ and 180◦ . According to their observations, that with the 180◦ angle is the
best fit, among the four.
I have looked at several references on staurolite and plan to look at more, since I
have found them rather confusing and incomplete, partly because of my ignorance
and meager experience with minerals. In one of the more recent references, the
book by Klein and Hurlbut [19, pp. 104, 105 and 438, 439], various information
is presented, including the chemical composition, from which it is clear that this
is a complicated multilattice, and that the composition is somewhat different in
different specimens. This is also the case in various other minerals. It is described
as monoclinic with a β angle of 90◦ , being pseudo-orthorhombic. As I interpret
this and other writings, the skeletal lattice is orthorhombic, but the configuration of
370
J.L. ERICKSEN
shifts reduces the symmetry to monoclinic. The space group is described as C2/m.
No details concerning shifts are given. Although writers tend not to say so, the
general practice seems to be to use as lattice vectors a mutually orthogonal set,
with magnitudes given by these writers as
a = |e1 | = 7.83 Å,
b = |e2 | = 16.62 Å,
c = |e3 | = 5.65 Å. (7.1)
Other estimates I have seen of these numbers are not very different. As they describe them, two common kinds occur, both being called penetration twins, and
both being involved in crosses commonly found in this material,
“(1) with twin plane {0 3 1} in which the individuals cross at nearly 90◦
(Figure 13.24b);
(7.2)
◦
(2) with twin plane {2 3 1} in which they cross at nearly 60 (Figure 13.24c).”
I will only present an analysis of the first kind. Also, I note that Donnay et al. [18]
do not consider the twin plane {0 3 1} as one of the more likely possibilities. Here,
{a b c} represents the direction ae1 + be2 + ce3 or a crystallographic equivalent. For
the particular direction noted, replace curly brackets by parentheses. Concerning
the first kind, in a paper written almost two decades earlier, Hartman [7] complains
that too many workers ignore experimental work by Friedel [20], then quite old,
which he accepts. This is also endorsed by Cahn [6], a major expert. Briefly, Friedel
found that the isometry is better described by
1
Reπ/2
.
(7.3)
From such remarks, I became doubtful that the interface is exactly a {0 3 1} plane.
Similarly, Klein and Hurlbut [19] ignore the work mentioned above, by Donnay
et al. For the second kind, they found that this isometry is better described as a
180◦ rotation about [3 1 3], the direction 3e1 + e2 + 3e3 . One could replace this by
a crystallographic equivalent, the set of these being described by 3 1 3. Various
writers just give one direction, assuming you know that the equivalents can be used
and, here, I follow suit.
I decided to try analyzing the first kind as a rotation twin, using (7.3) This axis
is perpendicular to the normal to the aforementioned (0 3 1) plane or an equivalent.
I assume that the interface is a plane with normal perpendicular to the axis, but
not necessarily this one. A priori, it might be an S-twin or an O-twin, these being
growth twins. I tried both possibilities, concluding that only the latter is appropriate. I shall sketch my analysis of the latter, using the analysis in Section 6. We are
given that
π
e1
= ae1 ,
ψ= .
(7.4)
u=
a
2
With the lattice vectors orthogonal, e2 and e3 must be in the plane determined by n
and w, where the vectors are defined as in Section 6, giving
be2 = cos χn + sin χw,
ce3 = − sin χn + cos χw,
(7.5)
371
ON THE THEORY OF ROTATION TWINS
where χ is some angle, unknown since no particular value of n is assumed. Of
course, these values of ea are to be consistent with (6.13), and I reject solutions
possible only for isolated values of b/c. This gives equations which can be solved
by elementary methods, to get two solutions. For m, one gets the values
5
5
5
5
51 0 0 5
51 0 05
5
5
5
5
5
5
m2 = 5
m1 = 5
(7.6)
5 0 1 0 5,
5 0 −1 0 5 ∈ L(ea ).
5 0 0 −1 5
50 0 15
For the first, one gets
2
3
(n ⊗ a)1 = be + ce
and, for the second
⊗
(n ⊗ a)2 = be2 − ce3 ⊗
e2 e3
+
b
c
⇒
a1 = 2n1
(7.7)
e2 e3
−
b
c
⇒
a2 = 2n2 .
(7.8)
The two directions of n are orthogonal, these solutions being rather similar to the
solutions Zanzotto [16, 17] and I [5] got by different reasoning for the Friedel twins
in quartz. The latter form 90◦ crosses, and one can use much the same reasoning to
construct solutions for such crosses in staurolite. For (7.7), say, the direction of n
is given by
b 2
e + e3 ,
c
b ∼
= 2.94
c
(7.9)
for the approximate values of b and c given in (7.1). In the usual jargon, this is an
irrational direction. In dealing with these, it is a common practice to give a rational
direction, using fairly small integers and, here, a likely choice is a (0 3 1) plane,
referred to above as a twin plane. While m1 and m2 are not in L(ea , pi )
5
5
51 0
0 5
5
5
5
m3 = m1 m2 = 5
5 0 −1 0 5 ∈ L(ea , pi ) ⇒ m1 = m3 m2 , (7.10)
5 0 0 −1 5
from the claim that these are monoclinic crystals. Here, my interpretation of the
literature is that the 180◦ rotation included in the point group is Reπ1 . So, the two
solutions are symmetry related. As far as I can tell, this seems to be a satisfactory
description of these twins.
What is the basis for calling these penetration twins, particularly by those like
Klein and Hurlbut who associate a definite twin plane with them? According to
Cahn [6, p. 388], the interfaces of penetration twins are crystallographically irregular, and interfaces are not always parallel to twin planes, when these exist. I have
not yet found reports of observations of details of these interfaces, so I might
be missing something. From looking at sketches of the 90◦ crosses and eyeing
372
J.L. ERICKSEN
one specimen, the interfaces at least resemble perpendicular planes. By looking at
sketches of the “nearly 60◦ ” crosses, the reader will be as able as I am to judge
whether the interfaces are planes and if these are likely to be rotation twins. Other
kinds of intergrowths more or less like these, not always found in crosses, are
among the things called penetration twins. Sketches of these staurolite crosses and
other twinned configurations in this and other materials are presented by various
writers, for example, Klein and Hurlbut [19, pp. 103–106] and Dana [21, pp. 181–
194]. With (7.6), for the same values of m1 and m2 , one can get solutions for what
I call penetration S-twins without restrictions on n, or for what I call exceptional
O-twins with different values of n, but these involve different isometries.
There is a little mystery which I seem to have resolved. For the second kind
of twin described in (7.2), Cahn [6, p. 373] says nothing about the twin plane
mentioned there, but does mention a twin law “. . . with a three fold twin axis
[1 0 1], . . . ”. I expected to find this among the four likely prospects considered
by Donnay et al., but did not. At first, I thought that this could be due to some
misprint. Then, I found that Hartman [7, p. 234] noted that different workers use
different values of c, one based on “. . . the X-ray unit cell. . . ”, the other on “. . . the
morphological description. . . ”, the latter being twice the former. Assuming Cahn
but not the others used the latter, this corresponds to the [1 0 2] possibility considered by Donnay et al. As was mentioned earlier, this is not the one they consider
the most likely, which is a two fold rotation with twin axis [3 1 3]. However, this
seems to give a likely explanation of the indicated difference. Of the two, Cahn’s
paper was published a bit earlier so, at the time, he might not have learned of the
results of the other writers. I will not recommend a solution for these twins without
demonstrating that it is adequate to describe the “nearly 60◦ ” crosses, something I
have not yet explored.
I do think it desirable to try to analyze more of the many growth twins, to better
understand how well my twinning equations apply to them, and to determine how
useful they might be in helping to resolve ambiguities mentioned at the beginning
of this Section.
References
1.
2.
3.
4.
5.
6.
7.
8.
C.S. Barrett and T.B. Massalski, The Structure of Metals, 3rd edn. McGraw-Hill, New York
(1966).
J.L. Ericksen, Equilibrium theory for X-ray observations. Arch. Rational Mech. Anal. 139
(1997) 181–200.
J.L. Ericksen, On correlating two theories of twinning. Arch. Rational Mech. Anal. 151 (2000)
261–289.
J.L. Ericksen, Twinning analyses in the X-ray theory. Internat. J. Solids Structures 38 (2001)
967–995.
J.L. Ericksen, On the theory of growth twins in quartz. Math. Mech. Solids 6 (2001) 359–386.
R.W. Cahn, Twinned crystals. Adv. in Phys. Quart. Suppl. Phil. Mag. 3 (1954) 363–445.
P. Hartman, On the morphology of growth twins. Zeits. Krist. 107 (1956) 225–237.
M. Pitteri, On (ν + 1) lattices. J. Elasticity 15 (1985) 3–25.
ON THE THEORY OF ROTATION TWINS
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
373
J.L. Ericksen, On groups occurring in the theory of crystal multi-lattices. Arch. Rational Mech.
Anal. 148 (1999) 145–178.
M. Pitteri, On the kinematics of mechanical twinning in crystals. Arch. Rational Mech. Anal.
88 (1985) 25–58.
M. Pitteri, On type-2 twins in crystals. Internat. J. Plasticity 2 (1986) 99–106.
M. Pitteri and G. Zanzotto, Beyond space groups: The arithmetic symmetry of deformable
multilattices. Acta Cryst. A 54 (1998) 359–373.
S. Adeleke, On matrix equations of twinning in crystals. Math. Mech. Solids 5 (2000) 395–415.
J.L. Ericksen, Some surface defects in unstressed thermoelastic solids. Arch. Rational Mech.
Anal. 88 (1985) 337–345.
G. Zanzotto, Twinning in minerals and metals: remarks on the comparison of a thermoelastic
theory with some experimental results. Mechanical twinning and growth twinning, Nota II. Atti
Accad. Naz. Lincei 82 (1988) 725–741, 743–756.
G. Zanzotto, Geobarothermometric properties of growth twins and mathematical analyses of
quartz for a broad range of temperatures and pressures. Phys. Chem. Minerals 16 (1989) 783–
789.
G. Zanzotto, Thermoelastic stability of multiple growth twins in quartz and general barothermometric implications. J. Elasticity 23 (1990) 253–287.
G. Donnay, J.D.H. Donnay and V.J. Hurst, Precession goniometry to identify neighboring
twins. Acta Cryst. 8 (1955) 507–509.
C. Klein and C.S. Hurlbut, Jr., Manual of Mineralogy (after James D. Dana), 21st edn, revised.
Wiley, New York (1999).
G. Friedel, Sur les macles de la staurotide. Bull. Soc. Franç. Minéral. 45 (1922) 8–15.
B.W. Dana, A Textbook on Mineralogy with an Extended Treatise on Crystallography (revised
and enlarged by W.E. Ford). Wiley, New York (1932).
Minimum Free Energies for Materials with Finite
Memory
MAURO FABRIZIO1 and MURROUGH GOLDEN2
1 Dipartimento di Matematica, Università degli Studi di Bologna, Piazza di Porta S. Donato 5,
40127 Bologna, Italia. E-mail: fabrizio@dm.unibo.it
2 School of Mathematical Sciences, Dublin Institute of Technology, Dublin, Ireland.
E-mail: jmgolden@maths1.kst.dit.ie
Received: 22 May 2002; in revised form 16 October 2003
Abstract. Finite memory viscoelastic materials are of interest because (a) they are not necessarily experimentally distinguishable from materials with infinite memory; and (b) the assumption of
infinite memory can, in certain contexts, lead to results that run counter to physical intuition. An
example of this – the quasi-static viscoelastic membrane in a frictional medium – is discussed. It
is shown that, for a finite memory material, the singularity structure of the Fourier transform of the
relaxation function derivative is quite different from the infinite memory case in the sense that it is
an entire function with all its singularities being essential singularities at infinity. The formula for the
minimum free energy [1] is still valid in this case. In contrast to the work function, this quantity, and
all other functions of the minimal state, depend only on the values of the history over the period when
the relaxation function derivative is nonzero. The factorization required to determine the form of the
minimum free energy can be carried out explicitly for simple step-function choices of the relaxation
function derivative. The two simplest cases are fully worked through and explicit formulae are given
for all relevant quantities.
Mathematics Subject Classifications (2000): 74A15, 74D05, 30E20.
Key words: linear viscoelasticity, thermodynamics, minimum free energy, finite memory.
To Clifford A. Truesdell, Founder of Rational Thermodynamics
1. Introduction.
General expressions have been given recently for the minimum free energy of
linear viscoelastic materials under isothermal conditions in the scalar case [1] and
in the full tensorial case [2]. An alternative derivation for the full tensorial case
was given in [3] together with expressions for a family of other free energies
that are functions of state in the sense of Noll [4] in the case of scalar, discrete
spectrum (relaxation function given by a sum of exponentials) response. The case
of a compressible fluid is considered in [5].
The aim of the present work is to derive an expression for the minimum free
energy corresponding to a relaxation function with the special property that its
375
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 375–397.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
376
M. FABRIZIO AND M. GOLDEN
derivative is nonzero over only a finite interval of time. It will be seen that there are
special features associated with the analytic behaviour of the frequency space representation of such relaxation functions which render this a non-trivial extension,
with unique features, of the general treatments referred to above. This property of
finite memory is of interest in the first instance because finite and infinite memories
are not necessarily experimentally distinguishable; also, the assumption of infinite
memory can lead to paradoxical results for many problems.
The scalar case is dealt with in this work. In Section 2, various relationships
required in later sections are presented. The problem of a viscoelastic membrane
in a frictional medium is discussed in Section 3 in order to illustrate that results running counter to physical intuition emerge from the assumption of infinite memory.
In fact, a result is quoted which shows that while (time) exponential decay in the
displacement occurs in the elastic problem, this is not so in the viscoleastic problem
if the viscoelastic function does not decay exponentially. One would expect that any
viscoelastic function would simply enhance the elastic exponential decay because
of the dissipative effects associated with viscoelasticity.
In Section 4, it is shown that the singularity structure of the Fourier transform of
the relaxation function derivative is quite different from the infinite memory case
in that it is an entire function with essential singularities of exponential type at
infinity, rather than poles and branch points generally in the finite complex plane, as
is the case for infinite memory materials. The latter may of course include essential
singularities at infinity, though these have been excluded for simplicity from earlier
work [1–3].
In Section 5, an explicit expression for the minimum free energy is derived. This
is very similar to the developments in [1, 2], though with the difference that relative
strain histories are used, which leads to certain simplifications in frequency space –
associated with better convergence at infinity. The main reason why the derivation
is presented in some detail is to ascertain that the very different singularity structure
in the finite memory quantities does not affect the final expression. An important
result is proved at the end of this section, namely that the minimum free energy (and
other functions of the minimal state) depends only on that portion of past history for
which the memory is nonzero, while the work function depends on all past times.
In Section 6 the crucial factorization required to determine the minimum free
energy is discussed for specific (step-function) examples of finite memory materials, while explicit forms of the minimum free energy, and the corresponding
work function, are given in Section 7 for two choices of finite memory relaxation
functions.
2. Basic Relationships
We consider a linear viscoelastic solid, subject to stress in such a way that there is
only one nonzero component of stress T (t) and strain E(t) related by
377
MINIMUM FREE ENERGIES FOR FINITE MEMORY
T (t) = G0 E(t) +
= G∞ E(t) +
∞
G′ (s)E t (s) ds,
0
∞
0
G′ (s)Ert (s) ds,
E t (s) = E(t − s),
s ∈ R,
(2.1)
t
t
Er (s) = E (s) − E(t),
where E ∈ L (R ) ∩ L2 (R+ ) ∩ C 1 (R+ ) and G′ ∈ L1 (R+ ) ∩ L2 (R+ ) using
the notation here and below: R is the set of reals, R+ the positive reals, and R++
the strictly positive reals; similarly R− , R−− are the negative and strictly negative
reals. The relative history Ert will be used extensively later.⋆ The relaxation function
s
G′ (u) du
(2.2)
G(s) = G0 +
t
1
+
0
is then well-defined along with G∞ = lims→∞ G(s). We take
G∞ > 0
(2.3)
so that the body is a solid.
Let be the complex ω plane and
+ = {ω ∈ | Im(ω) ∈ R+ },
(+) = {ω ∈ | Im(ω) ∈ R++ }.
(2.4)
These define the upper half-plane including and excluding the real axis, respectively. Similarly, − , (−) are the lower half-planes including and excluding the
real axis, respectively.
A viscoelastic state is defined in general by the current value of strain and the
history (E(t), E t ). The concept of a minimal state, defined in [3] (see also [6–8, 2])
can be expressed as follows: two viscoelastic states (E1 (t), E1t ), (E2 (t), E2t ) are
in the same minimal state if
∞
E1 (t) = E2 (t);
G′ (s + τ )[E1t (s) − E2t (s)] ds = 0 ∀τ 0. (2.5)
0
We refer to any functional of (E(t), E t ) which gives the same value for any (E1 (t),
E1t ) and (E2 (t), E2t ) obeying (2.5), as a function of the minimal state.
For any f ∈ L2 (R), we denote its Fourier transform by
∞
f (ξ )e−iωξ dξ, fF ∈ L2 (R)
(2.6)
fF (ω) =
−∞
If f is a real-valued function – which will be the case for all functions of interest
here, in the time domain – then
f¯F (ω) = fF (−ω)
where the bar denotes complex conjugate.
⋆ Note that this notation differs from that in [1]
(2.7)
378
M. FABRIZIO AND M. GOLDEN
We have
fF (ω) = f+ (ω) + f (− (ω),
∞
f (ξ )e−iωξ dξ,
f+ (ω) =
0
0
f (ξ )e−iωξ dξ,
f− (ω) =
−∞
(2.8)
f± ∈ L2 (R),
where f+ is analytic in (−) since it is the Fourier transform of a function that is
zero on R−− . For the cases of interest in the present work, we also assume that it is
analytic on R and thus on − . Similarly, f− is analytic on + . What we mean by
extending the analyticity to R for say f+ is that there is a constant ǫ > 0 such that
there are no singularities for Im(ω) < ǫ. This of course amounts to assuming that
f (ξ ) decays exponentially as |ξ | → ∞. However, it will sometimes be possible to
weaken this assumption by taking continuous limits, once final results have been
obtained, for example, by extending branch points at which no discontinuity or
infinity occurs up to the real axis.
In general, the integral definition of f± are convergent only over part of . On
the remainder of the complex plane (where the singularities lie), it can be defined
by analytic continuation, except at the singularities. If an explicit expression can
be obtained for the integral, this is a trivial procedure.
Using the inverse transform to express f in terms of fF , we obtain
∞
fF (ω′ )
1
dω′ ,
f+ (ω) = −
2π i −∞ ω′ − ω−
∞
1
fF (ω′ )
(2.9)
f− (ω) =
dω′ ,
2π i −∞ ω′ − ω+
ω± = lim (ω ± iα).
α→0+
Thus, we move ω slightly into the half-plane of analyticity of f± respectively to
achieve convergence in the time integration. This also ensures that the integrals on
the right-hand side of (2.9) have a well-defined meaning. The limit is taken after
the integration is carried out.
Functions on R which vanish identically on R−− are defined as functions
on R+ . For such quantities, fF = fc − ifs , where fc , fs are the Fourier cosine
and sine transforms
∞
f (ξ ) cos ωξ dξ = fc (−ω),
fc (ω) =
0
∞
(2.10)
f (ξ ) sin ωξ dξ = −fs (−ω).
fs (ω) =
0
Thus
G′F (ω)
=
0
∞
G′ (s)e−iωs ds = G′c (ω) − iG′s (ω).
(2.11)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
379
Properties of G′s (ω) include [9]
G′s (ω) 0
∀ω ∈ R++ ,
G′s (−ω) = −G′s (ω), ∀ω ∈ R,
(2.12)
the first relation being a consequence of the second law of thermodynamics and
the second being a particular case of (2.10). It follows that G′s (0) = 0. Actually,
it was proved in [9] that G′s (ω) < 0 for positive ω, based on a restricted form of
the second law. It will be shown in Section 6 that this does not hold for at least two
finite memory examples. We also have [9]
1 ∞ G′s (ω)
dω < 0
(2.13)
G∞ − G0 =
π −∞ ω
so that G′s (ω)/ω ∈ L1 (R). It follows from (2.3) and (2.13) that G0 is positive
definite.
The function G′F (ω) is analytic on (−) . As noted after (2.8), this is a consequence of the fact that G′ vanishes on R−− which is essentially the requirement of
Causality [10]. It is assumed that G′F (ω) is analytic on R and therefore on − .
′
The quantity G′F (−ω) = GF (ω) is analytic in + , a mirror image, in the real
axis, of the singularity structure of G′F (ω). Thus, G′s (ω) has singularities in both
(+) and (−) which are mirror images of one another. Similarly, its zeros will be
mirror images of one another. The singularity structure of
H (ω) = −ωG′s (ω) = H (−ω) 0 ∀ω ∈ R
(2.14)
will be of central interest. We have
H (ω) = H1 (ω2 )
(2.15)
which is a consequence of the analyticity of H (ω) on the real axis. It follows that
H (ω) goes to zero at least quadratically at the origin. Note that H (ω) is positive
semi-definite. The possibility of it vanishing for nonzero frequencies is not excluded, following the remarks after (2.12). It will be required in later developments
that H (ω) can be written in the form
H (ω) = H+ (ω)H− (ω),
(2.16)
where H+ (ω) has no singularities or zeros in (−) and is thus analytic in − .
Similarly, H− (ω) is analytic in + with no zeros in (+) .
By considering the inverse sine transform of G′s (ω) ([9], for example) one can
show that for the infinite memory case (where G′F is analytic at infinity; we shall
see that this implies that it has no finite memory component)
H∞ = lim H (ω) = − lim ωG′s (ω) = −G′ (0) 0.
|ω|→∞
ω→∞
(2.17)
The sign of G′ (0) has been deduced by various authors from thermodynamic constraints in the general three-dimensional case [11, 12, 9]. We assume for present
380
M. FABRIZIO AND M. GOLDEN
purposes that G′ (0) is nonzero so that H∞ is a finite, positive number. Then H (ω) ∈
R++ ∀ω ∈ R, ω = 0. The result corresponding to (2.17) for the finite memory
case is discussed in Section 4.
There is a non-uniqueness in the factorization (2.16) up to a constant phase
factor. We eliminate this by taking
H± (ω) = H∓ (−ω) = H ∓ (ω),
H (ω) = |H± (ω)|2 .
(2.18)
There is still an arbitrariness of sign in the sense that −H± (ω) are also acceptable
choices. It should be remarked that there is the possibility of a further, deeper
non-uniqueness deriving from the fact that, since H (ω) is not positive definite for
nonzero ω, the condition for unique factorization is not met [2]. The following sufficient conditions are also stated in [2] for the full tensorial case: G′ (0) < 0 which
has been assumed; G(·) − G∞ integrable; and G′′ integrable. This last condition
does not hold for the examples in Section 6, and since such non-uniqueness in fact
occurs for one of these examples, it may therefore be concluded that the condition
that G′′ be integrable is also necessary.
Consider now the strain history E t . Define
∞
E t (s)e−iωs ds, E+t ∈ L2 (R).
(2.19)
E+t (ω) =
0
It is analytic in (−) , a property which will be assumed to extend to − . It is
defined in all or part of (+) by analytic continuation. We also require the Fourier
transform of the relative history.
∞
E(t)
t
t
(2.20)
e−iωs ds = E+t (ω) − − ,
Er+ (ω) = E+ (ω) − E(t)
iω
0
where ω− is defined as in (2.9), the limit being taken, as noted, after any integration
involving the quantity (ω− )−1 has been carried out. Under a similar assumption to
t
is analytic in − .
that for E+t , we may conclude that Er+
t
Similarly, we define (if E (s), s ∈ R− ∈ L2 (R− ))
0
E t (s)e−iωs ds, E−t ∈ L2 (R),
E−t (ω) =
−∞
(2.21)
0
E(t)
t
−iωs
t
t
Er (s)e
ds = E− (ω) + + ,
Er− (ω) =
iω
−∞
both of which are analytic in + .
Analyticity at infinity is assumed for E+t (ω) though not for E−t (ω). Note that
dE+t (ω)
= −iωE+t (ω) + E(t)
dt
yielding that
t
dEr+
(ω)
Ė(t)
t
= −iωEr+
(ω) − − .
dt
iω
(2.22)
(2.23)
381
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Note also that T (t), given by (2.1), is independent of E t (s), s ∈ R−− , which
allows us to extend G′ to R as, for example, an odd function. Using this fact and
Plancherel’s theorem for the Fourier transform [13, 14] gives that [1, 15, 16]
∞
1
T (t) = [G0 E(t) −
G′ (ω)E+t (ω) dω
π i −∞ s
∞
1
t
G′ (ω)Er+
= G∞ E(t) −
(ω) dω,
(2.24)
π i −∞ s
where (2.13) has been used in writing (2.24)2 .
3. Viscoelastic Membrane
In this section we recall a result obtained in [17], in order to show that an infinite
memory, represented by a relaxation function G′ ∈ L1 (R+ ), places very strong
restrictions on the asymptotic behaviour of solutions of viscoelastic boundary and
initial value problems. To the extent that these restrictions run counter to physical
intuition, the results derived provide a motivation for considering finite memory
materials.
Let us consider a viscoelastic membrane occupying the region B with boundary
∂B, represented by the system
∞
G′ (s) ut (s) ds − aut
(3.1)
ut t = G0 u +
0
with boundary and initial conditions given by
u|∂B = 0,
u(x, 0) = u0 (x),
ut (x, 0) = u1 (x),
(3.2)
where is the Laplacian operator and the constant a 0 denotes the coefficient
of a viscous force.
Let us consider the problem (3.1), (3.2) on the domain Q = B × (0, ∞). We
suppose a > 0, G′ ∈ L1 (R) ∩ L2 (R), u0 ∈ H 1 (B), u1 ∈ L2 (B).
DEFINITION 3.1. A function u ∈ L2 (R+ , H01 (B)) ∩ H 1 (R+ , L2 (B)) is said to
be a weak solution of the problem (3.1), (3.2) with data u0 ∈ H 1 (B), u1 ∈ L2 (B),
if u satisfies the integral equation
u1 (x, t)ϕ(x, 0) dx
ut (x, t)ϕt (x, t) dx dt +
B
Q
∞
G′ (s)∇ut (x, s) · ∇ϕ(x, t) ds
G0 ∇u(x, t) · ∇ϕ(x, t) +
=
Q
0
+ aut (x, t)ϕ(x, t) dx dt
(3.3)
for all ϕ ∈ L2 (R+ , H01 (B)) ∩ H 1 (R+ , L2 (B)).
382
M. FABRIZIO AND M. GOLDEN
A function g ∈ L1 (R+ ) is said to decay exponentially if there exists a positive β
such that
∞
eβt |g(t)| dt < ∞.
(3.4)
0
We say that a weak solution u decays exponentially if E 1/2 (t) decays exponentially, where the energy E is given by
E(t) = (u2t (x, t) + [∇u(x, t)]2 dx.
B
One of the main results contained in [17] is the following:
THEOREM 3.1. If G′ does not decay exponentially, then the weak solution of
problem (3.1), (3.2) does not decay exponentially.
When the relaxation function G′ = 0, and a > 0, then the solutions of the
system (3.1), (3.2) exhibit exponential decay.⋆ In contrast, Theorem 3.1 gives that
when a > 0 and G′ = 0 does not decay exponentially, then the solutions do not
exhibit exponential decay, even though the memory term represents a dissipative
effect. In other words, the decay to zero of the solutions of (3.1), (3.2) for t → ∞
cannot be faster that the decay to zero of the kernel G′ for s → ∞. Of course
this constraint does not apply if the memory is finite. Roughly speaking, when
G′ is nonzero on an infinite interval, then the memory term not only provides a
dissipative effect, but also a braking effect on the decay to zero of the solutions.
For many problems, these two effects are physically contradictory. In such cases it
is more suitable to use a kernel G′ which is nonzero only in a finite interval.
4. Finite Memory
We now explore the case where
G′ (t) = 0,
t > d > 0,
so that (2.11) becomes
d
′
G′ (s)e−iωs ds
GF (ω) =
(4.1)
(4.2)
0
and the inverse relationship is
∞
1
′
G′ (ω)eiωs dω.
G (s) =
2π −∞ F
(4.3)
⋆ We have the same behaviour if the memory is finite, i.e., there exists a d ∈ R ++ such that
′
G (s) = 0, for any s d.
MINIMUM FREE ENERGIES FOR FINITE MEMORY
383
Before proving an important result, it is relevant to note that the function eiz has
an essential singularity at infinity. This is manifested as an exponential divergence
as Im(z) → −∞ and as an exponential decay as Im(z) → ∞. We shall, for
brevity, and in the context of this paper, refer to eiz as analytic/convergent on the
upper half-plane, meaning analytic on the finite part of the plane and exponentially
convergent to zero as Im(z) → ∞.
The function e−iωb on , where b > 0, diverges exponentially in (+) as
Im(ω) → +∞. Similarly, eiωb , where b > 0 diverges exponentially in (−) as
Im(ω) → −∞. We refer to these as exponential divergences of order b in the
respective half-planes.
The following is now proved:
PROPOSITION 4.1. Relation (4.1) is true if and only if G′F has only essential
singularities at infinity with exponential divergences in (+) of order d and perhaps
others of lower order.
Proof. By transforming the integration variable in (4.2), we obtain
0
′
−iωd
GF (ω) = e
g(ω), g(ω) =
G′ (s + d)e−iωs ds.
(4.4)
−d
It follows from its definition that g(ω) is analytic on (+) and (4.4)1 gives that
the only singularities of G′F in (+) are exponential divergences at +∞ due to its
factor e−iωd . Since g(ω) may contain factors eiωc , c > 0, we see that while the
dominant divergence is of order d, there may be others of order d − c, 0 < c < d.
Note also that
g(ω) = eiωd G′F (ω)
(4.5)
has only exponential divergences in (−) of order d and possibly others of lower
order.
To prove the converse, we combine (4.3) and (4.4) to obtain
∞
1
′
g(ω)eiω(s−d) dω.
(4.6)
G (s) =
2π −∞
If s > d, the contour can be closed in (+) with the contribution from the infinite
portion exponentially attenuated. The analyticity of g(ω) in (+) ensures that the
result is zero.
✷
REMARK 4.1. The first part of Proposition 4.1 can be seen by the following,
perhaps more intuitive argument. From the fact that G′F , given by (4.2), is defined
by an integral of finite range, we see that it is an entire function on the complex
frequency plane; and therefore, if not a polynomial, must have an essential singularity at infinity [18]. The dominant singular behaviour can be deduced without
difficulty from (4.2). It is worth noting that the properties of Fourier transforms
of functions that are nonzero only on finite regions are of interest also in signal
processing applications [19].
384
M. FABRIZIO AND M. GOLDEN
An immediate consequence of Proposition 4.1 is that the assumption that G′F is
analytic at infinity, which was made in earlier work, must now be dropped.
We have
lim ωG′F (ω) = −iG′ (0) 1 − lim e−iωd ,
(4.7)
ω→∞
ω→∞
where the limit is taken along the real axis or any axis parallel to the real axis. This
is of course finite but not well-defined. Also
lim
Im(ω)→−∞
ωG′F (ω) = −iG′ (0).
(4.8)
Furthermore,
lim
Im(ω)→∞
ωg(ω) = iG′ (d − ).
(4.9)
The quantities of central interest are H , given by (2.14) and its factors H± defined by (2.16). The function H has exponential divergences as Im(ω) approaches
infinity in both (+) and (−) ; the factor H+ has exponential divergences only in
(+) and H− only in (−) .
Along the real axis or any axis parallel to it,
lim H (ω) = −G′ (0) 1 − lim cos ωd
(4.10)
ω→∞
ω→∞
by virtue of (4.7). This limit does not exist. There is however no resultant infinity.
We have
lim
Im(ω)→∓∞
H± (ω) = h∞ = {−G′ (0)}1/2 .
(4.11)
The method of factorization given in [1] is not useful in this case. Special
techniques must be employed, as outlined in Section 6.
5. An Expression for the Minimum Free Energy
Let ψ(t) be a free energy functional for the system under consideration. Then the
Clausius–Duhem inequality, adapted to the isothermal case:
T (t)Ė(t) − ψ̇(t) = D(t) 0
(5.1)
requires, by virtue of standard arguments [20, 21], that
T (t) =
∂ψ(t)
∂E(t)
(5.2)
provided that ψ(t) has certain differentiability properties. The quantity D(t) is the
internal dissipation function. Denoting by ψs (t) the free energy for static histories
equal to E(t) for all past times, then
ψs (t) = φ(t) =
1
G∞ E 2 (t)
2
(5.3)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
385
or the elastic stored energy. The second law requires that for all histories [20, 21]
ψ(t) ψs (t),
(5.4)
where equality occurs by definition for static histories. It follows that ψ(t) is nonnegative. We take (5.1)–(5.4) to be the defining properties of a free energy [9].
The derivation of the form of the minimum free energy as presented in [1] and
other relevant formulae will be sketched here, with some change and simplification,
in order to clarify that the essential singularities at infinity in H± do not invalidate
the argument.
Let us consider the work function which is also the maximum free energy if the
state is defined as (E(t), E t ) [22]:
t
W (t) = ψM (t) =
T (s)Ė(s) ds
−∞
∞
1
t
H (ω)|Er+
(ω)|2 .
(5.5)
= φ(t) +
2π −∞
The second relation is derived in [15, 1]. We wish to establish an expression for
the minimum free energy at a specified state, which is given by the maximum
recoverable work from that state [12, 9]. It can be shown that this quantity is a
function of the minimal state [3]. It will be assumed that the strain tends to zero at
large times [2] though the eventual optimal continuation will not have this property.
The limit of W (u) as u → +∞ gives [1]
∞
∞
1
t
t
dωH (ω)|E+t (ω) + E−t (ω)|2
dsT (s)Ė(s) =
ψ = ψM (∞) =
2π −∞
−∞
∞
1
t
t
dωH (ω)|Er+
(ω) + Er−
(ω)|2 .
(5.6)
=
2π −∞
The last form follows from the previous term on using (2.20) and (2.21) and the
t
fact that H (ω) vanishes quadratically at the origin. It has been allowed that ψM
(∞)
may depend on t in anticipation of the fact that the eventual optimal continuation
will be dependent on the current time.
t
t
(ω) which
(ω) is given and wish to find the choice of Er−
We assume that Er+
maximizes the recoverable work
∞
Wr = −
dt ′ T (t ′ )Ė(t ′ )
t
t
∞
=
dt ′ T (t ′ )Ė(t ′ ) −
dt ′ T (t ′ )Ė(t ′ )
−∞
= ψM (t) − ψ t .
−∞
(5.7)
t
(ω) which minimizes ψ t . Let
Now, ψM (t) is given, so we seek the choice of Er−
t
Em (ω) (not the same notation as in [1]) be that choice. It will be assumed that
386
M. FABRIZIO AND M. GOLDEN
it (and other continuations) may have essential singularities at infinity, similar to
those of G′F . Thus, we will use the term analytic/convergent with respect to it. If
we replace it by Emt (ω) + k(ω) where k(ω) is arbitrary, apart from the fact that
k̄(ω) = k(−ω), that it vanishes at least as strongly as ω−1 at large frequencies in
t
with possible es+ and that it is analytic/convergent in + (any choice of Er−
′
sential singularities similar to those of GF must have these properties) the resulting
integral must not be smaller. It is easy to show that this is assured if
∞
t
(ω) + Emt (ω)) = 0.
(5.8)
dωH (ω)Re k(−ω)(Er+
−∞
Let us impose the equivalent condition that
∞
t
dωH (ω)k(−ω) Er+
(ω) + Emt (ω) = 0
−∞
∞
t
=
dωH+ (ω)k(−ω) H− (ω)Er+
(ω) + H− (ω)Emt (ω) .
(5.9)
−∞
t
is analytic and H+ (ω), k(−ω) are analytic/convergent in − while
Note that Er+
t
Em (ω) and H− (ω) are analytic/convergent in + . Let
t
t
t
P t (ω) = H− (ω)Er+
(ω) = p−
(ω) − p+
(ω),
(5.10)
where [23]
1
p (z) =
2π i
t
∞
−∞
dω′
P t (ω′ )
ω′ − z
(5.11)
t
(ω) is the limit of p t (z) on the real axis from above. It is analytic in (+) .
and p−
t
(ω) is the limit from below and is analytic in (−) . This inverted notational
Also, p+
convention is adopted to retain conformity with other notation introduced earlier.
The function P t (ω) is analytic on the real axis. Closing the contour on (+) (where
t
H− (ω) is analytic/convergent), we pick up the singularities of Er+
, none of which
t
t
are on R or at infinity. Thus, we see that p± are analytic on R. We have p̄±
(ω) =
t
p± (−ω), ω ∈ R. The definitions of these functions may be extended to by
analytic continuation, as discussed before (2.9).
t
The product H+ (ω)k(−ω)p+
(ω) vanishes as ω−2 for large ω in (−) since
t
k(−ω) and p+
(ω) vanishes as ω−1 . Therefore the integral of this function over the
real axis can be extended to an infinite contour on (−) without altering its value.
The result is zero because of the convergence of the integrand on (−) . Therefore
(5.9) becomes
∞
t
dωH+ (ω)k(−ω) p−
(ω) + H− (ω)Emt (ω) = 0.
(5.12)
−∞
This will be true for arbitrary k(−ω) only if the expression in brackets is a function that is analytic or analytic/convergent in (−) , vanishing at infinity. However,
MINIMUM FREE ENERGIES FOR FINITE MEMORY
387
t
Emt (ω) must be analytic/convergent in + . Remembering that p−
(ω) is analytic
+
and H− (ω) analytic/convergent in , we see that the expression in brackets must
be analytic/convergent in both the upper and lower half-planes and on the real axis.
Thus, it is analytic/convergent over the entire complex plane and thus analytic over
t
(ω) vanishes as ω−1 at infinity (at least in + ), as also must
the finite part. Now p−
Emt (ω) if the strain function is to be finite at s = 0. Therefore, the function is
analytic everywhere, zero at infinity and so must vanish everywhere by Liouville’s
theorem. Thus
∞
t
(ω′ )
H− (ω′ )Er+
p t (ω)
1
1
dω′
=−
.
(5.13)
Emt (ω) = − −
H− (ω)
2π i H− (ω) −∞
ω′ − ω+
Observe that the pole at the origin due to H− (ω) in the denominator must be shifted
to (−) , i.e., [H− (ω)]−1 behaves as (ω+ )−1 near the origin.
Substituting (5.13) into (5.6) and using (5.10), we see that the minimum value
of ψ t is given by
∞
1
t
dω|p+
(ω)|2 .
(5.14)
ψmt =
2π −∞
Also, from (5.5)
∞
1
t
t
ψM (t) = φ(t) +
dω|p+
(ω) − p−
(ω)|2
2π −∞
∞
t
1
t
= φ(t) +
dω |p+
(ω)|2 + |p−
(ω)|2
2π −∞
(5.15)
since the cross-term
t
t
t
t
p−
(−ω)p+
(ω) + p−
(ω)p+
(−ω)
(5.16)
consists of terms that are analytic in (−) and (+) respectively and which vanish
as ω−2 on the infinite boundary of these domains. By closing the contour on the
half-plane over which a given term is analytic, one obtains zero.
The minimum free energy ψm (t) is equal to the quantity Wr evaluated for ψ t =
t
ψm , giving
∞
1
|p t (ω)|2 ψM (t).
(5.17)
ψm (t) = φ(t) +
2π −∞ −
We can write (5.17) in the form:
∞
1
dωH (ω)|Emt (ω)|2 .
ψm (t) = φ(t) +
2π −∞
(5.18)
Relation (4.1) can be shown to be obeyed by ψm (t), using the relation
t
(ω)
∂p−
H− (ω)
=−
,
∂E(t)
iω
(5.19)
388
M. FABRIZIO AND M. GOLDEN
together with equation (2.24)2 and carrying out certain contour integrals on the
appropriate half-planes. Also, relations (5.3) and (5.4) follow from (5.18), on observing [1] that, for a static history, Emt (ω) vanishes.
Lastly, we must show that
Dm (t) = T (t)Ė(t) − ψ̇m (t)
(5.20)
is non-negative. From (5.7)3 and (5.14) we see that
1 d ∞
d
t
dω|p+
(ω)|2 .
Dm (t) = ψmt =
dt
2π dt −∞
(5.21)
One finds that
Dm (t) = |K(t)|2 ,
(5.22)
where K(t) is a real number given by
∞
1
t
K(t) =
dωH− (ω)Er+
(ω)
2π −∞
(5.23)
on using the first of the relationships
d t
t
p (ω) = −iωp+
(ω) − K(t),
dt +
(5.24)
d t
H− (ω)Ė(t)
t
p (ω) = −iωp− (ω) − K(t) −
dt −
iω
which are derived by using (2.23) in (5.11). Also required are the relationships for
t
from
p+
1
2π
t
lim ωp±
(ω) = iK(t),
|ω|→∞
∞
1
1
t
dωp±
(−ω) = ∓ K(t) =
2
2π
−∞
∞
−∞
t
p±
(ω) dω.
(5.25)
By steps similar to those in [1], it can also be shown with the aid of (5.13) and
(5.25) that the optimal relative continuation Emt (s) does not tend to zero as s → 0.
Relation (4.11) is required for this demonstration. Also, Emt (s) + E(t) does not
tend to zero as s → ∞.
We see from (2.8) and (2.9) that if
∞
1
t
Y (s) =
P t (ω)eiωs dω,
(5.26)
2π −∞
where P t is defined by (5.10) then
∞
t
Y t (s)e−iωs ds,
p+ (ω) = −
0
0
t
p− (ω) =
Y t (s)e−iωs ds.
−∞
(5.27)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
From (2.14) and (4.4) we have that for ω ∈ R
ω
g(ω)e−iωd − ḡ(ω)eiωd
H (ω) =
2i
389
(5.28)
so that
H (ω)
Im(ω)→+∞
Im(ω)→−∞
ω
g(ω)e−iωd
2i
ω
− ḡ(ω)eiωd ,
2i
(5.29)
where ḡ indicates the complex conjugate of the function but not the argument. It
follows from (4.9) that the dominant behaviour of the factors are
H+ (ω)
Im(ω)→+∞
H− (ω)
Im(ω)→−∞
Ae−iωd ,
(5.30)
Āeiωd ,
where A is a constant. One can deduce from (5.30) that Y t (s), given by (5.26),
vanishes for s < −d by closing the contour on (−) , so that (5.27)2 becomes
0
t
Y t (s)e−iωs ds.
(5.31)
p− (ω) =
−d
Finally, in this section, we prove the following result.
PROPOSITION 5.1. For a material with finite memory of duration d, the minimum free energy ψm depends only on that part of the history for which G′ is
nonzero, i.e., Ert (s), 0 s d; while ψM may depend on the entire history of
strain.
Proof. We have from (5.26) that
∞
∞
1
H− (ω)
Ert (u)eiω(s−u) du dω.
(5.32)
Y t (s) =
2π −∞
0
It follows from (5.30)2 that
∞
H− (ω)eiω(s−u) dω = 0
−∞
∀s + d < u
(5.33)
so that
1
Y (s) =
2π
t
∞
−∞
H− (ω)
0
s+d
Ert (u)eiω(s−u) du dω.
(5.34)
t
It is clear now that p−
(ω), given by (5.31), depends only Ert (s), 0 s d.
t
(ω), given by (5.27)1 , may depend on the entire history of strain. The
However, p+
result follows from (5.15) and (5.17).
✷
390
M. FABRIZIO AND M. GOLDEN
A consequence of Proposition 5.1 is that a time domain representation of the
minimum free energy, given in [24], reduces to the form
1 d d t
∂2
ψm (t) = φ(t) +
G(s1 , s2 )Ert (s2 ) ds1 ds2
(5.35)
Er (s1 )
2 0 0
∂s1 ∂s2
rather than this expression with infinite integrations as in the general case. An
expression for G(s1 , s2 ) can be given in terms of time domain representations of
the factors H± [24].
REMARK 5.1. It is interesting to consider Proposition 5.1 against a more general
background. From (2.5), we see that, for a finite memory material, the condition
that two viscoelastic states are in the same minimal state depends only on the
t
values of the histories in the time interval [0, d]. In particular, the quantity p−
is a function of the minimal state [2]. A function of the minimal state can depend
only on the history in the interval [0, d], since values outside of this interval can
t
be varied arbitrarily without altering the minimal state. Thus p−
and ψm must have
this property, as shown by Proposition 5.1.
6. Factorization of H (ω)
Let us now address the problem of factorization of H (ω) as given by (2.16) in the
finite memory case, for specified forms of the relaxation function.
Consider the ansatz
H+ (ω) = e−iωd/2 {H (ω)}1/2,
H− (ω) = eiωd/2 {H (ω)}1/2,
ω ∈ R,
(6.1)
where {H (ω)}1/2 is assumed to be analytic at all finite points in . We note that
H vanishes at ω = 0 where it has a quadratic zero that does not produce a branch
point. It is assumed that any other zero of H in is of even power type. The
quantity H+ has an exponential divergence of order d as Im(ω) → +∞ and is
analytic/convergent in (−) . Similarly, H− has an exponential divergence of order d
as Im(ω) → −∞ and is analytic/convergant in (+) . These are consistent with
(5.30).
Let us now look at specific cases. Consider first the choice
G′ (t) = −K0 , 0 t < d,
= 0,
t d,
(6.2)
where
K0 =
G0 − G∞
> 0.
d
(6.3)
391
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Then
iK0
1 − e−iωd ,
ω
H (ω) = K0 (1 − cos ωd)
ωd
= 2K0 sin2
2
G′F (ω) =
(6.4)
which has zeros at ωd = 2nπ for all integer values of n and is thus not positive
definite for nonzero ω. Also
ωd
(6.5)
{H (ω)}1/2 = 2K0 sin
2
so that, from (6.1)
H+ (ω) =
H− (ω) =
K0
1 − e−iωd ,
2
K0
1 − eiωd ,
2
(6.6)
where a factor i has been omitted, to obtain agreement with (2.18). Beyond this
example, relation (6.1) would seem to have limited applicability.
Consider next the case
G′ (t) = −K0 , 0 t < d/2,
= −K1 , d/2 t < d,
= 0,
t d,
0 < K1 < K0 .
(6.7)
We have that
G0 − G∞
K0 + K1
=
2
d
(6.8)
and
G′F (ω) =
It follows that
i
K0 − (K0 − K1 )e−iωd/2 − K1 e−iωd .
ω
ωd
H (ω) = K0 − (K0 − K1 ) cos
2
− K1 cos(ωd) 0 ∀ω ∈ R
(6.9)
(6.10)
which again vanishes for discrete nonzero values of ω. We look for factors of the
form
H+ (ω) = b0 + b1 e−iωd/2 + b2 e−iωd ,
H− (ω) = b0 + b1 eiωd/2 + b2 eiωd ,
(6.11)
392
M. FABRIZIO AND M. GOLDEN
where b0 , b1 , b2 are real. Using (2.16) and comparing coefficients gives the conditions
b02 + b12 + b22 = K0 ,
K1 − K0
,
b1 (b0 + b2 ) =
2
K1
b0 b2 = −
2
(6.12)
with solution
b1 = ǫ1
K0 − K1
,
2 3
ǫ1 = ±1,
1
−b1 + ǫ2 b12 + 2K1 ,
2
3
1
b2 = −b1 − ǫ2 b12 + 2K1 ,
2
ǫ2 = ±1.
b0 =
(6.13)
Observe that b0 + b1 + b2 = 0 as required for H± to have zeros at ω = 0. If we
choose
K0 = 2K1
(6.14)
then these reduce to
√
K0
,
b1 = ǫ1
√ 2
√
K0
−ǫ1 + ǫ2 5 ,
b0 =
√4
√
K0
b2 =
−ǫ1 − ǫ2 5 .
4
(6.15)
Since H± are arbitrary up to a constant real phase factor, we choose ǫ1 = 1.
However, there remain two possible solutions, corresponding to an interchange
of b0 and b2 . A choice will be made between these two possibilities below. This is
the lack of uniqueness in the factorization referred to after (2.18).
We finish this section with a few observations on the case of more general stepfunction forms of G′ . For n steps of equal time gaps, (6.11) and (6.12) are replaced
by
H± (ω) =
n
r=0
br e∓irωd/n
(6.16)
393
MINIMUM FREE ENERGIES FOR FINITE MEMORY
and
n−r
l=0
bl bl+r = ar ,
ar =
r = 0, 1, . . . , n,
Kr − Kr−1
,
2
a0 = K0 ;
(6.17)
r = 1, 2, . . . , n − 1,
an = −
Kn−1
,
2
where the quantities Kr are obvious generalizations of K0 and K1 in (6.7).
In the case of three steps, we have four equations for b0 , b1 , b2 , b3 . Adding
the two middle relations, one obtains equations of the same form as (6.12) for b0 ,
b1 + b2 and b3 , but in terms of b1 b2 which is unknown. Subtracting the two middle
equations and squaring, one obtains a quadratic equation for b1 b2 with real roots.
Thus there are two solutions for each solution of (6.12).
This special procedure becomes more cumbersome for a larger number of steps.
In the general case, the simplest systematic procedure would seem to be to start
with the last relation of (6.17)1 (r = n), solving for bn , bn−1 , . . . in terms of b0
and b1 . This procedure is complicated by the double occurrence of variables beyond
a certain point. The penultimate equation for b1 in terms of b0 is a polynomial
equation of degree n − 1. The final substitution
into the first equation (r = 0
in (6.17)1 – which can be replaced by nl=0 bl = 0 – yields in general a nonpolynomial equation for b0 .
7. Explicit Forms of the Minimum Free Energy
For the relaxation function derivative given by (6.2) we have from (2.1) that the
stress has the form
d
Ert (s) ds
T (t) = G∞ E(t) − K0
0 t
(E(u) − E(t)) du
(7.1)
= G∞ E(t) − K0
t −d
and the work function given by (5.5) becomes, after some algebra,
K0 d t
[Er (s)]2 ds
W (t) = ψM (t) = φ(t) +
2 0
K0 ∞ t
[Er (s + d) − Ert (s)]2 ds.
+
2 0
Similarly, for G′ given by (6.7),
d/2
Ert (s) ds − K1
T (t) = G∞ E(t) − K0
0
d
d/2
Ert (s) ds
(7.2)
394
M. FABRIZIO AND M. GOLDEN
= G∞ E(t) − K0
− K1
t −d/2
t −d
t
t −d/2
(E(u) − E(t)) du
(E(u) − E(t)) du
(7.3)
and the work function has the form
K0 d/2 t
K1 d t
2
[Er (s)] ds +
[E (s)]2 ds
W (t) = ψM (t) = φ(t) +
2 0
2 d/2 r
K0 − K1 ∞ t
d 2
t
+
ds
Er (s) − Er s +
2
2
0
K1 ∞ t
+
[Er (s) − Ert (s + d)]2 ds
2 0
∞
∞
d
t
t
t
2
Er (s)Er s +
ds
[Er (s)] ds + (K1 − K0 )
= φ(t) + K0
2
0
0
∞
Ert (s)Ert (s + d) ds.
(7.4)
− K1
0
We now write down the form of the minimum free energy and associated quantities for the explicit factorizations considered in Section 6. Consider first (6.2)
where the factors are given be (6.6). The quantity Y t , defined by (5.26), has the
form
t
1 K0 ∞
t
Y (s) =
1 − eiωd Er+
(ω)eiωs dω
2π 2 −∞
=
K0 t
[E (s) − Ect (s + d)],
2 c
(7.5)
where Ect is the function Ert with Ect (s) = 0, s < 0 so that
t
p−
(ω)
K0
= −
2
0
−d
K0 iωd
= −
e
2
Ect (s + d)e−iωs ds
0
d
Ect (u)e−iωu du
(7.6)
and
t
p+
(ω)
K0
=−
2
0
∞
[Ect (s) − Ect (s + d)]e−iωs ds.
From Plancherel’s theorem and (5.17) we see that
K0 d t
[Ec (s)]2 ds.
ψm (t) = φ(t) +
2 0
(7.7)
(7.8)
395
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Also, (5.15) gives an expression in agremment with (7.2).
For the second case, (6.7), we have
∞
t
1
t
b0 + b1 eiωd/2 + b2 eiωd Er+
(ω)eiωs dω
Y (s) =
2π −∞
d
t
t
= b0 Ec (s) + b1 Ec s +
+ b2 Ect (s + d)
2
(7.9)
so that
t
p−
(ω)
= b1
0
Ect
− d2
+ b2
0
−d
d −iωs
s+
e
ds
2
Ect (s + d)e−iωs ds
(7.10)
and
t
p+
(ω)
=−
0
∞
b0 Ect (s)
+
b1 Ect
d
t
+ b2 Ec (s + d) e−iωs ds. (7.11)
s+
2
Thus
d/2
d
[Ect (s)]2 ds
ds +
0
0
d/2
d
t
t
ds
+ 2b1 b2
Ec (s)Ec s +
2
0
ψm (t) = φ(t) +
b12
[Ect (s)]2
b22
(7.12)
after changing variables in the integrations. Formula (7.4) can also be reproduced
from (5.15) with the aid of the conditions (6.12). It is easier to use (5.15)1 for this
purpose. The vanishing of the cross-terms is readily checked directly from (7.10)
and (7.11).
Relation (7.12) can be written in the form
d/2
d
t
2
ψm (t) = φ(t) + a1
[Ec (s)] ds + a2
[Ect (s)]2 ds
− b1 b2
0
0
d/2
Ect (s) − Ect s +
K0
,
a1 =
− b1 b2 =
2
K1
a2 = b22 + b1 b2 =
,
2
d/2
d
2
2
ds,
(7.13)
b02
where (6.12) has been used. We see therefore that the largest choice of b2 minimizes ψm and thus choose ǫ2 = −1 in (6.13).
The results of this section are consistent with Proposition 5.1.
396
M. FABRIZIO AND M. GOLDEN
For the general case of n steps, where the coefficients bl , l = 1, 2, . . . , obey
(6.17), the generalization of (7.10)–(7.12) is straightforward.
Finally, we consider the question of minimal states. For G′ given by (6.2) or
(6.7), the conditions (2.5) reduce to
E1t (s) = E2t (s),
0 s d,
(7.14)
which will be obeyed by a large class of histories. Strictly, this relation need apply
only almost everywhere in 0 < s d.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
J.M. Golden, Free energies in the frequency domain: the scalar case. Quart. Appl. Math. 58
(2000) 127–150.
L. Deseri, G. Gentili and J.M. Golden, An explicit formula for the minimum free energy in
linear viscoelasricity. J. Elasticity 54 (1999) 141–185.
M. Fabrizio and J.M. Golden, Maximum and minimum free energies for a linear viscoelastic
material. Quart. Appl. Math. 60 (2002) 341–381.
W. Noll, A new mathematical theory of simple materials. Arch. Rational Mech. Anal. 48 (1972)
1–50.
M. Fabrizio, G. Gentili and J.M. Golden, The minimum free energy for a class of compressible
viscoelastic fluids. Adv. in Differential Equations 7 (2002) 319–342.
G. Del Piero and L. Deseri, On the concepts of state and free energy in linear viscoelasticity.
Arch. Rational Mech. Anal. 138 (1997) 1–35.
G. Del Piero and L. Deseri, On the analytic expression of the free energy in linear viscoelasticity. J. Elasticity 43 (1996) 247–278.
D. Graffi and M. Fabrizio, Sulla nozione di stato materiali viscoelastici di tipo ‘rate’. Atti Accad.
Naz. Lincei 83 (1990) 201–208.
M. Fabrizio and A. Morro, Mathematical Problems in Linear Viscoelasticity. SIAM, Philadelphia, PA (1992).
J.M. Golden and G.A.C. Graham, Boundary Value Problems in Linear Viscoelasticity. Springer,
Berlin (1988).
M.E. Gurtin and I. Herrera, On dissipation inequalities and linear viscoelasticity. Quart. Appl.
Math. 23 (1988) 235–245.
W.A. Day, Thermodynamics based on a work axiom. Arch. Rational Mech. Anal. 31 (1968)
1–34.
E.C. Titchmarsh, Introduction to the Theory of Fourier Integrals. Clarendon Press, Oxford
(1937).
I.N. Sneddon, The Use of Integral Transforms. McGraw-Hill, New York (1972).
M. Fabrizio, C. Giorgi and A. Morro, Free energies and dissipation properties for systems with
memory. Arch. Rational Mech. Anal. 125 (1994) 341–373.
M. Fabrizio, Existence and uniqueness results for viscoelastic materials. In: G.A.C. Graham
and J.R. Walton (eds), Crack and Contact Problems for Viscoelastic Bodies. Springer, Vienna
(1995).
M. Fabrizio and S. Polidoro, On the exponential decay for differential systems with memory.
Applicable Analysis 81 (2002) 1245–1266.
E.T. Whittaker and G.N. Watson, A Course of Modern Analysis. Cambridge Univ. Press,
Cambridge (1963).
J. Ramanathan, Methods of Applied Fourier Analysis. Birkhäuser, Boston (1998).
B.D. Coleman, Thermodynamics of materials with memory. Arch. Rational Mech. Anal. 17
(1964) 1–46.
MINIMUM FREE ENERGIES FOR FINITE MEMORY
21.
397
B.D. Coleman and V.J. Mizel, A general theory of dissipation in materials with memory. Arch.
Rational Mech. Anal. 27 (1967) 255–274.
22. A. Morro and M. Vianello, Minimal and maximal free energy for materials with memory. Boll.
Un. Mat. Ital. 4A (1990) 45–55.
23. N.I. Muskhelishvili, Singular Integral Equations. Noordhoff, Groningen (1953).
24. L. Deseri, G. Gentili and J.M. Golden, Free energies and Saint-Venant’s principle in linear
viscoelasricity, submitted for publication.
About Clapeyron’s Theorem in Linear Elasticity ⋆
ROGER FOSDICK and LEV TRUSKINOVSKY
Department of Aerospace Engineering and Mechanics, University of Minnesota, Minneapolis,
MN 55455, U.S.A. E-mail: fosdick@aem.umn.edu, trusk@aem.umn.edu
Received: 6 August 2002; in revised form: 17 March 2003
Abstract. We examine some elementary interpretations of the classical theorem of C LAPEYRON
in linear elasticity theory. As we show, a straightforward application of this theorem in the purely
mechanical setting leads to an apparent paradox which can be resolved by referring either to dynamics or to thermodynamics. These richer theories play an essential part in understanding the physical
significance of this theorem.
Mathematics Subject Classifications (2000): 74A15, 74B05, 80A17
Key words: elasticity, Clapeyron, dissipation, viscoelasticity, thermoelasticity.
In remembrance of Clifford Truesdell and his scientific program of enlightenment.
1. Introduction
According to Love [11, p. 173], “The potential energy of deformation of a body,
which is in equilibrium under given load, is equal to half the work done by the
external forces, acting through the displacements from the unstressed state to the
state of equilibrium.” This is now commonly known as C LAPEYRON’s theorem
in linear elasticity theory.⋆⋆ In particular, this theorem, taken literally, implies that
the elastic stored energy accounts for only half of the energy spent to load the
body; the remaining half of the work done to the body by the external forces is
unaccounted for and is lost somewhere in achieving the equilibrium state. It is
particularly striking that this apparent paradox is reached within the framework of
⋆ The National Science Foundation Grant No. DMS-0102841 is gratefully acknowledged for their
support of this research.
⋆⋆ In 1852, Lamé [9] published his volume, Leçons sur la théorie mathématique de l’élasticité des
corps solides, in which he devoted his seventh lecture to what he termed C LAPEYRON’s Theorem.
(See [13, pp. 565 and 578], for relevant remarks.) Earlier, Lamé and Clapeyron [10] had noted this
result in a joint memoir of 1833. Although Emile Clapeyron [3], himself, first published on this
theorem in 1858, in a résumé of an original memoir that apparently was never published, it is argued
by Todhunter and Pearson [14, p. 419], that the “result of the memoir of 1833 was due entirely to
Clapeyron, for Lamé in his Leçons, of 1852, . . . terms it C LAPEYRON’s Theorem, and C LAPEYRON
here speaks of it as he would do only if it were entirely due to himself.”
399
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 399–426.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
400
R. FOSDICK AND L. TRUSKINOVSKY
the purely conservative linear theory of elasticity. Alternatively, however, within
elastostatics the common characterization of the work done to reach equilibrium is
conceptually ambiguous and a different interpretation may be required.
To illustrate the above concerns, let us first recall that in the linear theory of
elasticity the total strain energy of a body that occupies the region ⊂ R3 and
supports a, generally, dynamical displacement field u = u(x, t) and strain field
e ≡ (∇u + (∇u)T )/2 = e(x, t) relative to its undistorted state at time t = 0 is
defined by
1
U [e](t) ≡
ρC[e] · e dv.
(1.1)
2
Here, ρ is the mass density of the body and C is the positive definite and completely
symmetric elasticity tensor. Further, the work done during the interval of time (0, t)
due to an applied boundary traction field t∗ = t∗ (x, t) and body force field b∗ =
b∗ (x, t) over the displacement u(x, t) is given by
t
∗
∗
W [u](t) =
b · u̇ dv dt.
(1.2)
t · u̇ da +
0
∂
The corresponding stress field in at time t, T = T(x, t), satisfies the generalized
H OOKE ’ S law T = ρC[e] and is symmetric. Throughout this paper we shall assume, for convenience, that the body is homogeneous, so that ρ and C are constant.
If the loads t∗ and b∗ are ‘dead’, i.e., independent of time, so that t∗ = t̄(x) and
b∗ = b̄(x), then for a body that is undistorted at time t = 0, (1.2) may be integrated
to yield
(1.3)
t̄ · u da +
b̄ · u dv ≡ W [u](t).
W [u](t)|(t∗ ,b∗ )=(t̄,b̄) =
∂
This ‘dead load work’ represents the “work done by the external forces” to which
L OVE referred in his quote concerning equilibrium reproduced in the first line of
this introduction, above. Of course, in this case the loads are equilibrated so that
(1.4)
t̄ da +
x × t̄ da + x × b̄ dv = 0
b̄ dv = 0,
∂
∂
and u is an equilibrium displacement field, say ū(x); the corresponding ‘dead load
work’ is then
W [ū] ≡ W [u](t)|u=ū(x).
(1.5)
Suppose that the displacement field u = ū(x) corresponds to an equilibrium
state with strain ē(x) and stress T(x) satisfying T = ρC[ē] and
div T + b̄ = 0
in ,
Tn = t̄
on ∂,
(1.6)
ABOUT CLAPEYRON’S THEOREM
401
where n is the outer unit normal to on ∂. Without loss of generality, we may
eliminate the possibility of an added infinitesimal rigid displacement field in ū(x)
and render ū(x) unique by imposing the normalization conditions
ρx × ū dv = 0,
(1.7)
ρ ū dv = 0,
where the mass density ρ is included only for later convenience. Then, according to
the usual derivation of C LAPEYRON’s theorem, we see, starting with (1.3) and (1.5)
and using (1.6), generalized H OOKE ’ S law, the symmetry of T and the divergence
theorem, that
1
1
W [ū] =
Tn · ū da + b̄ · ū dv = U [ē].
(1.8)
2
2 ∂
Literally following L OVE’s statement of C LAPEYRON’s theorem, one may infer
that elastostatics alone⋆ accounts for only half of the work that is expended to
reach equilibrium; the coefficient one-half is a result of the linearity of the theory. In the remainder of this paper, we continue within the linear framework and
consider, respectively, in Sections 2, 3 and 4, the richer dynamical theories of elasticity, viscoelasticity and thermoelasticity in order to shed light on this seemingly
paradoxical and incomplete conclusion.
SYNOPSIS
In Section 2, we argue that within ideal elasticity theory the quantity W [ū] of
C LAPEYRON’s theorem does not reasonably represent the work done by the external forces to reach an elastostatic equilibrium state u = ū(x). We then investigate
‘fast’ versus ‘slow’ time dependent loading conditions and conclude that within the
assumptions of elastostatics W [ū]/2 is a better representative of the work expended
to reach equilibrium. In Sections 3 and 4, we amend ideal elasticity theory so as
to include the mechanisms of viscous and thermal dissipation, respectively. Then
dead loading becomes compatible with the notion of achieving equilibrium and,
we conclude that the quantity W [ū] of C LAPEYRON’s theorem does adequately
represent the corresponding work done by the external applied forces. Here we
find that half of W [ū] becomes stored in the body in the form of equilibrium strain
energy and the remaining half is dissipated either through the action of viscous
dissipation or heat transfer. In Section 5, we offer some conclusions.
⋆ In elastostatics, there is, of course, no time dependence and formally the work done by the loads
t̄(x) and b̄(x) to reach the equilibrium displacement u = ū(x) from an undistorted state commonly
is calculated by using (1.3) and (1.5), as was done in (1.8). As noted earlier, for purely equilibrium
theory this may not properly represent the ‘work done to reach equilibrium’ because this tacitly
assumes that the loads are ‘dead’ and applied over time and, therefore, impulsive. For an ideal elastic
body this circumstance is not compatible with the notion of reaching equilibrium, as we shall see in
Section 2 and related Appendices A and B.
402
R. FOSDICK AND L. TRUSKINOVSKY
In Appendices A and B, we show some example calculations to further illustrate the claims of Section 2. It should be noted that throughout the main body of
this paper we assume, for convenience, that the traction field is specified on the
complete boundary of . However, in the elementary examples of these appendices we prefer to hold one part of the boundary fixed and specify the traction on
the complementary part for all time. While these boundary conditions clearly are
not consistent with (1.4)1 and (2.4)1 , nevertheless they are normal and allowable;
moreover, they do not compromise the main purpose of illustrating the difference
between dead and retarded loading.
2. Elastodynamics
Here, we shall first consider the consequences of ‘dead’ loading within elastodynamics regarding work and energy and then show how equilibrium theory is best
accounted for by introducing a retarded system of loads.
2.1. ‘ DEAD ’ LOADING
Suppose that for all time t > 0 the body is ‘dead’ loaded with the same loads as in
the static situation described above, so that t∗ = t̄(x) and b∗ = b̄(x) in (1.2). On
the boundary of we set
Tn = t̄ on ∂, ∀t > 0,
(2.1)
and initially the body is at rest and undistorted so that
u(x, 0) = u̇(x, 0) = 0 in .
(2.2)
The dynamical equation is
div T + b̄ = ρ ü
in , ∀t > 0,
(2.3)
and we recall that t̄ = t̄(x) and b̄ = b̄(x) are supposed to be balanced in the
sense of (1.4). Of course, u, e and T are related through the strain-displacement
and stress–strain equations of Section 1. Under these conditions, it readily follows,
from (2.1), (2.3) and the symmetry of T, that the linear and angular momentum are
conserved. Thus, by integration in time and use of (2.2), it is clear that the resulting
motion naturally satisfies the normalization
ρx × u dv = 0 ∀t 0.
(2.4)
ρu dv = 0,
In addition, by forming the inner product of (2.3) with u̇, integrating over and using the symmetry of C together with (2.1) and (1.1), we readily reach the classical
power theorem
d
d
t̄ · u̇ da +
b̄ · u̇ dv = U [e](t) + K[u̇](t),
(2.5)
dt
dt
∂
ABOUT CLAPEYRON’S THEOREM
where
K[u̇](t) ≡
1
ρ|u̇|2 dv
2
403
(2.6)
is the kinetic energy of the body. Then, by integrating (2.5) in time and using (2.2)
and (1.3) we obtain the standard balance of mechanical energy
W [u](t) = U [e](t) + K[u̇](t),
(2.7)
which is supposed to be valid for all time t 0.
Now, as a first elementary observation, let us assume that may, at some time
t = t¯ during the motion, instantaneously support the equilibrium displacement field
in the sense that u(x, t¯) = ū(x); it may be that u̇(·, t¯) = 0 so is not coincidently
at rest. Be that as it may, nevertheless, (2.4) will be met at t = t¯ because of (1.7)
and, in addition, because W [u](t¯) = W [ū] and U [e](t¯) = U [ē], we see from (2.7)
that
W [ū] = U [ē] + K[u̇](t¯).
(2.8)
Then, recalling (1.8), we may conclude that half the work done during the time
interval (0, t¯) is stored in the body as strain energy and the remaining half satisfies
1
W [ū] = K[u̇](t¯);
(2.9)
2
it has been spent to produce the instantaneous kinetic energy of the body. Accordingly, under the present circumstances, it is this kinetic energy that must be
spontaneously extracted from if the body is to be arrested in the equilibrium
state u(x, t¯) = ū(x). However, there is no mechanism in this conservative ideal
elastic system to do so!⋆
Let us now consider an alternative description of how the work may be channeled into strain energy and kinetic energy based upon time-averaging of the corresponding energies. The assumption here is that there is a time t = t ∗ > 0, perhaps
one among many, at which the body instantaneously is at rest, i.e., u̇(x, t ∗ ) ≡ 0
in .⋆⋆ To describe the average motion, we introduce the time-average displacement field as
∗
1 t
u(x) ≡ ∗
u(x, t) dt,
(2.10)
t 0
⋆ According to [11, p. 123] (see also [13, p. 537, art. 988]), in 1839 Poncelet [12] was the
first to note that “a load suddenly applied may cause a strain twice as great as that produced by
a gradual application of the same load.” While this observation of Poncelet, which also contains
an interesting factor of 2, appeared contemporaneously with the original and later announcements of
C LAPEYRON’s theorem, there appears to have been no recognition of a possible relationship between
the claims of either authors.
⋆⋆ While, generally, there may not be such a time, in the case of periodic motion there is a countable
set of such times; a specific one-dimensional example is discussed later in Appendix A. Notice,
though, that according to (2.4)1 the average of u̇(x, t) over is always zero.
404
R. FOSDICK AND L. TRUSKINOVSKY
with the time-average strain e(x) and stress T(x) fields defined analogously.
Then, it readily follows that
1
∇u + (∇u)T ,
T = ρC[e].
2
Moreover, because of (2.2), by time-averaging (2.3) and (2.1) we find
e =
divT + b̄ = 0 in ,
Tn = t̄ on ∂.
(2.11)
(2.12)
Also, note that the time-average displacement field u(x) satisfies the normalization (2.4) and recall that the loads t̄ = t̄(x) and b̄ = b̄(x) are balanced in the sense
of (1.4). Thus, because of uniqueness and the fact that u(x) and ū(x) solve the
same equilibrium boundary-value problem, we may conclude that
u(x) = ū(x),
e(x) = ē(x),
T(x) = T(x)
in .
(2.13)
Now, by time-averaging (2.7), using (1.3), recalling the notation established
in (2.10) and applying (2.13), we easily have
W [u] = W [u] = W [ū] = U [e] + K[u̇].
(2.14)
In particular, the average of the ‘dead load work’, W [u], is equal to the quantity
W [ū] of C LAPEYRON’s theorem in (1.8), and our immediate aim is to determine
how this average work expended is divided up between the average strain energy
U [e] and the average kinetic energy K[u̇] of the body. To do so, we first
introduce the difference displacement field
u′ (x, t) ≡ u(x, t) − ū(x),
(2.15)
with e′ (x, t) and T′ (x, t) defined analogously, and observe, using (1.1), the symmetry of C and (1.8), that
1
ρC[ē + e′ ] · (ē + e′ ) dv
U [e](t) =
2
ρC[ē] · e′ dv + U [e′ ](t)
= U [ē] +
1
= W [ū] +
ρC[ē] · e′ dv + U [e′ ](t).
2
Then, by time-averaging we have
1
U [e] = W [ū] +
ρC[ē] · e′ dv + U [e′ ].
2
However, because e′ (x, t) = e(x, t) − ē(x) we see from (2.13) that e′ (x) =
e(x) − ē(x) = 0 and, consequently, we reach
1
U [e] = W [ū] + U [e′ ].
2
(2.16)
405
ABOUT CLAPEYRON’S THEOREM
Now, to determine U [e′ ], it is convenient to observe, using (2.1)–(2.3), (2.15)
and the relationships T′ = ρC[e′ ] and e′ = (∇u′ + (∇u′ )T )/2, that
div T′ = ρ ü′
T′ n = 0
in , ∀t > 0,
on ∂, ∀t > 0;
u′ (x, 0) = −ū(x),
u̇′ (x, 0) = 0
(2.17)
in .
Then, with the definition (1.1), the symmetry of C, (2.17) and the aid of the divergence theorem we find that
1
T′ · ∇u′ dv
U [e′ ](t) =
2
1
1
=
T′ n · u′ da −
ρ ü′ · u′ dv
2 ∂
2
1
=−
ρ ü′ · u′ dv
2
2
1 1 ˙ ′
=−
ρ u̇ · u − ρ|u̇|2 dv,
2
the last equation of which uses the fact that (2.15) implies u̇′ (x, t) = u̇(x, t). Now,
by time-averaging, recalling that u̇(x, 0) = u̇(x, t ∗ ) = 0 and using (2.6), we obtain
U [e′ ] = K[u̇],
(2.18)
which is a well known result concerning the equipartition between kinetic and
potential energies. Thus, by substituting (2.18) into (2.16) and then using (2.14)
we conclude that
1
U [e′ ] = K[u̇] = W [ū]
4
(2.19)
and, again using (2.16), we see that
1
1
3
U [e] = W [ū] + W [ū] = W [ū].
2
4
4
(2.20)
To show that this result is independent of the assumption of periodicity, let us introduce the complete set of orthonormal eigenfunctions and eigenvalues, {ūi (x), ωi ,
i = 1, 2, . . .}, which satisfy (1.7) and
div(C[∇ ūi ]) + ρωi2 ūi = 0
(C[∇ ūi ])n = 0
in ,
on ∂,
and expand the solution u(x, t) of (2.1)–(2.3) in the form
u(x, t) =
∞
i=1
ūi (x)gi (t).
(2.21)
406
R. FOSDICK AND L. TRUSKINOVSKY
Then, we readily find that gi (t) = Ai (1 − cos ωi t), i = 1, 2, . . . , and it is possible
to determine the constants Ai as Fourier coefficients so that this series represents
a weak solution of (2.1)–(2.3) in the sense that ∇u(·, t) ∈ L2 () for all t > 0.
Furthermore, it is straightforward to show that the infinite time-average of the
displacement field,
1 T
u∞ (x) ≡ lim
u(x, t) dt,
(2.22)
T →∞ T 0
satisfies u∞ (x) = ū(x) and that conclusions similar to those highlighted in the
previous paragraph continue to hold for the relationships between the infinite timeaverages of the work, strain energy and kinetic energy, i.e.,
W [u]∞ = W [u∞ ] = W [ū] = U [e]∞ + K[u̇]∞
(2.23a)
with
U [e]∞ =
3
W [ū],
4
1
K[u̇]∞ = W [ū].
4
(2.23b)
Based upon the above analyses, we conclude that when an elastic body is set in
motion with a ‘dead’ loading system from an initially undistorted rest state, then,
with suitable interpretation, the average work that is supplied to the body by the
‘dead’ loading is equal to the equilibrium work of C LAPEYRON’s theorem. On
the average, three quarters of this work appears as strain energy (half due to the
equilibrium strain energy as predicted from C LAPEYRON’s theorem and a quarter
due to the strain energy of the deformation relative to this equilibrium), and the
remaining quarter is, on the average, transformed into kinetic energy.
To illustrate the general conclusions reached above, we consider, in Appendix A, a specific one-dimensional elastodynamic problem with ‘dead’ loading.
2.2. ‘ SLOW ’ LOADING
When an ideal elastic body is ‘dead’ loaded from an undistorted, rest state with an
otherwise equilibrium system of loads, the loading is impulsively applied. Consequently, from the dynamical considerations of Section 2.1, the body never reaches
equilibrium but, rather, rings by constantly redistributing kinetic and strain energy
between its elements. Indeed, the work done to the body at any time t > 0 due
to the external loading is given by (1.3), but the body is never coincidently at rest
and in a state of equilibrium. On the other hand, we expect that if an equilibrium
system of loads is achieved sufficiently slowly in time then even an ideal elastic
body should distort through a sequence of near equilibrium states and eventually
reach a nearly static equilibrium configuration. In this case, the work done to the
body at any time t due to the external loading may be calculated using (1.2), but
the calculation is no longer trivial because now t∗ and b∗ are not ‘dead’ but rather
depend on time. For dissipationless, ideal elastic bodies it is intuitively clear that
407
ABOUT CLAPEYRON’S THEOREM
the work expended to reach equilibrium should be related to the latter rather than
the former calculation.
To gain some general perspective, suppose that the loading system, t∗ on ∂
and b∗ in , is such that
⎧
⎨ t t̄(x), t ∈ (0, t ),
∞
∗
∗
t = t (x, t) = t∞
(2.24)
⎩
t̄(x),
t t∞ ,
and
⎧
⎨ t b̄(x),
∗
∗
b = b (x, t) = t∞
⎩
b̄(x),
t ∈ (0, t∞ ),
(2.25)
t t∞ ,
where t∞ is a sufficiently large time constant so that the loads may be considered
to be slowly applied. Then, at least for t ∈ (0, t∞ ), the displacement field u =
u(x, t) = t ū(x)/t∞ and the corresponding strain and stress fields, e = e(x, t) =
t ē(x)/t∞ and T = T(x, t) = tT(x)/t∞ , from the strain-displacement and stress–
strain relations of Section 1, will satisfy the dynamical equation
div T + b∗ = ρ ü
in , t ∈ (0, t∞ ),
together with the boundary condition
Tn = t∗
on ∂, t ∈ (0, t∞ )
and initial conditions
u(x, 0) = 0,
u̇(x, 0) =
1
ū(x)
t∞
in .
Clearly, for sufficiently large time constant t∞ not only is the applied loading
‘slow’, but the initial state of is undistorted and ‘nearly’ at rest. Further, at time
t = t∞ the body achieves the equilibrium displacement field ū(x) with, again,
‘nearly’ zero velocity. Moreover, according to (1.2), the work done to up to time
t = t∞ is
t∞
t
1
1
t
b̄ · ū dv dt
t̄ · ū da +
W [u](t∞ ) =
t∞
t∞
t∞
∂ t∞
0
1
=
t̄ · ū da +
b̄ · ū dv ,
2 ∂
so that with (1.8)1 we have
1
W [u](t∞ ) = W [ū].
2
408
R. FOSDICK AND L. TRUSKINOVSKY
In addition, according to (2.6), the kinetic energy of at any time t ∈ [0, t∞ ) is
‘nearly’ zero and equal to its initial value because
1
ρ|ū|2 dv, ∀t ∈ [0, t∞ ).
K[u̇](t) = 2
2t∞
Finally, according to (1.1), the strain energy of at time t = t∞ is given by
U [e](t∞ ) = U [ē],
and so with (1.8)2 we may conclude that
W [u](t∞ ) = U [e](t∞ ).
Thus, for sufficiently large time constant t∞ , the body is ‘nearly’ at rest in equilibrium at time t = t∞ and the work done to to achieve this ‘near’ equilibrium state
is half that which is supplied, according to L OVE’s interpretation of C LAPEYRON’s
theorem; in fact, this reasoning shows that the work done is equal to the strain
energy at time t = t∞ and, to within a certain degree of approximation the paradox
of C LAPEYRON’s theorem may be considered resolved. Of course, the body is not
exactly at rest in equilibrium at time t = t∞ and while it was initially undistorted,
it was not initially at rest. For large time constant t∞ , according to the given initial
conditions there is a small kinetic energy imparted to at time t = 0 and this same
kinetic energy must be extracted from at time t = t∞ in order for to strictly
remain in equilibrium for all time t > t∞ .
In general, for the loading conditions of (2.24) and (2.25) it readily follows from
an application of the power theorem that for all time t > t∞ we must have
1
ρ|ū|2 dv,
K[u̇](t) + U [e − ē](t) = K[u̇](t∞ ) = 2
2t∞
where, of course, the right-hand side also represents the kinetic energy that is added
to the system at t = 0 due to the ‘nearly’ stationary initial condition. Thus, given
an ǫ > 0, for sufficiently large time constant t∞ , both the kinetic energy of and
the strain energy of for the difference strain e(x, t) − ē(x) must remain within an
ǫ-neighborhood of zero for all t t∞ .
To illustrate these ideas, we consider, in Appendix B, a one-parameter family
of one-dimensional elastodynamic problems for a bar of finite length, wherein the
applied loading depends on a slowness parameter α. Our aim in this appendix is
to exhibit how the retarded nature of the applied loading effects the dynamical
behavior and its relationship to the notion of equilibrium.
3. Viscoelasticity
As earlier, we again suppose that the body is initially at rest and undistorted and
that it is ‘dead’ loaded as in the static situation of Section 1. Now, to introduce an
409
ABOUT CLAPEYRON’S THEOREM
elementary form of mechanical dissipation, we consider a viscoelastic body whose
constitutive relation is of the K ELVIN –VOIGT form
T = ρC[e] + D[ė],
(3.1)
where D is a positive definite, completely symmetric (constant) viscosity tensor.
The dynamical equation and the boundary and initial conditions are the same as
those in (2.1)–(2.3), i.e.,
div T + b̄ = ρ ü
in , ∀t > 0,
Tn = t̄ on ∂, ∀t > 0,
u(x, 0) = u̇(x, 0) = 0 in .
(3.2)
In the usual way, it follows from (3.2) and (2.6) that the classical power theorem
holds, i.e.,
d
(3.3)
t̄ · u̇ da +
T · ė dv + K[u̇](t),
b̄ · u̇ dv =
dt
∂
which, with the use of (3.1), (3.2)3 , (1.1), (1.3) and integration in time, results in
W [u](t) = U [e](t) + K[u̇](t) + D(t),
where D(t) denotes the dissipation function
t
D(t) ≡
D[ė] · ė dv dt 0,
0
(3.4)
(3.5)
for all t 0.
Because of the dissipative character of viscosity and the special nature of the
loading, in that t̄(x) and b̄(x) are balanced and correspond with the equilibium
displacement field ū(x) of Section 1, it is natural to expect that the solution of
the problem outlined above will have the ‘asymptotic property’ u(x, t) → ū(x) as
t → ∞.⋆ Supposing this is the case, we find from (3.4), (1.1) and (2.6) that in the
limit as t → ∞
W [ū] = U [ē] + D∞ ,
where
D∞ ≡ D(∞) =
0
∞
D[ė] · ė dv dt 0.
(3.6)
⋆ Dafermos [5] and Andrews and Ball [1] have studied the questions of existence and asymptotic
stability for general one-dimensional K ELVIN -VOIGT viscoelasticity theory. With certain smoothness hypotheses, the conclusions in [1, 5] guarantee that the solution to the problem with ‘dead’
loading and zero initial data asymptotically and strongly approaches the equilibrium state which
corresponds to the same ‘dead’ loads.
410
R. FOSDICK AND L. TRUSKINOVSKY
Moreover, by using C LAPEYRON’s theorem (1.8) we may then conclude that half
the work done to reach equilibrium is stored as strain energy and the remaining
half is given by
1
W [ū] = D∞ ,
2
which is consumed during the dynamical process through viscous dissipation. To
summarize, when viscous dissipation is present and the ‘asymptotic property’ holds
then ‘dead’ loading and equilibrium are, indeed, compatible. Moreover, in practical terms the paradox reached within elasticity theory from L OVE’s interpretation
of C LAPEYRON’s theorem may be resolved by appropriately accounting for the
dissipative action of viscoelastic behavior.
4. Thermoelasticity
Within the linear theory of thermoelasticity, when a body is subject to a displacement field u = u(x, t) relative to its undistorted, rest state and coincidently the
absolute temperature is changed from its constant reference (room) temperature θ0
to the field θ = θ(x, t), the H ELMHOLTZ free energy per unit mass ψ = ψ(x, t) is
determined by the constitutive equation⋆
1
θ
ψ = ψ̂(θ, e) = C[e] · e − (θ − θ0 )M · e − cθ ln ,
2
θ0
(4.1)
normalized so that ψ̂(θ0 , 0) = 0. Here, M is the positive definite, symmetric thermal expansion tensor and c > 0 is the specific heat at constant deformation, both
representing prescribed thermomechanical material properties and herein assumed
to be constant. The symmetric stress tensor field T = T(x, t) and the entropy field
per unit mass η = η(x, t) are then determined by the G IBBS relations
and
∂ ψ̂(θ, e)
= ρ(C[e] − (θ − θ0 )M)
T=
T(θ, e) = ρ
∂e
∂ ψ̂(θ, e)
θ
η = η̂(θ, e) = −
= M · e + c ln + 1 ,
∂θ
θ0
respectively. The total H ELMHOLTZ free energy of the body is given by
ρ ψ̂(θ(x, t), e(x, t)) dv.
[θ, e](t) ≡
(4.2)
(4.3)
(4.4)
If we now assume, analogous to Sections 1–3, that the ‘dead’ loads t̄(x) on ∂ and
b̄(x) in are balanced in the sense of (1.4) and that u = ū(x) is a corresponding
⋆ In [4] or [8, p. 99], for example, a non-essential quadratic approximation for θ near θ is used
0
in place of the last term in (4.1).
411
ABOUT CLAPEYRON’S THEOREM
equilibrium displacement field at the uniform temperature θ = θ0 then, as in Section 1, we readily see, from an argument totally analogous to that given in (1.8) for
C LAPEYRON’s theorem and (1.1), that
[θ0 , ē] =
1
W [ū];
2
(4.5)
i.e., half of the work done to reach equilibrium is stored in the body as H ELMHOLTZ
free energy. Here, we have used the normalization (1.7) which guarantees uniqueness and eliminates any possible additive infinitesimal rigid field.
Before proceeding with a more detailed continuum thermodynamic analysis for
non-isothermal processes, we first give an elementary thermodynamic explanation
for the isothermal case θ(x, t) = θ0 . Thus, for a finite material body the first law
and the second law, in the form of the C LAUSIUS –P LANCK inequality, may be
written as
Ė (t) + K̇(t) = P (t) + Q(t),
Ḣ(t)
Q(t)
θ0
∀t > 0,
(A)
where E, K, P , Q and H denote the internal energy, kinetic energy, mechanical
power supply (positive for influx and negative for efflux), heat supply rate (positive
for absorbtion and negative for emission) and entropy for the body, respectively.
Now, introducing the H ELMHOLTZ free energy of the body, F (t) ≡ E(t)−θ0 H(t),
we may write (A)1 in the form
P (t) = F˙ (t) + θ0 Ḣ(t) + K̇(t) − Q(t).
(B)
Then, supposing the body reaches an equilibrium state at some time t0 ∈ (0, ∞],
we see from (B) that the total work done to the body over the time interval (0, t0 )
is given by
t0
W≡
P (t) dt = F + D,
(C)
0
where
D ≡ θ0 H −
t0
Q(t) dt 0
(D)
0
represents the total energy dissipated by the body during the (isothermal) process
of reaching equilibrium. We are assured that this dissipated energy is non-negative
because of (A)2 and the isothermal condition. Now, by naturally interpreting W
as the work W [ū] to reach equilibrium and F as the equilibrium free energy
[θ0 , ē], both noted in (4.5), we see from (4.5), (C) and (D) that
1
W=
2
F
and
1
W = D.
2
(E)
412
R. FOSDICK AND L. TRUSKINOVSKY
Clearly, half of the work that is supplied to reach equilibrium is dissipated and the
paradox of C LAPEYRON’s theorem is resolved.
Now, rather than assume that the temperature field of the body is spatially
uniform and constant in time, let us suppose that the body is initially at rest in
its undistorted state at the constant temperature θ0 and that for all time t > 0 it
is subject to a balanced ‘dead’ loading system, as is presumed in the equilibrium
situation which lead to (4.5) above. In addition, for convenience we suppose that
the body is subject to null heat radiation to or from the external environment and
that the boundary temperature is fixed at θ0 for all time t > 0. Explicitly, the
boundary and initial conditions that we consider are
Tn = t̄,
θ = θ0
on ∂, ∀t > 0,
(4.6)
and
u(x, 0) = u̇(x, 0) = 0,
θ(x, 0) = θ0
in ,
(4.7)
respectively. The dynamical governing equations have the form
div T + b̄ = ρ ü
in , ∀t > 0,
(4.8)
and
−div q + T · ė = ρ ǫ̇
in , ∀t > 0.
(4.9)
Here, ǫ = ǫ(x, t) is the internal energy field per unit mass, which is related to
the H ELMHOLTZ free energy, temperature and entropy through ǫ = ψ + θη, and
q = q(x, t) is the heat flux vector field. Also, we note for later reference that with
(4.1)–(4.3), we may write (4.9) in the alternative form
−div q = ρθ η̇
in , ∀t > 0.
(4.10)
Now, with the aid of (4.8), (4.6) and (2.6) we again have the power theorem (3.3). Moreover, following a standard line of reasoning which uses (3.3) with
(4.9) and an application of the divergence theorem, we recover the global form of
the balance of energy:
d
d
(4.11)
t̄ · u̇ da +
b̄ · u̇ dv = E[θ, e](t) + K[u̇](t) − Q(t).
dt
dt
∂
Here, E[θ, e](t), the total internal energy of the body at time t, may be written
conveniently as
ρ ǫ̂(θ(x, t), e(x, t)) dv
E[θ, e](t) ≡
= θ0 [θ, e](t) + θ0 ρ η̂(θ(x, t), e(x, t)) dv,
(4.12)
ABOUT CLAPEYRON’S THEOREM
where
θ0 [θ, e](t) ≡
ρ(ǫ̂(θ, e) − θ0 η̂(θ, e))dv
413
(4.13)
is the total H ELMHOLTZ semi-free energy⋆ of the body based upon the boundary
temperature θ0 and Q(t) is the total heat rate of the body at time t, which, here, is
determined solely by boundary conduction, i.e.,
Q(t) ≡
−q · n da.
(4.14)
∂
Because n denotes the outer unit normal to ∂, we note that Q(t) > 0 (< 0)
corresponds to a rate of heat supply to (loss from) . Thus, by integration of (4.11)
in time and use of the initial conditions (4.7), and formulae (4.1), (4.4), (1.2) and
(1.3), we arrive at
W [u](t) = θ0 [θ, e](t) + K[u̇](t) + D(t),
where D(t) denotes the dissipation function
t
D(t) ≡ θ0
ρ(η̂(θ, e) − η̂(θ0 , 0))dv −
Q(τ ) dτ
0
t
d
θ0 H [θ, e](t) − Q(t) dt 0,
=
dt
0
(4.15)
(4.16)
for all t 0 and where H [θ, e](t), the total entropy of the body in the state of
temperature θ(x, t) and strain e(x, t), is defined by
ρ η̂(θ(x, t), e(x, t)) dv.
(4.17)
H [θ, e](t) ≡
Observe that the right-hand side of D(t) in (4.16) contains an expression as integrand which, in the absence of radiation and when the body is emersed in an
environment of constant temperature θ0 , is non-negative due to the second law of
thermodynamics in the form of the C LAUSIUS –P LANCK inequality. Of course, in
this circumstance the C LAUSIUS –P LANCK inequality is implied by the C LAUSIUS –
D UHEM inequality.
Because of the dissipative nature of heat conduction and the fact that the mechanical loading t̄(x) and b̄(x) and the thermal loading conditions (4.6)2 and (4.7)3 ,
⋆ See the work on the stability of material phases by Dunn and Fosdick [7, p. 41]. Duhem [6]
introduced a similar quantity denoted by him “l’énergie balistique” in his studies on the stability
of equilibrium states. Truesdell [15], in his Historical Introit on pp. 39–40, gives a brief account
of Duhem’s ballistic energy and its first appearances in the more modern researches of the 1960s.
Today, the term “ballistic free energy” often is used to denote the sum of the total kinetic energy, the
H ELMHOLTZ semi-free energy and the total potential energy of the applied forces for the body, for
certain special processes as, for example, in [2, Section 3.3]. Its main feature is that it is non-negative
on these processes and this fact emphasizes its importance in stability analyses.
414
R. FOSDICK AND L. TRUSKINOVSKY
are associated with the equilibrium state u = ū(x) and θ = θ0 , it is natural to expect, based on physical considerations, that any possible thermodynamic process,
generated according to (4.6)–(4.9), will stabilize in the sense that u(x, t) → ū(x)
and θ(x, t) → θ0 as t → ∞.⋆ Provided this asymptotic behavior⋆⋆ is, indeed, the
case, we may conclude, from (4.13), (4.15), (4.16) and the fact that θ0 [θ, e](t) →
[θ0 , ē], that
W [ū] = [θ0 , ē] + D∞
(4.18)
in the limit t → ∞, where
∞
Q(τ ) dτ
D∞ ≡ D(∞) = θ0 (H [θ0 , ē] − H [θ0 , 0]) −
0
∞
d
=
θ0 H [θ, e](t) − Q(t) dt 0.
dt
0
(4.19)
Thus, with (4.5) and (4.18) we see that half the work done to reach equilibrium
is stored as H ELMHOLTZ free energy and the remaining half is given by
⋆ Of course, from an analytical point of view this will depend upon the constitutive structure for
the law of heat conduction which, for classical linear theory, may be taken as Fourier’s law (4.22).
⋆⋆ In the present context, this problem has yet to be studied. While Dafermos [4] has provided an
analysis of the issues of existence and asymptotic stability for the completely linear theory of thermoelasticity, the initial-boundary value problem under consideration here is weakly nonlinear, due
to thermal expansion, and slightly different. In its one-dimensional form the fields u(x, t) and θ(x, t)
are sought for x ∈ (0, L) and for all t > 0 such that the dynamical and constitutive equations (4.2),
(4.3), (4.8), (4.9) and (4.22) hold subject to null body force and appropriate boundary and initial
conditions. Specifically, the governing equations are
σx (x, t) = ρ ü(x, t) ∀x ∈ (0, L), ∀t > 0,
with σ (x, t) = Eux (x, t) − ρm(θ(x, t) − θ0 ),
and
kθxx (x, t) = ρ(mθ(x, t)u̇x (x, t) + cθ̇ (x, t)) ∀x ∈ (0, L), ∀t > 0,
subject to the following boundary and initial conditions:
u(0, t) = 0,
u(x, 0) = u̇(x, 0) = 0,
σ (L, t) = σ̄ = const,
θ(x, 0) = θ0
θ(0, t) = θ(L, t) = θ0
∀x ∈ (0, L).
∀t > 0,
The material constants ρ, k, m, c and E are positive.
In the completely linear theory, the nonlinear term θ u̇x in the third equation above is linearized and
replaced by θ0 u̇x . For the system so linearized and within the more general three-dimensional setting,
DAFERMOS has shown that the solution asymptotically and strongly approaches the equilibrium state
of uniform temperature in the sense that
(u, e ≡ ux , σ )(x, t) → (ū, ē, σ̄ )(x) =
as t → ∞.
1 σ̄
E
x,
σ̄ 2
, σ̄ ,
E
θ(x, t) → θ0
ABOUT CLAPEYRON’S THEOREM
1
W [ū] = D∞ .
2
415
(4.20)
Following classical considerations, we may interpret the first term in the definition
(4.19)1 of D∞ , i.e., the term that involves the total entropy difference, as that part
of the change of the total internal energy that is stored in the distorted equilibrium
state of the body in the ‘primative form of heat’ and that is unavailable to do
mechanical work at the temperature θ = θ0 . This is historically referred to as
the ‘bound’ part. Of course, the total H ELMHOLTZ free energy [θ0 , ē] represents
the remaining part of the total internal energy, and it is available. According to
the definition (4.14), the second term in D∞ , in (4.19)1 , represents the total heat
exchange for the body due to the process of conduction (i.e., ‘transfer’) through its
boundary during the thermodynamic process.
Finally, to clearly identify (4.19) as an expression for the dissipated energy due
to the internal heat transfer, we first note that with (4.14), (4.17), the divergence
theorem, (4.10) and (4.6)2 we may re-write D∞ as
∞
(ρθ0 η̇ + div q) dv dt
D∞ =
0
∞
θ0
div q dv dt
1−
=
θ
0
∞
q · ∇θ
θ0
=
− 2 dv dt.
(4.21)
θ
0
Then, as is standard within the linear theory of thermoelasticity, if we assume
F OURIER ’ S law of heat conduction, i.e.,
q = −K∇θ,
(4.22)
where K is the positive definite, symmetric heat conductivity tensor, we see that
∞
(K∇θ) · ∇θ
dv dt 0.
(4.23)
D∞ = θ0
θ2
0
Accordingly, in the case of continuum thermoelasticity the expression (4.23) gives
an explicit representation for the total dissipated energy that was identified as D
in our previous more elementary discussion (see (D)). Through (4.20), it accounts
for the remaining half of the work that is done to reach equilibrium and provides a
thermodynamics based response to the paradox posed in Section 1.
5. Discussion
In this communication we have revisited a well known classical theorem in linear
elastostatics due to Emile Clapeyron and offered several interpretations of an apparent paradox associated with the ‘mysterious’ unaccountability of part of the work
done by the loading device to reach equilibrium. Our considerations reveal that this
416
R. FOSDICK AND L. TRUSKINOVSKY
theorem may be viewed in a purely statical framework as a mechanical statement
concerning work and elastic strain energy as did Love [11], and that is where
the paradox appears, or it can be viewed more generally as a thermodynamical
statement concerning the work and the H ELMHOLTZ free energy, in which case
no paradox emerges. We consider the ‘thermodynamic’ version of C LAPEYRON’s
theorem, as noted in (4.5), to be the most reasonable one; the issue does not appear
to have been addressed previously in the literature.
Within elastostatics, the purely mechanical statement of C LAPEYRON’s theorem is ambiguous because only equilibrium ideas are used to deduce it and,
therefore, the definition of ‘work’ is somewhat subjective. In practice, an elastic
body adjusts to the application of a loading gradually and part of the associated
work is transformed during this process into an energy of ‘ringing’ relative to some
average configuration. This ‘ringing’ may be sizable or negligible depending upon
the rate at which the ultimate load is attained. Coincidently, this energy is being
removed from the system by the unavoidable action of dissipation and the body
tends to an equilibrium state. If, in a particular setting, the process of reaching
equilibrium is considered instantaneous relative to the time-scale defined by the
physical problem, then the classical theorem applies and the unaccounted work
should be considered lost through dissipation. In this case, one can suppose that
there is a fast time-scale in the problem and that the associated generation of high
frequency vibrations can be considered, from the slower time-scale point of view,
to be an effective dissipative action.
We note that circumstances in which some energy may be either ‘lost’ or ‘acquired’ are not unknown within the setting of a purely conservative elastic system. For example, when considering steady state solutions of linear elastodynamic
problems, one characteristically neglects short transient periods in determining the
corresponding steady states from prescribed initial conditions. One of the energetic
consequences of such a neglect of the transient phase of the process is the necessity
to apply so-called radiation conditions in order to determine a unique steady state
configuration. Another example originates in nonlinear elastodynamics where the
energy is not conserved due to the unavoidable generation of the ‘invisible’ high
frequency vibrations inside the transition layer of shock waves.
If the H ELMHOLTZ free energy is used instead of the elastic strain energy and
the problem is viewed as thermodynamical from the very beginning, the paradox
does not surface. The reason is that in this case the system no longer is considered to be energetically closed and the ‘macro-mechanical’ degrees of freedom
are not the only ones present in the system. More specifically, in this case, the
adjustment of the body to the applied dead loads involves the activation of the
‘micro-mechanical’ degrees of freedom not accounted for by the purely mechanical
macro-description. The channeling of the macroscopic energy towards these microscopic degrees of freedom is then viewed at the macro-level as the dissipation.
The beauty of a continuum thermodynamical description is that these degrees of
freedom need not be described explicitly.
417
ABOUT CLAPEYRON’S THEOREM
Acknowledgements
We wish to thank Chi-Sing Man, J.J. Marigo and W. Warner for helpful comments.
L.T. also acknowledges, with appreciation, discussions with I. Müller, P. PodioGuidugli, J. Rice and K. Wilmanski. We gratefully acknowledge E. Petersen for
supplying the calculations and figures related to Appendix B.
Appendix A. 1D Example: ‘Dead’ Loading
To exemplify the general conclusions reached in Section 2.1 concerning the dynamical implications of ‘dead’ loading, consider the specific one-dimensional elastodynamic problem of determining the displacement field u(x, t) for x ∈ (0, L)
and for all time t > 0 such that
Euxx (x, t) = ρ ü(x, t)
∀x ∈ (0, L), ∀t > 0,
(A.1)
subject to the following boundary and initial conditions:
u(0, t) = 0,
σ (L, t) = σ̄ = const ∀t > 0,
u(x, 0) = u̇(x, 0) = 0
∀x ∈ (0, L).
(A.2)
(A.3)
Here, E > 0 is the (constant) Young’s modulus and σ (x, t) ≡ Eux (x, t) denotes
the stress.
It is straightforward to show that√the solution of (A.1)–(A.3) is periodic in time
with period T = 4L/c, where c ≡ E/ρ is the characteristic wave speed, and that
in the (x, t)-plane the strain and velocity fields, e(x, t) ≡ ux (x, t) and v(x, t) ≡
u̇(x, t), are piecewise constant and of the form shown in Figure 1. Moreover, in
this one-dimensional setting (2.7) again holds, i.e.,
W [u](t) = U [e](t) + K[v](t)
where
∀t 0,
(A.4)
L
1 2
W [u](t) ≡ σ̄ u(L, t),
U [e](t) ≡
Ee dx,
0 2
t
(A.5)
1 2
ρv dx.
K[v](t) ≡
0 2
Thus, from the solution shown in Figure 1 we may readily construct the periodic
forms of W [u](t), U [e](t) and K[v](t) and they are illustrated in Figure 2.
Now, to analyze these results it is helpful to first note that the unique equilibrium displacement ū(x), strain ē(x) and stress σ̄ (x) fields which correspond to the
boundary conditions
ū(0) = 0,
σ̄ (L) = σ̄
are given by ū(x) = (σ̄ /E)x, ē(x) = σ̄ /E and σ̄ (x) = σ̄ for x ∈ (0, L). In this
case, C LAPEYRON’s theorem implies that
1
W [ū] = U [ē]
(A.6)
2
418
R. FOSDICK AND L. TRUSKINOVSKY
Figure 1. Summary of the solution of (A.1)–(A.3) in the (x, t)-plane.
Figure 2. The total work W [u](t), strain energy U [e](t) and kinetic energy K[v](t) during
one period of motion.
419
ABOUT CLAPEYRON’S THEOREM
and we easily calculate, using (A.5), that
W [ū] =
σ̄ 2
L,
E
U [ē] =
1 σ̄ 2
L.
2E
(A.7)
Notice from Figure 1 that at the discrete times t = t¯ ∈ {L/c, 3L/c, . . .}
the displacement and strain fields coincide with those of the equilibrium state,
u(x, t¯) = ū(x) and e(x, t¯) = ē(x). Thus, from (A.7)1 and Figure 2 we see that
W [u](t¯) = W [ū],
U [e](t¯) = K[v](t¯) =
1
W [ū].
2
(A.8)
This verifies (2.9) and explicitly shows that at those times when the dynamical
displacement field coincides with the equilibrium displacement field, half the work
done is stored as strain energy and the remaining half appears as kinetic energy. In
passing, we note from Figure 1 that at time t = 2L/c (and periodically thereafter)
the body is at rest and it is distorted with a strain field that is double what it is in
equilibrium. Moreover, from Figure 2 we see that at this time there is a total ‘workenergy balance’ in the sense that W [u](2L/c) = U [e](2L/c). This is a reflection
of Poncelet’s observation noted earlier in the first footnote of Section 2.1.
Observe, from Figure 1, that v(x, t ∗ ) = 0 ∀x ∈ (0, L) and for every t ∗ ∈
{2L/c, 4L/c, . . .}. Thus, by time-averaging (A.4) over any interval (0, t ∗ ) and
using a notation analogous to (2.10) it is clear that
W [u] = W [u] = U [e] + K[v],
(A.9)
where, according to Figure 2 and (A.7)1 , we easily calculate
W [u] = W [ū],
3
U [e] = W [ū],
4
1
K[v] = W [ū],
4
(A.10)
in agreement with results more generally obtained in Section 2.1. In addition, from
the periodic extension of Figure 2 and the value of W [ū] in (A.7), we readily
see that the infinite time-average, constructed analogous to (2.22) for this onedimensional example, satisfies the general conditions recorded in (2.23), i.e.,
W [u]∞ = U [e]∞ + K[v]∞ ,
where
W [u]∞ = W [ū],
3
U [e]∞ = W [ū],
4
K[v]∞ =
1
W [ū].
4
Appendix B. 1D Example: Retarded Loading
In order to exhibit more precisely how the solution of an elastodynamics problem
may depend on the slowness of the applied loading, we consider another one-
420
R. FOSDICK AND L. TRUSKINOVSKY
dimensional elastodynamic problem of determining u(x, t) for x ∈ (0, L) and for
all time t > 0 such that
Euxx (x, t) = ρ ü(x, t)
∀x ∈ (0, L), ∀t > 0,
(B.1)
subject to the following boundary and initial conditions:
u(0, t) = 0,
σ (L, t) = (1 − e−αt )σ̄
u(x, 0) = u̇(x, 0) = 0 ∀x ∈ (0, L).
∀t > 0,
(B.2)
(B.3)
Here, α > 0 represents a ‘slowness’ load parameter which governs the length
of time it takes the applied end load to essentially reach the constant value σ̄ .
For sufficiently large α, the loading in (B.2) is nearly impulsive and this problem
then reduces to that of Appendix A. As α is reduced the loading becomes more
retarded and the solution is expected to show less of a dynamic structure. Of course,
analogous to (2.7) the mechanical energy balance again holds, so that
W [u](t) = U [e](t) + K[v](t),
∀t 0,
where the work done on the body up to time t is now determined by
t
σ (L, τ )u̇(L, τ ) dτ,
W [u](t) =
(B.4)
(B.5)
0
and where the corresponding strain energy, U [e](t), and corresponding kinetic
energy, K[v](t), are as defined in (A.5).
One of the major questions concerning the solution of the dynamical problem
stated above is how the work, strain energy and kinetic energy vary with time relative to the strain energy that would be stored in the same elastic bar in equilibrium
under the constant end load σ̄ , i.e., U [ē] of (A.7)2 . In Figures 3–5 we show the
normalized work, W [u](t)/U [ē], normalized strain energy, U [e](t)/U [ē], and normalized kinetic energy, K[v](t)/U [ē], as functions of time computed numerically
for this problem for a range of slowness load parameters α between α = 104 sec−1
and α = 106 sec−1 . These figures are based on material constants for an aluminum alloy with E = 76.1 × 109 Pa and ρ = 2710 kg/m3 , for a bar of length
L = 5 × 10−3 m, and for a load constant σ̄ = 107 Pa. The time axis of these figures
is measured in ‘time steps’ with the final time step of 129760 corresponding to
1200 × 10−6 sec.
One can see that the impulsive-like nature of the loading for large α results
in wildly irregular behavior which is sustained over an infinite time. On the contrary, for relatively small α equilibrium appears to be achieved quickly in time
with nearly constant limiting values W [u](t)/U [ē] ≈ 1, U [e](t)/U [ē] ≈ 1 and
K[v](t)/U [ē] ≈ 0. We conclude that the quantity W [ū] in (A.7)1 , while it has
units of work and shows up in C LAPEYRON’s theorem as exhibited in (A.6), does
not represent the work done to reach equilibrium; reasoning based on the computed
limiting behavior leads to the conclusion that only half of this value is expended to
reach equilibrium and, then, it is manifested totally in the form of strain energy.
ABOUT CLAPEYRON’S THEOREM
421
Figure 3. Normalized work W [u](t)/U [ē] as a function of time for various slowness load
values α.
Figure 4. Normalized strain energy U [e](t)/U [ē] as a function of time for various slowness
load values α.
422
R. FOSDICK AND L. TRUSKINOVSKY
Figure 5. Normalized kinetic energy K[v](t)/U [ē] as a function of time for various slowness
load values α.
Because there are three decades of variation of the slowness load parameter α
shown in Figures 3–5, there is much highly oscillatory, rapid time-behavior that is
not resolved in these figures. Therefore, in Figures 6–10, we take α = 105 sec−1
and show a more detailed solution of (B.1)–(B.3). The material constants E and ρ,
bar length L and load constant σ̄ are the same as noted above, but the time steps
for the time-axis is now such that the final time step of 12800 corresponds to 120 ×
10−6 sec. In Figure 6, we see that the strain field e(x, t) is highly irregular in
time at the fixed end x = 0 where information from the time-dependent loading
at the end x = L is reflected back into the bar. The length-axis of this figure is
measured in ‘length steps’ with the final length step of 100 corresponding to 5 ×
10−3 m which is the length of the bar. In Figures 7 and 8, we show the normalized
total work done W [u](t)/U [ē] and the normalized kinetic energy K[v](t)/U [ē]
as functions of time. These correspond to the α = 105 sec−1 cross sections of
Figures 3 and 5, respectively, for the initial time interval (0, 12800) as noted in
these figures. The normalized strain energy U [e](t)/U [ē] is not shown, but behaves
similar to Figure 7. Notice the orders of magnitude reduction of the energy scale
used in exhibiting the kinetic energy in Figure 8. In Figures 9 and 10, we show
the ratios U [e](t)/W [u](t) and K[v](t)/W [u](t) as functions of time in order to
illustrate that it takes only a few ‘rings’ to almost completely eliminate the total
kinetic energy in the bar. Of course, a small motion remains in the bar for all time
no matter how small the slowness parameter α > 0.
ABOUT CLAPEYRON’S THEOREM
Figure 6. Strain e(x, t) as a function of axial position and time for α = 105 sec−1 .
Figure 7. W [u](t)/U [ē] vs. t: α = 105 sec−1 .
423
424
R. FOSDICK AND L. TRUSKINOVSKY
Figure 8. K[v](t)/U [ē] vs. t: α = 105 sec−1 .
Figure 9. U [e](t)/W [u](t) vs. t: α = 105 sec−1 .
ABOUT CLAPEYRON’S THEOREM
425
Figure 10. K[v](t)/W [u](t) vs. t: α = 105 sec−1 .
References
1.
G. Andrews and J.M. Ball, Asymptotic behaviour and changes of phase in one-dimensional
nonlinear viscoelasticity. J. Differential Equations 44 (1982) 306–341.
2. J.M. Ball, Some open problems in elasticity. In: Geometry, Mechanics and Dynamics, eds.
P. Newton, P. Holmes and A. Weinstein, Springer, New York (2002) pp. 3–59.
3. E. Clapeyron, Mémoire sur le travail des forces élastiques dans un corps solide élastique
déformé par l’action de forces extérieures. Comptes Rendus Acad. Sci. Paris XLVI (1858)
208–212.
4. C.M. Dafermos, On the existence and the asymptotic stability of solutions to the equations of
linear thermoelasticity. Arch. Rational Mech. Anal. 29 (1968) 241–271.
5. C.M. Dafermos, The mixed initial-boundary value problem for the equations of nonlinear onedimensional viscoelasticity. J. Differential Equations 6 (1969) 71–86.
6. P. Duhem, Traité d’Energétique ou de Thermodynamique Générale. Gauthier-Villars, Paris
(1911).
7. J. E. Dunn and R. Fosdick, The morphology and stability of material phases. Arch. Rational
Mech. Anal. 74 (1980) 1–99.
8. D. Iesan and A. Scalia, Thermoelastic Deformations. Kluwer Academic, Dordrecht (1996).
9. G. Lamé, Leçons sur la Théorie Mathématique de l’Élasticité des Corps Solides. Paris (1852).
10. G. Lamé and E. Clapeyron, Mémoire sur l’équilibre intérieur des corps solides homogénes.
Mém. Divers Savants IV (1833) 465–562.
426
11.
12.
13.
14.
15.
R. FOSDICK AND L. TRUSKINOVSKY
A.E.H. Love, A Treatise on the Mathematical Theory of Elasticity, 4th edn. Cambridge (1927).
J.V. Poncelet, Introduction à la Mécanique Industrielle, Physique et Expérimentale. Paris
(1839).
I. Todhunter and K. Pearson, A History of the Theory of Elasticity and of the Strength of
Materials from Galilei to the Present Time, Vol. I, Galilei to Saint-Venant. Cambridge (1886)
pp. 1639–1850.
I. Todhunter and K. Pearson, A History of the Theory of Elasticity and of the Strength of Materials from Galilei to the Present Time, Vol. II, Saint-Venant to Lord Kelvin, Part I. Cambridge
(1893).
C. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1984).
The Lavrentiev Phenomenon in Nonlinear Elasticity
M. FOSS1, W. HRUSA2 and V.J. MIZEL2
1 Kansas State University, Manhattan, KS, U.S.A.
2 Carnegie Mellon University, Pittsburgh, PA, U.S.A.
Received 14 October 2002; in revised form 30 October 2003
Abstract. In 1985 J.M. Ball and V.J. Mizel raised the question of whether there exist nonlinearly
elastic materials possessing a physically natural stored energy density, i.e., one which is independent
of an observer’s coordinate frame (objective) and is invariant under the group of orthogonal linear
transformations of space (isotropic), as well as physically reasonable boundary value problems for
such materials such that the infimum of the total stored energy for those continuous deformations of
the material meeting the boundary condition (admissible deformations) which belong to a Sobolev
space W 1,p2 for some p2 > 1 is strictly greater than its infimum for those admissible continuous
deformations belonging to some Sobolev space W 1,p1 , p1 < p2 , despite the density of W 1,p2
in W 1,p1 . The question was motivated by M. Lavrentiev’s demonstration in 1926 of the presence
of such a gap for a 1-dimensional variational boundary value problem on a bounded interval whose
smooth integrand satisfied the conditions of Tonelli’s existence theorem (as well as the development
of improved versions in the 1980’s). The present article describes a positive response to the question
raised in 1985. Namely, we provide examples of nonlinearly elastic materials in 2-dimensions and
physically reasonable boundary value problems for these materials in which a positive gap exists
between the infimum of the total stored energy over admissible continuous deformations belonging
to a Sobolev space W 1,p2 and its infimum over admissible continuous deformations belonging to a
Sobolev space W 1,p1 , with p1 < p2 . The physical and computational significance of such results is
also discussed.
Mathematics Subject Classifications (2000): 49J, 49K, 74B, 74G.
Key words: Lavrentiev phenomenon, nonlinear elasticity, singular minimizers.
Dedicated to the memory of Clifford Truesdell, a teacher, good friend, and
inspiration to several generations of researchers in continuum physics.
1. Introduction
During the course of an investigation in the early 1980’s studying the variational
approach to the analysis of hyperelastic materials, J.M. Ball and V.J. Mizel discovered that even unidimensional problems in the calculus of variations are not
as elementary as many classical results suggest. For example, there are two point
boundary value problems involving a positive smooth integrand which is convex
in the derivative variable for which the absolutely continuous minimizer is not
427
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 427–435.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
428
M. FOSS ET AL.
a Lipschitz function but instead possesses a derivative which is only in Lp for
some finite p values and thus is essentially unbounded [4]. Furthermore Ball and
Mizel became aware through Cesari’s book [6] of an even more surprising one
dimensional phenomenon (to be described below) which they thereafter called the
Lavrentiev phenomenon in honor of its discoverer. These developments affected
the degree of confidence Ball and Mizel had in the invariable success of variational
methods in determining the actual energy deformation of an elastic body under
physically appropriate boundary conditions and external forces [4].
The goal of the present article is to present a successful resolution of the question raised in [4] as to whether the Lavrentiev phenomenon can arise in realistic
problems of multidimensional hyperelasticity. We summarize the results obtained
in a longer article, which appeared in the Archive for Rational Mechanics and
Analysis [7], and we include some additional discussion of the physical significance of those results. In order to clarify what is meant by the Lavrentiev phenomenon in one-dimensional variational problems let us consider a functional of the
form
b
f (x, y(x), y ′ (x)) dx,
(1)
J [y] =
a
where y is an absolutely continuous function defined on the interval [a, b] which is
subject to boundary constraints y(a) = A, y(b) = B and smoothness conditions.
We adopt the following notation. y ∈ W 1,p (a, b) ⇔ y is absolutely continuous
with y ′ ∈ Lp (a, b);
Ap = y ∈ W 1,p (a, b) | y(a) = A, y(b) = B .
Thus, for example, with [a, b] = [0, 1] and A = 0, B = 1 y(x) = x β , β ∈
(0, 1), y ∈ W 1,p (0, 1) if and only if p ∈ [1, 1/(1 − β)). For J as above we put
i(p) = inf{J [y] | y ∈ Ap },
p ∈ [1, ∞],
so that p1 p2 ⇒ i(p1 ) i(p2 ). Then the Lavrentiev phenomenon is said to
occur if J is such that i(p1 ) < i(p2 ) for some p1 < p2 . For convenience we
denote this phenomenon by . There is no general theory ensuring the existence
of a W 1,∞ (i.e., Lipschitz) minimizer for such problems; however in the 1920’s
the following result (incorporating an idea due to Nagumo) was demonstrated by
Tonelli.
THEOREM. If there exists φ: [0, ∞) → R such that φ(t)/t → ∞ as t → ∞
and f ∈ C 2 satisfies f (x, y, z) φ(|z|), ∀x ∈ [a, b], (y, z) ∈ R2 and fzz 0
then there exists u ∈ A1 such that J [u] J [v], ∀v ∈ A1 .
To illustrate the phenomenon we examine the following example due to Heinricher and Mizel [8]. Let f0 (x, y, z) = (y 2 − x)2 z6 and consider the problem of
minimizing
1
f0 (x, y(x), y ′ (x)) dx,
J0 [y] =
0
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
429
where y is an absolutely continuous function defined on the interval [0, 1] subject
to the constraints y(0) = 0, y(1) = 1. Note that f0 satisfies all conditions apart
from the superlinear growth condition in Tonelli’s theorem. We put
i0 (p) = inf J0 [y] | y ∈ W 1,p (0, 1), y(0) = 0, y(1) = 1 , p ∈ [1, ∞].
Now positivity of f0 ensures that since u(x) = x 1/2 satisfies J0 [u] = 0 one has
i0 (p) = 0 ∀p ∈ [1, 2). However it was shown using ideas of Noether that for all
p ∈ [2, 5/2) the function v(x) = x 3/5 minimizes J0 whence i0 (p) = (1/6)(3/5)6
for this range of p values. Thus we have the phenomenon for this problem:
i0 (p1 ) = 0 < i0 (p2 ) = (1/6)(3/5)6 ∀p1 < 2, p2 ∈ [2, 5/2). Furthermore if
the exponent 6 in f0 is replaced by any β > 6 then for any sequence {yn } ⊂ A2
such that yn (x) converges pointwise to x 1/2 , J0 [yn ] → +∞. Now it is possible by a
simple modification of f0 to construct an integrand fǫ which satisfies all conditions
of Tonelli’s theorem as well as fǫzz > 0 and yet exhibits Lavrentiev’s phenomenon.
Namely, put for ǫ > 0, fǫ (x, y, z) = (y 2 − x)2 z6 + ǫ(1 + z2 )5/6 and set
1
fǫ (x, y(x), y ′ (x)) dx.
(2)
Jǫ [y] =
0
It is not difficult to verify that for sufficiently small ǫ one has iǫ (p1 ) < (1/6)(3/5)6
< iǫ (p2 ) ∀p1 < 2, p2 ∈ [2, 5/2), whereby the phenomenon holds for Jǫ , as
claimed. One might be tempted to say that {y ∈ W 1,6 (0, 1): y(0) = 0, y(1) = 1}
is the “natural” domain on which to minimize J0 (or Jǫ ). However, in this case there
is no minimizer. In physical problems arising in nonlinear elasticity the situation is
similar. If one takes the domain of the energy functional based on a Sobolev space
having the property that all deformations have finite energy, then there may not be
a minimizer – unless the stored energy function is of a very special type having the
same upper and lower growth rates.
2. 2-Dimensional Hyperelasticity Examples
As indicated earlier, we will describe certain boundary value problems involving
a 2-dimensional body consisting of an elastic material with a physically natural
stored energy function, for which the Lavrentiev phenomenon occurs. It should
be noted that there were previous results in nonlinear elasticity where such a gap
phenomenon was shown [3, 1] but all such examples involved discontinuous deformations exhibiting cavities of nonzero surface area to which no energetic cost
was assigned. In view of the uncertain physical status of those examples, we regard
it as important that our examples involve only continuous deformations so that no
cavities arise.
Now the presence of the Lavrentiev phenomenon in our boundary value problems for a physically natural elastic material implies that the infimal stored energy
for continuous deformations belonging to a Sobolev space of exponent p2 for some
430
M. FOSS ET AL.
p2 > 1, is strictly greater than the minimal stored energy for continuous deformations belonging to the larger Sobolev space of exponent p1 = 1. Consequently any
deformation minimizing the stored energy in the space with exponent p2 would be
a metastable equilibrium – one possessing milder gradient singularities than those
occurring in the stable equilibrium deformation belonging to the space with exponent p1 = 1. Therefore any occasion in which the material deformation transforms
from the metastable equilibrium associated with exponent p2 to the stable equilibrium associated with exponent p1 would produce material points with heightened
gradient singularities. Possibly such points could provide loci for the initiation of
(as opposed to the presence of ) fracture. The computational significance of such
results is that standard computational schemes might suggest that the more regular
metastable equilibrium state is actually the stable equilibrium state, i.e., absolute
minimizer. Hence such computations could lead to misleadingly optimistic design
specifications for elastic structures. Furthermore, in analogy to the one-dimensional
result cited on page 4, when a sector of the unit disc is deformed into a sector
whose central angle is less than 3/4 as large then for any sequence {u(n)} lying in
the Sobolev space of exponent p2 (which does not contain the stable equilibrium
deformation) and converging to that equilibrium deformation pointwise one has
J [u(n) ] → +∞, where J [u] denotes the total stored energy associated with the
displacement field u.
It will be convenient to introduce the following notation for the description
of our examples. Lin R2 denotes the set of linear transformations on R2 , while
Lin+ R2 denotes the subset consisting of linear transformations with positive determinant. Orth R2 denotes the orthogonal group of linear transformations on R2 ,
with Orth+ R2 = Orth R2 ∩ Lin+ R2 denoting the proper orthogonal group of
linear transformations
on R2 . For F ∈ Lin R2 , F 2 = tr F T F . For x ∈ R2 \{0},
3
r(x) = x12 + x22 , θ(x) = arctan(x2 /x1 ). Finally, for β ∈ (0, 2π ) β = {x |
r(x) < 1, θ(x) ∈ (0, β)}, with the following notation for portions of its boundary:
Ŵ1,β = {x | r(x) ∈ [0, 1], θ(x) = β},
Ŵ2,β = {x | r(x) ∈ [0, 1], θ(x) = 0},
Ŵ3,β = {x | r(x) = 1, θ(x) ∈ [0, β]}.
We begin with the following two dimensional variational problem involving a
positive integrand W0 (∇u) where W0 has several physically natural features (to be
described below) but is not a physically appropriate stored energy integrand for a
nonlinearly elastic material since W0 (F ) does not become unbounded as det F →
0+, i.e., as the material undergoes very high compression:
Minimize J0 [u] =
β
W0 (∇u) dx for u: β → α , α ∈ (0, β),
(P)
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
431
subject to
(i) r(u(x)) = 1,
(ii)
(iii)
θ(u(x)) =
α
θ(x),
β
∀x ∈ Ŵ3,β ;
u(Ŵ1,β ) = Ŵ1,α ;
u(Ŵ2,β ) = Ŵ2,α ;
(iv) u(0) = 0;
(v)
det ∇u(x) dx
β
(BC)
du.
u(β )
(The first four conditions are (generalized) Dirichlet type conditions while the
fifth condition guarantees that if ∇u ∈ Lin+ R2 almost everywhere then u is
injective almost everywhere.)
Here W0 (F ) = (F 2 − 2 det F )4 = ((F11 − F22 )2 + (F21 + F12 )2 )4 and for
each p ∈ [1, ∞] we put Ap = W 1,p (β ; R2 ) ∩ C(β ; R2 ) = A1 ∩ W 1,p (β ; R2 ),
restricting the admissible mappings u for (P) to be those which are elements of
Ap := {u ∈ Ap | ∇u ∈ Lin+ R2 a.e. and u satisfies (BC)}. For example, for each
p ∈ [1, ∞] the mappings u: β → α given by
⎞
⎛
α
θ(x)
cos
⎟
⎜
β
δ⎜
u(x) = r(x) ⎝
⎟
⎠
α
sin
θ(x)
β
belong to Ap for appropriate choices of δ > 0.
We note the following properties of W0 :
• W0 ∈ C ∞ (Lin R2 ; R); W0 is convex on Lin R2 ; W0 (F ) W0 (I ) = 0;
• W0 is materially homogeneous, i.e., not x-dependent;
• W0 is objective and isotropic, i.e., W0 (QF ) = W0 (F ) = W0 (F Q) for each
Q ∈ Orth+ R2 .
Next for each p ∈ [1, ∞] put i0 (p) = inf{J0 [u] | u ∈ Ap } and for each α, β ∈
∗
= 2β/(β − α). We now give the value of i0 (p) for all
(0, 2π ) with α < β put pβ,α
∗
p ∈ [1, ∞]\{pβ,α }.
THEOREM 1. For α, β ∈ (0, 2π ) with α < β define uam: β → α as follows
α
cos(γ θ(x))
γ
uam (x) = r(x)
, where γ := < 1 so that
sin(γ θ(x))
β
cos(γ θ(x)) − sin(γ θ(x))
γ −1
∇uam (x) = γ r(x)
, x ∈ β .
sin(γ θ(x)) cos(γ θ(x))
Using the definition of W0 it is clear that
W0 (∇uam (x)) = 0 for all x ∈ β .
(∗)
432
M. FOSS ET AL.
∗
Now it is easy to verify that uam ∈ Ap if and only if p ∈ [1, pβ,α
) = [1, 2/(1 − γ )).
It therefore follows by (∗) that
∗
inf J0 = J0 [uam ] = 0 for each p ∈ [1, pβ,α
),
(∗∗)
Ap
whence uam is an “absolute” minimizer for J0 on each Ap , 1 p < 2/(1 − γ ).
To summarize we may write
∗
io (p) = 0 for all p ∈ [1, pβ,α
).
(3)
THEOREM 2 (Case 1). If γ = α/β < 3/4 then J0 possesses a “pseudo” minimizer upm ∈ Ap for each p ∈ [2/(1 − γ ), 14/(1 + γ )). Namely,
cos(γ θ(x))
upm (x) = r(x)(6−γ )/7
,
sin(γ θ(x))
so that
⎛6−γ
cos(γ θ(x)) −γ sin(γ θ(x))
⎞
7
⎠
∇upm (x) = r(x)−(1+γ )/7 ⎝ 6 −
γ
sin(γ θ(x)) γ cos(γ θ(x))
7
γ (6 − γ )
det ∇upm (x) =
r(x)−2(1+γ )/7 > 0, x ∈ β .
7
and
Now upm is a solution to the Euler–Lagrange system for J0 and upm ∈ Ap for all
p ∈ [2/(1 − γ ), 14/(1 + γ )), whereas uam ∈
/ Ap for any p 2/(1 − γ ), so one
has
7
2
14
8 3
−γ
> 0 for all p ∈
,
inf J0 = J0 [upm ] = β
,(#)
Ap
7 4
1−γ 1+γ
which justifies the description of upm as a “pseudo” minimizer for p ∈ [2/(1 − γ ),
14/(1 + γ )). Moreover the relation (#) also holds for all p ∈ [14/(1 + γ ), ∞]
although upm ∈
/ Ap for these p values.
Case 2. If β > α 34 β so that 3/4 < γ < 1 then for all p ∈ [2/(1 − γ ), ∞]
one has the relation
inf(J0 ) = J0 [uam ] = 0
Ap
(##)
although uam ∈
/ Ap for these p values.
In view of Theorems 1 and 2 we see that the Lavrentiev phenomenon does hold
for J0 when α/β = γ < 3/4:
7
2
2
8 3
−γ
for all p1 <
, p2 >
,
i0 (p1 ) = 0 < i0 (p2 ) = β
7 4
1−γ
1−γ
()
433
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
whereas the Lavrentiev phenomenon does not occur when γ > 3/4:
γ >
3
4
⇒
i0 (p) = 0
for all p ∈ [1, ∞].
We note that the validity of (#) and (##) when γ > 3/4 is demonstrated by
constructing sequences {u(n) } ⊂ A∞ such that u(n) converges weakly to upm (respectively, to uam) for the given values of p.
Next we quote a fundamental existence theorem due to Ball [2].
THEOREM 3. Suppose W : Lin+ R2 → R satisfies
(i) W is polyconvex: i.e., there is a continuous jointly convex function g: Lin R2 ×
(0, ∞) → R such that W (F ) = g(F, det F ), ∀F ∈ Lin+ R2 ;
(ii) There are p0 2 and K1 , K2 > 0 such that g(F, λ) K1 F p0 − K2 ,
∀(F, λ) ∈ T = Lin+ R2 × (0, ∞);
(iii) g(F, λ) → +∞ as (F, λ) → ∂T , in particular as det F → 0+ and as
det F → +∞ (by convention W (F ) = +∞ for F ∈ Lin R2 \ Lin+ R2 ).
Then for any β ∈ (0, 34 π ) and the stored energy functional J : W 1,1 (β ; R2 ) →
(−∞, ∞] defined by
J [u] =
W (∇u) dx,
β
there exists uam ∈ Ap0 such that
J [uam ] = inf{J [u] | u ∈ A1 } = inf{J [u] | u ∈ Ap0 }
provided J [u] < ∞ for at least one u ∈ Ap0 .
We now adapt our previous result to the elasticity context. Given K ∈ [2, 4),
let PK (F ) = K(det F )−1 + 3(2−K)/2 (1 + F 2 )K/2 and define Wǫ,K : Lin+ R2 →
[0, ∞) as follows:
4
Wǫ,K (F ) = F 2 − 2 det F + ǫ K(det F )−1 + 3(2−K)/2(1 + F 2 )K/2
= W0 (F ) + ǫPK (F ).
Consider the following variational problem for given p ∈ [1, ∞]:
Jǫ,K [u] =
Wǫ,K (∇u) dx → inf for u ∈ Ap .
(Pǫ,K )
β
We note the following properties of Wǫ,K for each ǫ > 0, K 2 (by convention
Wǫ,K (F ) = +∞ for F ∈ Lin R2 \ Lin+ R2 ):
Wǫ,K ∈ C ∞ (Lin+ R2 ; R);
Wǫ,K (F ) ǫF K ;
Wǫ,K (F ) Wǫ,K (I ) > 0;
Wǫ,K is materially homogeneous, i.e., x-independent; Wǫ,K is objective and isotropic; Wǫ,K is polyconvex; Wǫ,K (F) → +∞ as F → ∂T = ∂ Lin+ R2 × (0, ∞).
434
M. FOSS ET AL.
Thus Wǫ,K possesses those properties which are typical for physically natural
stored energy densities and in addition satisfies the conditions of Theorem 3 (Ball’s
existence theorem).
We may now state our main result writing iǫ,K for the obvious infimum as a
function of p.
THEOREM 4. Given β ∈ (0, 43 π ) and α ∈ (0, 34 β) there are for each K ∈
∗
= 2/(1 − γ ) and ǫβ,α = ǫβ,α (K) > 0 such that if
[2, 2/(1 − γ )) numbers pα,β
ǫ < ǫβ,α then the Lavrentiev phenomenon is present in the following sense:
0 < iǫ,K (p1 ) = inf{Jǫ,K (u) | u ∈ Ap1 } < iǫ,K (p2 )
(ǫ,K )
whenever p1 < 2/(1 − γ ) < p2 .
It can be shown that under the constraints we have given the minimizers uǫ,K
am
for (Pǫ,K ) above, whose existence is guaranteed by Ball’s theorem (Theorem 3),
are continuous mappings, thus avoiding the cavitation issue referred to on page 3.
For proofs and additional discussion see [7].
REMARK 1. Although our boundary condition (i) completely prescribes the
displacement on Ŵ3,β , conditions (ii) and (iii) only partially prescribe the displacement on Ŵ1,β and Ŵ2,β . We do not know if the Lavrentiev phenomenon can occur
for problems in which the displacement is completely prescribed on the entire
boundary (i.e., for standard Dirichlet type boundary conditions).
REMARK 2. In order to give some insight into the special properties of W0 it
is useful to introduce complex notation. If we write z = x1 + ix2 , f = u1 +
iu2 = reiθ(u) then W0 (∇u) = (4|∂f |2 )4 and the Euler–Langrange system becomes
∂(|∂f |6 ∂f ) = 0, where
∂
1 ∂
∂
1 ∂
and ∂ =
.
−i
+i
∂=
2 ∂x1
∂x2
2 ∂x1
∂x2
This allows one to explicitly construct solutions of the Euler–Lagrange equations.
Convexity arguments can be used to prove that these solutions are minimizers. Similar results apply to elastic materials associated with W0 (F ) = (F 2 − 2 det F )q ,
for each exponent q > 1.
REMARK 3. We have not succeeded in demonstrating that there are inherently
three-dimensional physically natural boundary problems in which the Lavrentiev
phenomenon occurs for some physically natural elastic material.
Acknowledgements
Mizel wishes to express his appreciation to T.J. Healey for stimulating comments
after his talk. Others contributing earlier stimulating comments to the research are
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
435
acknowledged in [7]. In addition, the authors wish to express appreciation to three
anonymous referees for their helpful comments. Finally, Mizel wishes to express
his appreciation to the U.S. National Science Foundation for its partial support of
his research under Grant 0072816 and Foss expresses his appreciation to the U.S.
National Science Foundation for partial support of his research under its VIGRE
program.
References
1.
2.
3.
4.
5.
6.
7.
8.
G. Alberti and P. Majer, Gap phenomenon for some autonomous functionals. J. Convex Anal. 1
(1994) 31–45.
J.M. Ball, Convexity condiitons and existence theorems in nonlinear elasticity. Arch. Rational
Mech. Anal. 63 (1977) 337–403.
J.M. Ball, Discontinuous equilibrium solutions and cavitation in nonlinear elasticity. Philos.
Trans. Roy. Soc. London Ser. A 305 (1982) 557–611.
J.M. Ball and V.J. Mizel, One-dimensional variational problems whose minimizers do not
satisfy the Euler–Lagrange equation. Arch. Rational Mech. Anal. 90 (1985) 325–388.
M. Belloni, Interpretation of Lavrentiev phenomenon by relaxation: The higher order case.
Trans. Amer. Math. Soc. 347 (1995) 2011–2023.
L. Cesari, Optimization Theory and Applications. Springer, New York (1983).
M. Foss, W.J. Hrusa and V.J. Mizel, The Lavrentiev gap phenomenon in nonlinear elasticity.
Arch. Rational Mech. Anal. 167 (2003) 337–365.
A.C. Heinricher and V.J. Mizel, The Lavrentiev phenomenon for invariant variational problems.
Arch. Rational Mech. Anal. 102 (1988) 57–93.
Steady Flow of a Navier–Stokes Fluid
around a Rotating Obstacle ⋆
GIOVANNI P. GALDI
Department of Mechanical Engineering, University of Pittsburgh, Pittsburgh 15261 PA, U.S.A.
E-mail: galdi@engrng.pitt.edu
Received 7 October 2002; in revised form 7 January 2003
Abstract. Let B be a body immersed in a Navier–Stokes liquid L that fills the whole space. Assume
that B rotates with prescribed constant angular velocity ω. We show that if the magnitude of ω is not
“too large”, there exists one and only one corresponding steady motion of L such that the velocity
field v(x) and its gradient grad v(x) decay like |x|−1 and |x|−2 , respectively. Moreover, the pressure
field p(x) and its gradient grad p(x) decay like |x|−2 and |x|−3 , respectively. These solutions are
“physically reasonable” in the sense of Finn. In particular, they are unique and satisfy the energy
equation. This result is relevant to several applications, including sedimentation of heavy particles in
a viscous liquid.
Mathematics Subject Classifications (2000): 35Q30, 76N10, 76D07.
Key words: rotating obstacle, Navier–Stokes, steady state, asymptotic behavior.
To Clifford Truesdell, in memoriam
1. Introduction
The steady motion of a liquid past a rigid body, B, translating with a constant
velocity is among the oldest and most fundamental questions in theoretical and
applied fluid dynamics [24]. In fact, the first, significant contributions to the subject
date back to the work of Stokes [35], Kirchhoff [21], and Thomson (Lord Kelvin)
and Tait [37].
In view of its complexity, a systematic and rigorous mathematical study of the
problem for a Navier–Stokes liquid, L, was initiated only much later, through to
the fundamental work of Oseen [31], Odqvist [30], and Leray [25, 26], and only
a few decades ago was it further deepened and, under certain aspects, completed,
as a result of the efforts of several mathematicians including Ladyzhenskaya [22],
Fujita [10], Finn [9] and Babenko [3]; see also [12, 15].
The main achievement of these works is the proof of existence of steady solutions that exhibit all the main features expected from a physical point of view.
⋆ Work partially supported by NSF grant DMS-0103970
437
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 437–467.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
438
GIOVANNI P. GALDI
In particular, they are unique for small data, satisfy the global energy balance
(energy equation) and show a wake behind the body, that is, they are “physically
reasonable” in the sense of Finn [9]. Furthermore, they are stable and attainable
from rest for sufficiently small data.
It is important to emphasize that all the above properties can be secured only
through the knowledge of the asymptotic behavior of the solutions at large distances. Moreover, they hold under the crucial assumption that the motion of B is
purely translatory (no spin).
Recently, the present author has started a mathematical analysis of sedimentation of rigid bodies in a Navier–Stokes liquid (see [14] and the reference cited
therein). This problem, which is at the foundation of several engineering applications like manufacturing of short-fiber composites [2], separation of macromolecules by electrophoresis, [36], flow-induced microstructures [20], and blood flow
problems [32], consists in studying the existence, stability and attainability of terminal states that are eventually achieved (as time goes to infinity) by a rigid body
of negative buoyancy that is dropped from rest in a Navier–Stokes liquid.⋆ Here,
by “terminal state” we mean a state of motion where the body moves with constant
translational and angular velocities with respect to an inertial frame, while the flow
of the liquid, as observed from a frame attached to the body, is steady [39].
A significant result of D. Serre shows that, for B of arbitrary shape and mass
and for L of arbitrary density and viscosity, the set of terminal states is not empty
[33].
However, solutions obtained by Serre are “weak”, in the sense that their corresponding velocity field v vanishes at infinity a priori only in a generalized sense⋆⋆
and, consequently, it is not known if they are “physically reasonable”. Recently, the
present author has shown that, for these solutions, v and the corresponding pressure
p tend to zero at large distances uniformly pointwise [14]. However, this result
is not enough to furnish the validity of the basic physical properties mentioned
above, that require for v and p an order of decay with the distance r from B like
r −1 and r −2 , respectively [4]; see also [5, 18, 19] and the references therein. We
wish to emphasize that even the proof of the uniform pointwise convergence, that
in absence of rotation is obtained quite straightforwardly [12, Theorem IX.6.1],
requires a substantial effort if B is allowed to spin; see Section 4.2.2 in [14].
In order to understand why the problem becomes difficult if B is rotating versus translating, we recall that the method typically employed in the study of the
asymptotic structure of a steady solution in exterior domain [12, 27, 8] relies upon
the proof of existence and of appropriate estimates of solutions to the linearized
problem, in conjunction with a suitable fixed point argument. In turn, this proof
is typically achieved by showing appropriate estimates of the fundamental solution
for the relevant linear operator. Now, if B is only translating, the linearized operator
is the well-known Oseen operator, LT , which is obtained from the (second order)
⋆ Similar problems are of great interest also in visco-elastic non-Newtonian fluid [38]; see [14].
⋆⋆ Specifically, the average of |v| over the unit sphere vanishes at large spatial distances.
FLOW AROUND A ROTATING OBSTACLE
439
Stokes operator by adding a lower (first) order term in the velocity field v, with
constant coefficients. If, on the other hand, the body is rotating with angular velocity ω, the corresponding linearized operator, LTR , also includes the first order term
ω × x · grad v with x generic point in the region occupied by L; see equation (2.1).
This term has two undesired features, related to its coefficient ω × x. The first is
that this coefficient depends on x, and the other, more important, is that it becomes
unbounded at large distances from B. It should be added that the fundamental
solution for the operator LTR is known [5], but due to its very complicated form,
any reasonable attempt to furnish appropriate estimates appears to be unwieldy
and extremely difficult. Moreover, also other methods, like Fourier transform in
conjunction with theory of multipliers, that have been successfully employed in
the case of the operator LT [12, Chapters IX and X], in the case of the operator
LTR they seem to fail or, at least, they do not seem to provide valuable information.
The present paper is devoted to existence, uniqueness and asymptotic behavior
of steady solutions to the Navier–Stokes equation in the exterior of a rotating body.
In particular, we show that, if the angular velocity ω of the body is not “too large”,
a unique solution exists, whose velocity field v decays to zero as |x|−1 , and grad v
decays as |x|−2 . Moreover, the corresponding pressure field p and grad p behave
as |x|−2 and |x|−3 , respectively. It is interesting to observe that these are exactly
the same asymptotics of the linear Stokes problem [11, Theorem V.3.2], obtained
by setting ω = 0 and by disregarding the nonlinear terms in the relevant Navier–
Stokes equations (see (2.1)). By a standard argument, it follows that our solutions
satisfy the energy equation (see (2.3)). From the work of Borchers [4], it also
follows that they are nonlinearly asymptotically stable in the sense of Liapunov,
in suitable norms.
The method we use to show the above results is based on obtaining the estimates
for solutions to the linear problem as limit, as time goes to infinity, of analogous
estimates proved for solutions of the corresponding initial value problem. Actually,
by means of a suitable transformation of coordinates, this latter goes into an initial
value problem for the heat equation. So, ultimately, the estimates for solutions to
the steady linearized problem are reduced to find the same estimates, uniformly in
time, for solutions of an initial value problem for the heat equation. This is done in
a relatively simple way, because the fundamental solution of the heat equation is
much simpler to handle than the fundamental solution of the operator LTR .
Since the main mathematical difficulty comes from rotation, for simplicity of
argument, in the present paper we have assumed that the body just rotates, without
translating. However, the method we use is quite flexible and it can be extended to
cover more general cases. This will be the object of a future work.
Finally, it should be emphasized that, even though our analysis was motivated
by the problem of sedimentation, it has, of course, an independent interest and can
be applied to other significant physical problems, like evaluation of torques and
forces on B; see [16, 17] and the references cited therein (see also Section 6).
440
GIOVANNI P. GALDI
The paper is organized as follows. In Section 2 we formulate the problem and
state the main result. Section 3 is devoted to the study of a suitable linear problem
in the whole space. Using the results of Section 3, in Section 4 we show existence,
uniqueness and corresponding estimates for solutions to a linearized problem in
exterior domains. The results of Section 4 are employed in Section 5 where the
proof of the main result is presented. We end the paper with a final Section 6, that
includes, among other things, possible other applications of our result.
2. Formulation of the Problem and Main Result
We begin to introduce some notation. R3 is the Euclidean 3-dimensional space and
{e1 , e2 , e3 } is the associated canonical basis. For a > 0, x ∈ R3 , we set Ba (x) =
{y ∈ R3 : |y − x| < a}, and B a (x) = {y ∈ R3 : |y − x| > a}. If x = 0, we shall
simply write Ba and B a , respectively. If A is a domain of R3 , we denote by δ(A)
its diameter. Moreover, we set Aa = A ∩ Ba and Aa = A ∩ B a . If f is a scalar,
vector or tensor function defined in A and k is a nonnegative integer, we set
[|f |]k = ess sup (|x|k + 1)f (x) ,
x∈A
where · denotes absolute value or modulus, depending on whether f is a scalar,
vector or tensor field. If A′ is a subdomain of A we shall write
[|f |]k,A′ = ess sup (|x|k + 1)f (x) .
x∈A′
m,q
m,q
Lq (A), W m,q (A), W0 (A), Wloc (Ā), m 0, 1 < q ∞, denote usual
Lebesgue and Sobolev spaces [1]. Norms in Lq (A)⋆ and W m,q (A) are denoted
by · q,A , · m,q,A . Unless confusion arises, in the above norms, we shall drop
the subscript “A”. The trace space on ∂A for functions from W m,q (A) will be
denoted by W m−1/q,q (∂A) and its norm by · m−1/q,q,∂A . By D k,q (A), k 1,
1 < q < ∞, we indicate the homogeneous Sobolev space of order (k, q) on A
[11, 34], that is, the class of functions u that are (Lebesgue) locally integrable in
A and with D β u ∈ Lq (A), |β| = k. Finally, given a Banach space X, and an open
real interval (a, b), we denote by W m,q (a, b; X) the linear space of (equivalence
classes of) functions f : (a, b) → X whose X-norm is in W m,q (a, b).
Typically, we shall use the symbol c to denote a constant whose numerical value
or dependence on parameters is not essential to our aims. In such a case, c may have
several different values in a single computation. For example, we may have, in the
same line, 2c c.
In this paper we shall study the steady-state motions of a viscous fluid around a
rotating obstacle. Specifically, let B be a rigid body that uniformly rotates, with
constant angular velocity ω, in a viscous liquid L filling the entire space. We
⋆ Let X be any space of real functions. As a rule, we shall use the same symbol X to denote the
corresponding space of vector and tensor-valued functions.
FLOW AROUND A ROTATING OBSTACLE
441
assume that L is described by the Navier–Stokes model, and that the motion of L
as seen from a frame S attached to B is steady. Then, the relevant nondimensional
equations, written with respect to S, are given by (see, e.g., [14])
$
Re(v · grad v − µ × x · grad v + µ × v) = v − grad p,
in ,
div v = 0
(2.1)
lim v(x) = 0,
|x |→∞
v(x) = µ × x, x ∈ ∂.
Here Re = |ω|d 2 /ν is the appropriate Reynolds number, d is the diameter of B,
ν is the kinematical viscosity of L, µ = ω/|ω|, and (the exterior of B) is the
domain occupied by L.
The main goal of this paper is to prove the following existence and uniqueness
theorem for problem (2.1)
THEOREM 2.1. Let be of class C 2 , and let R > δ(B), q > 1. Then, there
exists a constant Re0 > 0 depending only on , R and q, such that if Re < Re0 ,
problem (2.1) admits one and only one solution v, p satisfying
v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 < ∞,
[|p|]2 + [|grad p|]3,R < ∞.
(2.2)
Moreover, v, p ∈ C ∞ ().
REMARK 2.1. (i) For the sake of simplicity, we are assuming that the body force
b acting on the fluid is zero. However, the result of Theorem 2.1 can be easily
extended to cover the case b = 0. For example, as it is clear from the proof that we
shall give, one can show that if b = div F , with F = {Fij } a second-order tensor
field satisfying the assumptions (i)–(iii) of Theorem 4.1, there exists one and only
one corresponding solution v, p in the class (2.2), provided Re is suitably small. In
addition, the solution satisfies the following estimate⋆
v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 + [|p|]2 + [|grad p|]3,R
c [|F |]2 + [|∂i Fij ei |]3 + [|∂j ∂i Fij |]4 + 1 .
The differentiability of this solution will depend, of course, on the degree of smoothness of F . If, in particular, F ∈ C ∞ (), then v, p ∈ C ∞ ().
(ii) Using the spatial asymptotic properties of solutions of Theorem 2.1, one can
easily show that they satisfy the energy equation:⋆⋆
x × T (v, p) · n,
(2.3)
D(v) : D(v) = µ ·
∂
⋆ We adopt summation convention over repeated indices.
⋆⋆ Unless confusion may arise, we shall omit in the integrals the infinitesimal volume or surface
of integration.
442
GIOVANNI P. GALDI
where D(v) is the stretching tensor, T = 2D + pI , I is the identity tensor and
n is the unit outer normal to ∂. These solutions are (clearly) also unique, and
their velocity field decays at large distances as |x|−1 . Therefore, they are physically
reasonable in the sense of Finn [9].
(iii) From the work of Borchers [4] it follows that solutions of Theorem 2.1
are nonlinearly stable in the sense of Liapounov, for sufficiently small Reynolds
number. In particular, every dynamical perturbation that is initially in L2 () decays
to zero in suitable norms as t → ∞.
The Proof of Theorem 2.1 will be achieved through several steps, obtained in
the following two sections.
3. A Linear Problem in R3
LEMMA 3.1. Let F = {Fij } be a second-order tensor field in R3 such that
[|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 < ∞.
Moreover, let
1,q
f ∈ W0 (Bρ ),
some ρ > 0 and q > 3.
Then, the problem
u = ∂j ∂i Fij + div f ,
in R3
has one and only one solution such that
[|u|]2 c [|F |]2 + [|∂i Fij ej |]3 + f q,Bρ ,
[|grad u|]3 c [|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4
+ f q,Bρ + div f q,Bρ ,
D 2 us,R3 c [|∂i ∂j Fij |]4 + div f q,Bρ , all s ∈ (1, q],
(3.4)
(3.5)
where c is a positive constant.
Proof. Set G = ∂i Fij ej + f . Then, by assumption, G and div G belong to
Ls (R3 ) for all s ∈ [1, q]. Therefore, from well-known results, it follows that there
exists one and only one solution u to (3.4) such that
us1 ,R3 + grad us,R3 c [|∂i Fij ej |]3 + f q,Bρ ,
(3.6)
D 2 us,R3 c [|∂i ∂j Fij |]4 + div f q,Bρ
for all s ∈ (1, q], s1 > 3/2.
The second of these estimates is just the third inequality in (3.5). The Sobolev embedding theorem along with (3.6) implies that u and grad u are essentially bounded
443
FLOW AROUND A ROTATING OBSTACLE
on the ball BR of R3 of arbitrary finite radius R > 0, and that the following
inequality holds⋆
u∞,BR + grad u∞,BR
CR [|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4 + div f q,Bρ + f q,Bρ .
(3.7)
Furthermore, again from well-known results and from (3.6), we have that u admits
the following representation for all x ∈ R3⋆⋆
E(x − y) div f (y) dy
E(x − y)∂j ∂i Fij (y) dy +
u(x) =
R3
R3
≡ u1 (x) + u2 (x),
(3.8)
where E(ξ ) is the fundamental solution to Laplace’s equation in dimension three.
We recall that
|D α E(ξ )| c|ξ |−1−|α| ,
for all |α| 0 and for ξ = 0.
Integrating by parts, we obtain
∂i E(x − y)fi (y) dy.
u2 (x) = −
(3.9)
(3.10)
R3
Since
x ∈ B 2ρ , y ∈ Bρ 0⇒ |x − y| 12 |x|,
(3.11)
using (3.9) we get
|x|2 |u2 (x)| cf q,Bρ ,
|x| 2ρ.
(3.12)
Differentiating (3.10) once, we obtain
∂k ∂i E(x − y)fi (y) dy.
∂k u2 (x) = −
R3
Using in this equation (3.11) and (3.9), we recover
|x|3 |grad u2 (x)| cf q,Bρ ,
|x| 2ρ.
(3.13)
We next estimate the first integral in (3.8). Taking into account the asymptotic
properties of E and of Fij , it is easy to see that, for every fixed x, we can perform
integration by parts to get
∂j E(x − y)∂i Fij (y) dy, for all x ∈ R3 .
(3.14)
u1 (x) = −
R3
⋆ In fact, we can show that from (3.6) we have that u and grad u are essentially bounded in the
whole of R3 , but this is irrelevant for the rest of the proof.
⋆⋆ Notice that ∂ ∂ F ∈ Lq (R3 ) for all q 1.
j i ij
444
GIOVANNI P. GALDI
Since F satisfies (i) and (ii), from Lemma 2.5 of [29], it then follows that
|x|2 |u1 (x)| c [|F |]2 + [|∂i Fij ej |]3 , |x| > 1.
This estimate, together with (3.12) and (3.7), in turn proves the first inequality in
(3.5). In order to show the second inequality, we observe that
∂k E(x − y)∂j ∂i Fij (y) dy.
(3.15)
∂k u1 (x) =
R3
To estimate the integral on the right-hand side of (3.15), we set |x| = R > 2 and
split R3 as BR/2 ∪ B R/2 , and denote the corresponding contributions of the integral
over the two regions by I1 and I2 , respectively. We also set, for simplicity,
N0 = [|F |]2 ,
N1 = [|∂i Fij ej |]3 ,
N2 = [|∂j ∂i Fij |]4 .
By a double integration by parts, we have
∂j ∂k E(x − y)Fij (y)nj (y) dσy
∂i ∂j ∂k E(x − y)Fij (y) dy −
I1 =
∂BR/2
BR/2
∂k E(x − y)∂i Fij (y)nj (y) dσy .
+
∂BR/2
Taking into account that
y ∈ BR/2 0⇒ |x − y| 12 R,
(3.16)
y ∈ ∂BR/2 0⇒ |y| = 21 R,
by (3.9) and the assumptions (i) and (ii), we find
1
N0 + N1
N0 N1
N0
+
c
dy
+
.
|I1 | c
4
3
3
2
1
+
|y|
R3
R BR/2
R
R
Next, since
y ∈ B R/2 0⇒
/
|y| R/2,
|∂j ∂i Fij (y)| cN2 /|y|4
(3.17)
(3.18)
by assumption (iii) we have
|I2 | =
∂k E(x − y)∂j ∂i Fij (y) dy
B R/2
dy
N2
N2
c 3,
(3.19)
c 2
2
2
R R3 |x − y| |y|
R
where, in the last inequality, we have used a classical estimate on weakly singular
integral (see, e.g., [11, Lemma II.7.2]). From (3.17) and (3.19) we then conclude
|x|3 | grad u(x)| c(N0 + N1 + N2 ),
|x| > 2,
which together with (3.13) and (3.7) proves the second estimate in (3.5). The proof
of the lemma is thus accomplished.
✷
445
FLOW AROUND A ROTATING OBSTACLE
LEMMA 3.2. Let F and f be tensor and vector fields, respectively, satisfying the
assumptions of Lemma 3.1. Then, the problem
⎫
u + Re(µ × x · grad u − µ × u) ⎪
⎬
= grad φ + ∂i Fij ej + f ,
in R3
(3.20)
⎪
⎭
div u = 0
has one and only one solution such that
2,2
u ∈ Wloc
(R3 ) ∩ D 2,2 (R3 ) ∩ D 1,2 (R3 ) ∩ L6 (R3 ) ∩ L∞ (R3 ),
φ ∈ W 1,r (R3 ) ∩ D 2,s (R3 ), all q s > 1, r > 3/2,
[|φ|]2 + [|grad φ|]3 < ∞.
(3.21)
Moreover, the following estimate holds
u∞ + u6 + grad u2 + D 2 u2 + [|φ|]2 + [|grad φ|]3 + D 2 φs
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ ,
where c is a positive constant depending only on q and s.
Proof. The existence of the solution u satisfying the stated properties can be
found in [18] and [14]. Moreover, again by the work of these authors, we have that,
in particular, the corresponding pressure φ belongs to L6 (R3 ). We now apply the
operator “div” at both sides of (3.20)1 . Since
div(−µ × x · grad u + µ × u) = −µ × x · grad(div u),
(3.22)
we find
φ = ∂j ∂i Fij + div f .
Thus, the properties of φ follow from Lemma 3.1 and from a classical uniqueness
theorem in the Lebesgue class Lq for the Poisson equation in the whole space. ✷
LEMMA 3.3. Let G(x, t) = {Gij (x, t)} be a second-order tensor field in R3 ×
(0, ∞) such that
ess sup [|G|]2 + [|∂i Gij ej |]3 < ∞.
t 0
Moreover, let g be a function of bounded support contained in Bρ , for some ρ > 0
and such that
g ∈ L∞ (0, ∞; Lq (Bρ )),
for some q > 3.
Then, the Cauchy problem
∂w
= w + ∂i Gij ej + g
∂t
w(x, 0) = 0
in R3 ,
(3.23)
446
GIOVANNI P. GALDI
has one and only one solution such that
w ∈ W 1,2 (0, T ; L2 (R3 )) ∩ L2 (0, T ; W 2,2 (R3 )),
all T > 0.
(3.24)
Furthermore, this solution satisfies the following estimate
ess sup [|w|]1 + [|grad w|]2 c ess sup [|G|]2 + [|∂i Gij ej |]3 + gq,Bρ . (3.25)
t 0
t 0
Proof. The existence of a unique solution in the class (3.24) is well-known; see,
e.g., [23]. In order to show the other properties of w, we shall make use of the
volume heat potential representation:
t
wj (x, t) =
H (x − y, s) ∂i Gij (y, t − s) + gj (y, t − s) dy ds
≡
0
wj(1)
R3
+ wj(2),
(3.26)
where
$
1
|z|2
H (z, s) =
exp −
, s, |z| > 0.
(4π s)3/2
4s
In the sequel, we shall employ many times the following elementary inequality:
$
|z|2
c
−k
(3.27)
2k , k 0,
s exp −
4s
|z|
where c is a positive constant independent of z and s. Let us first consider the
function w(2) . Using (3.27) in conjunction with the Hölder inequality, we find for
any r, p ∈ [1, q] and t 1
1/r ′
1
(2)
−r ′ |x−y|2 /4s
−3/2
|w | c ess sup gr,Bρ s
ds
e
dy
t 0
0
+ ess sup gp,Bρ
t 0
Bρ
t
s
−3/2
1
−p ′ |x−y|2 /4s
e
Bρ
2
1
≡ c ess sup gr,Bρ I1 + ess sup gp,Bρ I2
t 0
dy
1/p′
ds
$
t 0
c ess sup gq,Bρ (I1 + I2 ),
(3.28)
t 0
where r ′ and p ′ are conjugate exponents to r and p, respectively. Without loss, we
shall assume throughout ρ > 1. Noticing that
x ∈ B 2ρ ,
y ∈ Bρ 0⇒ |x − y| 12 |x|,
with the help of (3.27), for any β ∈ (0, 1] we find
1/r ′
1
dy
−1+β
I1 c s
ds
dy
(1+2β)r ′
Bρ |x − y|
0
c
c
, |x| 2ρ.
1+2β
|x|
|x|
(3.29)
(3.30)
447
FLOW AROUND A ROTATING OBSTACLE
Furthermore, using again (3.27) and (3.29), we obtain
1/p′
t
t
2
−3/2
−p ′ |x|2 /16s
I2 c s
e
dy
ds c s −3/2 e−|x| /16s ds
1
1
Bρ
c
, |x| 2ρ.
|x|
From (3.28)–(3.31) we then deduce
(3.31)
|x| |w (2) (x, t)| c ess sup gq,Bρ ,
|x| 2ρ.
t 0
(3.32)
We next show an estimate valid for all |x| 0. We have
1/r ′
1
−3/2
−r ′ |x−y|2 /4s
ds
e
dy
I1 c s
0
c
c
B4ρ (x)
1
s
0
−3/2
4ρ
2 −r ′ σ 2 /4s
σ e
0
1
dσ
1/r ′
ds
′
s −3(1−1/r )/2 ds.
0
Thus, choosing r > 3/2, we conclude
I1 c,
|x| 0.
(3.33)
In a completely analogous fashion, we find
t
′
I2 c s −3(1−1/p )/2 ds,
1
and so, choosing p < 3/2 we also have
I2 c,
|x| 0.
(3.34)
If t 1, |w(2) (x, t)| is bounded by I, and so, collecting (3.32)–(3.34) we obtain
[|w(2) |]1 c ess sup gq .
(3.35)
t 0
Our next step is to estimate the first spatial derivative of w(2) . Taking the partial
derivative of w(2) with respect to xk and proceeding as before, we find for any
r, p ∈ [1, q] and t 1
1/r ′
1
−5/2
(2)
r ′ −r ′ |x−y|2 /4s
|∂k w | c ess sup gr,Bρ s
|x − y| e
dy
ds
t 0
+ ess sup gp,Bρ
t 0
0
1
Bρ
t
s
−5/2
p ′ −p ′ |x−y|2 /4s
Bρ
|x − y| e
2
1
≡ c ess sup gr,Bρ I3 + ess sup gp,Bρ I4
t 0
ds
$
t 0
c ess sup gq,Bρ (I3 + I4 ).
t 0
dy
1/p′
(3.36)
448
GIOVANNI P. GALDI
With the help of (3.27) and (3.29), for any β ∈ (0, 1], we find
I3 c
1
s
−1+β
0
c
,
|x|2
dy
Bρ
(2+2β)r ′
|x − y|
dy
1/r ′
ds
c
2+2β
|x|
|x| 2ρ.
(3.37)
Furthermore, using again (3.29), and observing that
y ∈ Bρ 0⇒ |x − y| 32 |x|,
|x| 2ρ,
we obtain
I4 c
t
s −5/2
1
′
Bρ
′
2 /4s
|x − y|r e−r |x−y|
t
2
c |x| s −5/2 e−|x| /4s ds c|x|−2 ,
1
dy
1/r ′
ds
|x| 2ρ.
(3.38)
We next show an estimate valid for all |x| 0. Choosing r ′ = q ′ , we have
I3 c
c
c
1
s
−5/2
0
q ′ −q ′ |x−y|2 /4s
B4ρ (x)
1
s
−5/2
4ρ
σ
|x − y| e
2+q ′ −q ′ σ 2 /4s
e
0
0
1
dσ
1/q ′
dy
1/q ′
ds
ds
′
s −2+3/2q ds.
0
Thus, since q > 3, we conclude
I3 c,
|x| 0.
(3.39)
In a completely analogous way, we find
t
′
I4 c s −2+3/2p ds,
1
and so, choosing p < 3 we also have
I4 c,
|x| 0.
(3.40)
If t 1, |∂k w(2) (x, t)| is bounded by I3 and so, from (3.36)–(3.40) we deduce
[|grad w(2) |]2 c ess sup gq .
t 0
(3.41)
449
FLOW AROUND A ROTATING OBSTACLE
It remains to estimate the integral w(1) in (3.26). For simplicity, we introduce the
following notation:
N0 = ess sup [|G|]2 ,
t 0
N1 = ess sup [|∂i Gij ej |]3 .
t 0
As shown in the estimates for w(z) (x, t), we may take, without loss, t 1. Integrating by parts, and using the assumption (i) for G, we find, for any α ∈ [0, 2],
t
2
(1)
−|x−y|2 /4s
−5/2
(xi − yi )e
Gij (y, t − s) dy ds
s
|wj | =
(4π )3/2 0
R3
1
2
|x − y|e−|x−y| /4s
s −5/2
cN0
dy
ds
|y|α
R3
0
2
t
|x − y|e−|x−y| /4s
−5/2
s
dy ds
+
|y|2
R3
1
≡ c N0 (I1 + I2 ).
(3.42)
Using (3.27), for all β ∈ (0, 1] we find
1
dy
−1+β
s
I1 c
ds,
2+2β
|y|α
R3 |x − y|
0
and so, by a classical estimate (see, e.g., [11, Lemma II.7.2]) we obtain
I1 c |x|−2β−α+1 .
Therefore, choosing β = 1 − α/2, α < 2, we conclude
I1 c|x|−1 ,
|x| > 0.
(3.43)
In order to estimate I2 , we notice that
t −|x−y|2 /4s
e
|x − y|
ds dy,
I2 c
2
|y|
s 5/2
R3
1
and so, performing in the time integral the change of variable η = |x − y|2 /4s, it
follows that
∞
dy
dy
1/2 −η
.
η e dη c
I2 c
2
2
2
2
R3 |x − y| |y|
R3 |x − y| |y|
0
Employing again Lemma II.7.2 in [11], we obtain
I2 c|x|−1 ,
|x| > 0.
(3.44)
Equations (3.43) and (3.44) imply
|w(1) (x, t)| c|x|−1 ,
|x| > 0.
(3.45)
450
GIOVANNI P. GALDI
We wish now to show that w(1) is uniformly bounded for all x. Applying Hölder
inequality in the first integral in (3.42), we obtain
(1)
|w | c ess sup Gr
t 0
1
s
−5/2
0
+ ess sup Gq
t 0
t
s −5/2
r ′ −r ′ |x−y|2 /4s
R3
|x − y| e
′
R3
1
′
dy
2 /4s
|x − y|q e−q |x−y|
1/r ′
dy
ds
1/q ′
ds
for any 3/2 < r, q ∞. Since
ess sup Gp c N0 ,
all p > 3/2,
t 0
the preceding inequality implies
|w(1) | c N0
0
+
1
s −5/2
R3
t
s
−5/2
1
≡ c N0 (I3 + I4 ).
′
′
2 /4s
|x − y|r e−r |x−y|
dy
q ′ −q ′ |x−y|2 /4s
R3
|x − y| e
1/r ′
dy
ds
1/q ′
ds
√
Performing the change of variable σ = |x − y|/ 4s we find
1
′
s −2+3/2r ds,
I1 c
0
and so, choosing 3/2 < r < 3 we get
I1 c.
By the same token, we obtain
t
′
s −2+3/2q ds,
I2 c
1
and so, choosing this time q > 3 we get
I2 c.
As a consequence, we deduce
|w(1) (x, t)| c N0 ,
|x| 0.
(3.46)
From (3.45) and (3.46) we conclude
[|w(1) |]1 c N0 .
(3.47)
451
FLOW AROUND A ROTATING OBSTACLE
It remains to estimate the spatial derivatives of w(1). To this end, we notice that
from (3.26) we have
∂k wj(1)
=
2
(4π )3/2
0
t
s −5/2
2 /4s
R3
(xk − yk )e−|x−y|
∂i Gij (y, t − s) dy ds. (3.48)
In order to achieve our goal, we set |x| = R > 2, and, as in the proof of Lemma 3.1,
split again R3 as BR/2 ∪ B R/2 . The contributions to the integral in (3.48) over the
two subdomains will be denoted by I1 (t) and I2 (t), respectively. Moreover, for
each Ii (t), we split the interval [1, t] into the two intervals [0, 1] and [1, t], and
denote the corresponding integrals by Ii (0, 1) and Ii (1, t), i = 1, 2, according to
whether we are integrating over [0, 1] or [1, t]. For instance, we have
t
2
−|x−y|2 /4s
−5/2
(x
−
y
)e
∂
G
(y,
t
−
s)
dy
ds
s
I1 (t) ≡
k
k
i ij
(4π )3/2 0
BR/2
1
2
−5/2
−|x−y|2 /4s
s
(xk − yk )e
∂i Gij (y, t − s) dy ds
=
(4π )3/2 0
BR/2
t
2
−|x−y|2 /4s
−5/2
(xk − yk )e
∂i Gij (y, t − s) dy ds
s
+
(4π )3/2 1
BR/2
≡ I1 (0, 1) + I1 (1, t),
etc. Integrating by parts, we find
2
(xk − yk )e−|x−y| /4s ∂i Gij dy
BR/2
=
(xk − yk )(xi − yi ) −|x−y|2 /4s
δik +
e
Gij dy
s
BR/2
2
−
e−|x−y| /4s Gij ni dσy
∂BR/2
≡ i1 (s) + i2 (s) + i3 (s).
Using the assumption (i) on G, the first condition in (3.16) and (3.27), for any
ε ∈ (0, 1] we have
1
1
2
e−|x−y| /4s dy
−5/2
−1+ε
ds
s
|i1 (s)| ds c N0 s
3/2+ε
|y|2 + 1
BR/2 s
0
0
1
ds
dy
c N0
3+2ε
2
|x − y|
(|y| + 1) 0 s 1−ε
B
R/2
N0
N0
dy
c 3
c 2.
(3.49)
2
R BR/2 |y| + 1
R
452
GIOVANNI P. GALDI
By a similar argument, we find
1
s −5/2 |i2 (s)| ds
0
2
|x − y|2 e−|x−y| /4s dy
ds
s 5/2+ε
|y|2 + 1
BR/2
0
1
dy
N0
ds
c N0
c 2,
3+2ε
2
1−ε
(|y| + 1) 0 s
R
BR/2 |x − y|
c N0
1
s −1+ε
(3.50)
and, using this time also the second condition in (3.16),
1
s −5/2 |i3 (s)| ds
0
2
|x − y|e−|x−y| /4s
s
dσy ds
s 3/2+ε
∂BR/2
0
1
N0
|x − y|
N0
ds
c 2
dy
c 2.
3+2ε
1−ε
R ∂BR/2 |x − y|
R
0 s
N0
c 2
R
1
−1+ε
(3.51)
From (3.49)–(3.51) we then conclude
|I1 (0, 1)| c
N0
,
|x|2
|x| > 2.
(3.52)
We shall next estimate I1 (1, t). Using assumption (i) on G, we obtain
t −|x−y|2 /4s
t
1
e
−5/2
s
|i1 (s)| ds c N0
ds dy,
2
s 5/2
BR/2 |y| + 1
1
1
and so, setting η = |x − y|2 /4s, and using the first condition in (3.16), it follows
that
∞ 1/2 −η
t
η e
1
−5/2
dη dy
s
|i1 (s)| ds c N0
2
3
BR/2 |y| + 1
0 |x − y|
1
N0
dy
N0
c 3
(3.53)
c 2.
2
R BR/2 |y| + 1
R
Likewise, we find
t
−5/2
s
|i2 (s)| ds c N0
1
|x − y|2
2
BR/2 |y| + 1
1
t −|x−y|2 /4s
e
s 7/2
ds dy
∞ 1/2 −η
η e
1
c N0
dη dy
2
3
BR/2 |y| + 1
0 |x − y|
N0
N0
dy
c 3
c 2.
2
R BR/2 |y| + 1
R
(3.54)
453
FLOW AROUND A ROTATING OBSTACLE
Moreover, using this time the second condition in (3.16), we obtain
t −|x−y|2
t
N0
e
−5/2
s
|i3 (s)| ds c 2
|x − y|
ds dσy
R ∂BR/2
s 5/2
1
1
∞
N0
dσy
N0
η1/2 e−η dη c 2 .
c 2
2
R ∂BR/2 |x − y| 0
R
(3.55)
Collecting (3.52)–(3.55), we thus conclude
|I1 (t)| c
N0
,
|x|2
|x| > 2.
(3.56)
We now estimate I2 (t) = I2 (0, 1) + I2 (1, t). Using assumption (ii) on G, and
recalling (3.18) and (3.27), for any ε ∈ (0, 1) we find
1
2
|x − y| e−|x−y| /4s
N1
−1+ε
dy ds
s
|I2 (0, 1)| c 1+2ε
2−2ε
R
s 3/2+ε
B R/2 |y|
0
1
dy
N1
−1+ε
c 1+2ε
s
ds
2+2ε
R/2
R
|x − y|
|y|2−2ε
B
0
N1
N1
(3.57)
c 2+2ε c 2 ,
R
R
where, in the last step, we have used Lemma II.7 of [11]. Moreover, again by (3.18)
and this lemma, it follows that
t −|x−y|2 /4s
|x − y|
N1
e
ds dy
|I2 (1, t)| c
2
R B R/2 |y|
s 5/2
1
∞
dy
N1
η1/2 e−η dη
c
R R3 |x − y|2 |y|2 0
N1
c 2.
(3.58)
R
Therefore, from (3.57), (3.58) we conclude
|I2 (t)| c
N0
,
|x|2
|x| > 2.
(3.59)
Recalling that ∂k w(1) (x, t) = Ii (t) + I2 (t), from (3.56) and (3.59) we conclude
|grad w(1) (x, t)| c
N0 + N1
,
|x|2
|x| > 2.
(3.60)
Finally, we wish to show an estimate for grad w(1) holding for all |x|. However, the
proof of this estimate is completely similar to the analogous estimate we proved for
w(1) . The reason is because both w(1) and grad w(1) are expressed as the convolution
454
GIOVANNI P. GALDI
of a spatial derivative of the kernel H times a function G (say) which belongs
to Lq (R3 ), for all q > 3/2. (In fact, in view of assumption (ii) the function G
associated to grad w (1) belongs to Lq (R3 ) for all q > 1.) We may thus prove
|grad w(x, t)| c N1 ,
|x| 0,
which, in turn, along with (3.60), allows us to conclude
[|grad w(1) |]2 c (N0 + N1 ).
The proof of the lemma is then completed.
✷
LEMMA 3.4. Let F and f satisfy the assumptions of Lemma 3.1, and let φ be the
pressure field in Lemma 3.2. Then the Cauchy problem
∂v
− Re(µ × x · grad v − µ × v)
∂t
= v − grad φ − ∂i Fij ej + f in R3 ,
v(x, 0) = 0
(3.61)
has one and only one solution such that
v ∈ W 1,2 (0, T ; L2 (BR )) ∩ L2 (0, T ; W 2,2 (R3 )),
ess sup [|v|]1 + [|grad v|]2 < ∞.
all R, T > 0,
(3.62)
t 0
Moreover, the solution satisfies the following estimate
ess sup [|v|]1 + [|grad v|]2
t 0
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ ,
(3.63)
where c depends only on q.
Proof. Let Q = Q(t), t 0, be the uniquely determined family of proper
orthogonal transformations parameterized with time, such that (“T” denotes transpose)
•
QT (t) · Q(t) · a = Re a × µ,
Q(0) = I .
for all a ∈ R3 , t 0,
(3.64)
It is well-known that the tensor field Q(t) is found by solving the following initialvalue problem
⎡
⎤
/ •
0
−µ3 µ2
Q= ReQ · M,
0
−µ1 ⎦ .
M(µ) = ⎣ µ3
Q(0) = I ,
−µ2 µ1
0
455
FLOW AROUND A ROTATING OBSTACLE
We next introduce a new set of coordinates y related to x by
y = Q(t) · x.
(3.65)
Also, set
w(y, t) = Q(t) · v(QT (t) · y, t).
•
(3.66)
•
Using (3.66), (3.64), and the identity Q T (t) ·Q(t) = −QT (t) · Q (t), we find
•
•
∂w
∂v
T
T
= Q(t) ·
+ (Q (t) · Q(t) · x) · grad v + Q (t) · Q (t) · v
∂t
∂t
x
1
2
∂v
− Re µ × x · grad v − µ × v .
(3.67)
= Q(t) ·
∂t
x
and
yw
= Q(t) ·
(3.68)
x v.
Therefore, the Cauchy problem (3.61) can be equivalently rewritten as follows
∂w
= w + ∂i Gij ej + g
∂t
w(y, 0) = 0,
in R3 ,
(3.69)
where the second-order tensor field G and the vector field g are given by
G(y, t) = Q(t) · F (QT (t) · y) · Q(t)T + φ(QT (t) · y)I ,
g(y, t) = Q(t) · f (QT (t) · y).
Clearly, |g(y, t)| = |f (x)|. Moreover,
|G(y, t)| c (|F (x)| + |φ(x)|),
&
% 3
3
|∂ix Fij (x)| + gradx φ(x) .
|∂i Gij (y, t)| c
j =1
j =1
Therefore, recalling the properties of φ shown in Lemma 3.2, we deduce that the
fields G and g satisfy the assumptions of Lemma 3.3. Since we have also
|w(y, t)| = |v(x, t)|,
grady w(y, t) = gradx v(x, t)
and |y| = |x|, the proof of the lemma follows from Lemma 3.2 and Lemma 3.3. ✷
LEMMA 3.5. The solution v to the Cauchy problem (3.61) given in Lemma 3.4
tends, as t → ∞, to the solution u of the steady problem (3.20) given in Lemma 3.2.
Specifically,
lim v(t) − uq,R3 = 0,
t →∞
for all q > 6,
lim grad(v(t) − u)6,R3 = 0.
t →∞
456
GIOVANNI P. GALDI
Proof. Set
U (y, t) = Q(t) · u(QT (t) · y),
where Q(t) and y are defined in (3.64) and (3.65), respectively, and
W (y, t) = w(y, t) − U (y, t).
Arguing as in the proof of Lemma 3.4 (see (3.67) and (3.68)), and taking into
account (3.20), (3.61) and (3.64)2 , we find that W (y, t) satisfies the following
Cauchy problem
∂W
= W in R3 ,
∂t
W (y, 0) = u(y).
Since, obviously,
|W | = |v − u|,
grady W = gradx (v − u) ,
(3.70)
from (3.22) and (3.62), we have
W ∈ L∞ (0, ∞; L∞ (R3 )) ∩ L∞ (0, ∞; D 1,2 (R3 )).
Thus, using these properties along with the asymptotic properties in space of the
kernel H , in conjunction with the classical Green’s identity for the heat equation,
we obtain that W admits the following representation:
H (y − z, t)u(z) dz.
(3.71)
W (y, t) =
R3
Therefore, from Young’s theorem on convolutions we get
3
W q,R3 c t − 2 (1/6−1/q)u6 ,
gradW 6,R3 c t −1/2 u6 ,
q > 6,
t > 0.
The proof then follows from these latter displayed inequalities and from (3.70). ✷
We are now in a position to show the main result of this section.
THEOREM 3.1. Let F and f satisfy the assumptions of Lemma 3.1. Then the
problem (3.20) has at least one solution u, φ such that
2,2
u ∈ Wloc
(R3 ) ∩ D 2,2 (R3 ) ∩ D 1,2 (R3 ) ∩ L6 (R3 ) ∩ L∞ (R3 ),
[|u|]1 + [|grad u|]2 < ∞,
φ ∈ W 1,r (R3 ) ∩ D 2,s (R3 ),
[|φ|]2 + [|grad φ|]3 < ∞.
all q s > 1, r > 3/2,
(3.72)
457
FLOW AROUND A ROTATING OBSTACLE
Moreover, the following estimate holds
D 2 u2 + [|u|]1 + [|grad u|]2 + [|φ|]2 + [|grad φ|]3 + D 2 φs
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ .
(3.73)
Finally, if u1 , φ1 is another solution to (3.20) with
2,2
(R3 ) ∩ D 1,2(R3 ) ∩ L6 (R3 ),
u1 ∈ Wloc
1,2
(R3 ),
φ1 ∈ Wloc
we have u ≡ u1 , φ ≡ φ1 + const.
Proof. In view of Lemma 3.1, for the existence proof we only have to show that
the solution u satisfies, in addition, the property
[|u|]1 + [|grad u|]2 < ∞,
together with the estimate given in (3.73). To this end, let v(x, tn ) be the solution
to the Cauchy problem (3.61) given in Lemma 3.4, evaluated along an increasing
sequence of times {tn } with tn → ∞. By Lemma 3.5, v(x, tn ) and grad v(x, tn )
converge strongly to u(x) and grad u(x), in Lq , q > 6, and L6 , respectively.
Therefore, we can select a subsequence, again denoted by {tn }, along which v(x, tn )
and grad v(x, tn ) converge pointwise to u(x) and grad u(x), for almost all x ∈ R3 .
By the triangular inequality and by (3.63) we then find
|u(x)|(|x| + 1) + |grad u(x)|(|x|2 + 1)
|v(x, tn )|(|x| + 1) + |grad v(x, tn )|(|x|2 + 1)
+ |v(x, tn ) − u(x)|(|x| + 1) + |grad(v(x, tn ) − u(x))|(|x|2 + 1)
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ
+ |grad(v(x, tn ) − u(x))|(|x|2 + 1) + |v(x, tn ) − u(x)|(|x| + 1).
Passing to the limit n → ∞ in this latter inequality furnishes the desired result. In
order to show the uniqueness part, setting U = u − u1 , = φ − φ1 , we have that
U and satisfy the following problem⋆
U + µ × x · grad U − µ × U = grad ,
div U = 0
in R3 .
(3.74)
By classical results on elliptic regularity, we have that U and are of class C ∞ .
Operating with “curl” on both sides of (3.74)1 , we find
W + µ × x · grad W + grad U · µ + µ · grad U = 0,
(3.75)
where W = curl U . Let ψR = ψR (|x|), R > 0, be a real, nonnegative and
nondecreasing function of |x| such that ψR (|x|) = 1 for |x| < R, ψR (|x|) = 0
⋆ In the proof of uniqueness, the magnitude of Re is irrelevant. Therefore, for simplicity, we shall
set Re = 1.
458
GIOVANNI P. GALDI
for |x| > 2R and | ψR (|x|)| M/R 2 , for a constant M independent of R and x.
Dot-multiplying (3.75) by ψR W , integrating by parts over R3 and observing that
grad ψR (|x|) · µ × x = 0,
we find
R3
ψR |grad W |2 =
1
2
for all x ∈ R3 ,
ψR |W |2 +
R3
R3
(3.76)
ψR (µ · grad U + grad U · µ) · W .
Recalling that grad U ∈ L2 (R3 ), letting R → ∞ in this relation we thus deduce
grad W ≡ grad curl U ∈ L2 (R3 ). Since U = − curl curl U , this implies
U ∈ L2 (R3 ).
(3.77)
We now go back to (3.74). Since, by assumption, U ∈ D 1,2 (R3 ) ∩ L6(R3 ), we infer
U /|x| ∈ L2 (B r ),
for all r > 0;
(3.78)
see [11, Theorem II.5.1]. Plugging this information back into (3.74) and using
(3.77), we then obtain
grad /|x| ∈ L2 (B r ),
for all r > 0.
(3.79)
However, applying the operator “div” at both sides of (3.74)1 and using (3.22) and
(3.74)2 , we have that is harmonic in the whole space. But grad satisfies the
asymptotic condition (3.79) and so, by well-known results, it follows that =
const. Equation (3.74) thus furnishes, in particular,
− U − µ × x · grad U + µ × U = 0
in R3 .
(3.80)
We now multiply this equation by ψR U and integrate again by parts on R3 . Taking
into account (3.76), we obtain
2
ψR |U |2 .
ψR |grad U | = −
R3
R3
Using the properties of ψR in this latter relation we obtain
|U |2
2
,
ψR |grad U | c
2
BR,2R |x|
R3
and so, letting R → ∞, by (3.78) we conclude
|grad U |2 = 0 0⇒ U (x) = const.
R3
Since U ∈ L6 (R3 ), this gives, in turn, U ≡ 0, and the proof of the theorem is
completed.
✷
FLOW AROUND A ROTATING OBSTACLE
459
4. A Linear Problem in Exterior Domains
The objective of this section is to prove existence, uniqueness and corresponding
estimates of solutions to the following exterior problem
$
v + Re(µ × x · grad v − µ × v) = grad p + ∂i Fij ej ,
in ,
div v = 0
(4.1)
v = v ∗ , x ∈ ∂,
where is an exterior domain of class C 2 . We begin to prove the following.
LEMMA 4.1. Let be a locally Lipschitzian, exterior domain. Assume that the
second-order tensor field F = {Fij } has components in L2 (), and that v ∗ ∈
W 1/2,2 (∂), with
v∗ · n = 0,
(4.2)
∂
where n is the unit outer normal to ∂. Then, problem (4.1) has at least one
distributional solution v, p such that
v ∈ D 1,2(),
¯
p ∈ L2loc ().
Proof. The proof of this result is quite standard and we shall sketch it here.
First, we extend the boundary data to a solenoidal smooth function V in W 1,2 ()
of bounded support (see [11, Chapter III]). Then we look for a solution to (4.1)
of the form v = V + u, where u satisfies the following problem (in the sense of
distributions)
$
u + Re(µ × x · grad u − µ × u) = grad p + ∂i Fij ej + h,
in ,
div u = 0
(4.3)
u = 0, x ∈ ∂,
where h is a function of bounded support given by
h = − V − Re(µ × x · grad V − µ × V ).
Dot-multiplying (4.3)1 by u, integrating by parts over R , letting R → ∞ and
formally assuming that the surface integrals over ∂BR go to zero, we obtain the
following a priori estimate:
|grad u|2 M,
where M depends only on , F , Re and v ∗ . Using this bound and the classical
Galerkin method, we can easily show the existence of a weak solution u ∈ D 1,2()
¯ For details, we refer the
to (4.3), with corresponding pressure field p ∈ L2loc ().
reader to [12, Chapter IX]. The proof of the lemma is completed.
✷
460
GIOVANNI P. GALDI
We also have
LEMMA 4.2. Let be an exterior domain of class C 2 , and let F and v ∗ satisfy the assumptions of Lemma 4.1. Assume, further, that ∂i Fij ej ∈ Ls (), v ∗ ∈
W 2−1/s,s (∂B), for all s > 1. Then, the solution v, p of Lemma 4.1 satisfies
v ∈ W 2,s (r1 ),
p ∈ W 1,s (r1 ),
for all r1 > δ(B). Moreover, the following estimate holds
v2,s,r1 + p1,s,r1
c ∂i Fij ej s,r + v ∗ 2−1/s,s,∂ + vs,r + ps,r
for all r > r1 , with c depending only on , r1 , r, s and B, whenever Re ∈ [0, B].
Proof. We may formally write (4.1) as a Stokes problem:
$
v = grad p + H ,
in ,
div v = 0
(4.4)
v = v ∗ at ∂,
where
H = −Re(µ × x · grad v − µ × v) + ∂i Fij ej .
¯ From [11, Theorem IV.5.1], it then
By assumption, we have that H ∈ L2loc ().
2,2 ¯
follows that v ∈ Wloc (). By the Sobolev embedding theorem, we then have v ∈
1,6 ¯
¯ Again from [11, Theorem IV.5.1], we
(), which implies that H ∈ L6loc ().
Wloc
2,6 ¯
then infer Wloc () which, in turn, by the Sobolev embedding theorem, gives v ∈
1,∞ ¯
¯ for all s > 1. We
Wloc
(). Thus, we conclude, in particular, that H ∈ Lsloc ()
2,s ¯
() and,
then use Theorem IV.5.1 in [11], to find that, on the one hand, v ∈ Wloc
on the other hand, that
v2,s,r1 + grad p2,s,r1
c H s,ρ + v ∗ 2−1/s,s,∂ + v1,s,ρ + ps,ρ
for arbitrary ρ > r1 . Recalling the form of H , we then obtain
v2,s,r1 + grad p2,s,r1
c ∂i Fij ej s,ρ + v ∗ 2−1/s,s,∂ + (1 + Re)v1,s,ρ + ps,ρ . (4.5)
Applying Theorem IV.5.3 of [11] to (4.4) we also find, in particular, for any r > ρ,
v1,s,ρ c H −1,s,r + v ∗ 1−1/s,s,∂ + vs,r + ps,r .
(4.6)
Since
H −1,s,r c(1 + Re)vs,r + ∂i Fij ej s,r ,
the lemma follows from (4.5)–(4.7).
The main result of this section is given in the following theorem.
(4.7)
✷
461
FLOW AROUND A ROTATING OBSTACLE
THEOREM 4.1. Let be an exterior domain of class C 2 and let F = {Fij } be a
second-order tensor field on such that
[|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 < ∞.
Assume, further, v ∗ ∈ W 2−1/q,q (∂) for all q > 1, satisfying (4.2). Then, problem
(4.1) has one and only one solution v, p verifying
2,q ¯
v ∈ Wloc ()
∩ D 2,2 (),
1,q ¯
p ∈ Wloc (),
all q 1,
all q 1,
[|v|]1 + [|grad v|]2 < ∞
[|p|]2 + [|grad p|]3,R < ∞,
(4.8)
all R > δ(B).
Moreover, the following estimate holds
v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 + [|p|]2 + [|grad p|]3,R
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + v ∗ 2−1/q,q,∂ ,
(4.9)
where the constant c depends only on , q, R and B, whenever Re ∈ [0, B].
Proof. Uniqueness in the class defined by (4.8) is immediately obtained by a
standard argument. Actually, consider (4.1) with F ≡ v ∗ ≡ 0, and denote the
resulting problem by (4.1)0 . Dot-multiplying the first equation in (4.1)0 by v, integrating by parts over r , r > δ(B), and using the conditions div v = 0 and
v |∂ = 0, furnishes
(N · grad v · v + pv · N ), N = x/|x|.
|grad v|2 =
∂Br
r
Then uniqueness follows by letting r → ∞ in this relation and by using the
asymptotic properties of v and p given in (4.8). We now show existence. Let v
be the solution to problem (4.1) determined in Lemma 4.1. By assumption and
Lemma 4.2 we have
2,q ¯
v ∈ D 1,2() ∩ Wloc (),
1,q ¯
p ∈ Wloc (),
all q 1.
(4.10)
Let ϕ = 1 − ψR , R > δ(B), where ψR is the “cut-off” function introduced in
Theorem 3.1, and set u = ϕv + w, φ = ϕp, where w satisfies the following
problem
div w = − grad ϕ · v
w∈
3,q
W0 (2R ),
in 2R ,
for all q > 1.
(4.11)
By virtue of (4.10) and well-known results on problem (4.11), e.g., [11, Section III.3], it follows that the field w does exist. By a direct calculation that uses
(4.1), we find that u, φ satisfy problem (3.20) with
F = ϕF ,
f = − w − µ × x · grad w + µ × w
− µ × x · grad ϕ v − ϕv − 2 grad ϕ · grad v − p grad ϕ − ∂i ϕFij ej .
462
GIOVANNI P. GALDI
In view of (4.10) and of the assumptions on F we find that F and f satisfy the
hypotheses of Theorem 3.1. Therefore, according to that theorem, there exists at
least one solution ū, φ̄ satisfying all conditions there stated. However, in view of
(4.8), we have u ∈ D 1,2 (R3 ) and so, again by Theorem 3.1, we conclude ū ≡ u
and φ̄ ≡ φ.⋆ In view of (4.10) and of the Sobolev embedding theorem, we then
conclude that v and p satisfy (4.8). It remains to show the validity of the estimate.
To this end, using (4.10) and (4.11), we find
[|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]3 c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 ,
(4.12)
f q + div f q c v2,q,2R + grad pq,2R + [|F |]2 + [|∂i Fij ej |]3 ,
where c depends also on B. By Lemma 4.2, the second inequality delivers
f q + div f q c vq,ρ + pq,ρ + [|F |]2 + [|∂i Fij ej |]3 ,
(4.13)
where ρ = 3R (say) and c is a constant depending only on R, q and B. Taking
into account the first inequality in (4.12) and (4.13), and using (3.73) we obtain, in
particular,
D 2 v2,2R + [|v|]1,2R + [|grad v|]2,2R + [|p|]2,2R + [|grad p|]3,2R
c N + vq,ρ + pq,ρ ,
where
N ≡ [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 .
Combining this latter inequality and the inequality in Lemma 4.2 with r1 = 2R,
r = ρ ≡ 3R, and using again Sobolev embedding theorem, we conclude
v2,q,2R + D 2 v2 + [|v|]1 + [|grad v|]2
+grad pq,2R + [|p|]2 + [|grad p|]3,2R
c N + v ∗ 2−1/q,q,∂ + vq,ρ + pq,ρ ,
(4.14)
with a constant c depending only on , R, q and B. To complete the proof of the
theorem, it is enough to show that there is a constant c, again depending at most on
, R, q and B, such that
vq,ρ + pq,ρ c N + v ∗ 2−1/q,q,∂ .
(4.15)
Assume this inequality does not hold. Then, in view of the linearity of problem
(4.1), we can find a sequence {F n , v ∗n , Ren }, with Ren ∈ [0, B] and a sequence of
corresponding solutions {v n , pn }, such that
[|F n |]2 + [|∂i Fnij ej |]3 + [|∂j ∂i Fnij |]4 + v ∗n 2−1/q,q,∂
v n q,ρ + pn q,ρ = 1.
⋆ Possibly redefining φ by the addition of a constant.
1
,
n
(4.16)
463
FLOW AROUND A ROTATING OBSTACLE
From (4.14), it follows that the sequence of solutions is bounded in the norm defined by the left-hand side of (4.14) and that, therefore, it converges, in a suitable
topology, to a pair {v 0 , p0 } which belongs to the class defined by (4.8). Since, in
particular
v n 1,q,ρ + pn 1,q,ρ M
with M independent of n, by Rellich’s theorem and by the second equation in
(4.16) we infer
v 0 q,ρ + p0 q,ρ = 1.
(4.17)
Moreover, using (4.16), it is easy to show that v 0 , p0 is a solution of the following
boundary-value problem
$
v 0 + Re0 (µ × x · grad v 0 − µ × v 0 ) = grad p0 ,
in ,
div v 0 = 0
(4.18)
v 0 = 0 at ∂,
where Re0 = limn→∞ Ren . However, v 0 , p0 satisfy (4.8) and, by the uniqueness
property showed previously, we conclude v 0 = p0 = 0,⋆ contradicting (4.17). This
proves (4.15), and the proof of the theorem is completed.
✷
5. Proof of Theorem 1.1
We are now in a position to give a proof of our main result. The proof of existence
will be obtained by combining the results of Theorem 4.1 with a fixed point argument. To this end, for fixed R > δ(B) and q > 1 we introduce the following space
of functions:
1,2
() : div ϕ = 0 in ,
XR,q = ϕ ∈ Wloc
ϕ2,q,R + D 2 ϕ2 + [|ϕ|]1 + [|grad ϕ|]2 < ∞ .
Clearly, XR,q is a Banach space with the norm
ϕXR,q ≡ ϕ2,q,R + D 2 ϕ2 + [|ϕ|]1 + [|grad ϕ|]2 .
Let us consider the map
M: ϕ ∈ XR,q → v,
where v is a solution to the following problem
v + Re(µ × x · grad v − µ × v) = grad p + ∂i Fij ej ,
div v = 0
v = µ × x, x ∈ ∂,
⋆ Recall that p (x) → 0 as |x| → ∞.
0
$
in ,
(5.1)
464
GIOVANNI P. GALDI
where
Fij = Re ϕi ϕj .
Notice that, by virtue of the condition div ϕ = 0, we have
∂i Fij ej = Re ϕ · grad ϕ,
∂i ∂j Fij = Re grad ϕ · (grad ϕ)⊤ .
Therefore, since ϕ ∈ XR,q , we deduce
[|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4 c Re ϕ2XR,q .
So, by Theorem 4.1, we find, on the one hand, that v is a uniquely determined
element of XR,q and, on the other hand, that
(5.2)
vXR,q + [|p|]2 + [|grad p|]3,R c Reϕ2XR,q + 1 .
Moreover, if v 1 = M(ϕ 1 ) and v 2 = M(ϕ 2 ), ϕ 1 , ϕ 2 ∈ X, setting v = v 1 − v 2 and
ϕ = ϕ 1 − ϕ 2 we deduce that v satisfies the following problem
⎫
v + Re(µ × x · grad v − µ × v)
⎬
= grad p + Re∂i (ϕi ϕ1j + ϕ2i ϕj )ej ,
in ,
⎭
div v = 0
v = 0 at ∂,
for some p. Thus, again by Theorem 4.1, it follows, in particular,
v 1 − v 2 XR,q c Re ϕ 1 XR,q + ϕ 2 XR,q ϕ 1 − ϕ 2 XR,q .
(5.3)
Inequalities (5.2) and (5.3) ensure that the map M has a fixed point v in XR,q ,
for sufficiently small Re. Furthermore, in view of (5.2), the corresponding pressure
p satisfies the condition stated in the theorem. Finally, the pair v, p is of class
C ∞ (), as a consequence of a boot-strap argument and of well-known regularity
results for the Stokes problem (see, e.g., [11, Chapter IV]). It remains to prove
uniqueness. To this end, let u = v 2 − v 1 , φ = p2 − p1 where {vi , pi }, i = 1, 2 are
two smooth solutions to (2.1) in the class defined by (2.2) and corresponding to the
same ω. We then have
⎫
Re(u · grad u + u · grad v 1 + v 1 · grad u
⎪
⎬
− µ × x · grad u + µ × u) = u − grad φ,
in ,
⎪
⎭
div u = 0
(5.4)
lim u(x) = 0,
|x |→∞
u(x) = 0 at ∂.
Dot-multiplying the first equation in (5.4) by u, integrating by parts over R
and taking into account the third and fourth equation in (5.4), we obtain (with
FLOW AROUND A ROTATING OBSTACLE
N = x/|x|)
|grad u|2 =
R
∂BR
+ Re
465
$
∂u
1 2
· u − Re |u| (u + v 1 ) · N − φN
∂n
2
R
u · grad v 1 · u.
(5.5)
We now observe that, in view of the asymptotic properties (2.2) the surface integral
vanishes in the limit R → ∞. Therefore, in this limit, from (5.5) we find
2
|grad u| = Re u · grad v 1 · u.
However, [|grad v 1 |]2 c, and so the preceding equation delivers
|u|2
2
|grad u| c Re
.
2
|x|
(5.6)
Since [11, Section II.5],
|u|2
4 |grad u|2 ,
2
|x|
from (5.6) and the fourth equation in (5.4), for sufficiently small Re, we obtain
u = 0, which completes the proof of the theorem.
✷
6. Conclusions
Consider a rigid body B steadily rotating, with constant angular velocity ω in a
Navier–Stokes liquid that fills the whole space exterior to B. The main achievement of this paper is that, if B is of class C 2 and if |ω| is not “too large”, the spacial
asymptotics of the velocity v and pressure p of the liquid are completely and
uniquely determined. Specifically, v(x) and its gradient grad v(x) decay like |x|−1
and |x|−2 , respectively, while the pressure field p(x) and its gradient grad p(x)
decay like |x|−2 and |x|−3 , respectively. This result is relevant in several respects.
From a strictly theoretical point of view, it ensures existence of solutions satisfying
basic physical requirements, at least for small data. In fact, these solutions are
unique, satisfy the global energy balance and are nonlinearly stable in the sense of
Liapunov. The result is also important in several applications, like particle sedimentation [14] and the calculation of net force and torque exerted by a viscous liquid
on a rotating body at small and nonzero Reynolds number; see, e.g., [6, 16, 17], and
references cited therein. Finally, the knowledge of the sharp asymptotic behaviour
of solutions to elliptic system in exterior domains is also fundamental in numerical
computations, especially in evaluating the error made by approximating the infinite
region of flow with a necessarily bounded domain. For this type of problems related
to the Navier–Stokes equations see, e.g., [28, 7].
466
GIOVANNI P. GALDI
Acknowledgement
I am indebted to Professor Christian Simader for helpful conversations about the
proof of Lemma 3.1.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
R. Adams, Sobolev Spaces. Academic Press, New York (1975).
A.S. Advani, Flow and Rheology in Polymer Composites Manufacturing. Elsevier, Amsterdam
(1994).
K.I. Babenko, On stationary solutions of the problem of flow past a body of a viscous incompressible fluid. Mat. Sb. 91(133) (1973) 3–27; English transl.: Math. SSSR-Sb. 20 (1973)
1–25.
W. Borchers, Zur Stabilität und Faktorisierungsmethode für die Navier–Stokes Gleichungen
inkompressibler viskoser Flüssigkeiten. Habilitationsschrift, University of Paderborn (1992).
Z.-M. Chen and T. Miyakawa, Decay properties of weak solutions to a perturbed Navier–Stokes
system in Rn . Adv. Math. Sci. Appl. 7(2) (1997) 741–770.
R.G. Cox, The steady motion of a particle of arbitrary shape at small Reynolds numbers.
J. Fluid Mech. 23 (1965) 625–643.
P. Deuring, On H 2 -estimates of solutions to the Stokes system with an artificial boundary
condition. J. Math. Fluid Mech. 4 (2002) 203–236.
P. Deuring and G.P. Galdi, On the asymptotic behavior of physically reasonable solutions to
the stationary Navier–Stokes system in three-dimensional exterior domains with zero velocity
at infinity. J. Math. Fluid Mech. 2(4) (2000) 353–364.
R. Finn, On the exterior stationary problem for the Navier–Stokes equations, and associated
perturbation problems. Arch. Rational Mech. Anal. 19 (1965) 363–406.
H. Fujita, On the existence and regularity of the steady-state solutions of the Navier–Stokes
equation. J. Fac. Sci. Univ. Tokyo (1A) 2 (1961) 59–102.
G.P. Galdi, An Introduction to the Mathematical Theory of the Navier–Stokes Equations: Linearized Steady Problems, revised edn. Springer Tracts Nat. Philos. 38. Springer-Verlag, New
York (1998).
G.P. Galdi, An Introduction to the Mathematical Theory of the Navier–Stokes Equations: Nonlinear Steady Problems, revised edn. Springer Tracts Nat. Philos. 39. Springer-Verlag, New
York (1998).
G.P. Galdi, Slow motion of a body in a viscous incompressible fluid with application to particle sedimentation. In: V.A. Solonnikov (ed.), Developments in Partial Differential Equations.
Quaderni di Matematica della II Università di Napoli 2 (1998) 2–50.
G.P. Galdi, On the motion of a rigid body in a viscous liquid: A mathematical analysis with applications. In: S. Friedlander and D. Serre (eds), Handbook of Mathematical Fluid Mechanics.
Elsevier Science (2002) pp. 653–791.
G.P. Galdi, J.G. Heywood and Y. Shibata, On the global existence and convergence to steady
state of Navier–Stokes flow past an obstacle that is started from rest. Arch. Rational Mech.
Anal. 138 (1997) 307–318.
G.P. Galdi and A. Vaidya, Translational steady fall of symmetric bodies in a Navier–Stokes
liquid, with application to particle sedimentation. J. Math. Fluid Mech. 3(1) (2001) 183–211.
R.B. Guenther, R.T. Hudspeth and E.A. Thomann, Hydrodynamic forces on submerged rigid
bodies – steady flow. J. Math. Fluid Mech. (2002), in press.
T. Hishida, An existence theorem for the Navier–Stokes flow in the exterior of a rotating
obstacle. Arch. Rational. Mech. Anal. 150 (1999) 307–348.
FLOW AROUND A ROTATING OBSTACLE
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
467
T. Hishida, The Stokes operator with rotation effect in exterior domains. Analysis (Munich) 19
(1999) 51–67.
D.D. Joseph, Flow induced microstructure in Newtonian and viscoelastic fluids. In: Proceedings of the Fifth World Congress of Chemical Engineering. Particle Technology Track. 6 (1996)
3–16.
G. Kirchhoff, Über die Bewegung eines Rotationskörpers in einer Flüssigkeit. J. Reine Ang.
Math. 71 (1869) 237–281.
O.A. Ladyzhenskaya, Investigation of the Navier–Stokes equation for a stationary flow of an
incompressible fluid. Uspekhi Mat. Nauk. 14(3) (1959) 75–97 (in Russian).
O.A. Ladyzhenskaya, N.N. Ural’ceva and V.A. Solonnikov, Linear and Quasilinear Equations
of Parabolic Type. Transl. Math. Monographs 23. Amer. Math. Soc., Providence, RI (1968).
H. Lamb, Hydrodynamics, Cambridge Univ. Press (1932).
J. Leray, Etude de diverses équations intégrales non linéaires et de quelques problèmes que
pose l’hydrodynamique. J. Math. Pures Appl. 12 (1933) 1–82.
J. Leray, Sur le mouvement d’un liquide visqueux emplissant l’espace. Acta Math. 63 (1934)
193–248.
S.A. Nazarov and K. Pileckas, On steady Stokes and Navier–Stokes problems with zero velocity at infinity in a three-dimensional exterior domain. J. Math. Kyoto Univ. 40(3) (2000)
475–492.
S.A. Nazarov and M. Specovius-Neugebauer, Approximation of exterior boundary value
problems for the Stokes system. Asymptotic Anal. 14(3) (1997) 223–255.
A. Novotný and M. Padula, Note on decay of solutions of steady Navier–Stokes equations in
3-D exterior domains. Differential Integral Equations 8(7) (1995) 1833–1842.
F.K.G. Odqvist, Über die Randwertaufgaben der Hydrodynamik Zäher Flüssigkeiten. Math. Z.
32 (1930) 329–375.
C.W. Oseen, Neuere Methoden und Ergebnisse in der Hydrodynamik. Akad. Verlagsgesellschaft M.B.H., Leipzig (1927).
H. Schmid-Schonbein and R. Wells, Fluid drop-like transition of erythrocytes under shear.
Science 165(3890) (1969) 288–291.
D. Serre, Chute libre d’un solid dans un fluide visqueux incompressible. Existence. Japan J.
Appl. Math. 40(1) (1987) 99–110.
C.G. Simader and H. Sohr, The Dirichlet Problem for the Laplacian in Bounded and
Unbounded Domains. Pitman Res. Notes Math. Ser. 360. Longman Sc. Tech. (1997).
G. Stokes, On the effect of internal friction of fluids on the motion of pendulums. Trans.
Cambridge Phil. Soc. 9 (1851) 8–85.
B. Tinland, L. Meistermann and G. Weill, Simultaneous measurements of mobility, dispersion,
and orientation of DNA during steady-field gel electrophoresis coupling a fluorescence recovery after photobleaching apparatus with a fluorescence detected linear dichroism setup. Phys.
Rev. E 61(6) (2000) 6993–6998.
W. Thomson and P.G. Tait, Natural Philosophy, Vols. 1, 2. Cambridge Univ. Press (1879).
C. Truesdell and W. Noll, Handbuch der Physik, Vol. VIII/3. Springer-Verlag (1965).
H.F. Weinberger, On the steady fall of a body in a Navier–Stokes fluid. Proc. Sympos. Pure
Math. 23 (1973) 421–440.
Global Bifurcation in Nonlinear Elasticity
with an Application to Barrelling States
of Cylindrical Columns
TIMOTHY J. HEALEY1 and ERROL L. MONTES-PIZARRO2
1 Center for Applied Mathematics and Department of Theoretical and Applied Mechanics,
Cornell University, Ithaca, NY 14853, USA. E-mail: tj10@cornell.edu
2 Department of Mathematics and Physics, University of Puerto Rico, Cayey Campus, Cayey,
PR 00736, Puerto Rico. E-mail: emontes@caribe.net
Received 18 October 2002; in revised form 12 May 2003
Abstract. We present rigorous local and global bifurcation results for a concrete example from
3-dimensional nonlinear elastostatics – the problem of barrelling of compressed cylindrical columns.
We use standard tools of bifurcation theory for the local analysis, already producing results that
are rare in our field. For the global part we employ the generalized degree designed by Healey and
Simpson to overcome the specific difficulties of 3-dimensional nonlinear elasticity. Ours are the first
global bifurcation results for a problem from 3-dimensional nonlinear elastostatics not governed by
ordinary differential equations. Moreover, our approach to the barrelling problem provides a paradigm for the solution of a large class of problems in nonlinear elastostatics concerning bifurcation
from a homogeneously deformed state.
Mathematics Subject Classifications (2000): 74B20, 74G25, 37G99.
Key words: local bifurcation, global bifurcation, barrelling, complementing condition, Green–
Hadamard material, Blatz–Ko material.
We dedicate this work to the memory of Clifford Ambrose Truesdell III, whose
scholarship, clarity of exposition and leadership inspired a generation.
1. Introduction
One of the main difficulties in applying degree-theoretic methods to problems
of three-dimensional nonlinear elastostatics stems from the presence of traction
boundary conditions. Indeed the latter correspond to nonlinear Neumann conditions, where the nonlinearity is in the dependence upon the deformation gradient.
Even for scalar, second-order elliptic equations, the apparent inapplicability of the
Leray–Schauder degree in this setting is well known [18] (cf. Chapter 10.1). In a
recent work [14], a generalized degree was constructed to overcome this difficulty,
and global-continuation results were obtained for a general class of boundary value
469
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 469–494.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
470
T. J. HEALEY AND E. L. MONTES-PIZARRO
problems in nonlinear elasticity. The construction in [14] was inspired by an abstract degree proposed in [16] for a class of nonlinear Fredholm maps. However, the
treatment in [14] accounts explicitly for mappings comprising nonlinear equations
on both the domain and the boundary, which is common in problems from nonlinear continuum mechanics (cf. [7], where the degree developed in [14] was recently
applied to a problem in water waves). In [14] one starts from the unloaded, stressfree state, which is subsequently shown to be an element of a global continuum
of solutions (in “load-displacement” space). In particular, for pure displacement
problems, under physically reasonable hypothesis on the stored energy function
and on the domain, one obtains unbounded branches of classical injective solutions,
cf. [12, 13]. In the absence of a-priori bounds, this falls short of a general existence
theorem for displacement problems. Nonetheless, the existence of solutions “in the
large”, i.e., “far” from the unloaded reference configuration, is established.
For problems admitting a trivial line of solutions, e.g., a family of homogeneous solutions parametrized by the magnitude of the loading, the existence results
obtained in [12, 14, 13] are of little interest (the trivial solution branch itself is
unbounded). Rather, bifurcating solutions from the trivial line are typically sought
and characterized. The degree presented in [14] has all of the properties of the
Leray–Schauder degree, including the capability of detecting global bifurcation
in the sense of Rabinowitz [26]. The first step in a global bifurcation analysis is
to verify a change in the degree as the bifurcation parameter crosses a singular
point along the trivial line. As first observed by Krasnoselskii [17], this yields
the existence of a local continuum of bifurcating solutions; Rabinowitz [26] later
showed the global ramifications of this change in degree for operator equations in
the form of a compact perturbation of the identity. In practice, the simplest and most
common way to demonstrate a change in degree is to verify a certain transversality
or “crossing condition”, which also typically yields the existence of a local curve
of bifurcating solutions, cf. [8]. In other words, a local analysis is the first step in
performing a global one.
The literature in three-dimensional nonlinear elasticity is replete with examples
in which the necessary conditions for bifurcation are worked out by (formally)
linearizing the nonlinear problem about a trivial line of homogenous solutions.
The critical value or “buckling load” at which the linearization fails to be injective
is then determined, e.g., cf. [5, 21, 22, 27, and references therein]. This is often
referred to in the literature as the problem of “small on large”. Given the maturity
of the field of bifurcation theory, there is surprisingly little rigorous analysis in
the literature on sufficient conditions for bifurcation in concrete examples from
three-dimensional nonlinear elasticity. In fact, we know of only two such works –
[25, 31] – where the existence of a local curve of bifurcating solutions is obtained.
In this work we choose a typical example from the literature for which only the
necessary conditions for bifurcation are well understood, viz., the problem of barrelling of compressed cylindrical columns. This problem was studied by Simpson
and Spector [28, 29], and by Davies [9, 10]. In those works, the existence of the
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
471
trivial line of homogeneous solutions is established, the critical load is obtained,
and the stability (and instability) of solutions on the trivial line is determined.
Although our ultimate goal here is to obtain global bifurcation results, quite a bit
of the paper is devoted to a rigorous local bifurcation analysis. Here we benefit
from the treatment in [31]. To the best of our knowledge, ours are the first global
bifurcation results in a problem from three-dimensional elasticity not governed by
ordinary differential equations. Moreover, our approach to the barrelling problem
serves as a paradigm for the nonlinear analysis of a broad class of “small on large”
problems in nonlinear elasticity.
The outline of the paper is as follows: In Section 2 we present the formulation
of our problem. As in [9, 10, 28, 29], we assume that the two compressed ends are
subjected to “sliding” conditions, which insure that the trivial solution branch corresponds to a homogeneously deformed state. We impose strong ellipticity, which,
among other things, plays a key role in our ability to reformulate the problem as
that of finding certain periodic solutions for an infinite cylinder. In this way, we
eliminate the presence of corners on the boundary of the domain, thus resolving
otherwise difficult questions of regularity. In Section 3 we summarize the necessary
conditions for bifurcation from the homogeneous state, and in Section 4 we present
concrete examples verifying those conditions. In Section 5 we provide a local bifurcation analysis of our problem, insuring the existence of a curve of nontrivial
“barreling” solutions. Finally, in Section 6 we obtain global bifurcation results
using the degree presented in [14]. Following up on an observation made in [12],
we take the opportunity here to simplify the construction of the degree via uniform
spectral estimates, cf. Proposition 6.3. As is the case for the continuation results
in [14], the first general result here provides the existence of a global continuum
of nontrivial solutions, characterized not only by the two usual Rabinowitz alternatives, but also by the possibility that the branch “terminates” due to a loss of local
injectivity and/or a failure of the complementing condition. As in [13], we are able
to eliminate the possibility of a terminated or bounded branch of solutions due to a
loss of local injectivity for a large class of realistic materials. Interestingly, the presence of traction boundary conditions requires slightly stronger growth conditions
here than those employed in [13] for pure displacement problems.
NOTATION
Throughout this work we presume that a fixed, right-handed, rectangular Cartesian
frame of reference has been chosen for E3 , Euclidean 3-space; we employ the
usual abuse of notation by associating points in E3 and vectors (in the space of
translations of E3 ) with their coordinates and components, respectively, relative to
the Cartesian frame, as elements of R3 . Elements of R3 , henceforth called vectors,
are denoted by boldface, lowercase latin letters such as a, x, etc.; a · b denotes
the inner product of a and b. Linear transformations of R3 into itself, also called
(second order) tensors, are denoted by boldface, uppercase symbols like A, L, etc.;
472
T. J. HEALEY AND E. L. MONTES-PIZARRO
I denotes the identity. AT , A−1 , trA and detA denote the transpose, the inverse, the
trace and the determinant, respectively, of A.
Given two Banach spaces X and Y , we denote the space of all bounded linear
transformations of X into Y by L(X, Y ); L(X) ≡ L(X, X). Uppercase symbols
like A, L, etc., denote elements of L(X, Y ); I ∈ L(X) denotes the identity. We
write A[x] for the value of A ∈ L[X, Y ] at x ∈ X. In particular, we consistently
employ the latter notation when dealing with elements of L(L(R3)), which are
called fourth order tensors: C[H] denotes the value of C ∈ L(L(R3)) at H ∈
L(R3 ). We also define
GL(X)
GL+ (R3 )
SO(3)
Sym(L(R3 ))
≡
≡
≡
≡
{A
∈ L(X):3 A is bijective},
A ∈ GL(R ): det A > 0 ,
A ∈ L(R3 ): AT = A−1 ∩ GL+ (R3 ) ,
C ∈ L(L(R3)): A · C[B] = B · C[A], ∀A, B ∈ L(R3 ) ,
where E · F ≡ tr(ET F) for all E and F, which is the standard inner product on
L(R3 ).
Finally, for a function (x1 , . . . , xn ) of n variables we denote by ,i the partial
derivative of with respect to its ith argument, and by ,ij the mixed second order
partial derivatives.
2. Formulation
Let ⊂ R3 denotes the right circular (open) cylinder of height L and radius R
given by
= (x1 , x2 , x3 ) ∈ R3 : x12 + x22 < R 2 , 0 < x3 < L ,
(2.1)
with boundary ∂ = ∂B ∪ ∂L ∪ ∂T , where
∂B = (x1 , x2 , x3 ): x12 + x22 R 2 , x3 = 0 ,
∂T = (x1 , x2 , x3 ): x12 + x22 R 2 , x3 = L ,
∂L = (x1 , x2 , x3 ): x12 + x22 = R 2 , x3 ∈ [0, L] .
(2.2)
We consider a hyperelastic, homogeneous, and isotropic body occupying in
a stress-free reference configuration. Let f denote a deformation of , which is, by
definition, a differentiable mapping f: ⊂ R3 → R3 , i.e., f(x) denotes the position
in the deformed state of the material point occupying position x in the reference
configuration ; we require local injectivity, viz., F(x) ≡ ∇f(x) ∈ GL+ (R3 ) for
each x ∈ .
Let S denote the (first) Piola–Kirchhoff stress tensor. We then study the following boundary value problem:
∇ ·S = 0
f3 = 0
in ,
on ∂B ,
(2.3)
(2.4)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
S13
f3 = λL on ∂T ,
= S23 = 0
on ∂B ∪ ∂T ,
Sn = 0
on ∂L .
473
(2.5)
(2.6)
(2.7)
Here n denotes the outward unit normal to ∂L , and λ ∈ (0, ∞) is a “loading”
parameter. To avoid trivial nonuniqueness we also impose:
f2 = 0
(2.8)
f1 =
[f1,2 − f2,1 ] = 0.
(2.9)
By hyperelasticity we mean there exists a sufficiently smooth stored-energy
function W : GL+ (R3 ) → R, such that
S(F) =
dW (F)
.
dF
(2.10)
Since the reference configuration is assumed to be stress-free, we have
dW (I)
= 0.
dF
(2.11)
We require W to satisfy the principle of material objectivity (frame–indifference)
W (QF) = W (F),
∀Q ∈ SO(3).
(2.12)
In addition, by isotropy we mean
W (FQ) = W (F),
∀Q ∈ SO(3).
(2.13)
Consequently, there exists a smooth function : R2 × R+ → R, such that
1 T
1
T
(2.14)
W (F) = F · F, FF · FF , det F .
2
4
The fourth order tensor C defined by
C(F) ≡
d2 W (F)
,
dF2
is called the elasticity tensor. Note that
C ∈ Sym(L(R3 )) ≡ A ∈ L(L(R3 )): D · A[B] = B · A[D],
∀D, B ∈ L(R3 ) .
(2.15)
(2.16)
We also make the following assumptions on W, which are physically reasonable
and mathematically convenient:
474
T. J. HEALEY AND E. L. MONTES-PIZARRO
H1. Growth conditions on W :
lim W (F) = +∞,
det F→0+
and
lim
F→+∞
W (F) = +∞.
(2.17)
H2. Smoothness:
W ∈ C 5 (GL+ (R3 ), R).
(2.18)
H3. The restriction of C(I) to Sym(R3 ) is positive-definite:
H · C(I)[H] > 0,
∀H ∈ Sym(R3 )\{0}.
(2.19)
H4. The elasticity tensor is strongly-elliptic, i.e., for every F ∈ GL+ (R3 ),
a ⊗ b · C(∇f)[a ⊗ b] > 0,
∀a, b ∈ R3 \{0}.
(2.20)
Next we discuss the complementing condition. Let x0 ∈ ∂L , F0 ∈ GL+ (R3 ),
and assume that C0 ≡ C(F0 ) is strongly elliptic. Consider the following linear
problem:
∇ · C0 [∇v] = α 2 v in H,
C0 [∇v]n0 = 0
on ∂H,
(2.21)
where n0 ≡ n(x0 ) is the outward unit normal to ∂L at x0 , and H is the halfspace
H = {x ∈ R3 : (x − x0 ) · n0 < 0}. We seek solutions of (2.21) of the form
v(x) = z((x − x0 ) · n0 ) exp(i(x − x0 ) · ξ ),
(2.22)
for all unit vectors ξ such that ξ · n0 = 0, where z ∈ C ∞ ([0, ∞), C3 ). The pair
(C0 , n0 ) is said to satisfy the complementing condition if, for all unit vectors ξ
orthogonal to n0 , the only bounded solution of (2.21) with α = 0, of the form (2.22)
is v ≡ 0 [2, 30]. If the same is true for all α = 0, then (C0 , n0 ) is said to satisfy
Agmon’s condition. If both the complementing condition and Agmon’s condition
are satisfied, then (C0 , n0 ) is said to satisfy the strong complementing condition.
As discussed, e.g., in [14], the verification of these conditions is equivalent to the
nonvanishing of a certain 3 × 3 determinant, denoted d(F0 , x0 , ξ,α) = 0, where d
is continuous in its four arguments (cf. also [1, 20]).
The following result is well known, cf. [10, 29, 32]:
PROPOSITION 2.1. Assume that is twice continuously differentiable and that
hypothesis (2.17) and (2.20) hold. Then, for each λ ∈ (0, ∞), there exists a unique
constant µ(λ) > 0, such that
⎛ 1/2
⎞
µ
0 0
(2.23)
f(x) = Hλ x ≡ ⎝ 0 µ1/2 0 ⎠ x
0
0 λ
is a solution of (2.3)–(2.14). Moreover, µ ∈ C 1 ((0, ∞); R), and µ(1) = 1.
475
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
In the jargon of bifurcation theory, the solution of (2.3)–(2.14), given by (2.23),
is called the trivial solution. We want to study the existence of branches of nontrivial solutions bifurcating from (2.23). In order to carry out a rigorous bifurcation
analysis for our problem, it is convenient to rewrite it abstractly as G(λ, u) = 0,
where u denotes the displacement from the trivial solution and G is a nonlinear operator between appropriate Banach spaces. Before doing this, we derive boundary
conditions, in terms of the displacement field u, that are equivalent to the “sliding”
boundary conditions (2.4)–(2.6).
PROPOSITION 2.2. If we write f(x) = Hλ x + u(x), then the boundary conditions
(2.4) and (2.5) become
u3 (x1 , x2 , 0) = u3 (x1 , x2 , L) = 0,
(2.24)
while the zero-shear conditions (2.6) are equivalent to
u1,3 (x1 , x2 , 0) = u1,3 (x1 , x2 , L) = 0,
u2,3 (x1 , x2 , 0) = u2,3 (x1 , x2 , L) = 0,
(2.25)
and
for x12 + x22 R 2 .
Proof. The first part leading to (2.24) is trivial. For the second part, first notice
that u3 constant in ∂T ∪ ∂B , implies u3,1 = u3,2 = 0 in ∂T ∪ ∂B . Using this
with (2.10) and (2.14), we get that
S13 = ,1 + ,2 (µ1/2 + u1,1 )2 + u21,2 + u21,3 + (λ + u3,3 )2 u1,3
+ ,2 (µ1/2 + u1,1 )u2,1 + u1,2 (µ1/2 + u2,2 ) + u1,3 u2,3 u2,3 (2.26)
and
S23 = ,2 (µ1/2 + u1,1 )u2,1 + u1,2 (µ1/2 + u2,2 ) + u1,3 u2,3 u1,3
+ ,1 + ,2 u22,1 + (µ1/2 + u2,2 )2 + u22,3 + (λ + u3,3 )2 u2,3 .
(2.27)
In ∂T ∪ ∂B we have the decomposition
⎛ 1/2
⎞
µ + u1,1
u1,2
u1,3
u2,1
µ1/2 + u2,2 u2,3 ⎠ = F0 + a ⊗ e3 ,
F=⎝
0
0
λ + u3,3
where
⎛
⎞
µ1/2 + u1,1
u1,2
0
F0 = ⎝
u2,1
µ1/2 + u2,2 0 ⎠
0
0
λ
and
a = (u1,3 , u2,3 , u3,3 ).
(2.28)
(2.29)
Observe that det F > 0 ⇒ det F0 > 0. Now strong ellipticity at F0 implies that
the mapping a → s(a) ≡ S(F0 + a ⊗ e3 )e3 is injective, cf. [4]. Of course s(a) =
476
T. J. HEALEY AND E. L. MONTES-PIZARRO
(S1,3 , S2,3 , S3,3 ), and by (2.26) and (2.27) we see that s((0, 0, u3,3 )) = (0, 0, S3,3 )
in ∂T ∪ ∂B . On the other hand, if S1,3 = S2,3 = 0 in ∂T ∪ ∂B , then
s((u1,3 , u2,3 , u3,3 )) = (0, 0, S3,3 ), and thus, u1,3 = u2,3 = 0 in ∂T ∪ ∂B by the
injectivity of s.
✷
In view of (2.24) and (2.25), we now formulate our nonlinear problem on the
infinite cylinder
(2.30)
∞ ≡ (x1 , x2 , x3 ) ∈ R3 : x12 + x22 < R 2 , −∞ < x3 < ∞ .
We then impose the following even–oddness assumptions on the components of u:
ui (x1 , x2 , x3 ) = ui (x1 , x2 , −x3 ), for i = 1, 2,
u3 (x1 , x2 , x3 ) = −u3 (x1 , x2 , −x3 ).
and
(2.31)
(2.32)
We restrict ourselves to axisymmetric solutions, viz., we require that u satisfy
u(Qx) = Qu(x),
⎛
cos ω
Q = ⎝ sin ω
0
⎛
1 0
Q = ⎝ 0 −1
0 0
∀x ∈ ∞ ,
− sin ω
cos ω
0
⎞
0
0 ⎠.
1
where
⎞
0
0 ⎠,
1
∀ω ∈ Rmod2π ,
(2.33)
or
Let us define now the following spaces of functions
X ≡ u ∈ C 2,α (∞ , R3 ): u is 2L-periodic in x3 ,
and satisfy (2.31)–(2.33) ,
Y0 ≡ v ∈ C 0,α (∞ , R3 ): v is 2L-periodic in x3 ,
and satisfy (2.31)–(2.33) ,
Y1 ≡ w ∈ C 1,α (∂∞ , R3 ): w is 2L-periodic in x3 ,
and satisfy (2.31)–(2.33) ,
Y ≡ Y0 × Y1 ,
(2.34)
(2.35)
(2.36)
(2.37)
where
∂∞ = (x1 , x2 , x3 ) ∈ R3 : x12 + x22 = R 2 ,
(2.38)
k,α
and where C denotes the usual Hölder spaces of all k-times continuously differentiable functions whose kth-order partial derivatives are (locally) Hölder continous with exponent α ∈ (0, 1]. We endow X and Y with the usual Hölder norms
rendering them Banach spaces:
· X ≡ · 2,α; ,
· Y ≡ · 0,α; + · 1,α;∂L .
(2.39)
477
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
Next we define
U ≡ (λ, u) ∈ (0, ∞) × X: det(Hλ + ∇u) > 0 in ,
G(λ, u) ≡ C(Hλ + ∇u)[∇ 2 h], S(Hλ + ∇u)n ,
and
(2.40)
(2.41)
where G: U → Y. From (2.15), the componential form of the first term in (2.41)
is given by
C(Hλ + ∇u)[∇ 2 u] i ≡
∂ 2W
∂ 2 uk
(Hλ + ∇u)
.
∂Fij ∂Fkl
∂xj ∂xl
Observe that all u ∈ X automatically satisfy (2.8) and (2.9).
From Proposition 2.2, we then conclude:
PROPOSITION 2.3. Any solution of the operator equation,
G(λ, u) = 0
(2.42)
with u restricted to , is a solution of the BVP (2.3)–(2.20) and vice-versa. That is,
any classical solution of BVP (2.3)–(2.20), for each fixed λ, can be 2L-periodically
extended on ∞ to produce a solution of (2.42).
3. Linearized Problem
The mapping G, as defined in the previous section, has the property
G(λ, 0) = 0,
∀λ ∈ (0, ∞).
(3.1)
Moreover, the smoothness hypothesis H2 (cf. (2.18)) insures that G: U → Y is
of class C 2 , cf., e.g., [35]. (For the purposes of this section, G of class C 1 is
enough). We are interested in nontrivial solutions of (2.42) bifurcating from the
trivial solution u ≡ 0. A necessary condition for bifurcation is that the linearized
problem,
Gu (λ, 0)[h] = ∇ · C(Hλ )[∇h], C(Hλ )[∇h]n = (0, 0),
(3.2)
admit nontrivial solutions h ∈ X. Here Gu (λ, u) denotes the Frechet derivative of
u →G(λ, u) at (λ, u) ∈ U.
Fortunately, problem (3.2) has been studied previoulsy, cf. [9, 10, 29]. Any
h ∈ X satisfying conditions (2.33) can be written in the form:
⎛
⎞
φ(r, z)x1
h(x1 , x2 , x3 ) = ⎝ φ(r, z)x2 ⎠ ,
(3.3)
ℓ(r, z)
where r 2 = x12 + x22 and z = x3 . If we let θ(r, z) = r 2 φ(r, z), a long and tedious,
but otherwise elementary, computation shows that our linear problem reduces to
478
T. J. HEALEY AND E. L. MONTES-PIZARRO
that of finding a pair (θ(r, z), ℓ(r, z)) of C 2,α ((0, R) × (0, L)) functions satisfying
the equations:
θr
β1 θzz
τ1
+ Nℓrz = 0,
+
(3.4)
r r
r
β1 (rℓr )r + rτ3 ℓzz + Nθrz = 0, in (0, R) × (0, L),
together with the boundary conditions
ℓ(r, 0) = ℓ(r, L) = 0,
θz (r, 0) = θz (r, L) = 0,
(3.5)
for r ∈ (0, R],
and
β1 Rℓr (R, z) + t 1/2 β1 θz (R, z) = 0,
τ1 θr (R, z) + XRℓz (R, z) =
2β3 θ(R, z)
,
R
(3.6)
for z ∈ [0, L].
The requirement u ∈ C 2,α yields
lim+
r→0
θ(r, z)
= 0.
r
(3.7)
In general, β1 , β3 , τ1 , and τ3 are given by the expressions
(3.8)
βi ≡ ,1 + ,2 (ν12 + ν22 + ν32 − νi2 ) > 0,
∂t
i
> 0, with
(3.9)
τi ≡ ν1 ν2 ν3 νi−1
∂νi
ν 2 ,1 + νi4 ,2
ti (ν1 , ν2 , ν3 ) ≡ ,3 + i
,
(3.10)
ν1 ν2 ν3
√
where the νi ’s are the eigenvalues of FFT (the principal stretches) and the ti ’s are
the eigenvalues of the Cauchy stress tensor T(F) = (det F)−1 S(F)FT (the principal
stresses). It can be shown [34] that inequalities (3.8) and (3.9) are a consequence of
strong ellipticity H4, cf. (2.20). Here in (3.4)–(3.6), the expressions (3.8) and (3.9)
are evaluated at the trivial solution (2.23), viz., ν1 = ν2 = µ1/2 and ν3 = λ, and
µ
t ≡ 2,
(3.11)
λ
X ≡ µ1/2 λ,11 + (µ3/2 λ + µ1/2 λ3 ),12 + (µ1/2 λ2 + µ3/2 ),31
+ (µ1/2 λ4 + µ5/2 ),23 µ3/2 λ3 ,22 + µ3/2 λ,33 + µ1/2 ,3 ,
(3.12)
N ≡ X + t 1/2 β1 .
(3.13)
and
The following is due to Simpson and Spector [29]:
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
479
LEMMA 3.1. Suppose that the quadratic polynomial
p(e) ≡ β1 τ1 e2 + (τ1 τ3 + β12 − N 2 )e + β1 τ3
(3.14)
has distinct negative real roots e1 , and e2 . Then, any C 2 solution (θ, ℓ) of (3.4),
(3.5), and (3.7) can be written as a uniformly and absolutely convergent Fourier
series of the form
θ(r, z) =
ℓ(r, z) =
∞
n=1
∞
θn (r) cos(ρn z),
(3.15)
ℓn (r) sin(ρn z),
n=1
where ρn = nπ/L. The coefficients θn (r) and ℓn (r) are C 2 on [0, R] and
θn (r) = an θn1 (r) + bn θn2 (r),
ℓn (r) = an ℓn1 (r) + bn ℓn2 (r),
(3.16)
θnj (r) = rI1 (ρn rfj ),
ℓnj (r) = Dj I0 (ρn rfj ),
where fj = |ej |, Dj = [β1 − fj2 τ1 ]/Nfj , for j = 1, 2, and Ik , for k = 1, 2, are
the modified Bessel functions.
To find a nontrivial solution of (3.2), we look for those values of λ ∈ (0, ∞)
such that the boundary conditions (3.6) are fulfilled. This, in turn, is equivalent (for
a given λ ∈ (0, ∞)), to checking for values of n ∈ N such that (θn (r) cos(ρn z),
ℓn (r) sin(ρn z)) satisfies (3.6). Hence, we have from Lemma 3.1 that a necessary
and sufficient condition for the existence of such a number n is that the system
0
θn2
θn1
=
+ bn
(3.17)
An an
ℓn2
0
ℓn1
|r=R
has nontrivial solutions an , bn ∈ R. Here An is the differential operator
⎞
⎛
d
β3
ρn Xr ⎟
⎜ τ1 − 2 r
An = ⎝ dr
d ⎠,
−ρn t 1/2 β1 β1 r
dr
(3.18)
where X, t, θnj , and ℓnj are as defined above, cf. (3.11), (3.12), and (3.16). Note that
(3.17) is a linear system with unknowns (an , bn ) that can be re-written in matrix
form as
θn2
0
θn1
an
; An
=
An
.
(3.19)
ℓn2
ℓn1
b
0
n
r=R
480
T. J. HEALEY AND E. L. MONTES-PIZARRO
Finally, we conclude that Gu (λ∗ , 0) ∈ L(X, Y) has a nontrivial kernel if and
only if λ∗ is a root of the equation (hereafter refered to as the characteristic equation)
θn2
θn1
; An
= 0,
(3.20)
h(n, λ) ≡ det An
ℓn2
ℓn1
r=R
for some n ∈ N.
Next suppose that we have a root, λ∗ ∈ (0, ∞), of the characteristic equation (3.20), i.e., (3.2) admits a nontrivial solution at λ = λ∗ . In order to perform a
rigorous local analysis of bifurcation, one must also demonstrate that Gu (λ∗ , 0) is
a Fredholm operator (of index zero). This, in turn, depends upon both the ellipticity
of the differential operator (cf. H4, (2.20)) and the satisfaction of the complementing condition at each point x ∈ ∂L , cf. (2.20)–(2.22). By virtue of hypothesis H3,
cf. [14, (2.19) and Proposition 2.1], we know that the pair (C(Hλ ), n(x)) satisfies
the strong complementing condition for every x ∈ ∂L and for each λ sufficiently
close to λ = 1. To obtain explicit conditions under which our linearized problem
satisfies the complementing condition we note that the principal parts of the equation (3.4) and the boundary conditions (3.6), with the coefficients evaluated at a
point in ∂∞ (cf. (2.38)), are given by
⎫
τ1 θrr
β1 θzz
⎬
+
+ Nℓrz = 0
, r > 0, and
(3.21)
R
R
β1 Rℓrr + Rτ3 ℓzz + Nθrz = 0 ⎭
0
Rℓr (0, z) + t 1/2 θz (0, z) = 0
,
(3.22)
τ1 θr (0, z) + XRℓz (0, z) = 0
respectively.
In view of (2.22), we need to check whether the system (3.21)–(3.22) has solutions of the form
θ(r, z) = w1 (r)eiαz ,
ℓ(r, z) = w2 (r)eiαz ,
(3.23)
with w1 , and w2 bounded, and α, z ∈ R. Substitution of (3.23) into (3.21)–(3.22)
yields
0
τ1 w1′′ (r) − β1 α 2 w1 (r) = −NRαiw2′ (r)
, r > 0,
(3.24)
β1 Rw2′′ (r) − Rτ3 α 2 w2 (r) = −Nαiw1′ (r)
together with
Rw2′ (0) + t 1/2 αiw1 (0) = 0 and
τ1 w1′ (0) + XRαiw2 (0) = 0.
(3.25)
Note that the system (3.24)–(3.25) is a constant coefficients system. Accordingly,
we look for solutions of the form
w1 (r) = A1 eξ r ,
w2 (r) = A2 eξ r ,
(3.26)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
which upon substitution into (3.24) yields
2
τ1 ξ − β1 α 2 A1 + NRαiξ A2 = 0,
Nαiξ A1 + β1 Rξ 2 − Rτ3 α 2 A2 = 0.
The linear system of equations (3.27) has nontrivial solutions iff
4
ξ 2
ξ
2
2
τ1 β1
− τ1 τ3 + β1 − N
+ β1 τ3 = 0.
α
α
481
(3.27)
(3.28)
Upon extracting the roots of which, we conclude that w1 , and w2 are solutions of
(3.24) iff
(3.29)
ξ = ±α −ej , j = 1, 2,
where e1 , and e2 are the roots of (3.14). Therefore, the general solution of (3.24),
satisfying the boundedness condition, is given by
√
√
w1 (r) = Aeα −e1 r + Beα −e2 r ,
(3.30)
AD1 α √−e1 r BD2 α √−e2 r
w2 (r) = −
e
e
+
i
R
R
√
assuming, without loss of generality, that Re(α −ej ) 0, j = 1, 2, cf. (3.29).
Finally, from (3.30) and the boundary conditions (3.25), we arrive at
1/2
√
√
t √ − D1 −e1 t 1/2
A
0
√ − D2 −e2
=
,
(3.31)
B
0
τ1 −e1 + XD1 τ1 −e2 + XD2
which has nontrivial solutions when the determinant of the coefficient matrix is
equal to zero. This yields:
PROPOSITION 3.2. The pair (C(Hλ ), n(x)) satisfies the complementing condition at λ ∈ (0, ∞) if and only if
√
√
g(λ) ≡ t 1/2 − D1 −e1 τ1 −e2 + XD2
√
√
− t 1/2 − D2 −e2 τ1 −e1 + XD1 = 0.
(3.32)
4. Examples
It is difficult to study both the characteristic equation (3.20) and the complementing
condition inequality (3.32) for arbitrary hyperelastic homegeneous isotropic materials. Simpson and Spector studied the characteristic equation (3.20) for Green–
Hadamard materials in [29] and for Blatz–Ko materials in [28]. In this section
we introduce these materials as examples verifying the general conditions of the
previous section. We will briefly review the results of Simpson and Spector for the
482
T. J. HEALEY AND E. L. MONTES-PIZARRO
corresponding characteristic equations and consider the complementing condition
inequality (3.32). We only consider values of λ in the interval [0, 1].
The constitutive stored-energy function for Green–Hadamard materials is given
by
W (F) =
a
b
F · F + (F · F)2 − FFT · FFT + (det F).
2
4
(4.1)
We assume, as in [29], that
a > 0, b 0,
(s (s))′ 0, for all s ∈ (0, 1],
′ (1) = −a − 2b.
′
and
(4.2)
As remarked in [29] we have that conditions (4.2) imply strong ellipticity for the
elasticity tensor at every F ∈ GL+ (R3 ) satisfying det F 1. Condition (4.2)2 is
used to prove that the energy becomes infinite as det F goes to zero and (4.2)3 is
equivalent to the reference configuration being natural.
For Green–Hadamard materials we have:⋆
τ1 = a + b(µ + λ2 ) + q,
β1 = a + bµ,
β3 = a + bλ2 ,
q = µλ2 ′′ (µλ),
µ(1) = 1.
τ3 = a + 2bµ + tq,
N = t 1/2 [bλ2 + q],
X = t 1/2 [−a − bµ + bλ2 + q],
t = µλ−2 ,
(4.3)
For this material the roots of (3.14) are given by
e1 = −1,
e2 = −
a + tq + 2bµ
a + q + b(µ + λ2 )
(4.4)
It can be shown, see [29], that e2 (λ) < −1, for every λ ∈ (0, 1), and that
dµ/dλ 0, and 1 µ(λ) for every λ ∈ (0, 1]. Hence, in particular, we have
e1 not equal to e2 , and we can apply Lemma 3.1. Making use of (4.3), (4.4), and
(3.16)3,4 , equation (3.20) reduces to
h(n, λ) = v(ρf ) −
2(t − 1) 2 a + bλ2
4t
2
f
v(ρ)
+
f
= 0,
(1 + t 2 )
(1 + t 2 ) a + bµ
(4.5)
where
v(r) ≡ rI0 (r)/I1 (r),
f2 = 1 +
(t − 1)(q + bλ2 )
,
(q + bλ2 ) + (a + bµ)
⋆ See the previous sections for the notation.
(4.6)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
483
and ρ = ρn R (recall that ρn = nπ/L). In [28] Simpson and Spector showed that
for each n ∈ N there exists a λn ∈ (0, 1) such that (4.5) is satisfied. In other
words, if W is given by (4.1) and (4.2), then for each n ∈ N there exists a λn ∈
(0, 1) such that (3.4)–(3.7) has a solution that is a linear combination of (θni , lni ) as
given by (3.16). Hence, for those values of λ, our linearized problem (3.2) admits
nontrivial solutions.
For Green–Hadamard materials, to determine the values of λ ∈ (0, 1) for which
our linearized problem (3.2) fails to satisfy the complementing condition we substitute (4.3) and (4.4) in the left hand side of (3.32) and solve
(4.7)
g(λ) ≡ a + q + b(µ + λ2 ) (t + 1)4 + 16te2 = 0,
for λ ∈ (0, 1).
PROPOSITION 4.1. For Green–Hadamard materials satisfying conditions (4.2),
the complementing condition is always violated at least once, i.e., there exists λc ∈
(0, 1) such that the complementing condition fails at λ = λc .
Proof. It is easy to see that at λ = 1 we have e2 = −1, and t = 1 from which
it follows that g(1) = 0. On the other hand, g ′ (1) > 0, and limλ→0+ g(λ) = +∞.
Hence, using the continuity of g, we conclude that there exists λc ∈ (0, 1) such
that g(λc ) = 0.
✷
The constitutive stored-energy function for Blatz–Ko materials (see [6]) is a
special case of (4.1) corresponding to a = 1, b = 0, and (s) = m1 s −m , with
m > 0, i.e.
1
1
F · F + (det F)−m .
2
m
For this material the formulas (4.3) and (4.4) are reduced to:
W (F) =
µ = µ̂(λ)
τ1
−2
t = µλ
N
= λ−m/(m+1) ,
β1 = β3 = 1,
= m + 2,
τ3 = 1 + (m + 1)λ−(3m+2)/(m+1) ,
= λ−(3m+2)/(m+1) ,
X = mλ−(3m+2)/(2(m+1)) ,
= (m + 1)λ−(3m+2)/(2(m+1))
(4.8)
(4.9)
and
e1 = −1,
e2 = −
1
1 + (m + 1)λ−(3m+2)/(2(m+1)) .
m+2
The characteristic equation (4.5) reduces to (cf. (4.6)),
tm
1/2
1/2
−1/2
h(n, λ) = −4t v(ρ) + 2(t − t
) m+2−
v(ρe2 ) = 0.
e2
(4.10)
(4.11)
Equation (4.11) is much easier to analize than (4.5), but the analysis is not trivial.
Simpson and Spector showed in [28] that for each n ∈ N there exists a unique
484
T. J. HEALEY AND E. L. MONTES-PIZARRO
λn ∈ (0, 1) such that (4.11) is satisfied. Hence, there exists an infinite sequence⋆
(λn ) ⊂ (0, 1) for which Gu (λn , 0) = 0 has nontrivial solutions.
Condition (4.7) can be written as the following polynomial equation on t,
(cf. (4.9)):
(t − 1) (m + 2)t 3 − (11m + 6)t 2 − 5(m + 2)t − (m + 2) = 0.
(4.12)
The complementing condition fails at those values of λ ∈ (0, 1) for which
t = λ−(3m+2)/(m+1) is a root of (4.12). Using Descartes’ rule of signs, we easily see
that (4.12) has only one positive root (other than t = 1) and that it is greater than
one. Hence, we have shown:**
PROPOSITION 4.2. For Blatz–Ko materials the complementing condition is vio−(3m+2)/(m+1)
lated at exactly one value, λc , of λ ∈ (0, 1) where t = λc
is the only
positive (greater than one) root of (m + 2)t 3 − (11m + 6)t 2 − 5(m + 2)t − (m + 2).
The next theorem was proved in [28].
THEOREM 4.3. The infinite sequence (λn ) ⊂ (0, 1) for which Gu (λn , 0) = 0 has
nontrivial solutions satisfy the following properties:‡
1. Each λn , n ∈ N, is a simple root of (4.11).
2. ∃N ∈ N such that (λn )nN ⊂ (λc , 1).
3. limn λn = λc .
4. dim kerGu (λn , 0) = 1, for n < N.
5. 1 dim kerGu (λn , 0) 2 for n N, i.e. at most two linear modes can occur
simultaneously.
REMARK 4.4. For Blatz–Ko materials we also remark:
1. It is clear from the results of Simpson and Spector (see Section 5 and Figure 1 in [28, p. 111]) that for Blatz–Ko materials dn ≡ dim kerGu (λn , 0), is
generically equal to one.
2. It is interesting to note that there exist values of R and L, respectively the
radius and the height of the cylinder, for which λc = λn0 , for exactly one
λn0 ∈ (λn ). In other words, for cylinders with those dimensions the linearized
problem Gu (λc , 0)[h] = 0 has a nontrivial solution.
5. Local Bifurcation
Denoting the Fréchet derivative of G with respect to its second argument by
Gu (λ, u) ≡ L(λ, u) ∈ L(X, Y), where
L(λ, u)[h] = (A(λ, u)[h], B(λ, u)[h]) ∈ Y0 × Y1 ∀h ∈ X,
(5.1)
⋆ The index corresponds to the enumeration of the linear modes.
** This result was also obtained in [10].
‡ Although this theorem was proved in [28], the authors of that paper do not discuss the issue of
the violation of the complementing condition at λ = λc .
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
and assuming that λ ∈ (0, 1] is such that (3.32) holds, then we get
hX C A(λ, 0)[h]Y0 + B(λ, 0)[h]Y1 + hY0
C L(λ, 0)[h]Y + hY0 ,
485
(5.2)
for every h ∈ X, where C > 0 is independent of h. By a Lemma of Peetre [23],
we conclude from (5.2) that L(λ, 0) is a semi-Fredholm operator. A standard homotopy argument (see the proof of Proposition 3.1 in [14]) shows that in this case
L(λ, 0) is a Fredholm operator of index zero. In fact the following theorem is true
(see [30]):
THEOREM 5.1. If L(λ∗ , 0) satisfies the strong ellipticity condition, and the complementing condition for λ∗ ∈ (0, ∞), then L(λ∗ , 0) is self-adjoint and Fredholm
of index zero.
In our problem we cannot say in general for which values of λ ∈ (0, 1],
L(λ, 0)[h] = 0 has nontrivial solutions, or for which values of λ the complementing condition fails. But as we saw in the previous section we can verify the
hypothesis of the following theorem for particular materials.
THEOREM 5.2 (Local Bifurcation Theorem). Consider G: U → Y and suppose
that λ∗ ∈ (0, 1) is such that:
1. dim ker L(λ∗ , 0) = 1.
2. λ∗ satisfies (3.32), i.e. the linearized problem L(λ∗ , 0)[h] = 0 satisfies the
complementing condition.
3. If kerL(λ∗ , 0) = span{h∗ }, and M ≡ Gu,λ (λ∗ , 0), let us assume that
Mh∗ ∈
/ range(L(λ∗ , 0)),
(5.3)
which is usually called “the strict crossing condition”.
Then, (λ∗ , 0) is a bifurcation point of a local continuous branch of nontrivial
solutions of G(λ, u) = 0.
Proof. In view of the Fredholm property (cf. Theorem 5.1), the proof of this
theorem is well known, cf., e.g., [8, 3].
✷
The condition (5.3) can be rewritten in a form which is easier to verify in the
context of our problem, cf. [31].
LEMMA 5.3. Let us assume that λ∗ satisfies the hypothesis of Theorem 5.2. Then
condition (5.3) is equivalent to:
d
∇h∗ · C(Hλ )[∇h∗ ] λ=λ∗ = 0,
(5.4)
dλ
which in turn is equivalent to λ = λ∗ being a simple root of the characteristic
equation, cf. (3.20):
θn1
θn2
f (n, λ) = det Bn
; Bn
= 0.
(5.5)
ℓn1
ℓn2
r=R
486
T. J. HEALEY AND E. L. MONTES-PIZARRO
Proof. Consider the linear functional ψ: Y → R given by ψ(h, g) = − h∗ ·
h + ∂L h∗ · g. If (h, g) ∈ range(L(λ∗ , 0)), then ∃v ∈ Xs such that L(λ∗ , 0)[v] =
(h, g), i.e. such that ∇ · C(Hλ )[∇v] = h in , and C(Hλ )[∇v]n = g on ∂L . Now,
making use of (2.16), integration by parts, and that h∗ ∈ Xs we get
ψ(h, g) = ψ ∇ · C(Hλ∗ )[∇v], C(Hλ∗ )[∇v]n
h∗ · C(Hλ∗ )[∇v]n
= − h∗ · ∇ · C(Hλ∗ )[∇v] +
∂L
∇v · C(Hλ∗ )[∇h∗ ]
∇h∗ · C(Hλ∗ )[∇v] =
=
v · C(Hλ∗ )[∇h∗ ]n
= − v · (∇ · C(Hλ∗ )[∇h∗ ] +
∂L
= 0.
Using the above computation, the hypothesis that L(λ∗ , 0) is Fredhom of index zero with one dimensional kernel, and by self-adjointness, we conclude that
kerψ = rangeL(λ∗ , 0). Therefore, Guλ (λ∗ , 0)[h∗ ] ∈
/ rangeL(λ∗ , 0) if and only if
/ kerψ, i.e., if and only if
Guλ (λ∗ , 0)[h∗ ] ∈
d
= 0,
h∗ · C(Hλ )[∇h∗ ]n
− h∗ · ∇ · C(Hλ )[∇h∗ ] +
dλ
∂L
λ=λ∗
which is condition (5.4) after an integration by parts.
For the other equivalence, let
T
u(i)
λ (r, z) = (θni (r) cos(ρn z), ℓni (r) sin(ρn z)) ,
for i = 1, 2, be two solutions of (3.4), (3.5), and (3.7), with θni (r) and ℓni (r) given
by (3.16)3,4 , continuously differentiable in λ and of mode number n of h∗ for each
λ in a neighborhood of λ∗ . Suppose h∗ = c1 uλ(1)∗ + c2 uλ(2)∗ with c1 , c2 ∈ R (not both
zero). If we define uλ = c1 uλ(1) + c2 uλ(2) , then
d
d
∇h∗ · C(Hλ )[∇h∗ ] λ=λ∗ =
∇uλ · C(Hλ )[∇uλ ] λ=λ∗ .
dλ
dλ
Performing an integration by parts in the last integral and using (3.5) we get
∇uλ · C(Hλ)[∇uλ ] = − uλ · ∇ · C(Hλ )[∇uλ ]
+
uλ · C(Hλ )[∇uλ ]n.
∂L
Since by hypothesis ∇ · C(Hλ )[∇u(i)
λ ] = 0, then
d
d
∇uλ · C(Hλ )[∇uλ ] λ=λ∗ =
uλ · C(Hλ )[∇uλ ]n
dλ
dλ ∂L
λ=λ∗
487
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
L
d
uλ · C(Hλ )[∇uλ ]n r=R
dλ 0
λ=λ∗
d
= L
uλ · C(Hλ )[∇uλ ]n r=R λ=λ∗
dλ
d
= L c · T Aλ cT λ=λ∗ ,
dλ
=
where
θn1 (λ) θn2 (λ)
=
, and
c = (c1 , c2 ),
ℓn1 (λ) ℓn2 (λ)
θn1 (λ)
θn2 (λ)
Aλ = An
; An
.
ℓn1 (λ)
ℓn2 (λ)
r=R
By construction we have that cT is a null vector of T Aλ∗ . But,
which implies that cT is also a null vector of Aλ∗ . Note that
T
d
dAλ T
d
c · T Aλ cT λ=λ∗ = c ·
Aλ + T
c
dλ
dλ
dλ
λ=λ∗
dA
λ T
= c· T
,
c
dλ
λ=λ∗
T
is invertible,
which is nonzero if and only if (d/dλ) det Aλ |λ=λ∗ = 0.
Recall that dimkerL(λ, 0) = dimkerAλ , Hence, by hypothesis (1) of Theorem 5.2, zero is a simple eigenvalue of T Aλ∗ . But, T is invertible, therefore zero
is a simple eigenvalue Aλ∗ , i.e., λ∗ is a root of det Aλ , and we have shown that it is
simple if and only if (5.4) holds.
✷
Combining Theorem 4.3 with Remark 4.4 we notice that for Blatz–Ko materials, Theorem 5.2 implies the existence of an infinite sequence of bifurcating
branches of nontrivial solutions of G(λ, u) = 0 bifurcating from (λn , 0), where
limn λn = λc , and (λn ) is enumerated by the number of the corresponding linear
mode, cf. Proposition 4.2 and Theorem 4.3. A similar phenomena was observed by
Rabier and Oden in their study of bifurcation of steady-state motions of a spinning
hyperelastic incompressible cylinder, cf. [25], and by Simpson and Spector in their
study of buckling of a rectangular rod, cf. [31].
6. Global Bifurcation
In this section we demonstrate that the conditions of Section 5 ensuring local
bifurcation, also yield global bifurcation results for our problem
G(λ, u) = 0,
(6.1)
488
T. J. HEALEY AND E. L. MONTES-PIZARRO
cf. (2.41). As a first step, we define a set of admissible solutions appropriate for
our analysis:
A = (λ, u) ∈ (0, ∞) × X: Hλ + ∇u(x) ∈ GL+ (R3 )
∀x ∈ , and |d(Hλ +∇u(x), x)| > 0 ∀x ∈ ∂L ,
(6.2)
where “d” refers to the determinant involved in the definition of the complementing
conditon, cf. (3.32). We define O to be the maximal connected set in A containing
the point (λ, u) ≡ (1, 0), i.e.,
O = comp{(1, 0)}
in A.
(6.3)
For each δ > 0, we also define
Oδ = (λ, u) ∈ A: det(Hλ + ∇u(x)) > δ ∀x ∈ ,
and |d(Hλ +∇u(x), x)| > δ ∀x ∈ ∂L .
(6.4)
Clearly Oδ ⊂ X is open, and O δ ⊂ O, for each δ > 0. Moreover, O =
and thus, O ⊂ (0, ∞) × X is open.
We now state our main theorem.
9
δ>0 Oδ ,
THEOREM 6.1. Assume the hypotheses of Theorem 5.2. Let S ⊂ A denote the
closure of the set of nontrivial solution pairs (λ, u) of (6.1). Let C ⊂ S denote the
(connected) component of S containing the bifurcation point (λ∗ , 0). Then at least
one of the following holds:
1. C is unbounded in (0, ∞) × X.
2. (λ0 , 0) ∈ C, where λ0 ∈ (0, ∞) and λ0 = λ∗ .
3. C ⊂ Oδ , for each δ > 0.
To prove Theorem 6.1, we first fix δ > 0. Observe that the smoothness assumptions of Section 2 insure that G: Oδ → Y is of class C 2 , cf. [35]. In particular,
we denote the Fréchet derivative of G with respect to its second argument by
Gu (λ, u) ≡ L(λ, u) ∈ L(X, Y), where (cf. (5.1))
L(λ, u)[h] = (A(λ, u)[h],B(λ, u)[h]) ∈ Y0 × Y1
∀h ∈ X.
(6.5)
A straightforward calulation shows that the principal parts of the linear operators
A(λ, u) and B(λ, u) (for fixed (λ, u) ∈ Oδ ) are given by
A(λ, u)[h] = C(Hλ +∇u(x))[∇ 2 h] + · · ·
in ,
B(λ, u)[h] = C(Hλ + ∇u(x))[∇h]n(x) + · · ·
on ∂L .
and
(6.6)
We now summarize three crucial properties of the mapping G: Oδ → Y,
which enable the construction of a degree having the capability of detecting global
bifurcation. In what follows W ⊂ Oδ is open and bounded.
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
489
PROPOSITION 6.2. For each (λ, u) ∈ W , L(λ, u) ∈ L(X, Y) is a Fredholm
operator of index zero, i.e., the (finite) dimension of the null space, N(L(λ, u)), is
equal to the co-dimension of the range, R(L(λ, u)).
Proof. The Schauder estimates and a theorem of Peetre [23] imply that the
dimension of N(L(λ, u)) is finite and that R(L(λ, u)) is closed. A homotopy
argument, using the stability of the Fredholm index (on the connected set O), then
yields the result, cf. [14, Proposition 4.3].
✷
The next proposition requires some additional notation. For each (λ, u) ∈ Oδ ,
note that the linear operator A(λ, u) with domain
Zλ,u ≡ {h ∈ X: B(λ, u)[h] = 0}
(6.7)
is closed in Y0 .
PROPOSITION 6.3. For each (λ, u) ∈ W , there are positive constants ε, C1 , C2 ,
independent of λ, u, µ, and h, such that
hX ≦ C1 |µ|α/2 (A(λ, u)−µ)[h]Y0 + |µ|(1+α)/2 B(λ, u)[h]Y1 , (6.8)
for all h ∈ X, and for all µ ∈ C satisfying |arg(µ)| π/2 + ε and |µ| C2 ,
where α ∈ (0, 1) is the Hölder exponent inherent in X and Y.
Proof. The main observation here is that for all (λ, u) ∈ W , the pair (C(Hλ +
∇u(x)), n(x)) satisfies the strong complementing condition at each x in ∂. This
follows from the path connectedness of (λ, u) and (1, 0) in O and the fact that
the pair (C(I), n(x)) satisfies Agmon’s condition, cf. the proof of [14, Proposition 4.4]. From here, the uniform estimate (6.8) follows from Agmon’s trick [1] in
the Hölder-space setting [36, 15].
✷
The next proposition is established in [14, Theorem 4.6].
PROPOSITION 6.4. The mapping G: Oδ → Y is proper, i.e., G−1 (K) ∩ D is
compact for each bounded set D ⊂ Oδ and compact set K ⊂ Y.
We can now define a degree for u → G as in [14] (cf. also [11, 16]) as follows:
Consider any subset W ⊂ O δ such that for any fixed value of λ ∈ (0, ∞), the set
Wλ = {u ∈X: (λ, u) ∈ W },
(6.9)
is open and bounded. With λ ∈ (0, ∞) fixed, we then consider equation (6.1),
with u → G(λ, u) restricted to W λ , assuming that 0 ∈
/ G(λ, ∂Wλ ) is a regular
value. Proposition 6.3 insures that the linear operator A(λ, u), with domain Zλ,u ,
has a finite number of positive eigenvalues, denoted ν(λ, u), counted by algebraic
multiplicity. We then define the degree of G(λ, ·) in Wλ (with respect to 0) by
(−1)ν(λ,u) ,
(6.10)
deg(G(λ, ·), Wλ , 0) =
u∈G−1
λ (0)∩Wλ
490
T. J. HEALEY AND E. L. MONTES-PIZARRO
with the understanding that deg(G(λ, ·), Wλ , 0) = 0 if G−1
λ (0) ∩ Wλ = ∅. To show
the validity of (6.10) when 0 ∈
/ G(λ, ∂Wλ ) is not a regular value, and to prove
homotopy invariance, viz.,
deg(G(λ, ·), Wλ , 0) = const
(6.11)
for all λ ∈ [λ1 , λ2 ] whenever 0 ∈
/ G(λ, ∂Wλ ) for all λ ∈ [λ1 , λ2 ], require the
use of a generalization of Sard’s theorem [24] applicable to C 2 , proper Fredholm
maps, the later two properties of which are guaranteed by Propositions 6.2 and 6.4.
The uniform estimate (6.8) of Proposition 6.4 together with the continuity of the
mappings (λ, u) → A(λ, u) and (λ, u) → B(λ, u) yield eigenvalue-perturbation
results insuring the continuity of the index (λ, u) → (−1)ν(λ,u) , which is employed
in the proof of homotopy invariance. We refer the reader to [14, the appendix]
for details. In addition to homotopy invariance, our degree has all of the usual
properties of the Leray–Schauder degree, e.g., existence, additivity, etc.
Proof of Theorem 6.1. For each fixed δ > 0, we now argue as in [26], viz., if
none of properties (1)–(3) hold, then by the separation theorem for compact sets,
there is a bounded open set W ⊂ O δ such that C ⊂ W and S ∩ ∂W = ∅. Hence,
(6.11) holds. Moreover, for ε > 0 sufficiently small, S ∩ {(λ, 0): λ∗ − ε < λ <
λ∗ + ε} = {(λ∗ , 0)}. Let Ba (0) denote an open ball of radius a > 0 centered at
0 ∈ X. Let λ → ρ(λ) be continuous on [λ∗ − ε, λ∗ + ε] such that ρ(λ) > 0 on
[λ∗ − ε, λ∗ ) ∪ (λ∗ , λ∗ + ε] and ρ(λ∗ ) = 0. Define Mλ similarly to that in (6.9). We
claim that
deg(G(λ∗ − ε, ·), Mλ∗ −ε , 0) = deg(G(λ∗ + ε, ·), Mλ∗ +ε , 0).
(6.12)
To see this, consider the (parametrized) eigenvalue problem
A(λ, 0)[h] =µh in ,
B(λ, 0)[h] = 0 on ∂L .
(6.13a)
(6.13b)
Note from hypothesis (2) of Theorem 5.2, that µ = 0 and h = h∗ when λ = λ∗ . If
we differentiate (6.13a) and (6.13b) with respect to λ, take the vector dot product of
each with h, integrate the first over and the second ∂L , subtract these equations,
and then evaluate the result at µ = 0, h = h∗ , λ = λ∗ , we obtain the well known
result that the transversality condition (5.4) is equivalent to
dµ
dλ
λ=λ∗
= 0.
(6.14)
This, in turn, insures the “birth” or “death” of a simple, positive eigenvalue as λ
crosses through λ∗ , i.e., the integer ν(λ, 0) increases or decreases by one; (6.12)
then follows directly from (6.10). Finally, since 0 ∈
/ G(λ, ∂(Wλ − M λ )) for all
λ ∈ [λ∗ − ε, λ∗ + ε], we observe that
deg(G(λ, ·), Wλ − M λ , 0) = const
on [λ∗ − ε, λ∗ + ε],
(6.15)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
491
where the constant is zero. To see this, choose W so that W ∩ (R × {0}) =
[λ∗ − ε, λ∗ + ε] × {0}. Observe that there are no solutions of (6.1) in W − M for
large enough λ. By additivity of the degree, we see that (6.12) and (6.15) contradict
(6.11), which completes the proof of Theorem 6.1.
✷
If property (3) of Theorem 6.1 holds, then C ∩ ∂Oδ = ∅ for each δ > 0.
From (6.4) and (6.3) we then conclude that there is a sequence of solution points
{(λj , uj )} ⊂ C such that at least one of the following occurs:
inf det(Hλ + ∇uj (x)) ց 0,
x∈
inf |d(Hλ + ∇uj (x), x)| ց 0,
x∈∂L
j → ∞,
(6.16)
j → ∞.
(6.17)
If we adopt more specific, physically reasonable constitutive hypotheses, we
can follow ideas of [13] to show that (6.16) is not possible on bounded solution
branches C. Specifically, we assume that W has the form
W (F) = (F) + Ŵ(det F) ∀F ∈ GL+ (R3 ),
(6.18)
Ŵ(η) → ∞
ηŴ ′ (η) → −∞
(6.19)
where ∈ C 5 (GL+ (R3 )) ∩ C 2 (GL+ (R3 )) and Ŵ ∈ C 5 (0, ∞) such that the
following growth conditions hold:
as η ց 0,
as η ց 0.
(The smoothness of on the closure of GL+ (R3 ) can be relaxed slightly, cf. [13].)
We now have the following generalization of Theorem 2.3 in [13]:
THEOREM 6.5. Let the hypotheses of Theorem 6.1 hold, and assume the constituive hypothesis (6.18) with growth conditions (6.19). If the global solution branch C
is bounded in (0, ∞) × X, then condition (6.16) is not possible, i.e., bounded
solution branches are characterized by property (2) of Theorem 6.1 and/or property (6.17).
Proof. The proof is nearly identical to that given in [13], and we refer the reader
to that work for the details. However, there is one point in the proof where we need
a different argument here in this context. Namely, we need that J (x) = det F(x) =
det(Hλ + ∇u(x)) not vanish identically on . In [13] the placement boundary conditions rule out such behavior. Here we use the traction-free boundary conditions
and (6.19)2 , the latter of which is stronger than that required in [13], as follows. If
we write the stored energy function W as a function of the principal stretches (cf.
(3.8)–(3.10) above), ν1 , ν2 , ν3 , viz.,
W (F) = (ν1 , ν2 , ν3 ),
(6.20)
then the traction-free boundary condition (2.7) and isotropy imply that the outward
unit normal n is a principal direction, say, i = 1, with
s1 ≡
∂
(ν1 , ν2 , ν3 ) = 0 on ∂.
∂ν1
(6.21)
492
T. J. HEALEY AND E. L. MONTES-PIZARRO
Suppose that {(λj , uj )} ⊂ C is bounded such that Jj (x) = det(Hλ +∇uj (x))ց0
identically on . Now (6.18), (6.19)2 and (6.20) show that
s1 →
Jj Ŵ ′ (Jj )
→ −∞ as j → ∞,
ν1 j
(6.22)
where we have used the fact that the sequence of principal stretches, (ν1j )j , is either
bounded or approaches zero. Obviously, (6.22) contradicts (6.21) on ∂L .
✷
7. Concluding Remarks
Even with the additional hypotheses (6.18) and (6.19), Theorem 6.5 leaves open the
possibility that a bifurcating branch could be bounded and “terminate” due to the
breakdown of the complementing condition. That the complementing condition can
fail along a solution branch is clear. In both of the specific constitutive examples
presented in Section 4, the complementing condition is violated along the trivial
solution branch. On the other hand, the trivial solution does not “terminate” at that
location, perhaps suggesting that Theorem 6.5 is not sharp. That is, although our
existence method fails at a point where the complementing condition is violated,
we do not know if the branch actually terminates or not. A physically reasonable
way around this is to introduce a small additive second-gradient term in the model,
which can be thought of as a model for the surface behavior, cf. [33, 37]. In particular, when the model is linear in the higher-gradient term, the complementing
condition is always satisfied, cf. [19]. We plan to pursue these questions, in the
context of global bifurcation problems, in future work.
Clearly our approach to the barrelling problem serves as a paradigm for the
rigorous analysis of a large class of concrete problems concerning bifurcation from
a trivial line of homogeneously deformed states, e.g., cf. [5, 22, 27, and references therein]. The essential ingredients are: (1) the problem has enough symmetry
or “hidden” symmetry enabling a reformulation without boundary “corners”, cf.
Section 2. (2) The necessary conditions for bifurcation can be obtained for the
linearized problem. (3) A crossing condition is verified, insuring a change in degree. We believe that (1) can be carried out in most cases, although see [21] for
an example where this step is unclear. Conditions (2) and (3) are difficult to carry
for general materials, but can be determined in the context of specific classes of
materials.
Acknowledgements
The work of T.J.H. was supported in part by the National Science Foundation
through grant DMS-0072514, and the work of E.L.M-P by the University of Puerto
Rico as matching funds to the National Institute of Health grant GM-63039-01. The
authors thank Phoebus Rosakis and Pablo V. Negrón-Marrero for useful comments
at various stages of this work.
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
493
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
S. Agmon, On the eingenfunctions and on the eigenvalues of general elliptic boundary value
problems. Comm. Pure Appl. Math. 17 (1964) 35–92.
S. Agmon, A. Douglis and L. Nirenberg, Estimates near the boundary for solutions of elliptic
partial differential equations satisfying general boundary conditions ii. Comm. Pure Appl. Math.
15 (1962) 119–147.
A. Ambrosetti and G. Prodi, A Primer of Nonlinear Analysis, Cambridge Studies in Advanced
Mathematics 34. Cambridge Univ. Press, Cambridge, UK (1993).
J.M. Ball, Strict convexity, strong ellipticity, and regularity in the calculus of variations. Math.
Proc. Cambridge Philos. Soc. 87 (1980) 501–513.
M.A. Biot, Mechanics of Incremental Deformation. Wiley, New York (1965).
P.J. Blatz and W.L. Ko, Applications of finite elasticity theory to the deformation of rubber
materials. Trans. Soc. Rheology 6 (1962) 223–251.
A. Constantin and W. Strauss, Exact steady periodic water waves with vorticity. Preprint (2003).
M. Crandall and P.H. Rabinowitz, Bifurcation from simple eigenvalues. J. Funct. Anal. 8 (1971)
321–340.
P.J. Davies, Buckling and barrelling instabilities in finite elasticity. J. Elasticity 21 (1989) 147–
192.
P.J. Davies, Buckling and barrelling instabilities of nonlinearly elastic columns. Quart. Appl.
Math. 49(3) (1991) 407–426.
C.C. Fenske, Extensio gradus ad quasdam applicationes Fredholmii. Mitt. Math. Seminar
Giessen 121 (1976) 65–70.
T.J. Healey, Global continuation in displacement problems of nonlinear elastostatics via the
Leray–Schauder degree. Arch. Rational Mech. Anal. 152 (2000) 273–282.
T.J. Healey and P. Rosakis, Unbounded branches of classical injective solutions to the forced
displacement problem in nonlinear elastostatics. J. Elasticity 49 (1997) 65–78.
T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech.
Anal. 143 (1998) 1–28.
H. Kielhöfer, Existenz und Regularität von Lösungen semilinearer parabolischer Anfangas–
Randwertprobleme. Math. Z. 142 (1975) 131–160.
H. Kielhöfer, Multiple eigenvalue bifurcation for Fredholm mappings. J. Reine Angew. Math.
358 (1985) 104–124.
M.A. Krasnosel’skii, Topological Methods in the Theory of Nonlinear Integral Equations.
Pergamon Press, New York (1964).
O.A. Ladyzhenskaya and N.N. Ural’tseva, Linear and Quasilinear Elliptic Equations. Academic Press, New York (1968).
A. Mareno, Global continuation in higher-gradient three-dimensional nonlinear elasticity. PhD
Thesis, Cornell University (2002).
A. Mielke and P. Sprenger, Quasiconvexity at the boundary and a simple variational formulation
of Agmon’s condition. J. Elasticity 51 (1998) 23–41.
P.V. Negrón-Marrero and E.L. Montes-Pizarro, Axisymmetric deformations of buckling and
barrelling type for cylinders under lateral compression – The linear problem. J. Elastcity 65
(2001) 61–86.
R.W. Ogden, Non-linear Elastic Deformations. Ellis Horwood, Chichester (1984).
P. Peetre, Another approach to elliptic boundary problems. Comm. Pure Appl. Math. 14 (1961)
711–731.
F. Quinn and A. Sard, Hausdorff conullity of critical images of Fredholm maps. Amer. J. Math.
94 (1972) 1101–1110.
P.J. Rabier and J.T. Oden, Bifurcation in Rotating Bodies, Recherches en Mathématiques
Appliquées 11. Springer, New York (1989).
494
T. J. HEALEY AND E. L. MONTES-PIZARRO
P.H. Rabinowitz, Some global results for nonlinear eigenvalue problems. J. Funct. Anal. 7
(1971) 487–513.
27. K.N. Sawyers, Material stability and bifurcation in finite elasticity. In: R.S. Rivlin (ed.), Finite
Elasticity AMD, Vol. 27. ASME, Basel (1977).
28. H.C. Simpson and S.J. Spector, On barrelling for a special material in finite elasticity. Quart.
Appl. Math. 42 (1984) 99–111.
29. H.C. Simpson and S.J. Spector, On barrelling instabilities in finite elasticity. J. Elasticity 14
(1984) 103–125.
30. H.C. Simpson and S.J. Spector, On the positivity of the second variation in finite elasticity.
Arch. Rational Mech. Anal. 98 (1987) 1–30.
31. H.C. Simpson and S.J. Spector, On bifurcation in finite elasticity: Buckling of a rectangular
block. Unpublished manuscript.
32. S.J. Spector, On the absence of bifurcation for elastic bars in uniaxial tension. Arch. Rational
Mech. Anal. 85 (1984) 171–199.
33. N. Triantafyllidis and E. Aifantis, A gradient approach to localization of deformation I:
Hyperelastic materials. J. Elasticity 16 (1986) 225–237.
34. C. Truesdell and W. Noll, The Nonlinear Field Theories of Mechanics. In: S. Flügge (ed.),
Handbuch der Physik III/3. Springer, Berlin (1965).
35. T. Valent, Boundary Value Problems of Finite Elasticity. Springer, New York (1988).
36. W. von Wahl, Gebrochene Potenzen eines elliptischen Operators und parabolische Differentialgleichungen in Räumen hölderstetiger Funktionen. Nachr. Akad. Wiss. Göttingen II. Math.
Phys. K1 11 (1972) 231–258.
37. C.H. Wu, Cohesive elasticity and surface phenomena. Quart. Appl. Math. 50 (1992) 73–103.
26.
Constitutive Relation of Elastic Polycrystal
with Quadratic Texture Dependence
MOJIA HUANG and CHI-SING MAN
Department of Mathematics, University of Kentucky, Lexington, KY 40506, USA.
E-mail: mclxyh@ms.uky.edu
Received 25 September 2002
Abstract. Herein we consider polycrystalline aggregates of cubic crystallites with arbitrary texture
symmetry. We present a theory in which we keep track of the effects of crystallographic texture
on elastic response up to terms quadratic in the texture coefficients. Under this theory, the Lamé
constants pertaining to the isotropic part of the effective elasticity tensor of the polycrystal will generally depend on the texture. We introduce also two simple models, which we call HM-V and HM-R,
by which we derive an explicit expression for the effective stiffness tensor and one for the effective
compliance tensor. Each of these expressions contains a term quadratic in the texture coefficients and,
in addition to three parameters given in terms of the single-crystal elastic constants, each carries an
undetermined material coefficient. These two remaining coefficients can be determined by imposing
the requirement that the expressions from models HM-V and HM-R be compatible to within terms
linear in the texture coefficients.
Mathematics Subject Classifications (2000): 74B99, 74E10, 74E25, 74M25, 74Q15.
Key words: polycrystal, crystallographic texture, Lamé constants, HM-V and HM-R models.
May the rational spirit prevail!
1. Introduction
A polycrystal is an aggregate of tiny crystallites separated by grain boundaries. The
chemical composition and the arrangement of the constituting crystallites, which
includes grain orientations and grain boundary structure, are the main factors that
determine the effective stiffness tensor of the polycrystal. In this paper we restrict
our discussion to polycrystals whose constituting crystallites are cubic crystals of
the same chemical composition.
To give a crude but quantitative description of grain orientations or “crystallographic texture”, the orientation distribution function w (or equivalently its associated orientation measure ℘ which, for each given Borel set A of orientations,
specifies the probability that the grain located at a given point has its orientation
in A) was introduced independently by Bunge [1] and by Roe [2] in the 1960s.
Since then, efforts have been made to determine the effect of the orientation distribution function (ODF) on various material properties. In linear elasticity the ODF
495
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 495–524.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
496
M. HUANG AND C.-S. MAN
was first introduced [3, 4] into the constitutive equation of orthorhombic aggregates of cubic crystallites through the Voigt model and orientational averaging.
Under the Voigt model, the anisotropic part of the effective elasticity tensor C eff
depends linearly on the anisotropic part of the ODF characterized by the texture
coefficients. A few years ago Man [5] initiated a phenomenological approach in
delineating the effects of crystallographic texture on the mechanical anisotropy of
polycrystals. In this approach the ODF is treated on a par with stress and strain and
is taken as a macroscopic variable in constitutive equations. General principles that
govern constitutive equations (e.g., the principle of material frame-indifference,
indifference to rotation of reference placement [6], etc.) and restrictions imposed
by texture and crystal symmetries are then applied to obtain representation formulae that show explicitly the effects of texture on mechanical response. As a
first example of applying this approach, Man [5] derived, for orthorhombic aggregates of cubic crystallites, a representation formula for the effective elasticity
tensor C eff , which accounts for the effects of the ODF up to terms linear in the
texture coefficients. Empirical experience has so far suggested that Man’s formula
for the elasticity tensor would work well for materials such as aluminum, whose
single crystal has weak anisotropy. On the other hand, there is also experimental
evidence [7] which indicates that this simple formula is inadequate for strongly
textured samples of copper, whose single crystal manifests much stronger elastic
anisotropy than that of aluminum. With a view to applications involving strongly
textured aggregates of crystallites which are themselves strongly anisotropic, here
we seek a formula for C eff which accounts for the effects of crystallographic texture
on elastic anisotropy up to terms quadratic in the texture coefficients. To this end,
we shall follow a phenomenological approach.
The stress and strain that enter into the constitutive equation of an elastic polycrystal are each a volume average of the corresponding field over many crystallites.
In other words, they are the mean stress T and the mean strain E over some representative volume of the polycrystal. In polycrystalline sheet metals, we often find
clustering of crystallite orientations around a relatively small number of specific
orientations Rα . This phenomenon motivates the definition of texture components
in metallurgy. A texture component in the representative volume of a sample often
includes many crystallites. If we denote the volume averages of the strain and stress
over the crystallites included in the αth texture component by E(Rα ) and T (Rα ),
respectively, it should not need a long stretch of the imagination to believe that
there could be constitutive equations governing E(Rα ) and T (Rα ) or, equivalently,
constitutive equations on the mean perturbation strain E = E(Rα ) − E and the
mean perturbation stress T = T (Rα ) − T . As the starting point of our present
theoretical investigations, we postulate the existence of appropriate forms of such
constitutive equations (see equations (59) and (148) below).
After presenting some preliminaries in Sections 2–4, we discuss the constitutive
assumption (59) in Section 5. Starting from this constitutive assumption or rather
its linearization (60), we examine the special instance of cubic aggregates of cubic
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
497
crystallites in Section 6. There we obtain the somewhat surprising result that, when
we consider the influence of crystallographic texture up to terms quadratic in the
texture coefficients, the Lamé constants that pertain to the isotropic part of the
effective elasticity tensor C eff will generally depend on the texture. From experience in working with formulae linear in the texture coefficients, one could hardly
anticipate this finding.
The formula for C eff in Section 6 contains numerous undetermined parameters. Thus, from the practical standpoint, the theory presented there is too general. In Section 7, we add simplifying ad hoc assumptions and obtain two simple
models, which we call HM-V and HM-R, respectively, for polycrystalline aggregates of cubic crystallites with arbitrary texture symmetry. These models lead to
a formula for the effective stiffness tensor C eff and one for the effective compliance tensor S eff , both of which contain terms quadratic in the texture coefficients
(see equations (141) and (155)). Besides three polycrystal coefficients related to
the single-crystal elastic constants, each of these formulae contains an undetermined material parameter, which we denote by ζ and η, respectively. It is easy
to determine the particular values of ζ and η which guarantee compatibility of
the two models up to terms linear in the texture coefficients. The formulae for
these particular values are given in equations (166) and (167). We call the special
version of HM-V and of HM-R with the particular value of ζ and of η model
HM-Vc and HM-Rc , respectively. In Section 8, after giving some remarks on using
numerical calculations and experimental corroboration to check the adequacy of
formulae (141) and (155), we present a couple of examples, where the predictions of models HM-Vc and HM-Rc are compared with results of experimental
measurements and/or computations based on the self-consistent method.
In what follows we adopt the Schönflies notation for point groups and the
Einstein summation convention for tensors. For two fourth-order tensors
and
, we let
:
denote the fourth-order tensor with components ij mn mnkl .
When we regard fourth-order tensors as linear transformations on the space of
second-order tensors, : is simply the composition of the corresponding linear
transformations. In this sense, we have : = 2 .
2. Preliminaries
Henceforth we assume that a fixed spatial Cartesian coordinate system has been
chosen. We consider polycrystalline aggregates of cubic crystallites of the same
chemical composition. To describe the orientation of a crystallite, we pick as reference the configuration of a single crystal which has its three four-fold axes of rotational symmetry coincide with the coordinate axes. The orientation of a crystallite
in the polycrystalline aggregate is then specified by any one of the 24 rotations R
which take the reference configuration to the given configuration of the crystallite.
In much of our discussions below, we shall refrain from making any specific assumption on texture symmetry. On occasions where we refer to aggregates with
498
M. HUANG AND C.-S. MAN
cubic or orthorhombic texture, we assume that the axes of the spatial coordinate
system have been chosen to agree with the three four-fold axes and with the three
two-fold axes of the O and D2 texture symmetry, respectively.
2.1. TENSOR BASIS FOR CUBIC ELASTICITY
For a fourth-order tensor A and a rotation Q, we let Q⊗4 A to denote the tensor
with components
⊗4
Q A ij kl = Qip Qj q Qkr Qls Apqrs .
(1)
If A defines a physical property of a material point X in a given configuration, then
Q⊗4 A describes the same property of X after the configuration is rotated by Q.
Let C(R) be the elasticity tensor of the crystallite with orientation R. Clearly
C(R) = R ⊗4 C(I ),
(2)
where I is the second-order identity tensor and C(I ) is the elasticity tensor of the
reference cubic crystal. Let c11 , c12 , and c44 be the three independent components
of C(I ) (in the Voigt notation), and let
c = c11 − c12 − 2c44 .
(3)
Let B (α) (I ) (α = 1, 2, 3) be [8] the fourth-order tensors with components
Bij(1)kl (I ) = Bij(1)kl = δij δkl ,
Bij(3)kl (I )
=
3
1
Bij(2)kl (I ) = Bij(2)kl = (δik δj l + δil δj k ),
2
(4)
δiα δj α δkα δlα .
α=1
By direct computations, it is easy to verify that
C(I ) = c12 B (1) + 2c44 B (2) + cB (3) (I ).
(5)
If we let R act on both sides of equation (5), we obtain the decomposition of the
elasticity tensor C(R) in terms of the tensor basis B (α) (R) as follows:
C(R) = c12 B (1) + 2c44 B (2) + cB (3) (R),
(6)
where
B (α) (R) = B (α) (I ) = B (α)
for α = 1, 2,
Bij(3)kl (R) = 3α=1 Riα Rj α Rkα Rlα .
(7)
Note that
B (1) = I ⊗ I ,
B (2) = I,
(8)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
499
where I is the identity operator on the space of second-order symmetric tensors,
and that they constitute a basis in the space of isotropic fourth-order tensors with
both the major and minor symmetries.
2.2. THE ORIENTATION DISTRIBUTION FUNCTION
Let w be the orientation distribution function (ODF) [1, 2, 5] pertaining to the given
polycrystalline aggregate, and let L2 (SO(3)) be the space of square-integrable
complex-valued functions defined on the rotation group SO(3). We assume that
w is independent of the sampling location. For w ∈ L2 (SO(3)), we can expand it
as an infinite series in terms of the Wigner D-functions:
w(R) = wiso +
l
cmn
=
∞
l
l
l
l
cmn
Dmn
(R),
l=1 m=−l n=−l
∗
m−n l
(−1)
cm̄n̄ ,
(9)
l
(l 1) are the
where wiso = 1/(8π 2 ) is the ODF for an isotropic aggregate, cmn
∗
texture coefficients, z denotes the complex conjugate of the complex number z,
l
are related to Roe’s Wlmn coefficients [2]
and n̄ = −n. The texture coefficients cmn
by the formula
Wlmn = (−1)n−m
2
cl .
2l + 1 mn
(10)
Let g = 8π 2 gH , where gH is the Haar measure on SO(3) with gH (SO(3)) = 1.
l
The Wigner D-functions Dmn
, which constitute an orthogonal basis in L2 (SO(3)),
satisfy
∗
′
7
6 l
8π 2
l
l′
δll ′ δmm′ δnn′ .
(11)
Dmn
(R) Dml ′ n′ (R) dg =
Dmn , Dm′ n′ ≡
2l + 1
SO(3)
When R = R(ψ, θ, φ) is described by the Euler angles (here we use the convention
adopted by Roe [2]), the Wigner D-functions assume the form [9]
l
l
Dmn
(R) = dmn
(θ)e−i(mψ+nφ) ,
(12)
n+m
n−m
θ
θ
(l + n)!(l − n)!
(n−m,n+m)
l
dmn
(θ) =
cos
sin
Pl−n
(cos θ)
(l + m)!(l − m)!
2
2
(13)
with the Jacobi polynomial
Pq(r,s)(x) = (q + r)!(q + s)!
((x − 1)/2)q−k ((x + 1)/2)k
,
k!(q + r − k)!(q − k)!(s + k)!
k
(14)
where the summation is over all integral values of k for which the arguments of
the factorials in the denominator are non-negative. From (9) and (11), the texture
500
M. HUANG AND C.-S. MAN
coefficients are given by
l
∗
2l + 1
l
cmn =
w(R)
D
(R)
dg.
mn
8π 2 SO(3)
(15)
The texture coefficients that we shall need in this paper can easily be measured
by X-ray diffraction. For cubic crystallites, since w(R) = w(RQcr ) for all Qcr ∈ O,
we have
∗
1 l
2l + 1
l
w(R)
Dmn (RQcr ) dg.
(16)
cmn =
2
8π
24 Q ∈O
SO(3)
cr
By direct computations using a simple Maple program, it is easy to verify that
∗
1 l
Dmn (RQcr ) = 0, when l = 1, 2, 3,
(17)
24 Q ∈O
cr
for all R ∈ SO(3). Hence, from (16) and (17), we obtain
l
cmn
= 0,
when l = 1, 2, 3.
(18)
After a polycrystal with texture characterized by the ODF w undergoes a rotation Q, its texture is described by a new ODF
wQ (R) = w(Q−1 R),
whose texture coefficients
l
čmn
=
l
l
čmn
(19)
can be obtained by the formula
l
l
csn
Dsm
(Q−1 ).
(20)
s=−l
Similarly, when the reference orientation of the crystallites undergoes a rotation Q,
l
the ODF of the polycrystal becomes w(RQ) with texture coefficients c̀mn
l
c̀mn
=
l
l
l
cms
Dns
(Q).
(21)
s=−l
As an example on the applications of (20) and (21), the texture and the crystal
symmetry of orthorhombic aggregates of cubic crystallites lead to the equations
l
cmn
l
cmn
=
=
l
s=−l
l
s=−l
l
l
csn
Dsm
(Q−1
tex ),
l
l
cms
Dns
(Qcr ),
∀Qtex ∈ D2 ,
(22)
∀Qcr ∈ O,
in which D2 denotes the orthorhombic group of texture symmetry and O the octahedral group of cubic crystal symmetry. From (22), one [1, 2, 10] can easily derive
l
l
the restrictions on cmn
, some of which are shown in Table I (e.g., cmn
= 0 whenever
m is odd).
501
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
l for orthorhombic aggregates
Table I. Some restrictions on texture coefficients cmn
of cubic crystallites (in the table k denotes an integer)
Conditions
m = 2k
m = 2k
n = 4k
n = 4k
l =
cmn
l
(−1)l cm̄n
0
l
(−1)l cm
n̄
0
√
4 = 70 c4
cm4
14 m0
2.3. ISOTROPIC PART OF ELASTICITY TENSOR
In this paper we shall present a theory and two models on textured aggregates of
cubic crystallites, under which the “isotropic part” of the effective elasticity tensor
will generally depend on the crystallographic texture. To make our discussions
precise, here we define what we mean by the isotropic part Ciso of an elasticity
tensor C, and we give a formula for the computation of Ciso. Proof of a general
version of this formula and detailed discussions on the isotropic and anisotropic
parts of general material tensors will be given elsewhere.
Let V be the translation space of the three-dimensional Euclidean space, and V r
the r-fold tensor product V ⊗V ⊗· · ·⊗V . Elasticity tensors belong to the subspace
of V 4 whose members enjoy both the major and minor symmetries. We denote this
subspace by [[V 2 ]2 ]. Each rotation Q on V induces a linear transformation Q⊗4
on [[V 2 ]2 ] as defined by (1).
The map Q → Q⊗4 defines [11] a linear representation of the rotation
group SO(3) on [[V 2 ]2 ]. By formally introducing the complexification Vc of V (see
[11, p. 105]), we shall henceforth regard this tensor representation as a complex
representation. For simplicity, we shall suppress the subscript “c” and continue to
write the complex representations as Q → Q⊗4 |[[V 2 ]2 ].
The rotation group has a complete set of absolutely irreducible unitary representations Dl (l = 0, 1, 2, . . .) of dimension 2l + 1. The representation Q →
Q⊗4 |[[V 2 ]2 ], however, is reducible; it can be decomposed as a direct sum of subrepresentations as described by the formula
2 2
[V ] = 2D0 + 2D2 + D4 .
(23)
This formula should be interpreted as follows: the 21-dimensional space [[V 2 ]2 ]
is a direct sum of two 1-dimensional, two 5-dimensional, and one 9-dimensional
subspaces, each of which is invariant under the action of Q⊗4 for every rotation Q. Moreover, the restrictions of Q → Q⊗4 on each of the 1-dimensional,
5-dimensional, and the 9-dimensional subspace are subrepresentations equivalent
to the irreducible representation D0 , D2 , and D4 , respectively. Decomposition formulae such as equation (23) above can be derived by computing the character of
the tensor representation in question [12–14] or by other methods [15].
Tensors C which fall in the D4 subspace of [[V 2 ]2 ] are harmonic. In other
words, they are totally symmetric and traceless, i.e.,
Ci1 i2 i3 i4 = Ciτ (1) iτ (2) iτ (3) iτ (4)
(24)
502
M. HUANG AND C.-S. MAN
for any permutation τ of {1, 2, 3, 4} and
trj,k C = 0
(25)
for any pair of distinct indices j and k.
A tensor C ∈ [[V 2 ]2 ] is isotropic if and only if it takes value in the direct sum
of the two 1-dimensional subspaces invariant under Q⊗4 . These two subspaces
are spanned by the tensors I ⊗ I and I, respectively. We call the 2-dimensional
subspace spanned by these two tensors the isotropic subspace of [[V 2 ]2 ]. When we
write an isotropic tensor in [[V 2 ]2 ] as the linear combination λI ⊗ I + 2µI, the
parameters λ and µ are called the Lamé constants.
For an arbitrary C ∈ [[V 2 ]2 ], we write it as a direct sum of tensors in the rotationally invariant subspaces given in decomposition (23). We define the isotropic
part of C to be that which falls in the isotropic subspace of [[V 2 ]2 ] under this
decomposition. One recipe to compute the isotropic part of tensor C is by way of
the integral formula
R ⊗4 Cwiso dg.
(26)
Ciso =
SO(3)
We call C − Ciso the anisotropic part of C.
When Ciso depends on texture, the corresponding Lamé constants are functions
of the ODF w. Naturally they should be isotropic functions of w. To make this
connotation precise, we introduce the following:
DEFINITION 2.1. A function f (·) of the ODF is isotropic if f (wQ ) = f (w) for
each rotation Q and each ODF w.
3. Theoretical Setting
Consider an ensemble of nominally identical polycrystals, each member of which
is subjected to the same macroscopic deformation so that T the ensemble average
of the Cauchy stress, and E the ensemble average of the infinitesimal strain, are
independent of place x. We assume that T and E are equal to the volume average
of the stress field and of the strain field in a representative polycrystal B, and we
call them the mean stress and the mean strain of the polycrystal, respectively. The
effective stiffness tensor C eff of the polycrystal is defined by [16]
T = C eff [E].
(27)
Let w be the ODF that characterizes the texture of the polycrystal in question,
and let ℘ be the associated orientation measure. For each Borel subset A of SO(3),
we have
w(R) dg.
(28)
℘ (A) =
A
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
503
Let Tm (A) and Em (A) be the volume average of the stress and strain field over
the crystallites in B, the orientations of which lie in A. We assume that the set
functions Tm (·) and Em (·) are vector-valued measures. Clearly, T = Tm (SO(3))
and E = Em (SO(3)). Let T (·) and E(·) be the Radon–Nikodym derivative of Tm
and Em with respect to ℘, respectively. It follows that
T (R)w(R) dg,
E(R)w(R) dg.
(29)
T =
E=
SO(3)
SO(3)
Roughly speaking, T (R) and E(R) are the (Euclidean) volume averages of the
stress and strain fields over those crystallites in B whose orientations lie in an
infinitesimal group volume around R. By abuse of language, we simply call T (R)
and E(R) the mean stress and mean strain pertaining to the crystallites with orientation R.
We define the mean stiffness tensor of the polycrystal to be
C(R)w(R) dg,
(30)
C=
SO(3)
and call
D(R) = C(R) − C,
E(R) = E(R) − E
(31)
the perturbation stiffness tensor and the mean perturbation strain of the crystallites
with orientation R, respectively. It is clear from their definition that
D(R)w(R) dg = 0,
(32)
D =
SO(3)
E (R)w(R) dg = 0.
(33)
E =
SO(3)
From the preceding equations and the identity
T (R) = C(R)[E(R)]
= (C + D(R))[E + E(R)]
= C[E] + D(R)[E] + C[E (R)] + D(R)[E(R)],
(34)
we deduce that
T = C[E] + D(R)[E (R)].
(35)
Combining equations (27) and (35), we have
C eff [E] = C[E] + D(R)[E(R)],
where
D(R)[E(R)] =
D(R)[E (R)]w(R) dg.
SO(3)
Our task at hand is to determine C eff from equation (36).
(36)
(37)
504
M. HUANG AND C.-S. MAN
REMARK 3.1. The bulk modulus K of the polycrystal is defined by the equation
tr T = 3K tr E.
(38)
Under the present setting, we always have
K=
2
1
c11 + c12
3
3
(39)
for aggregates of cubic crystallites, irrespective of the texture. Indeed, from (7), we
observe that
(1)
= 3δkl ,
Biikl
(2)
= δkl ,
Biikl
(3)
(R) = δkl ;
Biikl
hence, from (6), we have
Ciikl (R) = 3c12 δkl + 2c44 δkl + cδkl = (c11 + 2c12 )δkl ,
which leads to
Diikl (R) = Ciikl (R) −
SO(3)
Ciikl (R)w(R) dg = 0.
The preceding equation and equation (35) implies that
Ciikl (R)w(R) dg E kl
T ii = C iikl E kl =
SO(3)
= (c11 + 2c12 )δkl E kl = (c11 + 2c12 )E ii .
(40)
Formula (39) then follows from a comparison of (38) with (40).
REMARK 3.2. When all the crystallites in the polycrystal have the same orientation R0 , we have C = C(R0 ) and, by definition (31)1 , D = 0. Equation (36) then
leads to the formula C eff = C(R0 ).
4. Orientational Averaging of Tensor Basis
Let A(·) be a tensor function defined on the rotation group SO(3). For simplicity,
we introduce the following notation
+ A
A(R)w(R) dg = A
(41)
A=
SO(3)
where
=
A
A(R)wiso dg,
SO(3)
A=
SO(3)
A(R)(w(R) − wiso ) dg.
(42)
505
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Since B (1) (R) = I ⊗I and B (2) (R) = I are constant tensor functions on SO(3),
we have
B
(α)
(α) = B (α) ,
=B
B (α) = 0,
α = 1, 2.
(43)
From equations (7)3 and (42)1 , we obtain the identity
(3) = 1 I ⊗ I + 2 I.
B
5
5
(44)
To proceed further, let us write
B (3) ≡
(45)
for brevity. The tensor is harmonic (see Section 2.3 above); explicit formulae
expressing its components in terms of the texture coefficients have been reported
elsewhere [17], which we reproduce here for completeness:
2233 = a1 ,
1133 = a2 ,
1122 = a3 ,
1123 = a5 − a8 ,
1113 = −a7 + 3a4 ,
1112 = −a6 + a9 ,
3323 = −4a5 ,
3313 = −4a4 ,
3312 = 2a6 ,
(46)
where
a1
a3
a5
a7
a9
32π 2 4
5
4
c00 +
Re(c20 ) ,
=−
105
2
√
8π 2 4
4
c00 − 70Re(c40
) ,
=
105
√
8 5π 2
4
=
Im(c10
),
105
√
8 35π 2
4
Re(c30
),
=
105
√
8 70π 2
4
Im(c40
);
=
105
a2
a4
a6
a8
32π 2 4
5
4
c00 −
Re(c20 ) ,
=−
105
2
√ 2
8 5π
4
Re(c10
=
),
105
√
8 10π 2
4
(47)
Im(c20
=
),
105
√
8 35π 2
4
=
Im(c30
),
105
here Re(z) and Im(z) denote the real and imaginary parts of the complex number z,
respectively. From the total symmetry of and from the traceless condition (cf.
equations (24) and (25)), all the other components of the harmonic tensor can be
obtained from those displayed above.
The presence of texture symmety imposes restrictions on the texture coefficients, and the formulae for the components of will simplify accordingly. For
4
=
instance, for orthorhombic aggregates, the texture coefficients are real, cm0
4
4
cm̄0 , and cm0 = 0 for odd m (see Table I above). The independent non-trivial
506
M. HUANG AND C.-S. MAN
components are then
5 4
5 4
32π 2 4
32π 2 4
1133 = −
c00 +
c00 −
2233 = −
c ,
c ,
105
2 20
105
2 20
√ 4
8π 2 4
c00 − 70c40
1122 =
.
(48)
105
Under the Voigt model, all the grains in the polycrystal are assumed to have a
uniform strain field equal to the mean strain E of the polycrystal. Thus we have
the mean stress of the crystallites with orientation R given by T (R) = C(R)[E],
and the mean stress of the polycrystal given by T = C[E]. It follows from (27)
that, under the Voigt model, the effective stiffness tensor of the polycrystal is given
by the mean stiffness tensor, i.e., C eff = C. From equations (6) and (43)–(45),
we obtain the following explicit formulae for polycrystalline aggregates of cubic
crystallites with arbitrary texture symmetry:
(3)
+ B (3) = λI ⊗ I + 2µI + c ,
C = c12 B (1) + 2c44 B (2) + c B
(49)
1
1
2
4
λ = c12 + c = c11 − c44 + c12 ,
(50)
5
5
5
5
1
3
1
1
(51)
µ = c44 + c = c11 + c44 − c12 .
5
5
5
5
For the special instance of orthorhombic aggregates, the preceding form of the
mean stiffness tensor has long been available in the literature [3, 4].
Under the Reuss model, all grains in the polycrystal are assumed to have a
uniform stress field equal to the mean stress T of the polycrystal. It follows that for
the Reuss model the effective compliance tensor of the polycrystal is none other
than the mean compliance tensor
S(R)w(R) dg;
(52)
S=
SO(3)
here S(R) = R ⊗4 S(I ), where
with
S(I ) = s12 B (1) + 2s44 B (2) + sB (3) (I )
(53)
s = s11 − s12 − 2s44 ,
(54)
s11 =
2
c11
c11 + c12
,
2
+ c11 c12 − 2c12
s12 =
2
c11
−c12
,
2
+ c11 c12 − 2c12
s44 =
1
.
4c44
(55)
Similar to the computation of C, we obtain the mean compliance tensor of the
polycrystal
S = λs I ⊗ I + 2µs I + s ,
1
2
4
λs = s11 − s44 + s12 ,
5
5
5
(56)
1
3
1
µs = s11 + s44 − s12 .
5
5
5
(57)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
507
5. Constitutive Assumption on Mean Perturbation Strain
The Voigt model and the Reuss model, for which we have C eff = C and S eff = S,
respectively, are too simplistic and are based on dubious physical assumptions. As
shown in (49) and (56), both the anisotropic part of C and of S are linear in the
texture coefficients. Starting from the assumptions that C eff = C eff (w) and that C eff
is indifferent to the rotation of reference placement [6], which with the principle of
material frame-indifference leads to the constitutive restriction
C eff (wQ ) = Q⊗4 C eff (w)
(58)
for each rotation Q, Man [5, 14] derived for orthorhombic aggregates of cubic
crystallites a formula for C eff up to terms linear in the texture coefficients. Man’s
formula is identical in form to (49) for C, although the parameters λ, µ and c
are all undetermined material constants; thus his formula should be interpreted in
the same spirit as the classical representation formula with two Lamé constants
in isotropic elasticity. Empirical experience has so far suggested that Man’s formula would work well for materials such as aluminum, whose single crystal has
weak anisotropy. On the other hand, experimental evidence [7] also indicates that
this simple formula is inadequate for strongly textured samples of copper, whose
single crystal manifests much stronger elastic anisotropy than that of aluminum.
With a view to applications involving strongly textured aggregates of strongly
anisotropic crystallites, here we seek a formula for C eff which delineates the effects
of crystallographic texture on elastic anisotropy up to terms quadratic in the texture
coefficients.
Consider, in the given polycrystal, the collection of crystallites with orientations
in an infinitesimal group volume around R. Recall that we let E(R) denote the
mean strain (i.e., volume average of the strain field) in these crystallites, and we
call E = E(R) − E the mean perturbation strain pertaining to these crystallites
(cf. Section 3 for a precise definition of the function E(·)). Our analysis below is
based on the following physical assumption:
(#) The mean perturbation strain E is governed by a constitutive relation of the
form
E = E(R, w, E)
(59)
with E (R, w, 0) = 0.
Since we are concerned with linear elasticity in this paper, we linearize the
constitutive function (59) with respect to E and take as our starting point the
constitutive relation
E = H(R, w)[E],
(60)
where H is a fourth-order tensor with minor symmetries. Similar to equation (58),
we require H to satisfy, under any rotation Q of the polycrystal, the constraint that
H(QR, wQ ) = Q⊗4 H(R, w)
(61)
508
M. HUANG AND C.-S. MAN
for each R ∈ SO(3). On the other hand, since we restrict our attention to aggregates
of cubic crystallites, we have
H(R, w) = H(I , w)
(62)
for each R ∈ O, the group of cubic crystal symmetry.
To proceed further, suppose the function H(R, ·) is sufficiently smooth in a
neighborhood of w = wiso that we may use the Taylor formula
H(R, w) = H (0) (R, wiso ) + H (1) (R)[w − wiso]
+ H (2) (R)[w − wiso , w − wiso] + o w − wiso 2 ,
(63)
where H (β) (R) is 1/β! times the βth derivative of H(R, ·) at w = wiso and
· denotes the L2 -norm. Henceforth, for simplicity, we shall suppress wiso in
H (0) (R, wiso) and write it as H (0)(R). Clearly, for β = 0, 1, 2, H (β) enjoys the
minor symmetries, and they satisfy
H
H
(2)
(1)
H (0)(QR) = Q⊗4 H (0) (R),
(QR)[wQ − wiso ] = Q⊗4 H (1) (R)[w − wiso ],
⊗4
(QR)[wQ − wiso , wQ − wiso ] = Q H
(2)
(64)
(R)[w − wiso , w − wiso]
for all Q, R ∈ SO(3).
That E has to satisfy also the requirement E = 0 (cf. (33) in Section 3) imposes
further restrictions on H (β) , i.e.,
(0) =
H (0) (R)wiso dg = 0,
(65)
H
SO(3)
(1)
H (0) (R)(w(R) − wiso ) dg = 0,
H (R)[w − wiso ]wiso dg +
SO(3)
SO(3)
(66)
H (2) (R)[w − wiso , w − wiso ]wiso dg
SO(3)
H (1) (R)[w − wiso](w(R) − wiso ) dg = 0.
+
(67)
SO(3)
From (64)1 , we observe that
H (0) (R) = R ⊗4 H (0) (I ),
∀R ∈ SO(3).
(68)
On the other hand, for R ∈ O, the symmetry group of the reference cubic crystallite, we have H (0)(R) = H (0)(I ) or
H (0) (I ) = R ⊗4 H (0) (I ),
∀R ∈ O,
where we have appealed to equation (68). It follows that
1
(0)
Rip Rj q Rkr Rls Hpqrs
(I ),
Hij(0)kl (I ) =
24 R∈O
(69)
(70)
509
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
from which we obtain the identity
(0)
(I )
Hij(0)kl (I ) = Hklij
(71)
by direct computations. In other words, H (0) (I ) has the major symmetry. Because
H (0) (I ) enjoys the minor and major symmetries, equations (68) and (69) imply
that we may write
H
(0)
(R) =
3
hα B (α) (R),
(72)
α=1
where hα are some constants. From (43), (44) and (65), we know
3
2h3
h3
hα
B (1) + h2 +
B (2) = 0.
B (α) (R)wiso dg = h1 +
5
5
SO(3)
α=1
(73)
Let h3 = ζ . From (73), we observe that
1
h1 = − ζ,
5
2
h2 = − ζ.
5
(74)
Putting (74) into (72), we have
H (0) (R) = ζ (R),
2
1
(R) = − B (1) − B (2) + B (3)(R).
5
5
For later use, we record here two more equations involving
= ,
D(R) = c( (R) −
).
(75)
(76)
, namely:
(77)
(78)
Equation (77) follows immediately from (43)–(45) and (76), and equation (78)
from (6), (31)1 , (49) and (76).
6. Lamé Constants with Quadratic Texture Dependence
In this section, we consider the special instance where the texture of the polycrystal
also has cubic symmetry. This particular example serves to highlight a somewhat
surprising result under our present theory: If we account for the effects of texture
up to terms quadratic in the texture coefficients, then the Lamé constants of the
isotropic part of C eff will generally depend on the texture.
For brevity, we shall sometimes write
H (β) (R)[w − wiso , . . . , w − wiso] ≡ H (β) (R, w).
β -fold
(79)
510
M. HUANG AND C.-S. MAN
From (64), we observe that for β = 1, 2 and for each Q ∈ O, we have
H (β) (Q, w) = Q⊗4 H (β) (I , w),
(80)
because wQ = w when Q belongs to the group of cubic texture symmetry. On the
other hand, it follows from equations (62) and (63) that
H (β) (Q, w) = H (β) (I , w)
(81)
for each Q ∈ O. Hence we have
H (β) (I , w) = Q⊗4 H (β) (I , w)
(82)
for each Q ∈ O. Following the same argument as in the derivation of equation (72),
we see that we may express H (β) (I , w) in terms of the tensor basis B (α) (I ):
H (β) (I , w) =
3
(α)
h(β)
(I )
α (w)B
(83)
α=1
for β = 1, 2, where
hα(1) (w) ≡ hα(1) [w − wiso],
hα(2)(w) ≡ hα(2)[w − wiso, w − wiso ].
(84)
Combining equations (64) and (83), we conclude that for each R ∈ SO(3)
H (β) (R, wR ) =
3
(α)
h(β)
(R).
α (w)B
(85)
α=1
Replacing w by wRT in (85), we have
H (β) (R, w) =
3
(α)
(R).
h(β)
α (wR T )B
(86)
α=1
By taking the ODF w as a parameter, we treat
aα (R; w) ≡ hα(1)[wRT − wiso ],
fα (R; w) ≡ hα(2)[wRT − wiso , wRT − wiso ]
as functions defined on the rotation group. Clearly,
aα (R; w)w(R) dg, fα (w) =
a α (w) =
SO(3)
fα (R; w)wiso dg (87)
SO(3)
are functions of the ODF.
LEMMA 6.1. The parameters a α and fα are isotropic functions of the ODF, i.e.,
a α (wQ ) = a α (w) and fα (wQ ) = fα (w) for each rotation Q.
511
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Proof. For later convenience, we introduce the notation
l
l
′
(α)
(2)
(1)
Dmn , Dml ′ n′ .
Flmnl
A(α)
′ m′ n′ ≡ hα
lmn ≡ hα Dmn ,
(88)
By the linearity and bilinearity of hα(1) and hα(2) , respectively, we have
aα (R; w) = hα(1)[wRT − wiso ] =
=
l
l
A(α)
lmn čmn
l,m,n
l
l
A(α)
lmn csn Dsm (R),
(89)
l,m,n s=−l
fα (R; w) = hα(2)[wRT − wiso , wRT − wiso ]
(α)
l
l′
=
Flmnl ′ m′ n′ čmn
čm
′ n′
l,m,n l ′ ,m′ ,n′
′
=
l
l
l,m,n l ′ ,m′ ,n′
s=−l
s ′ =−l ′
′
′
(α)
l l
l
l
Flmnl
′ m′ n′ csn cs ′ n′ Ds ′ m′ (R)Dsm (R),
(90)
where l, l ′ 1 and we have made use of equation (20). Since we have
aα =
l
l
A(α)
lmn csn
l,m,n s=−l
SO(3)
l
Dsm
(R)wiso dg = 0,
(91)
we obtain from equations (11), (42)1 , and (89) the formula
a α (w) = aα =
=
l
′
l l
A(α)
lmn csn cm′ n′
l,m,n s=−l l ′ ,m′ ,n′
SO(3)
l
8π 2
l l
(−1)s+m A(α)
lmn csn cs̄ m̄ ,
2l
+
1
l,m,n
s=−l
′
l
Dsm
(R)Dml ′ n′ (R) dg
(92)
l
where we have appealed to the property dm̄l n̄ (θ) = (−1)m+n dmn
(θ). Similarly, from
(90), we have
fα (w) =
l
l
(−1)s+m (α)
l l
Flmnl m̄n′ csn
cs̄n′ .
2l
+
1
′
l,m,n s=−l
(93)
n =−l
Since wQ (R) = w(QT R), we deduce from (89) that
aα (R; wQ ) =
l
l,m,n s=−l
l
l
T
A(α)
lmn csn Dsm (Q R).
(94)
512
M. HUANG AND C.-S. MAN
From (20) and (92) we derive the identity
l
l
l
T
T
A(α)
a α (wQ ) =
lmn csn Dsm (Q R)(w(Q R) − wiso ) dg
SO(3) l,m,n s=−l
=
=
l
l l′
A(α)
lmn csn cm′ n′
l,m,n s=−l l ′ ,m′ ,n′
l
l l′
A(α)
lmn csn cm′ n′
l,m,n s=−l l ′ ,m′ ,n′
′
SO(3)
l
Dsm
(QT R)Dml ′ n′ (QT R) dg
′
SO(3)
l
Dsm
(R)Dml ′ n′ (R)dg
= a α (w).
(95)
Similarly, we can show that
′
fα (wQ ) =
l
l
l,m,n l ′ ,m′ ,n′ s=−l s ′ =−l ′
×
′
(α)
l l
Flmnl
′ m′ n′ csn cs ′ n′
′
SO(3)
l
(QT R)wiso dg = fα (w).
Dsl ′ m′ (QT R)Dsm
(96)
✷
Substituting (89) and (90) into equation (86) for β = 1, 2, respectively, we have
H (1) (R, w) =
H
(2)
(R, w) =
3
l
l
l
(α)
A(α)
(R),
lmn csn Dsm (R)B
(97)
α=1 l,m,n s=−l
′
l
3
l
α=1 l,m,n l ′ ,m′ ,n′ s=−l s ′ =−l ′
′
(α)
l l
Flmnl
′ m′ n′ csn cs ′ n′
′
l
(R)B (α) (R).
× Dsl ′ m′ (R)Dsm
(98)
From (63) and (75), the perturbation strain E in (60) can be expressed as
E = (ζ
+ H (1) + H (2) + · · ·)[E].
We substitute (78) and (99) into the term D[E] in (35) to obtain
l 2
(1)
(2)
(3) + C (4) + C
(5) + o |cmn
D[E ] = c(C + C + C
| [E],
(99)
(100)
where (cf. Section 1 for notation)
(1)
(2)
C =ζ : ,
C = −ζ : ,
(4)
(1)
(3)
=− :H
,
C
C = : H (1),
(5) =
: H (2) wiso dg.
C
SO(3)
(101)
(102)
(103)
513
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Table II. Results of B (α) : B (β)
(α)
(β)
Bij mn Bmnkl
β=1
α=1
3Bij kl
α=2
Bij kl
(1)
α=3
Bij kl
(1)
Bij kl
(1)
β=2
β=3
Bij kl
(1)
Bij kl
Bij kl
(2)
Bij kl
(3)
Bij kl
(1)
(3)
(3)
In order to compute the right-hand sides of equations (101)–(103), first we list
the results of B (α) : B (β) in Table II. From (76) and Table II, we have
:
=−
3 (1)
4
1
B + B (2) + B (3) (R).
25
25
5
From (43), (44), (101)1 and (104), we obtain
6
1
2
(1)
.
C =ζ − I ⊗I + I+
25
25
5
Since
C
=
(2)
(105)
(see equation (77)), we know
= −ζ
:
.
(1) = −H (0) = −ζ
From (65), (66), (75) and (102)1 , we observe that H
(3) = ζ
C
(104)
:
.
(106)
and
(107)
With the help of Table II, we can recast (102)2 and (103) as
1
2
3
(4)
C = − (a 2 + a 3 )I ⊗ I − a 2 I +
a2 + a3 B (3) (R)w(R) dg,
5
5
5
SO(3)
(108)
3
(5) = − 1 f2 + f3 I ⊗ I − 2 f2 I +
f2 + f3 B (3)(R)wiso dg.
C
5
5
5
SO(3)
(109)
To proceed further, let us write
3
0
a2 + a3 B (3)(R)wiso dg,
A =
5
SO(3)
3
1
A =
a2 + a3 B (3)(R)(w − wiso) dg,
5
SO(3)
3
f2 + f3 B (3) (R)wiso dg.
F0 =
5
SO(3)
(110)
(111)
(112)
514
M. HUANG AND C.-S. MAN
l
Since B (3) (R) is a fourth-order tensor and cmn
= 0 for 1 l 3 (see (18)), we
observe from (89) and (110) that
4
4
0
4
4
U4mn csn
Dsm
(R)B (3)(R)wiso dg,
(113)
A =
SO(3) m,n=−4 s=−4
where
3 (3)
(2)
+ A4mn
U4mn = A4mn
.
5
Let
(114)
√
√
70
70
dm = U4m0 +
U4m4 +
U .
14
14 4m4̄
Using Table I, we have
4
4
4
4
dm cs0
Dsm
(R)B (3) (R)wiso dg.
A0 =
(115)
(116)
SO(3) m=−4 s=−4
It is easy to show [17] that
4
Dsm
(R)B (3) (R)wiso dg = 0 when m = ±1, ±2, ±3,
(117)
SO(3)
and
SO(3)
4
Ds4
(R)B (3) (R) dg
=
=
SO(3)
√
70
14
Ds44̄ (R)B (3)(R) dg
SO(3)
4
Ds0
(R)B (3)(R) dg.
Hence from (116), we have
4
4
4
cs0
Ds0
(R)B (3)(R)wiso dg = ξ ,
A0 = ξ 1
(118)
(119)
SO(3) s=−4
where
√
√
7
70
70
ξ1 = d0 +
d4 +
d4̄ , and ξ =
ξ1 .
(120)
14
14
96π 2
Now we consider the relations (111) and (112). Using the Clebsch–Gordan
expansion [18], we see that
A1 =
0
l
′
l l
Ulmn csn
cm′ n′ lsm
l ′ m′ n′ ,
(121)
l ′ ,m′ ,n′ l,m,n s=−l
F = wiso
′
l
l
l ′ ,m′ ,n′
l,m,n s ′ =−l ′ s=−l
′
′ ′
′
lmn l l
csn cs ′ n′ lsm
Vlmn
l ′ s ′ m′ ,
(122)
515
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
where
3 (3)
(2)
l ′ m′ n′
Vlmn
= Flmnl
F ′ ′ ′,
′ m′ n′ +
5 lmnl m n
J
l+l ′
J
JN
JM
lmn
Clml ′ m′ Clnl ′ n′
l ′ m′ n′ =
J =|l−l ′ | M=−J N=−J
(123)
SO(3)
J
DMN
(R)B (3) (R) dg; (124)
JM
(3)
here Clml
(R) is a fourth-order
′ m′ are the Clebsch–Gordan coefficients. Since B
tensor pertaining to a cubic crystal, we have
J
DMN
(R)B (3) (R) dg = 0 unless J ∈ {0, 4} and N ∈ {−4, 0, 4}.
SO(3)
(125)
1
0
We can decompose A and F in (121) and (122) into
A1 = K (0) + K (4) ,
(0)
(0)
F 0 = L(0) + L(4),
(126)
(4)
where K and L are the partial sums in (121) and (122) with J = 0; K and
L(4) are the partial sums in (121) and (122) with J = 4. Note that the tensors K (0)
and L(0) lie in the isotropic subspace of [[V 2 ]2 ], whereas K (4) and L(4) belong
to the 9-dimensional anisotropic D4 subspace and they are harmonic (i.e., totally
symmetric and traceless).
0
When J = 0, we have l = l ′ and D00
(R) = 1, which leads to
00
00
0
lsm
D00
(R)B (3)(R) dg
l ′ m′ n′ = Clslm′ Clmln′
SO(3)
2 (2)
δs m̄′ δmn̄′
s+m
2 1 (1)
(−1) 8π
B + B
(127)
=
2l + 1
5
5
and
lsm
l ′ s ′ m′
2 (2)
δs s̄ ′ δmm̄′
s+m
2 1 (1)
(−1) 8π
B + B
,
=
2l + 1
5
5
(128)
where we have appealed to the identity [18]
δll ′ δs m̄′
00
Clsl
(−1)l−s .
(129)
′ m′ =
2l + 1
Putting (126)–(128) into (121) and (122), respectively, we obtain from (92), (93),
(114) and (123) the formulae
l
8π 2
1
2
s+m
l l
(0)
(−1) Ulmn csn cs̄ m̄ I ⊗ I + I
K
=
2l + 1 s=−l
5
5
l,m,n
3
2
1
I ⊗I + I ,
(130)
= a2 + a3
5
5
5
3 1
2
(0)
L = f2 + f3
I ⊗I + I .
(131)
5
5
5
516
M. HUANG AND C.-S. MAN
Substitution of (110)–(112), (119), (126), (130), and (131) into (108) and (109)
leads to
2
6
(4)
C = − a 3 I ⊗ I + a 3 I + ξ + K (4) ,
(132)
25
25
(5) = − 2 f3 I ⊗ I + 6 f3 I + L(4).
C
(133)
25
25
Finally, combining (100)–(107), (132) and (133) with (35), we obtain a formula
for the effective elasticity tensor of cubic aggregates of cubic crystallites, which is
correct up to terms quadratic in the texture coefficients:
C eff = λeff I ⊗ I + 2µeff I + ceff
+ c(K (4) + L(4) ),
(134)
where
2c
(ζ + a 3 + f3 ),
25
6c
= 2µ + (ζ + a 3 + f3 ),
25
cζ
= c+
+ cξ.
5
λeff = λ −
2µeff
ceff
(135)
In (134), λeff and µeff are the effective Lamé constants of the polycrystal, which
contain terms quadratic in the texture coefficients and are isotropic functions of
the ODF w. The isotropic and anisotropic parts of C eff are λeff I ⊗ I + 2µeff I
and ceff + c(K (4) + L(4) ), respectively. The term ceff is linear in the texture
coefficients, whereas c(K (4) + L(4) ) is quadratic. Since the tensors , K (4) and
L(4) are harmonic (cf. Section 2.3 above), we have 3K = 3λeff +2µeff = 3λ+2µ =
c11 + 2c12 in agreement with (39).
7. HM-V Model, HM-R Model, and Compatibility
Expression (134) shows that under assumption (#) crystallographic texture will
generally affect the isotropic part of the effective elasticity tensor of a textured
polycrystal. The same expression, which pertains to the special instance of cubic
aggregates, however, also betrays the fact that the theory we presented is in a sense
too general for practical purposes. In (134), for example, the myriads of undeter(α)
mined coefficients A(α)
lmn and Flmnl ′ m′ n′ make it impossible to apply this expression
in practical computations. For this reason, in this section, we shall establish two
simple models by adding some ad hoc simplifying assumptions.
7.1. HM - V MODEL
In our first model, which we call the HM-V model, we discard all the o(w −wiso )
terms in equation (63) and simply take
(136)
E = H (0)(R) + H (1) (R, w) [E]
517
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
as the constitutive relation for the mean perturbation strain. Under assumption (136),
equation (67) becomes
H (1) (R, w)(w(R) − wiso) dg = 0.
(137)
SO(3)
In order that equation (137) be satisfied for all independent texture coefficients, the
quantity H (1) (R, w) must be independent of R. Hence, we conclude from (66),
(75) and (77) that
H (1) (R, w) = −ζ .
(138)
Substituting (75) and (138) into (136), we obtain the following formula for the
perturbation strain under the HM-V model:
E = ζ (R) − [E].
(139)
REMARK 7.1. For later use, let us examine equation (139) when the cubic crystallites are almost isotropic, i.e., c ≈ 0. Under such circumstances, we expect
that E should be almost a constant function of R and E ≈ 0 for all R. Since
the expression (R) − is independent of c, we conclude that the HM-V model
would be unphysical unless ζ ≈ 0 when the crystallites are almost isotropic.
Now, from (77), (78), and (139), we have
D[E ] = cζ(
:
−
:
(140)
)[E].
Substituting (49) and (140) into (35), we derive from (43)–(45) and (104) a simple
formula for the effective stiffness tensor of polycrystalline aggregates of cubic
crystallites as follows:
2 (1)
6 (2) 1
eff
(1)
(2)
C = λB + 2µB + c + cζ − B + B +
− :
25
25
5
= λ◦ I ⊗ I + 2µ◦ I + c◦ + d ◦ :
(141)
with
λ◦ = λ −
2cζ
,
25
µ◦ = µ +
3cζ
,
25
c◦ = c +
cζ
,
5
d ◦ = −cζ,
(142)
where c, λ and µ, are given by (3), (50) and (51), respectively, and ζ is an undetermined material constant. Expression (141) for C eff contains a term quadratic in the
texture coefficients. It is applicable to aggregates of cubic crystallites with arbitrary
texture symmetry, for which the tensor is given by (46).
The isotropic part of the tensor C eff can be obtained from formula (26):
eff
R ⊗4 C eff wiso dg = λeff I ⊗ I + 2µeff I,
(143)
Ciso
=
SO(3)
518
M. HUANG AND C.-S. MAN
where
λ
eff
µeff
4
512π 4 d ◦ 4 2
=λ −
|c | ,
4725 k=−4 k0
◦
4
256π 4 d ◦ 4 2
◦
=µ +
|c | .
1575 k=−4 k0
(144)
The Lamé constants λeff and µeff pertaining to the isotropic part of C eff are isotropic
4
coefficients vanish, the polyfunctions of the ODF. In the limit when all the cm0
eff
crystal exhibits elastic isotropy, and we have λ = λ◦ , µeff = µ◦ for the isotropic
polycrystal.
When all the cubic crystallites in the polycrystal have the same orientation as
the reference single crystal, the polycrystal is said to have the ideal Cube texture
in the jargon of metallurgy. Under the theoretical setting of the present paper, a
polycrystal with the ideal Cube texture is indistinguishable from a single crystal.
By substituting into (141) the values of texture coefficients appropriate for the Cube
texture, i.e.,
√
70 4
21
4
4
4
,
c40 = c4̄0 =
c ,
(145)
c00 =
2
32π
14 00
4
and all other cm0
= 0, it is straightforward to verify by direct computations that the
effective elasticity tensor of the polycrystal reduces to
C eff = c12 I ⊗ I + 2c44 I + cB (3) (I ),
(146)
which is none other than the elasticity tensor of the reference single crystal (cf.
equation (5) above). This result, while expected (see Remark 3.2), is still comforting, for it serves as a check on the correctness of the computations that lead to
equation (141).
7.2. HM - R MODEL
By reversing the roles of stress and strain, we may follow the same procedure as
the derivation of (141) to obtain a parallel expression for the effective compliance
tensor S eff . For the polycrystalline aggregate, let
T = T (R) − T
(147)
be the mean perturbation stress of the crystallites with orientations in an infinitesimal group volume around R (cf. Section 3 for a precise definition of the
function T (·)). Instead of assumption (#) on the mean perturbation strain E (see
Section 5), now we postulate the constitutive relation
T = T (R, w, T )
(148)
519
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
for the mean perturbation stress, with T (R, w, 0) = 0. Next we linearize equation (148) with respect to T and write
T = G(R, w)[T ],
(149)
and we express the constitutive function G in the Taylor formula
G(R, w) = G(0)(R) + G(1)(R)[w − wiso ] + o w − wiso ,
(150)
where G(β) (R, w) enjoy the minor symmetries.
In our second model, which we call the HM-R model, we ignore all the o(w −
wiso ) terms in equation (150) and take
(151)
T = G(0)(R) + G(1) (R)[w − wiso ] [T ]
as the constitutive equation for the mean perturbation stress. Parallel to what we
obtain under the HM-V model (see equation (139)), we find that under the HM-R
model equation (151) reduces to the form
T = η( (R) −
)[T ],
(152)
where η is an undetermined material constant.
Let L = S(R) − S be the perturbation compliance tensor of crystallites with
orientation R. Parallel to equation (78), we have
L(R) = s( (R) −
(153)
).
The mean strain tensor E and the effective compliance tensor S eff of the polycrystal
are given by the equation (cf. equations (35) and (36))
E = S eff [T ] = S[T ] + L[T ].
(154)
Substituting (56), (152) and (153) into (154), we obtain the following formula for
the effective compliance tensor of the polycrystal:
S eff = λ◦s I ⊗ I + 2µ◦s I + cs◦
+ ds◦
:
(155)
with
λ◦s = λs −
2sη
,
25
µ◦s = µs +
3sη
,
25
cs◦ = s +
sη
,
5
ds◦ = −sη,
(156)
where λs , µs , and s are given in (54) and (57). The isotropic part of the tensor S eff
is
eff
eff
Siso
= λeff
s I ⊗ I + 2µs I,
(157)
520
M. HUANG AND C.-S. MAN
where
λeff
s
µeff
s
4
512π 4 ds◦ 4 2
−
|c | ,
4725 k=−4 k0
=
λ◦s
=
µ◦s
4
256π 4 ds◦ 4 2
|c | .
+
1575 k=−4 k0
(158)
7.3. COMPATIBILITY. MODELS HM - Vc AND HM - Rc
The HM-V and the HM-R model each has an undetermined material constant in ζ
and in η, respectively, and the predictions from these two models need not be
compatible. Nevertheless, we can use the requirement
C eff : S eff = I
(159)
to determine ζ and η so that formulae (141) and (155) agree with each other to
within terms linear in the texture coefficients. To this end, we substitute (141) and
(155) into (159) and obtain, with the help of Table II, the expansion
C eff : S eff = (3λ◦ λ◦s + 2λ◦ µ◦s + 2λ◦s µ◦ )I ⊗ I
+ 4µ◦s µ◦ I + (2µ◦ cs◦ + 2µ◦s c◦ ) + · · · .
(160)
In order that (159) be satisfied to within terms linear in the texture coefficients, we
impose in (160) the requirements that
3λ◦ λ◦s + 2λ◦ µ◦s + 2λ◦s µ◦ = 0,
4µ◦s µ◦ = 1,
2µ◦ cs◦ + 2µ◦s c◦ = 0.
(161)
From (3), (50), (51), (54)–(57), we can get the relations
50λµ + 5λc + 2c2
,
2(3λ + 2µ)(10µ + 3c)(5µ − c)
−25c
5(10µ + c)
, s=
.
µs =
4(10µ + 3c)(5µ − c)
2(5µ − c)(10µ + 3c)
λs = −
(162)
Substituting (162) into (156) and then substituting (142) and (156) into (160), we
arrive at the following equations:
−
(−50ζ µ + 50ηµ − 25c + 6cηζ − 5cζ )c
= 0,
25(−5µ + c)(10µ + 3c)
3(−50ζ µ + 50ηµ − 25c + 6cηζ − 5cζ )c
= 0,
100(−5µ + c)(10µ + 3c)
(−50ζ µ − 25c + 25cζ + 30cη + 12cηζ + 50ηµ)c
= 0.
20(−5µ + c)(10µ + 3c)
(163)
(164)
(165)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
521
Since (163) is equivalent to (164), we need to solve only (164) and (165). We
obtain two pairs of solutions (ζ, η). For one pair, we find ζ → −10 and η → −10
as c → 0. By Remark 7.1, this pair of solutions is deemed unphysical and is
discarded. The pair of physical solutions is given by the formulae
√
10µ − 2(5µ − c)(10µ + 3c)
ζ = −5
√
,
(166)
6c − 2(5µ − c)(10µ + 3c)
√
5(10µ − 2(5µ − c)(10µ + 3c))
η=
.
(167)
2(−5µ + 3c)
Note that for this pair, ζ → 0 and η → 0 as c → 0. With this choice of ζ and η,
the constitutive equations as derived by the two models will agree with each other
to terms linear in the texture coefficients.
Henceforth we shall refer to (141) and (155) with ζ and η given by (166) and
(167) as predictions from models HM-Vc and HM-Rc , respectively. The subscript
“c” in HM-Vc and HM-Rc will remind us that these models are special versions of
HM-V and HM-R, where the material constants ζ and η are chosen to guarantee
approximate compatibility between the two. At this point, however, we would like
to add a cautionary remark: Whereas compatibility provides one easy means to put
a numerical prediction on ζ and η in (166) and (167), respectively, there is no a
priori reason why models HM-V and HM-R should be compatible with each other.
Hence, one should not attach too much significance to the formulae (166) and (167)
for ζ and η.
8. Examples and Discussion
With equation (141) for C eff and equation (155) for S eff in hand, the first question
to ask is whether these formulae from the simple models HM-V and HM-R are adequate. To shed some light on this question, one approach is to check these formulae
through numerical computations. For definiteness, let us restrict our discussion
here to formula (141) for C eff .
A polycrystal can be taken as an inhomogeneous elastic body whose elasticity
tensor is piecewise constant. For such a body, the classical boundary-value problems of elastostatics are well posed. Given a set of single-crystal elastic constants
and an arrangement of crystallites (including their orientations), a selected set of
six boundary-value problems Pi (i = 1, 2, . . . , 6) can be solved by using the finite
element method to get the corresponding mean stress T i and mean strain E i , from
which the effective elasticity tensor C eff which maps E i to T i for each i can be
calculated. By equation (141), each component Cijeffkl of the effective elasticity tensor carries only one and the same undetermined coefficient ζ . From the computed
value of Cijeffkl , a value of ζ can be determined. The values of ζ thus obtained
from various components Cijeffkl can then be checked for agreement with each other.
Such checkings can be repeated for different textures and for different choices of
boundary-value problems Pi .
522
M. HUANG AND C.-S. MAN
Of course, another possible approach for checking the adequacy of formulae (141) and (155) is to seek their experimental corroboration. In this regard, however, one should note that crystallographic texture is only one of the microstructural factors, albeit an important factor, which influence the elastic response of an
anisotropic polycrystal. In seeking experimental corroboration of (141) or (155),
efforts should be focused on strongly textured samples for which the term quadratic
in texture coefficients in these formulae has such a sizable effect that it could not
be confused with the influence from other microstructures.
When the single-crystal elastic constants are known, formulae (141) and (155)
each carry only one undetermined material constant. In metallurgical practice,
however, because of the effects of alloying elements (even if they are in minute
quantities), it is difficult to get accurate estimates of the elastic constants of the
crystallites in a polycrystalline metal. Equations (141) and (155) are then more
properly looked upon as formulae with four undetermined coefficients.
On the other hand, sometimes we just need rough estimates of the elastic constants of a polycrystal. Then, coupling single-crystal elastic constants from handbooks with either model HM-Vc or HM-Rc may suffice.
In closing, we present two examples, in which the predictions of HM-Vc and
HM-Rc are compared with results of experimental measurement and/or calculations by the self-consistent method [19].
EXAMPLE 8.1. We consider isotropic aggregates of copper. The single-crystal
elastic constants are taken to be c11 = 169.05 GPa, c12 = 121.93 GPa, c44 =
75.50 GPa [19]. In Table III we list a comparison of the predicted values of λeff
and µeff from models HM-Vc , HM-Rc , the Voigt and the Reuss model, and Morris’ calculations [19] by a self-consistent scheme. As reference, we include also
the values of λeff and µeff for isotropic aggregates, as inferred from ultrasound
measurements [7] on a batch of C122 copper samples.
EXAMPLE 8.2. Morris [19] reported the values of stiffness components pertaining to an orthorhombic aggregate of α-Fe crystallites, as calculated by a selfconsistent scheme. In his example, the values of the relevant texture coefficients
are:
4
c00
= −0.02475209611,
4
c40 = 0.003290910316,
4
c20
= −0.001375676243,
Table III. Comparison of predicted and measured values of Lamé constants
for isotropic aggregates of copper. Units are in GPa
Model
Voigt
HM-Vc
HM-Rc
Reuss
Morris
Expt.
λeff
µeff
101.16
54.72
106.14
47.24
106.14
47.24
110.89
40.12
105.2
48.7
106.5
47.35
523
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Table IV. Stiffness components pertaining to an orthorhombic aggregate of iron crystallites. Units are in GPa
Model
Voigt
HM-Vc
HM-Rc
Reuss
Morris
eff
C1111
295.3
286.2
286.2
276.6
286.8
eff
C2222
eff
C3333
eff
C2233
eff
C3311
eff
C1122
eff
C2323
eff
C3131
eff
C1212
297.1
288.3
288.4
278.8
288.8
311.7
305.7
305.8
296.7
305.5
102.8
105.6
105.5
110.0
105.3
104.6
107.7
107.7
112.2
107.3
119.1
125.1
125.1
130.2
123.9
77.8
70.2
70.3
64.6
71.8
79.6
71.9
71.9
65.9
73.5
94.1
87.4
87.4
79.7
88.4
and the single-crystal elastic constants are taken as c11 = 237 GPa, c12 = 141 GPa,
c44 = 116 GPa. In Table IV, we list, in juxtaposition with the values of stiffness
components reported by Morris, the corresponding values predicted by models
HM-Vc and HM-Rc . We include also the predictions from the Voigt and the Reuss
model in the same table as reference.
Acknowledgements
The findings reported here were obtained in the course of work supported in part by
a grant from the U.S. National Science Foundation (No. DMS-0103979),
a DEPSCoR grant from AFOSR (No. F49620-02-1-0243), and an R&D Excellence
grant from the Kentucky Science & Engineering Foundation (No. KSEF-148-50202-19).
References
1.
2.
3.
4.
5.
H.J. Bunge, Texture Analysis in Materials Science: Mathematical Methods. Butterworths,
London (1982).
R.J. Roe, Description of crystallite orientation in polycrystalline materials: III, General solution
to pole figures. J. Appl. Phys. 36 (1965) 2024–2031.
P.R. Morris, Averaging fourth-rank tensors with weight functions. J. Appl. Phys. 40 (1969)
447–448.
C.M. Sayers, Ultrasonic velocities in anisotropic polycrystalline aggregates. J. Phys. D 15
(1982) 2157–2167.
C.-S. Man, On the constitutive equations of some weakly-textured materials. Arch. Rational
Mech. Anal. 143 (1998) 77–103.
524
6.
M. HUANG AND C.-S. MAN
R. Paroni and C.-S. Man, Constitutive equations of elastic polycrystalline materials. Arch.
Rational Mech. Anal. 150 (1999) 153–177.
7. C.-S. Man, X. Fan and K. Kawashima. In preparation.
8. R.A. Toupin and R.S. Rivlin, Dimensional changes in crystals caused by dislocations. J. Math.
Phys. 1 (1960) 8–15.
9. L.C. Biedenharn and J.D. Louck, Angular Momentum in Quantum Physics. Cambridge Univ.
Press, Cambridge (1984).
10. R.J. Roe, Inversion of pole figures for materials having cubic crystal symmetry. J. Appl. Phys.
37 (1966) 2069–2072.
11. W. Miller, Symmetry Groups and Their Applications. Academic Press, New York (1972).
12. L. Tisza, Zur Deutung der Spektren mehratomiger Moleküle. Z. Physik 82 (1933) 48–72.
13. H.A. Jahn, Note on the Bhagavantam–Suryanarayana method of enumerating the physical
constants of crystals. Acta Cryst. 2 (1949) 30–33.
14. C.-S. Man, Material tensors of weakly-textured polycrystals. In: W. Chien et al. (eds), Proc.
of the 3rd Internat. Conf. on Nonlinear Mechanics. Shanghai Univ. Press, Shanghai (1998)
pp. 87–94.
15. Yu.I. Sirotin, Decomposition of material tensors into irreducible parts. Soviet Phys. Crystallogr.
19 (1975) 565–568.
16. M.J. Beran, T.A. Mason and B.L. Adams, Bounding elastic constants of an orthotropic
polycrystal using measurements of the microstructure. J. Mech. Phys. Solids 44 (1996)
1543–1563.
17. M. Huang and C.-S. Man, Elastic stiffness and compliance of anisotropic aggregates of cubic
crystallites. In: Q.-S. Zheng, M.-F. Fu and G.-Q. Song (eds), Mechanics and Its Applications
in Civil Engineering (In Honor of Professor D.-P. Yang’s 70th Anniversary). Tsinghua Univ.
Press, Beijing (2002) pp. 107–116 (in Chinese).
18. D.A. Varshalovich, A.N. Moskalev and V.K. Khersonskii, Quantum Theory of Angular
Momentum. World Scientific, Singapore (1988).
19. P.R. Morris, Elastic constants of polycrystals. Internat. J. Engrg. Sci. 8 (1970) 49–61.
Reconstruction Formula for Identifying Cracks
MASARU IKEHATA1,⋆ and GEN NAKAMURA2,⋆⋆
1 Department of Mathematics, Faculty of Engineering, Gunma University, Kiryu, 376-8515, Japan.
E-mail: ikehata@sv1.math.sci.gunma-u.ac.jp
2 Department of Mathematics, Graduate School of Sciences, Hokkaido University, Sapporo,
060-0810, Japan. E-mail: gnaka@math.sci.hokudai.ac.jp
Received 23 July 2002; in revised form 3 July 2003
Abstract. We consider an inverse boundary value problem for identifying cracks in a conductive
medium. By combining the probe method and an analysis for the behavior of the “reflected solution”,
we derive a reconstruction formula for identifying cracks from the Neumann to Dirichlet map. We
give also some related results.
Mathematics Subject Classifications (2000): 35J05, 35J55, 35R30.
Key words: inverse boundary value problem, probe method, indicator function.
Dedicated to the memory of Clifford Truesdell, who led to a renaissance in
rational mechanics through his teachings and research
1. Introduction
In this paper we give a reconstruction formula for identifying a crack in a homogeneous isotropic conductive medium in Rn (n = 2 or 3) by boundary measurements.
More precisely, we take the Neumann to Dirichlet map as boundary measurements
and apply the probe method to reconstruct the crack. The probe method was introduced by the first author in [8]. Therein he established a reconstruction formula for
unknown inclusions in a conductive medium. It uses singular solutions and Runge’s
theorem to approximate the solutions. It should be pointed out that Isakov [10] was
the first who gave a uniqueness theorem for identifying unknown inclusions in
a conductive medium by using singular solutions and Runge’s theorem. In order
to apply the probe method, we developed an analysis characterizing the behavior
of the “reflected solution” given later in Lemma 3.1. Our method works without
change for multiple cracks. We will remark about other types of measurements at
⋆ Partially supported by Grant-in-Aid for Scientific Research (C)(2) (No.13640152) of Japan
Society for the Promotion of Science.
⋆⋆ Partially supported by Grant-in-Aid for Scientific Research (B)(2) (No.14340038) of Japan
Society for the Promotion of Science.
525
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 525–538.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
526
M. IKEHATA AND G. NAKAMURA
the boundary. These are given in terms of mixed type boundary conditions. For
example, fixing Dirichlet data at a part of the boundary, we measure the corresponding Dirichlet data on the other part of the boundary for given Neumann
data on the same part of the boundary. By switching Dirichlet data to Neumann
data and vice versa in the preceding example, we obtain another type of boundary
measurements.
Our method can be generalized to treat a more general problem. For example,
we will give an analogous reconstruction formula for identifying cracks in a homogeneous anisotropic elastic medium. Although the basic idea of the proof is the
same, it requires much more heavy analysis for characterizing the behavior of the
“reflected solution”. In order to present the idea of our method most efficiently,
we give the details of argument for conductive media but only state the result for
elastic media without proof. The details of proof for elastic media will be given
elsewhere.
For elastic media, the mixed type boundary condition is widely used in practical
applications. Since we regard the conductivity equation as a simple analogue of the
elasticity equation, we consider the mixed type boundary condition even for the
conductivity equation.
There are many related results: Bryan and Vogelius [5], Kress [12], Ben Abda
et al. [2], Andrieux et al. [1], Ben Abda and Bui [3] and Brühl et al. [4]. Ben Abda
et al. assume the nonvanishing of the stress intensity factor of a surface breaking
crack in a two-dimensional medium and the nonvanishing of displacement jump
across a two-dimensional crack in a plane and use the reciprocity gap principle to
reconstruct the crack. Brühl et al. use the Kirsch method [11]. All other authors
transform the problem to some optimization problems and use a Newton-type
algorithm to solve the optimization problems.
We end this section by defining several notations used in this paper. Let X be
an open submanifold of a manifold Y . Following Hörmander [7], for a space F of
distribution in Y , we define
F (X) := {f |X ; f ∈ F },
Ḟ (X) := {f ∈ F ; supp f ⊂ X},
where f |X is the restriction of f to X. These notations will be used for some
Sobolev spaces defined on X and X, and we assume sufficient regularity of X, Y
and the boundary of ∂X for those definitions.
Let ⊂ Rn (n = 2 or 3) be a bounded domain with C 2 boundary Ŵ. For n = 2
or n = 3, let S ⊂ be a C 2 Jordan closed curve or closed connected surface and
' ⊂ S be an open curve or surface, respectively. When n = 3, we assume the
boundary ∂' of ' to be C 2 . ' will be considered as a crack. We sometimes divide
Ŵ into two parts:
Ŵ = ŴD ∪ ŴN ,
ŴD ∩ ŴN = ∅,
where ŴD , ŴN ⊂ Ŵ are open subsets. When n = 3 and ŴD = ∅, ŴN = ∅, we
assume that the boundaries ∂ŴD , ∂ŴN of ŴD , ŴN are C 2 . One of ŴD , ŴN will
527
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
be considered as the place where we do the measurements. Note that we do not
exclude the case ŴD = ∅ or ŴN = ∅.
Let − be the open subset of with boundary S and + := \ − . The trace
operator to Ŵ is denoted by γ and the trace operators from ± to S are denoted
by γ± , respectively. The direction of the unit normal ν at Ŵ and at S is directed into
Rn \ and into + , respectively. The normal derivative is denoted by ∂ν , and the
partial derivative with respect to the Cartesian xj variable is denoted by ∂j or ∂xj .
Also, we use C to denote the general positive constant in our estimates.
For example, by taking X = ', k ∈ R (|k| 1), we can define the Sobolev
k
spaces H (') and Ḣ k ('), which are subspaces of the Sobolev space H k (S). These
loc
loc
loc
(S) in [7], respectively. Also, we
(') of H(k)
are the subspaces H (k) (') and Ḣ(k)
k
k
k
∗
denote by H (') the dual space of H ('). Like H (')∗ , the superscript ∗ will
be used to denote the dual spaces of function spaces. For 21 < s 1, we define
H s ( \ ') by
s
H s ( \ ') := {u ∈ D ′ (); u± := u|± ∈ H (± );
γ+ u+ − γ− u− = 0 on S \ '}
with the norm uH s (\') := u+ H s (+ ) + u− H s (− ) . Eller [6] gives H s ( \ ')
in a different way, but both definitions are equivalent. The advantage of defining
k
the Sobolev spaces H ('), Ḣ k (') as in [7] and H s ( \ ') as above is to avoid
defining another Sobolev space H 1/2('), which is nothing but Ḣ 1/2('); we automatically have [u] := γ+ u+ − γ− u− ∈ Ḣ 1/2(') if u ∈ H 1 ( \ ') and we can
avoid heavy notation such as (H 1/2('))∗ .
2. Crack in a Conductive Medium
It is natural to take current and voltage as the input and output data for the measurements, respectively. We consider the following two types of direct problems.
−1/2
TYPE 1. For any given g ∈ H# (Ŵ) := {g ∈ H −1/2 (Ŵ); g, 1 = 0}, find a
solution u ∈ H#1 ( \ ') := {u ∈ H 1 ( \ '); Ŵ u dσ = 0} to
⎧
⎨ u = 0 in \ ',
(DP)1
∂ u = 0 in ' (i.e., γ+ ∂ν u = γ− ∂ν u = 0 on '),
⎩ ν
∂ν u = g on Ŵ,
where , is the pairing between H −1/2(Ŵ) and H 1/2 (Ŵ), dσ is the line or surface
element for n = 2 or 3, respectively.
TYPE 2. Let ŴD = ∅. For any given F ∈ L2 ( \ '), f ∈ H
−1/2
H
(ŴN ), find a solution u ∈ H 1 ( \ ') to
⎧
⎨ u = F in \ ',
∂ u = 0 on ',
⎩ ν
u=f
on ŴD , ∂ν u = g on ŴN .
1/2
(Ŵ0 ), g ∈
(DP)2
528
M. IKEHATA AND G. NAKAMURA
We have well posedness for a slightly more general direct problem including
the case ŴD = ∅.
1/2
THEOREM 2.1. For any given F ∈ L2 ( \ '), f ∈ H (ŴD ), g ∈ H
−1/2
('), there exists a unique solution u ∈ H 1 ( \ ') to
p∈H
⎧
⎨ u = F in \ ',
∂ u = p on ',
⎩ ν
u=f
on ŴD , ∂ν u = g on ŴN ,
−1/2
(ŴN ),
(DP)′2
and it satisfies
uH 1 (\') C F L2 (\') + f H 1/2 (Ŵ
D)
+ gH −1/2 (Ŵ
Here, if ŴD = ∅, we assume
F dx = g, 1
N)
+ pH −1/2 (') .
(2.1)
(2.2)
\'
−1/2
and Ŵ u dσ = 0, and g ∈ H
(ŴN ) has to be replaced by g ∈ H −1/2(Ŵ).
Proof. We give an outline of the proof. For the case ŴD = ∅, we seek a solution
u ∈ H#1 ( \ ') := {u ∈ H 1 ( \ '); Ŵ u dσ = 0} to the variational equation:
p[v] dσ
(2.3)
F v dx + gv dσ −
∇u · ∇v dx = −
\'
\'
Ŵ
'
for any v ∈ H#1 ( \ '), where \' q dx := + q dx + − q dx and the integral
−1/2
(Ŵ)
Ŵ gv dσ , for example, is understood as the pairing g, v between g ∈ H
1/2
and v|Ŵ ∈ H (Ŵ). Just as in [6], we can prove that solving this variational
problem is equivalent to finding a solution u ∈ H#1 ( \ ') to (DP)′2 , and that
this variational problem admits a unique solution u ∈ H#1 ( \ ') with the esti1/2
mate (2.1). When ŴD = ∅, we proceed as follows. By the definition of H (ŴD ),
there is an extension f˜ ∈ H 1/2 (Ŵ) of f such that f˜H 1/2 (Ŵ) f H 1/2 (ŴD ) . Let
1
u0 ∈ H () be the solution to
u0 = 0 in ,
(2.4)
u0 = f˜
on Ŵ
with the estimates:
u0 H 1 () Cf˜H 1/2 (Ŵ),
∂ν u0 H −1/2 (Ŵ
N)
Cf˜H 1/2 (Ŵ).
(2.5)
(2.6)
529
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
Let χ ∈ C ∞ (), supp χ ∩ ' = ∅, χ = 1 near Ŵ and u1 := u − χu0 . Then, (DP)′2
becomes
⎧
⎨ u1 = G in \ ',
(DP)′′2
∂ u = p on ',
⎩ ν 1
u1 = 0
on ŴD , ∂ν u1 = h on ŴN ,
where G := F − (2∇χ · ∇u0 + u0 χ) ∈ H 1 ( \ ')∗ , h := g − χ∂ν u0 ∈
−1/2
H
(ŴN ). Solving (DP2 )′′ for u1 ∈ H 1 ( \ ') is equivalent to finding a solution
u1 ∈ H 1 ( \ ') to the variational equation:
p[v] dσ
(2.7)
Gv dx +
gv dσ −
∇u1 · ∇v dx = −
ŴN
\'
\'
'
−1/2
for any v ∈ W := {v ∈ H 1 ( \ '); v = 0 on ŴD }. Here, note that H
(ŴN ) =
Ḣ 1/2(ŴN )∗ . To see that (2.7) has a unique solution u ∈ W with the estimate:
u1 H 1 (\') C GH 1 (\')∗ + gḢ 1/2 (ŴN ) + pH −1/2 (') ,
(2.8)
it is enough to prove the coercivity
|∇v|2 dx Cv2H 1 (\') ,
a(v, v) :=
\'
v ∈ W.
(2.9)
and apply the Lax–Milgram theorem. Since v ∈ W , a(v, v) = 0 implies v = 0
because v = 0 on ŴD = ∅, and α(v) := a(v, v) defines a norm in W . Hence,
we have to prove that α(v) and vH 1 (\') are equivalent norms in W . Clearly, it
suffices to prove
α(v) Cv2L2 (\') ,
v ∈ W.
(2.10)
Suppose this is not true. Then there exist vn ∈ W (n ∈ N) such that
|vn | := vn L2 (\') = 1,
α(vn ) → 0,
n → ∞.
(2.11)
∞
By (2.10), {vn H 1 (\') }∞
n=1 is bounded. So there exists a subsequence {vn(k) }k=1
of {vn }∞
n=1 and v ∈ W such that vn(k) → v (k → ∞) weakly in W and
α(v) lim inf α(vn(k) ).
k→∞
(2.12)
Hence by (2.11), α(v) = 0, which gives v = 0. Since H 1 ( \ ') ֒→ L2 ( \ ')
is compact, we can take the above subsequence to satisfy vn(k) → v (k → ∞)
strongly in L2 ( \ '). This contradicts with |vn(k) | = 1.
✷
Next we define the Neumann to Dirichlet map ' , which is our boundary
measurements.
530
M. IKEHATA AND G. NAKAMURA
DEFINITION 2.2.
−1/2
(1)
: H# (Ŵ) → H 1/2(Ŵ) by
(i) For the direct problem of type 1, we define '
(1)
g = u|Ŵ ,
'
g ∈ H −1/2 (Ŵ),
(2.13)
where u ∈ H#1 ( \ ') is the solution to (DP)1 .
1/2
(2)
(ii) For the direct problem of type 2, we fix f ∈ H (ŴD ) and define '
:
−1/2
1/2
H
(ŴN ) → H (ŴN ) by
(2)
g = u|ŴN ,
'
g∈H
−1/2
(2.14)
(ŴN ),
where u ∈ H 1 ( \ ') is the solution to (DP)2 .
(j )
(j )
(iii) For both types, let ' (j = 1, 2) be denoted by ∅ (j = 1, 2) when
1
' = ∅. In this case, H#1 ( \ ') has to be replaced by H # () := {u ∈
1
H (); Ŵ u dσ = 0} for type 1.
The formulation of our inverse problems is as follows.
INVERSE PROBLEMS (IP)j (j = 1, 2). For each j (j = 1, 2), reconstruct '
(j )
from ' .
We claim that for each j (j = 1, 2) there is a reconstruction formula for iden(j )
tifying ' from ' . We have adapted the probe method [8, 9] for this purpose.
For simplicity, we consider only the case n = 3, because only obvious changes are
required for the case n = 2.
DEFINITION 2.3.
(i) (Needle γ ). Let γ := {γ (t) ∈ ; 0 t 1} be a non-selfintersecting
continuous curve joining γ (0), γ (1) ∈ Ŵ such that γ (t) ∈ (0 < t < 1). We
call γ a needle.
(ii) (First hitting time T (γ , ')). We define T (γ , ') by
T (γ , ') := sup{t; 0 < t < 1, γ (s) ∈
/ ', 0 s < t}.
(2.15)
We call T (γ , ') the first hitting time. If γ ∩ ' = ∅ and we consider t as the
time, T (γ , ') is the time that the needle γ first hits '.
REMARK 2.4. It is obvious that if we know T (γ , ') for all possible needles,
we can reconstruct '. So the inverse problems (IP)j (j = 1, 2) reduce to find(j )
ing procedures to determine T (γ , ') for any needle γ from ' (j = 1, 2),
respectively.
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
531
DEFINITION 2.5.
(i) (Indicator function I1 (t, γ )). We define the indicator function I1 (t, γ ) by
6 (1)
7
I1 (t, γ ) := lim gj , '
− ∅(1) gj 1 ;
(2.16)
j →∞
−1/2
(Ŵ) and Ḣ 1/2(Ŵ), and gj := ∂ν vj |Ŵ ,
here , 1 is the pairing between H
1
1
where vj ∈ H () (j ∈ N) is defined as follows. The functions vj ∈ H ()
(j ∈ N) satisfy
vj = 0
in ,
(2.17)
1
vj → G(·, γ (t)), j → ∞ in H loc ( \ γt ),
where γt := {γ (s); 0 < s t} and
G(x, x 0 ) = (4π |x − x0 |)−1 .
(2.18)
(ii) We define the indicator function I2 (t, γ ) by
6 (2)
7
I2 (t, γ ) := lim gj , '
− ∅(2) gj 2 ;
(2.19)
j →∞
−1/2
(ŴN ) and Ḣ 1/2(ŴN ), and gj := ∂ν vj |ŴN ,
here , 2 is the pairing between H
1
where vj = v ′ + vj′′ ∈ H () (j ∈ N) and v ′ , vj′′ are defined as follows. The
1
function v ′ ∈ H () is the solution to
v ′ = F in ,
v′ = f
on ŴD , ∂ν v ′ = 0 on ŴN
(2.20)
1
and vj′′ ∈ H () (j ∈ N) satisfy
⎧
′′
in ,
⎨ vj =′′0
supp(vj |Ŵ ) ⊂ Ŵ0
⎩ ′′
1
vj → G(·, γ (t)), j → ∞ in H loc ( \ γt ),
(2.21)
where Ŵ0 is a fixed open subset of ŴN .
REMARK 2.6. Note that the well posedness of (2.20) follows from Theorem 2.1
as its special case when ' = ∅. The existence of vj (j ∈ N) and vj′′ (j ∈ N) are
due to the Runge approximation theorem given in the Appendix.
DEFINITION 2.7. (Detecting times tj (γ , ') (j = 1, 2)). For each j (j = 1, 2),
we define the detecting time tj (γ , ') by
tj (γ , ') := sup{0 < t < 1; sup |Ij (s, γ )| < ∞}.
0<s<t
We claim the following which will be proven in the next section.
(2.22)
532
M. IKEHATA AND G. NAKAMURA
THEOREM 2.8. For each j (j = 1, 2),
T (γ , ') = tj (γ , ') if γ ∩ ' = ∅.
(2.23)
REMARK 2.9.
(i) As already remarked before, Theorem 2.8 implies our reconstruction formulae.
(ii) We will prove this theorem by using the probe method [8, 9]. We have adapted
this method to our crack identification problem.
(iii) The reconstruction formulae are summarized at the end of the next section.
(iv) It is straightforward to see that our arguments provide the same reconstruction
formulae for identifying cracks inside a homogeneous conductive
medium.
This is the case when we replace the Laplacian
by ni,j =1 aij ∂i ∂j with
n = 1 or 2 and positive symmetric constant matrix (aij ).
(v) The numerical realization of our reconstruction will be discussed in forthcoming papers.
3. Proof of Theorem 2.8
Since the proof for type 1 is essentially the same as that for type 2, we provide a
proof only for type 2 and comment on modifications necessary for type 1.
The rest of the proof concerns (2.23). Hereafter in the proof, we simply denote
(2)
, I (t, γ ) = I2 (t, γ ), t (γ , ') = t2 (γ , '). Let uj ∈ H 1 ( \ ') be the
' = '
solution to
⎧
⎨ uj = F in \ ',
(3.1)
∂ u = 0 on ',
⎩ ν j
uj = f
on ŴD , ∂ν uj = gj on ŴN ,
and wj := uj − vj ∈ H 1 ( \ ').
LEMMA 3.1 (Reflected solution w). If γt ∩ ' = ∅, then wj → w (j → ∞) in
H 1 ( \ ') and w ∈ H 1 ( \ ') is the solution to
⎧
⎨ w=0
in \ ',
(3.2)
∂ν w = −∂ν (v ′ + G(·, γ (t))) on ',
⎩
w = 0 on ŴD , ∂ν w = 0 on ŴN .
Proof. Let γt ∩ ' = ∅. By (2.20), (2.21) and (3.1), wj ∈ H 1 ( \ ') satisfies
⎧
⎨ wj = 0
in \ ',
(3.3)
∂ν wj = −∂ν vj on ',
⎩
wj = 0
on ŴD , ∂ν wj = 0, on ŴN .
533
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
By Theorem 2.1,
wj − wk H 1 (\') C∂ν (vj − vk )H −1/2 (') = C∂ν (vj′′ − vk′′ )H −1/2 (') .
(3.4)
Here, if we take a bounded domain D with C 2 boundary such that ' ⊂ D, vj′′ −vk′′ ∈
H 1 (D) and (vj′′ − vk′′ ) = 0 in D imply
∂ν (vj′′ − vk′′ )H −1/2 (') Cvj′′ − vk′′ H 1 (D)
(3.5)
by the continuity of the trace (see [6, Lemma 2.9]). Hence, by (2.21), we have
wj − wk H 1 (\') → 0 (j, k → ∞).
✷
LEMMA 3.2. If γt ∩ ' = φ, then we have
F w dx +
|∇w|2 dx +
I (t, γ ) = −
f ∂ν w dσ.
(3.6)
ŴD
\'
\'
Proof. By (2.19) and Lemma 3.1, it is enough to prove
2
F wj dx +
|∇wj | dx +
gj , (' − ∅ )gj = −
f ∂ν wj dσ.
ŴD
\'
\'
(3.7)
By the definition of ' , ∅ and the Green formula (see [6, (2.5), (2.7)]),
F wj dx,
uj ∂ν vj dσ +
f ∂ν wj dσ +
gj , (' − ∅ )gj =
'±
ŴD
\'
(3.8)
where '± g dσ := ' γ+ g dσ − ' γ− g dσ . Here, note that '± uj ∂ν vj dσ =
' (γ+ uj [∂ν vj ] + [uj ]γ− ∂ν vj ) dσ = ' [uj ]γ− ∂ν vj dσ and
|∇wj |2 dx.
(3.9)
wj ∂ν wj dσ = −
uj ∂ν vj dσ =
'±
'±
\'
Hence, from (3.8) and (3.9), we have (3.7).
✷
For the behavior of w = w(·, γ (t)) as t ↑ T (γ , ') if γ (T (γ , ')) ∈ ', we
have the following lemma, which will be proven in the Appendix.
LEMMA 3.3. Assume γ (T (γ , ')) ∈ '.
(i) \' F w dx and ŴD f ∂ν w dσ are bounded as t ↑ T (γ , ').
(ii) \' |∇w|2 dx → ∞ (t ↑ T (γ , ')).
Now we prove (2.23). Since (2.23) is obvious when γ ∩ ' = ∅, it suffices to
prove
(0, T (γ , ')) = (0, t (γ , ')) when γ ∩ ' = ∅.
(3.10)
534
M. IKEHATA AND G. NAKAMURA
Clearly, we have
(0, T (γ , ')) ⊂ (0, t (γ , ')).
(3.11)
Suppose there exists t ∈ (0, t (γ , ')) such that t T (γ , '). By the definition of
t (γ , '),
|I (s, γ )| C0 ,
0 < s T (γ , ')
(3.12)
with C0 = sup0<st |I (s, γ )|. But, by Lemma 3.3, this is impossible. Hence,
(0, t (γ , ')) ⊂ (0, T (γ , ')) and we have (3.10).
Now we comment on the modifications necessary for problems of type 1. We
have to change the definition of wj given just before Lemma 3.1 to wj := uj −
(vj − Ŵ vj dσ ) ∈ H#1 ( \ '). Also, in (3.1) and Lemmas 3.1–3.3, we have to
delete whatever we had for ŴD and replace H 1 ( \ ') by H#1 ( \ ').
(1)
Next we summarize our reconstruction formulae for identifying ' from '
(2)
or '
. The steps given below for the reconstruction pertain to the Neumann to
(1)
(2)
, Steps 2 and 3 have to be modified more than just
. For '
Dirichlet map '
changing I2 (t, γ ) and t2 (γ , ') to I1 (t, γ ) and t1 (γ , '), respectively. The modified
steps for Steps 2 and 3 are given as Steps 2′ and 3′ , respectively.
Step 1. Consider a needle γ = {γ (t); 0 t 1} and the domain \ γt with
γt := {γ (s); 0 < s t}.
1
Step 2. Take harmonic functions vj′′ ∈ H () (j ∈ N) which approximate
G(x, γ (t)) = (4π |x − γ (t)|)−1 . (See (2.21) for details.)
Step 3. Solve (3.3) for v ′ ∈ H 1 () and compute gj = ∂ν (v ′ + vj′′ )|ŴN .
Step 4. Compute the indicator function I2 (t, γ ) = limj →∞ gj , (' − ∅ )gj
for small t.
Step 5. Increase t and search for t where |I (t, γ )| blows up. Denote this t by
t2 (γ , '). By Theorem 2.8, this give the first hitting time T (γ , ').
Step 6. Take many γ ’s and repeat all the previous steps. Plot all the points
γ (T (γ , ') for these γ ’s. Then these points generate the crack '.
Steps 2′ and 3′ are as follows:
1
Step 2′ . Take harmonic functions vj ∈ H () (j ∈ N) which approximate
G(x, γ (t)) = (4π |x − γ (t)|)−1 . (See (2.17) for details.)
Step 3′ . Compute gj = ∂ν vj |Ŵ .
4. Crack in an Elastic Medium
In this section, we give the corresponding result without proof for a homogeneous
elastic medium with elasticity tensor C = (Cij kℓ ) satisfying the full symmetries:
Cij kℓ = Cj ikℓ = Ckℓij ,
1 i, j, k, ℓ 3,
(4.1)
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
535
and the strong convexity condition:
3
Cij kℓ εij εkℓ δ
3
εij2
(4.2)
i,j =1
i,j,k,ℓ=1
for any symmetric matrix (εij ) and some δ > 0. Then Theorem 2.8 also holds
for this elastic medium with a crack ' and we have the same reconstruction
formula for identifying ' from the Neumann to Dirichlet map. While the proofs are
different from those for conductive media, the main idea of the proof remains the
same as before. Hence, we refrain from giving the details of the whole argument,
but for preciseness we present the definitions of the Neumann to Dirichlet maps
(j )
' (j = 1, 2) and the indicator functions Ij (t, γ ) (j = 1, 2). For simplicity, we
continue using the same notation as before and we only consider the case ŴD = ∅
(2)
by
on which we fix the displacement. We define '
(2)
g := u|ŴN ,
'
(4.3)
where u is the solution to the mixed type boundary value problem
/
Lu = F in \ ',
∂L u = 0 on ',
u=f
on ŴD , ∂L u = g on ŴN ,
(4.4)
with 3 × 3 matrices of operators L and ∂L whose (i, k) components are
3
j,ℓ=1
3
Cij kℓ ∂j ∂ℓ
and
Cij kℓ νj ∂ℓ ,
(4.5)
(4.6)
j,ℓ=1
respectively. The indicator function I2 (t, ') is defined in the same way as
−1/2
(2.19) with the modification that the pairing , 2 is between (H
(ŴN ))3 and
3
1/2
(Ḣ (ŴN )) . The functions gj (j ∈ N) are defined as before except that and ∂ν
in (2.20) are replaced by L and ∂L , respectively. Also, we have to change the
definition of G(x, x 0 ) in (2.21) and (2.18) by
G(x, x 0 ) = E(x, x 0 )b,
(4.7)
where E(x, x 0 ) is the fundamental solution of L and b is any fixed nonzero constant
vector.
Acknowledgements
The authors would like to thank Prof. Chi-Sing Man who kindly corrected our
English. Also, the second author would like to thank Prof. Kohji Ohtsuka who
taught him a lot about fracture mechanics.
536
M. IKEHATA AND G. NAKAMURA
Appendix
In this appendix we state Runge’s theorem, which we use in the construction of vj′′
(j ∈ N), and we prove Lemma 3.3.
THEOREM A.1 (Runge’s theorem). Let U be an open subset of such that U ⊂
and \ U is connected. Define the two spaces X, Y of functions by
1
X := {u|U ; u ∈ H (U ),
1
Y := {v|U ; v ∈ H (),
u = 0 in U },
(A.1)
v = 0 in , supp(v|Ŵ ) ⊂ Ŵ0 },
(A.2)
where U is an open subset of U depending on u such that U ⊂ U ⊂ U ⊂ and
1
Ŵ0 is a fixed open subset of ŴN . Then, Y is dense in X with respect to H (U ) norm.
Proof. The proof is given in [9].
✷
Proof of Lemma 3.3. Let x 0 = γ (t) ∈ \ ' and a = x(T (γ , ')). Suppose
x ∼ a (i.e., |x 0 − a| ≪ 1). Let y = (y1 , y2 , y3 ) = (y1 (x, x 0 ), y2 (x, x 0 ), y3 (x, x 0 ))
be the boundary normal coordinates near the point a such that
0
y(a) = 0,
∂y(x, x 0 )
|x=x 0 = I,
∂x
− = {y1 < 0} near a,
(A.3)
where I is the identity matrix. Let
A(x) := |J (x)|−1 J (x)(tJ (x)),
x(y(x, x 0 ), x 0 ) = x,
(A.4)
where J (x) = ∂y(x, x0 )/∂x.
Also, let
Ã(y) = A(x(y, x 0 )),
ũ(y) = u(x(y, x 0 )),
y 0 = y(x 0 , x 0 ).
(A.5)
Then, it is easy to see
(i) Ã(y) ∈ C 1 near y = 0,
(ii) |J |−1 = ∇ · Ã∇ near y = 0,
(iii) δ(x(y; x 0 ) − x 0 ) = δ(y − y 0 ),
(iv) ∂ν = ∂y1 .
To simplify our expression, we introduce the following definition.
DEFINITION A.2. Let X be a function space defined on an open subset of R3
and let {g(·, x 0 )}, {r(·, x 0 )} be families of distributions defined on X, which depend
on x 0 ∼ a. Then, we write g(·, x 0 ) ∼ r(·, x 0 ) in X if and only if {g(·, x 0 ) −
r(·, x 0 ); x 0 ∼ a} is bounded in X.
537
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
Let V ⊂ R3 be a small open neighborhood of y = 0 such that V± := V ∩ R3±
with R3± := {±y1 > 0} and has C 2 boundary and β± , β0 are open subsets of the
boundary ∂V± of V± such that
∂V± = β¯± ∪ β¯0 ,
β± ∩ β0 = ∅,
β± ⊂ R3± ,
β0 ⊂ {y1 = 0}.
(A.6)
in H 1 (V± ),
(A.7)
By a direct computation, we have
G(y, y 0 ) := G(x(y, y 0 ), x(y 0 , y 0 )) ∼ G(y, y 0 )
=0 (y, y 0 ) by
where y 0 ∼ 0 plays the role of x 0 in Definition A.2. Define w
±
=0 (y, y 0 ) = ±G(y, ∓y 0 , y 0 ′ )
w
±
1
in R3± with y10 > 0,
(A.8)
′
where y 0 = (y10 , y20 , y30 ), y 0 = (y20 , y30 ). Then, by a direct computation, we have
/
=0 = 0
in R3± ,
w
±
(A.9)
=0 = −∂ G(y, y 0 ) on y = 0.
∂y1 w
y1
1
±
=± ∈ H 1 (V± ) to
Consider the solution Z
/
=0 + G)
=± = −∇ · (Ã(y) − Ã(y 0 ))∇(w
∇ · Ã∇ Z
±
=± = 0 on β0 ,
=± = 0
∂y1 Z
Z
in V±
on β± .
=0 + G) ∼ 0 in L2 (V ), we have
Then, since (Ã(y) − Ã(y 0 ))∇(w
±
±
=± ∼ 0 in H 1 (V± ).
Z
Define w
=± by
=0 − (G − G).
=± + w
w
=± = Z
±
(A.10)
(A.11)
(A.12)
Then, by a direct computation, we have
∇ · Ã∇ w
=± = 0 in V± ,
∂y1 w
=± = −∂y1 G on y1 = 0.
(A.13)
Then, by (3.2) and (A.13), w1 := w − w0 satisfies
⎧
⎨ w1 = −2∇ζ · ∇w0′ − ( ζ )w0′
in \ ',
∂ν w1 = −∂ν v ′ + (ζ − 1)∂ν G − (∂ν ζ )w0′ on ',
⎩
w1 = 0 on ŴD ,
∂ ν w1 = 0
on ŴN .
(A.15)
Now, let ζ ∈ C0∞ () satisfy ζ = 1 in a small open neighborhand of a and
supp ζ ∩ S ⊂ ' and define w0 ∈ H 1 ( \ ') by
w
=+ (y(x, x 0 ), y 0 ) in + ∩ supp ζ ,
′
′
w0 = ζ w0 , w0 =
(A.14)
w
=− (y(x, x 0 ), y 0 ) in − ∩ supp ζ .
538
M. IKEHATA AND G. NAKAMURA
Note that (A.15) is equivalent to the variational equation:
{−2w0′ (∇ζ · ∇ϕ) + ( ζ )w0′ ϕ} dx
∇w1 · ∇ϕ dx =
\'
\'
{−∂ν v ′ + (ζ − 1)∂ν G − (∂ν ζ )w0′ }ϕ dσ
+
(A.16)
'±
for any ϕ ∈ W . Also, we have
′
(∂ν ζ )w0 ϕ dσ =
(∂ν ζ )w0′ ϕ dσ Cηw0′ L2 (S) ϕL2 (S)
±
S
Cηw0′ H 1 (\') ϕH 1 (\') ,
(A.17)
where η ∈ C0∞ (), η = 1 on supp(∇ζ ) and a ∈
/ supp η. Hence the behavior of w is
controlled by that of w0 . Therefore, (i) and (ii) of Lemma 3.3 follow immediately
from (A.7), (A.8) and (A.17). This completes the proof.
✷
References
1.
S. Andrieux, A. Ben Abda and H.D. Bui, Reciprocity principle and crack identification. Inverse
Problems 15 (1999) 59–65.
2. A. Ben Abda, H. Ben Ameur and M. Jaoua, Identification of 2D cracks by elastic boundary
measurements. Inverse Problems 15 (1999) 67–77.
3. A. Ben Abda and H.D. Bui, Reciprocity principle and crack identification in transient thermal
problems. J. Inverse Ill-Posed Problems 9 (2001) 1–6.
4. M. Brühl, M. Hanke and M. Pidcock, Crack detection using electrostatic measurements. Math.
Model. Numer. Anal. 35 (2001) 595–605.
5. K. Bryan and M. Vogelius, A. computational algorithm to determine crack locations from
electrostatic boundary measurements. The case of multiple cracks. Internat. J. Engrg. Sci. 32
(1994) 579–603.
6. M. Eller, Identification of cracks in three-dimensional bodies by many boundary measurements.
Inverse Problems 12 (1996) 395–408.
7. L. Hörmander, Linear Partial Differential Operators, 3rd revised edn. Springer, Berlin (1969).
8. M. Ikehata, Reconstruction of the shape of the inclusion by boundary measurements. Comm.
Partial Differential Equations 23 (1998) 1459–1474.
9. M. Ikehata, Reconstruction of inclusion from boundary measurements. J. Inverse Ill-Posed
Problems 10 (2002) 37–65.
10. V. Isakov, On uniqueness of recovery of a discontinuous conductivity coefficient. Comm. Pure
Appl. Math. 14 (1988) 863–877.
11. A. Kirsch and S. Ritter, The linear sampling method for inverse scattering from an open arc.
Inverse Problems 16 (2000) 89–105.
12. R. Kress, Inverse elastic scattering from a crack. Inverse Problems 12 (1996) 667–684.
An Approximate Treatment of Blunt Body Impact
R.J. KNOPS1 and PIERO VILLAGGIO2
1 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, Scotland, UK
2 Dipartimento di Ingegneria Strutturale, Università di Pisa, 56126 Pisa, Italy
Received 29 October 2002; in revised form 21 July 2003
Abstract. This paper considers a blunt body, modelled by an elastic-perfectly plastic one-dimensional bar, impacting normally against a rigid fixed target as indicated in Figure 1. When the impact
velocity is small, the bar behaves elastically during the ensuing motion and rebounds with an equal
and opposite velocity to that on impact. But for large impact velocity, part of the bar adjacent to the
point of contact experiences permanent plastic deformation reducing the rebound velocity. The illuminating theory developed by Taylor [10] analyzed the impact of a rigid-plastic bar. We extend this
treatment by employing a semi-inverse procedure combined with energy conservation to additionally
take into account elastic deformation.
Mathematics Subject Classifications (2000): 74M20, 74C99, 74H99.
Key words: impact, elastic-perfectly plastic blunt body.
1. Introduction
A slender bar of uniform cross-section A, uniform mass density ρ, and undeformed
length ℓ, is projected normally in the horizontal direction towards a rigid smooth
vertical plane as shown in Figure 1. With respect to a Cartesian set of coordinate
axes, the x-axis is directed along the bar and the origin is located at the point
of initial impact. During the period immediately prior to striking the plane, the
motion of the bar is assumed to be rigid and to possess a longitudinal velocity
v1 . The bar behaves as a linear elastic-perfectly plastic material. Consequently, the
deformation is linear elastic with Young’s modulus E until a limiting value σp
of the stress is attained, after which the material starts to yield at constant stress
σp . Unloading, whenever it occurs, is always linear elastic according to classical
plasticity theory ([5, p. 47]; see also [3]). When the impact speed v1 is sufficiently
small or the yield stress σp is sufficiently large, the deformation after impact remains elastic throughout the bar. The longitudinal motion of the bar satisfies the
standard linear wave equation and the displacement is described by d’Alembert’s
solution (see, e.g., [4, art. 281]). A different analysis, however, is required when
either condition is not satisfied and parts of the bar become plastic. The wave
equation ceases to be valid in these parts, and, moreover, the precise extent of the
539
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 539–554.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
540
R.J. KNOPS AND P. VILLAGGIO
Figure 1.
plastic region is unknown beforehand. Such complications partly explain why the
problem has failed to receive a satisfactory treatment since its inception 150 years
ago.
A notable contribution by Taylor [10] attempted to circumvent the difficulty
by adopting a rigid-plastic stress–strain law. He proposed that immediately after impact the elastic and plastic regions are separated by a “shock front” that
propagates into the bar. Behind the front the bar is at rest in plastic deformation. Taylor’s approach was probably motivated by, and certainly elucidates, the
experimentally observed characteristic shape of projectiles after they have been
fired against armour or similar rigid targets. The shape has been confirmed by
Whiffin [11] in experiments successively repeated since the original investigations
by Boltzmann [1]. Nevertheless, Taylor’s method perhaps is vulnerable to criticisms for two reasons. The first is that a rigid-plastic stress–strain law excludes
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
541
any elastic deformation, the second is the assumption that the bar does not rebound
but remains in contact with the rigid plane obstacle and progressively comes to rest
as the plastic shock front propagates through the bar. Expressed otherwise, these
criticisms query the assumption that the kinetic energy immediately prior to impact
is totally converted into the work of plastic deformation.
There is an obvious theoretical and technical interest in the nonelastic behavior
of a bar that rebounds after impact and for which the initial kinetic energy is only
partially recoverable from the elastic deformation. Equally important is the determination of the corresponding coefficient of restitution, which Stronge [8, art. 5.1]
defines to be the ratio of the kinetic energies at the end of the rebound period to that
on impact. He also remarks that energy can be dissipated either by elastic vibrations
generated on impact, by viscosity, or by plastic deformation. The greatest energy
is absorbed by the plastic deformation especially in ductile materials that include
most metals.
Intermediate between the extremes of a perfectly elastic body and Taylor’s
rigid-plastic treatment, is another possibility. The body, immediately after impact,
experiences everywhere an elastic deformation into which subsequently propagates
a region of plastic deformation commencing from the end in contact with the rigid
target. Provided the plastic deformation does not extend to the whole bar, the elastic
strain energy is recoverable and is fully available to contribute to the energy of
rebound.
We here begin to explore this proposed intermediate mode of deformation.
Certain simplifying assumptions are introduced, the most notable of which is the
classic equivalent reduction of a continuous system to one possessing a single
degree of freedom. The reduction, first proposed by Cox [2], was further considered by Saint-Venant and Flamant [7] and improved by Pöschl [6]. When applied to a partly plastically deformed bar, it enables energy conservation to be
applied to determine the maximum compression as well as the coefficient of restitution. Furthermore, the period of compression and recovery during which the
end of the bar remains in contact with the rigid obstacle also may be calculated.
These two phases of the deformation are of different duration in an elastic – plastic bar and neither is equal to the respective periods when the deformation is
wholly elastic. Section 2 sketches the reduction of the basic theory to one dimension and by means of a semi-inverse procedure with a general function examines the motion when the deformation is completely elastic. A condition is
obtained for plastic deformation to occur. Section 3 examines a propagating plastic region and derives expressions for the respective strain and kinetic energies
again using a semi-inverse procedure. Energy conservation leads to the determination of the maximum compression of the bar and the time taken to achieve this,
as well as the coefficient of restitution. A final section introduces a particular
set of functions for the semi-inverse procedure in order to illustrate the results
derived in Section 3, and to compare them with those obtained by Taylor and
Whiffin.
542
R.J. KNOPS AND P. VILLAGGIO
2. The Reduced System: Perfectly Elastic Behavior
Let the time be denoted by t and consider the bar of Section 1 at the instant
t = 0 when one end B is first in contact with the rigid plane obstacle. Immediately
after impact, it is assumed that there is a brief “conversion” period [0, t0 ], where
t0 does not exceed l/c0 and c0 is the speed of propagation of longitudinal elastic
waves. During this period, a disturbance penetrates into the bar causing the velocity
to change from the rigid translational velocity v1 of the bar immediately prior to
impact. Let F (x1 , x2 , t) be the reactive force at the end B; u(x, t) the longitudinal displacement; e = ∂u/∂x the corresponding strain; and σ the axial stress.
Differentiation with respect to the time variable is indicated by a superposed dot.
In terms of these quantities and in an obvious notation, the principle of linear
momentum gives
t0
ℓ
t0
F (x1 , x2 , η) dS dη =
(2.1)
ρAu̇(x, t) dx ,
0
A
0
0
while the rate of work equation may be expressed as
t0
t0
1 ℓ
2
σ (x, η)ė(x, η) dx dη = 0,
ρAu̇ (x, t) dx +
2 0
0
0
(2.2)
in which we have taken u(x, 0) = 0 and σ (ℓ, t) = 0. The pair of unknowns
F (x1 , x2 , t) and u(x, t) are to be determined from (2.1) and (2.2). The solution,
however, presents severe difficulties that are here overcome by the introduction of
the following simplifying assumptions based upon an averaging procedure adopted
by Cox [2] and subsequently developed by others, notably Pöschl [6]:
(a) The reactive force F (x1 , x2 , t) remains finite during the interval [0, t0 ].
(b) The interval [0, t0 ] is small compared to the subsequent period of deformation.
(c) The velocity at the end of the small interval [0, t0 ] is spatially affine to the
static compression of a heavy column supported at its base.
(d) The displacement of the bar is zero throughout the interval [0, t0 ].
Furthermore it is supposed that
(e) The displacement remains longitudinal for t t0 .
Assumptions (a) and (b) imply that the left side of (2.1) can be neglected and
consequently linear momentum is conserved [9, Section 24]:
1 ℓ
u̇(x, t0 ) dx.
(2.3)
v1 =
ℓ 0
In accordance with Assumption (c), we represent the speed at t = t0 by (cf. Szabó
[9, p. 374]):
u̇(x, t0 ) = −Vf (x),
(2.4)
543
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
where V is a positive constant and f (x) is given by
f (x) =
x(2ℓ − x)
.
ℓ2
(2.5)
Note that the quadratic function f (x) vanishes at x = 0, and becomes unity at
x = ℓ.
Instead, however, of the explicit choice (2.5) for the function f (x), it is convenient to retain a function that is differentiable but otherwise arbitrary apart from
the properties:
f (0) = 0,
f (ℓ) = 1,
f ′ (ℓ) = 0,
f ′ (x) 0,
f ′′ (x) 0,
(2.6)
where a superposed prime indicates differentiation with respect to the variable x.
It follows from (2.3) and (2.4) that
κ1 V = −v1 ,
(2.7)
where
κ1 =
1
ℓ
ℓ
f (x) dx
(2.8)
0
is a reduction factor for the total mass m = ρAℓ of the bar due to the speed V .
At the end of period [0, t0 ] the bar experiences a compressive deformation that
achieves first maximum compression at the instant t1 to be determined. The strain
in the interval [t0 , t1 ] is calculated by the semi-inverse procedure in which the
longitudinal displacement u(x, t) is assumed to be of the separable form:
u(x, t) = −ω(t)f (x),
t t0 ,
(2.9)
where by continuity ω̇(t0 ) = V , and by Assumption (d), ω(t0 ) = 0. Of course,
insofar as the displacement (2.9) satisfies neither the equilibrium equations nor the
equations of motion, it is to be regarded only as an approximation. This remark
applies equally to the subsequent introduction in Section 3 of a similar expression
(3.1) for the displacement. The longitudinal velocity obviously becomes
u̇(x, t) = −ω̇(t)f (x),
t t0 ,
(2.10)
and the corresponding strain is
e(x, t) =
∂u(x, t)
= −ω(t)f ′ (x),
∂x
t t0 .
(2.11)
For the remainder of this section it is supposed that when t t0 the deformation is
entirely elastic so that the accompanying stress is given by
σ (x, t) = −Eω(t)f ′ (x),
where E is Young’s modulus.
t t0 ,
(2.12)
544
R.J. KNOPS AND P. VILLAGGIO
The strain energy is expressed by
ℓ
1 ℓ
1
′
2
2
EAe dx = EAω (t)
f 2 (x) dx
U (t) =
2 0
2
0
1 mE 2
ω (t)κ2 , t t0 ,
=
2 ρ
(2.13)
where
κ2 =
1
ℓ
ℓ
′
f 2 (x) dx.
(2.14)
0
Moreover, the kinetic energy is
1
1 ℓ
ρAω̇2 (t)f 2 (x) dx = mω̇2 (t)κ3 ,
K(t) =
2 0
2
t t0 ,
(2.15)
where
1
κ3 =
ℓ
ℓ
f 2 (x) dx
(2.16)
0
is another reduction factor [9, Section 24]. Conservation of energy yields
ℓ
Emκ2 A 2
2
ρAu̇2 (x, t0 ) dx, t t0 .
ω (t) + mκ3 ω̇ (t) =
ρ
0
(2.17)
Assumption (e) implies that no strain energy is created during the conversion period
[0, t0 ], and accordingly this quantity vanishes at t = t0 . By virtue of (2.4) and (2.7),
we obtain from (2.17) the relation
2
v1
Eκ2 2
2
ω (t) + ω̇ (t) =
, t t0 .
(2.18)
ρκ3
κ1
By hypothesis, we have the initial conditions:
ω(t0 ) = 0,
v1
ω̇(t0 ) = V = ,
κ1
and the solution to the simple harmonic motion (2.18) becomes
ρκ3 1/2 v1
Eκ2 1/2
ω(t) =
(t − t0 ) .
sin
Eκ2
κ1
ρκ3
(2.19)
(2.20)
(2.21)
The maximum compression of the bar corresponds to the value of ω(t) given by
ρκ3 1/2 v1
,
(2.22)
ω(t1 ) =
Eκ2
κ1
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
545
and occurs at the instant t1 , where (t0 , t1 ) is the compression period of length
1
ρκ3 1/2
t1 − t0 = π
.
(2.23)
2
Eκ2
We conclude from (2.6) and (2.12) that the maximum stress σM occurs at the end
B at the instant t = t1 and is given by
Eρκ3 1/2 v1 ′
′
σM = Eω(t1 )f (0) =
f (0).
(2.24)
κ2
κ1
The assumption that linear isotropic elasticity adequately describes the material
behavior is justified provided that σM does not exceed the plastic compressive yield
stress σP . Consequently, the expression (2.24) implies that the deformation ceases
to be elastic whenever (after suitable adjustment of signs)
Eρκ3 1/2 v1 ′
f (0).
(2.25)
σP
κ2
κ1
Condition (2.25) holds when either the plastic compressive yield stress is sufficiently small or the impact velocity is sufficiently large and ensures that the bar
becomes progressively plastic. Such regions of plastic deformation, however, can
transmit only a stress σP . The remaining parts of the bar continue in elastic motion
since the longitudinal stress nowhere exceeds the value σP . Elastic-plastic behavior
is discussed in the next section.
3. The Reduced System: Elastic Plastic Behavior
In this section it is assumed that the plastic compressive yield stress satisfies a
condition corresponding to (2.25) and the plastic deformation extends progressively along the bar during the compression subsequent to a period [t0 , t2 ] of elastic
motion throughout the bar, and the brief conversion period [0, t0 ] defined in Section 2. In particular, t0 is still supposed sufficiently small to justify condition (d) of
Section 2, namely that u(x, t0 ) is zero.
We seek to determine the instant t2 at which plastic deformation first occurs
and then proceed to construct individual expressions for the different contributions
to the stored and kinetic energies. We deduce the length of the plastic region, and
the instant T , when there is first instantaneous maximum compression, and calculate the coefficient of restitution. Because the maximum compressive elastic stress
always occurs at the end B in contact with the rigid target, plastic deformation
penetrates into the bar from this point. It is supposed that a narrow transitional
zone separates the regions of elastic and plastic deformation, and it is convenient
to abstract the zone into a line discontinuity between the regions. We let z(t) be the
distance of the discontinuity from the end B measured in the deformed bar at time
t ∈ [t2 , T ]. In what follows, all deformations are assumed sufficiently small to justify neglect of second and higher order terms. We frequently, for example, confuse
546
R.J. KNOPS AND P. VILLAGGIO
the distance z(t) with the corresponding distance measured in the undeformed or
reference configuration of the bar. For simplicity, it is initially supposed that the
plastic region is reduced to rest, although subsequently we indicate an approximate
form for the kinetic energy when the plastic region is in motion.
As just mentioned, during the period [t0 , t2 ] a compressive elastic motion occurs
throughout the bar, and thereafter is confined to the length [z(t), ℓ]. Accordingly,
we represent the elastic displacement by the semi-inverse expression
0 x ℓ,
t0 t t2 ,
u(x, t) = −(t)f (x),
(3.1)
z(t) x ℓ, t2 t T ,
where (t), not necessarily identical to the function ω(t) introduced in (2.9), is to
be determined; and the function f (x) satisfies conditions (2.6). The negative sign
in (3.1) indicates contraction of the bar. The corresponding speed, strain and stress
produced by the elastic displacement (3.1) on its region of definition are given
respectively by
˙
u̇(x, t) = −(t)f
(x),
e(x, t) = −(t)f ′ (x),
σ (x, t) = −E(t)f ′ (x),
where, as before, E denotes Young’s modulus.
Assumption (d) of Section 2 implies the initial values
v1
˙ 0) = V =
.
(t0 ) = 0,
(t
κ1
(3.2)
(3.3)
(3.4)
(3.5)
We conclude from (3.4) and (2.9) that the elastic compressive stress assumes its
maximum at the end B (x = 0) and consequently plastic deformation commences
from this point provided the condition corresponding to (2.25) is satisfied. We let
t2 be the instant at which the elastic stress at x = 0 achieves the value −σP of the
plastic compressive yield stress. It follows from equations corresponding to (2.21)
that
σP = E(t2 )f ′ (0)
Eκ2 1/2
ρEκ3 1/2 v1 ′
f (0) sin
(t2 − t0 ) ,
=
κ2
κ1
κ3
(3.6)
(3.7)
and t2 may be calculated from (3.7) in terms of σP .
At a subsequent instant t ∈ [t2 , T ] the elastic plastic interface has moved to the
point z(t). The stress on the side adjacent to the interface in the elastic region is
given by (3.4) and must equal the plastic compressive yield stress. Consequently,
we have
(3.8)
σP = E(t)f ′ z(t) , 0 z(t) ℓ, t ∈ [t2 , T ],
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
547
where the reference and deformed positions are not distinguished in accordance
with the linear approximation adopted here. On noting that z(t2 ) = 0, we obtain
from (3.8) the lower bound
σP
(t)
(3.9)
, t ∈ [t2 , T ].
Ef ′ (0)
Moreover, differentiation of (3.8) leads to
˙
(t)
= −(t)
f ′′ (z(t))
ż(t),
f ′ (z(t))
t ∈ [t2 , T ],
(3.10)
which by (3.8) may be written as
˙
(t)
=−
σP f ′′ (z(t))
ż(t),
E f ′ 2 (z(t))
t ∈ [t2 , T ].
(3.11)
Let us also remark that as the plastic deformation progresses along the bar, the
(true) stress distribution at the instant t ∈ [t2 , T ] is given by
σ = −σP ,
= −E(t)f ′ (x),
0 x z(t),
z(t) x ℓ.
(3.12)
(3.13)
We continue the analysis by employing, as in Section 2, conservation of energy and
observe that the total strain energy U (t) of the bar at any time t ∈ [t2 , T ] consists
of three components U1 (t), U2 (t), U3 (t), due respectively to the elastic, elasticplastic, and dissipative strain energies. We calculate separately each component.
First, the elastic strain energy from (3.3), (3.4) and (3.8) becomes, to within the
linear approximation,
ℓ
1
′
2
f 2 (x) dx
U1 (t) = AE (t)
2
z(t )
2
−1 ℓ ′ 2
1
σP ′ 2
= A
f (x) dx.
(3.14)
f (z(t))
2
E
z(t )
We suppose that a point x undergoes a displacement w(x) at constant volume
during the plastic deformation so that the total compressive strain at a point in
the plastic region is
∂w
, 0 x z.
(3.15)
∂x
Because no change in volume accompanies the plastic deformation the longitudinal
compressive plastic strain must be balanced by a transverse dilatational plastic
strain ǫ(x, t), using the sign convention of (3.15). By definition, the cross-sectional
area A(x, t) at any point x in the plastic region is then related to the uniform
cross-section area A of the undeformed bar by
ǫ(x, t) = −
A(x, t) = A(1 + ǫ(x, t)),
0 x z(t).
548
R.J. KNOPS AND P. VILLAGGIO
Note that, since in contraction ǫ(x, t) is positive by definition (3.15), the plastically
deformed area A(x, t) is greater than A. By the standard theory of elastic-perfectly
plastic materials [5], the total longitudinal compressive plastic strain ǫ(x, t) contains a plastic part eP = −σP /E whose associated strain energy is given by
z(t )
(−σP )eP
1
U2 (t) = A
dx
(3.16)
2
[1 − ǫ(x)]
0
2 z(t )
1
σ
dx
= A P
,
(3.17)
2
E
[1
−
ǫ(x)]
0
where the nominal stress σP /[1 − ǫ(x)] per unit area of A is used. Evaluation of
the integrand in (3.17) to first order leads to the expression
2 z(t )
σ
1
[1 + ǫ(x)] dx,
(3.18)
U2 (t) = A P
2
E
0
which by (3.15) and the continuity of the displacement across the interface gives
2
σ
1
(3.19)
U2 (t) = A P [z(t) − (t)f (z(t))]
2
E
2
1
σP f (z(t))
σ
= A P z(t) −
.
(3.20)
2
E
E f ′ (z(t))
On again using the nominal stress, the expression for the dissipative strain
energy is given by
z(t )
(−σP )[ǫ(x) − eP ]
U3 (t) = A
dx
[1 − ǫ(x)]
0
z(t )
ǫ(x)
= −AσP
dx − 2U2 (t)
[1 − ǫ(x)]
0
σP
E
1+
U2 (t) + AσP z(t),
(3.21)
= −2
σP
E
where (3.16) and (3.18) are used.
Under the assumption that no energy is destroyed across the interface, conservation of energy at the instant T of first instantaneous maximum compression
yields
U (T ) = K,
(3.22)
where
2
v1
1
,
K = ρAℓκ3
2
κ1
(3.23)
549
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
and κ1 and κ3 are defined in (2.8) and (2.16). Rearrangement of (3.22) after substitution from (3.14), (3.20) and (3.21) gives
P (z(T )) =
nℓκ3
,
κ12
(3.24)
where
′2
−1
P (z(t)) = [f (z(t))]
ℓ
z(t )
′
f 2 (x) dx + (2 + q)
f (z(t))
− z(t),
f ′ (z(t))
(3.25)
and the nondimensional parameters q and n are defined by
q=
σP
,
E
n=
ρEv12
.
σP2
(3.26)
In the next section, we solve equation (3.24) for specific functions f (x) and present
some numerical results. The right side of (3.24) is not arbitrary. Let us recall that
condition (2.25) for the onset of plastic deformation requires that
Eρκ3 1/2 v1 ′
f (0),
(3.27)
σP
κ2
κ1
where κ2 is defined by (2.14). In terms of the parameter n, the bound (3.27) becomes
n
κ12 κ2
,
κ3 f ′ 2 (0)
(3.28)
which implies that the right side of (3.24) must satisfy the lower bound
κ2 ℓ
nκ3 ℓ
′2 .
2
f
(0)
κ1
(3.29)
The parameter n enters prominently into the numerical calculations of Section 4
where it is compared to the nondimensional parameter ρv12 /σP = nq introduced
by Taylor [10].
The solution z(T ) to equation (3.24) subject to (3.28) may be used to determine
the coefficient of restitution e∗ , which according to Stronge [8] is defined to be
e∗ = 1 −
U3 (T )
,
K
(3.30)
in our notation. It is immediate from (3.22) that 0 e∗ 1, whereas on substituting from (3.20), (3.21) and (3.24) we have the explicit representation
2κ12
1
q
z(T )
+
+ (1 + q)
e =−
(2 + q) nκ3 (2 + q)
ℓ
ℓ
∗
′2
ℓ
z(T ) f (x) dx
.
f ′ 2 (z(T ))
(3.31)
550
R.J. KNOPS AND P. VILLAGGIO
Determination of the time T of first maximum compression requires consideration
of the energy conservation for the whole bar at some intermediate time t ∈ [t2 , T ]
and the introduction of the corresponding kinetic energy. We distinguish between
hard materials, for which both the elastically and plastically deformed parts of
the bar experience motion, and soft materials for which, as assumed so far, the
plastically deformed part is at rest.
Consequently, for soft materials, motion at time t ∈ [t2 , T ] is confined to the
elastically deformed region where the displacement is given by (3.1). The kinetic
energy is
ℓ
1
˙ 2 (t)f 2 (x) dx
K1 (t) = ρA
2
z(t )
2 ℓ
′′
1
1
2
2 f (z(t))
= ρAℓq
f (x) dx ż2 (t), t ∈ [t2 , T ], (3.32)
2
f ′ 2 (z(t))
ℓ z(t )
after appeal to (3.11). The potential energy of the whole bar continues to be given
by U (t) and, accordingly, energy conservation yields
K1 (t) + U (t) = K,
(3.33)
which after substitution from (3.14), (3.20), (3.21), (3.32) becomes
′′
f (z(t)) 2 1 ℓ 2
f (x) dx ż2 (t) = nℓκ3 κ1−2 − P (z(t)).
f ′2 (z(t))
ℓ z(t )
(3.34)
We may integrate (3.34) to obtain
ρℓ
E
1/2
(t − t2 ) =
0
z(t )
′
[f ′′ (ζ )/f 2 (ζ )][(1/ℓ)
ℓ
ζ
f 2 (x) dx]1/2 dζ
[nℓκ3 κ1−2 − P (ζ )]1/2
,
(3.35)
which enables z(t) to be determined. Insertion into (3.8) provides an expression for
(t). To calculate T − t2 , we substitute from the solution to (3.24) into (3.35). But
t2 is given by (3.7) and consequently the time T − t0 is known. We omit details.
For a hard material, all parts of the bar continue in motion after plastic deformation has commenced. We adopt the simplifying assumption that the transverse
inertial effects are absent in the region of plastic deformation, and further assume
that the kinetic energy may adequately be represented by elastic kinetic energy
derived from the displacement (3.1). The total kinetic energy K2 (t) of the whole
bar is consequently given by
ℓ
1
˙ 2 (t)f 2 (x) dx
K2 (t) = ρA
2
0
′′
f (z(t)) 2 2
1
2
= ρAκ3 q ℓ ′ 2
ż (t),
(3.36)
2
f (z(t))
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
551
where (3.11) has been used. Conservation of energy becomes
K2 (t) + U (t) = K,
which may be treated similarly to (3.33) to yield
1/2
z(t )
′
E
[f ′′ (ζ )/f 2 (ζ )] dζ
1/2
.
(t − t2 ) = κ3
ρℓ
[nℓκ3 κ1−2 − P (ζ )]1/2
0
(3.37)
(3.38)
Again, the function (t) and the total time T − t0 may be determined as before.
The evaluation of the integrals in (3.35) and (3.38) must in general be undertaken numerically. Let us observe, however, that by definition (2.16) we have
1 ℓ 2
κ3
f (x) dx,
ℓ z
and accordingly the time for maximum compression in soft materials is less than
that for hard materials.
4. The Pöschl Approximation. Numerical Evaluation
We illustrate the theory by selecting an explicit family of functions f (x) and for
one particular choice (the Pöschl approximation (2.5)) present a numerical analysis
that allows comparison with the well-known corresponding results obtained by
Taylor [10] in the rigid-plastic theory that for certain impact speeds predict those
experimentally observed by Whiffin [11]. The functions considered are given by
x α
f (x) = 1 − 1 −
, 0 x ℓ,
(4.1)
ℓ
where α (> 1) is a positive constant. Clearly, the functions (4.1) satisfy conditions
(2.6) for each α. In particular, when α = 2, the function (4.1) becomes the Pöschl
approximation (2.5) and because of its physical relevance this function is selected
for numerical investigation.
To obtain the equation satisfied by z(T ), we first substitute (4.1) into definitions
(2.8), (2.14), and (2.16) to obtain
α2
2α 2
α
,
κ2 = 2
,
κ3 =
,
α+1
ℓ (2α − 1)
(α + 1)(2α + 1)
and consequently the plastic yield stress by (2.25) must satisfy
2α − 1 1/2
1/2
σP (ρE) v1 2(α + 1)
.
2α + 1
κ1 =
(4.2)
(4.3)
As observed in Section 3, the material parameters in (3.24) appear only in the
dimensionless combinations q and n defined by (3.26). Consequently, n by (4.3)
possesses the lower bound:
2α + 1
.
(4.4)
n
2(α + 1)(2α − 1)
552
R.J. KNOPS AND P. VILLAGGIO
Observe that the functions (4.1) give
ℓ
y(T )
′
′2
−1
f 2 (x) dx =
[f (z(T ))]
,
2α − 1
z(T )
(4.5)
where ℓy(T ) = ℓ − z(T ) is the elastically deformed length of the bar at first
maximum compression, and therefore (3.24) leads to:
2n(α + 1)
2(α − 1)2
−q −α 1+
= 0. (4.6)
y (1−α) (T )[2 + q] + y(T )
2α − 1
2α + 1
The coefficient of restitution e∗ from (3.31) becomes
(2α + 1)
{2(α − 1) − q}
∗
e = −q/(2 + q) +
1 − y(T )
.
n(α + 1)(2 + q)
2α − 1
(4.7)
General estimates for the time taken to achieve first maximum compression do not
simplify sufficiently to warrant separate display.
For the Pöschl approximation, when α = 2, the compressive yield stress from
(4.3) must satisfy the bound
2ρE 1/2
,
(4.8)
σP 3v1
5
while the parameter n from (4.4) must satisfy
n 0.278.
(4.9)
We consider the following numerical values for the density, impact speed and
Young’s modulus, which are comparable to those considered by Taylor [10] and
Whiffin [11] for mild steel:
ρ = 7.873 · 10−3 Kg.cm−3 ,
E = 2.07 · 109 Kg.wt.cm−2 .
v1 = 4.875 · 104 cm.sec−1 ,
(4.10)
From (4.8) we find that σP 3.734 · 108 Kg.wt.cm−2 . On the other hand, Taylor
stipulates that his analysis requires the compressive yield stress to be comparable to
ρv12 which for the numerical values (4.10) equals 1.871 ·107 . Nevertheless, Taylor
expresses his results in terms of the nondimensional parameter N = nq = ρv12 /σP
(in our notation) and considers the explicit values N = 0.5, 1.633, 3.2, and 8.1 for
the impact speed v1 = 2.4689 ·104 cm.sec−1 (= 810 ft.sec−1 ). These values correspond to a compressive yield stress of respective values 9.598 ·106 Kg.wt.cm−2 ,
2.939 ·106 Kg.wt.cm−2 , 1.5 ·106 Kg.wt.cm−2 , 5.925 ·105 Kg.wt.cm−2 . The values
of n are 1.078 ·102 , 1.150 ·103 , 4.417 ·103 , and 2.83 ·104 .
The proportional length y(T ) = 1 − z(T )/ℓ of the bar that remains elastically
deformed at first maximum compression is obtained from (4.6) and satisfies the
quadratic equation:
6n
q
3q
2
− 3y(T ) 1 +
+3 1+
= 0.
(4.11)
y(T ) 1 −
2
5
2
553
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
Table I.
n
N
σP (Kg.wt.cm−2 )
y(T )
z(T )/ℓ
e∗
0.5
0.75
1
5
10
2.766 ·10
102
2.950 ·102
8.051 ·102
103
6.72 ·10−2
8.23 ·10−2
9.51 ·10−2
2.126 ·10−1
3.000 ·10−1
5.000 ·10−1
9.507 ·10−1
1.633
2.968
3.006
2.783 ·108
2.272 ·108
1.968 ·108
8.801 ·107
6.223 ·107
3.742 ·107
1.968 ·107
1.146 ·107
6.936 ·106
6.223 ·106
0.739
0.587
0.491
0.144
0.077
0.029
0.008
0.003
0.001
0.0008
0.261
0.413
0.509
0.856
0.923
0.971
0.992
0.997
0.999
0.9992
0.846
0.676
0.561
0.151
0.079
0.030
0.008
0.003
0.001
0.000
But the ratio q = σP /E for both soft and hard materials, and also for the values
considered here, is negligibly small compared to one, and (4.11) reduces to the
approximate equation:
6n
2
y (T ) − 3y(T ) 1 +
+ 3 = 0.
(4.12)
5
The coefficient of restitution from (4.7) is expressed by:
e∗ = −
5[3 − (2 − q)y(T )]
q
+
,
2+q
9n(2 + q)
(4.13)
which upon neglecting q assumes the simpler form:
e∗ =
5[3 − 2y(T )]
.
18n
(4.14)
We calculate y(T ) from (4.12) for different values of n, and then determine e∗
from (4.14). The results for the data (4.10) are presented in Table I which also lists
the respective values of z(T ), the compressive yield stress σP obtained from (3.26),
and the parameter N. It is seen that the present investigation provides realistic
values of y(T ) corresponding to values of the compressive yield stresses that for
n 1 are approximately 10 times greater than those given by Taylor and Whiffin.
The difference decreases to roughly 10% for n = 5 when the value of y(T ) is that
measured by Whiffin at the same impact speed. We do not attempt to determine
the total time to first maximum compression, which as already noted, requires the
numerical evaluation of the integrals appearing in (3.35) and (3.38).
Whiffin [11] treated mild steel specimens subject to longitudinal impact velocities v1 = 1.219 ·104 cm.sec−1 , 4.875 ·104 cm.sec−1 , and 7.62 ·104 cm.sec−1 ,
and from measurements of the plastically deformed and undeformed lengths used
554
R.J. KNOPS AND P. VILLAGGIO
Taylor’s theory to show that the values of N are 0.145, 2.698, and 6.290,
respectively. He then determined the values of the compressive yield stress to
be 8.061 ·106 Kg.wt.cm−2 , 6.936 ·106 Kg.wt.cm−2 , 7.268 ·106 Kg.wt.cm−2 , which
are of the same order of magnitude as those produced from other tests for mild
steel. The most favorable comparison with those predicted by the present approach
exhibits a difference of about 10%. The order of magnitude of the yield stress,
however, derived here is the same as that for nickel–chrome steel whose density
and Young’s modulus are comparable with the values given by (4.10). Our analysis
appears better suited to hard materials.
Both Taylor and Whiffin indicate a considerable shortening of the bar due to
the plastic deformation. The method of this paper, however, supposes that such
shortening is negligible. A possible explanation for the discrepancies between the
theories is provided by our assumption of small plastic strains and the introduction
of elastic strains into the energy balance equation.
Acknowledgements
This work was partly supported by the Italian Group for Mathematical Physics. The
authors are grateful to the referees for constructive comments. One author (R.J.K.)
wishes to thank the Leverhulme Trust for the award of an Emeritus Fellowship.
References
1.
L. Boltzmann, Einige Experimente über den Stoss von Zylindern. Sitzungberichte, Akad. Wiss.
Wien Math. Naturwiss. Kl. 84 (1881) 1225.
2. H. Cox, On impacts on elastic beams. Trans. Cambridge Phil. Soc. 9 (1849) 73.
3. W. Goldsmith, Impact. E. Arnold, London (1960).
4. A.H.E. Love, A Treatise on the Mathematical Theory of Elasticity. Cambridge Univ. Press,
Cambridge (1927).
5. J.B. Martin, Plasticity: Fundamentals and General Results. MIT Press, Cambridge, MA
(1975).
6. T. Pöschl, Der Stoss. Handbuch der Physik, Vol. 6. Springer, Berlin (1928), Chapter 7.
7. B. Saint-Venant and A. Flamant, Détermination et répresentation graphique des lois du choc
longitudinal. C. R. Acad. Sci. Paris 47 (1883) 127, 214, 281, 314.
8. W.J. Stronge, Impact Mechanics. Cambridge Univ. Press, Cambridge (2000).
9. I. Szabó, Einführung in die Technische Mechanik. Springer, Berlin (1963).
10. G.I. Taylor, The use of flat-ended projectiles for determining dynamic yield stress. I. Theoretical consideration. Proc. Roy. Soc. London A 194 (1948) 289–299.
11. A.C. Whiffin, The use of flat-ended projectiles for determining dynamic yield stress. II. Tests
of various metallic materials. Proc. Roy. Soc. London A 194 (1948) 300–322.
On the Transformation Property of the Deformation
Gradient under a Change of Frame
I-SHIH LIU
Instituto de Matemática, Universidade Federal do Rio de Janeiro, Caixa Postal 68530,
CEP 21945-970, Rio de Janeiro, Brazil. E-mail: liu@im.ufrj.br
Received 23 April 2002; in revised form 16 January 2003
Abstract. If the deformation gradients are denoted by F and F ∗ respectively before and after a
change of frame, they are related by the transformation formula, F ∗ = QF , where Q is the orthogonal transformation associated with the change of frame. Although it has been pointed out that this
relation is valid “provided that the reference configuration be unaffected by the change of frame” (see
p. 308 of [1]), this formula is found in most textbook of Continuum Mechanics, and is used, without
further justification, in deriving the condition of material frame-indifference, H (QF ) = QH (F )QT
for the constitutive function H of the stress tensor of an elastic body. In this note, we shall analyze
the effect of change of frame on the transformation property of the deformation gradient, and show
that the above transformation formula is not valid in general. However, we shall confirm the validity
of the above well-known condition of material frame-indifference without the assumption that the
reference configuration be unaffected by the change of frame.
Mathematics Subject Classifications (2000): 74A05, 74A20.
Key words: reference configuration, Euclidean transformation, Galilean objectivity, principle of
material frame-indifference, simple materials.
In memory of Professor Clifford A. Truesdell
1. Frame of Reference and Deformation
The event world or space-time W of continuum mechanics [2] can be mapped onto
the product space of a three-dimensional Euclidean space E and the set of real
numbers R through a one-to-one mapping,
φ: W → E × R.
Such a mapping is called a frame of reference. Let us denote by Wt the totality
of simultaneous events at the instant t, and φt the restriction of φ to Wt , so φt : Wt
→ E associates the placement of an event with a location in the Euclidean space E.
A body B is a set of material points, and we shall identify it, through a oneto-one mapping, with a region in E relative to a frame of reference. Such an
identification is called a configuration of the body. More specifically, we consider
555
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 555–562.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
556
I-S. LIU
a placement of the body B in Wt , say, κ̃: B → Wt , then κ = φt ◦ κ̃, B → E is a
configuration of B relative to the frame of reference φ at the instant t.
Given a particular configuration relative to a frame of reference,
κ: B → E,
κ(X) = X,
(1.1)
called a reference configuration of B, a material point X in the body B can be
identified with its position X in the region Bκ occupied by the body in the reference
configuration κ.
A motion χ of B can be expressed as a map,
χ: B × R → E,
x = χ(X, t),
(1.2)
where χ(·, t): B → E is the configuration of the body B at time t. The region
occupied by the body at time t will be denoted by Bt .
Given a reference configuration κ, the motion χ can also be expressed as
χκ : Bκ × R → E,
x = χκ (X, t) = χ κ −1 (X), t .
(1.3)
We call χ(X, t) the material description of the motion and χκ (X, t) a referential
description. The map
χκ (·, t) = χ(·, t) ◦ κ −1 : Bκ → Bt
is called the deformation from Bκ to Bt . The deformation gradient relative to κ,
denoted by F is defined as the gradient of χκ (X, t) relative to X, i.e.,
F = ∇X χκ .
For a given motion, the reference configuration κ is often chosen as the configuration at some instant t = t0 , say, the initial position of the body in the motion,
κ = χ(·, t0 ). However, the reference configuration need not be occupied by the
body in the actual motion at any instant, in principle. It can be any convenient
placement of the body at some instant of time in the frame of reference.
2. Transformation Property of Deformation Gradient
Let φ and φ ∗ be two frames of reference. We call
∗ = φ ∗ ◦ φ −1 : E × R → E × R
a change of frame from φ to φ ∗ . In general, the change of frame ∗, which maps
(x, t) to (x ∗ , t ∗ ), is a Euclidean transformation of the following form,
x ∗ = Q(t)(x − x0 ) + c(t),
t ∗ = t + a,
(2.1)
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
557
Figure 1. Reference configurations κ and κ ∗ in the change of frame from φ to φ ∗ .
for some a ∈ R, x0 ∈ E, c(t) ∈ E, and Q(t) ∈ O, where O is the group of orthogonal transformations on the translation space of E. In particular, φt∗ ◦ φt−1 : E → E
is given by
x ∗ = φt∗ (φt−1 (x)) = Q(t)(x − x0 ) + c(t).
(2.2)
Let κ̃: B → Wt0 be a reference placement of the body at some instant t0 , then
(see Figure 1)
κ = φt0 ◦ κ̃
and
κ ∗ = φt∗0 ◦ κ̃
(2.3)
are the two corresponding reference configurations of B in the frames φ and φ ∗ at
the same instant, and
X = κ(X),
X ∗ = κ ∗ (X),
X ∈ B.
Let us denote by γ = κ ∗ ◦ κ −1 the change of reference configuration from κ to κ ∗
in the change of frame, then it follows from (2.3) that γ = φt∗0 ◦ φt−1
and by (2.2),
0
we have
X ∗ = γ (X) = K(X − x0 ) + c(t0 ),
(2.4)
where K = Q(t0 ) is a constant orthogonal tensor.
On the other hand, the motion in referential description relative to the change
of frame is given by
x = χκ (X, t),
x ∗ = χκ∗∗ (X ∗ , t ∗ ),
and from (2.2) we have
χκ∗∗ (X ∗ , t ∗ ) = Q(t)(χκ (X, t) − x0 ) + c(t).
Therefore we obtain for the deformation gradient in the frame φ ∗ , i.e., F ∗ =
∇X∗ χκ∗∗ , by the chain rule,
F ∗ (X ∗ , t ∗ ) = Q(t)F (X, t)K T ,
or simply,
F ∗ = QF K T ,
(2.5)
558
I-S. LIU
where K T denotes the transpose of K, which, by (2.4), K = Q(t0 ), is a constant
orthogonal tensor due to the change of frame for the reference configuration.
The transformation property (2.5) stands in contrast to the well-known formula
F ∗ = QF , which is valid provided that the reference configuration is unaffected
by the change of frame, so that K reduces to the identity tensor.
From (2.1), since the orthogonal transformation Q(t) in a Euclidean transformation is time-dependent, it is conceivable that in a change of frame, one may
choose a reference configuration at some instant t0 , such that Q(t0 ) = 1. Therefore, the assumption that “the reference configuration be unaffected by the change
of frame” (see [1, p. 308]) can be justified. However, for an arbitrary Euclidean
transformation, this is not always possible in general, for example, when Q is
time-independent.
Transformation properties of some other kinematic quantities related to the
deformation gradient are discussed in [3].
GALILEAN OBJECTIVITY OF DEFORMATION GRADIENT
A second order tensor quantity S is called objective if, in a change of frame ∗,
S ∗ = QSQT .
From (2.5), it follows that the deformation gradient F is not an objective tensor
quantity under Euclidean transformations because, in general,
K = Q(t0 ) = Q(t).
However, (2.5) also asserts that F is objective under frame transformations with
time-independent orthogonal tensor Q, since, in this case,
K = Q and
F ∗ = QF QT .
In particular, we can say that the deformation gradient is an objective tensor quantity with respective to Galilean transformations, which form a subclass of Euclidean transformations (2.1), with
Q(t) = Q,
c(t) = v0 t + c0 .
(2.6)
This conclusion, therefore, modifies the classical result of the strict non-objectivity
of the deformation gradient, from the transformation formula F ∗ = QF , based on
the convenient, but oversimplified, assumption that the reference configuration be
unaffected by the change of frame.
3. Principle of Material Frame-Indifference
The most important aspect of changes of frame lies in the formulation of the principle of material frame-indifference for constitutive functions. In what follows, we
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
559
shall confirm the usual condition of material frame-indifference, without the usual
assumption that the reference configuration be unaffected by the change of frame.
For simplicity, we shall present it in the pure mechanical theory.
3.1. IN MATERIAL DESCRIPTION
Let φ be a frame of reference and χ be a motion. Let T (X, t) be the value of the
stress tensor at the material point X and time t in the frame φ. We can write the
constitutive relation in the following form,
T (X, t) =
Fφ
(χ(Y, t
Y ∈B,0s<∞
− s), X),
X ∈ B,
(3.1)
where we have indicated the domain of the argument function χ beneath the functional symbol F . We emphasize that the constitutive function depends on the
choice of frame in general, so that we have also indicated the frame φ on F as
a subscript.
We remark that the stress tensor is an objective tensor quantity, i.e., relative to
a change of frame ∗ given by (2.1), it has the following transformation property:
T ∗ (X, t ∗ ) = Q(t)T (X, t)Q(t)T ,
X ∈ B.
(3.2)
Since any intrinsic property of materials should be independent of frame of
reference, it is required that for any objective quantity, the constitutive function
must be invariant with respect to any change of frame. Mathematically, it can be
stated in the following
Principle of material frame-indifference. The constitutive function of an objective quantity must he independent of the frame, i.e.,
Fφ (·) = Fφ∗ (·),
for any frames of reference φ and φ ∗ .
More specifically, from (3.2), the principle implies the following condition of
material frame-indifference,
F
Y ∈B,0s<∞
(χ ∗ (Y, t ∗ − s), X) = Q(t)
F
Y ∈B,0s<∞
(χ(Y, t − s), X)Q(t)T ,
(3.3)
or simply
F (χ ∗ ) = QF (χ)QT ,
for any change of frame ∗ given by (2.1). In this condition, we have written F for
both Fφ and Fφ∗ , since, by the principle of material frame-indifference, they are
the same function, and therefore, (3.3) is a restriction imposed on the constitutive
function F .
560
I-S. LIU
3.2. IN REFERENTIAL DESCRIPTION
Let κ be a reference configuration of the body B in the frame φ,
x = χ(X, t) = χκ (X, t),
X = κ(X),
X ∈ B.
(3.4)
In terms of referential description, we can rewrite the constitutive relation (3.1)
relative to κ as
T (X, t) =
Fκ
(χκ (Y , t
Y ∈Bκ ,0s<∞
− s), X),
X ∈ Bκ .
(3.5)
From (3.4), Fκ is related to F by
Fκ (χκ (Y , t − s), X) = F (χκ (Y , t − s), κ −1 (X)).
(3.6)
Note that the constitutive function F depends on the reference configuration κ in
the frame φ.
To express the condition of material frame-indifference in referential description, let φ ∗ be another frame, and denote the corresponding reference configuration
in this frame by κ ∗ (see (2.3)). We have
x ∗ = χ ∗ (X, t ∗ ) = χκ∗∗ (X ∗ , t ∗ ),
X ∗ = κ ∗ (X),
X ∈ B,
(3.7)
and, similar to (3.6),
Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ ) = F (χκ∗∗ (Y ∗ , t ∗ − s), κ ∗−1 (X ∗ )).
(3.8)
The condition (3.3) then takes the form,
Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ ) = Q(t)Fκ (χκ (Y , t − s), X)Q(t)T .
(3.9)
In this equation, the constitutive functions on the two sides are expressed in terms
of the reference configuration in two different frames. However, from (3.8) and
(3.6), we have
Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ )
= F (χκ∗∗ (Y ∗ , t ∗ − s), κ ∗−1 (X ∗ ))
= F (χκ∗∗ (κ ∗ ◦ κ −1 (Y ), t ∗ − s), κ ∗−1 (κ ∗ ◦ κ −1 (X)))
= F (χκ∗∗ (γ (Y ), t ∗ − s), κ −1 (X))
= Fκ (χκ∗∗ (γ (Y ), t ∗ − s), X),
where γ = κ ∗ ◦ κ −1 stands for the change of reference configuration due to
the change of frame given by (2.4). Therefore, the condition of material frameindifference relative to a reference configuration κ becomes,
Fκ
(χκ∗∗ (γ (Y ), t ∗
Y ∈Bκ ,0s<∞
= Q(t)
− s), X)
Fκ
(χκ (Y , t
Y ∈Bκ ,0s<∞
− s), X)Q(t)T ,
(3.10)
561
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
or simply as
Fκ (χκ∗∗ ◦ γ ) = QFκ (χκ )QT .
(3.11)
We emphasize that, in this condition, only the constitutive function relative to
the reference configuration κ in the frame φ is involved and therefore, (3.11) is
a restriction on the constitutive function Fκ .
Note that we have
x = χκ (X, t),
x ∗ = χκ∗∗ (γ (X), t ∗ ),
(3.12)
for X ∈ Bκ and by (2.2) they are related by
χκ∗∗ (γ (X), t ∗ ) = Q(t)(χκ (X, t) − x0 ) + c(t).
(3.13)
From (3.10) and (3.13), we conclude that although the reference configuration is
frame-dependent, it does not affect the condition of material frame-indifference,
(3.10) together with (3.13), as long as the condition is expressed in terms of the
constitutive function relative to the reference configuration in a frame only.
4. Simple Material Bodies
For simple material bodies (see [1, Section 28]), the constitutive dependence of
motions is restricted to the local dependence of deformation gradients only. The
constitutive relation (3.5) can then be written as
T (X, t) =
H (F (X, t − s), X),
(4.1)
0s<∞
where F = ∇X χκ , and the condition (3.11) becomes
H(∇X (χκ∗∗ ◦ γ )) = QH(∇X χκ )QT .
(4.2)
By the chain rule, we obtain the gradient,
∇X (χκ∗∗ ◦ γ ) = (∇X∗ χκ∗∗ )(∇X γ ) = F ∗ K,
where F ∗ = ∇X∗ χκ∗∗ and K = ∇X γ from (2.4). Hence, we have
∇X (χκ∗∗ ◦ γ ) = (QF K T )K = QF,
(4.3)
by the use of the transformation formula F ∗ = QF K T from (2.5). Note that the
above relation can also be obtained directly from (3.13).
Therefore, from (4.2) and (4.3), the condition of material frame-indifference
(3.10), for simple material bodies, becomes
H (Q(t − s)F (X, t − s), X) = Q(t) H (F (X, t − s), X)Q(t)T ,
0s<∞
0s<∞
562
I-S. LIU
or simply
H(QF ) = QH(F )QT
∀Q(t) ∈ O.
(4.4)
In other words, this well-known condition remains valid without the assumption
that the reference configuration be unaffected by the change of frame.
FINAL REMARKS
In [4], the transformation property (2.5), F ∗ = QF K T , was derived (see equation (2.2.93)), however, it appeared as an isolated remark, and its consequences
were not considered further in the book.
The observer-dependent reference configuration was also considered in [5], in
which the property (2.5) and its restrictions on the response functions for elastic
solids were obtained in a manner different from the present paper.
Acknowledgement
The author acknowledges the partial support of CNPq-Brasil, through the Research
Fellowship, Proc. 300135/83-1.
References
1.
2.
3.
4.
5.
C. Truesdell and W. Noll, The Non-Linear Field Theories of Mechanics, S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer, Berlin/Heidelberg (1965).
C. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1, 2nd edn. Academic Press,
Boston (1991).
I-Shih Liu, Continuum Mechanics. Springer, Berlin/Heidelberg (2002).
R.W. Ogden, Non-Linear Elastic Deformations. Dover, Mineola, New York (1997).
A.I. Murdoch, On objectivity and material symmetry for simple elastic solids. J. Elasticity 60
(2000) 233–242.
Some New Advances in the Theory
of Dynamic Materials
KONSTANTIN A. LURIE
Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road,
Worcester, MA 01609, U.S.A. E-mail: klurie@wpi.edu
Received 25 September 2002; in revised form 29 August 2003
Abstract. Some recent advances in the theory of dynamic materials are listed in the paper. We
discuss the technique used to determine the set of invariant characteristics of material mixtures in
one spatial dimension and time, in the context of electrodynamics of moving dielectrics, versus the
relevant results in traditional electrostatics. Some special features of dynamic materials demonstrated
through a material design are advertised as well. Among them, we mention the possibility to eliminate
the cut-off frequency in the waveguides with activated dielectric filling.
Mathematics Subject Classifications (2000): 78A48, 78M40, 78M30.
Key words: dynamic materials, cut-off frequency elimination.
To the Memory of Professor Clifford Truesdell, Teacher and Friend
Introduction
This paper is focused on special material formations termed dynamic materials
(DM). DMs are defined [1–3] as composites assembled from conventional materials distributed on a microscale in space and time. When a low frequency dynamic
disturbance propagates through such an assemblage, it perceives this one as a uniform medium with some “effective” properties mathematically detected through
homogenization. A discussion of such properties, along with some special effects
they produce in material design, is the central objective of this work.
DMs are encountered far more often in real life than one may expect at first
glance. A TV screen on which a movie is projected represents a DM – a plane
with reflection properties that are fast variable in space and time. A human mechanism of vision implements a spatio-temporal averaging of a rapidly alternating
pattern of picture waves, i.e., modulated scanning lines, and it thereby implements
homogenization to reveal “a slow motion” carrying information stored in a movie.
A similar example of a DM is given by a transmission line with variable linear
inductance and capacitance. A discrete model represents the line as an array of
LC-cells connected in series (Figure 1). Assume that each cell offers two possi563
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 563–573.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
564
K.A. LURIE
Figure 1. A discrete version of a transmission line.
Figure 2. A moving (LC)-property pattern – an activated composite.
bilities: (L1 , C1 ) – “material 1”, and (L2 , C2 ) – “material 2”, turned on/off by a
toggle switch. If the cells are densely distributed along the line, then, by a controlled switching, the linear inductance L and capacitance C may become almost
arbitrary functions of the spatial coordinate z and time t. In particular, they may
produce a periodic LC-laminate in a (z, t)-plane assembled from materials 1 and
2 (Figure 2). To this end, we create such a pattern at time t = 0, and bring it,
as a whole, to a uniform motion with velocity V along the z-axis. This velocity
should either be less than the least phase velocity of waves in both materials, or
exceed both of such velocities; we take these precautions in a DM to avoid the
formation of shocks. It is essential that the motion is confined to the pattern alone:
materials 1 and 2 themselves remain immovable relative to a laboratory observer.
From this remark it becomes clear that some restrictions should be imposed on the
microgeometry of a DM to avoid strong discontinuities in dynamic disturbances.
We will term the relevant microstructures admissible; for such microstructures,
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
565
Figure 3. A pipeline construction.
conventional compatibility conditions of kinematic and dynamic type hold across
the interfaces separating one material in the assemblage from another.
After we apply homogenization to an admissible laminar construction, we reveal a uniform material with the effective properties depending on all of the parameters involved, such as the volume fractions of participating constituents, and the
velocity V of a property pattern.
This example represents what we term an activated spatio-temporal composite
– one out of two major categories of DMs. Another type of DMs, called kinetic,
involves the relative motion of the original constituents in a microstructure. An
example is given by an air column in a form of a pipeline assembled from identical
sections separated by toroidal chambers [4]. By manipulating compressions in the
chambers, one may produce an individual velocity pattern within each section, and
such patterns may vary, from section to section, both in magnitude and direction
(Figure 3). The waves that are long compared to the length of a section, will propagate along the pipeline as if it were a uniform medium with some effective density
and compressibility.
The mathematical theory of DMs reveals some resemblance with a conventional
theory of composites built in space alone. This resemblance is, however, limited
because there are features unique for dynamic formations that have no analogs in
ordinary composites. One thing is universal: to maintain a spatio-temporal variability of properties in a material assemblage with a dynamic process developed
in it, one should generally arrange a flow of energy and momentum between the
material and its environment. In other words, a DM is a thermodynamically open
system.
A clear idea of both common and special features of DMs versus the ordinary
composites may be obtained from the comparison between electrodynamics of
moving dielectrics and traditional electrostatics. In both examples we start with
two base tensor entities: the electric displacement D and the electric field E in
electrostatics, and the skew-symmetric electromagnetic tensors f and F in electrodynamics. These entities are linked through the constitutive relations involving
material tensors: a tensor e of dielectric constants and a tensor s of dielectric and
566
K.A. LURIE
Figure 4. Spatial and spatio-temporal polycrystals.
magnetic constants, respectively. The base vectors (tensors) satisfy the relevant
fundamental equations given by Maxwell’s theory. The main difference is that
electrostatics is about purely spatial phenomena associated with the Euclidean
group of rotations, while electrodynamics is about spatio-temporal phenomena
associated with the Lorentz group that contains, along with Euclidean rotations,
also pure Lorentz transforms as its elements. Particularly, any dielectric which is
isotropic in a conventional sense (i.e., with regard to Euclidean rotations) is at
the same time anisotropic in space-time (with regard to Lorentz transforms), the
only exception from this rule being vacuum. This difference is substantial: due
to it, in electrostatics we have a variational principle of minimum stored energy,
while in electrodynamics we only have a principle of stationarity of the action
density. Accordingly, Euler’s equations are elliptic in electrostatics and hyperbolic
in electrodynamics.
In spite of these differences, homogenization may detect the effective properties
of composites in both scenarios. To illustrate, consider as example the polycrystalline formation. The notion is common in electrostatics: to produce a polycrystal,
we must have an originally anisotropic paternal material (a monocrystal), and intermix, in space, its fragments turned by different angles relative to a laboratory
frame (Figure 4). In other words, a traditional Euclidean rotation is responsible for
the difference in the material properties of individual grains. When the polycrystal
is two-dimensional (i.e., it lies in an (x, y)-plane), then with homogenization the
determinant of its material tensor e is preserved through the mixing [5]:
λ1 λ2 = det eeff = det e = ǫ1 ǫ2 .
Here, ǫ1 and ǫ2 represent eigenvalues of the paternal material, whereas λ1 and λ2
denote eigenvalues of the effective tensor eeff .
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
567
Figure 5. “Caterpillar” construction.
A similar result holds for the electrodynamics of moving dielectrics. To be
specific, consider a kinetic laminate – a periodic array which consists of copies
of one and the same isotropic dielectric, with eigenvalues ǫc and 1/µc for its
material tensor s; these copies will be distributed along the z-axis, and each copy
brought into material motion along the z-axis with individual velocity V . A discontinuous velocity pattern may be implemented through the use of the following
feasible construction. Assume that we have a linear arrangement of caterpillars
placed one after another along the z-axis (Figure 5). The tracks that are moved
by caterpillars become electrically connected when they belong to the z-axis, and
stay disconnected otherwise. The z-axis will then become occupied by material
fragments moving each at its own axial velocity, and the electric current will flow
along the z-axis through the assemblage of electrically connected tracks. With this
construction, the electromagnetic field will be controlled directly by an appropriate specification of the velocity pattern. Because every conventional dielectric is
anisotropic in space-time (ǫc = 1/µc), and because a material motion represents
rotation in space-time by an imaginary angle iφ, where tanhφ = V /c, we arrive
at what may be termed a spatio-temporal polycrystal (Figure 4). This formation
represents a DM – an isotropic dielectric – with the effective properties E, M found
through homogenization. We obtain [2]
E
ǫ
= det seff = det s = ,
M
µ
in complete analogy with a similar electrostatic situation. To translate this result into the language of transmission lines, we may say that the effective wave
impedance of the line assembled from the moving parts with the same wave impedance is preserved through a spatio-temporal mixing.
There is, however, a substantial difference between the static and dynamic scenarios. Both identify the set of eigenvalues of the effective material tensors as hyperbolas in the relevant planes (Figure 6); these hyperbolas obviously pass through
points related to the monocrystalline materials. However, in electrostatics, not all of
the points on the hyperbola are attainable; only those points can be attained through
actually assembled composites that belong to the segment of the hyperbola between
the original material and the diagonal. This is understandable because an ordinary
polycrystal cannot become more anisotropic than the original monocrystal. This
follows basically from the minimum variational principle of electrostatics.
On the contrary, in electrodynamics a spatio-temporal hyperbola is attainable at
all points but one, namely, that point on the diagonal related to the vacuum [6]. This
is because, to attain this point, it takes infinite energy for particles of nonzero proper
568
K.A. LURIE
Figure 6. Effective properties of spatial and spatio-temporal polycrystals.
mass. We conclude that in electrostatics the minimum variational principle generates a hierarchy of materials with respect to mixing: an original monocrystalline
material may create only those polycrystals that lie on the hyperbola closer to the
diagonal. In contrast to this, in electrodynamics any material on the hyperbola may
create, by forming polycrystals, any other material on it (except for the vacuum).
In other words, electrostatics displays a paternalistic performance when it comes
to mixing in space, whereas in electrodynamics, with a spatio-temporal mixing, we
have no such performance. Clearly, the reason is because the minimum principle is
not valid with respect to the full Maxwell’s system.
These observations strongly affect the problem of determining the so-called
G-closures, i.e., the sets of invariants of material tensors of all mixtures that may be
produced as the assemblages of original material constituents. Again, we illustrate
this through the comparison between electrodynamics and electrostatics.
Figure 7 is related to electrostatics; it demonstrates the G-closure produced
by a spatial mixing of two anisotropic dielectrics in 2D [7]. The original materials (ǫ11 , ǫ12 ) and (ǫ21 , ǫ22 ) generate, through making spatial polycrystals, their
own hyperbolic segments. These segments represent a part of the boundary of the
G-closure. In a transverse direction, this set is bounded by a diagonal at one end,
and by a special curve at another end; this curve passes through the points (ǫ11 , ǫ12 )
and (ǫ21 , ǫ22 ) related to the original materials and represents a rank one spatial
laminate assembled from them. Within the layers, the eigenaxes of the original
materials are oriented along and across the layers’ interface. All 2D-mixtures of
two original materials fall, regardless of microgeometry, into the shaded domain
bounded by the noted curves.
In electrodynamics, where composites are assembled in space-time, the situation is different. Any particular composite built in space-time from the original
constituents may or may not allow the long waves to travel through it without
shocks, damping, or amplification (the term “long” in this context means “long
compared with the period of a microstructure”). When such travelling waves exist,
we call a composite stable, otherwise we term it unstable.
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
569
Figure 7. G-closures of a binary set of spatial and spatio-temporal composites.
We will allow only stable composites to become elements of G-closures generated by the elements of the original set U . Stable formations should, of course,
be admissible in the sense that they do not lead to shock waves. This requirement
alone is, however, not enough for stability: a composite may not generate shock
waves but at the same time be unable to transmit travelling waves. Such composites
may be produced by a special mixing procedure consisting of two steps; below we
describe this procedure for one dimensional wave propagation.
We start with two isotropic dielectrics immovable in a laboratory frame and
having positive material constants ǫi , µi , i = 1, 2, ǫ1 /µ1 = ǫ2 /µ2 . An activated
rank one laminate assembled from them has the determinant of its effective material
tensor s0 defined by [2]
y
det seff =
det s ;
(1)
y
here
· = m1 ·1 + m2 ·2 , m1 , m2 0, m1 + m2 = 1,
1
ǫ
y = ( )−1 ,
= ǫ(V 2 − a 2 ),
a2 =
,
det s =
µ
ǫµ
(2)
and V denotes the velocity of the pattern – the slope of lines in Figure 2.
By introducing
κi =
mi yi
,
y
i = 1, 2,
(3)
570
K.A. LURIE
we observe that κ1 + κ2 = 1; as to the sign of κi , it is the same as that of yi /y.
Assume that this sign is positive for both i = 1, 2; then det seff is also positive as
a convex combination (1) of det si , and, consequently, the eigenvalues Ec, 1/Mc
of seff have the same sign. This means that travelling waves are possible through
the laminate, and this holds for all admissible values of mi and V . However, if the
signs of yi y are opposite for i = 1 and i = 2, then the same holds for κi , and the
combination
det seff = κ1 det s1 + κ2 det s2
may be made negative by a suitable choice of κ1 ; as a consequence, the values of E
and M will have opposite signs, and travelling waves will not exist. For the relevant
values of κ1 , the laminate will become unstable, but for other admissible values of
κ1 (i.e., mi and V ) it will remain stable. We will call a stable composite absolutely
stable if it remains stable for all admissible values of its structural parameters. So, if
both κ1 and κ2 are positive for all of such values, an activated laminate is absolutely
stable, otherwise it lacks absolute stability. Another example of absolutely stable
composite is given by a kinetic polycrystal produced by mixing different fragments
of the same original dielectric in space and time.
As seen from the above argument, a laminate fails to be absolutely stable if the
signs of ǫ1 and ǫ2 are opposite; in other words, to violate absolute stability, we must
have original materials with parameters ǫi , µi of opposite sign for different values
of i, say, both parameters positive for i = 1, and both of them negative for i = 2.
But a material with negative ǫ and µ can be created as an activated laminate, probably of the second rank, assembled from any two original dielectrics with ǫ, µ being
all positive. This may be achieved [8] by a special choice of structural parameters
in a laminate. The creation of such a “negative” material represents the first stage of
the two-step procedure mentioned above. Having one material negative and another
material positive, we will, at the second stage of this procedure, assemble from
them a second rank laminate lacking absolute stability.
We now define a stable hyperbolic G-closure of an original set U of materials as
a set of invariants of the effective tensors seff of all absolutely stable spatio-temporal
mixtures generated by the elements of U . For one spatial variable and time, a stable
G-closure of the set U of two original dielectrics (ǫ1 , µ1 ) and (ǫ2 , µ2 ) is given by
a hyperbolic strip bounded by hyperbolas E/M = ǫ1 /µ1 , and E/M = ǫ2 /µ2 ,
this strip involving branches belonging both to the first and the third quadrants
of the coordinate plane; see Figure 7 [9]. We observe that the G-closure contains
materials with both effective parameters negative. Not every two elements of a
G-closure may serve as original constituents for building other elements; only
those elements qualify that may produce an absolutely stable composite. Materials
of opposite signs are known not to qualify, so the secondary elements (mixtures)
may only be produced as composites made from original materials of the same
sign. We also observe that there is no transverse bound for a G-closure (leave alone
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
571
the diagonal). The reason for that is because there is no minimum energy principle,
and the system is thermodynamically open.
In conclusion, I want to mention some of the special effects produced by DMs
as they become elements of material design. When we create DMs, we control the
geometry of the characteristics of the relevant hyperbolic equations. By a suitable
mixing, we may direct all of the waves to travel in one and the same direction
relative to a laboratory observer. We term such a phenomenon a coordinated wave
propagation. Consider two coordinated material mixtures; in one of them, both
waves travel from left to right (“a right material”), in another – from right to left (“a
left material”). Place a right material on the right of the origin z = 0 on the z-axis,
and a left material to the left of it. This material combination will demonstrate a
screening property [10]: an initial state will be split into waves travelling away
from the origin, and never entering an extended region in between. This region
will never be invaded also by the waves generated at the ends of the segment of the
z-axis we consider because the characteristics will avert such waves away from the
required direction.
Another effect achieved through a material activation consists in elimination of
the cut off frequency in waveguides. A waveguide filled with an appropriately activated laminate allows for all waves much longer than a spatial period of lamination
to propagate without damping through the waveguide, thus eliminating the cut off
frequency.
Some Recollections of Professor Clifford Truesdell
I am pleased to have this opportunity to share some personal reminiscences of
Clifford. I met with Clifford only twice but both meetings produced such a lasting
impression upon me that I think he was one of the most remarkable and unique
individuals I ever met.
The first meeting took place in the late spring of 1988 when Clifford and Charlotte visited Russia, and the second occurred one year plus later when they hosted
me in their wonderful home in Baltimore. Before I met Clifford in person, I knew
much about his work and noticeably less about his personality. Of course I knew
about a strong opposition among some high ranking moguls in Soviet mechanics
to what they called “Truesdellism”.
This labelling of rational mechanics carries no negative flavor per se; I myself
perceive it in a positive sense as a tribute to the man whose seminal work has added
so much to our understanding of the roots of continuum mechanics. There are,
however, individuals who take this name negatively because they associate with
it “an unnecessary incursion of abstract mathematics into the field of mechanics”.
This is their viewpoint, and if appropriately motivated it may become a basis for
a legitimate opposition. The bad thing, however, is that time and again there are
undertaken unproportionally fierceful efforts aimed to suppress the ideas of rational
mechanics at all costs, and prevent it from reaching out to a broad audience. Such
572
K.A. LURIE
efforts are beyond logic; I will refer only to two examples; of one of them I bear a
personal witness.
The first occasion took place in 1975 when A First Course in Rational Continuum Mechanics was published in Moscow by the initiative of Grigorii Isaakovich
Barenblatt. This name was taboo at the Nauka Editorial Board headed by Academician L.I. Sedov; after some extensive pressure, however, the book appeared
with no mention of Barenblatt’s name in it; it was translated by R.V. Goldstein
and V.M. Entov, and edited by P.A. Zhilin and A.I. Lurie. Clifford had taken an
active part in the preparatory work through his intensive correspondence with the
editors. Interestingly, the Russian version appeared several years before the book
was published in English [11].
The second episode is related to one of the first original Russian texts on nonlinear elasticity written by my father, A.I. Lurie. This book has exposed and developed
many of the ideas of rational mechanics of which the author not only was an ardent
supporter, but worked in it as an active contributor. He and Clifford knew each
other, and paid an immense mutual respect.
The manuscript of the book was completed by the beginning of 1979 and submitted to Nauka Publishers for review.
When the referee’s report was received in late April, the author was already
gravely ill. Naturally, it was up to me to act in his name through the entire review
business. The referee’s report shocked me: its language left no doubt that the goal
was to destroy the book. From a long list of demands, I mention here only one:
it required that there should be an index rather than a direct tensor notation. To
accept this was the same as to kill the book because it required complete revision
(and retyping) of the whole text, i.e., several long months of senseless work. The
situation was both delicate and risky; after a discussion with colleagues, I declined
this demand but accepted some others, less significant, but also time consuming.
Unfortunately, the delay produced by this circumstance did not allow enough time
for the author to see his work published [12].
So it was with this prehistory that I met Clifford and Charlotte in Leningrad
in the spring of 1988. There was a meeting at the Ioffe Institute, my workplace at
that time, with a number of invited guests. At this Institute, there was a group of
people doing classical mathematical physics, particularly, special functions, integral equations, and alike. And it quite unexpectedly turned out that Clifford had
contributed to this topic, too! Apparently, that occurred years before he turned all
of his attention to the foundations of continuum mechanics.
While in Leningrad, we discussed many general topics. This was a time when
perestroika was already taking gear, so we naturally discussed politics. But I think
Clifford was more interested in personal observations: the town itself, its museums,
and its beautiful suburbs. St. Petersburg is an architectural marvel, a place world
famous for its magnificence and harmony. Passionate as he was, Clifford was eager
to examine every bit of what he saw in museums, be it a fabric, furniture, or clocks.
He literally knelt down in front of selected pieces to better feel the material and
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
573
to recognize the work. He evidently was pleased with his visit as he recalled later
when we met again.
I think Clifford was more than a prolific scholar. He really loved life, was
passionate and sometimes sharp in his judgment, but he also was a philosopher.
His interest towards the history of science was not occasional: it originated from
the same source as his interest towards foundations of mechanics: both revealed his
strive for understanding the very roots of things. In this sense, I think, he belonged
with the same ilk as the people of Renaissance who clearly realized their place in
history.
References
1.
I.I. Blekhman and K.A. Lurie, On dynamic materials. Proc. of the Russian Academy of Sciences
(Doklady) 37 (2000) 182–185.
2. K.A. Lurie, The problem of effective parameters of a mixture of two isotropic dielectrics distributed in space-time and the conservation law for wave impedance in one-dimensional wave
propagation. Proc. Roy. Soc. London A 454 (1998) 1767–1779.
3. K.A. Lurie, Control of the coefficients of linear hyperbolic equations via spatio-temporal
composites. In: V. Berdichevsky, V. Jikov and G. Papanicolaou (eds), Homogenization. World
Scientific, Singapore (1999) pp. 285–315.
4. B.P. Lavrov, Private Communication. Mekhanobr-Tekhnika, St. Petersburg, Russia (2002).
5. A.M. Dykhne, Conductivity of a two-dimensional two-phase system. Soviet Phys. JETP 32
(1971) 63–65.
6. K.A. Lurie, Bounds for the electromagnetic material properties of a spatio-temporal dielectric
polycrystal with respect to one-dimensional wave propagation. Proc. Roy. Soc. London A 456
(2000) 1547–1557.
7. K.A. Lurie and A.V. Cherkaev, Effective characteristics of composite materials and the optimal
design of structural elements. In: A. Cherkaev and R. Kohn (eds), Topics in the Mathematical
Modelling of Composite Materials. Birkhäuser, Boston (1997) pp. 175–258.
8. K.A. Lurie and S.L. Weekes, Effective and averaged energy densities in one-dimensional wave
propagation through spatio-temporal dielectric laminates with negative effective values of ǫ
and µ. To appear in: R. Agarwal and D. O’Reagan (eds), Nonlinear Analysis and Applications.
World Scientific, Singapore (2003).
9. K.A. Lurie, A stable spatio-temporal G-closure and Gm -closure of a set of isotropic dielectrics
with respect to one-dimensional wave propagation. Submitted to Wave Motion.
10. K.A. Lurie, Effective properties of smart elastic laminates and the screening phenomenon.
Internat. J. Solids Struct. 34 (1997) 1633–1643.
11. C.A. Truesdell III, A First Course in Rational Continuum Mechanics, Part I. Academic Press,
New York, 1977.
12. A.I. Lurie, Non-linear Theory of Elasticity (in Russian). Nauka, Moscow, 1980 (English translation: Non-linear Theory of Elasticity. North Holland, Amsterdam, New York, Oxford, Tokyo,
1990).
Pseudo-plasticity and Pseudo-inhomogeneity
Effects in Materials Mechanics
GERARD A. MAUGIN
Laboratoire de Modélisation en Mécanique, UMR CNRS 7607, Université Pierre et Marie Curie
(Paris 6),Case 162, 4 place Jussieu, 5252 Paris, Cedex 05, France. E-mail: gam@ccr.jussieu.fr
Received 29 July 2002; in revised form 27 August 2003
Abstract. It is shown that a large variety of physical effects such as continuously distributed defects, heat conduction, anelasticity (plasticity in finite-strains, growth), phase transitions and more
generally shock-waves, can be viewed as pseudo-material inhomogeneities when continuum thermomechanics is completely projected onto the material manifold itself. Main ingredients in this
approach are the notions of local structural rearrangements (Epstein and Maugin) and of its thermodynamical dual, the Eshelby material stress tensor. An outcome of this is the unification of the
theories of inhomogeneity of Eshelby on the one hand, and of Kroener–Noll–Wang, on the other
hand. The notion of configurational forces as understood nowadays in solid-state physics and engineering mechanics follows necessarily from these developments. They are driving forces acting
on sets of material points that correspond to strongly localized fields and, in the limit, singularities,
which are also viewed as pseudo-inhomogeneities. The second law of thermodynamics then is a
constraint imposed on the time evolution of these pseudo-inhomogeneities (e.g., plastic evolution,
volumetric growth, progress of a crack, advancement of a phase-transition front, etc.). This has
very powerful implications in numerical schemes drawn directly on the material manifold (e.g.,
thermodynamically admissible volume-element scheme for the simulation of phase-transformation
evolution).
Mathematics Subject Classifications (2000): 74A15, 74A45, 74J40, 74N20.
Key words: solids, thermodynamics, elasticity, dissipation, nonhomogeneous materials, fracture,
phase transitions, singularities.
This contribution is dedicated to the memory of the late Clifford A. Truesdell,
master and guide of our generation in the field theory of continuum mechanics.
1. Introduction
We start by giving the following two purely verbal definitions which encapsulate
most of the subject of this paper.
DEFINITION 1. We call pseudo-plastic effects in continuum mechanics those
mechanical effects – due to any physical property – which manifest themselves
just like plasticity, through the notion of internal or eigenstrains and eigenstresses
575
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 575–597.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
576
G.A. MAUGIN
(eigenspannungen) in the language of Kroener [1]; see also [2]. Thermoelasticity,
magnetoelasticity of nonuniformly magnetized ferromagnets, electroelasticity of
ferroelectrics, etc., are examples of such pseudo-plastic effects (cf. [3, 4]).
DEFINITION 2. We call pseudo-inhomogeneity effects in continuum mechanics
those mechanical effects – of any origin – which manifest themselves as so-called
material forces in the material mechanics of materials, as developed by the author
and co-workers since 1990 (see, e.g., reviews by the author [5, 6]). The reason
for these is that the force exerted on a true material inhomogeneity (a region of a
material body where material properties vary with the material point or are different
from those at other points outside the region) in a material displacement (caused by
the field solution of the problem) is – through the inherent duality of continuum mechanics – the best characterization of the materal inhomogeneity of a body. Forces
acting on smooth distributions of dislocations (one kind of crystalline defect) and
forces acting on macroscopic defects viewed as field singularities of certain dimensions on the material manifold, such as the forces driving macroscopic cracks or
phase-transition fronts, are of this type.
The present work has for purpose to present a unified view of these two classes
of effects. This should not come as a surprise since, for instance, dislocations are
one possible cause of eigenstresses. They also provide the ultimate microscopic
mechanism at the basis of macroscopic plasticity. Also, it might seem reasonable
that the best intrinsic way to describe eigenstrains is to observe what happens on
the material manifold. To reach our conclusion we shall combine two modern approaches of continuum mechanics, the thermomechanics of irreversible processes
approached by means of the concept of internal variable of state [7] and the geometrical approach that considers local material rearrangements on the material
manifold (a notion due to Epstein and Maugin [8], but leaning on Noll’s and Wang’s
works [9, 10] (see also [11]) as the basic mechanisms of all our effects of interest.
On the way, we shall uncover the unification of three of the most productive and
creative lines of thought developed in continuum mechanics in the second part of
the XXth century, namely,
(i) the finite-strain line with the concept of multiplicative decomposition of the
deformation gradient,
(ii) the geometrical line whose purpose inspired by mathematical physics was to
capture anelastic effects via necessarily involved geometrical descriptions of
the material manifold, and
(iii) the configurational-force line which gave rise to the notion of material force
(i.e., a covector on the material manifold) following the pioneering works
of Peach and Koehler and Eshelby in the 1950’s. Section 2 reviews these
three lines in the form of a historical introït. The three great historical figures
who emerge thus are J. Mandel (1904–1978), E. Kroener (1919–2000) and
J.D. Eshelby (1916–1981).
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
577
2. Historical Preliminary
This section does not pretend to be an exaustive historical account of the field,
but only to indicate some of the most salient contributions and their approximate
interrelations. We like to distinguish three lines of originally independent creative
developments in continuum mechanics in the period 1950–2000 (Flow chart 1
based on [12, 13]). One purpose of this contribution is to show how these three lines
finally recently united in a grand scheme under the umbrella of thermomechanics
and how the viewpoints of the main protagonists (Mandel, Kroener, Noll, Eshelby)
find their best combined expression in this powerful unity.
A. Along the finite-deformation line (left column in Flow chart 1, Figure 1), following the natural notion of composition of maps in analysis, the main fruitful
ingredient was the multiplicative decomposition of the deformation gradient into
an elastic contribution and an anelastic one (none of these two being integrable
into a displacement separately), originally by the UK group of Bilby et al. [14] and
Kroener and Seeger [15, 16]. This may have been anticipated by rheologists (Green
and Tobolsky, 1940’s) but for exactly integrable members of the decomposition.
The geometrical line (central column in Flow chart 1) was connected with this
initially. But the finite-strain theory of anelasticity stayed dormant until the late
1960s when this was revived by Lee [17] and co-workers. From our viewpoint,
however, a definite progress was made by Mandel [18] when he showed that what
is now referred to as the “Mandel stress” [19] expressed in the so-called elasticallyreleased or “intermediate” configuration – between the material one and the actual
one – is the driving force behind anelasticity. The introduction of the “intermediate”
configuration is intimately – we should say, in duality – related to that of multiplicative decomposition of the finite deformation; cf. [20]. Sidoroff [21] has shown how
the richness of the phenomenological description of finite-strain viscoelasticity is
enhanced by the decomposition in multiple factors (more than two), introducing
thus a series of “intermediate” configurations. So much for this line.
B. Along the geometrical line (central column in Flow chart 1), we find works by
scientists who were greatly influenced by mathematical physics, particularly the
geometrical theory of gravitation of A. Einstein known as the general theory of
relativity. Kondo [22, 23] in Japan was the first to infuse such ideas in continuum
mechanics. But the group of Bilby et al. in the UK and E. Kroener and A. Seeger
in Germany soon took over this line. In particular, introducing the notion of incompatibility tensor [1] to describe mathematically the lack of unique determination
of the elastic displacement in continuously dislocated bodies, Kroener [15] made a
definite step as he could then relate the density of dislocations (one type of “elastic”
defect) to the geometry of the material manifold (non-vanishing curvature). At this
point inclusive ideas of T. Levi-Civita and E. Cartan on (geometrical) connections,
torsion, and distant parallelism entered the scene. This was most forcefully implemented by Noll [9] – also in [11, 10] in landmark papers. But these authors, fruitful
578
G.A. MAUGIN
Finite Deformation
Line
⇓
1950s
Geometrical
Line
⇓
1950s
Configurational-force
Line
⇓
1950s
Multiplicative
Decomposition
Riemannian
Geometry
Force on a singularity
PEACH-KOEHLER (1950)
∗
ICTAM Brussels 1956 ⇐
K. KONDO (JP)
∗
BILBY et al. (UK) Force on an inhomogeneity
STROH ⇔
J.D. ESHELBY (1951)
Attempts to relate the
Einstein–Cartan tensors
to density of defects
Incompatibility tensor
⇐ E. KROENER
Non-Riemannian geometry
W. NOLL (inhomogeneity)
C.C. WANG
∗ E.H.
Mechanics on the
Material Manifold
LEE (1969)
Gauge theory
EDELEN, LAGOUDAS,
KROENER, KLEINERT
(1980)
Elastoplasticity
J. MANDEL
⇓
Mandel Stress
Figure 1. Flow chart 1.
M. EPSTEIN ⇐
⇓
∗ G.A.M.
(1969,71)
D. ROGULA
⇓
A. GOLEBIEWSKA
G. HERRMANN
R. KIENZLER
···
⇒ STRUCTURAL
REARRANGEMENTS
e.g., divR b = b : Ŵ
⇐
G.A.M. (1989)
divR b + finh = 0
⇓
Eshelby stress
⇒
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
579
and deep as their research was, did not really propose a relationship between a
driving force and the geometrical background.
C. The third line (right column in Flow chart 1) is that initially developed by
Peach and Koehler [24] and Eshelby [25], who established the expression of the
driving force (not a Newtonian force acting per unit of matter) on a singularity line
(dislocation line) and on a material inhomogeneity, respectively. The celebrated
J-integral of fracture (force on a crack tip [26]) is also such a force. Eshelby found
that this type of “force” is related to the divergence of a peculiar stress tensor, which
he identified as the spatial part of what was known as the energy-momentum tensor
in field theories [27]. This is now referred to as Eshelby stress tensor in honor of
this great scientist. However, late in the 1960’s, Rogula [28] and the author [29],
then relating to studies in general relativistic continuum mechanics, found it convenient to emphasize the duality between projections of the equations of continuum
mechanics, whether in physical space or directly onto the material manifold. It
seems that this viewpoint was exported to the USA by Golebiewska [30] in the
late 1970s, who initiated a trend followed by Herrmann and his co-workers with
efficient applications to the strength of materials of structural members [31]. The
configurational-force line is also exposed in some detail in [32].
Epstein and Maugin [8] (also many subsequent papers by these authors; in
particular, some synthesis works [5, 6]), working entirely in material space and
exploiting the ideas of Noll but pursuing them to a logical end, combined lines
C and B and got the final unifying result: the Eshelby material stress is indeed
fed by all types of material inhomogeneities and field singularities (defects) . This
is shown by establishing the material balance law in which the Eshelby stress is
the flux. This is the fully material balance law missed by Noll and Wang, which
represents equilibrium, or dynamics, among all types of inhomogeneities. This establishes the relationship between the geometrical and configurational-force lines.
Furthermore it happens that the above-mentioned Mandel stress is none other than
an easily identified part of the Eshelby stress. All these they achieved by exploiting the notion of inhomogeneity map, or material transplant (with a biophysical
connotation) or, still, local structural rearrangement. This is shown in Section 4
below after an introduction to canonical balance laws in Section 3. A different line
of thought was pursued by Gurtin [33–35] and some of his co-workers, with a
special interest in interface phenomena.
3. Canonical Balance Laws
These are the fundamental balance laws of thermomechanics (momentum and energy) expressed intrinsically in terms of a good space-time parametrization. In a
relativistic background this would be the conservation – or lack of conservation
– of the canonical energy-momentum tensor [27], first spelled out in 1915 and
1918 by David Hilbert and Emmy Noether on a variational basis. Here we do not
580
G.A. MAUGIN
appeal to any variational formulation as we consider the case of finitely deformable
dissipative media which may conduct heat, a case of general interest.
In modern continuum mechanics, we account for a variety of microscopic phenomena responsible for macroscopic dissipation through the notion of internal
variables of state (review of this notion in [7]). These variables have for essential property to be uncontrollable directly by external stimuli, so that they expand
power only in the bulk in the form of dissipation. We denote collectively by α
these variables, whose choice and tensorial nature depend on the physical acumen
of the theoretician helped by the experimentalist who should uncover those most
representative variables (e.g., density of dislocation, plastic strain, work-hardening
variables, etc., as we know now). Such a thermodynamical framework is particularly powerful in the field of study of anelastic behavior of solid-like materials. To
formulate a correspondingly sufficiently general theory it is remarkable that it is
sufficient, to start with, to cast the theory of finite-strain thermoelasticity in a form
where the free energy is taken to be a function of the internal variable of state (by
the very nature of these new variables being internal, we do not need to introduce
new kinetic notions – inertial forces in the bulk or applied forces at the boundary).
The only initial change is the introduction of α, in addition to the deformation
gradient F and the absolute temperature θ in the list of functional arguments of the
free energy, e.g.,
1 (F,θ,α; X)
W =W
(3.1)
for an anisotropic, possibly anelastically inhomogeneous material in finite strains,
whose basic behavior is elastic, but it may present combined anelasticity. Here,
W is the free energy density per unit volume in the global reference configuration
1 may depend on F only through another
KR of a material body B. The function W
quantity such as an element of a multiplicative decomposition while α itself may
contain another element of this decomposition (case of finite-strain elasto-plasticity
1 is supposed to depend explicitly on the
and elasto-viscoplasticity). In addition, W
material point X, i.e., the material may be smoothly materially inhomogeneous
(assuming the function sufficiently smooth in all of its arguments to allow for analytic manipulations). Equation (3.1) corresponds to a so-called first-order gradient
theory with respect to the deformation (so-called simple materials in Noll’s classification) but not for internal variables. Higher-order gradients of both fields yielding
scale effects are dealt with by Maugin and Trimarco [36] (also in [5, Section 5.8])
on a variational basis for the deformation and [37] for internal variables in the
dissipative case.
The so-called laws of (thermodynamical) state given by the partial derivatives
1 (3.10) with respect to its first three arguments are:
of this function W
1
1
1
∂W
∂W
∂W
,
S=−
,
A=−
.
(3.2)
T=
∂F
∂θ
∂α
These are, respectively, the first Piola–Kirchhoff stress, the entropy density (according to the axiom of local thermodynamical state [7]), and the thermodynamical
581
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
force associated to α. The quantities F, θ and α are fields, thus depending on the
material point X and the Newtonian time t.
Let ρ0 (X) be the matter density at KR . Then at any regular material point in the
body B, we have the following balance equations for mass, linear momentum, and
energy [7]:
∂ρ0
= 0,
∂t X
∂p
− divR T = 0,
∂t X
∂(K + E)
− ∇R · (T.v − Q) = 0.
∂t
X
(3.3)
(3.4)
(3.5)
These equations are presented here in the so-called Piola–Kirchhoff formulation,
with an (X, t) space–time parametrization, but the components of equation (3.4)
are still in physical space, so that it is not an intrinsic formulation. We remind the
reader of the following definitions:
x = χ(X, t),
∂χ
= ∇R χ,
F=
∂X t
p = ρ0 v,
E = W + Sθ.
(3.6)
∂χ
v=
,
∂t X
1
K = ρ0 v2 ,
2
(3.7)
(3.8)
(3.9)
This last quantity is the internal energy per unit reference volume; v and p are
called the physical velocity and linear momentum, respectively; Q is the material
heat flux, i.e., the heat influx per unit material surface. Equations (3.3)–(3.5) are
strict conservation laws (no source terms), because we assume, for the sake of simplicity , that there are neither external body force acting nor energy input per unit
volume. In these conditions the entropy equation and the dissipation inequality read
∂S
+ ∇R · Q = intr ,
intr := Aα̇,
∂t X
Q
σB = θ −1 (intr − S · ∇R θ) 0, S ≡ ,
θ
θ
α̇ ≡
∂α
∂t
,
and
(3.10)
X
(3.11)
with the continuity condition
Q(F, θ, α; ∇R θ; X) → 0 as ∇R θ → 0.
(3.12)
CANONICAL EQUATION OF LINEAR MOMENTUM
This is obtained by projecting canonically equation (3.4) onto the material manifold M 3 of points X constituting the body. In turn this is effected simply by
582
G.A. MAUGIN
applying F to the right to equation (3.4) and taking account of the functional
dependence (3.1) and that of ρ0 . The now classical result is
∂P
∂t
X
− (divR b + f inh) = f th + f intr ,
(3.13)
where we have introduced the canonical momentum P (a co-vector on M 3 ), a
density of “Lagrangian function” L with a superscript th indicating that this is
evaluated with the free energy, the (fully material but mixed) Eshelby stress tensor b, and three material forces due respectively to true material inhomogeneities,
thermal effects, and intrinsic dissipative effects represented by α:
P = −p · F = ρ0 C.V,
L = Lth = K − W,
b = −(Lth 1R + T · F),
∂Lth
;
f th = S∇R θ,
f inh =
∂X expl
(3.14)
(3.15)
(3.16)
f intr = A(∇R α)T .
(3.17)
In equation (3.14), C is the Cauchy–Green finite strain and V is the material
velocity (a contravector on the material manifold) defined by
C = FT · F,
V = −F−1 · v =
∂χ −1
.
∂t x
(3.18)
In the first of equations (3.17), the explicit material gradient is computed by keeping the fields fixed, i.e.,
1 2
∂W
inh
f = (∇R ρ) v −
.
(3.19)
2
∂X F,θ,αfixed
At all regular material points X equation (3.13) is a differential identity deduced
from equation (3.4). An embryonic form of this equation for the case of statics in
a purely hyperelastic homogeneous body in the absence of applied force may be
found in [38] – we referred to this as Ericksen’s identity [5, pp. 76–77] (for other
such “Ericksen–Noether identities” in other field theories see [39]).
According to (3.19) the “force” f inh captures indeed the explicit X-dependency
and deserves its naming as material force of inhomogeneity, or for short inhomogeneity force. This is the first cause for the momentum equation (3.13) to be
inhomogeneous (i.e., to have a source term) while the original – in physical space
– momentum equation (3.4) is a true conservation law. What is more surprising is
that a spatially nonuniform state of temperature (∇R θ = 0) causes a similar effect,
i.e., the material thermal force f th acts just like a true material inhomogeneity in so
far as the balance of canonical (material) momentum is concerned [40]. It seems
that Bui [41] was the first to uncover such a thermal term while studying fracture
although in the small-strain framework and not in the material setting. Finally, any
583
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
internal variable of state α that has not reached a spatially uniform state at point X,
∇R α = 0, has a similar effect in the equation of canonical momentum through
the intrinsic material force f intr [37]. We call such material forces, material forces
of quasi- or pseudo-inhomogeneity. Note that any additional variable put in the
functional dependency of the free energy W will cause a similar effect. It is only
in the pure materially homogeneous elastic case (W depending only on F) that the
balance of canonical momentum is also a strict conservation law. For instance, in
a spatially nonuniformly magnetized elastic material, with material magnetization
density m per unit volume of the reference configuration, we shall have a material
magnetic force with expression
f magn = HL · (∇R m)T ,
HL = −
∂W
,
∂m
(3.20)
where HL is the so-called local magnetic field – in material form – of ferromagnetism [42]. Formulas (3.20) strictly apply to the case of soft ferromagnets only (no
magnetic hyteresis). In a hard ferromagnet with magnetic ordering (micromagnetics) the contributions (3.20) will be replaced by
f ferro = Heff · (∇R m)T ,
Heff = −
δW tot
,
δm
(3.20)′
where δ/δm is an Euler–Lagrange functional derivative, and W tot is the total potential energy including elastic, magnetoelastic, exchange, magnetic-anisotropy,
magnetic doublet, and demagnetizing energies [43]. The very expression (3.20)′1
accounts for the gyroscopic nature of the magnetic spin so that there is simultaneously no explicit contribution of magnetic spin to the volume kinetic energy K. In
the presence of spin-lattice relaxation (Gilbert’s spin “viscosity” generalized to the
deformable framework), there exists an additional material force due to this effect
(see [43]).
One could be tempted to consider equations (3.13) and (3.5) as the canonical
equations of momentum and energy. But the second of these, in its form (3.5),
does not show much in common with (3.13). The reason is that, whatever we try
(see below), the latter can never be transformed into a strict conservation law. It is
therefore equation (3.5) which must be transformed in order to exhibit a structure
similar to that of equation (3.13). For this it is sufficient to remember that (3.10)1
is but a transformed form of the energy equation. Manipulating the first term, we
can write the latter equation as [44, 45]
∂(Sθ)
∂t
X
+ ∇R .Q = th + int ,
th := S
∂θ
.
∂t
(3.21)
The similarity between variables α and θ is thus enhanced due to the very analogous space and time structure of the right-hand sides of equations (3.13) and
(3.21)1 ; but while the second variable is governed by the heat equation, the first
one has to be governed by a pure evolution equation subjected to the second
584
G.A. MAUGIN
law of thermodynamics (non-negative dissipation). Equations (3.13) and (3.21)1
now clearly appear as the spacelike and timelike components of a unique fourdimensional canonical balance of momentum and energy. The remarkable fact,
however, is that the fourth (timelike) component of the four-dimensional canonical
momentum that could be introduced is neither the free nor the internal energy density but the difference between the two. This is in agreement with the relativistic
formulation of thermoelasticity (neither with true inhomogeneities nor with any
pseudo-inhomogeneities of any kind) of Kijowski and Magli [46]. This hints at a
true analytical mechanics of dissipative continua.
REMARK. (On Legendre–Fenchel transforms of the energy density). At regular
material points equation (3.13) is an identity deduced for smooth fields from equation (3.4). As such we can arrange the contributions to the left- and right-hand sides
at will. But, whatever we do, we cannot reformulate the material forces as exact
time and space derivatives in the left hand side, even by a clever redefinititon of
some quantities. There will always remain source terms. For instance, considering
the case of quasi-statics (neglect of inertial quantities) in order to simplify the
writing, equation (3.13) will read
divR b + f inh + f th + f intr = 0
(3.22)
with
f inh = −
1
∂W
∂X
expl
;
f th = S∇R θ,
f intr = A(∇R α)T .
(3.23)
The expressions (3.23) go along with the fact that it is the free energy from which
b is now defined:
b = W 1R − T · F.
(3.24)
We shall normally use a notational device to emphasize this fact; thus we shall
1 is assumed to be concave in the variable θ.
write bW for this b. Usually function W
A typical Legendre–Fenchel transformation of this energy density which conserves
the property of convexity and the degree of mathematical homogeneity is given by
(no pure material inhomogeneities here)
∂E
> 0,
(3.25)
∂S
of which the first expresses Young’s equality for conjugate functions in convex
analysis at fixed F and α, and the second provides the non-negative temperature θ
indicating that internal energy E is an ever increasing function of entropy.
E(F, S, α) + (−W (F, θ, α)) = Sθ,
θ=
For illustrative purposes, consider the simple case of quasi-statics in the absence of internal variable α (materially homogeneous, purely thermoelastic body).
Equation (3.22) takes on the form
divR bW + fθth = 0,
bW := W (F, θ)1R − T.F, fθth := S∇R θ.
(3.26)
585
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
But with the Legendre–Fenchel transform (3.25) this can as well be written as
divR bE + fSth = 0,
bE := E(F, S)1R − T.F, fSth := −θ∇R S.
(3.27)
The notation with subscripts (W, θ) and (E, S) is clear and consistent. Like in
the rest of thermomechanics, the choice of exploiting either (3.26)1 or (3.27)1
depends on the thermodynamical situation at hand (i.e., isothermal situation or
adiabatic conditions, or isentropic conditions). This choice becomes essential in
treating problems involving singularities (crack extension) or transition layers such
as phase-transition interfaces (essentially homothermal singular surfaces) or more
classical shock waves (singular surfaces exhibiting a growth of entropy across, but
often assumed to connect two material regions in adiabatic regime). The reason for
this is that, while (3.26) and (3.27) or their more general form in dynamics and with
real inhomogeneities and dissipative processes are mathematical identities at regular material points, they do provide the expression of the driving force (originally
a material or configurational force) acting at such singular points (case of a crack)
or at such surfaces via contour integrals or the jump equation associated to the
balance law of material momentum, of which (3.26)1 and (3.27)1 are specialized
equilibrium forms. The problem of selecting the appropriate energy density in the
Eshelby stress tensor, and accordingly the expression of the material forces due to
thermal and other dissipative effects deserve special attention (this was remarked
upon by Abeyaratne and Knowles [47] and the author [48]).
4. Local Structural Rearrangements and Material Transplants
4.1. TRUE MATERIAL INHOMOGENEITIES
In order to make ideas clear let us consider the case of quasi-statics in the absence
1 (F; X) per unit
of body force with an elastic energy density given by W = W
reference volume. In this case equations (3.4) and (3.22) reduce to
divR T = 0,
T=
1
∂W
,
∂F
(4.1)
and
divR b + f inh = 0,
f inh = −
1
∂W
∂X
,
expl
b = W 1R − T · F.
(4.2)
But following Epstein and Maugin [8], we consider (thought experiment) the case
where the material inhomogeneity can be artificially removed at each material
point X, by effecting a point-dependent change of reference configuration. That is,
the reference change is therefore local and generally not integrable over the whole
body. Such a change is called a local structural rearrangement. This is in the line
of Noll’s original idea of uniformity [9]. Let P(X) denote this reference change
(note that P here is not to be mistaken for the canonical momentum which does
586
G.A. MAUGIN
not appear in this section), which brings a neighborhood of X into the so-called
crystal reference. This is performed modulo the material symmetry [8, 49] so that
when we account for the accompanying volume change JP = det P , P combines
mutiplicatively to the right with F and, for energies, we can write
1 (F; X) = JP−1 W
(FP(X)) = W (F, P).
W
(4.3)
Obviously we can compute the partial derivatives of the last mentioned function
W , obtaining thus, as easily checked,
T=
1
∂W
∂W
=
,
∂F
∂F
b̃ = −
∂W
= −(T · F − W 1R ) · P−T .
∂P
(4.4)
Accordingly,
b ≡ b̃ · PT = −
∂W
· PT ≡ W 1R − T · F.
∂P
(4.5)
This provides an elegant geometrical definition of the quasi-static Eshelby stress b
(originally referred to as the energy-momentum or Maxwell stress by Eshelby) via
the notion of local structural rearrangement, although the final expression in (4.5)
no longer refers to this rearrangement. It is just the same as that given in the last of
(4.2). Assuming that we just know (4.1)1 at all regular points X, JF := det F > 0,
we can then compute the material divergence of b resulting in (4.2)1−2 . But we can
also express the material co-vector f inh through the operation
∂W
∂X
expl
= (∇R P) :
∂ W (F, P)
= −(∇R P) : b · P−T = b : Ŵ,
∂P
(4.6)
where Ŵ is the (geometrical) connection based on the non-integrable mapping P;
that is, in components (to avoid any misunderstanding):
A
A
ŴB.K
:= −(P−1 )α.B P.α,K
.
(4.7)
Therefore, equation (4.2)1 also reads [8]
divR b = b : Ŵ.
(4.8)
In some geometrical theories (Bilby, Kroener, Noll, Wang) of continuous distributions of dislocations, the connection Ŵ is directly related to the density of dislocations – the skew part of Ŵ is the torsion tensor and it is set equal to the skew
tensor that represents the density of dislocations [5]. Accordingly, we can say
that in such “continuously dislocated” elastic bodies, dislocations create a material force density which is responsible for the non-divergence-free nature of the
Eshelby stress tensor. Dislocations, which originally are discrete defects, act thus
as a materially distributed inhomogeneity force in agreement with equation (4.8).
This is the equation that unifies, via (4.5), the two (geometrical and configurational)
lines in Flow chart 1. We do not pursue further here this geometrical approach to
587
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
continuously distributed defects (see more on differential geometry, the notions of
material uniformity and homogeneity, the role of material symmetry groups, crystallographic basis, transplants, G-structure, and G-covariance in [49]). For reasons
to become clear, the mapping P may also be referred to as a material transplant
[50]. We note that the unification represented by (4.8) is now followed by several
authors [51–53]. In any case the notion of dislocation density happens here to be
connected to that of local structural rearrangements. These are local and we cannot
fit the reference crystal pieces together so that this description necessarily yields
the notion of internal stresses, but the really new point is the relationship to the
Eshelby stress tensor.
4.2. THE CASE OF DISSIPATIVE MEDIA WITH INTERNAL VARIABLES
The mental operation just performed to account for true material inhomogeneities
can also be performed for thermal effects and those due to the presence of internal
variables of state (whatever their peculiar tensorial character). Because of this more
general nature than that of temperature, we consider by way of example the case
of a materially homogeneous anelastic material treated in quasi-statics (still to
remain as simple as possible). Then equations (4.1) apply while equations (4.2)
are replaced by
1 (F, α)
∂W
.
(4.9)
∂α
Now let us envisage the following mental operation. Consider that it is possible,
by the appropriate local change of reference P, to make the material appear as
purely elastic at point X. This means that the new energy function W at X will
depend only on a finite strain and no other argument, becoming indeed a function
(FP(α(X, t))). But this is now per unit volume of the new local reference conW
figuration so that, accounting for the volume change, we should write (compare
with equation (4.3))
1 (F, α) = JP−1 W
FP(α(X, t)) = W (F, P(α)).
(4.10)
W =W
divR b + f intr = 0;
f intr = A(∇R α)T ,
A=−
The same reasoning as in the previous paragraph yields
1
∂W
∂W T
∂W
=
;
b=−
P = W 1R − T · F = b̃PT .
(4.11)
∂F
∂F
∂P
By introducing the local reference change or local rearrangement P(α(X)), we
have in some way “subtracted” the anelastic behavior of the material at X. Now
on account of the functional dependences (4.10) we can also evaluate the “force”
A and obtain the following equality between the “thermodynamic” definition of A
and a kind of “geometrical” definition (via P):
T=
A = b · P−T ·
∂PT
,
∂α
(4.12)
588
G.A. MAUGIN
where the free indices are on A and α, i.e. in components, (4.13) reads
γ
K
A = b.L
(P−1 ).K
∂P.γL
∂α
(4.12a)
.
Obviously, this is the same as
A = −b · P
∂P−1
.
∂α
(4.13)
4.3. THE CASE WHERE α IS AN ANELASTIC STRAIN
Here we identify P with the inverse of the “anelastic” deformation gradient (in
truth, not a gradient but a Pfaffian form) in a multiplicative decomposition of F as
[17–19]
F = Fe · Fp ,
(4.14)
where Fe is the elastic component and Fp is the anelastic one (in fact, the subscript
p stands for plasticity). Hence
P−1 ≡ Fp ,
FP = F · F−1
p ≡ Fe .
(4.15)
The deformation Fp defines locally an elastically released configuration at material
point X. This is also called a (local) intermediate configuration Ki [18, 20, 21]. Use
can be made of formula (4.13) to find out the thermodynamic force associated with
α through the geometrical description. With α ≡ Fp , we immediately have:
A = −b · F−T
p .
(4.16)
Introducing the second Piola–Kirchhoff stress S and the “Mandel” stress M by
S = T · F−T ,
M = T · F = S · FT · F = S · C,
(4.17)
we obtain that
b = W 1R − M
or
M = W 1R − b.
(4.18)
As a consequence, equation (4.16) reads
A = (M − W 1R ) · F−T
p .
(4.19)
Simultaneously, the intrinsic dissipation intr takes on the following form:
T
intr = Aα̇ = tr (M − W 1R ) · F−T
(4.20)
p · Ḟp ,
or
p
intr = tr (M − W 1R ) · (LR )T ,
(4.21)
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
589
where
p
LR = Ḟp · F−1
p
(4.22)
is the “plastic finite-strain rate in the reference configuration KR ”. If the plastic
p
deformation is assumed to be incompressible, then trLR = 0, and (4.21) reduces
formally to
p
intr = tr M · (LR )T .
(4.23)
In plain words this means that the Mandel stress is the driving force behind plasticity. The relation (4.23) can be as well expressed with geometrical objects pushed
forward to the intermediate configuration Ki , a more usual formulation. The present
argument clearly shows the unification of the three lines of thought of chart 1.
Obviously, the Mandel stress is only one part of the Eshelby stress. The first relationship between finite-strain plasticity and the notion of Eshelby stress in an
intermediate configuration was established by the author [54]; more on Eshelby
stress and finite-strain elastoplasticity to be found in other works [55, 56].
4.4. THE THERMOELASTIC CASE
Since the variables α and θ in preceding sections play a similar role in so far
as materials mechanics is concerned, we may consider the case of finite-strain
thermoelasticity by analogy with the anelastic case. That is, while in homoge1 (F, θ), we
neous thermoelastic materials the free energy is a priori a function W
can think of a local rearrangement of matter at point X in such a way that after
this local rearrangement the free energy depends only on one finite strain (just
as in pure elasticity) and no other argument, so that it depends on temperature
only through this rearrangement. To comply with the notation of other papers
we call P(θ(X, t)) = H−1 (θ) this local rearrangement so that up to the notation
equation (4.10) delivers the following functional dependences:
1 (F, θ) = JH W
(FH−1 (θ)) = W (F, H−1 ).
W =W
(4.24)
We copy directly equation (4.12) with the appropriate change in notation, obtaining thus a relationship between the thermodynamical definition of entropy and a
geometrical-like definition via H (note that the b here is bW ) [40]:
S = bW · H ·
dH−1
,
dθ
(4.25)
with
∂ W −T
H = W 1R − T · F.
∂H−1
Consequently, we can write the material thermal force as
dH
th
−1
f = −b · H ·
· (∇R θ).
dθ
bW = −
(4.26)
(4.27)
590
G.A. MAUGIN
4.5. MAGNETOELASTICITY OF NONUNIFORMLY MAGNETIZED BODIES
In that case we replace θ or α by the material magnetization vector m per unit
volume of the reference configuration. By analogy with the two previous cases we
have thus
HL = b · P ·
∂P−1
,
∂m
f magn = b · P ·
∂P−1
· (∇R m)T .
∂m
(4.28)
5. Configurational Forces
Although it has become customary to refer to the above material forces (contributions in the balance of canonical momentum) as configurational forces, we prefer
to call configurational forces those quantities that are deduced from the balance of
canonical momentum by some operation such as integration over a singular region
(and shrinking to a singular point if this is the case) or taking the jump across a
singular manifold (this is also obtained by volume integration over a region overlapping the singular surface and then flattening this region on the surface). In both
cases the definition involves both an integration and a limiting procedure yielding
a nonzero quantity by virtue of the present singularity. Accordingly, configurational forces are here associated with field singularities. The latter thus appear as
pseudo-inhomogeneities in their own right. Such configurational forces acquire a
true physical meaning only in so far as the power they expend in an irreversible
motion of the singularity set is none other than a dissipation. They are clearly
related primarily not to the dissipative behavior of the bulk material per se but to
an irreversibility due to the time evolution of the volume of integration, e.g., during
the irreversible progress of the crack tip inward the material in the case of fracture.
To arrive at a consistent “material force”–“energy change” formulation, one must
integrate both canonical momentum and energy, a point that escapes the attention
of many authors. The material may simultaneously be dissipative and smoothly
materially inhomogeneous in the bulk. Then one must account for the sources of
canonical momentum exhibited in previous sections. Without dealing with this in
detail but just to show the completeness and unification of concepts, we briefly
remind the reader of the two cases of fracture (propagation of a singularity line
viewed as a point in the plane) and propagation of a singularity surface (shock
wave and phase-transition front viewed as a line in the plane).
5.1. FRACTURE
In the case of a macroscopic sharp straight through crack seen as the uniform limit
of a family of regular rounded notches with radius going to zero, Dascalu and
Maugin [57], starting from the appropriately reduced form of equations (3.13) and
(3.5) at any regular material point X, implemented the above mentioned procedure
rigorously for a purely elastic homogeneous material. They showed that the global
591
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
material force or configurational force K (we note this force K – like Kraft in
German – to avoid any confusion with the deformation gradient F; P here again is
the material momentum and not the structural rearrangement of Section 4) acting
on the crack tip and the associated energy release rate G are given by
∂
1
P dV
(5.1)
K = (N · b + P(V · N)) dS −
∂t B
Ŵ
and
∂
V · N) + N(T · v)) dS −
G = (H (1
∂t
Ŵ
H dV ,
(5.2)
B
where H = E + K is the total energy (Hamiltonian) density based on the internal
energy, 1
V is the material velocity of the crack tip, and B is the regular region
bounded by the inside border Ŵin the material (with unit outward normal N) and the
stress-free faces of the crack. The quantities K and G are related by the dissipation
relation
11 0,
G = K1 V
(5.3)
11 and K1 are components of 1
where V
V and K in the direction of extension of
the crack. The powerful result (5.3) holds in the limit as the volume B shrinks
uniformly to the crack tip. Since there are no thermal effects in equations (5.1)–
(5.3), it does not matter whether the Hamiltonian H is based on the internal or
free energy and we do not distinguish between bW and bE . However, when true
and pseudo-inhomogeneities are present the starting point may be equation (3.13),
which emphasizes the presence of these additional effects, and the following transformed expression of equation (3.5) obtained by accounting for (3.21)1 and the
Legendre–Fenchel transform (3.25), that is,
∂
HW − ∇ · (T · v) + th + intr = 0,
(5.4)
∂t
where HW = W + K is the total energy (Hamiltonian) density based on the free
energy. The volume integral and skrinking limit (if necessary) are now applied to
equations (3.13) and (5.4). One obtains thus in place of equation (5.1)
∂
inh
th
intr
P dV − (f + f + f ) dV ,
V · N)) dS −
K = (N · bW + P(1
∂t B
B
Ŵ
(5.5)
where bW is the dynamical Eshelby stress based on the free energy. In the same
conditions, equation (5.2) is replaced by
(HW (1
V · N) + N · (T · v)) dS
G =
Ŵ
∂
th
intr
−
(5.6)
HW dV + ( + ) dV .
∂t B
B
592
G.A. MAUGIN
Fortunately, the material force f inh – which has no counterpart in (5.6) – has no
dissipation content so that (5.5) and (5.6) are again shown to be consistent. Indeed,
accounting for the order of singularity of the fields α and θ at the crack tip, we have
limit expressions at the crack tip such as [57]
V ≈ −th ,
f th · 1
f intr · 1
V ≈ −intr ,
(5.7)
from which it follows by using the same argument as for the purely elastic case
[57] that with the dual expressions (5.5)–(5.6) the result (5.3) still holds true.
5.2. SINGULAR SURFACES
In the case of propagating singular surfaces ' (with unit oriented normal N) entirely described in the material framework, it is clear that the presence of such
a surface breaks the translational invariance on the material manifold, since the
material will in general have acquired different material properties on both sides
of the surface. Accordingly, the central equation – that one which will deliver the
driving force on the singular surface – is the jump relation associated with the
regular bulk equation (3.13), because this equation is that one which contains the
“material force” generated by a material displacement of ' on the material manifold. This general problem was dealt with by the author [37, 58, 59] along this line
of thought. For a general singular surface (however not equipped with its own mass
and energy) of the shock wave type (characterized by a finite discontinuity in the
physical velocity field v and possibly in the other fields θ and α), one can establish
by various means the following two equations that relate to the lack of conservation
of pseudomomentum and entropy across ' although the jump equations associated
with the physical momentum and energy (see equations (5.11)–(5.13) below) do
not reveal source terms:
N · [bW + 1
V ⊗ P] + f' = 0
(5.8)
and
Q
1
N · VS −
+ σ' = 0,
θ
(5.9)
with the constraint (second law of thermodynamics at ')
σ' 0.
(5.10)
Equations (5.8) and (5.9) can be viewed as uniform limits obtained at ' by shrinking (flattening) a volume – so-called “pill-box” method – overlapping '. In that
view the source term, e.g., f' , in (5.8), is the formal representation of the limit of
the volume integral of the pseudo-inhomogeneity forces, the singularity of the surface making that this does not converge towards zero in the limit. The same holds
true of the surface entropy source σ' , which is a phenomenological representation
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
593
of the term obtained in the same limit procedure starting with a volume integral of
the sources of entropy in equation (3.11). Again, the true inhomogeneity force f inh ,
contrary to f th and f intr, does not produce any entropy. Accordingly, the interesting
relationship is the one that relates the unknown driving force f' and the equally
unknown (but non-negative) surface entropy source σ' . If the theory is consistent,
these cannot be entirely independent. The looked for consistency condition in fact
allows one to close the system of phenomenological equations at ' in compliance with the second law. As a matter of fact, accounting for the jump equations
associated with mass, physical motion and energy, i.e., corresponding to the bulk
equations (3.3)–(3.5), across ',
(1
V · N)[ρ0 ] = 0,
N · [T + 1
V ⊗ p] = 0,
1
N · [VH + T · v − Q] = 0,
(5.11)
(5.12)
(5.13)
one can show in all generality [12] that we have the following relationship
[SN]
−1
V + N · Q[θ −1 ] 0,
(5.14)
σ' = −θ f' + −1 · 1
θ
where the symbolisms [·] and · denote, respectively, the jump and mean value of
the enclosed quantity at ' and we have
f' · 1
V = − E(1
V · N) − N · T · F · 1
V .
(5.15)
For classical shock waves (in the so-called inconsistent theory where a dissipative interface across which entropy grows is supposed to connect two regions
nonetheless in adiabatic regime!), one sets
f' = 0,
∀1
V = 0,
(5.16)
and there remains the trivial relation σ' = −[S](1
V · N) 0, which tells in which
direction (with respect to N) the wave front moves to guarantee an increase in
entropy. Projected onto the unit normal N, the first of (5.16) then is none other than
the celebrated Hugoniot equation of shock-wave theory, i.e.,
HugoSW := [E − N · T · F · N] = 0.
(5.17)
For coherent phase-transition fronts for which
[V] = 0,
[θ] = 0
(5.18)
across ', the above-given formula (5.14) reduces to
1N 0,
V = θ'−1 f' V
σ' = θ'−1 f' · 1
(5.19)
f' = −HugoPT ,
(5.20)
1N = 1
where θ' is the value of θ at ', V
V · N is the normal speed of ', and the
scalar surface driving force f' is given by
HugoPT := [W − N · TF · N]
594
G.A. MAUGIN
and is generally not zero (it is zero for the nondissipative Landau theory of phase
transitions where the vanishing of f' is a mathematical statement akin – in the
appropriate state space – to the “Maxwell’s rule of equal areas” in the construction
of the so-called Maxwell line). Another way to derive these relations at ' has
been developed by the author [58, 59] by introducing the notion of a single scalar
quantity, namely a generating function or Massieu thermodynamical potential M
from which both f' and σ' are consistently derived.
Equations (5.17) and (5.20) illustrate perfectly the need for distinguishing between internal and free energies. They emphasize the use of one or the other
depending on the thermal conditions of the considered process across the wave
front. Accordingly, we may say that the study of the thermodynamics of shock
waves is based on the vanishing of the jump of the normal component of the (quasistatic) Eshelby stress built on the internal energy, while the study of the propagation
of phase-transition fronts is based essentially on the consideration of the value of
the jump of the normal component of the (quasi-static) Eshelby stress built on the
free energy [47, 58]. As a matter of fact, equation (5.19) is in the familiar form
of the product of a thermodynamical force and a generalized velocity (here a true
velocity). This hints at the fact that although originally defined in terms of usual
fields (Piola–Kirchhoff stress, finite deformation, energy density, temperature), the
configurational forces should be involved in kinetic relations (in a general way, relationships between material velocity of the singularity set and conjugated driving
force), which should obey the second law of thermodynamics. In this view shared
by many works of Abeyaratne and Knowles [60, 61], the configurational forces
are essentially secondary quantities that are exploited in criteria of progress of
the singularity set (or “defects” or “localized inhomogeneities”) and not primary
quantities on which the solution of an original boundary-value problem can be
built. This is in contradiction with Gurtin’s point of view [35] where configurational
forces seem (a priori only, this is our own remark) to exist independently of the
physical world (e.g., the classical balance of physical momentum or its jump in the
last studied case, or the bulk field equation for a more abstract dependent field).
6. Conclusion
At the end of Section 3 we have already clearly unified the notions of true and
pseudo-inhomogeneities by their parallel contributions to the balance of canonical momentum or its degenerate equilibrium form. The three lines of chart 1
are now unified through the dual notions of Eshelby stress and local structural
rearrangement. The latter may be of other types than those exhibited here, e.g.,
phase transformations or material growth. They may be of a general deformation
type (up to a rotation and the local material symmetry of the material), essentially
of the shear type (plasticity), or of the isotropic dilatation type (thermoelasticity,
growth unless these two have directional properties). In each case eigenstrains are
involved, e.g., transformation strains in the case of phase transformation. In the
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
595
case of growth of materials of the physiological type (such as in bone remodelling
or the mechanics of soft tissues), the local rearrangement was called “transplant”
– emphasizing the local nature – for obvious “surgical” reasons [50]. In the case
of the theory of inhomogeneity, it was called “inhomogeneity map” [8–10]. We
note to conclude that other generalizations of the concepts presented globally in
this contribution apply to media with higher order deformation gradients than in
classical hyperelasticity (so-called “weakly” nonlocal theory [5, 36], to additional
internal degrees of freedom [62], or to dissipative internal variables exhibiting also
a weak nonlocality [37], and to electromagnetic deformable bodies [43, 63].
Acknowledgements
The author benefits from a Max Planck Award for International Co-operation (2001
–2005). He acknowledges his debt to the referees.
References
1.
E. Kroener, Inneren Spannungen und der Inkompatibilitätstensor in der Elastizitätstheorie.
Z. Angew. Phys. 7 (1958) 249–257.
2. V.L. Indenbom, Internal stress in crystals. In: B. Gruber (ed.), Theory of Crystal Defects,
Proc. of Summer School, Hrazany, Czech, September 1964. Acad. Publ. House, Prague, and
Academic Pres, New York (1965) pp. 257–274.
3. M. Kleman, Dislocations, disclinations and magnetism. In: F.R.N. Nabarro (ed.), Dislocations
in Solids, Vol. 5. North-Holland, Amsterdam (1980) pp. 100–215.
4. G.A. Maugin, Classical magnetoelasticity in ferromagnets with defects. In: H. Parkus (ed.),
Electromagnetic Interactions in Elastic Solids, CISM Udine Course (1977). Springer, Vienna
(1979) pp. 243–324.
5. G.A. Maugin, Material Inhomogeneities in Elasticity. Chapman and Hall, London (1993).
6. G.A. Maugin, Material forces: Concepts and applications. ASME Appl. Mech. Rev. 48 (1995)
213–245.
7. G.A. Maugin, Thermomechanics of Nonlinear Dissipative Behaviors. World Scientific, Singapore, and River Edge, NJ (1999).
8. M. Epstein and G.A. Maugin, The energy-momentum tensor and material uniformity in finite
elasticity. Acta Mech. 83 (1990) 127–133.
9. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal.
27 (1967) 1–32.
10. C.C. Wang, On the geometric structure of simple bodies, or mathematical foundations for
the theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967)
33–94.
11. C.A. Truesdell and W. Noll, Nonlinear field theories of mechanics. In: S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer, Berlin (1965).
12. G.A. Maugin, Kröner–Eshelby approach to continuum mechanics with dislocations, material
inhomogeneities and peudo-inhomogeneities. In: B. Maruzewski (ed.), Proc. of Internat. Sympos. on Structured Media in Memory of E. Kröner, Poznan, Poland, September 2001. Poznan
Univ. Press, Poland (2001) pp. 182–195.
13. G.A. Maugin, Geometry and thermomechanics of structural rearrangements: Ekkehart
Kroener’s legacy, GAMM’2002, Kroener’s Lecture, Augsbug (2002). Z. Angew. Math. Mech.
83 (2002) 75–83.
596
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
G.A. MAUGIN
B.A. Bilby, L.R.T. Lardner and A.N. Stroh, Continuum theory of dislocations and the theory
of plasticity. In: Proc. of the Xth ICTAM, Brussels, 1956. Presses de l’Université de Bruxelles,
Vol. 8 (1957) pp. 35–44.
E. Kroener, Kontinuumstheorie der Versetzungen und Eigenspannungen. Springer, Berlin
(1958).
E. Kroener and A. Seeger, Nicht-lineare Elastizitätstheorie und Eigenspannungen. Arch.
Rational Mech. Anal. 3 (1959) 97–119.
E.H. Lee, Elastic-plastic deformation at finite strain. ASME Trans. J. Appl. Mech. 36 (1969)
1–6.
J. Mandel, Plasticité et Viscoplasticité Classique, CISM Udine Course. Springer, Vienna
(1971).
J. Lubliner, Plasticity Theory. McMilan, New York (1990).
C. Teodosiu and F. Sidoroff, A Theory of finite elastoplasticity in single crystals. Internat. J.
Engrg. Sci. 14 (1976) 165–176.
F. Sidoroff, Variables internes en viscoélasticité et viscoplasticité. State Doctoral Thesis in
Mathematics, Université Pierre et Marie Curie, Paris (1976).
K. Kondo, On the geometrical and physical foundations of the theory of yielding. In: Proc. of
the 2nd Japanese National Congress of Applied Mechanics, Kyoto (1952) pp. 41–47.
K. Kondo, Non-Riemannian geometry of imperfect crystals from a macroscopic viewpoint. In:
K. Kondo (ed.), RAAG Memoirs of the Unifying Study of Basic Problems in Engineering and
Physical Sciences by Means of Geometry, Vol. 1. Gakujutsu Bunken Fukyukai, Tokyo (1955)
pp. 459–480.
M.O. Peach and J.S. Koehler, The force exerted on dislocations and the stress field produced
by them. Phys. Rev. II-80 (1950) 436–439.
J.D. Eshelby, The force on an elastic singularity. Phil. Trans. Roy. Soc. London A 244 (1951)
87–112.
J.R. Rice, Path-independent integral and the approximate analysis of strain concentrations by
notches and cracks. Trans. ASME J. Appl. Mech. 33 (1968) 379–385.
L.D. Landau and E.M. Lifshitz, Theory of Fields. Mir, Moscow (1965).
D. Rogula, Forces in material space. Arch. Mech. 29 (1967) 705–715.
G.A. Maugin, Magnetized deformable Media in general relativity. Ann. Inst. Henri Poincaré A
15 (1971) 275–302.
A. Golebiewska-Herrmann, On conservation laws of continuum mechanics. Internat. J. Solids
Struct. 17 (1981) 1–9.
R. Kienzler and G. Herrmann, Mechanics in Material Space. Springer, Berlin (2000).
R. Kienzler and G.A. Maugin (eds), Configurational Mechanics of Materials. Springer, Vienna
(2001).
M.E. Gurtin, The characterization of configurational forces. Arch. Rational. Mech. Anal. 126
(1994) 387–394.
M.E. Gurtin, On the nature of configurational forces. Arch. Rational Mech. Anal. 131 (1995)
67–100.
M.E. Gurtin, Configurational Forces as Basic Concepts of Continuum Physics. Springer, Berlin
(1999).
G.A. Maugin and C. Trimarco, Pseudo-momentum and material forces in nonlinear elasticity:
Variational formulation and application to fracture. Acta Mech. 94 (1992) 1–28.
G.A. Maugin, Thermomechanics of inhomogeneous-heterogeneous systems: Application to the
irreversible progress of two- and three-dimensional defects. ARI 50 (1997) 41–56.
J.L. Ericksen, Special topics in elastostatics. In: C.-S.Yih (ed.), Advances in Applied Mechanics, Vol. 17. Academic Press, New York (1977) pp. 189–244.
G.A. Maugin, On Ericksen–Noether identity and material balance laws in thermoelasticity and
akin phenomena. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Me-
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
597
chanics and Mathematics of Materials (J.L.Ericksen’s 70th Anniversary Volume). C.I.M.N.E.,
Barcelone (1996) pp. 397–407.
M. Epstein and G.A. Maugin, Thermoelastic material forces: definition and geometric aspects.
C. R. Acad. Sci. Paris II 320 (1995) 63–68.
H.D. Bui, Mécanique de la Rupture Fragile. Masson, Paris (1978).
G.A. Maugin, Continuum Mechanics of Electromagnetic Solids. North-Holland, Amsterdam
(1988).
A. Fomèthe and G.A. Maugin, Material forces in thermoelastic ferromagnets. Cont. Mech.
Thermodyn. 8 (1996) 275–292.
G.A. Maugin, On the universality of the thermomechanics of forces driving singular sets. Arch.
Appl. Mech. 70 (2000) 31–45.
G.A. Maugin, Universality of the thermomechanics of forces driving singular sets in continuum
mechanics. In: 20th ICTAM, Paper QG2. Chicago (August 2000).
J. Kijowski and G. Magli, Unconstrained Hamiltonian formulation of general relativity with
thermo-elastic surces. Classical Quantum Grav. 15 (1998) 3891–3916.
R Abeyaratne and J.K. Knowles, A note on the friving traction acting on a propagating interface: Adiabatic and non-adiabatic processes in a continuum. ASME Trans. J. Appl. Mech. 67
(2000) 829–831.
G.A. Maugin, Remarks on Eshelbian thermomechanics of materials. In: S. Cleja-Tigoiu and
V. Tigoiu (eds), Proc. of the 5th Internat. Seminar on Geometry, Continua and Microstructure.
Publ. House of Romanian Acad. Sciences, Bucharest (2001) pp. 159–166.
M. Esptein and G.A. Maugin, Notions of material uniformity and homogeneity. In: T. Tatsumi
(ed.), Theoretical and Applied Mechanics, Proc. of ICTAM’96, Kyoto. Elsevier, Amsterdam
(1997) pp. 201–215.
M. Epstein and G.A. Maugin, Thermomechanics of volumetric growth in uniform bodies.
Internat. J. Plasticity 16 (2000) 51–978.
K. Ch. Le, Thermodynamically based constitutive equations for single crystals. In: G.A.Maugin
(ed.), 1st Internat. Seminar on Geometry, Cotinua and Microstructure. Hermann, Paris (1999)
pp. 87–97.
M.E. Gurtin and P. Cermelli, The characterization of geometrically necessary dislocations in
finite plasticity. In: 20th ICTAM, Paper FG1. Chicago (August 2000).
P. Steinmann, Views on multiplicative elastoplasticity and the continuum theory of dislocations.
Internat. J. Engrg. Sci. 34 (1996) 1717–1735.
G.A. Maugin, Eshelby stress in plasticity and fracture. Internat. J. Plasticity 10 (1994) 393–
408.
M. Epstein and G.A. Maugin, On the geometrical material Structure of unelasticity. Acta Mech.
115 (1995) 19–131.
S. Cleja-Tigoiu and G.A. Maugin, Eshelby’s stress tensors in finite elastoplasticity. Acta Mech.
139 (2000) 19–131.
C. Dascalu and G.A. Maugin, Forces matérielles et taux de restitution de l’énergie dans les
corps élastiques homogènes avec défauts. C. R. Acad. Sci. Paris II 317 (1993) 1135–1140.
G.A. Maugin, On shock waves and phase-transition fronts in continua. ARI 50 (1998) 145–150.
G.A. Maugin, Thermomechanics of forces driving singular point sets. Arch. Mech. 50 (1998)
477–487.
60. R. Abeyaratne and J.K. Knowles, Driving traction acting on a surface of strain discontinuity in
a continuum. J. Mech. Phys. Solids 38 (1990) 345–360.
61. R. Abeyaratne and J.K. Knowles, Kinetic relations and the propagation of phase boundaries in
elastic solids. Arch. Rational Mech. Anal. 114 (1991) 119–154.
62. G.A. Maugin, On the structure of the theory of polar elasticity. Phil. Trans. Roy. Soc. London
A 356 (1998) 1367–1395.
63. G.A. Maugin and C. Trimarco, Driving force on phase transition fronts in thermoelectroelastic
crystals. Math. Mech. Solids 2 (1997) 199–214.
59.
On the Microscopic Interpretation of Stress and
Couple Stress
A. IAN MURDOCH
Department of Mathematics, University of Strathclyde, Livingstone Tower, 26 Richmond Street,
Glasgow G1 1XH, U.K. E-mail: aim@maths.strath.ac.uk
Received 18 September 2002; in revised form 25 February 2003
Abstract. Exact continuum forms of balance (for mass, linear momentum, and tensor-valued moment of momentum) are established as relations between weighted spatial averages of corpuscular
quantities computed at any supra-molecular length scale. Explicit expressions for stress and generalised couple stress in terms of particle interactions are obtained using a theorem due to Noll, and
their physical interpretation is discussed for a specific choice of weighting function. Remarks are
made on other choices of weighting function, the interpretation of partial stress in mixture theory, a
link between couple stress and inhomogeneity, and other forms of moment of momentum balance.
Comparison is made with the statistical mechanical viewpoint pioneered by Irving and Kirkwood.
Mathematics Subject Classifications (2000): 70F, 74A.
Key words: stress, couple stress, microscopic interpretation, weighting function.
Dedicated to the memory of Clifford Truesdell
1. Introduction
Modelling molecules as interacting point masses, Irving and Kirkwood [1] studied
the molecular basis of the equations of hydrodynamics within the framework of
classical statistical mechanics. Explicit expressions were obtained for the separate
contributions to the stress tensor which derive from momentum transport and from
interactions. The physical and geometrical interpretations of these contributions
were clear and simple, but the formal manipulation of series expansions of Dirac
δ distributions (central to the analysis) was not justified. Noll [2] showed how
the same results could be obtained using a theorem whose proof requires only
undergraduate-level multivariable calculus. Pitteri [3] extended Noll’s analysis to
very general interactions, and later [4] discussed couple stress from the same,
statistical mechanical, viewpoint.
The field values which appear in [1–4] are strictly local, both in space and
time. However, as pointed out in [1, p. 821], in principle the values of fields in
deterministic continuum mechanics should be identified with averages of local
599
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 599–625.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
600
A.I. MURDOCH
measurements made in oft-repeated experiments. Since any local measurement has
associated scales of length and time, it was proposed in [1] that continuum field
values be identified with those obtained after a further averaging in both space
and time. Such additional averaging was not undertaken, however. Since the field
values in [1] are strictly-local ensemble averages, the foregoing proposal implicitly
assumes that space-time averaging of an ensemble average should be equivalent to
averaging oft-repeated space-time averages.
The foregoing motivates the formulation of continuum relations in which field
values are directly identifiable with averages of molecular quantities computed at
specific scales of length and time. These field values can then be related to measurements at these scales in individual experiments. Such formulation was undertaken
by Murdoch and Bedeaux [5], who employed weighting function methodology.
The main purpose of this work is to review the nature of spatial averaging using
weighting functions, to indicate the central role of the aforementioned theorem of
Noll, and to highlight the somewhat subtle interpretation of the interaction stress
and couple-stress tensors. The existence and explicit form of the couple stress
tensor, together with the physical/geometrical interpretation of stress and couple
stress, constitute the new aspects of this contribution.
In Section 2 relations expressing mass conservation, together with balances of
linear momentum and (rank two tensor-valued) moment of momentum, are derived
in terms of corpuscular quantities and general choice of weighting function. Noll’s
theorem is used to establish the existence and explicit forms of the stress and
couple-stress tensors. Choice of a simple scale-dependent weighting function is
made in Section 3, and the physical and geometrical interpretation of all fields
is discussed. Other choices of weighting function are considered in Section 4,
together with the interpretation of partial stress in mixtures, the link between couple stress and inhomogeneity, the difference between moment of momentum here
derived and those usually postulated, and the statistical mechanical approach of [1].
2. Continuum Relations Derived on the Basis of Particle Mechanics
2.1. KINEMATICS AND MASS CONSERVATION
Consider a material system M of distinguishable molecules, modelled as a system
of interacting point masses labelled Pi (i = 1, 2, . . . , N), whose masses, locations,
and velocities at instant t are denoted by mi , xi (t), and vi (t), respectively. Local
spatial averages of additive corpuscular quantities may be computed in terms of
a weighting function. For example, the mass density appropriate to a choice w of
weighting function is
N
mi w(xi (t) − x).
(2.1)
ρw (x, t) :=
i=1
To make physical sense, w should assign greater contributions to the sum from
molecules near the geometrical point x than from those far from x, and also have
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
601
physical dimension (length)−3 . Further, if ρw is to be identified with mass density
as employed in continuum mechanics then it should be differentiable both in space
and time. However, from (2.1), any such regularity of ρw is inherited from that
of w. Accordingly w is required to be of class C 1 on the space V of displacements
in Euclidean space E. Additionally, the integral of ρw over E should yield the total
mass of the system. If there is only one particle in M then necessarily
w = 1.
(2.2)
V
This normalisation condition also (trivially) suffices to yield the property for any
number of particles. At this stage no further restrictions will be imposed upon w.
Although the physical interpretation of fields is crucially dependent upon the choice
of w, in what follows the forms of the relations these fields satisfy is independent
of such choice.
Holding x fixed in (2.1),
N
N
∂ρw
mi ∇x w · vi
mi ∇w · vi = −
=
∂t
i=1
i=1
= −
where
pw (x, t) :=
N
i=1
N
i=1
mi div{vi w} = −div pw ,
mi vi (t)w(xi (t) − x)
(2.3)
(2.4)
denotes the momentum density appropriate to w. Here ∇w denotes the derivative
of w with respect to its argument u(:= xi (t) − x), ∇x w denotes the gradient of w
regarded as a function of location x, and in introducing the divergence it has been
noted that vi is independent of x. Whenever ρw = 0, the corresponding velocity
field
pw
vw :=
.
(2.5)
ρw
Thus from (2.3) and (2.5)
∂ρw
+ div{ρw vw } = 0.
(2.6)
∂t
2.2. LINEAR MOMENTUM BALANCE
Linear momentum balance is obtained by considering the motion of Pi relative to
an inertial frame. Such motion is governed by the equation
N
j =1,j =i
fij + bi =
d
{mi vi }.
dt
(2.7)
602
A.I. MURDOCH
Here fij denotes the force exerted upon Pi by Pj , bi represents the resultant
force on Pi due to external agencies, and the sum is over all particles Pj (j =
i). Multiplication of each term by w(xi (t) − x), followed by summation over all
particles, yields
N
d
fw + bw =
{mi vi }w(xi − x),
dt
i=1
(2.8)
where
fw (x, t) :=
and
bw (x, t) :=
N
N
i=1 j =1
j =i
N
i=1
fij (t)w(xi (t) − x)
bi (t)w(xi (t) − x).
(2.9)
(2.10)
Since
d
∂
{mi vi }w(xi − x) =
{mi vi w(xi − x)} − (mi vi ⊗ vi )∇w
dt
∂t
and
(mi vi ⊗ vi )∇w = −mi vi ⊗ vi ∇x w = −div{mi vi ⊗ vi )w},
the right-hand side of (2.8) becomes
0
/ N
∂
mi vi w(xi − x) + div Dw ,
∂t i=1
(2.11)
where
Dw (x, t) :=
N
i=1
mi vi (t) ⊗ vi (t)w(xi (t) − x).
(2.12)
Accordingly, noting definitions (2.4) and (2.5), substitution of (2.11) in (2.8) yields
the continuum balance
∂
fw + bw =
{ρw vw } + div Dw .
(2.13)
∂t
Writing
v̂i (t; x) := vi (t) − vw (x, t),
(2.14)
and noting that from (2.4) and (2.5)
N
i=1
mi v̂i (t; x)w(xi (t) − x) = pw − ρw vw = 0,
(2.15)
603
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
it follows that
Dw (x, t) = D w (x, t) + ρw vw ⊗ vw ,
(2.16)
where
D w (x, t) :=
N
i=1
mi v̂i (t; x) ⊗ v̂i (t; x)w(xi (t) − x).
(2.17)
Using (2.16), balance (2.13) may be written as
−div D w + fw + bw
∂
=
{ρw vw } + div{ρw vw ⊗ vw }
∂t
∂ρw
∂vw
=
+ div{ρw vw } vw + ρw
+ (∇vw )vw .
∂t
∂t
That is, invoking (2.6),
(2.18)
−div D w + fw + bw = ρw aw ,
(2.19)
div T + b = ρa.
(2.21)
where the acceleration field
∂vw
+ (∇vw )vw .
(2.20)
aw :=
∂t
Relation (2.19) is to be compared with the usual local form of momentum
balance
Identifications
bw ↔ b,
ρw ↔ ρ,
aw ↔ a
(2.22)
are straightforward. This motivates an attempt to express fw as the divergence of a
tensor field. The existence and explicit form of such a tensor field is a consequence
of (see [2, 4, 5])
NOLL’S THEOREM. Let g denote a class C 1 tensor-valued function of any rank
defined on E × E which satisfies, for any pair of points x and y,
g(y, x) = −g(x, y),
(2.23)
and such that for some positive number δ (here we identify E with R by selection
of a Cartesian reference frame)
g(x, y)x3+δ y3+δ ,
∇y g(x, y)x3+δ y3+δ
3
∇x g(x, y)x3+δ y3+δ ,
and
(2.24)
are bounded in E × E.
Then
$
1
1
g(x + αu, x − (1 − α)u) ⊗ u dα du . (2.25)
g(x, y) dy = div −
2 V 0
E
604
A.I. MURDOCH
To invoke this theorem in respect of fw we define
g(x, y) :=
N
N
fij w(xi − x)w(xj − y),
i=1 j =1
j =i
and notice that use of normalisation (2.2) and (2.9) yields
N
N
fij w(xi − x) w(xj − y) dy = fw (x).
g(x, y) dy =
E
i=1 j =1
j =i
(2.26)
(2.27)
E
(Of course, in the foregoing time dependence has been omitted for brevity.) Accordingly, (2.25) enables fw (x) to be expressed as the divergence of a tensor, defined explicitly in terms of interactions and w, provided (2.23) and (2.24) are
satisfied. Now
N
N
fij w(xi − x)w(xj − y)
g(x, y) =
i=1 j =1
j =i
=
N
N
j =1 i=1
i =j
= −
fj i w(xj − x)w(xi − y)
N
N
j =1 i=1
i =j
fij w(xj − x)w(xi − y) = −g(y, x).
Here the second equality is a consequence of re-labelling, and the third equality
holds if Newton’s third law holds for particle interactions, namely
fj i = −fij .
(2.28)
In respect of the boundedness conditions, observe that interactions fij are independent of x and y. If each interaction fij is governed by a separation-dependent
potential, φij say, which is bounded below, and such that φij → +∞ as Pi and
Pj get ever closer, then provided the total (kinetic plus potential) energy of M is
bounded it follows that the values of all interactions are bounded. (In particular, the
foregoing holds for Lennard–Jones-type potentials: see, for example, [6, p. 251].)
The boundedness criteria (2.24) are accordingly satisfied provided that, for some
δ > 0,
w(u)u3+δ
and
∇w(u)u3+δ
are bounded in V.
(2.29)
Condition (2.29) is thus a condition upon the choice of weighting function necessary for invocation of the theorem. Accordingly, for standard models of molecular
interactions, and modulo restriction (2.29) on w, from (2.27), (2.26) and (2.25),
fw = div T−
w,
(2.30)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
605
where the interaction stress tensor
1
N
N
1
−
Tw (x) := −
fij ⊗ u w(xi − x − αu)
2 V 0 i=1 j =1
j =i
×w(xj − x + (1 − α)u) dα du.
(2.31)
From (2.30) the balance of linear momentum (2.19) takes its standard form (2.21),
namely
div Tw + bw = ρw aw ,
(2.32)
where the stress tensor
Tw := T−
w − Dw.
(2.33)
REMARK 1. The existence of T−
w imposes no restriction upon the range of interactions, but requires their pairwise balance (2.28). More general interactions have
been considered: see [7, p. 299] and [3, p. 294]. A corresponding stress tensor was
motivated for the interactions covered in [7] using different methodology from that
here employed. Interactions discussed in [3] were decomposed into conservative
and non-conservative contributions, and a stress tensor was associated with the
former, using Noll’s theorem: the non-conservative contribution was shown to be
decomposable into two terms, one of which has an associated stress tensor and the
other remains a spatial force density. For large bodies, gravitational considerations
make a similar decomposition of fw desirable in order to make comparison with
the usual continuum approach to self-gravitation. Writing fij as the sum of the
gravitational attraction of Pj upon Pi together with the remaining non-gravitational
interaction, fw can be expressed as the sum of the divergence of a non-gravitational
interaction stress tensor together with an internal gravitational interaction body
force density, bgrav say. For the simplest choice of w, with corresponding length
scale ε (see (3.2)), the value bgrav (x, t) is the resultant gravitational force at instant t
exerted by molecules of the body distant greater than ε from x upon those distant
less then ε from x, divided by 4π ε 3 /3. The non-gravitational interaction stress at x,
while possibly involving individual long-range molecular contributions, in general
derives almost entirely from molecules close to x as a consequence of co-operative
behaviour (see [7, Section 3.1, Remarks]).
2.3. GENERALISED MOMENT OF MOMENTUM BALANCE
Tensorial pre-multiplication of each term in equation (2.7) by (xi − x)w(xi − x),
followed by summation over all i = 1, . . . , N, yields
N
d
(xi − x) ⊗ {mi vi }w(xi − x),
cw + Jw =
dt
i=1
(2.34)
606
A.I. MURDOCH
where
cw (x, t) :=
N
N
i=1 j =1
j =i
(xi (t) − x) ⊗ fij (t)w(xi (t) − x),
(2.35)
and
Jw (x, t) :=
N
(xi − x) ⊗
d
{mi vi }w(xi − x)
dt
Now
=
i=1
(xi (t) − x) ⊗ bi (t)w(xi (t) − x).
∂
{(xi − x) ⊗ mi vi w(xi − x)} − vi ⊗ mi vi w(xi − x)
∂t
−(xi − x) ⊗ mi vi (∇w . vi ).
(2.36)
(2.37)
Defining the action of simple tensor a ⊗ b ⊗ c on any vector v by
(a ⊗ b ⊗ c)v := (c . v)a ⊗ b,
(2.38)
the last term of (2.37) may be written as
−((xi − x) ⊗ mi vi ⊗ vi )∇w
= ((xi − x) ⊗ mi vi ⊗ vi )∇x w
= div (xi − x) ⊗ mi vi ⊗ vi w(xi − x) + mi vi ⊗ vi w(xi − x).
The second equality is a consequence of the identity
div(φ a ⊗ b ⊗ c) = φ (∇a)c ⊗ b + a ⊗ (∇b)c + (div c)(a ⊗ b)
+(a ⊗ b ⊗ c)∇φ,
(2.39)
(2.40)
with a := (xi − x), b := mi vi , c := vi , and φ := w. Here the definition of the
divergence of a rank three tensor M ensures that the divergence theorem holds in
the form
div M
(2.41)
Mn =
∂R
R
for a regular region R having outward unit normal n on its boundary ∂R. (In
Cartesian tensor notation, (div M)ij = Mij k,k .)
From (2.37) and (2.39), relation (2.34) may be written as
∂
{ρw Bw } + div Mw ,
(2.42)
cw + Jw =
∂t
where
N
(xi (t) − x) ⊗ mi vi (t)w(xi (t) − x)
(2.43)
ρw (x, t)Bw (x, t) :=
i=1
607
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
and
Mw (x, t) :=
N
(xi (t) − x) ⊗ mi vi (t) ⊗ vi (t)w(xi (t) − x).
(2.44)
i=1
Recalling that the non-interaction contribution D w to stress involves thermal velocities and, with an eye on usual forms of balance having right-hand sides of form
ρ ˙ for some tensor field , we write
w :=
M
N
(xi − x) ⊗ mi vi ⊗ v̂i w(xi − x).
(2.45)
i=1
Accordingly, from (2.44), (2.45), (2.14) and (2.43),
w = Mw − ρw Bw ⊗ v,
M
(2.46)
and (2.42) becomes
w + cw + Jw = ∂ {ρw Bw } + div{ρw Bw ⊗ vw }.
− div M
∂t
Since (2.40) may be written as
div (a ⊗ b) ⊗ φc = (∇(a ⊗ b))φc + div(φc)(a ⊗ b)
(2.47)
(2.48)
and this result holds with a ⊗ b replaced by any second-rank tensor,
div{ρw Bw ⊗ vw } = div{Bw ⊗ ρw vw }
= (∇Bw )ρw vw + (div(ρw vw ))Bw .
(2.49)
Hence (2.47) may be written (using (2.6)) in the form
w + cw + Jw = ρw Ḃw ,
− div M
(2.50)
where the material time derivative
Ḃw :=
∂
{Bw } + (∇Bw )vw .
∂t
(2.51)
It proves possible to write cw as the divergence of a rank three tensor field via
Noll’s theorem. To this end we define
G(x, y) :=
N
N
(xi − x) + (xj − y) ⊗ fij w(xi − x)w(xj − y).
i=1 j =1
i =j
(2.52)
608
A.I. MURDOCH
Consider
N
N
(xi − x) ⊗ fij w(xi − x) w(xj − y) dy
G(x, y) dy =
E
E
i=1 j =1
i =j
+
N
N
i=1 j =1
i =j
E
(xj − y)w(xj − y) dy ⊗ fij w(xi − x).
(2.53)
Changing to variable u := xj − y yields
w(u) du = 1,
w(xj − y) dy =
(2.54)
V
E
using normalisation condition (2.2). Further,
u w(u) du.
(xj − y)w(xj − y) dy =
(2.55)
Thus from (2.54), (2.55), (2.53) and (2.35),
G(x, y) dy = cw
(2.56)
provided that weighting function w satisfies
u w(u) du = 0.
(2.57)
V
E
E
V
Notice that if w is ‘balanced’, in the sense that
w(−u) = w(u),
(2.58)
then (2.57) is satisfied.
We also have
G(x, y) =
N
N
(xj − x) + (xi − y) ⊗ fj i w(xj − x)w(xi − y)
j =1 i=1
j =i
= −G(y, x)
(2.59)
on assuming interaction balance (2.28). The remaining boundedness condition necessary for application of Noll’s theorem is equivalent to requiring that
u w(u)u3+δ be bounded for u ∈ V,
(2.60)
a further restriction upon the choice of w. For such weighting functions Noll’s
theorem enables us to write
cw = div C−
w
(2.61)
609
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
and express balance (2.50) in the form
div Cw + Jw = ρw Ḃw .
(2.62)
Here the generalised couple-stress tensor
with
Cw := C−
w − Mw
C−
w (x) := −
(2.63)
N
N 1
1
(xi − x − αu) + (xj − x + (1 − α)u)
2 i=1 j =1 V 0
i =i
⊗ fij ⊗ uw(xi − x − αu)w(xj − x + (1 − α)u) dα du.
(2.64)
REMARK 2. Moment of momentum balance corresponds to the skew part of
relations (2.42) or (2.62). Denoting twice the skew part of any rank two tensor A
by Ă, and noting
˘ b = a ∧ b,
a⊗
(2.65)
where
a ∧ b := a ⊗ b − b ⊗ a,
(2.66)
these relations become
c̆w + J̆w =
∂
{ρw B̆w } + div M̆w
∂t
(2.67)
and
˙ .
div C̆w + J̆w = ρw B̆
w
(2.68)
Here c̆w , J̆w and B̆w are given by definitions (2.35), (2.36) and (2.43) with ⊗
replaced by ∧, M̆w is given by (2.44) with the first ⊗ replaced by ∧, and C̆w is
the difference of expressions (2.64) and (2.45) with the first ⊗ replaced by ∧.
Of course, relations (2.67) and (2.68) may be written in terms of the corresponding axial vectors. In so doing it should be noted that −a × b is the axial vector
corresponding to a ∧ b.
3. A Simple Choice of Scale-Dependent Weighting Function and
Corresponding Physical Interpretation of Field Values
3.1. A SIMPLE WEIGHTING FUNCTION
In Section 2 the usual forms of mass conservation (2.6) and linear momentum
balance (2.32), together with a generalised moment of momentum balance (2.62),
610
A.I. MURDOCH
were derived using any scalar-valued weighting function w defined on V of class C 1 ,
satisfying boundedness criteria (2.29) and (2.60), and relation (2.57). The only
physical requirements so far introduced are that w should have physical dimension
(length)−3 and be normalised in the sense (2.2). In the absence of any preferred
direction for the system, it is natural to take
w(u) = w
(u),
where u := u.
(3.1)
The simplest way of introducing a length-scale dependence is to choose
3 3
πǫ
4
w
(u) :=
if u < ǫ
w
(u) := 0
and
if u ǫ.
(3.2)
Clearly w is normalised, and fields ρw , pw , fw , bw , cw , Jw and Bw have simple
interpretations as local averages of molecular variables. For example (see (2.1)),
ρw (x, t) represents the mass of those particles which at time t reside within that
sphere Sǫ (x) of radius ǫ centred at x divided by the volume of this sphere. However,
w is not continuous wherever u = ǫ. It is a simple matter to “mollify” w
over an
interval (ǫ, ǫ + δ) in such a way that w
is of arbitrary smoothness up to class
C ∞ everywhere, with w
constant on 0 u ǫ and zero for u ǫ + δ. Here
δ(> 0) is arbitrarily small (see [5, p. 160]): for example, we could choose δ = 10−6
Å = 10−16 m. In such case the physical interpretations of fields ρw , etc. are essentially indistinguishable from those delivered by choice (3.2). Such mollification
involves monotone decreasing smooth functions wherever ǫ u ǫ + δ. Nevertheless, such functions w
have bounded (although very large) derivative values on
(ǫ, ǫ +δ). Of course, these derivatives vanish on [0, ǫ]∪[ǫ +δ, ∞), and accordingly
boundedness criteria (2.29) and (2.60) are satisfied.
3.2. INTERPRETATION OF FIELD VALUES
3.2.1. Values of ρw , pw , fw , bw , cw , Jw and ρw Bw are immediately seen to deliver
values, at geometrical point x and time t, of sums of additive molecular quantities
taken over those molecules which lie within Sǫ (x), divided by the volume Vǫ of
this sphere.
Modulo satisfaction of Newton’s third law (2.28), the expressions for fw and cw
can be reduced somewhat. Writing
N
N
i=1 j =1
i =j
fij w(xi − x) =
fik fiℓ
+
,
Vǫ
Vǫ
P ∈S (x)
P ,P ∈S (x)
i
k
i =k
ǫ
we note that the first sum is the same as
1
(fik + fki ),
2 P ,P ∈S (x)
i
k
i =k
ǫ
ǫ
i
Pℓ ∈
/ Sǫ (x)
(3.3)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
611
which vanishes by (2.28). Thus fw (x, t) is the resultant force of molecules outside
Sǫ (x) upon those inside Sǫ (x), at time t, divided by Vǫ .
Similarly, again invoking (2.28) and adopting the book-keeping on the righthand side of (3.3),
N
N
i=1 j =1
j =i
=
(xi − x) ⊗ fij w(xi − x)
1
fik
fiℓ
(xi − xk ) ⊗
+
(xi − x) ⊗ .
2 i =k
Vǫ
Vǫ
i
ℓ
(3.4)
If interactions are governed by separation-dependent pair potentials, then
fik = αik (xi − xk ).
(3.5)
where αik is a scalar-valued function of xi − xk . Accordingly, from (3.4) and
(2.35),
cw (x, t) = c̃w (x, t) + ĉw (x, t) ,
(3.6)
where c̃w (x, t) takes symmetric values and ĉw (x, t) represents at instant t the resultant tensor moment of forces about x exerted by molecules outside Sǫ (x) upon
those inside Sǫ (x), divided by Vǫ . Specifically,
cw (x, t) :=
1
(xi − xk ) ⊗ αik (xi − xk ) w(xi − x)
2 P ,P ∈S (x)
i
k
i =k
(3.7)
ǫ
and
(xi − x) ⊗ fil w(xi − x).
ĉw (x, t) :=
(3.8)
Pi ∈Sǫ (x)
Pℓ ∈
/ Sǫ (x)
−
3.2.2. The geometrical interpretations of T−
w and Cw follow from a theorem which
−
relates particular molecular interactions to the integrals of T−
w n and Cw n over any
subset S of an oriented plane. Specifically, let n (x0 ) denote that oriented plane
through point x0 with unit normal n. Then n (x0 ) divides E into the two open
subsets
En+ (x0 ) := {z ∈ E: (z − x0 ).n > 0}
E−
n (x0 )
:= {y ∈ E: (y − x0 ).n < 0}.
and
(3.9)
More precisely,
E = En− (x0 ) ∪ n (x0 ) ∪ En+ (x0 ).
(3.10)
612
A.I. MURDOCH
THEOREM. If S is a connected subset of n (x0 ) then, for any function g as in
Noll’s theorem,
g(y, z) dy dz
D(S)
$
1
1
g(x + αu, x − (1 − α)u) ⊗ u dα du n dSx ,
=
−
2 V 0
S
(3.11)
where domain
D(S) := (y, z) ∈ En− (x0 ) × En+ (x0 ) and ℓ(y, z) intersects S ,
(3.12)
and ℓ(y, z) denotes the line through points y and z.
See [8] for a proof of this theorem.
Choice (2.26) of g together with (2.31) yields from (3.11)
N
N
fij Fij (S),
T−
n
dS
=
w
S
where
(3.13)
i=1 j =1
j =i
w(xi − y)w(xj − z) dy dz.
(3.14)
Similarly, choice (2.52) of g with (2.64) gives
N
N
−
F ij (S) ⊗ fij ,
Cw n dS =
(3.15)
Fij (S) :=
D(S)
S
where
F ij (S) :=
i=1 j =1
i =j
D(S)
[(xi − y) + (xj − z)]w(xi − y)w(xj − z) dy dz.
(3.16)
REMARK 3. Before discussing the values of Fij (S) and F ij (S) for the mollified version of w
it is instructive to consider the formal limit of w
as scale ǫ
tends to zero. This is provided by choosing the Dirac δ distribution in place of w
.
Accordingly, from (3.14)
δ(xi − y)δ(xj − z) dy dz
(3.17)
Fij (S) :=
D(S)
takes the value 1 if ℓ(y, z) passes through S, and is otherwise zero. In the same
way, from (3.16) F ij (S) is seen to be zero for all particle pairs. Accordingly, we
have the simple result (see Figure 1)
′
′
fij ,
(3.18)
T−
n
dS
=
w
S
i
j
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
613
Figure 1.
where sums are taken only over particle pairs for which Pi ∈ En− (x0 ), Pj ∈
En+ (x0 ), and the line through Pi and Pj intersects S. Further, from (2.31) with
w = δ we see the only contributions to T−
w (x) derive from α and u values for
which
xi = x + αu and
xj = x − (1 − α)u.
(3.19)
Thus
xj − xi = −u
(3.20)
and (2.31) takes form
T−
w (x)
N
N
1
fij ⊗ (xj − xi )aij (x)
=
2 i=1 j =1
(3.21)
i =j
for some scalar-valued functions aij . Accordingly, for interactions of form (3.5),
T−
w takes symmetric values. Irving and Kirkwood [1] first obtained the foregoing,
strikingly simple, interpretation of T−
w given in (3.18).
Setting w = δ in (3.16) yields zero value for each F ij (S). Thus from (2.64) the
couple-stress vanishes for such choice.
More generally, in determining the values of Fij (S) and F ij (S) for any particular pair of particles and choice of scale ǫ embodied in w
we note that:
(i) the only nonzero contributions come from domains in which both y − xi
< ǫ and z − xj < ǫ, and in such case w(xi − y)w(xj − z) takes the
value Vǫ−2 ,
(ii) the y domain lies in En− (x0 ) and the z domain in En+ (x0 ), and
(iii) line ℓ(y, z) passes through S.
Consequently there is no simple expression for either Fij (S) or F ij (S), and a
number of different cases must be taken into account.
Case 1 (See Figure 2). If Sǫ (xi ) ⊂ En− (x0 ), Sǫ (xj ) ⊂ En+ (x0 ), and any line joining
a point in sphere Sǫ (xi ) to a point in sphere Sǫ (xj ) passes through S, then
Fij (S) = 1 and
F ij (S) = 0 .
(3.22)
614
A.I. MURDOCH
Figure 2. Case 1: Fij (S) = 1 and Fij (S) = 0.
The latter result is a consequence of xi and xj being the centroids of, respectively,
Sǫ (xi ) and Sǫ (xj ), and the spherical symmetry of w
. Accordingly the contribution
to the interaction stress integral (3.13) is fij and to couple-stress integral (3.15) is
zero.
Case 2 (See Figure 3(i) and (ii)). If xi and xj both lie within a distance ǫ of S, and
all lines joining points in Sǫ (xi ) to points in Sǫ (xj ) pass through S, then from (3.14)
Vi− Vj+
−2
dz =
dy
Fij (S) = Vǫ
,
(3.23)
Vǫ2
Sǫ+ (xj )
Sǫ− (xi )
where
Sǫ− (xi ) := Sǫ (xi ) ∩ En− (x0 ) and
with (ℓ = i or j )
Sǫ+ (xj ) := Sǫ (xj ) ∩ En+ (x0 ),
Vℓ± := vol(Sǫ± (xℓ )).
(3.24)
(3.25)
The net contribution of particles Pi and Pj to the integral in (3.13) is
fij Fij (S) + fj i Fj i (S) = fij (Fij (S) − Fj i (S))
Vi− Vj+ − Vj− Vi+
fij
=
Vǫ2
$
Vi+ + Vj−
fij
= 1−
Vǫ
Vi− − Vj−
Vj+ − Vi+
=
fij or
fij ,
Vǫ
Vǫ
(3.26)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
615
+
− +
2
Figure 3. Case 2: Fij (S) = Vi− Vj+ /Vǫ2 and Fij (S) = {(xi − ȳ−
i ) + (xj − z̄j )}Vi Vj /Vǫ .
on noting (ℓ = i or j )
Vℓ+ + Vℓ− = Vǫ .
(3.27)
Further, in this situation (3.16) yields
F ij (S) = Vǫ−2 Vj+
(xi − y) dy + Vi−
Sǫ− (xi )
Sǫ+ (xj )
$
(xj − z) dz .
Hence
Vi− Vj+
+
(xi − ȳ−
F ij (S) =
i ) + (xj − z̄j ) ,
2
Vǫ
(3.28)
F ij (S) is parallel to n.
(3.29)
+
−
+
where ȳ−
i and z̄j denote the centroids of regions Sǫ (xi ) and Sǫ (xj ), respectively.
+
Since ȳ−
i (z̄j ) lies on that line through xi (xj ) parallel to n, by symmetry,
The analogue of (3.26) for the net contribution to the integral in (3.15) from particles Pi and Pj is
−1
−
−
−
F ij (S) ⊗ fij + F j i (S) ⊗ fj i = (xi − ȳ−
i )Vi − (xj − z̄j )Vj Vǫ ⊗ fij
(3.30)
−
upon using (2.28), (3.27) and where z̄−
j denotes the centroid of Sǫ (xj ).
Notice that the foregoing results also hold when Sǫ (xi ) and Sǫ (xj ) intersect.
Also, Fij (S) never vanishes, even when xi and xj lie on the same side of S: see
(3.23) and Figure 3(ii). Further, the net contribution of Pi and Pj , given by (3.26),
is a nonzero multiple of fij provided that xi and xj are not equidistant from S.
Other cases. For particles Pi and Pj for which not all lines joining points in
Sǫ (xi ) to points in Sǫ (xj ) do not intersect S, results are more complex. In the
situation depicted in Figure 4(i),
Fij (S) =
Sǫ (xj ) Ri− (z)
Vǫ2
dy dz
,
616
A.I. MURDOCH
Figure 4. Some other cases.
where
Ri− (z) := y ∈ En− (x0 ) ∩ Sǫ (xi ) and l(y, z) intersects S .
Figure 4(ii) indicates a situation in which
Fij (S) =
R + (xj ) Ri− (z)
Vǫ2
dy dz
,
where R + (xj ) is the set of points z ∈ Sǫ (xj ) such that there exists a point y ∈ Sǫ (xi )
for which l(y, z) intersects S.
w (see (2.17) and (2.45)) are to be identified with fluxes
3.2.3. Fields D w and M
of momentum and generalised moment of momentum associated with molecular
mass transport. To see this note that as a consequence of (2.15)
D w (x, t) =
N
i=1
mi vi (t) ⊗ v̂i (t; x)w(xi (t) − x).
(3.31)
For any unit vector n it follows that, suppressing time dependence for brevity,
D w (x)n =
and
w (x)n =
M
N
i=1
mi vi {(vi − v(x)) · n} w(xi − x)
N
(xi − x) ⊗ mi vi {(vi − v(x)) · n} w(xi − x).
i=1
(3.32)
(3.33)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
617
Each particle Pi within a distance ǫ of x contributes to sums (3.32) and (3.33) a
weighted multiple of mi vi and (xi − x) ⊗ mi vi , respectively: the weighting factor,
αi say, is (vi −v(x)).n/Vǫ . Such velocities vi , if constant, would involve Pi crossing
a plane surface, moving with velocity v(x), through x with unit normal n from
En− (x) into En+ (x0 ) (see (3.9)) if αi > 0 and in the opposite sense if αi < 0.
Notice also that the contribution to Cauchy stress Tw from the momentum flux
(see (2.33)) is ‘pressure-like’ in that, for any vector n,
D w n.n =
N
mi (v̂i .n)2 w > 0.
(3.34)
i=1
(The only exception would be physically-unrealistic situations in which, for some
n, all particles have the same n component of velocity.) Quantity v̂i (t; x) approximates, for particles near x, the thermal velocity
ṽi (t) := vi (t) − vw (xi (t), t)
(3.35)
of Pi corresponding to w. It is the kinetic energy associated with such thermal
velocities that is, according to the kinetic theory of heat (see [9]), identified with
heat energy. Specifically, at the scale embodied in the choice of w,
ρw (x, t)hw (x, t) :=
N
1
i=1
2
mi ṽ2i (t)w(xi (t) − x)
(3.36)
is the heat content density. Modulo the approximation
ṽi (t) ≃ v̂i (t; x)
if xi (t) − x < ǫ,
(3.37)
from (2.17) we have
trD w = 2ρh.
(3.38)
Consider a moderately-rarefied gas macroscopically at rest in a container of
volume V . In such case ṽi = vi and T−
w is negligible in comparison with D w :
interactions occur only ‘occasionally’ via binary ‘collisions’. In such case it is reasonable to expect D w to be isotropic and constant, except for a region of thickness
2ǫ centred on the boundary of the container. Thus for some P > 0 (see (3.34))
D w = P 1,
(3.39)
and integration of (3.38) over the container interior yields (modulo neglect of
boundary inhomogeneity)
3P V =
since,
N
i=1
mi v2i ,
(3.40)
618
A.I. MURDOCH
container
2ρh =
E
2ρh =
N
mi v2i
(3.41)
i=1
as a consequence of normalisation property (2.2). If each molecule has mass m
then
N
i=1
mi v2i = Nmv̄2 ,
(3.42)
where v̄2 is the mean square velocity of molecules. Thus, from (3.40), (3.41) and
(3.42).
PV =
1
Nmv̄2 .
3
(3.43)
This is the ideal gas relation: the temperature θ in such context is given (see, for
example, [10, Section 19.4]) by
θ = mv̄2 /3k,
(3.44)
where k denotes the Boltzmann constant.
4. Discussion
4.1. ALTERNATIVE CHOICES OF WEIGHTING FUNCTION
Averaging via weighting functions may be repeated, by defining the w-average,
fw , of a spatial field f via
f (y)w(y − x) dy.
(4.1)
fw (x) :=
all space
This accords with microscopic averages computed in Section 2 upon writing discrete (that is, purely microscopic) quantities in terms of distributions. For example,
the microscopic mass density (at any given instant: time-dependence is suppressed)
̺mic (x) :=
N
i=1
mi δ(xi − x),
(4.2)
where δ denotes the three-dimensional Dirac distribution. Clearly, from (4.1), (4.2)
and (2.1),
(̺mic )w = ̺w .
(4.3)
Upon repeating a w-average it is natural to compare (fw )w with fw . If one requires
that repeated averaging yields nothing new, that is if
(fw )w = fw ,
(4.4)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
619
then the form of w may be determined (see [5, p. 161]). In unbounded domains the
convolution format of (4.1) implies that the Fourier transform w(k) of w should
satisfy
w(k)2 = w(k).
(4.5)
Thus w(k) = 0 or 1 and the simplest (and most physical) choice is for a wavevector ‘cut-off’, say at |k| = ε −1 for some choice of the length scale ε. That is,
w(k) = 1
if |k| < ε −1 ,
w(k) = 0
if |k| ε −1 .
In such case it follows that
$
1
d
d
d
w(d) =
sin
−
cos
,
2
3
2π d
ε
ε
ε
(4.6)
(4.7)
where
d := |d|.
(4.8)
The analogue for a bounded rectangular region of dimensions 2L1 × 2L2 × 2L2
yields truncated (at wavelength ε) multiple Fourier series which are delivered by
3
>
sin((Ni + 1/2)di )
1
,
w(d) :=
8L1 L2 L3 i=1
sin(di /2)
(4.9)
where Ni is the integral part of 2Li /ε and d = (d1 , d2 , d3 ). A consequence of
using (scale-dependent) weighting functions of form (4.7) or (4.9) is that averaging
at scale ε1 , followed by a further averaging at scale ε2 , yields the same result as
merely averaging once at the larger of the two scales.
Choice (4.7) does not satisfy the boundedness conditions (2.24) of Noll’s theorem, and accordingly fw and cw do not appear to be expressible in divergence form.
Thus, for this choice, the balances of linear and generalised moment of momentum
remain in the forms (2.19) and (2.50). To be able to invoke Noll’s theorem, choice
(4.9) must be modified in the same manner as w
given by (3.2). However, the
interpretations analogous to those of (3.13) and (3.15) are no longer so transparent
and simple.
4.2. PARTIAL STRESS IN MIXTURE THEORY
Mass conservation and linear momentum balance for any single constituent in
a non-reacting mixture can be obtained using the methodology of Sections 2.1
and 2.2. For constituent α the analogue of (2.19) is
−div D α + fαα +
fαβ + bα = ρα aα ,
(4.10)
β =α
620
A.I. MURDOCH
where
∂vα
+ (∇vα )vα
(4.11)
∂t
denotes the α intrinsic acceleration field. Here fαα denotes the body force density
associated with α–α interactions, and fαβ represents the body force density which
derives from the effect on constituent α molecules due to those of constituent β.
The β sum is over all constituents except α, and the weighting function subscript
has been suppressed. Field D α is given by (2.17) with sum taken only over α molecules and velocity vα in place of vw in definition (2.14), and bα denotes the force
density which derives from all influences outside the mixture. It is only possible
to invoke Noll’s theorem in respect of fαα , in precisely the manner of Section 2.2:
separately, and in combination, this is impossible for fαβ (β = α). Accordingly
there exists (with choice w
as in Section 2) an α–α interaction stress tensor T−
α
such that
aα :=
fαα = div T−
α.
(4.12)
Thus (4.10) takes the form
div Tα +
fαβ + bα = ρα aα ,
(4.13)
β =α
where the α partial stress tensor
Tα := T−
α − Dα .
(4.14)
This partial stress differs in its interpretation from that of Truesdell [11] and Bowen
[12], who regarded Tα n as yielding the traction on an oriented surface S with unit
normal n due to the whole mixture on the ‘positive’ side of S upon species α on
its negative side. This latter interpretation gave rise to a paradox (see [13]). The
interpretation here given (which resolves the paradox) was also derived via the
corpuscular considerations of Murdoch and Morro [14, 15] using cellular averaging. The current use of weighting functions and Noll’s Theorem makes precise this
earlier approach.
4.3. COUPLE STRESS AND INHOMOGENEITY
Couple stress is taken into account when materials have microstructure, or are inhomogeneous: see, for example, [16, Section 98], and Lecture II of Truesdell [17].
The discussion here did not address microstructure, but there is a clear link with
inhomogeneity. A measure of this, at the scale ε associated with the weighting
function, is the displacement d from the field point x of the mass centre of those
molecules within a distance ε of x. (Here and hereafter subscripts w will be omitted, for simplicity.) Specifically,
ρ(x, t)d(x, t) :=
N
i=1
mi (xi (t) − x)w(xi (t) − x),
(4.15)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
621
when w is chosen to be the mollified version of w
given by (3.2). Differentiating
with respect to t (with x fixed) yields
N
N
∂
mi ((xi − x) ⊗ vi )∇w
mi vi w(xi − x) +
{ρd} =
∂t
i=1
i=1
= ρv −
N
i=1
%
mi ((xi − x) ⊗ vi )∇x w
= ρv + divx
= ρv −
N
i=1
/ N
i=1
mi (xi − x) ⊗ vi
0&
w − divx
/
N
(xi − x) ⊗ mi vi w
i=1
mi vi w(xi − x) − div{ρB}.
0
(4.16)
Here use has been made of identities
div(φA) = φ div A + A∇φ,
with φ = w and A = mi (xi − x) ⊗ vi , and
div(a ⊗ b) = (∇a)b + (divb)a
with a = (xi − x), b = mi vi , together with results ∇x a = −1 and div mi vi = 0.
Thus from (4.16) (2.4) and (2.5)
∂
{ρd} = −div{ρB}.
∂t
(4.17)
Relation (4.17) is a conservation law for the density of moment of mass, ρd. Balance (2.62) in which (generalised) couple stress appears, may be regarded as an
evolution equation for B, which is related to inhomogeneity via (4.17).
4.4. ON GENERALISED MOMENT OF MOMENTUM BALANCE
Tensorial pre-multiplication of linear momentum balance (2.32) by (x − x0 ), where
x0 is an arbitrary, but fixed, point in the relevant inertial frame, yields
(x − x0 ) ⊗ divT + (x − x0 ) ⊗ b = (x − x0 ) ⊗ ρa.
(4.18)
Equivalently, noting that (2.40) with φ = 1 may be written as
div(a ⊗ (b ⊗ c)) = (∇a)(b ⊗ c)T + a ⊗ div(b ⊗ c),
which clearly holds when simple tensor b ⊗ c field is replaced by any rank-two
tensor field, (4.18) may be written as
˙
x0 ) ⊗ v} − ρv ⊗ v.
div{(x − x0 ) ⊗ T} − TT + (x − x0 ) ⊗ b = ρ {(x −
(4.19)
622
A.I. MURDOCH
Integration over a domain Rt whose boundary ∂Rt has outward unit normal n, and
which deforms with the motion prescribed by v, leads to
T
−T + (x − x0 ) ⊗ b + ρv ⊗ v
(x − x0 ) ⊗ Tn +
Rt
∂Rt
$
d
=
(x − x0 ) ⊗ ρv .
(4.20)
dt Rt
Here use has been made of divergence theorem (2.41) with M = (x − x0 ) ⊗ T and
Reynolds’ transport theorem (see [18]). Twice the skew part of (4.20) is, noting the
symmetry of ρv ⊗ v and (2.66),
T − TT + (x − x0 ) ∧ b
(x − x0 ) ∧ Tn +
Rt
∂Rt
$
d
(x − x0 ) ∧ ρv .
(4.21)
=
dt Rt
It follows that the usual form of moment of momentum balance for non-polar
media (see, for example, [11, Section 205]) holds if, and only if, T takes symmetric
values.
REMARK 4. As discussed in Remark 3, from (2.31) T− is symmetric for interactions of form (3.5) in the limiting case of scale ǫ tending to zero, and hence
symmetry of T follows from (2.33), since D is symmetric. However, for ǫ > 0 the
complex nature of (2.31) does not lead to such a simple conclusion. Conventional,
postulated forms of moment of momentum balance yield local forms which involve
the skew part of T (and hence of T− ): see, for example, [17, p. 24]. A generalised
moment of momentum balance was also motivated in [7] which involved skT− :
see equation (3.27) therein. Here, however, this is not the case: the derived balance
(2.62) does not involve skT.
To compare the foregoing with postulated forms of moment of momentum
balance, note the integral form of balance (2.62) is (cf. (4.21))
$
d
ρB .
(4.22)
J=
Cn +
dt Rt
Rt
∂Rt
Adding (4.20) and (4.22) gives
T
−T + ρv ⊗ v + J + (x − x0 ) ⊗ b
Cn + (x − x0 ) ⊗ Tn +
Rt
∂Rt
d
{(x − x0 ) ⊗ ρv + ρB} .
=
(4.23)
dt Rt
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
623
Twice the skew part of (4.23) is (see the definitions following (2.68) and noting the
symmetry of D: see (2.17))
−
T − (T− )T + J̆ + (x − x0 ) ∧ b
C̆n + (x − x0 ) ∧ Tn +
Rt
∂Rt
d
(x − x0 ) ∧ ρv + ρ B̆ .
=
(4.24)
dt Rt
Usual postulated moment of momentum balances correspond to (4.24) with contribution T− − (T− )T absent. This motivates the search for a couple-stress tensor C′
such that
div C′ = div C̆− + T− − (T− )T .
(4.25)
The author has, without success, studied the explicit expressions for the interaction
contributions to the right-hand side of (4.25), its skew part, and skT− , with the
aim of obtaining a suitable analogue of G in (2.52), and thence invoking Noll’s
theorem to establish the existence of such a tensor C′ . However, relation (3.33)
of [7] delivers this result, modulo very weak assumptions on interactions (see I.1.,
I.2. and I.3.). In this respect it is necessary to note that time averaging can be
omitted throughout [7] without affecting the veracity of (3.33), that Sǫ (x) is an
example of an ‘ǫ-cell’ centred at x, and that a ∧ b employed therein is here written
as 12 a ∧ b.
It follows that postulated balances are not incorrect, but that the interpretation
of couple stress therein is different. Further, Eringen [19] integrated local forms of
balance relations for non-polar continua, assumed to hold in ‘microvolumes’ over
so-called ‘macrovolumes’ to obtain a balance which involves both the divergence
of macroscale couple stress, and the skew part of the macroscale stress. Pitteri [4]
obtained a similar result by deriving an energy balance and then assuming this
to be invariant under change of observer. In this context the approach herein is
characterised by a definition of (generalised) couple stress for which the relations
governing evolution of momentum and (tensor-valued) moment of momentum are
uncoupled.
REMARK 5. It is of interest to note the formal similarity between mass conservation (2.6) with momentum balance (2.32), and moment of mass conservation (4.17)
with moment of momentum balance (2.62).
4.5. COMPARISON WITH STATISTICAL MECHANICS
The further averaging of ensemble averages in space and time advocated in [1]
(see the second paragraph in Section 1) is instructive. If spatial averaging of fields
is effected in the manner of (4.1), then, for interactions governed by separationdependent pair-potentials, the spatial average of the strictly-local form of linear
624
A.I. MURDOCH
momentum balance derived in [1] yields a formally-identical relation in which the
averaged stress tensor is symmetric. Such symmetry is preserved by subsequent
time averaging (see [5, Section 7]). However, even for such simple conservative
interactions, the possibility of asymmetric stress as a consequence of inhomogeneity is manifest in Section 3.2.2. Consequently the assumption in [1], that
space-time averages of ensemble averages be identified with mean values of space
time averages in oft-repeated experiments, is drawn into question.
4.6. CONCLUDING REMARKS
4.6.1. Local measurement values are (local) averages both in space and time.
Thus if field values are to be related to local measurements then the spatial averages
here discussed should be subjected to a further temporal averaging. Such additional
averaging was implemented in [5], and has been extended to time-dependent systems in [20, 21]. For brevity details have been omitted: revised interpretations of
stress and couple stress are simple time averages of those given here.
4.6.2. Adequate continuum modelling of macromolecular systems often requires
consideration of couple stresses and body couples: for example, nematic liquid
crystalline phases. If each macromolecule is regarded as an assembly of interacting
point masses, the balances here obtained remain valid. However, more detailed
book-keeping, taking account of co-operative macromolecular behaviour (for example, local alignment of long, ‘thin’, molecules), is necessary before balances are
obtained which resemble those usually postulated. An earlier study [22] addressed
this issue, using cellular averaging. In this work it was shown that if, roughly
speaking, macromolecules deform homogeneously, and neighbouring molecules
deform in much the same way, then the full tensor-valued moment of momentum
balance serves as an evolution equation for the tensor-valued measure of such affine
deformation.
References
1.
2.
3.
4.
5.
6.
7.
J.H. Irving and J.G. Kirkwood, The statistical mechanical theory of transport processes. IV. The
equations of hydrodynamics. J. Chem. Phys. 18 (1950) 817–829.
W. Noll, Die Herleitung der Grundgleichungen der Thermomechanik der Kontinua aus der
statistischen Mechanik. J. Rational Mech. Anal. 4 (1955) 627–646.
M. Pitteri, Continuum equations of balance in classical statistical mechanics. Arch. Rational
Mech. Anal. 94 (1986) 291–305.
M. Pitteri, On a statistical-kinetic model for generalized continua. Arch. Rational Mech. Anal.
111 (1990) 99–120.
A.I. Murdoch and D. Bedeaux, Continuum equations of balance via weighted averages of
microscopic quantities. Proc. Roy. Soc. London A 445 (1994) 157–179.
D.L. Goodstein, States of Matter. Dover, New York (1985).
A.I. Murdoch, A corpuscular approach to continuum mechanics: basic considerations. Arch.
Rational Mech. Anal. 88 (1985) 291–321.
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
625
A.I. Murdoch, Elements of the continuum modelling of material behaviour and its relation
to the fundamentally-discrete nature of matter. In: Lecture Notes, Centre of Excellence for
Advanced Materials and Structures, IPPT, Polish Academy of Sciences, Warsaw (2002).
S.G. Brush, The Kind of Motion We Call Heat. North-Holland, Amsterdam/New York (1986).
H.C. Ohanian, Physics, Vol. 1. Norton, NewYork/London (1985).
C. Truesdell and R.A. Toupin, The Classical Field Theories, S. Flügge (ed.), Handbuch der
Physik. Springer, Berlin (1960).
R.M. Bowen, Theory of Mixtures, A.C. Eringen (ed.), Continuum Physics, Vol. III. Academic
Press, New York (1976).
M.E. Gurtin, M.L. Oliver and W.O. Williams, On the balance of forces for mixtures. Quart.
Appl. Math. 30 (1973) 527–530.
A. Morro and A.I. Murdoch, Stress, body force, and momentum balance in mixture theory.
Mechanica 21 (1986) 184–190.
A.I. Murdoch and A. Morro, On the continuum theory of mixtures: Motivation from discrete
considerations. Internat. J. Engrg. Sci. 25 (1987) 9–25.
C. Truesdell and W. Noll, The Non-linear Field Theories of Mechanics, S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer, Berlin (1965).
C. Truesdell, Six Lectures on Modern Natural Philosophy. Springer, Berlin (1966).
C. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1. Academic Press, New
York (1977).
A.C. Eringen, Mechanics of micromorphic continua. In: E. Kröner (ed.), Mechanics of
Generalized Continua. Springer, New York (1968).
A.I. Murdoch, On time-dependent material systems. Internat. J. Engrg. Sci. 38 (2000) 429–452.
A.I. Murdoch and S.M. Hassanizadeh, Macroscale balance relations for bulk, interfacial and
common line systems in multiphase flows through porous media on the basis of molecular
considerations. Internat. J. Multiphase Flow 28 (2002) 1091–1123.
A.I. Murdoch, On the relationship between balance relations for generalised continua and
molecular behaviour. Internat. J. Engrg. Sci. 25 (1987) 883–914.
The Hanging Rope of Minimum Elongation for a
Nonlinear Stress–Strain Relation
PABLO V. NEGRÓN-MARRERO
Department of Mathematics, University of Puerto Rico, Humacao, PR 00791-4300, U.S.A.
E-mail: pnm@www.uprh.edu
Received 23 September 2002
Abstract. We consider the problem of determining the shape that minimizes the elongation of a rope
that hangs vertically under its own weight and an applied force, subject to either a constraint of fixed
total mass or fixed total volume. The constitutive function for the rope is given by a nonlinear stress–
strain relation and the mass–density function of the rope can be variable. For the case of fixed total
mass we show that the problem can be explicitly solved in terms of the mass density function, applied
force, and constitutive function. In the special case where the mass–density function is constant, we
show that the optimal cross-sectional area of the rope is as that for a linear stress–strain relation
(Hooke’s Law). For the total fixed volume problem, we use the implicit function theorem to show
the existence of a branch of solutions depending on the parameter representing the acceleration of
gravity. This local branch of solutions is extended globally using degree theoretic techniques.
Mathematics Subject Classifications (2000): 34B15, 74B20, 74G25.
Key words: string, mass–density, nonlinear stress–strain relation, implicit function theorem, compact map.
The contributions of Clifford Truesdell to the development of the field of rational
mechanics were vast. I met Professor Truesdell only briefly at a meeting on Theoretical Mechanics at Rutgers University in 1990. However, I am in debt to him for
the legacy of his teachings in rational mechanics. The following paper is dedicated to his memory.
1. Introduction
The problem of the motion of a string under different types of boundary conditions
and forces dates back to Euler [7] and Lagrange [14]. The problem we consider
here is a variation of the catenary problem first proposed by Leonardo da Vinci. In
particular, we consider the problem of a rope or string attached at one end, hanging
vertically under its own weight, and subject to an applied force at the hanging
end. (See Figure 1.) Instead of specifying a shape for the rope and determining its
deformation, we consider the optimal control problem of determining the shape of
627
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 627–649.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
628
P.V. NEGRÓN-MARRERO
Figure 1. Geometry of deformation. In the reference configuration there is no gravity and we
either have the total volume fixed to V or the total mass fixed to M.
the rope (given by its cross-sectional area function) that minimizes its elongation
under the effect of its own weight and applied force, subject to either a constraint
of fixed volume or fixed mass. This problem was treated in [20] for a linear stress–
strain relation (Hooke’s Law) and constant mass–density function. We generalize
the results in [20] to materials that satisfy a general nonlinear stress–strain relation
and with a variable mass–density function. The model for the string that we used
is based on those in [3], to which we also refer for a detailed historical account of
problems for strings.
A related problem to the one treated here is that, instead of hanging, the rope
is now upside down and is thought of as a column. The question is, subject to
the constraint of fixed volume: what shape should the column have in order to
maximize its strength? This problem is equivalent to maximizing the first buckling
mode of the system. We refer to [13, 5, 6] for further details on this problem.
In Section 2 we derive the equilibrium equations of the string and describe
the constitutive assumptions on the material behavior. In Section 3 we characterize the problem of minimum elongation as one of the calculus of variations.
For the problem of fixed total mass, we show in Section 4 that the corresponding
Euler–Lagrange equations can be solved explicitly in terms of the mass–density
function, applied force, and constitutive function (cf. (4.10)), and that this solution
corresponds to a global weak minimizer of the total elongation functional. In the
particular case where the mass–density function is constant, we find the surprising
result that the corresponding cross-sectional area function is identical to the one
found in [20] for the case of a linear stress–strain relation. The corresponding
THE HANGING ROPE OF MINIMUM ELONGATION
629
minimum elongation of the string, however, depends on the material response
(cf. (4.15)).
In Section 5 we carry out the analysis for the constraint of fixed total volume.
Since the problem is formulated as one for a functional in terms of a pseudoantiderivative of the cross-sectional area function (cf. (3.3)), and the mass–density
function may be variable, the volume constraint becomes an integral constraint.
Thus we have an isoperimetric problem of the calculus of variations and the Euler–
Lagrange equations involve an additional term proportional to a Lagrange multiplier. The Euler–Lagrange equations for this problem are then formulated as a
smooth mapping between appropriate Banach spaces of smooth functions, using
the gravitational parameter for continuation. We first study the problem with zero
gravity and show that it has a solution which is unique and corresponds to a constant cross-sectional area function. We then show that the implicit function theorem
is applicable to the problem with nonzero gravity to get the existence of a local
curve of solutions. This result includes the case of small but negative gravitational
parameter, which can be interpreted as the case of a gravitational field vertically
upward or as if the rope would hang upside down. Finally we apply the classical
Leray–Schauder degree for compact maps [16, 4, 2] to extend globally the local
branch found via the Implicit Function Theorem. The uniqueness of solutions for
the problem with zero gravity together with some a priori estimates on the solutions
allow us to rule out two of the Rabinowitz alternatives for the global continuum.
In addition we get that the problem has a solution for each nonnegative value of
the gravitational parameter. We then show that for fixed values of the gravitational
parameter, the solution obtained is a weak local minimizer of a modified version
of the original variational formulation (cf. (5.25), (5.26)).
For any m 0 we consider the spaces C m [0, L] consisting of functions v with
m continuous derivatives in [0, L] and with norm given by
vC m [0,L] =
m
k=0
max v (k) (x) .
0xL
m
We observe that C [0, L] is a Banach space and for m 1 it is compactly embedded into C m−1 [0, L]. That is, if {vk } is a bounded sequence in C m [0, L], then by the
Arzelá–Ascoli’s Theorem, {vk } has a subsequence that converges in C m−1 [0, L].
2. The Equations of Equilibrium
2.1. GEOMETRY OF DEFORMATION
We consider a rope or string which in its reference configuration occupies the
region in R3 . We let (x, y, z) represent Cartesian coordinates in and assume
that [0, L] = {x: (x, y, z) ∈ } where the positive x axis is downward in the
vertical direction. For any x ∈ [0, L] we define the cross-section of at x by
x = {(y, z): (x, y, z) ∈ },
(2.1)
630
P.V. NEGRÓN-MARRERO
and let A(x) be the area of x . We assume that the cross-sectional area function A(·) is positive and continuous on [0, L] (see Figure 1). We consider a onedimensional deformation of given by
p(x, y, z) = (u(x), y, z),
(2.2)
for some C 1 function u(·). (See Figure 1.) The requirement that an (infinitesimal) volume in the reference configuration cannot be reduced to zero by the
deformation p, implies that
u′ (x) > 0,
∀x ∈ [0, L].
(2.3)
2.2. MECHANICAL RESPONSE
For any x ∈ [0, L] we denote by n(x) the force exerted by the material on [0, x] on
that on [x, L] in a deformed configuration. We assume that the material of the rope
has mass density per unit volume at x given by ρ(x), where ρ(·) is a given positive
continuously differentiable function. Hence the weight of the [x, L] section of the
rope is given by:
L
g
ρ(x̄)A(x̄) dx̄,
(2.4)
x
where g denotes the acceleration of gravity. Assuming that a force W is applied at
x = L, the total force exerted on the section [x, L] is
L
W +g
ρ(x̄)A(x̄) dx̄.
(2.5)
x
For equilibrium, the forces must balance at each x ∈ [0, L], i.e.,
L
ρ(x̄)A(x̄) dx̄.
n(x) = W + g
(2.6)
x
We say that the material of the rope is elastic and nonhomogeneous if for some
function N(·, ·) we have that
n(x) = N(u′ (x), x).
(2.7)
The usual way to account for the lack of homogeneity is by taking
(u′ (x)),
N (u′ (x), x) = A(x)N
(0, ∞) → R satisfies:
where N:
A1.
A2.
A3.
is a strictly increasing smooth function;
N(·)
N(ν) → ∞ as ν → ∞;
N(ν)
→ −∞ as ν → 0+ .
(2.8)
631
THE HANGING ROPE OF MINIMUM ELONGATION
(0, ∞) → R has a smooth inverse
From properties A1–A3 it follows that N:
ν̂: R → (0, ∞). We further assume that
A4. N → N 2 ν̂N (N) is strictly increasing on [0, ∞);
A5. N 2 ν̂N (N) → ∞ as N → ∞.
One can easily check that condition A4 is equivalent to the strict convexity of
the integrand in the functional giving the total elongation of the rope (cf. (3.4)).
If we combine (2.6), (2.7), and (2.8) we get that
L
′
−1
N (u (x)) = A(x) W + g
ρ(x̄)A(x̄) dx̄ .
(2.9)
x
(See the Appendix for a derivation of this equation from the three-dimensional
theory of elasticity.) Since the top of the rope is attached to a wall we have that
u(0) = 0.
(2.10)
We consider two types of additional constraints: we assume either that the total
mass of the rope is a given constant M:
L
ρ(x)A(x) dx = M,
(2.11)
0
or that the volume of the rope is a given constant V :
L
A(x) dx = V .
(2.12)
0
For a constant mass–density function ρ(·) both constraints are equivalent.
3. Rope of Minimum Elongation
Note that we can write (2.9) as:
′
−1
u (x) = ν̂ A(x) W + g
L
ρ(x̄)A(x̄) dx̄
x
.
(3.1)
Integrating now over [0, L] and using (2.10), we get the following expression for
the total elongation of the rope:
L
L
−1
ρ(x̄)A(x̄) dx̄ dx.
(3.2)
u(L) =
ν̂ A(x) W + g
0
x
The problem then is to find a function A(·) that minimizes the above expression
for u(L) subject to either constraint (2.11) or (2.12).
Let
L
ρ(x̄)A(x̄) dx̄.
(3.3)
B(x) =
x
632
P.V. NEGRÓN-MARRERO
Hence B ′ (x) = −ρ(x)A(x) and we can write (3.2) as
L
ρ(x)(W + gB(x))
u(L) =
ν̂ −
dx.
B ′ (x)
0
(3.4)
Note that condition A4 can be seen now to be equivalent to the strict convexity
with respect to B ′ of the integrand in the above functional. Note that B(L) = 0 and
either
B(0) = M,
(3.5)
if (2.11) holds, or
L ′
B (x)
dx = −V ,
ρ(x)
0
(3.6)
if (2.12) holds. Thus our problem now is to find a function B(·) that minimizes (3.4)
subject to B(L) = 0 and either of the two constraints (3.5) or (3.6).
4. Fixed Total Mass
We now study the problem of minimizing (3.4) subject to (3.5) for an arbitrary
mass density function ρ(·). More specifically, we study the problem
(4.1)
min J (B),
B∈X
where
J (B) =
L
0
ρ(x)(W + gB(x))
ν̂ −
dx,
B ′ (x)
X = B ∈ C 1 [0, L]: B(0) = M, B(L) = 0, B ′ (x) < 0 ∀x .
The Euler–Lagrange equations for this functional are given by:
ρ(x)(W + gB(x))
d ρ(x)(W + gB(x))
ν̂N −
dx
B ′ (x)2
B ′ (x)
ρ(x)(W + gB(x))
gρ(x)
+ ′
ν̂N −
= 0, 0 < x < L,
B (x)
B ′ (x)
B(0) = M,
B(L) = 0.
(4.2)
(4.3)
(4.4a)
(4.4b)
If we let
H (x) = −
ρ(x)(W + gB(x))
,
B ′ (x)
(4.5)
633
THE HANGING ROPE OF MINIMUM ELONGATION
then a simple computation shows that
H (x)
d
d
2
H (x) ν̂N (H (x)) = ρ(x)(W + gB(x))
− ′
ν̂N (H (x))
dx
dx
B (x)
′
ρ (x)
+
H (x) − gρ(x) H (x)ν̂N (H (x)).
ρ(x)
If we multiply (4.4a) by ρ(x)(W + gB(x)), recall (4.5), and use the above identity,
then we have that (4.4a) is equivalent to
ρ ′ (x)
d
H (x)2 ν̂N (H (x)) −
H (x)2 ν̂N (H (x)) = 0.
dx
ρ(x)
(4.6)
This equation can be easily integrated now to get that
H (x)2 ν̂N (H (x)) = cρ(x),
(4.7)
for some constant c. The left-hand side of this equation can be written as h(H (x))
where h(N) = N 2 ν̂N (N). Thus (4.7) is equivalent to
W + gB(x)
1 −1
=−
h (cρ(x)),
B ′ (x)
ρ(x)
(4.8)
where h−1 is the inverse function of h which exists under hypotheses A4, A5.
Equation (4.8) can be written as
B ′ (x) +
gρ(x)
h−1 (cρ(x))
B(x) = −
Wρ(x)
h−1 (cρ(x))
.
Using the integrating factor
x
ρ(t)
µ(x) = exp g
dt ,
−1
0 h (cρ(t))
together with the boundary condition B(L) = 0, we conclude that
L
W
ρ(t)
B(x) =
exp g
dt
−
1
.
−1
g
x h (cρ(t))
(4.9)
(4.10)
It remains to determine the constant c. But using (4.10) we get that the boundary
condition B(0) = M is equivalent to
L
ρ(t)
W
exp g
dt − 1 = M.
(4.11)
G(c) ≡
−1
g
0 h (cρ(t))
It follows from hypotheses A4, A5 that h−1 (0) = 0, h−1 is strictly increasing, and
that h−1 (s) → ∞ as s → ∞. From these properties of h−1 it follows that G is
strictly decreasing, G(c) → 0 as c → ∞, and G(c) → ∞ as c → 0+ . Thus
634
P.V. NEGRÓN-MARRERO
equation (4.11) has a solution which is unique for each M > 0. It follows as well
that (4.4) has a unique solution for each M > 0.
The above argument can be easily modified to show that the problem
d ρ(x)(W + g(x))
ρ(x)(W + g(x))
ν̂N −
dx
′ (x)2
′ (x)
gρ(x)
ρ(x)(W + g(x))
+ ′
ν̂N −
= 0, a < x < L,
(4.12a)
(x)
′ (x)
(a) = B(a),
(L) = 0,
(4.12b)
has a unique solution ( · ; a) for any a ∈ [0, L), where B is the unique solution
of (4.4). This fact can be used now to construct a stationary field for the functional (4.2). This together with condition A4 which implies the strict convexity
with respect to B ′ of the integrand in (4.2), allow us to invoke Hilbert’s Invariant
Integral Theorem [18, Theorem 9.7] to get that the solution B of (4.4) is the unique
(weak) minimizer of (4.2) on (4.3).
4.1. UNIFORM MASS DENSITY
In the case where ρ is constant we can determine explicitly the constant c in (4.11).
In this case the integrand in (4.11) is a constant which we denote by K. Equation (4.11) now reduces to
W gKL
e
− 1 = M,
g
which has solution K = (1/Lg) ln(1 + gM/W ) that upon substitution into (4.10)
yields
W
gM 1−x/L
B(x) =
1+
−1 .
(4.13)
g
W
Since A(x) = −B ′ (x)/ρ, we get after simplification that
gM
gM 1−(x/L)
W
ln 1 +
· 1+
.
A(x) =
gρL
W
W
(4.14)
Note that this function is decreasing, i.e., the minimum elongation is attained by
tapering down the rope from top to bottom.⋆ This is the same result obtained in [20]
for the case ν̂(·) linear (Hooke’s Law). If we substitute (4.13) into (3.4) we get that
the total elongation of the rope is given by
gρL
ρ
= Lν̂
.
(4.15)
u(L) = Lν̂ −
K
ln(1 + gM/W )
⋆ In the general case where ρ depends on x, the cross-sectional area function need not be
monotone.
THE HANGING ROPE OF MINIMUM ELONGATION
635
5. Fixed Total Volume
We now consider the problem of minimizing (3.4) subject to (3.6) for an arbitrary
mass density function ρ(·). This problem can be formulated as
min J (B),
(5.1)
XV
where J is like (4.2) and
2
XV = B ∈ C [0, L]:
B ′ (x)
dx = −V ,
ρ(x)
0
$
′
B(L) = 0, B (x) < 0, ∀x .
L
(5.2)
The first order necessary conditions for this problem are obtained by considering
the extended functional
′
L
V
B
(x)
ρ(x)(W
+
gB(x))
ν̂ −
+λ
+
dx,
(5.3)
J(λ, B) =
B ′ (x)
ρ(x)
L
0
over the set
= (λ, B) ∈ R × C 2 [0, L]: B(L) = 0, B ′ (x) < 0, ∀x ,
X
(5.4)
and where λ is a Lagrange multiplier. By considering smooth variations w with
w(L) = 0 one gets that the Euler–Lagrange equations for (5.3) and hence of (5.1)
are given by:
d ρ(x)(W + gB(x))
ρ(x)(W + gB(x))
λ
ν̂N −
+
dx
B ′ (x)2
B ′ (x)
ρ(x)
ρ(x)(W + gB(x))
gρ(x)
ν̂N −
= 0, 0 < x < L,
(5.5a)
+ ′
B (x)
B ′ (x)
ρ(x)(W + gB(x))
ρ(x)(W + gB(x))
λ
ν̂N −
+
= 0,
(5.5b)
′
2
′
B (x)
B (x)
ρ(x) x=0
L ′
B (x)
dx = −V ,
B(L) = 0.
(5.5c)
ρ(x)
0
Note that the multiplier λ is determined from the boundary condition (5.5b) and in
fact must be negative. Since in this case it is not possible to obtain explicit solutions,
we study the boundary value problem (5.5) using g as a continuation parameter.
Let Y = R2 × C 2 [0, L], Z = C 0 [0, L] × R2 , and
U = (g, λ, B) ∈ Y: λ < 0, B(L) = 0, B ′ (x) < 0 ∀x .
(5.6)
Equations (5.5) are now equivalent to G(g, λ, B) = 0 where G: U → Z is given
by
L ′
B (x)
dx + V ,
(5.7)
G(g, λ, B) = G1 (g, λ, B), G2 (g, λ, B),
ρ(x)
0
636
P.V. NEGRÓN-MARRERO
where G1 (g, λ, B) and G2 (g, λ, B) are given by the left-hand sides of (5.5a)
and (5.5b), respectively. A simplified version of the results in [19], which are for
Schauder spaces, gives us that G1 , G2 are twice continuously Fréchet differentiable, and since the other component of G is a twice differentiable linear functional
of B, we conclude that:
LEMMA 5.1. The function G: U → Z is twice continuously Fréchet differentiable and
D(λ,B) G(g, λ, B) · (γ , v)
= D(λ,B)G1 (g, λ, B) · (γ , v), D(λ,B) G2 (g, λ, B) · (γ , v),
v (x)
dx ,
ρ(x)
L ′
0
where D(λ,B) G1 (g, λ, B) · (γ , v) is given by
g
W + gB(x) ′
d
ρ(x)
v(x) − 2
v (x) ν̂N
dx
B ′ (x)2
B ′ (x)3
W + gB(x) ′
γ
g
2 W + gB(x)
− ρ(x)
v(x) −
v (x) ν̂NN +
B ′ (x)2
B ′ (x)
B ′ (x)2
ρ(x)
2
gρ(x)
W + gB(x) ′
g
gρ(x)ν̂N
v(x) −
v (x) ν̂NN ,
− ′ 2 v ′ (x) − ′
′
B (x)
B (x) B (x)
B ′ (x)2
and D(λ,B) G2 (g, λ, B) · (γ , v) is given by
W + gB(x) ′
g
v(x) − 2
v (x) ν̂N
ρ(x)
B ′ (x)2
B ′ (x)3
W + gB(x) ′
γ
g
2 W + gB(x)
− ρ(x)
v(x) −
v (x) ν̂NN +
B ′ (x)2
B ′ (x)
B ′ (x)2
ρ(x) x=0
and where the argument of ν̂N and ν̂NN is −ρ(x)(W + gB(x))/B ′ (x).
5.1. LOCAL CONTINUATION
We now study the existence of solutions of (5.5) for small values of g.
LEMMA 5.2. The equation G(0, λ, B) = 0 has a solution which is unique and
for which the corresponding cross-sectional area function A is constant.
Proof. If we set g = 0 in (5.5a) then we get that
d ρ(x)W
λ
ρ(x)W
ν̂N − ′
+
= 0,
dx B ′ (x)2
B (x)
ρ(x)
i.e.,
ρ(x)W
ρ(x)W
λ
ν̂N − ′
= constant.
+
B ′ (x)2
B (x)
ρ(x)
THE HANGING ROPE OF MINIMUM ELONGATION
637
The boundary condition (5.5b) with g = 0 implies that this “constant” must be
equal to zero from which we conclude that
ρ(x)W
ρ(x)W 2
= −λW.
−
ν̂
N
B ′ (x)
B ′ (x)
(Note that (5.5b) implies that λ < 0.) Since the right-hand side of this equation is
constant, it follows from hypothesis A4 that
−
ρ(x)W
= C,
B ′ (x)
for some positive constant C. (The volume constraint in (5.5c) implies that C =
W L/V .) Thus B ′ (x) = −ρ(x)W/C and since B ′ (x) = −ρ(x)A(x) we get that
A(x) = W/C, i.e., A is constant.
✷
Let (λ0 , B0 ) be the solution pair of G(0, λ, B) = 0 given by the above lemma.
We now have:
LEMMA 5.3. The linear map D(λ,B) G(0, λ0 , B0 ) is a bijection from R × C 2 [0, L]
into Z.
Proof. It follows from Lemma 5.1 that given any (f, α, η) ∈ Z, the equation
D(λ,B) G(0, λ0 , B0 ) · (γ , v) = (f, α, η),
is equivalent to:
γ
1
d 2
d
′
N ν̂N (N)
v (x) +
= f (x),
dx B0′ (x)2 dN
ρ(x)
N=H
1
γ
d 2
′
= α,
v
(x)
+
N
ν̂
(N)
N
B0′ (x)2 dN
ρ(x) x=0
N=H
L ′
v (x)
v(L) = 0,
dx = η,
0 ρ(x)
(5.8a)
(5.8b)
(5.8c)
where H = −ρ(x)W/B0′ (x), etc. Since the coefficient of v ′ (x) in (5.8a) is positive
by hypothesis A4, problem (5.8a), (5.8b), and the first equation in (5.8c) can be
uniquely solved for v in terms of f, α, γ , where the dependence in γ is linear.
(See [17].) Upon substitution of this expression for v into the second equation
of (5.8c), we get a linear equation for γ , which can be uniquely solved.
✷
It follows now from Lemmas 5.1–5.3, and the implicit function theorem
(see [15]) that:
THEOREM 5.4. For small values of g, the problem (5.5) has a solution that
depends continuously on g.
638
P.V. NEGRÓN-MARRERO
5.2. GLOBAL CONTINUATION
In this section we carry a global analysis of solutions of (5.5) via Leray–Schauder
degree theory. In order to apply the global continuation results in [16, 4, 2], we
need to recast our problem in terms of a compact operator between appropriate
Banach spaces. (It turns out that assumption A4 is crucial in this respect.) The
local analysis of Section 5.1 is still valid in this setting and thus we just carry out
the additional steps for the global analysis.
By an analysis similar to the one that leads to (4.6), we can get that (5.5a) is
equivalent to
ρ ′ (x)
ρ ′ (x)
d
H (x)2 ν̂N (H (x)) −
H (x)2 ν̂N (H (x)) = λ
(W + gB(x)),
dx
ρ(x)
ρ(x)
which in turn is equivalent to:
ρ ′ (x)
d H (x)2 ν̂N (H (x))
(W + gB(x)).
=λ
dx
ρ(x)
ρ(x)2
If we integrate this equation from 0 to x we get that
x ′
H (x)2 ν̂N (H (x)) H (0)2 ν̂N (H (0))
ρ (t)
−
=λ
(W + gB(t)) dt.
2
ρ(x)
ρ(0)
0 ρ(t)
(5.9)
A simple integration by parts shows that
x ′
x ′
ρ (t)
B (t)
W + gB(0) W + gB(x)
(W
+
gB(t))
dt
=
−
+
g
dt.
2
ρ(t)
ρ(0)
ρ(x)
0
0 ρ(t)
Also from (5.5b) we have that
−H (0)2 ν̂N (H (0)) = λ(W + gB(0)).
Using these two last identities we can conclude that (5.9) is equivalent to
x ′
W + gB(x)
B (t)
H (x)2 ν̂N (H (x))
= −λ
−g
dt .
ρ(x)
ρ(x)
0 ρ(t)
(5.10)
Since ν̂N > 0, the boundary condition (5.5b) implies that λ < 0. Furthermore,
since B ′ (x) < 0, it follows that the right-hand side of the above equation is positive.
Let h(N) = N 2 ν̂N (N). By hypothesis A4, this function for N 0 has an inverse
function h−1 (·). Thus after multiplying both sides by ρ(x), the above equation is
equivalent to
W + gB(x)
= F (g, λ, B)(x),
B ′ (x)
where
F (g, λ, B)(x) = −
(5.11)
x ′
B (t)
1
h−1 −λ W + gB(x) − gρ(x)
dt .
ρ(x)
0 ρ(t)
(5.12)
THE HANGING ROPE OF MINIMUM ELONGATION
639
Since F (g, λ, B)(x) < 0 for all x, we can write (5.11) as
B ′ (x) −
g
W
B(x) =
.
F (g, λ, B)(x)
F (g, λ, B)(x)
(5.13)
If we treat the coefficient and right-hand side in this equation as if they were known
functions of x, then after using an appropriate integrating factor and the boundary
condition B(L) = 0 we can write that
B = K2 (g, λ, B),
(5.14)
where
L
dt
W
exp −g
−1 .
K2 (g, λ, B)(x) =
g
x F (g, λ, B)(t)
(5.15)
Note that
L
W
dt
d
K2 (g, λ, B)(x) =
exp −g
.
dx
F (g, λ, B)(x)
x F (g, λ, B)(t)
(5.16)
With this expression for B ′ , the volume constraint in (5.5c) becomes
λ = K1 (g, λ, B),
(5.17)
where
L
W
K1 (g, λ, B) = V + λ +
ρ(x)F
(g,
λ, B)(x)
0
L
dt
× exp −g
dx.
x F (g, λ, B)(t)
(5.18)
If we let K = (K1 , K2 ), then (5.5) is equivalent to
(λ, B) = K(g, λ, B).
Note that K: E → R × C 1 [0, L], where
E = (g, λ, B) ∈ [0, ∞) × (−∞, 0) × C 1 [0, L]: B ′ (x) < 0 ∀x .
(5.19)
(5.20)
The operator K need not be compact on the whole of E as the condition B ′ (x) < 0
and λ < 0 might be violated in the limit for some converging sequence. To deal
with this possibility, we define for any δ > 0 the open set
Eδ = (g, λ, B) ∈ E: λ < −δ, B ′ (x) < −δ ∀x .
(5.21)
Note that E = ∪δ>0 Eδ . We now have:
LEMMA 5.5. For each δ > 0 and g ∈ [0, ∞), the operator K(g, ·, ·) that maps
{(λ, B): (g, λ, B) ∈ Eδ } into R × C 1 [0, L] is compact.
640
P.V. NEGRÓN-MARRERO
x
Proof. Since the mapping f → 0 f (t) dt is compact from C[0, L] into itself,
it follows from (5.12) that F (g, ·, ·) is compact from
Eδ, g = {(λ, B): (g, λ, B) ∈ Eδ },
into C[0, L]. Furthermore, from (5.15), (5.16), and (5.18), we get that K(g, ·, ·) is
the composition of a continuous operator from C[0, L] into R × C 1 [0, L] with the
compact operator F (g, ·, ·). Thus K(g, ·, ·) is compact from Eδ, g into C 1 [0, L]. ✷
We need a few preliminary lemmas before invoking the global continuation
results in [16, 4, 2].
LEMMA 5.6. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) that converges to (g, λ, B) in R2 × C 1 [0, L] with g 0. Then B ′ (x) < 0 for all x and
λ < 0.
Proof. Since Bj′ < 0 and λj < 0 for all j , it follows that B ′ 0 and λ 0.
Assume that B ′ (x̄) = 0 for some x̄ ∈ [0, L]. Then since Bj′ (x̄) < 0 for all j , we
get that
W + gj Bj (x̄)
→ −∞,
Bj′ (x̄)
j → ∞.
But from (5.11) and (5.12) we observe that
x̄ ′
B (t)
1
W + gj Bj (x̄)
−1
−λ W + gB(x̄) − gρ(x̄)
→−
h
dt ,
Bj′ (x̄)
ρ(x̄)
0 ρ(t)
which is finite and thus we get a contradiction. Hence B ′ (x) < 0 for all x ∈ [0, L].
To argue that λ < 0, note that
W
W
W + gj Bj (L)
= ′
→ ′
< 0.
′
Bj (L)
Bj (L)
B (L)
But if λ = 0, then (5.11) and (5.12) imply that
1
W + gj Bj (L)
→−
h−1 (0) = 0,
′
Bj (L)
ρ(L)
which again leads to a contradiction.
✷
LEMMA 5.7. For any solution (g, λ, B) of (5.19) with g 0, we have that
BC[0,L] K,
L
g
W ρC[0,L]
′
B C[0,L] −1
exp −1
ρ(t) dt ,
h (−λW )
h (−λW ) 0
for some constant K depending only on ρ and V .
(5.22a)
(5.22b)
THE HANGING ROPE OF MINIMUM ELONGATION
641
Proof. Since B(L) = 0, we have using an integration by parts that
L ′
L
B (t)
B ′ (t) dt = −
B(x) = −
ρ(t) dt
ρ(t)
x
x
L ′
L
L ′
B (t)
B (ξ )
′
dt −
dξ dt.
ρ (t)
= −ρ(x)
ρ(t)
ρ(ξ )
x
x
x
The volume constraint in (5.5c) and the fact that B ′ < 0 imply that
L ′
B (t)
dt V .
0−
ρ(t)
x
(5.23)
This inequality together with the above expression for B gives the result (5.22a).
To get (5.22b), note that since λ < 0, B ′ < 0, and B(L) = 0, we get that
x ′
B (t)
−λ W + gB(x) − gρ(x)
dt −λW.
0 ρ(t)
The result now follows from the representations (5.12), (5.16), and the fact that h−1
is strictly increasing.
✷
LEMMA 5.8. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) with 0
gj R for all j for some constant R. Then {λj } satisfies that
lim inf λj > −∞,
j
lim sup λj < 0.
j
Proof. If the first inequality does not hold, then {λj } would have a subsequence,
which we denote again by {λj }, such that λj → −∞. Since h−1 (s) → ∞ as
s → ∞, we get from (5.22b), Lemma 5.7, that Bj′ → 0 in C[0, L]. But this is
impossible because
L ′
Bj (t)
dt = −V , ∀j.
(5.24)
ρ(t)
0
Thus {λj } must be bounded from below.
For the second inequality we argue again by contradiction. If {λj } were not
bounded away from zero, there would be a subsequence, which we denote again
by {λj }, such that λj → 0. It follows now from (5.22a), (5.23), and (5.12) that
cj F (gj , λj , Bj )(x) < 0,
lim cj = 0.
j →∞
Using this in (5.16) yields that
Bj′ (x)
W
< 0,
cj
x ∈ [0, L].
Since cj → 0, the above inequality would contradict the volume constraint (5.24).
Thus {λj } must be bounded away from zero.
✷
642
P.V. NEGRÓN-MARRERO
LEMMA 5.9. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) with 0
gj R for all j for some constant R. Then {(λj , Bj )} is bounded in R × C 1 [0, L].
Proof. The result follows from Lemmas 5.7 and 5.8.
✷
We now have:
THEOREM 5.10. Let C ⊂ E be the connected component of solutions of (5.5)
containing (0, λ0 , B0 ) where (λ0 , B0 ) is given by Lemma 5.2. Then C is unbounded
in R2 × C 1 [0, L] and (5.5) has a solution for each g 0.
Proof. It follows from Lemma 5.5 and the results in [16, 4, 2], that C must
satisfy at least one of the following alternatives:
(i) C is unbounded in R2 × C 1 [0, L];
(ii) C contains a solution of the form (0, λ∗ , B ∗ ) where (λ∗ , B ∗ ) = (λ0 , B0 );
(iii) C ∩ ∂E = ∅.
We can rule out alternative (ii) using Lemma 5.2, and alternative (iii) with Lemma 5.6. Thus (i) must hold and the result about the existence of solutions for each
g 0 follows from the unboundedness of C and Lemma 5.9.
✷
This result as stated, cannot be used to construct a consistent stationary field for
the problem (5.1), basically because of the lack of uniqueness. We can however
get a partial result for any fixed value of g. Note that condition A4 together with
the fact that the constraint in (5.2) is linear in B ′ imply that the integrand in (5.3)
is strictly convex in B ′ . Let B be a solution of (5.5) corresponding to the length
value L. It follows now from the results in [18, Theorems 9.10, 9.23], that for ℓ
sufficiently small, the solution B is a unique (weak) local minimizer of
ℓ
ρ(x)(W + gv(x))
dx,
(5.25)
Jℓ (v) =
ν̂ −
v ′ (x)
0
on the set
Xℓ = v ∈ C 1 [0, ℓ]:
v ′ (x)
dx =
ρ(x)
B ′ (x)
dx,
0
0 ρ(x)
$
′
v(ℓ) = B(ℓ), v (x) < 0, ∀x ∈ [0, ℓ] .
ℓ
ℓ
(5.26)
Thus B gives a local minimum among configurations of a rope of length ℓ, with
total volume equal to that of B in [0, ℓ], and with total mass at x = ℓ equal to B(ℓ).
6. Numerical Examples
In this section we present a typical family of constitutive functions that satisfies
(·) of the form:
hypotheses A1–A5. In particular, we consider functions N
(ν) = A1 ν α1 − A2 ν −α2 ,
N
(6.1)
THE HANGING ROPE OF MINIMUM ELONGATION
643
where A1 > 0, A2 0, α1 , α2 > 0. This function clearly satisfies A1–A3 and thus
has an inverse function ν̂(·) such that
(ν̂(N)) = N,
N
ν̂(N(ν))
= ν,
N ∈ R,
ν ∈ (0, ∞).
(6.2a)
(6.2b)
If we differentiate (6.2a) with respect to N and solve for ν̂N (N), then we get that
h(N) = N 2 ν̂N (N) =
N2
.
ν (ν̂(N))
N
(ν) in this expression, then we get from (6.2b) that
If we let N = N
h(N(ν))
=
2 (ν)
N
.
ν (ν)
N
(6.3)
Now A5 is equivalent to h(N(ν))
→ ∞ as ν → ∞, which is satisfied by (6.1) for
any A1 > 0, A2 0, α1 , α2 > 0.
If we differentiate (6.3) with respect to ν, then we have that
2
(ν) 2Nν (ν) − N(ν)Nνν (ν) .
=N
hN (N(ν))
ν3 (ν)
N
Condition A4 requires that N > 0, which for (6.1) is equivalent to
1/(α1+α2 )
A2
ν>
.
A1
(6.4)
(6.5)
It follows from (6.4) now that A4 is equivalent to
ν2 (ν) − N
(ν)N
νν (ν) > 0,
2N
(6.6)
provided ν satisfies (6.5). We now have:
PROPOSITION 6.1. The constitutive function (6.1) satisfies condition A4 for any
A1 > 0, A2 0 provided that α1 = α2 = α > 0 or for any α1 > 0, α2 1.
Proof. A direct calculation shows that for (6.1), inequality (6.6) is equivalent to
α1 (α1 + 1)A21 ν 2α1 + α2 (α2 − 1)A22 ν −2α2
+ (α12 + α22 + α2 + α1 (4α2 − 1))A1 A2 ν α1 −α2 > 0,
provided ν satisfies (6.5). This inequality is automatically satisfied for any
α1 > 0, α2 1. If α1 = α2 = α > 0, the inequality is satisfied provided
1 − α 1/4α A2 1/2α
.
ν>
1+α
A1
Since the expression (1 − α)/(1 + α) is less than 1 for α > 0, this last inequality
is satisfied provided ν satisfies (6.5).
✷
644
P.V. NEGRÓN-MARRERO
Figure 2. Computed mid-cross-sectional area functions for the density functions (6.7) and
total fixed mass of 0.03.
We show now some numerical computations for the constitutive function (6.1)
for the case A1 = 1, A2 = 0, and α1 = 3. We use variable density functions
which along the axis of the rope are either increasing, decreasing or with an interior
minimum. In particular, we consider:
ρ1 (x) = 0.1(1 + x),
ρ2 (x) = 0.2 − 0.1x,
ρ3 (x) = 5.0(x − 0.5)2 + 0.1.
(6.7a)
(6.7b)
(6.7c)
We used the following values for the parameters L, W, g:
L = 1.0,
W = 0.1,
g = 9.8,
the units of which are in the metric system. We show in Figure 2 the computed
cross-sectional area functions for the densities (6.7) and total fixed mass of 0.03.
Note that for variable density functions, the area function need not be decreasing
as is the case when the density is constant (cf. (4.14)). Note that for ρ2 the rope
is thinner at the beginning and fatter at the end as compared with the one for
ρ1 compensating in this way for the decrease in density. In Figure 3 we show
the corresponding shape of the rope of minimum elongation for the case (6.7c)
assuming circular cross sections. Similar results for the case of total fixed volume
of 0.05 are shown in Figures 4 and 5.
THE HANGING ROPE OF MINIMUM ELONGATION
645
Figure 3. The shape of the rope of minimum elongation for the density function (6.7c) and
total fixed mass of 0.03.
Figure 4. Computed mid-cross-sectional area functions for the density functions (6.7) and
total fixed volume of 0.05.
646
P.V. NEGRÓN-MARRERO
Figure 5. The shape of the rope of minimum elongation for the density function (6.7c) and
total fixed volume of 0.05.
7. Conclusions
A problem perhaps more interesting from the practical point of view than the ones
treated in this paper is that of minimizing the volume (thus minimizing the amount
of material used) of the rope for a given length of the rope. The one-dimensional
version of this problem can be treated similarly to the ones discussed here. Its three
dimensional version has applications in the petroleum industry where long tubes
from the top to the bottom of the sea (fixed length) need to be constructed with
the least amount of material. In this case the tubes need to be hollow in order to
transport various materials and there is the additional complication of the external
water pressure.
The use of Leray–Schauder degree techniques in elasticity has a long and successful story that we will not try to review here. We refer to [3] for examples and
its extensive literature review. However most of these applications of the Leray–
Schauder degree have been limited to one dimensional problems, like the one
treated in this paper, due to the complexity in transforming the equations of elasticity into an equivalent problem in terms of a compact operator. Not until recently,
in [9], such a major enterprise was carried out for the three dimensional displacement problem of nonlinear elasticity. On the other hand, the use of a degree for
proper Fredholm maps of index zero [8, 12] avoids the transformation of the orig-
THE HANGING ROPE OF MINIMUM ELONGATION
647
inal problem into one in terms of a compact operator but requires some a priori
estimates on solutions of the linear problem and its spectrum. For the three dimensional mixed problem of nonlinear elasticity such spectral estimates were obtained
in [11], and together with the estimates in [1] for elliptic systems, Healey and
Simpson were able to apply a degree for proper Fredholm maps of index zero
to get the existence of a global branch of solutions of this problem. This global
continuum, in addition to the usual alternatives for such a continuum, may also
“cease” to exist due to a failure of strong ellipticity, local injectivity, or the complementing condition. For specific materials with appropriate growth conditions, one
can rule out termination due to a failure of strong ellipticity and local injectivity.
(See, e.g., [10].)
Appendix
In this section we derive the model equations for the rope from the three-dimensional theory of elasticity. For the deformation (2.2) the deformation gradient is
given by
⎞
⎛ ′
u (x) 0 0
1 0 ⎠.
∇p = ⎝ 0
(A.1)
0
0 1
If we assume that the material of the body is isotropic and hyperelastic, then there
exists a smooth stored energy function σ̂ (F) of the form:
1 t
1
t
F · F, FF · FF , det F ,
σ̂ (F) = σ
2
4
where F · H = trace(FHt ) and such that the (first) Piola–Kirchhoff stress tensor is
given by
S(F) =
dσ̂ (F)
= σ,1 F + σ,2 FFt F + (det F)σ,3F−t .
dF
For (A.1) we have that
(u′ (x))2 + 2
F·F
=
,
2
2
FFt · FFt
(u′ (x))4 + 2
=
,
4
4
det F = u′ (x).
(A.2)
It follows now that
S(∇p) = diag u′ (x)σ,1 + (u′ (x))3 σ,2 + σ,3 ,
σ,1 + σ,2 + u′ (x)σ,3 , σ,1 + σ,2 + u′ (x)σ,3 ,
(A.3)
648
P.V. NEGRÓN-MARRERO
where the arguments of σ,1 , etc., are given by (A.2). If we let i be a unit vector
pointing in the positive x direction, then we have from (2.1) that the force exerted
by the material on [0, x] on that on [x, L] is given by
S(∇p) · i dsx ,
−
x
where dsx denotes an element of area over x . If we let ρ̂(x, y, z) denote the mass i is
density per unit volume at (x, y, z) and we assume that a force per unit area W
applied at the bottom of the rope, then we get that the total force on the material on
[x, L] is
L
i dsL .
W
ρ̂(ξ, y, z)i dsξ dξ +
g
x
L
ξ
For equilibrium we must have that
L
ρ̂(ξ, y, z)i dsξ dξ +
S(∇p) · i dsx = g
x
x
ξ
which upon recalling (A.3) reduces to:
A(x) u′ (x)σ,1 + (u′ (x))3 σ,2 + σ,3
L
ρ̂(ξ, y, z) dsξ dξ + W,
=g
A(x) =
If we let
i dsL ,
W
(A.4)
ξ
x
where
L
dsx ,
x
A(L).
W =W
(u′ (x)) = u′ (x)σ,1 + (u′ (x))3 σ,2 + σ,3 ,
N
and assume that ρ̂(x, y, z) = ρ(x), then (A.4) reduces to (2.9).
References
1.
2.
3.
4.
5.
S. Agmon, A. Douglis and L. Nirenberg, Estimates near the boundary for solutions of elliptic
partial differential equations satisfying general boundary conditions. Comm. Pure Appl. Math.
II(17) (1964) 35–92.
J.C. Alexander and J.A. Yorke, The implicit function theorem and the global methods of
cohomology. J. Funct. Anal. 21 (1976) 330–339.
S.S. Antman, Nonlinear Problems of Elasticity, Applied Mathematical Sciences 107. Springer,
New York (1995).
F.E. Browder, On the continuity of fixed points under deformations of continuous mappings.
Summa Brazil. Mat. 4 (1960) 183–191.
S.J. Cox, The shape of the ideal column. Math. Intelligencer 14(1) (1992) 16–24.
THE HANGING ROPE OF MINIMUM ELONGATION
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
649
S.J. Cox and M. McCarthy, The shape of the tallest column. SIAM J. Math. Anal. 29(3) (1998)
547–554.
L. Euler, De motu corporum flexibilum. Comm. Acad. Sci. Petrop. 14 (1751) 182–196.
C.C. Fenske, Extensio gradus ad quasdam applicationes Fredholmii. Mitt. Math. Seminar
Giessen 121 (1976) 65–70.
T.J. Healey, Global continuation in displacement problems of nonlinear elastostatics via the
Leray–Schauder degree. Arch. Rational Mech. Anal. 152 (2000) 273–282.
T.J. Healey and P. Rosakis, Unbounded branches of classical injective solutions to the forced
displacement problem in nonlinear elastostatics. J. Elasticity 49 (1997) 65–78.
T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech.
Anal. 143 (1998) 1–28.
H. Kielhöfer, Multiple eigenvalue bifurcation for Fredholm mappings. J. Reine Angew. Math.
358 (1985) 104–124.
J.B. Keller and F.I. Niordson, The tallest column. J. Math. Mech. 16 (1966) 433–446.
J.L. Lagrange, Application de la méthode exposée précédente à la solution de différens
problèmes de dynamique. Misc. Tour. 2(2) (1762) 196–298.
S. Lang, Real Analysis, 2nd edn. Addison-Wesley, Reading, MA (1983).
J. Leray and J. Schauder, Topologie et équations fonctionelles. Ann. Sci. École Norm. Sup.
3(51) (1934) 45–78.
I. Stakgold, Green’s Functions and Boundary Value Problems. Wiley, New York (1979).
J.L. Troutman, Variational Calculus with Elementary Convexity. Springer, New York (1983).
T. Valent, Boundary Value Problems of Finite Elasticity, Springer Tracts in Natural Philosophy.
Springer, New York (1988).
G. Verma and J. Keller, Hanging rope of minimum elongation. SIAM Rev. (1983) 369–399.
On Certain Weak Phase Transformations in
Multilattices
MARIO PITTERI
DMMMSA, Università di Padova, Via Belzoni 7, 35131 Padova, Italy. E-mail: pitteri@dmsa.unipd.it
Received 13 September 2002; in revised form 8 April 2003
Abstract. This paper is dedicated to the memory of Clifford Truesdell, to whom I acknowledge my
debt and gratitude. I present some recent results of mine on thermomechanics of crystalline solids,
a research on which I began working during my stay at The Johns Hopkins University, 1977–1979.
Then, in the last section, I mention some unusual merits of Truesdell which I experienced in my
scientific career and are not widely known.
Mathematics Subject Classifications (2000): 74A30, 74B20, 74G65, 74N05.
Key words: crystalline solids, phase transformations, physical possibility, quartz, thermoelasticity.
1. Introduction
I have many reasons to be grateful to Clifford Truesdell, both as a scientist and as
a person. Along my research activity as a postdoc I soon realized that I needed to
deepen my knowledge of kinetic-statistical theories of mechanics, and The Johns
Hopkins University was my choice because of Truesdell’s work on the kinetic theory of gases – he was then carrying to completion his book [22] with Muncaster.
During my stay there I was also exposed to other subjects I was not so familiar
with, among which were elements of experimental continuum mechanics and of
the history of mechanics, classical thermodynamics along Carnot’s lines, stability
problems in continuum mechanics, and nonlinear elasticity of crystals. In fact, in
the past fifteen years the last topic has been the main focus of my research, the
most recent results of which constitute the core of this paper.
There has recently been a renewed interest in the geometry and kinematics of
multilattices, in view of constructing a nonlinear model of the thermomechanical
behavior of complex crystals. The background, some details and references are
given in [17], where some still unsolved problems are also outlined. One of these
is the formulation of a unified kinematics of multilattices of different complexity.
Indeed, any multilattice configuration admits a maximal skeletal lattice of translations mapping the multilattice to itself (the essential skeleton), which nevertheless
shares the property of being translation-invariant with any one of its infinitely many
sublattices. Once a sublattice of the essential skeleton is selected, additional vectors
(shifts) have to be introduced to describe the multilattice configuration, and these
651
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 651–671.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
652
M. PITTERI
change, possibly also in number, with the changing sublattice. The kinematics
developed so far in the literature works well for multilattices whose lattice of translations is the essential skeleton itself; these will be called essentially described,
or essential for short. In this paper we restrict attention to essential multilattices
alone. Further work is needed to cope with deformations along which a multilattice
changes its periodicity, in a suitable sense.⋆ Handling such deformations would be
of great interest from the theoretical point of view as well as for applications, for
instance to certain phase transitions of shape memory alloys.
In [15] it is shown by a simple example how the knowledge of the full geometry
and kinematics of essential multilattices can help in classifying all the possibilities
for their weak – that is, involving suitably small distortions – symmetry-breaking
thermoelastic phase transformations, and motivation for the procedure is provided.
The example is the diatomic 2-lattice used in the introductory sections of [19]
to describe what is called a structural phase transition in a crystalline solid: in a
primitive tetragonal lattice whose unit cell has a physically different atom in its
center the transition is driven by the central atom moving off center. That this can
happen in essentially two different ways is shown to be one of the possibilities; the
other, orthogonal to the first in a suitable sense, corresponds to a configurational
transition: the tetragonal skeleton drives the transformation, which is then accompanied by a suitable displacement of the central atom. For this kind of transition
the independent generic possibilities have been classified in [5].
Here, in Section 3, I provide an explicit, general framework for generic weak
bifurcations in essential multilattices, completing a scheme presented in [6]. As an
example, in Section 4 we study the structural transitions of β-quartz, described by
the 3-lattice model introduced in [12]. The analysis shows that, among others, there
are two – mutually orthogonal in a sense, and both described by a 1-dimensional
order parameter space – trigonal trapezohedral low-symmetry product phases, one
Figure 1. Equivalent bases for a given (planar) lattice.
⋆ An example is a controlled deformation of a body-centered cubic (bcc) simple lattice along
which the central atom starts moving off-center for certain (transition) values of the controls. At the
transition the nonessential 2-lattice equivalent to the essential bcc simple lattice starts deforming into
an essential 2-lattice.
653
PHASE CHANGES IN MULTILATTICES
of which is the α-phase as modelled in [12]. The other was obtained in [6] as an
outcome of the search for a third quartz phase, used to give an alternative to the socalled incommensurate phase introduced in the physical literature to explain certain
peculiarities of the α–β transition. Here the geometry of the 3-lattice describing
this new phase is given in detail.
Finally, since this paper is appearing in a volume dedicated to the memory of
Clifford Truesdell, in Section 5 I mention some unusual merits of him, which I
experienced in my scientific career and are not widely known.
2. Preliminaries
I sketch here the bare essentials for the rest of the paper, referring the reader to [17]
for more information. In particular, as there, I use the summation convention and
“running indices” without specifying their range; for instance in expressions like
“the lattice basis ea ” instead of “the lattice basis {ea , a = 1, 2, 3}”, or “the function
φ̂(ea , pr , θ)” instead of “the function φ̂(e1 , e2 , e3 , p1 , . . . , pn−1 , θ)”. This should
not generate too much confusion. Also, the relations and < between groups
mean subgroup of and proper subgroup of, respectively.
Let Z and R denote the integral and real numbers, respectively. Consider first
the simplest triply-periodic structures, that is, simple lattices (or 1-lattices):
L = {N a ea , a = 1, 2, 3, N a ∈ Z} = L(ea ).
(1)
The lattice vectors (or lattice basis) ea are linearly independent in R3 .
Any basis ea uniquely determines the 1-lattice L(ea ), but not vice versa:
L( ēa ) = L(ea )
⇔
ēa = mba eb ,
m ∈ GL(3, Z);
(2)
here GL(3, Z) is the group of 3 by 3 integral matrices with determinant ±1. The
crystallographic point group (or holohedry), P (ea ), of the lattice L(ea ) is then
defined as the group of all the orthogonal transformations mapping L(ea ) to itself;
equivalently:
P (ea ) = {Q ∈ O(3): Qea = mba eb }.
(3)
Notice that the basis ea satisfies (3) if and only if the lattice metric
C = (Cab ),
Cab = ea · eb ,
(4)
is a fixed point of the map
C → mt Cm.
(5)
The conjugacy classes of the holohedries in O(3) correspond to the well known
7 crystal systems.
654
M. PITTERI
Figure 2. An elementary cell for the hcp lattice, and its projection on the x–y basal plane.
By looking at the right-hand side of the equation defining P (ea ) in (3) we
introduce the lattice group L(ea ) of a lattice L(ea ), associated with its lattice
basis ea :
L(ea ) = {m ∈ GL(3, Z): mba eb = Qea , Q ∈ P (ea )}
= {m ∈ GL(3, Z): mt Cm = C}.
(6)
By the last equality the lattice group depends on the basis ea only through the
corresponding metric C, hence can be denoted by L(C). The conjugacy classes of
lattice groups in GL(3, Z) correspond to the well known 14 (Bravais) lattice types.
Real crystals (hexagonal metals, alloys, etc.) are not in general 1-lattices. Their
geometry and kinematics can be described by means of multilattices, which are
the union of a finite number of nontrivial translates of a 1-lattice. A simple, well
known example of 2-lattice is the hexagonal close-packed (hcp) structure, which is
sketched in Figure 2.
In general, an n-lattice M in 3-dimensional affine space can be defined as follows, by using Grassman notation, choosing the origin O at one of the lattice points
and setting p0 = 0 for convenience:
M = M(ea , p1 , . . . , pn−1 ) =
n−1
4
r=0
{O + L(ea ) + pr };
(7)
L(ea ) is called the skeletal lattice of M. Figure 3 is a schematic picture of a
2-dimensional (planar) 2-lattice; if the atoms represented by filled circles are physically indistinguishable from the ones represented by open circles, the 2-lattice is
called monatomic, otherwise diatomic. For simplicity, in this paper all multilattices
are monatomic.
The multilattice descriptors (ea , pr ) =: εσ , σ = 1, . . . , n + 2 (in terms of
which we can write M = M(εσ )) satisfy the following conditions guaranteeing
the three-dimensionality of M and the non-overlap of the constituent 1-lattices:
e1 · e2 × e3 = 0,
a
pr = ps + lrs
ea ,
with the exclusion of the case r = 0 = s.
a
r, s = 0, . . . , n − 1, lrs
∈ Z,
(8)
655
PHASE CHANGES IN MULTILATTICES
Figure 3. Unit cells of the component 1-lattices of a planar 2-lattice.
An n-lattice M is called essential if its skeletal lattice contains all the translations mapping M to itself. In this case the lattice cell has minimum volume.
The O(3)-invariant multilattice metric
K = (Kσ τ ),
Kσ τ = Kτ σ = εσ · ετ , σ, τ = 1, . . . , n + 2,
(9)
m
is an analog of (4). The manifold Qn+2
of multilattice metrics is a submanifold of
m
the vector space Qn+2 of all symmetric matrices in Rn+2 ; Qn+2
is a “state space”
>
for n-lattices analogous to the set Q3 of positive definite quadratic forms in R3 (the
lattice metrics) for 1-lattices.
The “global symmetry group” of essential n-lattices expresses the indeterminateness in the choice of the multilattice descriptors:
M(ε̄σ ) = M(εσ )
⇔
ε̄σ = µτσ εσ ,
µ ∈ Γn+2 ,
(10)
where Γn+2 < GL(n + 2, Z) consists of the matrices of the form
⎞
⎛
b
l1b . . . ln−1
mba
⎟
⎜0 0 0
⎟
⎜
(µτσ ) = ⎜
⎟,
..
s
⎠
⎝
α
.
r
0
⎛
1
⎜ 0
⎜ ..
⎜ .
⎜
ᾱ = ⎜
⎜ −1
⎜ ..
⎜ .
⎝
0
0
0
0
0
1
..
.
−1
..
.
0
0
...
...
···
...
···
...
...
⎞
0
0
0
0 ⎟
..
.. ⎟
.
. ⎟
⎟
−1 −1 ⎟
,
..
.. ⎟
⎟
.
. ⎟
⎠
1
0
0
1
(11)
for a, b = 1, 2, 3, r, s = 1, . . . , n − 1, (mba ) ∈ GL(3, Z), lrb ∈ Z, and α = (αrs )
belonging to the finite noncommutative group of matrices generated by the permutation matrices of the set {1, . . . , n − 1} and by the n − 1 by n − 1 matrices ᾱ of
the form (11)2 , which are obtained from the identity by replacing one of its rows
by a row of −1s. If α is not a permutation of {1, . . . , n − 1}, then necessarily one
of its rows consists of −1s.
656
M. PITTERI
Analogous to (5), the group Γn+2 acts in a natural way on the manifold of
multilattice metrics:
K → µt Kµ.
(12)
Among all changes of descriptors, particularly important are those which produce an affine isometry of the multilattice onto itself. If, with respect to the chosen
origin O (which is one of the lattice points) we represent the isometry as a pair
(t, Q), t ∈ R3 , Q ∈ O(3), it must be, for k = 0 if α is a permutation or, otherwise,
for the index k of the row of α made of −1s – see (11):
Qεσ = µτσ ετ ,
t = pk + na0 ea ,
na0 ∈ Z.
(13)
In terms of the metric K in (9), equality (13)1 is equivalent to the condition that K
be a fixed point for the corresponding map (12):
µt Kµ = K.
(14)
For any essentially described multilattice M = M(εσ ) the solutions (t, Q) of
(13) depend on M itself and not on its specific descriptors εσ , and they constitute
the space group S(M) of M. The orthogonal maps appearing in (13)1 constitute
the crystal class P (M) (or P (εσ )) of M, which is a subgroup of the skeletal
holohedry P (ea ). The matrices µ ∈ Γn+2 in (13)1 form the lattice group Λ(εσ ) <
GL(n + 2, Z) of M(εσ ). The finite group Λ(εσ ) depends on the choice of the εσ
– actually, on the corresponding metric K, so that it can be denoted by Λ(K) –
and, under a change of descriptors, changes to a conjugate group in Γn+2 . The
conjugacy classes of lattice groups in Γn+2 formalize the notion of (arithmetic)
m
n-lattice types, and the way Ŵn+2 acts on the state space Qn+2
gives information
on the kinematics of deformable n-lattices. This analysis is considerably simpler if
one restricts attention to suitably small distortions:
PROPOSITION 1. Any multilattice metric K has a Λ(K)-invariant neighborm
hood N in Qn+2
, to be called a wt-nbhd of K, such that, for any µ ∈ Γn+2 ,
µt N µ ∩ N = ∅
⇔
µ ∈ Λ(K)
⇔
µt Kµ = K.
(15)
Therefore, in any Λ(K)-invariant neighborhood of K contained in N the global
Γn+2 -invariance reduces to the invariance under the lattice group Λ(K):
for any K̄ ∈ N
Λ(K̄) Λ(K).
(16)
For any choice of descriptors εσ an analogous O(3)-invariant neighborhood exists,
and will be also called a wt-nbhd of εσ .
This result allows us to efficiently reduce the description of the invariance in any
m
wt-nbhd N ⊂ Qn+2
of the metric K of an arbitrarily chosen essential n-lattice, and
to greatly simplify the classification of its generic elastic bifurcations in N .
657
PHASE CHANGES IN MULTILATTICES
For simplicity, here we consider a crystalline solid in equilibrium with a heatbath of which we only control the temperature. One can extend this treatment to
accommodate other controls, for instance, pressure, as in [6], or shear stresses, as
in [4, 18], or [7]. Here an appropriate thermodynamic potential is the Helmoltz free
energy of the multilattice, which is assumed to have a density per unit skeletal cell;
this is a sufficiently smooth function
φ = φ̂(ea , pr , θ) = φ̄(εσ , θ),
(17)
where θ denotes the absolute environmental temperature, regarded as a control.
The free energy density at zero temperature coincides with the internal energy
density, and it can be reasonably assumed to depend on the location of the multilattice points also at any given positive temperature. Therefore the functions φ̂ or
φ̄ must have the same value on any two equivalent sets of descriptors for the same
configuration; hence, for any µ ∈ Γn+2 , they satisfy the invariance conditions
φ̂(mba eb , αrs ps + lra ea , θ) = φ̂(ea , pr , θ),
φ̄(µτσ ετ , θ) = φ̄(εσ , θ), (18)
respectively. In addition, for these functions Galilean invariance reduces to invariance under orientation-preserving isometries. In particular,
φ̂(Qea , Qpr , θ) = φ̂(ea , pr , θ)
for any Q ∈ SO(3),
(19)
hence
φ̂(ea , pr , θ) = (s, Cab , pra , θ),
for Cab = ea · eb , pra = pr · ea ,
s = sgn(e1 · e2 × e3 ).
For any µ ∈ Γn+2 the function satisfies the equality
j
(s, Cab , pra , θ) = ss(µ), mia Cij mb , mba (αrs psb + Cbi lri ), θ ,
(20)
(21)
(22)
m(µ) denoting the m-component of µ, and s(µ) the sign of det m(µ).
3. Phase Changes in a wt-nbhd
Consider a multilattice M whose admissible configurations can all be described
as (perhaps nonessential) n-lattices; a (reference) configuration in which M is
an essential n-lattice, described by vectors εσ0 = (ea0 , pr0 ), and an O(3)-invariant
wt-nbhd N of εσ0 , based on Proposition 1. Since no configurations of higher complexity are considered, all our results are local, and in the configuration space the
nonessential multilattices form smooth submanifolds of strictly lower dimension
(see [17]); and since N is the union of disjoint SO(3)-invariant neighborhoods N +
and N − of εσ0 and −εσ0 , respectively, we can assume, without loss of generality,
that in N + all descriptors (ea , pr ) are essential, and s = s0 := sgn(e10 · e20 × e30 ).
The following analogues of (13)1 , (18)1 hold:
mba eb0 = Qea0 ,
φ̂(ea0 , pr0 , θ)
=
αrs ps0 + lra ea0 = Qpr0
φ̂(mba eb0 , αrs ps0
+
lra ea0 , θ)
for Q ∈ P (εσ0 ),
for µ ∈
Λ(εσ0 ).
(23)
(24)
658
M. PITTERI
Denoting by Sym the space of symmetric tensors, and by Sym> the convex cone
of the positive definite ones, we normalize the orientation of the (deformed) skeletal
lattices of the multilattices in N + by restricting attention to lattice bases of the
form ea = U ea0 , U ∈ Sym> , and define the referential shift increments πr by the
equalities
pr = U (pr0 + πr ).
(25)
The following is easily proved:
Cab = ea0 · Ceb0 ,
C = U 2,
and
pra = ea0 · C(pr0 + πr ).
(26)
Therefore in N + , where s is fixed,
(s, Cab , pra , θ) = (C, πr , θ).
(27)
From (18), (23), and (25), we have, for Q ∈ P + (εσ0 ), µ ∈ Λ+ (εσ0 ):
(C, πr , θ) = (Qt CQ, αrs Qt πs , θ),
Qεσ0 = µτσ ετ0 ;
(28)
here P + (εσ0 ) is the subgroup of positive-determinant elements of P (εσ0 ), with an
analogue for the m component of the elements of Λ+ (εσ0 ). Indeed, for any µ ∈
Λ+ (εσ0 ),
(C, πr , θ) = φ̂(ea , pr , θ)
= φ̂(mba eb , αrs ps + lra ea , θ)
= φ̂(U Qea0 , U Q(pr0 + αrs Qt πs ), θ),
(29)
from which the conclusion follows by (26) and (27).
A minor problem is that the matrix α = (αrs ) is not orthogonal; but it can be
always expressed in terms of one such, and it is convenient to do so. For instance,
denote by
G := {αi : µi ∈ Λ+ (εσ0 )}
(30)
the group of the submatrices α of the elements of Λ+ (εσ0 ), and by |G| the order
of G; then construct the “metric”
|G|
n−1
1 t
λ2k dk ⊗ dk =: W 2 ,
αj αj =
g=
|G| j =1
k=1
W =
n−1
k=1
(31)
λk dk ⊗ dk ,
the dk being an orthonormal basis of eigenvectors of g and W . It immediately
follows that
αit gαi = g
for all αi ∈ G.
(32)
659
PHASE CHANGES IN MULTILATTICES
Introduce now the group G, conjugate to G, consisting of the orthogonal matrices
βi = W αi W −1 ∈ O(n − 1),
(33)
and the new reference shift increments
̟r = (W −1 )sr πs .
(34)
Then the new function (C, ̟r ) = (C, Wrs ̟s ) has the following invariance,
when Q and β are related to the same matrix µ ∈ Λ+ (εσ0 ):
(C, ̟r ) = (Qt CQ, βrs Qt ̟s ),
for any Q ∈ P + (εσ0 ), β ∈ G.
(35)
At this point we apply a procedure used by Ericksen [5] (see also [17]) to
classify generic weak bifurcations in simple lattices: we introduce orthonormal
bases Vk , k = 1, . . . , 6, in Sym and cl , l = 1, 2, 3, in R3 , and the representations
C =1+
6
yk Vk ,
k=1
̟r =
3
yrl cl ,
(36)
l=1
so that, in particular, C is near 1 if and only if (y1 , . . . , y6 ) is near 0 ∈ R6 . In the
treatment of the reduced problems in Section 4 we will choose the basis (c1 , c2 , c3 )
to coincide with (i, j , k) introduced there, and the basis V1 , . . . ,V6 to consist of
the tensors represented in the basis (i, j , k) by the matrices shown in (47)–(49).
By putting in a single list (yi ) the 3(n + 1) coordinates so introduced, we can
(yi , θ) = (C, ̟r , θ), which then
define the corresponding free energy density
enjoys the invariance
(yi , θ) =
(ȳi , θ),
ȳi = Qij yj ,
Q ∈ O(3(n + 1)).
(37)
By (35)1 each matrix Q is a block matrix, with a 6 by 6 and a 3(n − 1) by 3(n − 1)
blocks, each one itself orthogonal. We denote by G the group of such orthogonal
matrices Q corresponding to elements µ of Λ+ (εσ0 ).
the equations of equilibrium in the absence of loads are, in a
In terms of
convenient notation,
y :=
i
∂
(y , θ) = 0,
∂ yi j
i = 1, . . . , 3(n + 1).
(38)
We assume these conditions to hold for θ = θ̄ and yi = 0, the latter giving the
(reference) multilattice M(εσ0 ). Consider now the second-order symmetric tensor
of moduli at the transition:
y y (0, θ̄ );
Lij =
i j
(39)
if this is invertible, then, by the implicit function theorem, the equilibrium equations (38) have one solution yi = ȳi (θ) for θ near θ̄, such that ȳi (θ̄) = 0, i =
660
M. PITTERI
1, . . . , 3(n + 1). Also, by continuity and uniqueness, all points on this equilibrium
branch have the same symmetry as M(εσ0 ).
Therefore symmetry breaking can only occur if the tensor L of moduli has
a nontrivial kernel. The invariance (37) forces, by differentiation, the following
at (0, θ̄):
identity among the second derivatives of
L = Qt LQ
for any Q ∈ G,
(40)
hence the eigenspaces of L are invariant under the action of G:
Qt LQy = Ly = λy
⇔
LQy = λQy.
(41)
This can be interpreted by saying that invariance forces certain eigenvalues of L
to be equal. We restrict the attention to the case, called generic by Ericksen [5]
are
(see also [17]), in which the only conditions to be imposed on derivatives of
those guaranteeing that (0, θ̄ ) is a stable equilibrium at which bifurcation occurs,
and those forced by invariance; for instance, (40). In particular, the only eigenvalues of L that are equal are the ones that are forced to be so by invariance; or,
the eigenspaces of L are irreducible invariant (i.i.) subspaces of R3(n+1) under the
action y → Qy of the group G, and exactly one of them is the kernel of L. Then, the
condition that a stable phase exists, say, for θ > θ̄ forces all the other eigenvalues
to be strictly positive.
We call reduced the action of G on each i.i. subspace, and also call reduced the
group representing such action on that subspace. If one chooses the basis above
aligned with a choice of i.i. subspaces, then each matrix Q is a block matrix, each
orthogonal block corresponding to an i.i. subspace, and representing an element of
the reduced group on that subspace.
The fact that the action of G does not mix the first 6 and the last 3(n − 1)
coordinates (see (35)), implies that the set of i.i. subspaces of R3(n+1) necessarily
contains those of either one of the forms
V1 × {0} or
{0} × V2 ,
(42)
where V1 (V2 ) is an i.i. subspace of R6 (of R3(n−1) ) and 0 ∈ R3(n−1) (0 ∈ R6 ).
Case (42)1 corresponds to configurational transitions, in which the motif follows
the deformation of the skeleton, at least in the beginning. Case (42)2 describes
structural transitions, which are driven by the deformation of the motif, followed
by a suitable consequent deformation of the skeleton. We then follow a classical
procedure: we determine the i.i. subspaces of R6 (of R3(n−1) ) for case (42)1 ((42)2 ),
and then consider the corresponding reduced problem; a description of these can
be found in [8, 5, 19, 17]. Other possibilities for the i.i. subspaces will be analyzed
elsewhere.
4. The Case of β-quartz
At low pressures quartz exhibits two stable phases, called “low” (or trigonal, or α-)
quartz and “high” (or hexagonal, or β-) quartz; at room pressure, these phases are
observed below and above about 574◦ C, respectively.
PHASE CHANGES IN MULTILATTICES
661
We follow [12] (and [17]) by assuming that in any configuration of the SiO2
structure the positions of the Si atoms be compatible with the definition of a
3-lattice, and neglect the oxygens; thus we describe the crystalline structure of
both quartz phases by a monatomic 3-lattice, whose points are the positions of
the Si atoms in the SiO2 lattice. In the literature the α–β transition is attributed
to a suitable deformation of the tetrahedra having the center at a Si atom, and the
four nearest O atoms as vertices. Here this deformation is only described by the
displacement of the Si atoms.
In both α- and β-quartz the skeletal lattice type is hexagonal. A common choice
of lattice vectors is the following:
√
3a
a
(43)
,0 ,
e3 = (0, 0, c),
e2 = − ,
e1 = (a, 0, 0),
2 2
in an orthonormal basis (i, j , k). The rotational subgroup of the corresponding
hexagonal holohedry is
π/3
2π/3
4π/3
5π/3
π
π √
, R√
, (44)
Hk = 1, Riπ , Rjπ , Rkπ , Rk , Rk , Rk , Rk , Ri±
3j
3i±j
where Rvω denotes the rotation by the angle ω about the direction of the vector v.
In the crystallographic literature the plane of e1 and e2 is called the basal plane,
and the direction of e3 (and of k) is called the (hexagonal) optic axis.
One of the two possible (enantiomorphic) 3-lattice structures of β-quartz at the
transition temperature θ0 has descriptors εσ0 , σ = 1, . . . , 5, where the lattice vectors
ea0 are given by (43) for suitable choices a0 , c0 , of a and c, and the shifts are
2
1
p10 = e10 + e30 ,
2
3
1
1
p20 = e20 + e30 .
2
3
(45)
Figure 4 shows the projection of the 3-lattice M(εσ0 ) onto the basal plane of e10 and
e20 orthogonal to the optic axis e30 .
James [12] (see also [17]) constructs the above 3-lattice structure of β-quartz
as
√
follows: consider a regular hexagonal planar honeycomb of edge d0 = ( 3/3)a0 ,
one cell of which is drawn by a plain line in the lower right part of Figure 4. For
each one of its vertices consider a right-handed (with respect to k) circular helix
of radius d0 /2 and pitch c0 , whose axis goes through that vertex and is orthogonal
to the plane of the honeycomb. It is possible to arrange the right-handed helices
in such a way that each one meets the neighboring three in equally spaced points
along the helix itself, each point having the same projection onto the plane of the
honeycomb as the third point following it. Each intersection point of the helices is
the site of a Si atom. The circles in Figure 4 are the projections of the helices in the
plane of the honeycomb, which is the basal plane generated by e10 , e20 .
The crystal class P (εσ0 ) < SO(3) of this 3-lattice is the rotational subgroup Hk
π/3
(see (44)) of the hexagonal holohedry P (ea0 ), and is generated, for instance, by Rk
and Riπ . This class is called hexagonal trapezohedral, is denoted by 6 2 2 in [11],
662
M. PITTERI
Figure 4. Projection onto the basal plane of the Si atoms in right-handed β-quartz, and of the
descriptors εσ0 = (ea0 , p10 , p20 ) for the 3-lattice given by (43) and (45) with λ > 0.
and is the actual crystal class of β-quartz, so that the monatomic 3-lattice M(εσ0 )
gives a good approximation of the actual structure of this quartz phase.
Case (42)1
We follow [5], in the notation of [17] to which we refer for more details.
a0 ) of symmetric tensors left fixed by the transformation E →
(1) The set C(e
t
Q EQ of Sym form the 2-dimensional subspace of Sym represented in the
basis (i, j , k) by matrices of the form
%
α
0
0
0
α
0
0
0
β
&
;
(46)
a0 ) is given, for instance, by the orthonormal vectors
a basis for C(e
1
V1 = √
2
%
1 0
0 1
0 0
0
0
0
&
and
V2 =
%
0 0
0 0
0 0
0
0
1
&
.
(47)
a ) are those and only those that are
The i.i. subspaces contained in C(e
1-dimensional. In each one of them the reduced action is trivial, and the bifurcation point is a turning (or limit) point, with change of stability but not of
symmetry.
PHASE CHANGES IN MULTILATTICES
663
a0 )⊥ of C(e
a0 ) there are two mutually or(2) In the orthogonal complement C(e
thogonal 2-dimensional i.i. subspaces V1 , V2 , generated by V3 , V4 and V5 , V6 ,
respectively, with
%
%
&
&
0 1 0
1 0 0
1
1
V3 = √
V4 = √
(48)
1 0 0 ,
0 −1 0 ,
2 0 0 0
2 0 0 0
%
%
&
&
0 0 1
0 0 0
1
1
V5 = √
V6 = √
(49)
0 0 0 ,
0 0 1 .
2 1 0 0
2 0 1 0
(3) The reduced group P1 on V1 is the symmetry group of an equilateral triangle
in R2 with center at the origin and a vertex on the second coordinate axis; it
has order six and is generated by⋆
2π
−1 0
π/3
π
≈ Rk ,
f :=
≈ Ri and r
0 1
3
(50)
cos ω − sin ω
r(ω) :=
.
sin ω cos ω
Thus the action of Hk on the typical element of V1 produces six monoclinic
variants, the symmetry axis being k. In addition, V1 contains three 1-dimensional base-centered orthorhombic subspaces, whose crystal class is rhombic
disphenoidal (2 2 2 in [11]).
The reduced bifurcation diagram consists in three transcritical bifurcating
curves which belong to the aforementioned subspaces and are all unstable.
This is detailed in [17], where it is also shown how these curves can be restabilized, thus producing a subcritical bifurcation.
(4) The reduced group P2 on V2 is the group of symmetries of a regular hexagon
in R2 with center at the origin and a vertex on the first axis, has order twelve
and is generated by
π
5π/3
π
≈ Rk .
(51)
f ≈ Ri and r
3
Thus the action of the hexagonal holohedry on the typical element of V2
produces twelve triclinic variants.
In V2 there are two triples of symmetry-related 1-dimensional subspaces made
of centered monoclinics, whose crystal class is monoclinic sphenoidal (2 in
[11]). The bifurcating curves consist in two triples of pitchforks, each one in
one of these triples of monoclinic subspaces. For a triple of pitchforks to be
stable it is necessary that it be supercritical: it must exist for θ θ0 under
the assumed stability of the high-symmetry phase for θ > θ0 ; then necessarily
also the other triple is supercritical, and exactly one of them is stable.
⋆ Here and below ≈ means “represented by” or “representing”.
664
M. PITTERI
Case (42)2
π/3
π/3
We denote by αk the submatrix α of the matrix µk ∈ Λ+ (εσ0 ) that corresponds
π/3
to the rotation Rk according to (13), etc., α1 denoting the 2 by 2 identity. Using
π/3
the expressions of the generators µk and µπi reported in [17], we have
1 0
−1 −1
π/3
4π/3
α1 =
= αkπ ,
αk =
= αk , (52)
0 1
1
0
0
1
−1 −1
2π/3
5π/3
π
αk
=
= αk ,
αi =
= αjπ ,
(53)
−1 −1
0
1
0 1
1
0
π √
π
π √
π
=
,
α
=
.
α√
=
α
= α√
i− 3j
i+ 3j
3i−j
3i+j
1 0
−1 −1
(54)
Therefore, by (31),
⎛4 2⎞
⎜
g=⎝3
2
3
3⎟
4⎠
3
and
1
W = √
3 2
√
√
3+ 3 3− 3
√
√
,
3− 3 3+ 3
from which, by (33) and in obvious notation,
√
1
1
1 0
3
4π/3
π/3
π
√
= βk ,
β1 =
= βk ,
βk = −
0 1
2 − 3 1
√
1 √1 − 3
2π/3
5π/3
βk = −
= βk ,
3
1
2
√
1
3
1
√
= βjπ ,
βiπ = −
1 − 3
2
0
1
π √
π
= βi+
,
β√3i−j =
3j
1 0
√
1 − 3 √1
π
π √
βi− 3j = −
.
= β√
3i+j
3
1
2
(55)
(56)
(57)
(58)
Consider now the action induced by (37)2 on the 6-dimensional space of shift
components, with typical element (a1 , a2 , a3 , b1 , b2 , b3 ) representing the pair
(̟1 , ̟2 ). One can check that this action transforms into themselves the subspaces
W1 and W of the form (0, 0, a3 , 0, 0, b3 ) and (a1 , a2 , 0, b1 , b2 , 0), respectively.
The 2-dimensional subspace W1 , described by the pair (a3 , b3 ), is irreducible
invariant, and consists of monoclinic 3-lattices, with axis k. The reduced group
on W1 has order 6, is generated by the matrices
√
4π
1
3
1
π/3
π
π
√
≈ Ri ≈ Rj and r
≈ Rk ,
(59)
1 − 3
2
3
665
PHASE CHANGES IN MULTILATTICES
and is the symmetry group of an equilateral triangle in R2 centered at the origin
and having a vertex on the bisectrix of the second and fourth quadrant. Therefore,
to within a rotation by π/4 of the coordinates, this reduced problem is the same
as the one in item (3) of Case (42)1 . As there, the bifurcation diagram consists of
three unstable transcritical bifurcating curves of orthorhombic 2 2 2 symmetry. For
instance, the orthorhombic axes (besides k) are i and j for the choice of shifts
π1 = 2λk = 2π2 ,
λ ∈ R.
(60)
This can be obtained from the corresponding 1-dimensional subspace of W1 , which
has the form
√
(a3 , b3 ) = γ (2 + 3, 1), γ ∈ R,
(61)
by (34) and (55), or directly from the condition (see (28))
Riπ πr = (αiπ )sr πs ;
(62)
or, equivalently, from its analogue for Rjπ .
The subspace W decomposes into the orthogonal sum of three i.i. subspaces:
W2 , W3 , W4 , the first two of dimension 1, the third of dimension 2. They are
respectively generated by
√
√
(63)
w2 = (−1, 2 + 3, 0, 2 + 3, 1, 0),
√
√
w3 = (2 + 3, 1, 0, 1, −2 − 3, 0), and
(64)
′
w4 = (1, 0, 0, 0, 1, 0),
w4 = (0, 1, 0, −1, 0, 0).
(65)
The reduced group on W2 [W3 ] is {1, −1}; for instance,
1 ≈ Riπ
[1 ≈ Rjπ ] and
π/3
− 1 ≈ Rk
≈ Rkπ .
(66)
Therefore, as is known, the bifurcation diagram is the standard pitchfork. A fourthorder polynomial energy is sufficient to capture the qualitative features of a (supercritical) second-order bifurcation, while a subcritical first-order one, as in the case
of quartz, requires a sixth-order polynomial (see, for instance, [5, 6] or [17]). In W2
[in W3 ] the crystal class of the bifurcating multilattices is trigonal trapezohedral
((32) in [11]), with k as 3-fold axis; the additional generator of the point group is
Riπ [is Rjπ ].
The reduced group on W4 has order 12, is generated by the matrices
√
1
π
3
1
π/3
π
√
≈ Ri and r
≈ Rk ,
(67)
1 − 3
2
3
and is the symmetry group of a regular hexagon in R2 which is centered at the
origin and has a vertex on the bisectrix of the second and fourth quadrant (compare
with (59)). Therefore, to within a rotation by π/4 of the coordinates, this reduced
666
M. PITTERI
Figure 5. Projection as in Figure 4 for the right-handed α-quartz structure, and of the descriptors εσ+ = (ea , p1+ , p2+ ) for the 3-lattice given by (43) and (70) with U = 1 and
λ > 0.
problem is the same as the one in item (4) of Case (42)1 and has the same qualitative
bifurcation diagram. All the bifurcating branches have monoclinic 2 symmetry.
We now analyze in detail the two trigonal trapezohedral subspaces W2 and W3 .
Using (34) and (55), we see that the reference shift increments corresponding to
W2 are, in terms of real parameters λ̄, λ′ ,
√
√ λ′ (e10 + 2e20 )
2(3 + 3)
,0 = 2 1 + 3
,
(68)
π1 = λ̄ 0,
3
3
√
√ 3+ 3
√ λ′ (2e10 + e20 )
π2 = λ̄ 1 + 3,
,0 = 2 1 + 3
.
(69)
3
3
Equivalently, by (25), denoting by pr+ the present shifts, with pr = Upr0 , ea = U ea0 ,
and λ a real parameter, we have
p1+ = p1 + λ(e1 + 2e2 ),
p2+ = p2 + λ(2e1 + e2 ).
(70)
These shifts represent deformed β-quartz for λ = 0, while for λ = 0 they give the
M(εσ+ ) 3-lattice model for trigonal trapezohedral α-quartz proposed in [12], and
used also in [6, 17]. The projection of this 3-lattice onto the (basal) plane of e1 and
e2 is sketched in Figure 5 for λ > 0.
Based on [12], we recall how the α-quartz structure can be obtained by deforming the helices described above for β-quartz. If the reference β-quartz helices, of
radius d0 /2, maintain their axes while being radially stretched, so that their radius
becomes larger than d0 /2, then any point of initial intersection of two neighboring
667
PHASE CHANGES IN MULTILATTICES
helices splits into two alternative intersections. For the actual α-quartz structure,
which is uniformly stretched by some U ∈ Sym> whose representative matrix in
the basis (i, j , k) has the form (46), the projections of these intersections onto the
basal plane are shown in the lower right part of Figure 5. To maintain the threefold
rotational symmetry about the normal to the honeycomb, the Si atoms must still
be evenly spaced on the helices on such intersections, each atom being on the
same vertical (with respect to the basal plane) as the third atom following it on
the helix. There are exactly two ways in which this can happen: on an arbitrarily
selected helix Si atoms are placed on either the first or the second of its possible
intersections with one of the neighboring helices; either choice forces in a unique
way the atoms on the other neighboring helices to be all placed on the second or on
the first of their intersections, etc., up to completion of the whole structure. Figure 5
shows one of the configurations obtained in this way, namely M(εσ+ ) with shifts
given by (70) for λ > 0. For fixed λ, the other possibility is given by the Dauphiné
twin M(εσ− ), where the εσ− = (ea , p1− , p2− ) have the same lattice vectors ea as εσ+ ,
and shifts
p1− = p1 − λ(e1 + 2e2 ),
p2− = p2 − λ(2e1 + e2 ),
(71)
with the same λ as in (70). These two configurations correspond to symmetryrelated points on the bifurcated branches of the pitchfork in the W2 subspace
mentioned above. We refer the reader to [17] for more details on Dauphiné twins,
and only recall that the twin multilattice M(εσ− ) can be obtained from M(εσ+ ) by
means of the rotation Rkπ , of order 2 (see also (66)).
Using (34) and (55), we observe that the reference shift increments corresponding to elements of the subspace W3 are, in terms of real parameters µ̄, µ′ ,
√
√
π1′ = 4µ̄(3 + 3, 0, 0) = 4µ′ 3 + 3 e10 ,
(72)
π2′
√
√
√ 1
3
,−
, 0 = −4µ′ 3 + 3 e20 .
= 4µ̄ 3 + 3
2
2
(73)
Equivalently, by (25), denoting by pr′ the corresponding present shifts, we have
p1′ = p1 + µe1 ,
p2′ = p2 − µe2 ,
(74)
with µ a real parameter. We have deformed β-quartz for µ = 0, while symmetryrelated points on the bifurcated branches of the pitchfork in this subspace correspond to opposite values of µ = 0 in the shifts given by (74). The related
multilattices, say M(εσ′ ) and M(εσ′′ ), are another example of shuffle twins, very
similar to the Dauphiné twins described above. In particular, the twin multilattice
M(εσ′′ ) can be obtained from M(εσ′ ) by means of the same rotation Rkπ , of order 2,
which relates the Dauphiné twins (see (66)).
Also in this case one can describe the low-symmetry phase and the twins in
terms of deformation of the reference β-quartz helices. Now the radius of those
668
M. PITTERI
Figure 6. Projection as in Figure 5 for the right-handed quartz structure with shifts given
by (74) for U = 1 and µ > 0.
helices is shrinked, becoming less than d0 /2, and hence neighboring helices do not
intersect anymore. Figure 6 shows one of the possible arrangements of the actual
helices; looking at the hexagon drawn in the lower right corner, we see that the
other possibility – which gives the twinned configuration – is obtained by exchanging the occupied and the nonoccupied helices in that hexagon, and consequently in
the whole structure.
In his paper [6], among other things, Ericksen tackles the problem of finding
an alternative to the so-called incommensurate phase, which is introduced in the
physical literature to explain certain features of the α–β transition in quartz. He
looks for configurations described by a 3-lattice with hexagonal skeleton and shifts
such that the symmetry of the structure consists in the threefold axis k alone. The
reference shift increments must have the form
π1 = (λ + µ)e10 + 2λe20 ,
π2 = 2λe10 + (λ − µ)e20 ,
(75)
for λ and µ varying arbitrarily over the reals; this corresponds in fact to the
2-dimensional orthogonal sum W2 3W3 of the above i.i. subspaces W2 and W3 .
In that 2-dimensional space Ericksen, generalizing an example in [19], introduces
a sextic polynomial (reduced) potential in the variables λ, µ, with the coefficients
of λ2 and µ2 depending affinely on environmental pressure p and temperature θ,
and all the other coefficients constant. The related bifurcation analysis shows the
existence of four phases, labelled I (λ = 0 = µ, β-phase), II (λ = 0 = µ, α-phase,
our subspace W2 ), III (λ = 0 = µ, our subspace W3 ), and IV (λ = 0 = µ, the
complement of the previous phases in W2 3W3 ). According to Ericksen, phases
II and III and their twins are rather similar, so that one could be confused for the
PHASE CHANGES IN MULTILATTICES
669
other. For instance, the twins can be described for both phases in terms of the same
twinning operation Rkπ .
Ericksen uses the sextic potential to describe the transition from β- to α-quartz,
at constant pressure when temperature is lowered, by means of a second-order
transition from phase I to phase III, followed by a first-order transition between
phases III and II.
Notice that a direct transition from I to IV corresponds to bifurcating into the
orthogonal sum of W2 and W3 in an (initial) direction contained in neither W2
nor W3 , which requires the vanishing of both the eigenvalues of L corresponding
to those eigenspaces. This is not forced by symmetry, and hence cannot generically occur with one control parameter, as the kernel of L is a reducible invariant
subspace of R6 .
We refer the reader to [6] for details, in particular for a comparison with certain formulae of [19] involving bifurcation into a reducible invariant subspace,
of interest in his bifurcation analysis, and for comments on the incommensurate
phase.
5. A Personal Tribute to Clifford Truesdell
The fact that this paper appears in a volume dedicated to the memory of Clifford
Truesdell gives me the opportunity to acknowledge, in addition to my debt of gratitude and admiration toward him, some of his merits that are not well known as
well as very unusual.
Among his various qualities, Truesdell was very interested in many research
fields different from those in which he himself or his associates were active. For the
purpose of this tribute he was interested in the logic of modality, that is, of possibility and hence of necessity, especially in its version introduced by Bressan [1] under
the name of physical possibility to write rigorous foundations of classical particle
mechanics based on ideas of Mach and Painlevé [13]. This version of possibility
has some intuitive features by means of which it is regarded as a primitive concept;
it makes axiomatization of physics rigorous and conform to the views of Hamel
[9, 10]; and it endows the axiomatization with a threefold interdisciplinary character: mathematical physics, logic, and philosophy of science. Papers having this
interdisciplinary character are exceptionally rare and can be easily misunderstood.
As far as I am concerned, Truesdell showed his interest in the logic of modalities
by accepting as appendix G6 of his well known book [21] my paper [14], in which
I explicitly refer to the theory of modal logic presented in progressively refined
versions in [1–3]. It is true that I use these versions from an intuitive point of
view,⋆ but the aforementioned version of physical possibility is a key ingredient in
definitions and axioms.
⋆ And, indeed, in my university curriculum I never had a formal training in mathematical logic,
nor attended lectures on technical parts of it.
670
M. PITTERI
Moreover, at about the same time, Truesdell used very strong arguments in
some polemic considerations on the axiomatization of mechanics ([20], Part V,
Philosophy, essay 39: Suppesian stews (1980/1981)). These arguments show that,
in fact, he had understood the main features of [1].
Additional evidence of Truesdell’s appreciation of the logic of modalities is
contained in a 1986 letter of his to the Accademia dei Lincei, of which he gave me a
confidential copy. There, he strongly supports the research program of extending a
theory à la Mach–Painlevé from particle mechanics to the mechanics of continuous
media. Details, background and extensive references are given in [16].
Those of us who have worked or are still working to render the approach in
[13] rigorous, and to generalize it, are well aware that, even today, that approach
is widely ignored by mathematical physicists and researchers in mechanics. The
interdisciplinary character of that work sometimes strongly contributes to serious
misunderstandings of the related publications, even involving their contents. This
confirms that Truesdell’s interest and understanding illustrated above were really
very unusual.
Acknowledgements
I want to extend here to Charlotte Truesdell my gratitude to her husband Clifford
expressed earlier. This work is part of the research activities of the EU Network
“Phase Transitions in Crystalline Solids”, and has been partially supported by
the Italian M.U.R.S.T. through the project “Mathematical Models for Materials
Science”.
References
1.
2.
3.
4.
5.
6.
7.
8.
A. Bressan, Metodo di assiomatizzazione in senso stretto della meccanica classica. Applicazione di esso ad alcuni problemi di assiomatizzazione non ancora completamente risolti.
Rend. Sem. Mat. Univ. Padova 32 (1962) 55–212.
A. Bressan, A General Interpreted Modal Calculus. Yale Univ. Press, New Haven/London
(1972). Foreword by N.D. Belnap, Jr., 327 pages.
A. Bressan, (a) On physical possibility and (b) Supplement: A much used notion of physical
possibility and Gödel’s undecidability theorem. In: M. Dalla Chiara Scabbia (ed.), Italian Studies in Philosophy of Science. North-Holland, Amsterdam (1981) pp. 197–210 and
211–214.
B. Budiansky and L. Truskinovsky, On the mechanics of stress-induced phase transformations
in zirconia. J. Mech. Phys. Solids 41 (1993) 1445–1459.
J.L. Ericksen, Local bifurcation theory for thermoelastic Bravais lattices. In: J.L. Ericksen,
R.D. James, D. Kinderlehrer and M. Luskin (eds), Microstructure and Phase Transition, IMA
Volumes in Mathematics and its Applications, Vol. 54. Springer, New York (1993).
J.L. Ericksen, On the theory of the α–β phase transition in quartz. J. Elasticity 63 (2001) 61–86.
G. Fadda, L. Truskinovsky and G. Zanzotto, Unified Landau description of the tetragonal,
orthorhombic, and monoclinic phases of zirconia. Phys. Rev. B 66 (2002) 174107 1–10.
M. Golubitsky, D. Schaeffer and I. Stewart, Singularities and Groups in Bifurcation Theory,
Vol. II. Applied Mathematical Sciences, Vol. 69. Springer, New York (1988).
PHASE CHANGES IN MULTILATTICES
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
671
G. Hamel, Über die Grundlagen der Mechanik. Math. Annalen 66 (1908) 350–397.
G. Hamel, Die Axiome der Mechanik, Handbuch der Physik, Vol. 5. Springer, Berlin (1927)
pp. 1–42.
T. Hahn (ed.), International Tables for X-ray Crystallography, Vol. A. Reidel, Dordrecht/
Boston (1996).
R.D. James, The stability and metastability of quartz. In: S.S. Antman, J.L. Ericksen, D. Kinderlehrer and I. Müller (eds), Metastability and Incompletely Posed Problems, IMA Volumes in
Mathematics and its Applications, Vol. 3. Springer, New York (1987).
P. Painlevé, Les Axiomes de la Mécanique. Gauthier-Villars, Paris (1922).
M. Pitteri, On the axiomatic foundations of temperature. In: C. Truesdell’s Rational Thermodynamics, 2nd edn. Springer, New York (1984) Appendix G6, pp. 522–544.
M. Pitteri, On bifurcations in multilattices. In: R. Monaco, M. Pandolfi and S. Rionero (eds),
Proc. of WASCOM 2001, Porto Ercole, Italy. World Scientific, Singapore (2002).
M. Pitteri, On certain weak phase transformations in multilattices. TMR Network “Phase
Transitions in Crystalline Solids” Preprint Series, No. 100 (2002). Also Rapporto Tecnico
DMMMSA No. 88, 2/12/2002. Available at http://www.dmsa.unipd.it/tmr/public_
html/PreprintDMMMSA.pdf.
M. Pitteri and G. Zanzotto, Continuum Models for Phase Transitions and Twinning in Crystals.
CRC/Chapman & Hall, Boca Raton/London (2002).
N.K. Simha and L. Truskinovsky, Phase diagram of zirconia in stress space. In: R. Batra and
M. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials.
CIMNE, Barcelona (1996).
P. Tolédano and V. Dmitriev, Reconstructive Phase Transformations: In Crystals and Quasicrystals. World Scientific, Singapore (1996).
C.A. Truesdell, An Idiot’s Fugitive Essays on Science: Methods, Criticism, Training, Circumstances. Springer, New York (1984).
C.A. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1984).
C.A. Truesdell and R.G. Muncaster, Fundamentals of Maxwell’s Kinetic Theory of a Simple
Monatomic Gas, Treated as a Branch of Rational Mechanics. Academic Press, New York
(1980).
A New Quasilinear Model for Plate Buckling
PAOLO PODIO-GUIDUGLI
Dipartimento di Ingegneria Civile, Università di Roma “Tor Vergata”, Viale Politecnico, 1–I-00133
Roma, Italy. E-mail: ppg@uniroma2.it
Received 5 November 2002; in revised form 19 March 2003
Abstract. A new quasilinear model for plate buckling is presented, which reduces to von Kármán’s
semilinear model through an explicit approximation procedure.
Mathematics Subject Classifications (2000): 74G60, 74K20, 74B20.
Key words: plate buckling, von Kármán equations, nonlinear eigenvalue problems.
1. Introduction
Von Kármán’s is a system of two semilinear partial differential equations, each with
a biharmonic principal part, which is intended to describe the large deflections of
thin elastic plates and, in particular, their buckling under the action at the boundary
of either in-plane compressive loads or in-plane inward displacements. The von
Kármán buckling equations have been and are popular among nonlinear analysts
because they provide a relatively easy and significant example of application of
various abstract techniques that have been devised to study nonlinear eigenvalue
problems (cf. [1–3]).
Von Kármán equations do capture well the target phenomenology. Yet, their
standard “derivations” in the engineering literature (including von Kármán’s
own [21]) have a host of conceptual defects that have been repeatedly exposed,
but never completely cured. To quote from Truesdell’s brilliant criticism in the
Epilogue of [19], one may choose to regard that theory “. . . as handed down by
some higher power (a Hungarian wizard, say) and study it as a matter of pure
analysis” (p. 601). Indeed, such a discrepancy between final product and derivation
is all but a strange case with named equations, since the greats’ intuition often
compensates for rigor; but rarely, if ever, to such an extent. This is perhaps the
reason why, both in [5] and lately in [6, p. 367], Ciarlet states that “von Kármán
equations . . . play an almost mythical role in applied mathematics”.
Sound mathematical justifications of the von Kármán theory have been given.
Ciarlet [4] has shown that, under appropriate hypotheses on the boundary data and
the elastic response, the von Kármán equations arise as the leading terms in a formal asymptotic expansion with respect to the thickness parameter of a quasilinear
problem in three-dimensional elasticity. Refinements and complements to Ciarlet’s
673
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 673–698
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
674
P. PODIO-GUIDUGLI
approach are found in [8],⋆ [5, 9; 6, Chapter 5]. Moreover, very recently Friesecke
et al. [10, 11] have established the Föppl–von Kármán plate theory as a Ŵ-limit of
three-dimensional nonlinear elasticity by scaling the energy as the fifth power of
the plate’s thickness.⋆⋆
My present derivation begins with a kinematic Ansatz (relations (3.7) and (3.8))
just as von Kármán’s, and has various other similarities with it. In fact, I strive to
keep as close to von Kármán’s line of reasoning as possible, and use the toolkit of
modern continuum mechanics to rigorously justify every single step. Mine is an exact derivation from three-dimensional elasticity of a two-dimensional, quasilinear
system of equations that reduces to von Kármán’s through an explicit approximation procedure. In other words, I show that the classical buckling equations of von
Kármán can be given a rational position with respect to a two-dimensional system
being an exact consequence, and having the same type of nonlinearity, of the threedimensional system of elasticity. At variance with methods based on asymptotic
expansions or variational convergence, which apparently cannot help extracting a
semilinear problem from a quasilinear one, my method preserves the mathematical
type of the original three-dimensional problem. Hence, the bifurcation problem I
derive is more difficult than von Kármán’s: I propose it for analysis.
In short, the contents of this paper are the following. In Section 2, the von
Kármán equations are recalled and the associated bifurcation problem is stated,
both in the standard formulation and in the formulation introduced in the cited work
by Berger. Next, in Section 3, a three-dimensional buckling problem is described,
for an internally constrained, homogeneous, three-dimensional plate-like body of
appropriate elastic response.
The internal constraint we stipulate to hold is an exact nonlinear version of
the linear constraint that allows for the classical plate theory of Kirchhoff–Love:
material fibers parallel to the plate’s axis stay straight, do not change their length,
and remain orthogonal to fibers orthogonal to the axis itself. We assume that the
material comprising the body is a St. Venant–Kirchhoff elastic material being transversely isotropic with respect to the axial direction and capable only of deformations that agree with the constraint. We let the body be weakly clamped along its
lateral boundary, in the sense that sliding parallel to the plate’s plane is allowed
(Figure 2). Moreover, we let the body be in equilibrium under loads that are everywhere null, except for a uniform in-plane pressure of magnitude λ over all of the
lateral boundary. For sufficiently large values of the parameter λ, we expect the
body to buckle. The purpose of von Kármán’s theory is to approximately determine
the critical values of the buckling parameter, as well as the accompanying buckled
⋆ An account of Davet’s refinement of Ciarlet’s work is given in Section XIV.14 of [1].
⋆⋆ Not surprisingly, this is precisely the scaling exponent one obtains by thickness integration of the
stored energy density (3.27) with the strain field evaluated as in (B.4). I am indebted to Sergio Conti
for kindly bringing this work to my attention on receiving a copy of the version of this manuscript I
had just submitted for publication.
A MODEL FOR PLATE BUCKLING
675
shapes, without solving a full three-dimensional problem of nonlinear elasticity,
whose global bifurcation analysis would be very difficult.⋆
In Section 4, it is shown that von Kármán’s first equation follows from a compatibility condition, which is an exact two-dimensional consequence of the classical St. Venant–Beltrami compatibility conditions. This compatibility condition
insures, roughly speaking, that a suitably defined plane strain field allows for the
construction of a displacement field consistent with the internal constraint in force.
In Section 5, it is shown, among other things, that von Kármán’s second equation
follows from an equilibrium condition, which is an exact two-dimensional consequence of stress balance, both at interior points of the body and at its upper
and lower faces. In both sections, the passage from three to two dimensions is
performed by one mathematical tool, thickness integration. Finally, in Section 6,
the bifurcation problem with respect to which Kármán’s is given a position is
formulated. The Appendix, in four parts, is meant to save the reader some ink.
2. The von Kármán Equations
For o a fixed point in the three-dimensional Euclidean space E, let {o; c1 , c2 ,
c3 ≡ z)} be an orthonormal Cartesian frame, and let (x1 , x2 , x3 ≡ ζ ) be the
Cartesian coordinates of a point p = x + ζ z of E, with xα the coordinates of
the point x in the plane ζ = 0; moreover, let P be a simply-connected domain in
that plane, with a smooth boundary ∂P , and let n(x) be the outer normal at a point
x ∈ ∂P (Figure 1).
Given the scalar-valued fields ϕo and ϕ1 over ∂P , the von Kármán’s bifurcation
problem consists in finding a real number λ and a pair (ϕ, w) of fields over P ∪ ∂P
such that:
– in P ,
1
(2.1)
ϕ − [w, w] = 0,
2
κ
w − [ϕ, w] = 0;
(2.2)
– in ∂P ,
ϕ = λϕ0 ,
∂n ϕ = λϕ1 ,
(2.3)
w = 0,
∂n w = 0.
(2.4)
Figure 1.
⋆ A study of certain bifurcation problems in nonlinear three-dimensional elasticity is found in a
recent paper by Healey and Simpson [14].
676
P. PODIO-GUIDUGLI
Here,
[a, b] := a,11 b,22 +a,22 b,11 −2a,12 b,12 ,
(·),α :=
∂(·)
,
∂xα
(2.5)
is the Monge–Ampère differential “crochet”; the field ϕ is an Airy-type stress function; w(x)z is interpreted as the transverse displacement of a point x of
P ∪ ∂P ; κ > 0 is a stiffness constant bearing the same physical dimensions
as ϕ; and λ is a scalar multiplier specifying the magnitude of a given distribution
of in-plane, compressive loads orthogonal to ∂P .
That von Kármán’s is a nonlinear eigenvalue problem is made more evident by
a change in format [2].
Let ϑ be a biharmonic field over P that satisfies the boundary conditions
ϑ = ϕ0 ,
∂n ϑ = ϕ1
in ∂P .
(2.6)
Set
ψ := ϕ − λϑ,
and denote by ψ(w) the unique solution, for each given field w over P , of the
boundary-value problem
1
ψ − [w, w] = 0 in P ,
2
ψ = 0,
∂n ψ = 0 in ∂P .
(2.7)
(2.8)
Set
A(w) := κ
w − [ψ(w) , w],
B(w) := [ϑ, w],
(2.9)
and call (λ, w) a proper pair if it solves the problem
C(λ, w) := A(w) − λ B(w) = 0,
(2.10)
subject to the boundary conditions (2.4). Then, the von Kármán problem (2.1)–(2.4)
may be reformulated as follows:
Study the mapping λ → Wλ := {w | C(λ, w) = 0}.
(2.11)
This problem has been repeatedly looked at by engineers and mathematicians:
by the former, in the first formulation; by the latter, in the formulation (2.11),
because it provides a nontrivial instance where the abstract techniques developed
by Crandal and Rabinowitz [7] and Rabinowitz [18] apply.
3. Three-Dimensional Plate Buckling
In this section, with a view toward giving the two-dimensional bifurcation problem
(2.1)–(2.4) the status of a rational approximation, we formulate a buckling prob-
A MODEL FOR PLATE BUCKLING
677
Figure 2.
lem for an internally constrained, three-dimensional plate-like body of appropriate elastic response. We use standard notation in continuum mechanics, and leave
smoothness assumptions tacit.
Consider a continuous body occupying a three-dimensional plate-like region
of cross section P and thickness 2ε, with 2ε ≪ diam(P ), that is to say, a right
cylinder C(ε) of axis z, that we identify pointwise with the set P × ] − ε, +ε[. Let
C(ε) be weakly clamped along its lateral boundary M(ε) ≡ ∂P × ] − ε, +ε[, in
the sense that sliding parallel to the plane ζ = 0 is allowed (Figure 2).
Moreover, let C(ε) be in equilibrium under null body loads, null tractions over
the upper and lower faces P ± = P × {±ε}, and uniform in-plane dead pressure
of magnitude λ over all of M(ε). For sufficiently large values of the real parameter
λ, we expect the plate C(ε) to buckle, that is, to admit equilibrium solutions with
the deformed shape of C(ε) different from a right cylinder of axis z with flat cross
sections.
We specify what restricted class of motions we choose for C(ε) in the upcoming
subsection; next, we assign to C(ε) a type of elastic response compatible with
such constrained kinematics (Section 3.2); finally, we express the place boundaryconditions in terms of the functions that parametrize the admissible motions, as
well as the traction boundary-conditions in terms of the admissible active and
reactive stresses (Section 3.3). We need not lay down here the governing field
equations; their consequences relevant to finding a two-dimensional problem which
is an exact antecedent of the von Kármán’s bifurcation problem will be discussed
in Sections 4 and 5.
3.1. CONSTRAINED KINEMATICS
For p → f (p) a deformation of C(ε) and for F = ∇f the deformation gradient,
we choose the strain measure
D=
1 T
(F F − 1).
2
Moreover, for
u(p) = f (p) − p
(3.1)
678
P. PODIO-GUIDUGLI
the displacement vector field associated with the deformation f and for U = ∇u
the displacement gradient, we note that
1
D = E + UT U,
2
where
(3.2)
1
(U + UT ) =: sym U
(3.3)
2
is the linearized strain measure.
As the first step of our derivation of the von Kármán equations, we stipulate that
all the deformations of C(ε) satisfy the constraint condition
E=
Dz = 0 in C(ε).
(3.4)
This condition is the nonlinear counterpart of the condition that characterizes the
kinematics of Kirchhoff–Love plates, namely,
Ez = 0 in C(ε).
(3.5)
Conditions (3.4) and (3.5) impose – the former exactly, the latter in the sense of the
classic approximation that regards
sup |∇u|
C(ε)
as small – that in any deformation material fibers parallel to z stay straight, do not
change their length, and remain orthogonal to fibers orthogonal to z.
The system (3.5) of linear PDEs has the well-known solution
u(x, ζ ) = v(x) + w(x)z − ζ ∇w(x),
v(x) · z = 0,
(3.6)
parametrized by the in-plane displacement field v(x) and the transverse deflection
w(x).⋆ It is not difficult to solve the nonlinear system (3.4) as well: the admissible
deformations have the form
f (x, ζ ) = g(x) + ζ m(x),
(3.7)
parametrized by the mapping g ≡ f |P delivering the deformed image f (P ) of
the cross section P of cylinder C(ε). Indeed, m(x), the unit normal to f (P ) at the
point g(x), is completely determined by g:
m(x) := uni(∂c1 f × ∂c2 f )|(x,0) = uni(g,1 ×g,2 )|x
(3.8)
(here ∂cα f = Fcα denotes the derivative of f in the direction cα , cα · z = 0, and
uni(a) := |a|−1 a for each vector a = 0).
⋆ We use the same symbol for fields such as u here and f below, no matter we regard them
as functions of the point p ∈ C(ε), of the corresponding pair (x, ζ ) ∈ P ×] − ε, +ε[, or of the
corresponding triplet of coordinates (x1 , x2 , ζ ).
679
A MODEL FOR PLATE BUCKLING
To see that (3.7) implies (3.4) is the matter of a straightforward computation.
We prove the converse implication in a manner different from Naghdi and Nordgren [15], the first who noted that (3.4) and (3.7) are equivalent. With the use of
definition (3.1), we write the constraint condition (3.4) as
Fz = F−T z in C(ε),
(3.9)
and we observe that
f,ζ (x, ζ ) = ∂z f (x, ζ ) = F(x, ζ )z.
(3.10)
It follows from (3.9) that
|Fz| = |F−T z| = 1,
whence, with the use of (3.8)1 ,
m(x) = F(x, 0)z = f,ζ (x, 0).
(3.11)
It also follows from (3.9) that
(Fz),ζ · Fz = 0
and
(Fz),ζ · Fcα = 0,
α = 1, 2,
whence
(Fz),ζ = 0.
(3.12)
Combining (3.10)–(3.12) we have that
f,ζ (x, ζ ) = f,ζ (x, 0) = m(x);
relation (3.7) then follows by integration, on setting
f (x, 0) = g(x).
(3.13)
REMARKS. 1. To compare relations (3.6) and (3.7), we write the latter as
u(x, ζ ) = v(x) + w(x)z + ζ(m − z),
(3.14)
v(x) + w(x)z = g(x) − x,
(3.15)
with
v(x) · z = 0.
In Cartesian components (3.6) reads
uα (x, ζ ) = vα (x) − ζ w,α (x),
u3 (x, ζ ) = w(x),
(3.16)
while (3.14) is
uα (x, ζ ) = vα (x) + ζ mα (x),
u3 (x, ζ ) = w(x) + ζ(m3 (x) − 1)
(3.17)
680
P. PODIO-GUIDUGLI
(here uα := u · cα , u3 := u · z). Thus, the Kirchhoff–Love kinematics is recovered
whenever
mα ≃ −w,α
m3 ≃ 1.
and
(3.18)
2. In the standard linear theory of plates, which cannot treat bifurcations and
concentrates on determining transverse deflections, the fields v(x) and w(x) are
typically determined separately, an exceptional situation in the exact, nonlinear
theory we deal with.
3. The von Kármán’s theory of plate buckling ignores in-plane displacements,
just as Euler’s theory of rod buckling does with axial displacements.
3.2. CONSTRAINED ELASTIC RESPONSE
As is standard doctrine in continuum mechanics (cf., e.g., [20, Section 30; 13; 1,
Section XII.12]), the kinematical constraint (3.7) should be maintained by powerless reactive stresses. In terms of the Cosserat stress S and the strain rate Ḋ, the
power expended per unit referential volume has the espression
π(S, Ḋ) := S · Ḋ.
(3.19)
We split S into mutually orthogonal active and reactive parts:
S = S(A) + S(R) ,
S(A) · S(R) = 0,
(3.20)
and assume that, at each point of C(ε), S(R) satisfies
π(S(R), Ḋ) = 0
(3.21)
for all admissible strain rates Ḋ, that is, those obeying
Ḋz = 0,
(3.22)
a restriction that follows directly from (3.7). Since (3.22) may be equivalently
written as
Ḋ · sym(z ⊗ a) = 0,
(3.23)
with a an arbitrary vector, the Cosserat reactive stress must then have the form
S(R) = z ⊗ d + d ⊗ z + δ z ⊗ z,
d · z = 0, δ ∈ R.
(3.24)
In other words, a reactive stress field over C(ε) takes its values in the three-dimensional subspace
R := span {sym(cα ⊗ z), z ⊗ z}
(3.25)
of the space Sym of symmetric, 3 × 3 tensors; by (3.7) and (3.20)2 , respectively,
both the admissible strain fields and the active stress fields are plane fields, in the
sense that they take values in A := R⊥ , the orthogonal complement of R in Sym.
A MODEL FOR PLATE BUCKLING
681
Our specification of a stress response compatible with the internal constraint (3.7)
is completed by assuming that the active part of the stress depends linearly on the
strain measure, in the form
E
ν
(tr D)1 ;
(3.26)
S(A) =
D+
1+ν
1−ν
here
tr D = 1 · D = D11 + D22 ,
with 1 the identity in A. With (3.26) and (3.21), (3.19) becomes
E
ν
2
2
π(S, Ḋ) = σ̇ (D), σ (D) :=
(trD) ;
|D| +
2(1 + ν)
1−ν
(3.27)
the inequalities
ν ∈ ]−1, +1[
E > 0,
(3.28)
guarantee strict positivity of the stored energy σ (D),⋆ as well as invertibility of the
linear transformation (3.26) of A into itself:
1 + ν (A)
ν
(A)
D=
S −
(tr S )1 .
(3.29)
E
1+ν
All in all, the material comprising the plate-like region C(ε) is a homogeneous
St. Venant–Kirchhoff elastic material [20, Section 94] being transversely isotropic
with respect to the direction z and capable only of deformations that agree with the
constraint (3.4).
3.3. BOUNDARY CONDITIONS
We are now in a position to specify mathematically – for a continuous body having
the referential shape C(ε), the motion class (3.7), and the mechanical response
described by (3.20), (3.24) and (3.26) – the boundary conditions we described in
words in the beginning of this section.
As is customary in the nonlinear mechanics of solids, we express the traction
conditions in terms of the Piola stress P, which relates as follows to Cosserat stress
measure S:
P = FS.
(3.30)
We require that
Pz = 0
in P + ∪ P − ;
(3.31)
⋆ Here E and ν are material moduli resembling, respectively, the well-known Young and Poisson moduli that characterize a linearly elastic, unconstrained, isotropic material; their operational
definition require some care (see [17, Section 19]). Note also that S(A) = ∂σ /∂D.
682
P. PODIO-GUIDUGLI
and that
Pn = −λn in M(ε).
(3.32)
Moreover, with the use of the representation (3.14) for the admissible deformations,
we express the weak-clamping condition in the form
w = 0 and
m = z in ∂P .
(3.33)
We now analyze the boundary conditions (3.31)–(3.33) a bit more closely.
Firstly, we observe that – with (3.30), (3.20) and (3.24) – condition (3.31) can
be given the form
d = 0 and
δ = 0 in P + ∪ P − .
(3.34)
Secondly, we observe that, on the lateral boundary, the Piola traction vector Pn
splits into two vectors, the one reactive, the other constitutively specified:
Pn = FS(R) n + FS(A) n = (d · n) z + FS(A) n;
and that, due to (3.33)2 and (3.9),
z = m = Fz = F−T z.
Consequently, in view also of (3.26),
z · FS(A) n = 0,
and condition (3.32) can be given the form
d·n=0
and
S(A) n = −λF−1 n in M(ε).
(3.35)
Note that the Cosserat traction vector S(A) n must be orthogonal to z, but not necessarily parallel to n (unless the closed curve ∂P has some special global symmetry,
say, it is a circle, or a rectangle).
Thirdly, we observe that, due to (3.8), condition (3.33)2 is equivalent to
(g,1 ×g,2 ) · cα = 0,
α = 1, 2.
(3.36)
Since
g,α = cα + v,α +w,α z,
(3.37)
an easy computation shows that (3.36) can be written as
−(1 + v2 ,2 ) w,1 +v2 ,1 w,2 = 0,
(3.38)
v1 ,2 w,1 −(1 + v1 ,1 ) w,2 = 0.
Thus, since the standard requirement that deformation preserves local orientation
(i.e., det F(x, ζ ) > 0 in C(ε)) implies that
(1 + v1 ,1 )(1 + v2 ,2 ) − v1 ,2 v2 ,1 > 0
in ∂P ,
683
A MODEL FOR PLATE BUCKLING
the gradient of w must be null along the curve ∂P . This fact, together with (3.33)1 ,
is enough to conclude that (3.33) can be equivalently written in the form
w = 0,
∂n w = 0
on ∂P ,
(2.4)
that is to say, precisely the Dirichlet-type boundary conditions in von Kármán’s
problem.
4. Compatibility and von Kármán’s First Equation
4.1. A PREPARATORY RESULT
We begin by establishing an easy consequence of the St. Venant–Beltrami compatibility conditions. As is well known (see [12, Section 14]), those conditions
characterize as follows the solvability of the linear system
1
(∇u + ∇uT ) = E
2
(4.1)
for the vector field u associated to a given symmetric-valued tensor field E: for a
simply-connected body, there is a solution to (4.1) if and only if the datum satisfies
eij k elmn Ej m ,kn = 0,
i, l = 1, 2, 3,
(4.2)
(here eij k is the Ricci alternator and Ej m := cj · Ecm ). If, in particular, E is plane,
that is, in the present circumstances, if
Ez = 0,
(4.3)
then conditions (4.2) are met if and only if E has the form
E(x, ζ ) = E(0) (x) + ζ E(1)(x),
(4.4)
with E(0) and E(1) (plane and) such as to satisfy, respectively,
(0)
(0)
(0)
− E22,11
= 0,
− E11,22
2E12,12
(4.5)
and
(1)
(1)
= 0,
− E12,1
E11,2
(1)
(1)
= 0.
− E12,2
E22,1
(4.6)
The preparatory result we need amounts to note that, for a plane strain field
E(x, ζ ) to be compatible, its thickness average
+ε
1
E(x, ζ ) dζ
(4.7)
E(x) :=
2ε −ε
must satisfy (4.5), that is,
2E 12,12 − E 11,22 − E 22,11 = 0.
(4.8)
684
P. PODIO-GUIDUGLI
Our plan is to show that von Kármán’s first equation follows from the compatibility condition (4.8), when applied to a suitable plane strain field.
4.2. AN EXACT ANTECEDENT OF VON KÁRMÁN ’ S FIRST EQUATION
Taking into account (3.1) and (3.29), and denoting by P the orthogonal projection
of Sym onto its subspace A, we first introduce the mapping
1+ν
ν
1
E = E(S, H) :=
S−
(trS)1 − P[HT H],
(4.9)
E
1+ν
2
which associates a tensor E ∈ A with each pair consisting of a tensor S ∈ A and
an arbitrary tensor H. The plane strain field we look for obtains by composition of
E with fields H(x, ζ ) and S(x, ζ ) belonging, respectively, to the collections H and
S we now describe.
H is the collection of all tensor-valued fields over C(ε) that are gradients of
vector fields of the form (3.14).
The collection S consists of all those plane tensor field S(x, ζ ) over C(ε) whose
thickness average S(x), a single-valued, plane tensor field over P ,
(i) is divergenceless:
S αβ,β = 0,
α = 1, 2;
(4.10)
(ii) satisfies the thickness average of the boundary condition (3.35)2 :
+ε
1
−1
F (x, ζ ) dζ n = F−1 n in ∂P .
Sn = −λs̄,
s̄ :=
2ε −ε
(4.11)
We note that S(x) admits a single-valued Airy representation in terms of a scalar
field ϕ(x) over P :
S = E R(∇∇ϕ)RT ,
R := −c1 ⊗ c2 + c2 ⊗ c1 ;
(4.12)
and that, with (4.12), the boundary condition (4.11) can be written in the form
ϕ = λϕ0 ,
∂n ϕ = λϕ1
in ∂P ,
(4.13)
where, for σ the arc-length parametrization of the boundary curve ∂P from a point
x(0) ∈ ∂P ,
σ
(x(τ ) − x(σ )) ⊗ s̄(τ ) dτ,
(4.14)
E ϕ0 (σ ) := −R ·
0
σ
s̄(τ ) dτ ⊗ n(σ ).
(4.15)
E ϕ1 (σ ) := −R ·
0
REMARKS. 1. The implicit restriction that
ϕ(0) = 0
and
∂n ϕ(0) = 0
is immaterial; see [12, Section 47].
685
A MODEL FOR PLATE BUCKLING
2. It is important to realize that the boundary fields ϕ0 and ϕ1 depend on the
restrictions to ∂P of the parameter fields w and v: such a dependence is entrained
by the presence, in the definitions (4.14) and (4.15), of the field s̄. To make this
point precise, note that, from (A.1) and the second of (3.33) we have that
F−1 n = (hα · n)cα
in ∂P .
In particular, then,
s̄ = (h̄α · n)cα
in ∂P ;
but (see Appendix A) the contravariant base vectors hα depend functionally just on
the fields w and v.
Given (S, H) ∈ S × H, we use definition (4.9) to construct a plane field
E(x, ζ ) =
E(S(x, ζ ), H(x, ζ ))
(in fact, a candidate strain field over the plate-like body C(ε) we consider), whose
thickness average E is parametrized by three fields over P , namely, ϕ, v, and w.
After some computations, we find that E obeys (4.8) if
1
1
1 2
ϕ − [w, w] +
(4.16)
[vα , vα ] + ε [mi , mi ] = 0.
2
2
3
For the reader’s convenience, let us repeat here the first of von Kármán equations:
1
ϕ − [w, w] = 0.
2
(2.1)
Comparison is striking. The differential relation (4.16) is an exact compatibility
restriction involving the fields w(x), v(x), and ϕ(x) over P . As is customary with
equations containing a small parameter, one investigates the dependence of the
solutions of (4.16) on ε. The obvious scaling
w ε = ε w,
vε = ε 2 v,
ϕ ε = ε 2 ϕ,
(4.17)
permits us to conclude that (4.16) is an exact two-dimensional antecedent of the
first von Kármán equation (2.1).⋆
Likewise, conditions (4.13) are exact antecedents of the Neumann-type boundary conditions (2.3) in von Kármán’s problem, with which they formally coincide,
⋆ Note that the first two of (4.17) imply that
mε = z − ε ∇w + o(ε),
whence
[mi , mi ] = O(ε2 ).
686
P. PODIO-GUIDUGLI
and to which they reduce when the scaling (4.17) is completed by taking
λε = ε 2 λ.⋆
(4.18)
REMARK. In this paper, we adhere to a common practice and loosely take 2ε,
instead of 2εh, to measure the thickness of the plate-like region C(ε). Thus, whenever we regard ε as a dimensionless smallness parameter, as is done for the first
time in (4.17), we think of it as divided by h.
5. Equilibrium and von Kármán’s Second Equation
At equilibrium, the Piola stress must satisfy
Div P = 0
at each point of C(ε), or rather, equivalently,
(Pcα ),α +(Pz),ζ = 0.
(5.1)
Our first concern is to give this equilibrium equation a form that makes the role of
the reactive stress explicit.
5.1. ACTIVE AND REACTIVE EQUILIBRIUM STRESSES
We begin by observing that, for
hα = f,α = g,α + ζ m,α ,
h3 = f,ζ = m
(5.2)
the covariant base vectors associated with the shape of C(ε) after a deformation of
the form (3.7), the deformation gradient has the representation
F = hα ⊗ cα + m ⊗ z
(5.3)
(cf. Appendix A). Because of this representation and the fact that S(A) ∈ A, we
find that
(A)
hβ ,
P(A) cα = FS(A) cα = Sβα
P(A)z = 0,
(5.4)
where
(A)
(A)
.
:= cβ · S(A) cα = Sαβ
Sβα
⋆ Under the scaling (4.17) ,
1,2
(hα )ε = cα + o(ε),
s̄ε = n + o(ε).
(5.5)
687
A MODEL FOR PLATE BUCKLING
Moreover, again by (5.3) and by (3.30) and (3.24), we find that
P(R) = dα hα ⊗ z + m ⊗ d + δ m ⊗ z,
dα := d · cα ,
(5.6)
whence
P(R) cα = dα m,
P(R) z = dα hα + δ m,
(5.7)
so that the nonnull components of P(R) are
(R)
Pγ(R)
z = (hγ · hβ )dβ ,
3 := hγ · P
(R)
P3γ
:= m · P(R)cγ = dγ ,
(5.8)
(R)
:= m · P(R)z = δ.
P33
Note that
(R)
Pγ(R)
3 = (hγ · hβ )P3β ,
(R)
(R)
P3γ
= (hγ · hβ )Pβ3
,
(5.9)
where {h1 , h2 , m} is the contravariant base associated with the covariant base in (5.2).
With (5.4) and (5.7), the equilibrium equation (5.1) takes the form
(A)
hβ + dα m),α +(dβ hβ + δ m),ζ = 0.
(Sβα
(5.10)
Differentiating and taking the inner products with the base vectors (5.2), we deduce
from (5.10) the following system of three scalar equations:
(A)
hβ ),α ·hγ + (hβ · hγ )dβ ,ζ = 0,
(Sβα
(5.11)
(A)
+ dα ,α +δ,ζ = 0.
(hβ ,α ·m)Sβα
Finally, with the use of (5.8) and (5.9), we write (5.11) in the form
(A)
hβ ),α · hγ + Pγ(R)
(Sβα
3 ,ζ = 0,
(R)
(R)
(A)
,ζ = 0.
),α + P33
+ ((hα · hβ )Pβ3
(hβ ,α · m)Sβα
(5.12)
Remarkably, equations (5.12) have the same structure exploited, in the linear
case, to give an exact derivation from three-dimensional elasticity of the classical Germain–Lagrange equation for thin plates [16]; we then manipulate these
equations in the same manner.
We proceed as follows: firstly, thickness integration of (5.12) allows us to determine the reactive stress field P(R)(x, ζ ) in C(ε) in terms of the equilibrium
deformation, that is, in terms of the parameter fields ϕ(x), v(x), and w(x) over P
at equilibrium; secondly, three pure (that is, reaction-free) and exact scalar consequences of (5.12) and (3.34) are found, namely, equations (5.16) and (5.18) in
the next subsection; this last equation, a nonlinear counterpart of the Germain–
Lagrange equation, serves as a two-dimensional antecedent of (2.2), the second
688
P. PODIO-GUIDUGLI
of von Kármán equations; finally, by way of the scaling (4.17), the relationship
between (2.2) and its antecedent (5.18) is made precise.
5.2. AN EXACT ANTECEDENT OF VON KÁRMÁN ’ S SECOND EQUATION
Integrating (5.12)1,2 with the use of the boundary condition (3.34)1 restricted to
P − , we find
ζ
(R)
(A)
Pγ 3 (x, ζ ) = −
hβ ),α ·hγ dχ.⋆
(5.13)
(Sβα
−ε
Moreover, with (5.13) and (3.34)2 restricted to P − , integration of (5.12)3 yields
χ
ζ
(A)
(R)
α
β
(Sδγ hδ ),γ ·hβ dτ ,α dχ
(h · h )
P33 (x, ζ ) =
−ε
−
ζ
−ε
−ε
(A)
(hβ ,α · m) Sβα
dχ.
(5.14)
Relations (5.13) and (5.14), together with the second of (5.9), allow us to construct
the reactive field in C(ε) whenever the deformation field is known.
Of the boundary conditions (3.34), those prevailing over P + remain to be satisfied. Two of them read:
Pγ(R)
3 (x, ε) = 0
in P ,
(5.15)
and, with the use of (5.13), can be written as
+ε
(A)
hβ ),α · hγ dζ = 0 in P .
(Sβα
(5.16)
−ε
The third reads:
(R)
P33
(x, ε) = 0
in P ,
or rather, with (5.14),
+ε
ζ
(A)
α
β
(h · h )
(Sδγ hδ ),γ ·hβ dτ ,α dζ −
−ε
in P .
−ε
(5.17)
+ε
−ε
(A)
(hβ ,α ·m) Sβα
dζ = 0
(5.18)
We postpone to the next subsection our study of the two PDE’s in the unknown
fields w and v to which conditions (5.16) reduce when the constitutive equation
(3.26) for the active stress is taken into account, as well as our study of the accompanying boundary conditions. We now show that, given the material response we
have chosen, condition (5.18) yields a fourth-order quasilinear PDE, which is an
exact antecedent of the second von Kármán equation (2.2).
⋆ Here and henceforth, for short, dχ signifies (x, χ) dχ.
689
A MODEL FOR PLATE BUCKLING
To see this, a rather lengthy computation is needed, in order to make explicit
the functional dependence of both S(A), hi and hα on the unknown fields w, v and,
possibly, ϕ. We define
ζ
+ε
1
(A)
α
β
I1 (ε; w, v) :=
(Sδγ hδ ),γ ·hβ dτ ,α dζ, (5.19)
(h · h )
2ε −ε
−ε
+ε
1
(A)
I2 (ε; w, v, ϕ) :=
dζ,
(5.20)
(hβ ,α ·m) Sβα
2ε −ε
and write (5.18) in the form
I1 (ε; w, v) − I2 (ε; w, v, ϕ) = 0 in P .
(5.21)
The dependence of I1 and I2 on the unknown fields is detailed in Appendix D.
In particular, we find that the first of these integrals can be given the form of a
quasilinear differential operator with principal part
p.p.(I1 (ε; w, v)) =
E
(m · z) B αδ (ε; w, v)( w),δα .⋆
2
1−ν
(D.3)
We also find the following explicit representation for the second integral:
I2 (ε; w, v, ϕ) = E(m · z)[ϕ, w] + E(m · v,αβ )(R(∇∇ϕ)RT · cα ⊗ cβ )
1
(1)
− ε 2 (m,α ·m,β )Sαβ
.
(D.8)2
3
Under the scaling (4.17), we find that
I1 (ε; w ε , vε ) = ε 3 Eκ
w + o(ε 3 ),
κ :=
1
h2 ;⋆⋆
3(1 − ν 2 )
(5.22)
as to the second integral, that
I2 (ε; w ε , vε , ϕ ε ) = ε 3 E[ϕ, w] + o(ε 3 ).
(5.23)
Hence,
1
(I1 (ε; w ε , vε ) − I2 (ε; w ε , vε , ϕ ε )) = E(κ
ε→0 ε 3
lim
w − [ϕ, w]),
(5.24)
⋆ Here,
B αδ (ε; w, v) :=
+ε
1
ζ Aαβ hβ · hδ dζ,
2ε −ε
(D.2)
where
hα · hβ = Aαβ ,ζ .
(A.7)
⋆⋆ Recall the remark at the end of Section 4. The last formula provides an interpretation for the
dimensional constant κ in von Kármán’s second equation (2.2).
690
P. PODIO-GUIDUGLI
which is enough to establish the second von Kármán equation as the limit of the
equilibrium equation (5.18) when ε → 0.
5.3. TWO COMPLEMENTING PDES AND THE ASSOCIATED BOUNDARY
CONDITIONS
We begin by writing conditions (5.16) in a form that reflects the functional dependence of both S(A) and hα on the unknown fields w and v:
+ε
1
(A)
(Sβα
hβ ),α · hγ dζ.⋆
(5.25)
I3γ (ε; w, v) = 0,
I3γ :=
2ε −ε
We also note that
S(A) (x, ζ ) = S(0) (x) + ζ S(1)(x) + ζ 2 S(2)(x),
(C.2)
while (5.2) can be written as
hα = hα(0) + ζ hα(1) ,
hα(0) := g,α , hα(1) := m,α .
(5.26)
It is then the matter of an easy calculation to give (5.16) the following form:
1 (0) (1)
(0) (0)
(1) (0)
hβ ),α ·hγ(0) + ε 3 (Sβα
I3γ = (Sβα
hβ ),α ·hγ(1)
hβ + Sβα
3
1
(2) (1)
(2) (0)
(1) (1)
hβ ),α ·hγ(1) = 0 in P .
hβ ),α ·hγ(0) + ε 4 (Sβα
hβ + Sβα
+ (Sβα
5
(5.27)
We denote by I3 (ε; w, v) the vector-valued, nonlinear differential operator with
components I3γ , and write (5.27) for short as
I3 (ε; w, v) = 0 in P ;
(5.28)
this equation is to complement the antecedents (4.16) and (5.21) of the von Kármán
equations (Section 6).
To record here the complicated, explicit form of the operator I3 is scarcely
relevant to our present purposes. What instead matters, as we shall demonstrate
in the next section, is that I3 does not depend on the unknown field ϕ. In addition,
it is interesting to note that, under the scaling (4.17), we have that
1
I3 (ε; w ε , vε ) = Div S(0) .
ε→0 ε 2
lim
(5.29)
Thus, with the use (C.4)1 , we see that, in the limit when ε → 0, (5.28) reduces to
E
1+ν
( w)1 +
∇∇w ∇w + Div S(0) (v) = 0 in P .
(5.30)
2(1 + ν)
1−ν
⋆ The dependence of S(A) on w and v is detailed in Appendix C; as to h , that dependence follows
α
from (5.2), (3.8), and (3.15).
691
A MODEL FOR PLATE BUCKLING
We now turn to determine the boundary conditions to be associated with the
differential system (5.28).
Combining with (5.3) the second of (3.35), we find that the latter can be given
the form
(S(A) cα · n)hα = −λn in M(ε).
(5.31)
Another use of (C.2) and (5.26) yields:
(S(A) cα · n)hα = (S(0)cα · n)hα(0) + ζ (S(0) cα · n)hα(1) + (S(1)cα · n)hα(0)
+ ζ 2 (S(1)cα · n)hα(1) + (S(2)cα · n)hα(0)
We set
1
B3 (ε; w, v) :=
2ε
+ ζ 3 (S(2)cα · n)hα(1).
(5.32)
(5.33)
+ε
−ε
(S(A) cα · n)hα dζ,
whence
1
B3 (ε; w, v) = (S(0) cα · n)hα(0) + ε 2 (S(1)cα · n)hα(1) + (S(2)cα · n)hα(0) ,
3
(5.34)
and stipulate that
B3 (ε; w, v) × n = 0 in ∂P .
(5.35)
Note that B3 does not depend on the unknown field ϕ. In addition, note that, under
the scaling (4.17), we have that
1
B3 (ε; w ε , vε ) = (S(0) cα · n)hα(0).
(5.36)
ε2
Thus, due to the fact that ∇w ≡ 0 on ∂P as a consequence of the Dirichlet
boundary conditions (2.4), another use of the first of (C.4) permits us to conclude
that, when ε → 0, (5.35) reduces to
lim
ε→0
(1 + ∇v)S(0) (v)n × n = 0 in ∂P .
(5.37)
6. An Exact Two-Dimensional Antecedent of von Kármán’s Bifurcation
Problem
We collect the relevant results obtained so far in order to formulate, in a format
modeled after [2], the two-dimensional bifurcation problem we propose as an exact
antecedent of von Kármán’s bifurcation problem (2.11).
We note, firstly, that the format change introduced by Berger applies also when
equation (4.16) takes the place of equation (2.1). Indeed, just as in Section 2, if
we let ϑ be a biharmonic field over P that satisfies the boundary conditions (2.6)
692
P. PODIO-GUIDUGLI
and, moreover, we let ψ = ϕ − λ ϑ, then we can give the system (4.16), (2.3) the
following form:
1
1 2
1
ψ − [w, w] +
(6.1)
[vα , vα ] + ε [mi , mi ] = 0 inP ,
2
2
3
(2.4)
ψ = 0,
∂n ψ = 0 in ∂P .
We denote by ψ(w,v) the unique solution of this problem.
Secondly, we note that
I2 (ε; w, v, ϕ) = I2 (ε; w, v, ψ) + λB(ε; w, v, ϑ),
(6.2)
with
B(ε; w, v, ϑ) := E(m · z)[ϑ, w] + E(m · v,αβ )(R(∇∇ϑ)RT · cα ⊗ cβ ).
(6.3)
Finally, we let
A(ε; w, v) := I1 (ε; w, v) − I2 (ε; w, v, ψ(w,v) ),
C(ε; λ, w, v) := A(ε; w, v) − λB(ε; w, v),
(6.4)
(6.5)
and pose the problem of studying, for each ε > 0 fixed, the mapping
λ → W(ε;λ),
where W(ε;λ) is the collection of all pairs (w, v) of smooth fields over P ∪ ∂P
satisfying
C(ε; λ, w, v) = 0
in P ,
(6.6)
together with (2.4),
I3 (ε; w, v) = 0 in P ,
(5.28)
B3 (ε; w, v) × n = 0 in ∂P .
(5.35)
and
It is clear that this problem, in the limit when ε → 0, reduces to problem (2.11).
Once the latter problem is solved and a critical pair (λ, w λ ) is selected, the limit
problem which obtains from (5.28) and (5.35), that is, the problem consisting of
equations (5.30) and (5.37), serves to determine the in-plane displacement field v
that should accompany the critical transverse deflection w λ .
Appendix A. Geometry of the Deformed Shape of C(ε)
The gradient of a deformation
f (x, ζ ) = g(x) + ζ m(x)
(3.6)
693
A MODEL FOR PLATE BUCKLING
of cylinder C(ε) may be written as
F = hα ⊗ cα + m ⊗ z
(5.6)
in terms of the covariant base vectors
hα = f,α = g,α +ζ m,α ,
h3 = f,ζ = m
(5.2)
associated with the deformed shape of C(ε). The inverse of the deformation gradient has the following expression in terms of the contravariant base vectors hi :
F−1 = cα ⊗ hα + z ⊗ m
(A.1)
(note that h3 = m).
Since
det F = Fc1 × Fc2 · Fz = h1 × h2 · m,
(A.2)
we find that
det F(x, ζ ) = (det F)(0) (x) + ζ(det F)(1)(x) + ζ 2 (det F)(2)(x),
(A.3)
where
(det F)(0) = | g,1 ×g,2 |,
(det F)(1) = (g,1 ×m,2 +g,2 ×m,1 ) · m,
(det F)(2) = (m,1 ×m,2 ) · m.
(A.4)
Expressions for the contravariant base vectors hα in terms of the covariant
vectors hi are:
h1 = (det F)−1 h2 × m,
h2 = (det F)−1 m × h1 ,
(A.5)
whence
h1 · h1 = (det F)−2 h2 · h2 ,
h2 · h2 = (det F)−2 h1 · h1 ,
−h1 · h2 = (det F)−2 h1 · h2 .
(A.6)
It follows, in particular, that there are fields Aαβ (x, ζ ) such that
hα · hβ = Aαβ ,ζ .⋆
(A.7)
⋆ These fields have the form
A11 (x, ζ ) = A11 (x, ζ0 ) +
etc.
ζ
(h2 · h2 )(0) + χ(h2 · h2 )(1) + χ 2 (h2 · h2 )(2)
dχ,
ζ0 ((det F)(0) + χ(det F)(1) + χ 2 (det F)(2) )2
694
P. PODIO-GUIDUGLI
Appendix B. The Strain Field in C(ε)
In view of definition (3.1), the strain tensor D takes the form
D=
1
((hα · hβ )cα ⊗ cβ − cα ⊗ cα ).
2
(B.1)
Now, due to the first two of (5.2),
hα · hβ (x, ζ ) = (hα · hβ )(0)(x) + ζ(hα · hβ )(1) (x) + ζ 2 (hα · hβ )(2) (x), (B.2)
where
(hα · hβ )(0) = g,α · g,β ,
(hα · hβ )(1) = g,α · m,β + m,α · g,β = −2 g,αβ · m,
(hα · hβ )(2) = m,α · m,β .
(B.3)
Consequently,
D(x, ζ ) = D(0) (x) + ζ D(1)(x) + ζ 2 D(2)(x),
(B.4)
1
((hα · hβ )(0) − δαβ ) cα ⊗ cβ ,
2
1
= (hα · hβ )(1) cα ⊗ cβ ,
2
1
= (hα · hβ )(2) cα ⊗ cβ .
2
(B.5)
with
D(0) =
D(1)
D(2)
To make explicit the dependence of the strain field in C(ε) on the surface
gradients of the parameter fields w, v over P , we recall that
g,α = cα + v,α +w,α z,
(3.37)
whence
g,α ·g,β − δαβ = w,α w,β + vα ,β + vβ ,α + vγ ,α vγ ,β ,
g,αβ = w,αβ z + v,αβ ;
(B.6)
(B.7)
and that
m = uni(g,1 ×g,2 ),
with
g,1 ×g,2 = z − ∇w + (c1 + w,1 z) × v,2 +v,1 ×(c2 + w,2 z) + v,1 ×v,2
= −(w,1 (1 + v2 ,2 ) − w,2 v2 ,1 )c1 − (w,2 (1 + v1 ,1 )
− w,1 v1 ,2 )c2 + (1 + v1 ,1 +v2 ,2 +v1 ,1 v2 ,2 −v1 ,2 v2 ,1 )z. (B.8)
695
A MODEL FOR PLATE BUCKLING
We can then write relations (B.5) in the following form:
1
1
∇w ⊗ ∇w + D(v), D(v) := sym(∇v) + (∇v)T ∇v,
2
2
= −(m · z)∇∇w − (m · v,αβ )cα ⊗ cβ ,
1
= (m,α ·m,β )cα ⊗ cβ .
2
D(0) =
D(1)
D(2)
(B.9)
Appendix C. The Active Stress Field in C(ε)
In order to find relations similar to (B.4) and (B.9) for the active strain, we first
write the constitutive relation (3.26) in the form
E
ν
(A)
S = S[D], S :=
1⊗1 .
(C.1)
I+
1+ν
1−ν
Secondly, we insert (B.4) into (C.1) and get
S(A) (x, ζ ) = S(0) (x) + ζ S(1)(x) + ζ 2 S(2)(x),
(C.2)
S(i) = S[D(i)],
(C.3)
with
i = 0, 1, 2;
more esplicitly,
S
(0)
S(1)
S(2)
ν
E
2
|∇w| 1 + S(0)(v),
∇w ⊗ ∇w +
=
2(1 + ν)
1−ν
E
ν
= −
(m · z) ∇∇w +
( w)1 + S(1)(w, v),
1+ν
1−ν
E
ν
2
=
|m,α | 1 ,
(m,α ·m,β )cα ⊗ cβ +
2(1 + ν)
1−ν
(C.4)
where
S(0)(v) := S[D(v)],
S(1) (w, v) := −S[(m · v,αβ )cα ⊗ cβ ].⋆
(C.5)
It follows from (3.20) and (C.1) that the thickness average of the stress field in
C(ε) is the following field over P :
S=S
(A)
1
= S(0) + ε 2 S(2).
3
⋆ In particular,
S(1) (w, v) = −
2
ν
E 1
(m · v,αβ )cα ⊗ cβ +
(m · v,αα )1 .
1+ν
1−ν
(C.6)
696
P. PODIO-GUIDUGLI
Appendix D. From (5.18) to (5.21)
D .1. THE FIRST INTEGRAL IN
(5.18)
In view of (A.7), an integration by parts taking (5.15) into account yields
ζ
+ε
+ε
(A)
(A)
α
β
hδ ),γ · hβ dζ.
Aαβ (Sδγ
(Sδγ hδ ),γ · hβ dτ dζ = −
(h · h )
−ε
−ε
−ε
Expanding the differentiations indicated, we have that
+ε
(A)
hδ ),γ · hβ ),α dζ
Aαβ (Sδγ
−ε
+ε
+ε
(A)
(A)
αβ
Sδγ
,γ (Aαβ hβ · hδ ),α dζ
Sδγ ,γ α (A hβ · hδ ) dζ +
=
−ε
−ε
+ε
+ε
(A)
(A)
αβ
+
Sδγ ,α (A hδ ,γ · hβ ) dζ +
Sδγ
(Aαβ hδ ,γ · hβ ),α dζ.
−ε
−ε
(A)
The expressions (C.1)–(C.5) for S must now be inserted into each of the four
integrals in the right side of the last relation. We concentrate on the first integral –
the most important, because it gives rise to the principal part of the operator I1 –
which takes the following form:
+ε
(A)
,γ α (Aαβ hβ · hδ ) dζ
Sδγ
−ε
+ε
+ε
(0)
(1)
αβ
αβ
=
A hβ · hδ dζ Sδγ ,γ α +
,γ α
ζ A hβ · hδ dζ Sδγ
−ε
−ε
+ε
(2)
2 αβ
+
ζ A hβ · hδ dζ Sδγ
,γ α .
−ε
What now counts is the second addendum. With (C.4)3 and (C.5)2 , we find that
E
(1)
,γ α = −
Sδγ
(m · z)( w),δα +(S(1)(w, v))δγ ,γ α .
(D.1)
1 − ν2
We set
+ε
1
ζ Aαβ hβ · hδ dζ,
(D.2)
B αδ (ε; w, v) :=
2ε −ε
and, finally, write the principal part of I1 as
p.p.(I1 (ε; w, v)) =
E
(m · z) B αδ (ε; w, v)( w),δα .
2
1−ν
D .2. THE SECOND INTEGRAL IN
(D.3)
(5.18)
Since
hβ ,α ·m = (m · z)w,αβ +m · v,αβ −ζ m,α ·m,β ,
(D.4)
697
A MODEL FOR PLATE BUCKLING
the second integral in (5.18) can we written as
+ε
(A)
Sβα dζ − (m,α ·m,β )
((m · z)w,αβ +m · v,αβ )
−ε
+ε
−ε
(A)
dζ.
ζ Sβα
Now, by the first of (4.12),
+ε
1
S(A) dζ = ER(∇∇ϕ)RT,
2ε −ε
(D.5)
(D.6)
so that
wαβ
+ε
−ε
(A)
dζ = (2ε)E [w, ϕ] = (2ε)E [ϕ, w].⋆
Sβα
Moreover, by (C.2) and (C.3),
+ε
1
1
ζ S(A) dζ = ε 2 S(1),
2ε −ε
3
(D.7)
with S(1) depending on w and v as specified by (C.4)3 and (C.5)2 .
With this, we conclude that
+ε
1
(A)
I2 (ε; w, v, ϕ) :=
(hβ ,α ·m) Sβα
dζ
2ε −ε
= E(m · z)[ϕ, w] + E(m · v,αβ )(R(∇∇ϕ)RT · cα ⊗ cβ )
1
(1)
.
(D.8)
− ε 2 (m,α · m,β )Sαβ
3
Acknowledgements
I have benefitted of some comments by two referees. This work has been supported
by the Progetti Cofinanziati 2000 and 2002 “Modelli Matematici per la Scienza
dei Materiali” and by TMR Contract FMRX-CT98-0229 “Phase Transitions in
Crystalline Solids”.
References
1.
2.
3.
4.
S.S. Antman, Nonlinear Problems of Elasticity. Springer, Berlin (1995).
M.S. Berger, Nonlinearity and Functional Analysis. Academic Press, New York (1977).
S. Chow and J. Hale, Methods of Bifurcation Theory. Springer, Berlin (1996).
P.G. Ciarlet, A justification of the von Kármán equations. Arch. Rational Mech. Anal. 73 (1980)
349–389.
⋆ We here use an identity that follows from (2.5) and the second of (4.12), namely,
[a, b] = ∇∇a · R(∇∇b)RT .
698
5.
P. PODIO-GUIDUGLI
P.G. Ciarlet, Plates and Junctions in Elastic Multi-Structures: An Asymptotic Analysis.
Springer, Berlin (1990).
6. P.G. Ciarlet, Mathematical Elasticity, Vol. II: Theory of Plates. North-Holland, Amsterdam
(1997).
7. M.G. Crandall and P.H. Rabinowitz, Nonlinear Sturm–Liouville eigenvalue problems and
topological degree. J. Math. Mech. 19 (1970) 1083–1102.
8. J.L. Davet, Justification de modèles de plaques non linéaires pour des lois des comportements
générales. Moddél. Math. Anal. Numér. 20 (1986) 225–249.
9. D.D. Fox, A. Raoult and J.C. Simo, A justification of nonlinearly properly invariant plate
theories. Arch. Rational Mech. Anal. 124 (1993) 157–199.
10. G. Friesecke, R.D. James and S. Müller, A theorem of geometric rigidity and the derivation of
nonlinear plate theory from three-dimensional elasticity. Comm. Pure Appl. Math. LV (2002)
1461–1506.
11. G. Friesecke, R.D. James and S. Müller, The Föppl–von Kármán plate theory as a low energy
Gamma limit of nonlinear elasticity. C. R. Math. Acad. Sci. Paris 335 (2002) 201–206.
12. M.E. Gurtin, The linear theory of elasticity. In: S. Flügge (ed.), Handbuch der Physik Via/2.
Springer, Berlin (1972).
13. M.E. Gurtin and P. Podio-Guidugli, The thermodynamics of constrained materials. Arch.
Rational Mech. Anal. 51 (1973) 192–208.
14. T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech.
Anal. 143 (1998) 1–28.
15. P.M. Naghdi and R.P. Nordgren, On the nonlinear theory of elastic shells under the Kirchhoff
hypothesis. Quart. Appl. Math. 21 (1963) 49–59.
16. P. Podio-Guidugli, An exact derivation of the thin plate equation. J. Elasticity 22 (1989) 121–
133.
17. P. Podio-Guidugli, A Primer in Elasticity. Kluwer Academic Publishers, Dordrecht (2000).
18. P.H. Rabinowitz, Some global results for nonlinear eigenvalue problems. J. Funct. Anal. 7
(1971) 487–513.
19. C. Truesdell, Some challenges offered to analysis by rational thermodynamics. In: G.M. de la
Penha and L.A. Medeiros (eds), Contemporary Developments in Continuum Mechanics and
Partial Differential Equations. North-Holland, Amsterdam (1978).
20. C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Flügge (ed.),
Handbuch der Physik, Vol. III/3. Springer, Berlin (1965).
21. T. von Kármán, Festigkeitsprobleme in Maschinenbau. In: F. Klein and C. Müller (eds), Encyclopädie der Matematisches Wissenschaften, Vol. IV/4. Teubner, Stuttgart (1910) pp. 311–385.
Cauchy’s Flux Theorem in Light of Geometric
Integration Theory
G. RODNAY and R. SEGEV
Department of Mechanical Engineering, Ben-Gurion University, P.O. Box 653, Beer-Sheva 84105,
Israel. E-mail: {rodnay;rsegev}@bgumail.bgu.ac.il
Received 24 June 2002; in revised form 17 January 2003
Abstract. This work presents a formulation of Cauchy’s flux theory of continuum mechanics in the
framework of geometric integration theory as formulated by H. Whitney and extended recently by
J. Harrison. Starting with convex polygons, one constructs a formal vector space of polyhedral chains.
A Banach space of chains is obtained by a completion process of this vector space with respect to a
norm. Then, integration operators, cochains, are defined as elements of the dual space to the space
of chains. Thus, the approach links the analytical properties of cochains with the corresponding
properties of the domains in an optimal way. The basic representation theorem shows that cochains
may be represented by forms. The form representing a cochain is a geometric analog of a flux field
in continuum mechanics.
Mathematics Subject Classifications (2000): 73A05, 58A05.
Key words: continuum mechanics, flux, Cauchy’s theorem, geometric integration, chains, cochains,
flat, sharp, natural.
Dedicated to the memory of Clifford Truesdell who by his work and personality
inspired the research of generations of scientists.
1. Introduction
The Cauchy Theorem for the existence of stresses and fluxes is one of the fundamental results of continuum mechanics. Over the years, research work contributed
to the subject by making the proof more rigorous, by weakening the postulates
needed to prove the theorem, by extending the circumstances under which it is
valid, and by proving the existence of stresses and fluxes using alternative methods
and approaches.
In terms of scalar fluxes in space, the basic notions of flux theory may be described as follows. One considers the total flux T (∂B) of an extensive property P
through the boundary ∂B of the region B in a three dimensional Euclidean space.
The total flux is assumed to be given as an integral of the flux density tB associated
with the region B, a scalar field defined on ∂B, in the form
699
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 699–719.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
700
G. RODNAY AND R. SEGEV
T (∂B) =
tB dA.
∂B
The dependence of the flux density tB on the region B is considered next and it is
assumed that at each point p, tB (p) depends on B only through the unit normal
vector n to ∂B at p so one writes t (p, n) for the corresponding value. Then, one
assumes that the total flux is balanced by the rate of decrease of the total amount
of the property P in B as given in terms of an integral of a scalar field b over B so
tB dA = − b dV .
∂B
B
Assuming that the dependence of t (p, n) on p is continuous one proves Cauchy’s
theorem asserting that t (p, n) depends linearly on n. Thus, there is a vector field τ
such that t = τ ·n, where the dependence on p was suppressed in the notation.
Considering smooth regions such that Gauss’ theorem may be applied, the
balance may be written in the form of a differential equation as div τ + b = 0.
The first major contribution within the continuum mechanics community was
made by Noll in 1957 [11]. Noll was able to prove the dependence of the flux on
the normal vector, using a weaker assumption of locality, namely, the flux density
tB (p) is equal for two regions if the intersection of their boundaries contains an
open neighborhood of p.
Gurtin and Williams in [5] and later works [12, 6, 13] use alternative assumptions, bi-additivity and boundedness, to obtain both locality and the representation
of total flux in terms of the integral of a flux density. Specifically, assuming that
the collection of admissible regions has the structure of a Boolean lattice, it is
required that T be given in terms of a mapping of pairs of bodies I (A, B), so
T (A) = I (A, B) if B is the complement of A. For separate (disjoint) domains
A and B, I satisfies I (A B, C) = I (A, C) + I (B, C) and I (C, A B) =
I (C, A) + I (C, B). Then, it is assumed that |I (A, B)| l area (∂A ∩ ∂B)+
k volume (A) – the boundedness assumption – to obtain locality. To prove Cauchy’s
theorem it is assumed also that I (A, B) = −I (B, A) and that the dependence of
τ (p, n) on p is continuous.
In [4] Gurtin and Martins prove the linearity in n of t (p, n) almost everywhere, while using similar additivity and boundedness assumptions but relaxing
the hypothesis that t (p, n) is a continuous function of p.
In [20, 21] Šilhavý uses a weak approach to prove the existence of stress tensors or flux vectors. Admissible bodies are sets of finite perimeter in En , and the
assumptions and results pertain to “almost every subbody” in a way which allows
singularities. The resulting flux vector τ has an Lp weak divergence.
Degiovanni et al. [1] generalize [20, 21] by considering flux mappings T whose
corresponding flux vector fields τ are only locally integrable. The field b = −divτ
is meaningful only in the weak sense.
In order to present results that hold for domains and flux fields that are increasingly irregular, the works cited above rely on geometric measure theory of Federer
FLUXES AND GEOMETRIC INTEGRATION
701
[2] and de Giorgi (see [14]). For example, tools of geometric measure theory are
used for choosing a universe of bodies, for a measure theoretic definition of the
normal vector, and for using generalizations of Gauss’ theorem to such irregular
domains.
Another approach for proving Cauchy’s theorem directly from an integral balance equation is introduced in [3]. In this paper a variational approach is taken to
prove the linear dependence on the normal starting from a weaker locality postulate.
Stress theory for manifolds that are not equipped with a metric is presented in
[15] from a weak point of view. Forces are defined as elements of the dual space
of the Banach space of C k -sections of a vector bundle over the body. Stresses are
Borel measures valued in the dual of a jet bundle and they represent forces using
a representation theorem. Further analytical aspects of the theory are presented
in [18]. In particular, as the theory introduces continuum mechanics of order k
as corresponding to the space of C k -sections, general consistency conditions that
are analogous to Cauchy’s postulates are formulated for arbitrary values of k and
for stresses as irregular as Borel measures. In [16], and the following [19, 17] the
analog of the classical Cauchy theorem is presented for differentiable manifolds.
In 1947 and 1948 Whitney [22] and Wolfe [24] presented a geometric theory of
r-dimensional integration in an n-dimensional Euclidean space. A comprehensive
treatment [23] of the theory was published by Whitney in 1957. While geometric
measure theory received a lot of attention because of its relevance to the Plateau
problem, the mathematical work continuing Whitney’s geometric integration theory is limited. In [7] and the following [8–10] Harrison made important extensions
to Whitney’s work. To the best of our knowledge, Whitney’s abstract geometric
integration theory was never used in the formulation of Cauchy flux theory in
continuum mechanics.
It is our objective here to present the Cauchy flux theory from the point of view
of geometric integration. In addition to offering a different approach to flux theory,
the following features make it eminently suitable. Firstly, the theory considers
various aspects namely, the collection of domains, integration, Stokes’ theorem
(the analog of Gauss’ theorem), and fluxes, from a unified point of view. The
properties and degrees of regularity of the various variables are linked. Thus, one
may consider less regular domains if one is willing to consider smoother fluxes.
In fact, the regions may be as irregular as the Dirac measure and its derivatives
if one is willing to admit differentiable flux fields. On the other hand, the flux
fields may be as irregular as essentially bounded and measurable functions when
the boundaries are as irregular as the graph of an L1 -mapping. The way the theory
is constructed, the relation between the regularity properties of domains and fluxes
is optimal in the following sense. The class of domains is the largest class for
which the evaluation of the various fluxes is continuous. Conversely, the class of
fluxes is the largest class such that the total fluxes depend continuously on the
domains.
702
G. RODNAY AND R. SEGEV
The codimension n − r is not limited to the value of 1 as in regular Cauchy flux
theory. It follows that the theory may be used to formulate flux theory on membranes, strings, etc. Furthermore, the theory does not require that the r-dimensional
domains be smooth. In fact, it permits for example the calculation of flux through
a 1-dimensional “arc” on a 2-dimensional domain in R3 which is itself the graph
of an L1 -mapping. In other words, not only the boundary is irregular, but so is
the domain itself. Finally, the construction of continuous chains creates a bridge
between the classical and weak formulation of the theory.
The elegance of the structure enables its description in just a few sentences.
One starts with the building blocks, r-dimensional oriented cells (convex polygons)
in an n-dimensional Euclidean space En . Then, the formal vector space of linear
combinations of r-cells is considered, where two linear combinations A and B are
identified if they may be further subdivided to obtain a common subdivision. The
elements of this vector space are called polyhedral r-chains. Then, the space of
polyhedral chains is completed with respect to a norm to obtain a Banach space.
The elements of the resulting complete space are called either flat, sharp, or natural r-chains depending on the norms used, and chains collectively. Integration
operators are referred to as r-cochains and they are defined as continuous linear
operators on the space of chains.
The application of geometric integration theory to Cauchy flux theory is based
on the identification of a total flux operator on regions with a cochain. In other
words, a cochain is analogous to a total flux operator acting on the various domains to produce real numbers. In the case of traditional continuum mechanics,
the total flux is regarded a 2-cochain in E3 . The analog of Cauchy’s flux theorem is
a representation theorem stating that a cochain may be represented by an r-form,
an antisymmetric r-tensor in En , using integration. As mentioned earlier, the analytical properties of chains and forms representing cochains are determined by
the norm used. The topology on the space of chains allows one to extend various
operations, e.g., integrals and boundaries, from polyhedral chains to the chains
obtained as limits of sequences.
Federer [2, pp. 367–378] introduces flat chains as currents, roughly, continuous linear functionals on the space of smooth forms with compact supports –
the geometric analogs of Schwartz distributions, and defines the flat norm as the
norm induced on the dual space by the norm φ = supp {|φ(p)|, |dφ(p)|} on
the space of smooth forms. While Federer’s treatment of flat chains is concise and
elegant, it does not contain the analogs for sharp and natural chains. In addition, it
seems to us that Whitney’s approach is closer in spirit to the traditional approach
of continuum mechanics. Furthermore, as Federer states in [2, p. 378] his main
interest has been in chains while Whitney’s main concern has been with cochains
– the objects representing Cauchy fluxes of continuum mechanics.
It is noted that the expression for the representation of cochains in terms of
forms also applies on general manifolds rather than a Euclidean space. In addition, while the definitions of the various norms utilize the metric structure of En ,
FLUXES AND GEOMETRIC INTEGRATION
703
the various topological spaces of chains remain invariant under diffeomorphisms.
This suggests an extension of the theories to general manifolds. However, a formal
presentation of such a theory is not available yet and will not be considered here.
Thus, the basic constructions, results, and applications to Cauchy flux theory are
described below. For details of the mathematical constructions and proofs see [23]
and [8, 9]. We start in Section 2 with the basic building blocks: polyhedral chains
and integration on polyhedral chains. Section 3 considers the construction of the
various Banach spaces of chains and Section 4 presents the definitions and basic
properties associated with cochains – the analogs of the Cauchy flux operators. The
Cauchy theorem of fluxes is implied by the representation theorem of cochains by
forms as presented in Section 5. Finally, Section 6 considers the extension of the
exterior derivative to non-smooth forms through the notion of a coboundary, and
the resulting local balance equation.
2. Chains and Integration
2.1. CELLS AND POLYHEDRAL CHAINS
We start with a review of the basic definitions related to integration on chains in an
n-dimensional Euclidean space En whose associated vector space is V . A cell, σ , is
a non empty bounded subset of En expressed as an intersection of a finite collection
of half spaces. The plane of σ is the smallest affine subspace containing σ , and the
dimension of σ is the dimension of its plane. We refer to r-dimensional cells as
r-cells.
An oriented r-cell is an r-cell with a choice of one of the two orientations of
the vector space associated with its plane. The cell −σ is the cell that contains
the same points as σ but has the opposite orientation. The boundary of an oriented
r-cell, ∂σ , is a collection of oriented (r −1)-cells. The boundary of a 1-cell consists
of two points, and 0-cell has no boundary. The orientations of the cells that make
up the boundary ∂σ are determined by the orientation of σ , in the following way.
Given a cell σ ′ ⊂ ∂σ , let v2 , . . . , vr be a collection of r − 1 independent vectors
that belong to the plane of σ ′ . Then, this collection is positively oriented if given
a vector v1 at σ ′ that belongs to the plane of σ and points out of σ , the collection
(v1 , . . . , vr ) is positively oriented relative to σ . The boundary of a 1-cell oriented
by the vector pq, consists of the two 0-cells q positive and p negative.
Oriented cells are the building blocks of chains. A polyhedral r-chain in En is
an element of the vector space spanned by formal linear combinations of r-cells,
together with the following properties.
(1) The polyhedral chain 1σ is identified with the cell σ .
(2) We associate multiplication of a cell by −1 with the operation of inversion of
orientation, i.e., −1σ = −σ .
(3) If an oriented cell σ is cut into several cells σ1 , . . . , σm , then σ and σ1 +
· · · + σm are identified as polyhedral chains. Thus, we identify the union of
704
G. RODNAY AND R. SEGEV
oriented r-cells having disjoint interiors with the polyhedral r-chain which is
the sum of the r-cells.
Polyhedral 0-chains are expressions of the form
ai pi , where pi are points. The
boundary of a cell is thus a chain, the sum of the various oriented cells that make
up the boundary as above.
The space of polyhedral r-chains in En is now an infinite-dimensional
vector
space denoted by Ar (En ). The boundary of a polyhedral
r-chain
A
=
ai σi , is
a polyhedral (r − 1)-chain defined to be ∂A =
ai ∂σi . The boundary of a polyhedral 0-chain is 0. Note that by this definition ∂ is a linear operator Ar (En ) −→
Ar−1 (En ).
2.2. MULTIVECTORS
A simple r-vector in V is defined in a formal way, to be an expression of the form
v1 ∧ · · · ∧ vr , where vi ∈ V , the vector space associated with En . We set r-vectors
in V to be elements of the vector space Vr of formal linear combinations of simple
r-vectors, together with the following properties:
(1)
(2)
(3)
v1 ∧ · · · ∧ (vi + vi′ ) ∧ · · · ∧ vr
= v1 ∧ · · · ∧ vi ∧ · · · ∧ vr + v1 ∧ · · · ∧ vi′ ∧ · · · ∧ vr ;
v1 ∧ · · · ∧ (avi ) ∧ · · · ∧ vr = a(v1 ∧ · · · ∧ vi ∧ · · · ∧ vr );
v1 ∧ · · · ∧ vi ∧ · · · ∧ vj ∧ · · · ∧ vr
= −v1 ∧ · · · ∧ vj ∧ · · · ∧ vi ∧ · · · ∧ vr .
1-vectors are just vectors, and 0-vectors are defined to be real numbers. It is noted
that any r-vector can be written in various equivalent ways. The various identifications above, in particular the antisymmetry, imply that dimension of the space of
r-vectors is
dim Vr =
n!
,
(n − r)!r!
where n is the dimension of V . If r > n then Vr is empty. Given a basis {ei } of V ,
the r-vectors {eλ1 ...λr = eλ1 ∧ · · · ∧ eλr }, such that 1 λ1 < · · · < λr n, form a
basis of Vr .
Given an oriented r-simplex σ in En , with vertices p0 , . . . , pr , the r-vector
of σ , denoted by {σ }, is defined to be {σ } = v1 ∧ · · · ∧ vr /r!, where the vectors vi
are defined by vi = pi − p0 and are ordered in such a way that they belong to the
orientation of σ . It is noted that in case {σ1 } = a{σ2 } for two r-simplexes σ1 and σ2 ,
then, the ratio between the r-dimensional volumes of the twosimplexes relative
to
any metric is |a|. The r-vector of a polyhedral r-chain
A
=
a
σ
,
where
ai σi
i i
is a simplicial subdivision of A, is defined by
ai σi =
ai {σi }. Clearly, this
defines the r-vectors of r-cells too, as r-cells are particular polyhedral r-chains.
705
FLUXES AND GEOMETRIC INTEGRATION
2.3. MULTI - COVECTORS
The dual space of Vr is denoted by V r and its elements are referred to as r-covectors.
We now show how r-covectors can be expressed using covectors. We denote by
V ∗ the dual space of V , and by Vr∗ the space which is constructed exactly like the
space Vr , butusing the (co-) vectors of V ∗ . Hence, elements of Vr∗ are expressions
of the form ai f i1 ∧ · · · ∧ f ir , where f ij ∈ V ∗ . The scalar product of elements
of Vr∗ and elements of Vr is defined by
(f 1 ∧ · · · ∧ f r ) · (v1 ∧ · · · ∧ vr ) =
λ
= det
ǫ λ1 ...λr f 1 (vλ1 ) · · · f r (vλr )
%
f 1 (v1 )
···
f r (v1 )
···
···
···
f 1 (vr )
···
f r (vr )
&
,
for simple vectors, and extends linearly to the vector spaces. Here, λ = {λ1 , . . . , λr }
ranges over the set of all permutations of (1, . . . , r), and ǫ λ1 ...λr is the alternating
symbol. Any element τ̄ of Vr∗ may be identified with an element τ of V r by
τ (α) = τ̄ · α for any r-multivector α. Furthermore, an element τ of V r may be
regarded as an alternating multilinear form τ̃ by
τ (v1 ∧ · · · ∧ vr ) = τ̃ (v1 , . . . , vr ).
2.4. INTEGRATION OF FORMS OVER POLYHEDRAL CHAINS
The natural integrands over r-chains are r-forms. An r-form in a set Q ⊂ En is an
r-covector valued mapping defined in Q. An r-form is continuous if its components
are continuous functions. The Riemann integral of a continuous r-form τ over an
r-simplex σ is defined as
τ (pki ) · {σki },
τ = lim
σ
k→∞
σki ∈Sk σ
where Sk σ is a sequence of simplicial subdivisions σki of σ with mesh → 0, and
each pki is a point in σki
. The Riemann integral of acontinuous r-form
over a
polyhedral r-chain A =
ai σi , is defined by A τ =
ai σi τ , where ai σi is
a simplicial subdivision of the polyhedral chain A.
An r-form in En is bounded and measurable if all its components relative to a
basis of V are bounded and measurable. The Lebesgue integral of an r-form τ over
an r-cell σ is defined by
{σ }
dp,
τ = τ (p) ·
|σ |
σ
σ
706
G. RODNAY AND R. SEGEV
where |σ | is the r-dimensional volume of σ and the integral on the right is a
Lebesgue integral of a real function. This is extended by linearity to domains that
are polyhedral chains by
τ,
τ=
ai
σi
A
if A =
i
ai σi .
2.5. STOKES ’ THEOREM FOR POLYHEDRAL CHAINS
The exterior derivative of a differentiable r-form τ is an (r + 1)-form dτ defined
by
dτ (p) · (v1 ∧ · · · ∧ vr+1 )
r+1
(−1)i−1 ∇vi τ (p) · (v1 ∧ · · · ∧
vi ∧ · · · ∧ vr+1 ),
=
i=1
where
vi denotes a vector that has been omitted, and ∇vi is a directional derivative
operator. The last definition is represented using coordinates by
(dτ )λ1 ...λr+1 (p) =
r+1
(−1)i−1
i=1
∂
(p).
τ
∂x i λ1 ...λ̂i ...λr+1
Stokes’ theorem for polyhedral chains, based on the fundamental theorem of
differential calculus, states that
τ
dτ =
A
∂A
for every differentiable r-form τ and an (r + 1)-polyhedral chain A.
3. Banach Spaces of Chains
3.1. FLAT CHAINS
The mass of a polyhedral r-chain A =
ai σi in En is defined to be |A| =
|ai ||σi |, where |σi | denotes the the r-dimensional volume of |σi |. Thus, in case
the interiors of the cells σi of a polyhedral r-chain do not intersect, and if ai = 1,
then the mass of the polyhedral chain is exactly its r-dimensional volume.
DEFINITION 3.1. The flat norm, |A|♭ , of a polyhedral r-chain A in En is defined
by
|A|♭ = inf{|A − ∂D| + |D|},
using all polyhedral (r + 1)-chains D.
FLUXES AND GEOMETRIC INTEGRATION
707
We note that it is not immediate that | · |♭ is indeed a norm. Furthermore, the
actual calculation of the the flat norm may be quite complicated even for simple
r-chains. (For example, consider a 1-chain in the plane consisting two oriented line
segments.) Taking D = 0 above, it is clear that |A|♭ |A|.
Completing the space Ar (En ) with respect to the flat norm gives a Banach space
denoted by A♭r (En ). That is, A♭r (En ) contains the formal limits of all the sequences
of polyhedral r-chains Ai , such that limi→∞ |Ai+1 − Ai |♭ = 0. Elements of A♭r (En )
which are limits of such sequences are sometimes denoted by lim♭ Ai . We refer to
elements of A♭r (En ) as flat r-chains in En . If there are no intersections between
cells and all coefficients have the value of 1, we identify the flat chain with the set
that contains its points.
EXAMPLE 3.2. Consider the sequence of 1-chains (Ai ) in E2 such that Ai =
L1i + L2i where L1i and Li2 are 1-simplexes associated with two parallel line
segments having the same length L, opposite orientation, and the line segment
corresponding to L2i is obtained from the line segment corresponding to L1i by a
translation of distance di perpendicularly to its direction. If we take the rectangle
generated by the two line segments as Di in the definition of the flat norm, it follows
that |Ai |♭ (L + 2)di . Thus, if di → 0, the sequence (Ai ) converges to the zero
chain in the flat norm. On the other hand, in the mass norm we have |Ai − Ai−1 | =
2L for all i so the sequence does not converge. Roughly speaking, the geometrical
significance of the flat norm is that, unlike the mass norm, it takes into account how
closely the two segments are located.
If we let the length of the line segments shrink also so that for Ai , L = di ,
then by taking Di as above we get |Ai |♭ di2 + 2di while taking Di = 0, implies
|Ai |♭ 2di so |Ai |♭ → 0 as di .
EXAMPLE 3.3. Consider the “staircase” sequence (Bi ) shown in Figure 1. Here,
j−1
Aj =
2
Aj l
l=1
is the sum of 2j −1 oriented 1-squares of size dj = 1/2j , Bi = B0 + ij =1 Aj , and
we take the limit as i → ∞. Set for each square Aj l , the cell Dj l such that Aj l =
∂Dj l . Then, using Dj l in the definition of the flat norm we get |Aj l |♭ dj2 = 2−2j .
Hence,
|Bi − Bi−1 |♭ = |Ai |♭ 2i−1 2−2i ,
so the sequence (Bi ) converges.
Flat chains may be used to represent continuous and smooth submanifolds of
E and even irregular surfaces as shown above. As another example, starting with
a triangle on R2 one may construct a plane in R3 by mapping the vertices using
n
708
G. RODNAY AND R. SEGEV
Figure 1. The staircase.
the values at the vertices of a real valued function u on R2 . One may subdivide
the triangle and map the new vertices again using the mapping u to construct a
piecewise flat surface in R3 approximating the graph of u. This procedure may
be repeated to construct a sequence of 2-chains. If for the function u one uses
a continuous function that is nowhere differentiable one obtains a flat chain that
represents a surface that is not rectifiable.
The Riemann integral of a continuous r-form τ over a flat r-chain A = lim Ai ,
is defined to be A τ = lim Ai τ , if the limit exists.
The boundary of a flat (r+1)-chain A = lim♭ Ai , is defined to be ∂A = lim ∂Ai .
The boundary of a flat (r + 1)-chain always exists as a flat r-chain.
3.2. SHARP CHAINS
Whitney obtained chains that are even less regular then the flat chains by introducing a possibly smaller norm. Thus, more Cauchy sequences will converge and one
ends up with a larger completed space.
DEFINITION 3.4. The sharp norm |A|♯ of a polyhedral r-chain A =
ai σi is
defined by
♭$
|ai ||σi ||vi |
♯
ai transvi σi ,
+
|A| = inf
r +1
using all vectors vi ∈ En , where transv is a translation operator that moves each
point p of σ to p + v, giving a translated cell transv σ with the same orientation
as σ .
Clearly, setting all vi = 0, we conclude that |A|♯ |A|♭ so the sharp norm
defines a coarser topology. Completing the space Ar (En ) with respect to the sharp
FLUXES AND GEOMETRIC INTEGRATION
709
norm gives a Banach space denoted by A♯r (En ) whose elements are referred to as
sharp chains. It follows that A♭r (En ) is a Banach subspace of A♯r (En ).
EXAMPLE 3.5. Consider again the sequence of pairs of 1-vectors in R2 of length
di situated a distance di apart as above. Taking v1 = 0, and v2 as the vector such
that transv2 will cause the two line segments to overlap so |v2 | = di , we have
|Ai |♯ di2 /2. Hence, for di → 0, the sharp norm of the shrinking pairs tends to
zero faster than the flat norm.
Consider the “staircase strainer” sequence (Bi ) constructed in the unit square
as shown in Figure 2. Here, Aj is the sum of 2j −1 pairs of size dj = 1/2j , Bi =
B0 + ij =1 Aj , and we take the limit as i → ∞. For the flat norm we have
|Bi − Bi−1 |♭ = |Ai |♭
2i−1 2
= 1,
2i
so the sequence (Bi ) does not converge. On the other hand, for the sharp norm
|Bi − Bi−1 |♯ = |Ai |♯
2i−1 (1/2i )2
2−i
=
,
2
4
and the sequence converges. Thus, we will be able to calculate the total flux through
the staircase strainer limit. (The extensive property under consideration may flow
through the strainer Bi at the horizontal segments only.)
Similarly, the “staircase mixer” sequence shown in Figure 3 converges in the
sharp norm but not in the flat norm.
Roughly speaking, the difference in behavior between the flat norm and the
sharp norm may be described as follows. Consider a sequence (Ai ) of shrinking
r-polyhedral chains of typical size si → 0. If Ai is the boundary ∂Bi of a shrinking
(r +1)-chain Bi , then taking D = Bi in the definition of the flat norm, |Ai |♭ shrinks
Figure 2. The staircase strainer.
710
G. RODNAY AND R. SEGEV
Figure 3. The staircase mixer.
like sir+1 . If Ai cannot be represented as the boundary of an (r + 1)-chain, the
r-dimensional mass of some subset of Ai will always be present in the definition
of the flat norm and hence |Ai |♭ will shrink like sir only. On the other hand, for the
sharp norm, if one can cancel the flat norm of a chain by translating simplexes by
vectors of the same order of magnitude as si , then the price to pay in the definition
of the sharp norm is bounded by sir+1 whether Ai is the boundary of another chain
or not.
The Riemann integral of a continuous r-form τ over a sharp r-chain A = lim Ai ,
is defined to be A τ = lim Ai τ , if the limit exists.
It is noted that being less regular than a flat chain, the boundary of a sharp chain
need not exist as a sharp chain.
3.3. NATURAL CHAINS
A basic notion of Harrison’s constructions is that of a dipole. A simple r-dimensional 0-dipole is an r-simplex σ 0 whose diameter diam(σ 0 ) 1. A simple
r-dimensional 1-dipole is a chain of the form σ 1 = σ 0 − transv1 σ 0 for a vector v1 , such that |v1 | 1, and transv1 σ 0 is disjoint from σ 0 . Inductively, a simple
r-dimensional j -dipole is an r-chain of the form σ j = σ j −1 − transvj σ j −1 , where
σ j −1 is a simple r-dimensional (j −1)-dipole, and vj is a vector with |vj | 1 such
that transvj σ j −1 is disjoint from σ j −1 . A simple j -dipole is therefore determined
by the simplex σ 0 and the v1 , . . . , vj vectors. A j -dipole is a simplicial chain
j
ai σi
Dj =
i
of simple j -dipoles.
711
FLUXES AND GEOMETRIC INTEGRATION
Given a simple j -dipole σ j constructed by the simplex σ 0 and vectors v1 ,
. . . , vj , its j -dipole mass is defined by
|σ j |j = |σ 0 ||v1 | · · · |vj |
(|σ 0 | is the mass of σ 0 ). The j -dipole mass of the j -dipole D j =
defined as
j
|D j |j =
|ai ||σi |j .
i
j
ai σi is
i
Using the notion of a dipole and the dipole mass, the k-natural norm, k =
1, 2, . . . , on the space of polyhedral chains is defined by
0
/ k
|D s |s + |C|♮k−1 ,
|A|♮k = inf
s=0
where the infimum it taken over all decompositions of A in the form A = ks=0 D s
+ ∂C, for s-dipoles D s . The Banach space one obtains by completing the space
of polyhedral chains relative to this norm is denoted by Akr and its elements are
referred to as k-natural r-chains. Clearly, the 0-natural norm is equivalent to the
flat norm. Harrison also defines norms associated with fractional values of r that
are related to the Hölder conditions but we omit the discussion of such chains here.
As k increases, the spaces of natural chains become larger, i.e., Akr is a Banach subspace of Alr for k < l. For increasing values of k these spaces contain
increasingly irregular chains. For example, various fractals are natural chains, and
the kth distributional derivative of the Dirac measure on the real line belongs to
A1k+1 (see [8]).
For a k-natural r-chain A, let τ be a form on A that has k−1 bounded derivatives
and whose kth derivative is Lipschitz. The Riemann integral of τ over a natural
r-chain A = lim♮ Ai , is defined to be A τ = lim Ai τ . Indeed, Harrison shows
that the limit always exists as integrals over polyhedral chains are bounded by the
natural norms of the chains.
A clear advantage of using the natural norms in comparison with the sharp
norm is the behavior of the natural chains under the boundary operator: the boundary operator of polyhedral chains extends to a continuous linear operator ∂:
k−1
.
Akr → Ar−1
4. Cochains
Cochains are elements of the dual spaces to the Banach spaces of flat, sharp, and
natural chains. The basic idea of the application of Whitney’s abstract integration
theory to the analysis of Cauchy fluxes is that cochains in the various dual spaces
are abstract counterparts of total fluxes. Specifically, for classical continuum mechanics we regard the total flux TA of a certain extensive property P through a
712
G. RODNAY AND R. SEGEV
2-dimensional domain A in E 3 as the action T ·A of a 2-cochain T on the 2-chain A
associated with the domain. For the sake of simplicity of the notation we used here
the same notation for both the domain and the representing chain. It is noted that
chains contain more information than just the domain where they are supported.
For example, any continuous function defined on a submanifold of the Euclidean
space may be represented as a chain. Obviously, the coefficients for the simplexes
that make up the chain will be different than 1 and will represent the values of the
function. In such a case, if we interpret the value of the function as a component
of a velocity field, the action of a cochain on the chain may be interpreted as the
calculation of power. Thus, geometric integration theory combines the classical
approaches to flux theory and the variational weak approach. An immediate benefit
of using geometric integration theory is that the analysis holds for r-chains in En
for all values of r n.
The properties of cochains that make them suitable mathematical models for
Cauchy fluxes follow firstly from the linearity of their action on chains which is
common to all Banach spaces considered above. Linearity of the action of cochains
implies both the additivity and the action–interaction–antisymmetry properties assumed in various formulations of continuum mechanics. For example, given a
cochain T , we have T · (−A) = −T · A. Secondly, the properties of the various
cochains are determined by the continuity of their action on chains which is directly
linked to the norm on the respective space of chains. Basic observations regarding
the relations between the various norms and the properties we expect fluxes to have
will described below.
4.1. FLAT COCHAINS
Flat r-cochains in En are the elements of A♭r (En )∗ , the dual space of A♭r (En ).
We will see next how the topology induced by the flat norm is related to traditional assumptions of Cauchy flux theory. We recall that in various formulations of
Cauchy’s flux theory it is assumed that the total flux is bounded by both the volume
and area of the corresponding region. That is, there are positive numbers N1 and
N2 such that for every region A,
|T∂A | N2 |∂A|,
|T∂A | N1 |A|,
|T · A| N2 |A|,
|T · ∂D| N1 |D|,
where we use the mass norm to denote both the area and volume of the respective
sets. In terms of a cochain T these boundedness conditions will be written as
for any r-chain A and an (r + 1)-chain D. Thus,
|T · A| =
|T · A − T · ∂D + T · ∂D|
|T · A − T · ∂D| + |T · ∂D|
N1 |A − ∂D| + N2 |D|
CT (|A − ∂D| + |D|),
713
FLUXES AND GEOMETRIC INTEGRATION
where CT is the least upper bound of all positive numbers satisfying this relation
for all (r + 1)-chains D. The basic idea is to look at this relation as a requirement
of continuity, |T · A| CT A, for the linear operator T . Since D is arbitrary it is
natural to set then
|A|♭ = inf{|A − ∂D| + |D|}.
D
It follows that the flat norm is the smallest of all norms that make the flux operators
satisfying the boundedness condition continuous. As such, upon the completion of
the space of polyhedral chains with respect to the flat norm, we obtain the largest
Banach space for which the bounded flux operators are continuous. This means
that flat chains are the most general geometrical objects for which the action of
bounded flux operators is continuous.
Conversely, if we consider norms | · |x on the space of polyhedral chains and
wish to consider the action of a continuous flux functional T , then, |T · A|
CT |A|x . If one requires that |A|x |A| and |∂D|x |D|, for any r-chain A and
(r +1)-chain D, then the boundedness conditions are implied by continuity because
|T · A| CT |A|x CT |A|,
and
|T · ∂D| CT |∂D|x CT |D|.
In order to admit the most general flux operators that satisfy these conditions we
need the largest norm such that |A|x |A| and |∂D|x |D|. Indeed it can be
shown that the flat norm is the largest norm satisfying these two conditions.
4.2. SHARP COCHAINS
Sharp r-cochains in En are elements of A♯r (En )∗ , the dual the space of A♯r (En ).
Since flat chains form a Banach subspace of A♯r (En ), every sharp cochain may be
restricted to flat chains. In other words, any sharp cochain is also flat.
The additional property of sharp cochains that distinguishes them from flat
cochains is the boundedness under translation. Given a sharp cochain T , consider
for an r-cell σ and a vector v, the difference in the flux due to the translation by v,
i.e., |T · σ − T · transv σ |. The continuity of T implies that
|T · σ − T · transv σ | CT |σ − transv σ |♯
|σ ||v|
,
CT
r +1
by choosing v1 = 0 and v2 = −v in the definition of the sharp norm. Thus,
continuity implies that there is a positive N3 such that
|T · σ − T · transv σ | N3 |σ ||v|.
In particular, the difference tends to zero if so does the magnitude of v. Clearly, this
imposes a regularity restriction on sharp cochains. In analogy with flat chains, the
sharp norm is the smallest of all norms for which all the flux operators satisfying the
714
G. RODNAY AND R. SEGEV
earlier boundedness conditions and boundedness under translation are continuous.
Hence, in comparison with all other norms, it allows more elements to be added to
the space of polyhedral chains in the process of completion.
Conversely, if we consider norms | · |x on the space of polyhedral chains and
wish to consider the action of a continuous flux functional T , then, |T · A|
CT |A|x . If one requires that |A|x |A|, |∂D|x |D| and |σ − transv σ |x |σ ||v|,
for every r-chain A, (r + 1)-chain D, r-cell σ , and vector v, then the boundedness
conditions are implied by continuity. For example, boundedness under translation
is implied by
|T · σ − T · transv σ | CT |σ − transv σ |x CT |σ ||v|.
In order to admit the most general flux operators that satisfy these three conditions
we need the largest norm satisfying them. Indeed it can be shown that the sharp
norm is the largest norm satisfying the conditions.
4.3. NATURAL COCHAINS
k
A k-natural r-cochain is an element of Ak∗
r , the dual space of Ar . Since the natural
♮
norms | · |k are smaller than the flat norm for k > 0, all natural cochains are flat
cochains. In fact, we will see later that natural cochains are very regular. It is a
basic guiding principle in geometric integration that as chains become increasingly
irregular the cochains become increasingly regular.
5. Representation of Cochains, the Isomorphism Theorem and Fluxes
5.1. THE CAUCHY MAPPING
The Cauchy mapping of r-directions induced by an r-cochain is completely analogous to the mapping that gives the dependence of the flux density on the unit
normal in classical continuum mechanics, hence the terminology we use. Let the
r-direction α of an r-cell σ be the r-vector {σ }/|σ |. The Cauchy mapping DT , associated with the cochain T is defined to be the function of points and r-directions
such that
σi
,
DT (p, α) = lim T ·
i→∞
|σi |
where σi is a sequence of r-cells containing p with r-direction α such that
lim diam(σi ) = 0.
i→∞
As the r-direction α is the analog of the unit normal n used in continuum
mechanics, the analog of Cauchy’s flux theorem will be the assertion that the
restriction of the Cauchy mapping to each point p may be extended to a linear
mapping of r-vectors. In other words, DT is a form in En .
FLUXES AND GEOMETRIC INTEGRATION
715
5.2. THE REPRESENTATION THEOREM FOR SHARP FLUXES
The analog to Cauchy’s flux theorem in Whitney’s geometric integration theory for
sharp cochains states the following.
PROPOSITION 5.1. For each sharp r-cochain T , the Cauchy mapping DT may
be extended to a unique r-form that represents T by
DT ,
T ·A=
A
for every polyhedral chain A.
Clearly, the proposition defines the integral of a form over a sharp chain by
continuity.
Whitney’s theory determines exactly the forms that represent sharp cochains –
the sharp forms. Firstly, the norm | · |0 is defined on V r by
|τ |0 = sup |w · α| α simple, |α| = 1 .
The sharp norm of the form τ is defined by
$
|τ (q) − τ (p)|0
♯
|τ | = sup |τ (p)|0 , (r + 1)
.
|q − p|
p,q∈En
Then, a sharp form is defined to be a form whose sharp norm is finite. Thus,
sharp forms are bounded Lipschitz forms. Using the norm topology on the space
of cochains where
|T |♯ = sup |T · A|,
|A|♯ =1
it can be shown that the previous proposition defines an isomorphism of the Banach
space of sharp cochains and the Banach space of sharp forms.
5.3. REPRESENTATION OF FLAT COCHAINS
While sharp r-cochains are regular enough to be represented uniquely by sharp
r-forms, flat r-cochains are less regular, and each flat r-cochain is represented by
an equivalence class of r-forms which satisfy certain regularity conditions. Sharp
forms representing sharp cochains are continuous and Riemann integration may be
used. The representation of flat cochains by forms uses the analogous Lebesgue
integration.
The Lebesgue integral of an r-form over a flat r-chain A = lim♭ Ai is defined
by
τ = lim
τ
A
Ai
716
G. RODNAY AND R. SEGEV
if the limit exists. (The integrals on the right-hand side are Lebesgue integrals on
polyhedral chains defined earlier.)
The analysis of the representation of flat cochains by forms requires more attention then the sharp counterpart. For example, in the definition of the Cauchy
mapping
σi
,
DT (p, α) = lim T ·
i→∞
|σi |
it is required that in the converging sequence (σi ), each of the simplexes will contain p as a vertex. It turns out that for each r-direction α, DT (p, α) is defined
almost everywhere. Wolfe’s representation theorem as formulated by Whitney [23,
p. 261] for flat cochains states as follows.
PROPOSITION 5.2. Let T be a flat r-cochain in an open set R ⊂ En . Then, there
is a set Q ⊂ R, with |R−Q| = 0, such that for each p ∈ Q, DT (p, α) is defined for
all r-directions α, and is extendable to all r-vectors, giving an r-covector DT (p).
The r-form DT is bounded and measurable in R. For any r-simplex σ in R, DT is
a measurable r-form relative to the plane of σ and
T · σ = DT .
σ
In fact, one can describe exactly the flat forms – those forms that represent
flat cochains. The exact conditions such that any flat r-form is associated with a
unique flat r-cochain use the notions of Q-good simplexes and association of a
form with a flat cochain. In order to avoid the technical details and since we are
mainly interested in the existence of the representing forms, these notions will not
be presented here (see [23, pp. 263–266] for the details). As one would expect, a
flat cochain is associated with an equivalence class of forms under equality almost
everywhere. The quotient space of flat forms obtained by identifying forms that
are equal almost everywhere together with an appropriate norm, the flat norm of
forms, is isomorphic to the space of flat cochains.
5.4. REPRESENTATION OF NATURAL COCHAINS
As mentioned earlier, natural cochains are regular. Harrison’s representation theorem states that for k > 0 every k-natural cochain T is represented by a unique
differential form DT as
DT ,
T ·A=
A
where the first k derivatives of DT are bounded and the kth derivative is Lipschitz. In fact, this relation defines an isomorphism of the of the space of k-natural
cochains and the space of differential forms having this degree of smoothness
(equipped with the suitable C k,Lip -norm).
FLUXES AND GEOMETRIC INTEGRATION
717
6. Coboundaries and Differential Balance Equations
Coboundaries generalize exterior differentiation and their definition is purely algebraic. The coboundary dT of an r-cochain T is the (r + 1)-cochain defined
by
dT · A = T · ∂A,
i.e., it is the dual of the boundary operator for chains. As ∂(∂A) = 0, one has
d(dT ) = 0. The basic result concerning coboundaries is that the coboundary of a
flat cochain is flat and the same holds for the coboundary of a sharp cochain. This
implies a very general formulation of the balance equation. For a cochain T that is
either sharp or flat, the coboundary exists as a flat cochain and we may define an
(r +1)-cochain S, satisfying dT +S = 0, so the balance equation S ·A+T ·∂A = 0
holds. Here, S is interpreted as the cochain giving the rate of change of total amount
of the property P in the flat (r + 1)-chain A (assuming there is no source term).
If the form DT representing the cochain T is differentiable, then the flat form
DdT representing dT is given as the exterior derivative of DT as one would expect,
i.e.,
DdT = dDT .
Thus, using τ for DT , the abstract balance equation above assumes the form
τ = 0.
b+
dτ + b = 0,
A
∂A
In the more general case where τ is an arbitrary flat form representing the flat
cochain T , dT is a flat cochain and hence it may be represented by any flat form
d0 τ in the equivalence class of DdT . Thus, one may write the “differential” balance
in the general situation of flat cochains. In fact,
|T |♭ = sup {|DT (p)|, |DdT (p)|}.
p
The right-hand side of this identity is the flat norm of the form DT .
In the particular case where T is a sharp cochain represented by the sharp
form τ = DT , the functions giving the components of τ are Lipschitz mappings,
hence, it has an analytic exterior derivative dτ as in Section 2.5 almost everywhere.
Furthermore, it turns out that d0 τ = dτ almost everywhere.
6.1. COBOUNDARIES FOR NATURAL COCHAINS
The fact that the boundary operator is continuous for natural chains allows the
definition of the coboundary operator as the dual of the boundary operator. For
natural cochains, one has DdT = dDT . It is noted that for natural cochains one
may use a geometric definition of the exterior derivative as follows. Let p be a
718
G. RODNAY AND R. SEGEV
point and α an r-direction, then taking a decreasing sequence of r-simplexes (σi )
containing p, all of which are in the direction of α, then
dτ (p, α) = lim
|σi |→0
∂σi
τ
|σi |
.
Thus, the balance equation holds pointwise.
Acknowledgements
The research leading to this paper was partially supported by a Kreitman Doctoral
Fellowship to G. Rodnay and by the Paul Ivanier Center for Robotics Research and
Production Management at Ben-Gurion University.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
M. Degiovanni, A. Marzocchi and A. Musesti, Cauchy fluxes associated with tensor fields
having divergence measure. Arch. Rational Mech. Anal. 147 (1999) 197–223.
H. Federer, Geometric Measure Theory. Springer, New York (1969).
R.L. Fosdick and E.G. Virga, A variational proof of the stress theorem of Cauchy. Arch.
Rational Mech. Anal. 105 (1989) 95–103.
M.E. Gurtin and L.C. Martins, Cauchy’s theorem in classical physics. Arch. Rational Mech.
Anal. 60 (1975) 305–324.
M.E. Gurtin and W.O. Williams, An axiomatic foundation for continuum thermodynamics.
Arch. Rational Mech. Anal. 26 (1967) 83–117.
M.E. Gurtin, W.O. Williams and W.P. Ziemer, Geometric measure theory and the axioms of
continuum thermodynamics. Arch. Rational Mech. Anal. 92 (1986) 1–22.
J. Harrison, Stokes’ theorem for nonsmooth chains. Bull. Amer. Math. Soc. 29(2) (1993) 235–
242.
J. Harrison, Continuity of the integral as a function of the domain. J. Geometric Anal. 8(5)
(1998) 769–795.
J. Harrison, Isomorphisms of differential forms and cochains. J. Geometric Anal. 8(5) (1998)
797–807.
J. Harrison, Flux across nonsmooth boundaries and fractal Gauss/Green/Stokes theorems.
J. Phys. A 32(28) (1999) 5317–5327.
W. Noll, The foundations of classical mechanics in light of recent advances in continuum
mechanics. In: The Axiomatic Method, with Special Reference to Geometry and Physics
(Symposium at Berkeley, 1957). North-Holland, Amsterdam (1959) pp. 265–281.
W. Noll, Lectures on the foundations of continuum mechanics and thermodynamics. Arch.
Rational Mech. Anal. 52 (1973) 61–92.
W. Noll, Continuum mechanics and geometric integration theory. In: Categories in Continuum
Physics, Lecture Notes in Mathematics, Vol. 1174. Springer, New York (1986) pp. 17–29.
W. Noll and E.G. Virga, Fit regions and functions of bounded variation. Arch. Rational Mech.
Anal. 102 (1988) 1–21.
R. Segev, Forces and the existence of stresses in invariant continuum mechanics. J. Math. Phys.
27(1) (1986) 163–170.
R. Segev, The geometry of Cauchy fluxes. Arch. Rational Mech. Anal. 154 (2000) 183–198.
FLUXES AND GEOMETRIC INTEGRATION
17.
18.
19.
20.
21.
22.
23.
24.
719
R. Segev, A correction of an inconsistency in my paper ‘Cauchy’s theorem on manifolds’.
J. Elasticity 63 (2002) 55–59.
R. Segev and G. de Botton, On the consistency conditions for force systems. Internat. J.
Nonlinear Mech. 26(1) (1991) 47–59.
R. Segev and G. Rodnay, Cauchy’s theorem on manifolds. J. Elasticity 56 (1999) 129–144.
M. Šilhavý, The existence of the flux vector and the divergence theorem for general Cauchy
fluxes. Arch. Rational Mech. Anal. 90 (1985) 195–212.
M. Šilhavý, Cauchy’s stress theorem and tensor fields with divergences in Lp . Arch. Rational
Mech. Anal. 116 (1991) 223–255.
H. Whitney, Algebraic topology and integration theory. Proc. National Acad. Sci. 33 (1947)
1–6.
H. Whitney, Geometric Integration Theory. Princeton Univ. Press, Princeton, NJ (1957).
J.H. Wolfe, Tensor fields associated with Lipschitz cochainsm, PhD Thesis, Harvard (1948).
A Comparison of the Response of Isotropic
Inhomogeneous Elastic Cylindrical and Spherical
Shells and Their Homogenized Counterparts ⋆
U. SARAVANAN and K.R. RAJAGOPAL
Department of Mechanical Engineering, Texas A&M University, U.S.A.
E-mail: saran@tamu.edu, krajagopal@mengr.tamu.edu
Received 28 August 2002; in revised form 17 February 2003
Abstract. All real bodies are inhomogeneous, though in many such bodies the inhomogeneity is
“mild” in that the response of the bodies can be “approximated” well by the response of a homogeneous approximation. In this study we explore the status of such approximations when one is
concerned with bodies whose response is nonlinear. We find that significant departures in response
can occur between that of a “mildly” inhomogeneous body and its homogeneous approximation
(if the approximate model is restricted to a certain class), both quantitatively and qualitatively. We
illustrate this fact within the context of a specific boundary value problem, the inflation of an inhomogeneous spherical shell. We also discuss the inappropriateness of homogenization procedures that
lead to a homogenized stored energy for the body when in fact what is required is a homogenized
model that predicts the appropriate stresses as they invariably determine the failure or integrity of the
body.
Mathematics Subject Classifications (2000): 74B20, 74Q15, 74Q20.
Key words: homogenization, inhomogeneous body, stored energy, isotropy, inflation, spherical shell.
Dedicated to the memory of Clifford Truesdell
1. Introduction
Nonlinear elasticity owes a great debt to the writings and research of Truesdell and
those inspired by him. Much of this effort, though not all, in nonlinear elasticity
is confined to understanding and describing the response of homogeneous bodies.
Many bodies that are patently inhomogeneous are usually approximated by constitutive response functions for homogeneous bodies. In this short paper we are
concerned with the error such an approximation entails.
⋆ We thank the National Institutes of Health and the National Science Foundation for the support
of this work.
721
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 721–749.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
722
U. SARAVANAN AND K.R. RAJAGOPAL
All real bodies are inhomogeneous, however in many bodies the departure from
homogeneity is ignorable while in others it is not. Biological (see [1]) and geological bodies, as well as composites (see [2]) are usually such that the inhomogeneity
cannot be neglected. In order to render the response of inhomogeneous bodies
amenable to analysis they are oftentimes approximated as homogeneous bodies,
provided that the material properties depart but slightly from a meaningful average
value. Even amongst the class of bodies whose properties vary slightly from a
mean, it is important to determine those classes of inhomogeneous bodies that
can be well approximated by homogeneous bodies and those classes that cannot.
While such an approximation seems appropriate for a reasonably wide class of
inhomogeneous bodies, it transpires that it is grossly deficient in characterizing
many classes of inhomogeneous bodies, even when their properties vary mildly
from an average value.
There has been little work with regard to finite deformations of inhomogeneous
solids and the little that there is concerns homogenization of nonlinear elastic
solids that have a stored energy that is polyconvex with the emphasis being the
determination of bounds for the stored energy. A popular model in biomechanics,
defined through (12) does not have a stored energy that is polyconvex. The model
(12) seems to fit the data well (of course, it is possible that a polyconvex model
might also fit the data reasonably well).
We are also interested in cautioning one involved in data reduction based on a
homogeneous model for a body that is supposedly “mildly” inhomogeneous, totally
different values would be ascribed to the material moduli on the basis of different
experiments (we discuss this aspect in some detail later).
A body (B) is said to be materially uniform if for any two points P1 , P2 ∈ B,
there exists placers κ1 and κ2 and neighborhoods NX1 of X1 = κ1 (P1 ) and NX2 of
X2 = κ1 (P2 ) such that the mechanical response of these neighborhoods is indistinguishable. If there exists a single placer κ such that the response of all X belonging
to κ(B) are indistinguishable the body is said to be homogeneous. A body that is
not homogeneous is said to be inhomogeneous.
Let γ denote a material parameter. We can define a mean value for the parameter, in the configuration κR (B), through
γmean =
V (κR (B)) γ (X) dV
V (κR (B))
,
(1)
where V (κR (B)) denotes the volume of the configuration κR (B). Now, it is reasonable to ask, when we can treat an inhomogeneous body in which a property
varies over the body as a homogeneous body, with the material parameter having
a constant value γmean in κR (B). Here we suppose the mechanical response is
determined by the single parameter, γ . Otherwise we will have to consider several parameters γi with all of them being approximated by a mean value. If the
approximation is to make sense, then the response of the inhomogeneous body has
to be close to the response of the homogeneous approximation in some sense.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
723
A related question of importance is whether the γmean defined via equation (1)
has any relevance to an experimentally inferred constant γexp through data reduction that presumes the body as homogeneous and belonging to the same type
as the inhomogeneous body (e.g., an inhomogeneous neo-Hookean body being
approximated by a homogeneous neo-Hookean body).
Suppose we have an inhomogeneous body that comprises of homogeneous subparts whose stored energy function belongs to a certain class, say neo-Hookean, but
the different subparts that are neo-Hookean having a different material modulus. In
general, a homogeneous approximation need not be a body of the same type, i.e.,
in the above example neo-Hookean. However, if a body consists in various pieces
of the same type with slightly differing values for the material moduli, we would
be tempted to believe that it could be modelled by a homogeneous body belonging
to the same class. Moreover, this is what is usually done (see [3]) though there are
a few studies which do obtain a homogenized body belonging to a different class.
As we observed earlier, it is possible that the homogenized model of an inhomogeneous body, each of its parts belonging to a certain class, need not be of the
same class, i.e., the homogenization of a body comprised of different homogeneous
neo-Hookean solids need not lead to a neo-Hookean body. But this depends on the
homogenization procedure. Currently, the few homogenization studies pertaining
to nonlinear elastic solids aim towards obtaining a homogenized stored energy.
Irrespective of whether the homogenized body belongs to the same class or to
another class, if the homogenization is based on energy considerations, the results
are highly unsatisfactory from the point of view of applications, for the following
reasons. Such a homogenization ill-serves a person interested in the failure of the
body as failure is invariably determined by the stress in the body and not the stored
energy at a point or for that matter the stored energy of a neighborhood of the
point. Stresses are related to the derivatives of the stored energy with respect to the
deformation gradient and thus having a homogeneous approximation of the stored
energy does not in general provide a good approximation for the stresses. Even
less useful are the studies concerning homogenization procedures for nonlinear
elastic solids that obtain bounds for the stored energy as these bounds need not be
tight and even if tight, they serve little useful purpose. The above points cannot be
overemphasized.
Another question that bears investigation is whether the data reduction that
presumes that the model is homogeneous leads to the same value γexp in different
2
1
obtained from two experiments
and γexp
experiments, or at least are the values γexp
close. In a recent study on the inflation, extension, torsion and shearing of a right
circular isotropic inhomogeneous cylinder, Saravanan and Rajagopal [4] show that
the answer to both the above questions is negative, i.e., γmean may not bear a close
1
2
correlation to γexp and γexp
may not be close to γexp
.
We show that different but slight variations of the property, say a piecewise
constant variation, a linear variation, a sinusoidal variation, etc., all of which have
the same mean value lead to responses that are markedly different, even for global
724
U. SARAVANAN AND K.R. RAJAGOPAL
measures of the response. When we focus our attention on local measures such
as stresses, even a 5% variation about a mean for the material moduli could lead
to solutions for the inhomogeneous and the approximate homogeneous body to
differ by several hundred percent! Even more disturbing is the fact that the sense
of the stress for the two cases, at a given location, could be different, i.e., while one
predicts a compressive stress, the other could predict a tensile stress or vice versa.
Of course, all this depends on the class of stored energy being considered.
It is worth noting that in the process of establishing the main thesis of the paper,
an important boundary value problem, the inflation of a sector of a spherical shell,
is solved for a variety of inhomogeneous bodies.
The arrangement of this paper is as follows. After a brief review of the relevant
kinematics in Section 2 we introduce the different types of bodies (stored energies)
that we will be considering in Section 3. We discuss the various types of inhomogeneities in Section 4 and develop the governing equation for the inflation of a
sector of spherical shell in Section 5. We follow this with a discussion of the issues
concerning parameter estimation from experiments, and conclude by presenting a
few interesting results on the stress distribution across the thickness of the shell.
2. Kinematics
Let X ∈ κR (B) denote a typical particle belonging to the reference configuration
κR (B) of the body, and let x ∈ κt (B) denote the position occupied by X at time t
in the configuration κt (B). The motion of the body is defined through the mapping
χκR that is one to one for each t ∈ R:
x = χκR (X, t).
(2)
We shall assume that the motion is sufficiently smooth to render all the derivatives
that follow meaningful. The deformation gradient, FκR is defined through
FκR =
∂χκR
,
∂X
(3)
and the Cauchy–Green stretch tensors, BκR and CκR are defined through
BκR = FκR FTκR ,
(4)
CκR = FTκR FκR .
(5)
The principal invariants of any second order tensor A are defined through
I1 = tr A,
I2 =
1
(tr A)2 − tr A2 ,
2
I3 = det A.
These kinematical quantities are sufficient for our purpose.
(6)
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
725
3. Constitutive Relations
In this study we shall restrict ourselves to hyperelastic, isotropic, incompressible
inhomogeneous solids. The most general representation for the Cauchy’s stress, T,
for this class of elastic solids is
T = −p1 + 2W1 BκR − 2W2 B−1
κR ,
(7)
where, W1 and W2 are the derivatives of the stored energy W = W (I1 , I2 , X) with
respect to the first and second principal invariants of CκR and −p1 is the indeterminate part of the stress due to the constraint of incompressibility. The specific form
of the stored energy function depends on the particular solid that is of interest.
We shall consider three types of stored energy functions associated with isotropic,
incompressible inhomogeneous bodies and restrict ourselves to special forms of inhomogeneities, but these special forms suffice to make our case that great care has
to be exercised in approximating inhomogeneous bodies as homogeneous bodies,
however mild the inhomogeneities.
We shall first consider a generalization of the classical homogeneous neoHookean model (see [5]). We shall suppose that the stored energy function W takes
the form
W = µ(X)(I1 − 3),
(8)
where I1 = tr CκR and µ(X) > 0 is the shear modulus. The Cauchy stress T in the
body is given by
T = −p1 + 2µ(X)BκR .
(9)
The second model that we shall consider is the inhomogeneous version of the
Mooney model [6], for which the stored energy W has the form
W = µ1 (X)(I1 − 3) + µ2 (X)(I2 − 3).
(10)
It follows from (7) and (10) that the Cauchy stress is given by
T = −p1 + 2µ1 (X)BκR − 2µ2 (X)B−1
κR ,
(11)
where µ1 (X) > 0 and µ2 (X) > 0 are material moduli that have to satisfy certain
restrictions (see [7]).
Finally, we introduce a model proposed by Fung [8] to describe biological
tissues. The stored energy function for the inhomogeneous counterpart of Fung’s
model is
W = a(X) eQ − 1 ,
(12)
2
2
2
+ b4 (X)E11 E22 + b5 (X)E22 E33 +
+ b3 (X)E33
+ b2 (X)E22
where, Q = b1 (X)E11
b6 (X)E11 E33 and E = 0.5[C − 1]. This stored energy is not polyconvex. We shall
discuss the deformation of inhomogeneous anisotropic elastic solids elsewhere.
726
U. SARAVANAN AND K.R. RAJAGOPAL
The problems identified in this article have to do with “homogenization” and not
with issues of symmetry. Here we shall, consider an isotropic model wherein, Q =
b(X)(I1 − 3). The Cauchy stress corresponding to such a stored energy is
T = −p1 + 2a(X)b(X)eQ BκR ,
(13)
where a(X) > 0 and b(X) > 0 are material parameters.
We shall neglect body forces and as we shall consider only static problems, the
balance of linear momentum reduces to
div(T) = 0.
(14)
For incompressible bodies, the Cauchy stress can be expressed as
T = −p1 + Te ,
(15)
where Te = W1 BκR − W2 B−1
κR is the constitutively determined part of the stress.
Let us first introduce a dimensionless prescription of the position, gradient and
divergence through
x
,
L
= Lgrad(·),
grad(·)
= Ldiv(·),
div(·)
x =
(16)
(17)
(18)
where L is a relevant length scale for the specific boundary value problem under
consideration. We note that the deformation gradient, FκR is already a dimensionless quantity. We introduce a parameter, µo with units of stress, to render
the Cauchy stress dimensionless. The choice of the parameter µo depends on the
specific form of the stored energy function that is under consideration. Here we
choose µo to be, the mean value of the shear modulus, µ in case of a neo-Hookean
stored energy, the mean value of µ1 for a Mooney stored energy, the mean value
of a for a Fung stored energy. Thus, equation (7) can be written in the following
dimensionless form:
=1 BκR − W
=2 B−1
T = −p1 + W
κR ,
(19)
= T) = 0.
div(
(20)
=i = Wi /µo and T = T/µo , p = p/µo . Consequently, equation (14)
where, W
becomes
For sake of convenience we drop the tilde with the understanding that all the
quantities considered henceforth are non-dimensional unless otherwise explicitly
stated.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
727
4. Forms of Inhomogeneities
For the purpose of illustrating our thesis we shall confine ourselves to a body B
that is the annular region between two concentric spheres:
B = {(R,
, ) | Ri R Ro , 0
2π, 0 π }.
(21)
We shall use Ro for the non-dimensionalization of the length. Let γ (X) denote any
of the material parameters µ(X), µ1 (X), µ2 (X), a(X) or b(X) introduced through
the models in the previous section. We shall assume that the properties vary only
along the radial direction, thus γ (X) = γ (R), and thus, µ, µ1 , µ2 , a and b are all
functions of R.
Before we discuss the manner in which the properties vary, we shall introduce
a parameter R in terms of which we find it convenient to discuss the variation as
this parameter ranges between 0 and 1. Let
R=
R − Ri
.
Ro − Ri
(22)
The forms for γ (R) are such that
γmean =
V (κR (B)) γ (X) dV
V (κR (B))
=
1
γ (R) dR.
(23)
0
We shall pick, for the purpose of illustration, γmean = 1, i.e., our properties will
be assumed to vary about a mean value of unity. First, we shall consider cases
where the material parameter varies monotonically. Here, we investigate two types
of variations, one in which γ (R) increases from Ri to Ro and in the other in which
it decreases. While this can happen in a variety of ways, we choose the following
simple variations.
4.1. LINEAR VARIATION
γ (R) = 2(1 − δ)R + δ,
where 0 < δ < 2. Thus,
dγ ? > 0 if 0 < δ < 1,
dR < 0 if 1 < δ < 2.
(24)
(25)
4.2. EXPONENTIAL VARIATION
Here we suppose that
γ (R) =
(eδ
δ
· eδR ,
− 1)
(26)
728
U. SARAVANAN AND K.R. RAJAGOPAL
where, −∞ < δ < ∞. Thus,
dγ > 0, δ > 0,
dR < 0, δ < 0.
(27)
Next we study cases where the variation of γ is non-monotonic.
4.3. PIECEWISE CONSTANT ( PWC ) VARIATION
We shall assume that
⎧
k−1
⎪
n
⎪
n
⎪
,
k is even,
(−1) H R −
⎪
⎨ δ + 2 · (1 − δ) ·
k
n=0
γ (R) =
k−1
⎪
n
⎪
n
⎪ δ + 2 · (1 − δ) k
⎪
(−1) H R −
, k is odd,
⎩
(k + 1) n=0
k
(28)
where,
⎧
⎨ 0 if R < n ,
n
k
=
H R−
⎩ 1 if R > n .
k
k
Here δ and k determine the amplitude and frequency of the variation.
4.4. SINUSOIDAL VARIATION
In this case, we assume that
γ (R) = 1 + δ · sin(2kπ R),
(29)
where δ determines the amplitude of the variation and k the frequency. Finally, we
shall consider the case,
γ (R) = 1 + δ · cos(2kπ R),
(30)
with the δ and k having the same meaning as before.
5. Inflation of a Sector of a Spherical Shell
We shall seek a semi-inverse solution of the following form, for the deformation,
in spherical coordinates:
r = r(R),
θ=
,
φ = ,
(31)
where (R, , ) and (r, θ, φ) represents the coordinates of a typical material point,
before and after the deformation, respectively. This deformation carries the region
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
729
between the two concentric spheres into a region between two other concentric
spheres. The deformation gradient associated with this deformation, in spherical
co-ordinates, has the following matrix representation
⎛ dr
⎞
0 0
⎜ dR
⎟
r
⎜
⎟
F=⎜ 0
(32)
0 ⎟.
⎝
⎠
R
r
0
0
R
Hence the left stretch tensor has the form
⎛ 2
⎞
dr
0
0
⎟
⎜ dR
⎟
⎜
2
⎟
⎜
r
⎜
B=⎜
(33)
0
0 ⎟
⎟.
R
⎟
⎜
2 ⎠
⎝
r
0
0
R
The constraint of incompressibility requires that
dr r 2
= 1.
dR R
(34)
Integrating the above results in
r 3 = R 3 + cs ,
(35)
where cs = ri3 − Ri3 . For the special form of the assumed deformation, the deformation gradient is only a function of R and therefore the stored energy has the
form, W = W (FκR (R), R). Hence, the equilibrium equation (14) simplifies to
1
dTrr
+ (2Trr − Tθθ − Tφφ ) = 0.
dr
r
(36)
The Lagrange multiplier p due to the constraint of incompressibility can be determined by integrating (36):
2
Ro
r
4R 2 R 4
−
W1 (I1 , I2 , R) dR
p(R) = Trre (R) −
3
r
r
R
R
4 2
$
Ro
4R 2
r
R
−
−
W2 (I1 , I2 , R) dR − Trr (r(Ro )).
r3
R
r
R
(37)
Note that here r is a function of R as given in (35) and I1 = (R/r)4 + 2(r/R)2 ,
I2 = (r/R)4 + 2(R/r)2 . We carry out, the integration numerically using Newton
Cote’s 8 panel rule.
730
U. SARAVANAN AND K.R. RAJAGOPAL
It immediately follows from (7) that
4
4
R
r
Trr (R) = −p(R) +
W1 (I1 , I2 , R) −
W2 (I1 , I2 , R),
(38)
r
R
2
2
r
R
Tφφ (R) = Tθθ (R) = −p(R) +
W1 (I1 , I2 , R) −
W2 (I1 , I2 , R),
R
r
(39)
and hence, the normal component of the stress in the radial direction at the inner
surface is given by
2
Ro
r
4R 2 R 4
−
W1 (I1 , I2 , R) dR
P = Trr (r(Ro )) +
3
r
r
R
Ri
4 2
$
Ro
4R 2
R
r
−
W2 (I1 , I2 , R) dR .
(40)
−
r3
R
r
Ri
Thus, given a value of ri we can determine the required magnitude of the normal
component of the stress in the radial direction at the inner surface to engender such
a motion.
6. Some Remarks Concerning Parameter Estimation from Experiments
Let us first consider the deformations of an inhomogeneous neo-Hookean solid.
Let µexp denote the constant value for the shear modulus for the homogeneous
approximation for the inhomogeneous body, that is, we assume that the body is
homogeneous with a constant material modulus µexp which is then determined
through a correlation with an experiment, in which, the body is subject to the
same boundary traction that has to be applied to engender a given inflation or
deflation. We now determine the relationship between this constant value of the
Sp–Inf
material modulus µexp and the material modulus µ for the inhomogeneous body
by comparing the solutions corresponding to an identical boundary value problem.
It follows from (40) that
2
Ro
r
4R 2 R 4
−
µ(R) dR
P − Trr (ro ) =
3
r
r
R
Ri
2
Ro
r
4R 2 R 4
Sp–Inf
−
dR.
= µexp
3
r
r
R
Ri
Thus,
µSp–Inf
exp
=
Ro
2 3
4
2
Ri (4R /r )[(R/r) − (r/R) ]µ(R) dR
.
Ro
2 3
4
2
Ri (4R /r )[(R/r) − (r/R) ] dR
(41)
We immediately recognize from (41) that different forms of µ(R) with the same
Sp–Inf
µmean can lead to different values for µexp . Note that in the above equation
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
Figure 1. Variation of µexp
(b) µ(R) = (δ/(eδ − 1)) · eδR .
731
with Ri when cs = 0.1 and (a) µ(R) = 2(1 − δ)R + δ,
732
U. SARAVANAN AND K.R. RAJAGOPAL
Ro = 1, since, Ro is used as the characteristic length scale. Figures 1–3 capture
Sp–Inf
the variation of µexp with Ri for various types of variation of µ(R) presented
in Section 4 for a given cs , defined in (35). Figures 2 and 3 show that when the
inhomogeneity is periodic, the homogeneous approximation is better the higher the
frequency of the inhomogeneity, the amplitude remaining fixed, a result in keeping
with the previous results of Saravanan and Rajagopal [4] in their investigations of
the deformation of inhomogeneous isotropic annular elastic cylinders. From the
same figures it can also be seen that a piecewise constant variation with k being
even is qualitatively similar to that for the sinusoidal variation, while if k is odd it
qualitatively resembles that for the cosine variation.
Instead of homogenizing such that the boundary traction and deformation, measured in experiments be the same in both the inhomogeneous body and its homogeneous approximation, we can homogenize such that the total stored energy in the
inhomogeneous body and its homogeneous counterpart are the same. Further, if the
stored energy of the homogeneous subparts of the inhomogeneous body belongs to
the same class, say neo-Hookean and we seek to find a homogeneous counterpart
for this body such that its stored energy belongs to the same class as the homogeneous subparts of the inhomogeneous body, then, we can mathematically seek
the constant value of the shear modulus µ, denoted by µmth , such that the stored
energy in the inhomogeneous body and its homogeneous approximation are same.
Thus
µmth =
Ro
Ri
µ(R)[I1 − 3] dR
Ro ∗
Ri [I1
− 3] dR
,
(42)
where I1 is the first principal invariant of C associated with the deformation field
in the inhomogeneous body, while I1∗ is the first principal invariant of C∗ corresponding to the deformation field in the homogenized approximation.
The deformation of the spherical body, here, is determined completely by the
condition of isochoricity, irrespective of whether the body is homogeneous or inhomogeneous. Hence
sph
µmth
=
Ro
Ri
µ(R)[I1 − 3] dR
Ro
Ri [I1
− 3] dR
,
(43)
where I1 = (R/r)4 + 2(r/R)2 . It immediately transpires from Figures 4–7 that
correlating the stored energy does not result in a good prediction of the boundary
traction required to engender the given boundary deformation. This is apparent
from comparing equation (43) with (41). The prediction for the normal component
of the stress in the radial direction required at the inner surface, to engender, a
given inflation for a given inhomogeneous spherical shell based on the equivalence
of the stored energy in the inhomogeneous spherical shell and its homogeneous
counterpart could at times be twice (refer Figure 7(b)) the actual value of the
normal component of the stress in the radial direction required at the inner surface.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
733
Figure 2. Variation of µexp
with Ri when cs = 0.1 and µ(R) = δ + 2 · (1 − δ) ×
k−1
k
n=0 (−1) H (R − n/k): (a) k is even, (b) k is odd.
734
U. SARAVANAN AND K.R. RAJAGOPAL
Sp–Inf
Figure 3. Variation of µexp
with Ri when cs = 0.1 and (a) µ(R) = 1 + δ · sin(2kπR),
(b) µ(R) = 1 + δ · cos(2kπR).
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
735
Figure 4. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.5 and (a) µ(R) = 2(1 − δ)R + δ,
(b) µ(R) = (δ/(eδ − 1)) · eδR for various load combinations.
736
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 5. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.9 and µ(R) = δ + 2 · (1 − δ) ×
k−1
k
n=0 (−1) H (R − n/k): (a) k = 2, (b) k = 10 for various load combinations.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
737
Figure 6. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.9 and µ(R) = δ + 2 · (1 − δ)(k/
k−1
(−1)n H (R − n/k): (a) k = 3, (b) k = 11 for various load combinations.
(k + 1)) n=0
738
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 7. Variation of µexp with Ri when cs = cc = 0.1, δ = 0.9 and (a) µ(R) = 1 + δ · sin(2kπR),
(b) µ(R) = 1 + δ · cos(2kπR) for various load combinations.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
739
It would be appropriate at this juncture to discuss a very important issue concerning experimental evaluation of the material parameters, namely the ability to
infer the same values for the material moduli from different experiments. Saravanan and Rajagopal [4] have shown that different deformations of an annular
cylinder (i.e., different experiments) lead to different values, for the experimental
modulus being deduced. Here we show that the same holds for µmth .
In the case, of the inflation, extension and torsion of an annular right circular
inhomogeneous cylinder the requirement that the deformation be isochoric completely determines the deformation and the deformation field is the same for both
the homogeneous and the inhomogeneous right circular cylinder. Of course, the
difference being in the stress field. This isochoric deformation is given by
λr 2 = R 2 + cc2 ,
Ri2 ,
ri2
cc2
θ=
+ Z,
z = λZ
(44)
λ being the axial extension, the angle of twist per unit
where = λ · −
length, Ri the inner radius of the annular cylinder in the reference configuration
and ri the radius of the cylinder after inflation, i.e., in the current configuration.
Following, Saravanan and Rajagopal [4] we consider three special cases. The first
being uniaxial extension of the right circular cylinder for which cc = 0 and = 0.
Then, correlating the axial load required to engender a given axial extension they
obtain
µAx–Ext
exp
Ro
Ri
=
µ(R)R dR
Ro
Ri
R dR
(45)
.
Instead if we correlate the total stored energy then
µcs1
mth
=
Ro
Ri
µ(R) dR
Ro
Ri
dR
= µmean ,
(46)
from (42). Next, consider pure twisting of the annular cylinder for which cc = 0
and λ = 1. Now, correlating the torque required to engender a given twist they
obtained
Ro
3
Ri µ(R)R dR
Tr–Twt
µexp =
,
(47)
Ro 3
Ri R dR
correlating the total stored energy we obtain
µcs2
mth
=
Ro
Ri
µ(R)R 2 dR
Ro
Ri
R 2 dR
(48)
.
Finally, consider the case when λ = 1 and = 0 corresponding to pure inflation.
Correlating the radial component of the radial stress required to engender a given
inflation they obtained
µPr–Inf
=
exp
Ro
Ri
µ(R)((2R 2 + cc2 )/R(R 2 + cc2 )2 ) dR
Ro
2
Ri ((2R
+ cc2 )/R(R 2 + cc2 )2 ) dR
.
(49)
740
U. SARAVANAN AND K.R. RAJAGOPAL
We obtain
µcs3
mth =
Ro
2
2
2
Ri (µ(R)/R (R + cc )) dR
,
Ro
2 (R 2 + c2 )) dR
(1/R
c
Ri
(50)
correlating the total stored energy.
The value of µexp obtained from equations (45), (47) and (49) for a given
variation of µ(R) and the value of δ and k, are plotted in Figures 4–7. Now, the
problem under consideration in a different geometry leads to yet another means of
estimating the value for the material moduli for the homogeneous approximation.
As observed by Saravanan and Rajagopal [4] the value of µexp depends on the
thickness of the cylinder or the sphere, as the case may be, an unacceptable situation. Also, µexp obtained from different experiments are significantly different,
again an unacceptable situation.
Figures 4–7 provide the value of µmth obtained for the various deformations
outlined above. Just like µexp , µmth could also, for a given inhomogeneous body,
vary by as much as 1800% (refer to Figure 5(a)) depending on the boundary value
problem and the thickness of the cylinder or the spherical shell. This suggests
that the bounds obtained on these homogenized parameters will not be tight and
hence of little utility. Further, correlating the stored energy does not result in a
good prediction of the boundary traction required to engender the given boundary
deformation. This is evident from comparing the equation (46) with (45) or (48)
with (47) or (50) with (49).
It should be recognized that the above observations are based on a few studies
of specific boundary value problems and more importantly for a few type of inhomogeneities. Consequently, there might exist other inhomogeneities for which
the variation is much more severe than that observed here. All these unsatisfactory
features point to the need for recognizing the exact structure of the inhomogeneity
in characterizing these bodies and solving the appropriate boundary value problem,
until at least a better homogenization procedure is put in place.
Sp–Inf
sph
Clearly, the value of µexp (as well as µmth ) depends on the value of cs . Figure 8
shows this dependence when the sphere is made up of layers of homogeneous
neo-Hookean solid.
Next, let us consider the deformation of an inhomogeneous Mooney solid. We
Sp–Inf
Sp–Inf
immediately obtain the values of (µ1 )exp and (µ2 )exp as
(µ1 )Sp–Inf
exp
=
(µ2 )Sp–Inf
=
exp
Ro
2 3
4
2
Ri (4R /r )[(R/r) − (r/R) ]µ1 (R) dR
,
Ro
2 /r 3 )[(R/r)4 − (r/R)2 ] dR
(4R
Ri
Ro
2 3
4
2
Ri (4R /r )[(r/R) − (R/r) ]µ2 (R) dR
.
Ro
2 /r 3 )[(r/R)4 − (R/r)2 ] dR
(4R
Ri
(51)
(52)
It is easy to see that (µ1 )exp has the same expression as that of the shear modulus
(µexp ) in the neo-Hookean form. Hence, all the difficulties and undesirable charac-
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
741
Figure 8. Variation of µexp
with cs when Ri = 0.5 and µ(R) = δ + 2 · (1 − δ) ×
k−1
k H (R − n/k): (a) k is even, (b) k is odd.
(−1)
n=0
742
U. SARAVANAN AND K.R. RAJAGOPAL
Sp–Inf
Figure 9. Variation of (µ2 )exp
with Ri when cs = 0.1 and µ(R) = δ + 2 · (1 − δ) ×
k−1
k
n=0 (−1) H (R − n/k): (a) k is even, (b) k is odd.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
743
teristics of evaluating the shear modulus for an inhomogeneous neo-Hookean solid
Sp–Inf
apply to (µ1 )exp . Figure 9 depicts the variation of (µ2 )exp when the sphere is
made up of layers of homogeneous Mooney solid. It can be seen from the figure
that even when the inhomogeneity is periodic, an increase in the frequency of
the inhomogeneity doesn’t result in the homogeneous approximation being better,
especially in a thick walled spherical annulus.
In the case of an inhomogeneous body whose stored energy function is that proposed by Fung, the stored energy of its homogeneous approximation can not even
belong to the same class, i.e. the stored energy of its homogeneous approximation
will not belong to the class introduced by Fung (see [4]). However, we investigate
the consequences of approximating an inhomogeneous body whose stored energy
is given by the model proposed by Fung with its material parameters varying mildly
with location, as a homogeneous body with the same class of stored energy function having constant material parameters. We use the mean value for the material
parameters in the homogeneous approximation and compare the pressure vs inner
radius response for the various class of inhomogeneous bodies that leads to the
same homogeneous approximation. The results are depicted in Figure 10. Thus,
the normal component of the stress in the radial direction required to engender
the motion, parameterized by a given value of ri , in case of the inhomogeneous
body, can be 150% to even 300% more than that required for its corresponding
homogeneous approximation. Of course, this depends on the specific variation of
the material parameters. Here both the parameters a and b are assumed to have the
same functional dependence on R.
7. Stress Distribution
We now turn our attention to the differences between the stress distribution in the
inhomogeneous solid and its homogeneous counterpart. It is not surprising that the
stress distribution corresponding to the inhomogeneous body is quantitatively different from that of the homogeneous approximation. However, one should expect
that the qualitative features of the stress distribution like, the derivative of the stress
(which indicates whether the stress is increasing or decreasing) or the sense of the
stress (tensile or compression) to be preserved by and large, though not everywhere,
in the inhomogeneous body and its homogeneous counterpart. The sense of the
stress as well as its magnitude can determine the integrity or failure of the body.
For instance, while certain materials can withstand significant compressive stresses
they can fail due to tensile stresses and thus a homogenized approximation that
predicts appropriate compressive stresses may lull one into a false sense of security
while in fact it may fail as tensile stresses develop, in the real inhomogeneous
body. Unfortunately, such is indeed the case and this was illustrated by Saravanan
and Rajagopal [4] for the case of inflation, extension, torsion and shearing of an
annular cylinder.
744
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 10. Trr (ri ) vs ri plot for various forms of inhomogeneities when δ = 0.5, am = 1,
(a) bm = 1, (b) bm = 2 and k = 2 except for PWC-2, for which k = 10.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
745
Figure 11. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a neo-Hookean stored
energy function when µ(R) = 2(1 − δ)R + δ and ri = 0.91.
746
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 12. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a neo-Hookean stored
k−1
energy function when µ(R) = δ + 2 · (1 − δ) · n=0
(−1)k H (R − n/k) and ri = 0.91.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
747
Figure 13. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a Fung stored energy
k
function when a(R) = 1, b(R) = δ + 2 · (1 − δ) · k−1
n=0 (−1) H (R − n/k) and ri = 1.1.
748
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 14. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a Fung stored energy
k
function when a(R) = 1, b(R) = δ + 2 · (1 − δ) · k−1
n=0 (−1) H (R − n/k) and ri = 1.4.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
749
It can be seen from Figures 11 and 12 that for a inhomogeneous neo-Hookean
body the derivative of the stress is quite different than that for the homogeneous
body and in some cases even the sense of the stress is different.
Finally, the stresses in the case of a sphere whose stored energy is that for the
model proposed by Fung are plotted in Figures 13 and 14. For the cases illustrated
in the figures, bmean = 1. In this case a 5% variation in parameter b causes a
8% variation in the stress (to be specific Tφφ ) when ri = 1.1 and a 17% when
ri = 1.4. The percentage variation of the stresses increases with an increase in the
mean value of parameter b, for the same variation of b about the mean. Thus, when
bmean = 5, a 5% variation in the parameter b causes a 74% variation in the stress
when ri = 1.4. Thus, the variation in the stress can be an order of magnitude greater
than the variation of the material parameters. Hence, if we idealize an aneurysm as
a sphere and obtain the stress distribution using a homogeneous approximation, we
will find that the stress will be both qualitatively and quantitatively different from
the one that accounts for the inhomogeneity of the aneurysm.
References
1.
2.
3.
4.
5.
6.
7.
8.
Y.C. Fung, Biomechanics: Motion, Flow, Stress and Growth. Springer, New York (1990).
R.M. Christensen, Mechanics of Composite Materials. Wiley, New York (1979).
A. Imam, G.C. Johnson and M. Ferrari, Determination of the overall moduli in second order
incompressible elasticity. J. Mech. Phys. Solids 43(7) (1995) 1087–1104.
U. Saravanan and K.R. Rajagopal, On the role of inhomogeneties in the deformation of elastic
bodies. Mech. Math. Solids (accepted for publication).
L.R.G. Treloar, The elasticity of a network of long chain molecules – II. Trans. Faraday Soc.
39(9/10) (1943) 241–246.
M. Mooney, A theory of large elastic deformation. J. Appl. Phys. 11 (1940) 582–592.
C. Truesdell and W. Noll, The nonlinear field theories. In: Handbuch der Physik, Vol. III/3.
Springer, Berlin (1965).
Y.C. Fung, Elasticity of soft tissues in elongation. Amer. J. Physiol. 213 (1967) 1532–1544.
On SO(n)-Invariant Rank 1 Convex Functions ⋆
M. ŠILHAVÝ
Mathematical Institute of the AV ČR, Žitná 25, 115 67 Prague 1, Czech Republic.
E-mail: silhavy@math.cas.cz
Received 9 October 2001; in revised form 30 December 2002
Abstract. Let f be a function defined on the set Mn×n of all real square matrices of order n. If f
is SO(n)-invariant, it has a representation f˜ on Rn through the signed singular values of the matrix
argument A ∈ Mn×n . A necessary and sufficient condition for the rank 1 convexity of f in terms of
f˜ is given.
Mathematics Subject Classifications (2000): Primary 49K20; secondary 73C50.
Key words: rank 1 convex functions, rotational invariance.
In memory of Clifford Truesdell
1. Introduction
In nonlinear elasticity and in the theory of phase transitions in solids, one minimizes the energy functional
f (Du) dx,
I (u) =
where ⊂ Rn is open and bounded, u ∈ W 1,p (, Rn ) is a deformation with the
gradient Du, f : Mn×n → R ∪ {∞} is the energy defined on the set Mn×n of all
real square matrices of order n and 1 p ∞. Consider, for definiteness, the
minimum problem
M = inf{I (u) : u ∈ A}
1,p
on the Dirichlet class A = {u : u − v ∈ W0 (, Rn )}, where v ∈ W 1,p (, Rn )
is fixed. If the problem has a solution, i.e., if there exists a u ∈ A such that
I (u) = M, then u is a ‘stable’ equilibrium state; on the other hand, the nonexistence of a solution indicates the possibility of phase transitions and the formation
of microstructure. Since the question of Truesdell [28, Section 20] it was clear that
apart from the invariance, the energy f must satisfy further basic requirements,
⋆ This research was supported by Grant 201/00/1516 of the Grant Agency of the Czech Republic.
751
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 751–762.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
752
M. ŠILHAVÝ
‘constitutive inequalities’ yet to be determined. Subsequently it was found that
the existence of the solution, and its further properties, are directly related to the
semiconvexity properties (i.e., quasiconvexity, rank 1 convexity, polyconvexity, and
convexity) of f [14, 3, 8, 18, 19, 16]. Recall that f is said to be quasiconvex if
f (A + Dv(x)) dx
(1)
|E|f (A)
E
for each A ∈ Mn×n , each bounded open E ⊂ Rn with |∂E| = 0, and each v ∈
W01,∞ (E) such that the right-hand side in (1) makes sense as the Lebesgue integral.
A closely related notion is that of rank 1 convexity, which requires that
f ((1 − t)A + tB) (1 − t)f (A) + tf (B)
(2)
for every t ∈ [0, 1] and every A, B ∈ Mn×n with rank(A − B) 1. For finitevalued functions, quasiconvexity implies rank 1 convexity. If f is not quasiconvex,
the material exhibits microstructure and phase transformation [5–8, 13, 15]. The
effective energy is given by the relaxation of I , i.e., by
¯
Qf (Du) dx,
I (u) =
where Qf is the quasiconvex hull of f , i.e., the largest quasiconvex function not
exceeding f . One defines the rank 1 convex hull Rf similarly. For finite-valued
functions, Qf Rf , and it often happens that Qf = Rf .
A function f : Mn×n → R ∪ {∞} is said to be rotationally invariant (briefly,
invariant) if f (A) = f (QAR) for all A ∈ Mn×n and all Q, R proper orthogonal.
For example, stored energies of isotropic solids have this property. If we define
f˜(τ ) = f (diag(τ )) for any τ ∈ Rn then f˜, called the representation of f , is
symmetric and even, i.e., f˜(P τ ) = f˜(τ ) = f˜(ǫτ ) for every τ ∈ Rn , every
permutation matrix P and every diagonal proper orthogonal matrix ǫ. One finds
that
f (A) = f˜(τ ),
where τ = (τ1 , . . . , τn ) are the signed singular values of A, defined as the unique
n-tuple such that τ1 , . . . , τn−1 , |τn | are the singular values of A, arranged in a
nonincreasing way, and sgn τn = sgn det A [17, 20].
This paper presents a condition equivalent to the rank 1 convexity of f in terms
of f˜, Theorem 6. Like the conditions in [1, 20–22, 25], the present condition
involves finite differences of arguments, resembling formally the inequality (2),
as opposed to the Legendre–Hadamard condition
D 2 f (A)(a ⊗ b, a ⊗ b) 0,
A ∈ Mn×n , a, b ∈ Rn ,
whose nature is ‘infinitesimal.’ The form of the Legendre–Hadamard condition
in terms of f˜ has been given in [12] for n = 2, in [27] for n = 3 (in a different
753
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
framework), and generally in [21, Proposition 6.4]. The reader is referred to [2, 10],
and [9] for additional information.
In view of its global nature, Theorem 6, and the results of [20–22, 25], can be
used to define iterative procedures for evaluating the rank 1 convex hull of an invariant function [26, 23]. Theorem 6, and its proof, has two parts. One part (item (i)
of the theorem) is the monotonicity of invariant rank 1 convex functions as established in [24], which is closely related to the Baker–Ericksen inequalities in the
differentiable case. In the special case of O(n) invariant functions, a similar result
has been established in [11]; however, the result does not apply to SO(n) invariant
functions treated here. The second part (item (ii) of the theorem) is based on an
explicit construction of a rank 1 perturbation B of a given matrix A with prescribed
signed singular values β, Proposition 1. Such a perturbation exists only if β and the
signed singular values α of A satisfy the interlacing inequalities to be formulated
in Section 2. Let S k , k = 1, . . . , n, denote the kth elementary symmetric function
of n variables. If γ (t) are the signed singular values of C(t) := (1 − t)A + tB,
0 t 1, then for an appropriate diagonal orthogonal matrix ǫ, the functions
S k (ǫγ (t)) behave affinely, i.e.,
S k (ǫγ (t)) = (1 − t)S k (ǫα) + tS k (ǫβ),
k = 1, . . . , n.
(3)
Accordingly, the function f˜ is convex on any curve satisfying (3) (provided α, β
satisfy the bilateral interlacing inequalities), i.e.,
f˜(γ (t)) (1 − t)f˜(ǫα) + t f˜(ǫβ).
(4)
The necessity of (4) thus follows by a direct insertion into (2); the converse proof
is based on local considerations. Namely, the rank 1 perturbations described in
Proposition 1 have certain minimum properties stated in Lemma 4. Combining
these with items (i), (ii) and the use of some continuity and density arguments
(Lemmas 2 and 5) then completes the proof. Apart from the notation and the bilateral interlacing inequalities, Section 2 is not needed for the statement of Theorem 6;
it only gathers a material for the proof, and can be used as reference as needed.
It would be desirable to integrate the two conditions of Theorem 6 into a single
condition. This can be done in dimension n = 2 as Theorem 7 shows. However,
if n 3, the discrepancy of the functions occurring in conditions (i), (ii) of Theorem 6 seem to present a serious difficulty in such attempts. One connection between
(i), (ii) is established in Lemma 4, which is the main technical improvement with
respect to the previous papers of the author. The lemma shows that the rank 1
perturbations underlying condition (ii) are local minimizers of the partial products
of signed singular values occurring in condition (i). This establishes the special
positions of these particular rank 1 perturbations. However, the result is local in its
nature and the full understanding of the issues is outstanding.
754
M. ŠILHAVÝ
2. Rank 1 Perturbations
This section describes a class of rank 1 perturbations of a given matrix with prescribed signed singular values and collects other supplementary facts.
Let Gn = {τ ∈ Rn : τ1 τ2 · · · τn−1 |τn |} and note that a τ ∈ Rn is
an n-tuple of signed singular values of some A ∈ Mn×n if and only if τ ∈ Gn . For
any α, β ∈ Rn let αβ = (α1 β1 , . . . , αn βn ), α 2 = (α12 , . . . , αn2 ), and write α 0
if α1 0, . . . , αn 0. For each k ∈ {1, . . . , n} denote by S k : Rn → R the kth
elementary symmetric function of n variables,
S k (α1 , . . . , αn ) =
αi1 · · · αik
1i1 <···<ik n
and let
S(α) = (S 1 (α), . . . , S n (α)).
The pair α, β ∈ Gn ∩ Rn+ is said to satisfy the bilateral interlacing inequalities
(BIL) if
β1 α2 ,
α1 β2 α3 ,
...,
αn−1 βn .
The pair α, β ∈ Gn is said to satisfy the BIL if (α1 , . . . , αn−1 , |αn |) and (β1 , . . . ,
βn−1 , |βn |) satisfy the BIL.
PROPOSITION 1. Let A = diag(α), α ∈ Gn . Then
(i) β ∈ Gn are the signed singular values of some rank 1 perturbation of A ⇔
α, β satisfy the BIL;
(ii) let β ∈ Gn , β = α, satisfy the BIL and let
√
√
qj = xj ,
pj = ǫj xj ,
where ǫj ∈ {1, −1} is such that ǫ(β − α) 0,
xj := ǫj (βj − αj )
> ǫi βi − ǫj αj
0,
ǫ i αi − ǫ j αj
i
and the product is taken over all i for which the denominator is nonzero; then
B := A + p ⊗ q has the signed singular values β;
(iii) in the situation of (ii), set C = (1 − t)A + tB, 0 t 1, and denote by γ
the signed singular values of C; then
S(ǫγ ) = (1 − t)S(ǫα) + tS(ǫβ).
(5)
The requirement ǫ(β − α) 0 determines ǫj ∈ {1, −1} uniquely as ǫj =
sgn(βj − αj ) if βj = αj while if βj = αj then both choices ǫj = ±1 are possible; however, this ambiguity has no consequences on the values of xj . Item (iii)
755
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
expresses a remarkable fact that the elementary functions, when composed with ǫ,
are affine functions of signed singular values along the rank 1 segments described
in Proposition 1. This is not generally true for any rank 1 line segment whose
endpoints have the signed singular values α, β.
Proof. For the proof of (i), (ii), see [22, Propositions 3.1, 3.2]. To prove (iii),
√
let m be defined by mi = xi , let E := diag(ǫ), L := EA, M := EB ≡
L + m ⊗ m, N := EC = (1 − t)L + tM, and note that L, M, N are symmetric.
Then ǫα, ǫβ, ǫγ are (unordered) spectra of L, M, N, respectively. Denoting the
characteristic polynomials of L, M, N by p, q, r, respectively, one finds
q(z) = p(z) + cof(L − z1)m · m,
r(z) = p(z) + t cof(L − z1)m · m
and hence
r = (1 − t)p + tq.
Expanding p, q, r in terms of S k (ǫα), S k (ǫβ), S k (ǫγ ), one obtains (5).
✷
LEMMA 2 ([22, Lemma 5.5]). Let A be diagonal, invertible and have distinct
singular values. Then there exists a dense subset D of Rn × Sn−1 such that for
each (a, n) ∈ D and each t ∈ R the matrix A + ta ⊗ n has distinct singular values.
LEMMA 3 ([21, equation (6.11)]). Let f : Mn×n → R be invariant and of class
C 2 in a neighborhood of A = diag(α), α ∈ Gn , where α 2 has distinct components,
let B := a ⊗ b, a, b ∈ Rn , and ǫ ∈ {1, −1}n . Then
Df (A)(B) =
n
i=1
D 2 f (A)(B, B) =
f˜i Bii ,
n
1
Tijǫ ǫi Bii ǫj Bjj ,
Kij (Bij − ǫi ǫj Bj i )2 +
2 1i =j n
i,j =1
(6)
(7)
where Tiiǫ = f˜ii and for i = j ,
Kij =
αi f˜i − αj f˜j
,
αi2 − αj2
Tijǫ = ǫi ǫj f˜ij +
ǫi f˜i − ǫj f˜j
,
ǫ i αi − ǫ j αj
(8)
(9)
with the derivatives evaluated at α.
Let mk : Mn×n → R, k = 1, . . . , n, be the partial products of the signed singular
values
mk (A) =
k
>
i=1
τi ,
756
M. ŠILHAVÝ
A ∈ Mn×n , where τ are the signed singular values of A. The rank 1 perturbations
described in Proposition 1 have the property that p = diag(ǫ)q and hence diag(ǫ)B
is symmetric. The following lemma shows that rank 1 perturbations with p =
diag(ǫ)q have a special position in the class of all rank 1 perturbations.
LEMMA 4. Let a, b ∈ Rn , ǫ ∈ {1, −1}n satisfy
ǫi ai bi 0,
i = 1, . . . , n,
(10)
and define p, q ∈ Rn by
qi = ǫi ai bi , pi = ǫi qi .
Furthermore, let A = diag(α), α ∈ Gn , and set
C(t) := A + ta ⊗ b,
R(t) := A + tp ⊗ q,
for any t ∈ R. If A is invertible and α 2 has distinct components then for all t
sufficiently close to 0,
mk (R(t)) mk (C(t)),
1 k < n,
mn (R(t)) = mn (C(t)).
(11)
Proof. Since A is invertible and α 2 has distinct components, the singular values,
considered as functions of its matrix argument, are of class C ∞ in a neighborhood
of A by [4, Section 6]. Since the first n − 1 signed singular values coincide with
the singular values, also mk , k = 1, . . . , n − 1, are of class C ∞ in a neighborhood
of A. Thus we can apply Lemma 3. For each i = j denote by Kijk the K-matrix
defined by (8) for mk at α and for each i, j denote by Tijk,ǫ the T ǫ -matrix defined
by (9) for mk at α. Then for k < n,
⎧
if 1 i < j k,
⎨0
mk
(12)
Kijk = Kjki =
⎩ α 2 − α 2 > 0 if 1 i k < j .
i
j
Note first that we have
%
det C(t) = det A 1 +
n
αi−1 ai bi
i=1
&
%
= det A 1 +
n
i=1
αi−1 pi qi
&
= det R(t)
which implies the assertion about mn . Let ck (t) = mk (C(t)), rk (t) = mk (R(t)),
t ∈ R. The functions ck , rk are infinitely differentiable in some neighborhood of 0
and let ċk , ṙk , c̈k , r̈k denote the first two derivatives at 0. Let
D = a ⊗ b,
S = p ⊗ q,
and note that Dii = Sii . Since the first derivative depends only on the diagonal
elements (see (6)), we have
ċk = ṙk .
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
757
Furthermore, since pi qj = ǫi ǫj pj qi , equation (7) provides
n
1
k
2
c̈k =
Tijk,ǫ Dii Djj ,
Kij (ai bj − ǫi ǫj aj bi ) +
2 1i =j n
i,j =1
r̈k =
n
i,j =1
Tijk,ǫ Sii Sjj =
n
Tijk,ǫ Dii Djj ,
i,j =1
and thus
c̈k − r̈k =
1
K k (ai bj − ǫi ǫj aj bi )2 .
2 1i =j n ij
(13)
Let u be the smallest integer such that
ai = bi = 0,
u < i n,
where the case u = n is not excluded. Then from the definition, also
pi = qi = 0,
u < i n,
and both R(t), C(t) have block diagonal forms,
C(t) = diag(Ã1 + t ã ⊗ b̃, Ã2 ),
R(t) = diag(Ã1 + t p̃ ⊗ q̃, Ã2 ),
where
Ã1 = diag(α1 , . . . , αu ),
Ã2 = diag(αu+1 , . . . , αn ),
and p̃, q̃, ã, b̃ ∈ Ru are the obvious truncations of p, q, a, b, respectively. Thus the
list of singular values of R(t) is the union of the list of singular values of Ã1 + p̃⊗ q̃
with {αu+1 , . . . , αn } and the same holds for C(t). Since the components of α are
ordered and distinct, for t sufficiently close to 0, we have, for continuity reasons,
τi (C(t)) = τi (R(t)) = αi ,
u < i n.
(14)
Let us distinguish the following two cases:
ai bu − ǫi ǫu au bi = 0
for all i < u,
av bu − ǫv ǫu au bv = 0 for some v < u.
(15)
Assume first (15). Then from the definition of u we have either au = 0 or bu = 0.
Assume the latter, the treatment under the former assumption is similar. Then
from (15),
ai = ǫi λbi
(16)
758
M. ŠILHAVÝ
if 1 i u, where λ = ǫu au /bu . It is noted that since ǫu au bu 0, we have λ 0.
Exclude the trivial case λ = 0. Note also that (16) extends trivially to all i. Then
√
qi = ǫi ai bi = λ|bi |.
√
Thus there√exists a σ ∈ {1, −1}n such that qi = σi λbi and hence from (16),
pi = σi ai / λ. Thus if J := diag(σ ), we have D = JSJ and consequently
R(t) = JC(t)J,
t ∈ R.
The invariance of the signed singular values implies that
τi (R(t)) = τi (C(t)),
1 i n,
and (11) holds with equality signs for all t ∈ R. Next assume that (15) holds. Then
by (12) and (13) we have
k
c̈k − r̈k Kvu
(av bu − ǫv ǫu au bv )2 > 0.
Thus (11) holds for all q < u and t sufficiently close to 0 by continuity. Moreover,
we have
det(R(t)) = ru (t)αu+1 · · · αn ,
det(C(t)) = cu (t)αu+1 · · · αn ,
and thus since det(R(t)) = det(C(t)) we conclude that ru (t) = cu (t) for all t ∈ R.
Finally using this and (14) we see that (11) holds with the equality sign for all
k u and all t ∈ R.
✷
Let g: D → R be a function defined on an interval D ⊂ R. We say that t ∈ R is
a local subgradient of g at x if there exists an ǫ > 0 such that g(y)−g(x) t (y−x)
for all y ∈ D such that |x − y| < ǫ.
LEMMA 5 ([22, Proposition A.1]). Let g: D → R be a continuous function on
an interval D ⊂ R which has a local subgradient t (x) at each x ∈ D. Then g is
convex.
The conclusion does not hold without the continuity hypothesis: consider, e.g.,
g: R → R given by g(x) = 1 if x < 0 and g(x) = 0 if x 0.
3. Invariant Rank 1 Convex Functions
The main result of the paper is
THEOREM 6. An invariant f : Mn×n → R is rank 1 convex if and only if it
satisfies the following two conditions:
759
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
(i) if α, β ∈ Gn satisfy
k
>
αi
i=1
k
>
βi ,
i=1
k = 1, . . . , n − 1,
and
n
>
i=1
αi =
n
>
βi
i=1
then
f˜(α) f˜(β);
(ii) if α, β ∈ Gn satisfy the BIL and γ ∈ Gn , ǫ ∈ {1, −1}n , t ∈ [0, 1] satisfy
ǫ(β − α) 0 and
S(ǫγ ) = (1 − t)S(ǫα) + tS(ǫβ)
(17)
f˜(γ ) (1 − t)f˜(α) + t f˜(β).
(18)
then
It is not clear whether the assertion holds for functions with values in R ∪ {∞}.
As explained in [24], condition (i) is equivalent to the Baker–Ericksen inequalities
if f is of class C 2 . Item (ii) shows that the rank 1 convexity in Mn×n translates
into the representation space Gn as follows. The rank 1 line segments are replaced
by curves with endpoints satisfying the BIL on which the elementary symmetric
functions (composed with the sign matrix ǫ) behave like affine functions. For f of
class C 1 Theorem 6 can be deduced from [21, Section 7].
Proof. Let f be rank 1 convex. (i) follows from [24, Theorem 5.4].
(ii) Let A = diag(α) and let B = A + p ⊗ q be the rank 1 perturbation of A with
the signed singular values β as described in Proposition 1. Consider C = (1−t)A+
tB and denote by γ the signed singular values of C. Then by Proposition 1(iii), γ
satisfies (17) and by the well known uniqueness property of elementary symmetric
functions, this realization of γ is the only way to satisfy (17). The rank 1 convexity
inequality for f and A, B, C reduces to (18).
Conversely, assume that f satisfies conditions (i) and (ii). Let us first show that
f˜ is separately convex. It suffices to show that h := f˜(·, δ) is convex on R for
each δ = (δ2 , . . . , δn ) ∈ Gn−1 . For each p ∈ R let ξ(p) ∈ Gn be the signed
singular values of diag(p, δ). Assume first that δn > 0 and prove that h is convex
on [−δn , ∞). Let a, b, c ∈ [−δn , ∞), t ∈ [0, 1] satisfy c = (1 − t)a + tb and set
α = ξ(a),
β = ξ(b),
γ = ξ(c).
Since diag(b, δ) is a rank 1 perturbation of diag(a, δ) we see that α, β satisfy the
BIL. Assume without any loss of generality that a < b, let ǫ = (1, . . . , 1) and
show that ǫ(β − δ) 0. Indeed, using b > a −δn one finds that α, β are of the
form
α = (δ2 , . . . , δk , a, δk+1 , . . . , δn ),
γ = (δ2 , . . . , δm , c, δm+1 , . . . , δn ),
β = (δ2 , . . . , δl , b, δl+1 , . . . , δn ),
(19)
(20)
760
M. ŠILHAVÝ
where k m l and the cases when k or m or l is equal to n, are not excluded.
From (19), (20),
S(α) = S(a, δ),
S(β) = S(b, δ),
S(γ ) = S(c, δ),
and since the elementary functions are separately affine (affine in each variable),
we find that (17) holds. Hence f˜(·, δ) is convex on [−δn , ∞). Similar considerations show that f˜(·, δ) is convex on (−∞, δn ]. Since the overlap of (−∞, δn ]
and [−δn , ∞) has a nonempty interior, it follows that f˜(·, δ) is convex on R. The
same considerations apply to δn < 0. Finally assume that δn = 0 and let u be the
largest integer such that δu > 0. The above considerations can be modified to show
that f˜(·, δ) is convex on (−∞, 0] and [0, ∞). The application of (i) gives that
f˜(·, δ) is nondecreasing on [0, ∞) and as δn = 0, we have f˜(−p, δ) = f˜(p, δ)
for each p ∈ R by the even nature of f˜. Thus f˜(·, δ) is symmetric about 0, and
nondecreasing and convex on [0, ∞). It follows that f˜(·, δ) is convex on R. To
summarize f˜ is separately convex on Rn and hence locally Lipschitz continuous
by [14, Theorem 4.4.1, p. 112]. Since f˜ is continuous and the signed singular
values are Lipschitz continuous (this can be deduced from the Lipschitz continuity
of the ordinary singular values, see [4, Section 6]), f is continuous.
Next let us show that conditions (i), (ii) imply the rank 1 convexity. Let Ā ∈
n×n
M , ā, b̄ ∈ Rn and ϕ(t) := f (Ā + t ā ⊗ b̄), t ∈ R, and we have to prove that ϕ
is convex. This will be done by showing that ϕ has a local subgradient at each t.
Assume first that Ā, ā, b̄ are such that Ā + t ā ⊗ b̄ is nondegenerate for each t ∈ R.
Thus let t be fixed, denote by α the signed singular values of Ā + t ā ⊗ b̄ so that
Ā + t ā ⊗ b̄ = Q diag(α)RT for some Q, R ∈ SO(n). Let a := QT ā, b := RT b̄
so that ϕ(s) = f (diag(α) + sa ⊗ b), s ∈ R. Let p, q ∈ Rn , ǫ ∈ {1, −1}n be as
in Lemma 4, let ϕ̄(s) := f (diag(α) + sp ⊗ q), and let δ(s) be the signed singular
values of diag(α) + sp ⊗ q, s ∈ R. Then, because of the special choice p, q, the
function s → S(ǫδ(s)) is affine as the proof of Proposition 1 shows. An appeal to
(ii) implies that ϕ̄ is convex, and since it is continuous, it has a subgradient q at
t. Combining Lemma 4 with (i), we obtain that ϕ(s) ϕ̄(s) for all s sufficiently
close to t. As ϕ(t) = ϕ̄(t) we conclude that q is a local subgradient of ϕ at t. A
reference to Lemma 5 implies that ϕ is convex. Thus the conclusion follows under
the additional assumption of nondegeneracy. Lemma 2 then extends the conclusion
to a general situation.
✷
Note that in dimension n = 2, conditions (i), (ii) of Theorem 6 can be joined
into one and the requirement that f be finite-valued can be removed:
THEOREM 7 ([25, Theorem 5.3]). Let f : M2×2 → R ∪ {∞} be invariant. Then
f is rank 1 convex if and only if
f˜(γ ) (1 − t)f˜(α) + t f˜(β)
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
761
for every α, β, γ ∈ G2 and t ∈ [0, 1] such that α, β satisfy the BIL and
γ1 γ2 = (1 − t)α1 α2 + tβ1 β2 ,
γ1 + ǫγ2 (1 − t)(α1 + ǫα2 ) + t (β1 + ǫβ2 ),
where
ǫ=
+1 if (α1 − β1 )(α2 − β2 ) 0,
−1 if (α1 − β1 )(α2 − β2 ) < 0.
References
1.
G. Aubert, Necessary and sufficient conditions for isotropic rank-one convex functions in
dimension 2. J. Elasticity 39 (1995) 31–46.
2. G. Aubert and R. Tahraoui, Sur la faible fermeture de certains ensembles de contraintes en
élasticité non linéaire plane. Arch. Rational Mech. Anal. 97 (1987) 33–58.
3. J.M. Ball, Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational
Mech. Anal. 63 (1977) 337–403.
4. J.M. Ball, Differentiability properties of symmetric and isotropic functions. Duke Math. J. 51
(1984) 699–728.
5. J.M. Ball and R.D. James, Fine phase mixtures as minimizers of energy. Arch. Rational Mech.
Anal. 100 (1987) 13–52.
6. J.M. Ball and R.D. James, Proposed experimental tests of a theory of fine microstructure and
the two-well problem. Philos. Trans. Roy. Soc. London 338 (1992) 389–450.
7. M. Chipot and D. Kinderlehrer, Equilibrium configurations of crystals. Arch. Rational Mech.
Anal. 103 (1988) 237–277.
8. B. Dacorogna, Direct methods in the calculus of variations. Springer, Berlin (1989).
9. B. Dacorogna, Necessary and sufficient conditions for strong ellipticity of isotropic functions
in any dimension. Discrete Contin. Dyn. Syst. B2 (2001) 257–263.
10. B. Dacorogna and H. Koshigoe, On the different notions of convexity for rotationally invariant
functions. Ann. Fac. Sci. Toulouse II (1993) 163–184.
11. B. Dacorogna and P. Marcellini, Implicit Partial Differential Equations. Birkhäuser, Basel
(1999).
12. J.K. Knowles and E. Sternberg, On the failure of ellipticity of the equations for finite elastostatic
plane strain. Arch. Rational Mech. Anal. 63 (1977) 321–326.
13. R.V. Kohn and G. Strang, Optimal design and relaxation of variational problems, I, II, III.
Comm. Pure Appl. Math. 39 (1986) 113–137, 139–182, 353–377.
14. C.B. Morrey Jr, Multiple Integrals in the Calculus of Variations. Springer, New York (1966).
15. S. Müller, Variational models for microstructure and phase transitions. In: Calculus of Variations and Geometric Evolution Problems (Cetraro, 1996), Lecture Notes in Math. 1713.
Springer, Berlin (1999) pp. 85–210.
16. P. Pedregal, Parametrized Measures and Variational Principles. Birkhäuser, Basel (1997).
17. P. Rosakis, Characterization of convex isotropic functions. J. Elasticity 49 (1997) 257–267.
18. T. Roubíček, Relaxation in Optimization Theory and Variational Calculus. W. de Gruyter,
Berlin (1997).
19. M. Šilhavý, The Mechanics and Thermodynamics of Continuous Media. Springer, Berlin
(1997).
20. M. Šilhavý, Convexity conditions for rotationally invariant functions in two dimensions. In:
A. Sequeira et al. (eds), Applied Nonlinear Analysis. Kluwer Academic Publishers, New York
(1999) pp. 513–530.
762
M. ŠILHAVÝ
M. Šilhavý, On isotropic rank 1 convex functions. Proc. Roy. Soc. Edinburgh A 129 (1999)
1081–1105.
22. M. Šilhavý, Rotationally invariant rank 1 convex functions. Appl. Math. Optim. 44 (2001) 1–15.
23. M. Šilhavý, Rank 1 convex hulls of isotropic functions in dimension 2 by 2. Math. Bohem. 126
(2001) 521–529.
24. M. Šilhavý, Monotonicity of rotationally invariant convex and rank 1 convex functions. Proc.
Roy. Soc. Edinburgh A 132 (2001) 419–435.
25. M. Šilhavý, On the semiconvexity properties of rotationally invariant functions in two
dimensions. (2001) To be published.
26. M. Šilhavý, Rank 1 convex hulls of rotationally invariant functions. In: Ch. Miehe (ed.), Proceedings of the IUTAM Symposium on Computational Mechanics of Solid Materials at Large
Strains. Kluwer Academic Publishers, New York (2003) pp. 87–98. In press.
27. H.C. Simpson and S. Spector, On copositive matrices and strong ellipticity for isotropic elastic
materials. Arch. Rational Mech. Anal. 84 (1983) 55–68.
28. C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Fluegge (ed.),
Handbuch der Physik III/3. Springer, Berlin (1965).
21.
On Thermodynamics of Nonlinear Poroelastic
Materials
K. WILMAŃSKI
Weierstrass Institute for Applied Analysis and Stochastics, Berlin, Germany.
E-mail:wilmansk@wias-berlin.de
Received 30 November 2002
Abstract. The paper contains a brief presentation of a macroscopical thermodynamic model of
poroelastic materials with many fluid components. A particular emphasis is placed on a Lagrangian
formulation of the model and, consequently, on a consistent formulation of field equations on the
reference configuration of the skeleton (solid phase of the mixture). It is demonstrated that the
model possesses an identical structure as that in the pioneering work of C.A. Truesdell on the continuum mixture of fluids. An issue of porosity as an additional microstructural variable is particularly
exposed.
Mathematics Subject Classifications (2000): 74A15, 74E30, 74L05.
Key words: continuum thermodynamics, porous media, mixtures.
In memoriam of Prof. Clifford Ambrose Truesdell whose work in continuum
mechanics created new standards of research in field theories
1. Introduction
The classical continuum theory of mixtures whose development was started in 1957
by the famous papers of Truesdell [1] is primarily designed to cover systems of
many fluid components. In 1982 Bowen [2] (see as well [3]) has extended this
classical field on mixtures one component of which is a solid. This has put theories
of porous materials on the same footing as mixtures of fluids. During the last twenty
years this field of research developed rapidly and in the meantime enhanced studies
on such systems as suspensions, mixtures of granular materials saturated or not
saturated with a fluid and many others.
In spite of this development there are still some controversies concerning the
construction of nonlinear models in which large deformations of the skeleton are
incorporated. This is related to the fact that in contrast to mixtures of fluids a solid
component (skeleton) yields naturally to a Lagrangian description of the system.
Bowen was using in his papers a mixed description – Lagrangian for the solid skeleton and Eulerian for the fluid – but such an approach leads to technical difficulties
763
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 763–777.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
764
K. WILMAŃSKI
in applications of the model, in formulation of boundary conditions etc. For this
reason I have proposed in 1995 a different way of description of two-component
porous materials [4]. This may be extended to many components and first results
for multicomponent porous materials have been published in the work [5].
In this work we present the full structure of a Lagrangian model of a poroelastic
material in which there may be more than one fluid component and the kinematics
of the skeleton is formulated in the Lagrangian way.
In Section 2 we define the Lagrangian description of multicomponent systems
and introduce various kinematical quantities analogous to those appearing in Truesdell’s theory of fluid mixtures.
In Section 3 we present partial balance equations in the Lagrangian description
in their global and local form. It is emphasized that in contrast to such balance
equations for single continua they contain convective contributions whose form is
objective. We also present a balance equation for the microstructural field of porosity and justify its macroscopic form on phenomenological grounds. This extension
of the microstructural model has been proposed in papers [6, 7].
Section 4 contains a discussion of thermodynamic admissibility of constitutive
relations for poroelastic materials with ideal fluid components. The whole development is fully macroscopical in contrast to many other works on this subject
which are based on the notion of so-called true (real) densities. These may be
introduced in the present model if needed at any stage of development but they are
not necessary for the formulation of the consistent mathematical model. In order to
be more specific we limit the attention solely to isotropic systems.
Section 5 is devoted to the specification of some special models which have
an important practical bearing. In particular we discuss the simplest model of a
two-component poroelastic material.
In Conclusions we indicate advantages of the Lagrangian description for both
theoretical development as well as for numerical evaluations of the boundary value
problems.
2. Porous Medium as a Mixture. Reference Configurations, Lagrangian
Description
The construction of the theory of mixtures of fluids proposed by Truesdell [1] is
based on the Eulerian description of motion of components. As a continuum model
it is based on the assumption that at each point of the space of configurations ℜ3
all components are present simultaneously. Their various contributions are characterized by different concentrations (fractions of partial mass densities to the total
mass density) as well as by their own velocity fields.
A model of porous materials requires an extension of this approach. On the one
hand it must account for large deformations of a solid component of the mixture
which describes the behaviour of the skeleton of the porous medium. This indicates
the necessity of the Lagrangian description which has been in part (solely to the
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
765
solid component) employed by Bowen [2]. On the other hand a description of the
microstructure must be extended as its properties are described not only by concentrations but also by a volume fraction of voids called porosity. This is particularly
visible when the porous material consists only of the solid component, i.e., the
mass densities of fluid components are all identically zero. Then the concentrations
are also zero but the microstructure is not trivial. This additional field requires an
additional equation and in the above mentioned paper Bowen proposed an evolution equation describing its relaxation properties. An alternative approach has
been proposed earlier for granular materials by Goodman and Cowin [8]. In their
paper the authors proposed a second order equation for a microstructural behavior.
Such an approach related to the so-called principle of self-equilibrated forces has
been modified by Hutter and Svendsen [9] and is applied in the description of
avalanches with abrasion [10]. In this paper we rely on a balance equation for
porosity introduced in my own works [6, 7].
We consider a porous medium whose channels are filled with a mixture of
A fluid components. The model is constructed on a chosen reference configuration B 0 of the solid component, i.e., all fields are functions of a spatial variable
X ∈B0 and time t ∈ T . We consider a thermomechanical model in which the
governing fields are as follows:
1. ρ S – mass density of the skeleton in the reference configuration,
2. ρ α , α = 1, . . . , A, – partial mass densities of fluid components referring to the
unit volume of the reference configuration of the skeleton,
3. x́S – velocity field of the skeleton,
4. FS – deformation gradient of the skeleton,
5. x́α , α = 1, . . . , A, – velocity fields of fluid components,
6. θ S – absolute temperature of the skeleton,
7. θ α , α = 1, . . . , A, – absolute temperatures of fluid components,
8. n – porosity (the volume fraction of voids).
Further in this work we assume that temperatures of components are the same
θ = θ S = θ 1 = · · · = θ A.
(1)
From the thermodynamic point of view little has been done for continuum
theories of mixtures in which this condition is not satisfied (e.g., [11]). Some
semi-kinetic models have been proposed for ionized gases (plasma; e.g., [12]).
The above fields are related to their Eulerian counterparts in the following way
ρtS (x, t) := ρ S f−1 (x, t), t J S−1 f−1 (x, t), t , J S := det FS ,
ρtα (x, t) := ρ α f−1 (x, t), t J S−1 f−1 (x, t), t , α = 1, . . . , A,
vS (x, t) := x́S f−1 (x, t), t ,
(2)
vα (x, t) := x́α f−1 x, t , t , α = 1, . . . , A,
n(x, t) := n f−1 (x, t), t ,
where the function of motion of the skeleton
x = f(X, t),
(3)
766
K. WILMAŃSKI
is assumed to be at least twice continuously differentiable almost everywhere, i.e.,
x́S =
∂f
,
∂t
FS = Grad f.
(4)
Hence the fields x́S , FS must satisfy the following integrability conditions
∂FS
= Grad x́S ,
∂t
T
Grad FS = Grad FS .
(5)
The reference configuration B0 is chosen in such a way that it is identical with
a configuration at the instant of time t = t0 for which
∀X ∈ B0
FS (X, t0 ) = 1.
(6)
This choice of reference configuration is convenient for systems in which the
solid component forms a skeleton whose topology does not change during the
motion. It is the case for modelling of rocks, it may or may not be the case for
granular materials, and it is certainly not the case for suspensions of solid particles
which appear, for instance, after liquefaction of a granular compact material.
For the above fields, field equations follow from general balance equations
which we discuss in the next section.
3. Balance Equations
We skip here axiomatic foundations for the integral representation of a general
balance law. These may be found in Truesdell’s book [13], which is after more
than 30 years still the most important reference on this subject.
The general form of this equation for a density ϕ(X, t), written for an arbitrary
domain P (t) whose motion is described by a velocity field V(X, t), is as follows:
@
d
γ (X, t) dV ,
(7)
(X, t) · N dS +
ϕ(X, t) dV =
dt P (t )
P (t )
∂P (t )
where is the so-called flux of ϕ, and γ is its volume supply. The first integral
on the right-hand side is evaluated over a closed surface ∂P of the domain P and
describes the transport through the surface. N is the field of unit outward normal
to the surface. If we perform the differentiation on the left-hand side and apply the
Stokes theorem, we obtain
∂ϕ
+ Div(ϕV − ) − γ dV = 0.
(8)
P (t ) ∂t
We apply this relation to partial quantities listed in the previous section. In order
to do so we have to find the kinematics of material domains for each component
related to the reference configuration B0 . Obviously for material domains with
respect to the skeleton we have V ≡ 0. For fluid components we have to use the
767
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
assumption on the simultaneous appearance of all components in each point of the
domain Bt := f(B0 , t) in the configuration space. For the α-component we have
then along the trajectory
∀x ∈ Bt , ∀x′ ∈ N (x) ⊂ Bt x′ = x + x́α t + O t 2
= x + FS f−1 (x′ , t) − f−1 (x, t) + x́S t
+ O |x′ − x|2 ,
where N (x) is a neighbourhood of x. The limit in this relation t → 0 yields the
following velocity field for material domains of the α-component in the reference
configuration of the skeleton
∀X ∈ B0
f−1 (x′ , t) − f−1 (x, t)
t →0
t
α
S−1
= F (X, t) x́ (X, t) − x́S (X, t) .
X́α (X, t) := lim
(9)
We call this field the Lagrangian velocity of the α-component.
Assuming that the balance equation (8) for a partial quantity ϕ α holds true
for any material domain of the α-component, we obtain in the standard way the
following local form of this equation
∂ϕ α
+ Div ϕ α X́α − α = γ α a.e. in B0 .
(10)
∂t
Obviously α denotes the corresponding partial flux, and γ α is the partial volume
supply.
In particular we have:
• partial mass balance equations
∂ρ S
= ρ̂ S ,
∂t
∂ρ α
+ Div ρ α X́α = ρ̂ α ,
∂t
α = 1, . . . , A,
(11)
• partial momentum balance equations
∂(ρ S x́S )
= DivPS + p̂S + ρ S bS ,
∂t
∂(ρ α x́α )
+ Div ρ α x́α ⊗ X́α = DivPα + p̂α + ρ α bα ,
∂t
(12)
α = 1, . . . , A,
• partial energy balance equations
∂
1
= Div QS − PST x́S + ρ S bS · x́S + ρ S r S + r̂ S ,
ρ S ε S + x́ S2
∂t
2
1
∂
1
(13)
ρ α ε α + x́ α2 + Div ρ α ε α + x́ α2 X́α
∂t
2
2
= Div Qα − PαT x́α + ρ α bα · x́α + ρ α r α + r̂ α , α = 1, . . . , A,
768
K. WILMAŃSKI
• balance equation of porosity
∂n
= −DivJ + n̂.
(14)
∂t
In these equations, all functions are defined on the reference configuration B0
of the skeleton. In this sense we may call it the Lagrangian description even
though partial balance equations for fluid components contain convective parts with
respect to the corresponding Lagrangian velocities.
The two-point tensors PS , Pα denote the Piola–Kirchhoff partial stress tensors,
S
b , bα are partial body forces, ε S , ε α are partial densities of the internal energy,
QS , Qα – partial heat fluxes, r S , r α are partial energy radiations, J is the flux of
porosity, and all quantities with a hat denote productions.
The balance equation of porosity requires some justification. We have argued in
previous works on this subject (e.g., [6, 7]) that the balance equation for n follows
from an averaging procedure for a representative elementary volume accounting for
geometrical properties of the microstructure. However this argument is not needed
if we make an extension of the continuous model of mixtures on the macroscopical
phenomenological level. In such a case a new scalar field satisfies in the most general case a balance equation. Second order equations for microstructural variables
appearing in some works on this subject indicate that most likely two variables
rather than one additional microstructural variable should be introduced and one of
them has to be eliminated from the model by substitution of one balance equation
in another. The most important question which must be answered in a model with
an additional balance law is if such a model can be mathematically well-posed –
in particular in relation to additional boundary conditions which may be necessary.
The most prominent example for those difficulties appears within the extended
thermodynamics (e.g., [17]) where the extension of number of fields and, consequently, an extension of the hierarchy of field equations yields unsolved problems
of boundary conditions. Fortunately the above balance equation for porosity specified for two-component poroelastic materials does not require additional boundary
conditions – it possesses all properties of an evolution equation. As we shall see,
further thermodynamic considerations indicate that the flux J results from the
diffusion (relative motion of fluid components with respect to the skeleton), and
the source n̂ describes relaxation to the thermodynamic equilibrium as well as
equilibrium changes of porosity ∂nE /∂t.
We make an assumption similar to the one introduced by Truesdell for mixtures
of fluids [1] that the bulk productions of mass, momentum, and energy vanish, i.e.,
the corresponding balance equations reduce to conservation laws. Hence
ρ̂ S +
A
α=1
ρ̂ α = 0,
p̂S +
A
α=1
p̂α = 0,
r̂ S +
A
α=1
r̂ α = 0.
(15)
Under these conditions we can introduce bulk quantities which correspond to
those introduced by Truesdell for fluid mixtures that satisfy conservation laws of
769
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
a single component continuum. Due to the fact that we have chosen one of the
components – skeleton – as the reference the form of these laws differs from
the classical Lagrangian form of conservation equations of a single continuum.
Namely, by addition of partial mass balance equations (11) we obtain
∂ρ
+ Div ρ Ẋ = 0,
∂t
S
ρ := ρ +
A
α=1
α
ρ ,
ρ Ẋ :=
A
ρ α X́α .
(16)
α=1
Hence for the single component bulk description we have to identify in relation (8): V ≡ Ẋ. This Lagrangian mean velocity takes over the role of the barycentric velocity of the classical mixture theory. However in contrast to the Eulerian description the Lagrangian mean velocity is relative, i.e., similarly to the Lagrangian
velocities X́α it is objective. The above definition yields the following conservation
laws:
• momentum
∂(ρ ẋ)
+ Div ρ ẋ ⊗ Ẋ − P = ρb,
∂t
A
A
(17)
ρ α bα ,
ρb := ρ S bS +
ρ α x́α ,
ρ ẋ :=ρ S x́S +
α=1
α=1
and the bulk Piola–Kirchhoff stress tensor P is defined by the relation
0
/
A
P := PI − FS ρ S Ẋ ⊗ Ẋ+
ρ α (X́α − Ẋ) ⊗ (X́α − Ẋ) ,
α=1
S
PI := P +
• energy
A
(18)
α
P ,
α=1
∂
1 2
1 2
T
+ Div ρ ε + ẋ Ẋ + Q − P ẋ = ρb · ẋ + ρr,
ρ ε + ẋ
∂t
2
2
(19)
where the bulk internal energy density is defined as follows
.
A
1 S S
ρ α CS (X́α − Ẋ) ⊗ (X́α − Ẋ) ,
ρ C · (Ẋ ⊗ Ẋ) +
ρε := ρεI +
2
α=1
(20)
A
ρ α εα ,
CS := FST FS ,
ρεI := ρ S ε S +
α=1
and the bulk heat flux has the form
770
K. WILMAŃSKI
Q = QI +
+
S
-
1
−ρ S Ẋ ⊗ Ẋ ⊗ Ẋ
2
A
α=1
QI := Q +
.
ρ α (X́α − Ẋ) ⊗ (X́α − Ẋ) ⊗ (X́α − Ẋ) CS ,
A
α=1
α
S S
Q − ρ ε Ẋ +
+ PST FS Ẋ −
A
α=1
A
α=1
ρ α ε α (X́α − Ẋ)
PαT FS (X́α − Ẋ),
as well as the radiation
A
A
ρ α bα · FS (X́α − Ẋ).
ρ α r α − ρ S bS · FS Ẋ+
ρr := ρ S r S +
α=1
(21)
(22)
α=1
The formal similarity of these relations to the corresponding relations of the
fluid mixture theory is obvious. Technical differences are related to the fact that
one of the components is solid and, secondly, as the reference we have chosen this
solid component rather than a mean barycentric motion of Eulerian description.
4. Field Equations and Thermodynamic Admissibility for Isotropic
Materials
Thermodynamics of mixtures of fluids needed more than 10 years since the publication of Truesdell’s papers [1] to start to develop. The pioneering work of Müller
[14] contains the most fundamental extention of the Clausius–Duhem inequality
which has been used as a condition for thermodynamic admissibility of various
single component models. It is the assumption that the heat flux and the entropy
flux are not related to each other by a classical universal Fourier relation: h =q/θ.
The review of basic results for mixtures following from this extention can be found
in the book [15].
The formal thermodynamic construction of a continuous model proceeds as
follows. We need field equations for the following fields
(23)
F := ρ S , ρ α , FS , x́S , x́α , θ, n , α = 1, . . . , A.
They follow from the balance equations (11), (5), (12), (19), (14). However,
in order to transform these equations into field equations we have to perform the
so-called closure. Namely, the following quantities
R := ρ̂ α , PS , Pα , p̂α , εI , QI , J, n̂int , nE ,
α = 1, . . . , A,
∂nE
n̂int := n̂ −
,
(24)
∂t
must be specified in terms of fields and their derivatives in order to close the system.
This is the constitutive problem defining materials contributing to the mixture.
771
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
The mass and momentum sources for the skeleton do not appear in the above
list because, according to (15), they are not independent. Let us remark that in
many cases of practical bearing additional constitutive relations may have the form
of evolution equations. For instance, this is the case when the skeleton has some
plastic properties, or when mass sources result from chemical reactions or adsorption/desorption processes. We do not consider such problems in this work and
limit further our attention to the so-called poroelastic materials. Then the set of
constitutive variables is as follows
C := ρ S , ρ α , FS , X́α , θ, G, n, N , α = 1, . . . , A,
(25)
G := Grad θ,
N := Grad n.
Usually this set is still much too complicated for the full thermodynamic analysis and one considers simpler models. For example in the case of a simple twocomponent isotropic model of isothermal processes without mass exchange scalar
constitutive functions depend on the following set of constitutive variables
Csimple := ρ F , I, I I, I I I, I V , V , V I, n ,
(26)
where the six invariants I, . . . , V I are defined as follows
I I I := det CS ,
I := trCS ,
I I := 12 I 2 − trCS2 ,
I V := X́F · X́F ,
V := X́F · CS X́F ,
V I := X́F · CS2 X́F ,
(27)
with X́F being the Lagrangian velocity of the single fluid component: α = F . We
present some results for such a model further in this paper.
The fundamental assumption of a continuous modelling has the form of the
following constitutive relation
R = R(C),
(28)
where the mapping is assumed to be at least once continuously differentiable.
The constitutive functions (28) are said to be thermodynamically admissible if
any solution of field equations satisfies identically the following entropy inequality
∂(ρη)
+ Div(ρηẊ + H) 0,
∂t
η = η(C),
H = H(C).
(29)
This is the Lagrangian form of the second law of thermodynamics proposed by
Müller for mixtures.
As shown in 1973 by Liu (e.g., see [16]) the limitation to solutions of field
equations can be eliminated from the above formulation by means of Lagrange
multipliers. The equivalent form of the second law is then as follows. For all fields
the following inequality must be fulfilled identically:
S
∂(ρη)
S ∂ρ
S
+ Div(ρηẊ + H) −
− ρ̂
∂t
∂t
772
K. WILMAŃSKI
S
α α
∂ρ α
α
S
S
S S
S
S ∂ x́
+ Div ρ X́ − ρ̂ − λ · ρ
− DivP − p̂ + ρ̂ x́
−
∂t
∂t
α=1
α
A
α
α ∂ x́
α
α
α α
α
α
λ · ρ
−
+ X́ · Grad x́ − DivP − p̂ + ρ̂ x́
∂t
α=1
S
∂F
n ∂n
S
F
− Grad x́ −
+ DivJ − n̂
− ·
∂t
∂t
T
ε ∂ρε
−
(30)
+ Div ρε Ẋ + Q − P ẋ 0,
∂t
A
α
where the Lagrange multipliers := {S , α , λS , λα , F , n , ε } are functions
of constitutive variables C.
The exploitation of the inequality is now standard. Applying the chain rule
we separate a linear part which must vanish. This yields relations for multipliers
and some restrictions of constitutive relations. The remaining nonlinear part of
the inequality defines the dissipation in the system. We skip here a discussion
of fully general restrictions of constitutive relations. These can be found in the
paper [5] and in the book [16]. We present their particular cases further in this
work. However it is worthwhile to expose the structure of the dissipation for constitutive variables C in which we leave out the dependence on G and N. After some
calculations we obtain the following so-called residual inequality
D :=
A
α=1
α − S ρ̂ α + n n̂int
A
A
α
α
S
S
α α
S
ρ̂ α X́α 0,
+
λ − λ · p̂ − ρ̂ x́ − λ · F
α=1
(31)
α=1
where the multipliers are given by the relations
∂η
∂η
S
S
α
α
ε ∂ε
ε ∂ε
=ρ
,
=ρ
,
−
−
S
S
∂ρ α
∂ρ α
−1∂ρ
∂ρ
∂ε
∂η
,
ε =
∂θ
∂θ
A
∂η
ε ∂ε
S S
S−T
,
−
ρ λ = −ρF
α
α
α=1 ∂ X́
∂ X́
∂η
∂η
α α
S−T
ε ∂ε
ε ∂ε
n
ρ λ = ρF
−
−
,
=ρ
.
∂n
∂n
∂ X́α
∂ X́α
(32)
The first contribution to the dissipation function D (31) describes the dissipation due to the mass exchange between components. The second contribution is the
dissipation due to the relaxation of porosity to its equilibrium value, say nE . Finally,
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
773
the last contribution is the dissipation due to the relative motion of components. It is
known from the classical theory of mixtures that momentum sources are objective
solely in the combination with mass sources. This property is also present in the
model for poroelastic materials and, consequently, the second line in the definition
of D should be considered as a whole. There is no contribution of dissipation due
to the heat conduction because we have left out the dependence on the temperature
gradient G. The lack of dependence on the gradient of porosity N does not lead to
any simplifications in the dissipation.
The thermodynamic equilibrium state is defined by the requirement that D = 0
in this state. It means that mass, momentum and porosity sources, ρ̂ α , p̂α , n̂int
vanish in this state, and simultaneously the dissipation function D reaches the
minimum.
The second law of thermodynamics does not specify constitutive relations for
sources but it limits their form by the residual inequality. This statement can be
made more specific by the assumption that deviations from the thermodynamic
equilibrium are small. Then the dissipation becomes a quadratic function of nonequilibrium variables. We present further the results of this simplification.
5. Some Special Cases
Let us begin with a rather formal simplification of the multicomponent model,
which indicates a possible structure of energy, entropy and porosity fluxes. We
assume that the intrinsic parts of the internal energy εI and the entropy η are
independent of relative velocities X́α . This assumption is motivated by the fact
that scalar functions for isotropic materials must be at least quadratic in their
dependence on vector arguments. For small deviations from the thermodynamic
equilibrium such a dependence on X́α can be left out. If so, then relations (32)4,5
for the multipliers become quite explicit and we obtain
λS = ε FS Ẋ,
λα = ε FS (X́α − Ẋ).
(33)
Then restrictions following from the second law which we are not quoting in this
paper (e.g., see [6]) yield the following general form of fluxes for processes in
isotropic materials with a small deviation from the thermodynamic equilibrium
Q =
H =
J =
A
α=1
A
α=1
A
α=1
Qα0 1 + Qα1 CS + Qα2 CS2 X́α ,
H0α 1 + H1α CS + H2α CS2 X́α ,
J0α 1 + J1α CS + J2α CS2 X́α ,
(34)
774
K. WILMAŃSKI
where the scalar coefficients Qα0 , . . . , J2α are solely functions of equilibrium variables
(35)
Cequil = I, I I, I I I, ρ α , θ, nE , nE = nE (I I I, θ).
Particularly, the last result is important because it allows to specify the equilibrium porosity. Namely, the balance equation of porosity (14) reduces in this case
as follows
∂nE
=0
∂t
A
0⇒
ρS
∂nE α ∂nE
ρ
+
= 0,
∂ρ S
∂ρ α
α=1
(36)
which is the partial differential equation for nE . It shows that nE can be left out in
the list (35) because it is not independent from the other variables.
In the simple case of two components the solution of the differential equation (36) has the form
F
ρ
.
(37)
nE = nE
ρS
The above simplification of the dependence on relative velocities and the structure of the dissipation function indicate as well the following structure of momentum and porosity sources
p̂α − ρ̂ α x́α = π α FS X́α ,
n̂ = −
n − nE
,
τ
π α , τ > 0,
(38)
where parameters π α , τ may depend on equilibrium variables.
We proceed to present the model for an important special case of the twocomponent poroelastic material. This models the so-called saturated porous materials whose components on the macroscopic level are the elastic skeleton and
the ideal fluid. The thermodynamic admissibility following from the second law
of thermodynamics leads for isothermal processes without mass exchange to the
following constitutive relations (e.g., [18]):
• partial Cauchy stresses
TS := J S−1 PS FST = ℵ0 1 + ℵ1 BS + ℵ2 BS2 − θn (n − nE )1,
(39)
F
F
n
S
S ST
T = − p − θ (n − nE ) 1,
B := F F ,
where
ℵŴ = ℵŴ (I, I I, I I I, θ), Ŵ = 0, 1, 2,
pF = pF ρ F , θ ,
(40)
n
n
F
nE = nE (I I I, θ),
= I, I I, I I I, ρ , θ .
• porosity flux and momentum source
J = nE X́F ,
p̂ = π FS X́F ,
π = π I I I, ρ F , θ .
(41)
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
775
Consequently the model is analogous to the model of simple mixtures of fluids (e.g., [15]) in which interactions of components reduce to momentum sources
and, what is characteristic for poroelastic materials, to nonequilibrium changes of
porosity.
We conclude these considerations with a few remarks concerning boundary
conditions. Very little has been done for the case of models with more than two
components. Therefore we limit the attention solely to this last case.
The natural condition on the boundary ∂B0 is the condition for the total loading.
If we denote by text the vector of force density on this surface which is controlled
from the external world, then it must be taken over by the total stress vector, i.e.
text = PN|∂B0 ,
(42)
provided the interface ∂B0 does not possess any intrinsic structure of its own. This
may not be fulfilled by many porous materials which, for instance, may possess a
surface tension on contact surfaces.
In addition to this dynamical condition we have to formulate a kinematical condition depending on a relative motion of components. The tangential component
of this vectorial condition has been intensively investigated and the early results of
Beavers and Joseph [19] have been confirmed. In the case of ideal fluid components
this condition reduces to the following one
X́F − (X́F · N)N|∂B0 = 0.
(43)
The remaining normal component must be determined from investigations of
a boundary layer which is created by fluid components flowing out of the porous
material through a permeable boundary. A phenomenological model of this flow
has been proposed by Deresiewicz and Skalak [20] and not much has been modified
in this condition even though some questions seem to be still open. For two porous
materials in contact through the permeable interface ∂B0 this condition has the
form
F
p
F F
= 0,
(44)
ρ X́ · N + α0
n
∂B0
where the double brackets denote a jump, α0 is a phenomenological coefficient of
surface permeablity, and the quantity in the brackets describes the difference of the
pore pressure on both sides of the interface. It is a kind of a driving force for the
flow of the fluid through the surface.
There remains the problem of a boundary condition for the porosity. Note that
the equation of porosity does not contain a divergence of porosity. Consequently it
is a heterogeneous evolution equation rather than a real balance equation. For this
reason it does not require any boundary condition at all. This may not be the case
if we rely on the model proposed by Goodmann and Cowin in which the equation
for the microstructural variable does contain spatial derivatives.
776
K. WILMAŃSKI
6. Conclusions
The general framework of a nonlinear model of poroelastic materials reminds very
much that designed by Truesdell for mixtures of fluids. The Lagrangian formulation of the present model is solely a technical issue which enables to incorporate large deformations of the solid component but does not change anything in
“philosophy” of the construction of the model.
A new element grows only from the fact that we have to incorporate an additional microstructural parameter into the model. The model presented in this
work contains only one such parameter – the porosity. However the experience
with soil and rock mechanics, mechanics of snow and glaciers indicates that the
number of those parameters must be larger in many problems of practical bearing.
For example, it may be tortuosity, double porosity, anisotropy of microstructure,
plastic deformation of the skeleton, etc. In such cases the model must be extended
even further but the fundamental elements of the theory of mixtures would remain
in such extensions.
Finally, let us remark that the linear version of the model has been extensively
investigated and it seems to work very well, particularly in applications to acoustics
of porous materials. Nonlinear problems of poroelastic materials are being solved
usually by means of numerical methods for which the Lagrangian formulation is
particularly useful. In such a description a mesh of finite elements or finite volumes
does not have to be changed in time to follow the motion of fluid components.
Analytical results are very rare (e.g., [21]) because very little is known about the
form of constitutive relations for large deformations of the skeleton.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
C.A. Truesdell, Sulle basi della termomeccanica. Accad. Naz. dei Lincei, Rend. della Classe di
Scienze Fisiche. Matematiche e Naturali 22(8) (1957) 33–38, 158–166.
R.M. Bowen, Compressible porous media models by use of the theory of mixtures. Internat.
J. Engrg. Sci. 20(6) (1982) 697–735.
C.A. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1985).
K. Wilmanski, Lagrangian model of two-phase porous material. J. Non-Equilib. Thermodyn.
20 (1995) 50–77.
K. Wilmanski, Toward an extended thermodynamics of porous and granular materials. In:
G. Ioos, O. Guès and A. Nouri (eds), Trends in Applications of Mathematics to Mechanics.
Chapman&Hall/CRC (2000) pp. 147–160.
K. Wilmanski, Porous media at finite strains – The new model with the balance equation of
porosity. Arch. Mech. 48(4) (1996) 591–628.
K. Wilmanski, A Thermodynamic Model of compressible porous materials with the balance
equation of porosity. Transport Porous Media 32 (1998) 21–47.
M.A. Goodman and S.C. Cowin, A continuum theory for granular materials. Arch. Rational
Mech. Anal. 48 (1972) 249–266.
B. Svendsen and K. Hutter, On the thermodynamics of a mixture of isotropic materials with
constraints. Internat. J. Engrg. Sci. 33 (1995) 2021–2054.
N.P. Kirchner, Thermodynamically consistent modelling of abrasive granular materials I. Nonequilibrium theory. Proc. Roy. Soc. London A 458 (2002) 2153–2176.
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
11.
777
N.T. Dunwoody and I. Müller, Thermodynamic theory of two chemically reacting ideal gases
with different temperatures. Arch. Rational Mech. Anal. 29 (1968).
12. N.A. Krall and A.W. Trivelpiece, Principles of Plasma Physics. McGrow-Hill, New York
(1986).
13. C. Truesdell, A First Course in Rational Continuum Mechanics, Part 1, Fundamental Concepts,
Academic Press, New York (1977); also Lecture Notes: A First Course in Rational Continuum
Mechanics. Johns Hopkins Univ. Press, Baltimore, MD (1972).
14. I. Müller, A Thermodynamic theory of mixtures of fluids. Arch. Rational Mech. Anal. 28
(1968).
15. I. Müller, Thermodynamics. Pitman, New York (1985).
16. K. Wilmanski, Thermomechanics of Continua. Springer, Heidelberg (1998).
17. I. Müller and T. Ruggeri, Rational Extended Thermodynamics. Springer, New York (1998).
18. K. Wilmanski, Mass exchange, diffusion and large deformations of poroelastic materials. In:
G. Capriz, V.N. Ghionna and P. Giovine (eds), Modeling and Mechanics of Granular and
Porous Materials. Birkhäuser, Basel (2002) pp. 213–244.
19. G.S. Beavers and D.D. Joseph, Boundary conditions at a naturally permeable wall. J. Fluid
Mech. 30(19) (1967) 197–207.
20. H. Deresiewicz and R. Skalak, On uniqueness in dynamic poroelasticity. Bull. Seismol. Soc.
Amer. 53 (1963) 783–788.
21. B. Albers and K. Wilmanski, An axisymmetric steady-state flow through a poroelastic medium
under large deformations. Arch. Appl. Mech. 69 (1999) 121–132.
Anisotropic Elasticity and Multi-Material
Singularities
WAN-LEE YIN
School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta,
GA 30332-0355, USA. E-mail: wanlee.yin@ce.gatech.edu
Received 24 July 2002; in revised form 21 January 2003
Abstract. Multi-material wedges associated with convergence of geometrical and material discontinuity lines generally show singular stress fields around the vertex of the wedge. In this paper, the
eigenvalue problem for a multi-material wedge composed of several anisotropic elastic sectors is
formulated in a completely generally manner, including the cases of degenerate and extra-degenerate
material sectors, and various types of edge conditions for both open and closed wedges. General
representation of the elasticity solution in a degenerate or extra-degenerate anisotropic sector requires
higher-order eigenmodes (generalized eigenfunctions) in addition to zeroth-order eigenmodes. Such
higher-order eigenmodes are obtained from appropriate analytical expressions of the zeroth-order
eigenmode by using the derivative rule. The analysis is applied to one bisector wedge and one
trisector wedge in a three-layer cracked composite model to obtain accurate elasticity solutions of the
singular stress fields. These solutions were determined using the traction data generated on a circular
collocation path by a conventional finite element analysis.
Mathematics Subject Classifications (2000): 74E10.
Key words: anisotropic elasticity, stress singularity, multi-material wedges, eigensolutions, degenerate materials, Lekhnitskii and Stroh formalisms.
To Clifford Truesdell, in Fond Memory, Admiration and Gratitude.
1. Introduction
Composite structures involving interfaces, joints, free edges and cracks generally
develop singular elastic stress fields near the intersection of lines of material and
geometrical discontinuity. Examples include interface cracks, transverse matrix
cracks impinging upon an adjacent ply, lap and beveled adhesive joints, skinstiffener interfaces, and ply drops in laminated structures. These localized regions
of severe stress are possible sites of failure initiation and growth. In many cases, the
local geometry and state of deformation do not vary significantly in the direction
tangential to the line of singularity. A local analysis model may be used where
the parameters and variables depend only on two rectangular coordinates x and y
in the plane perpendicular to the line of singularity. This two-dimensional model,
containing two or more anisotropic elastic sectors, is called a multimaterial wedge.
779
C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 779–808.
© 2004 Kluwer Academic Publishers. Printed in the Netherlands.
780
W.-L. YIN
The dissimilar sectors are bonded by radial interfaces which converge at the vertex
of the wedge.
A general analysis method has been developed to obtain accurate elasticity solutions of multi-material wedges using a substructure approach [1–3]. In this analysis
scheme, conventional finite element analysis of the global structure is performed to
provide the traction boundary data on a path bordering the wedge. A 2-D elasticity
solution of the region interior to the path is obtained subsequently by constructing
an eigenseries that matches the traction data at various points along the path based
on least square error.
For bimaterial wedges associated with free edges or interface cracks, this method
yields elasticity solutions of the local problem that are in close agreement with
existing numerical solutions using special singular finite elements [4, 5]. In the case
of interface cracks, the energy release rate predicted by the dominant singularity
of the eigenseries was found to be in excellent agreement with the result of the
J -integral evaluated along a remote boundary path [6, 7]. Further validation is
shown by obtaining and comparing several eigenseries of the same problem with
different numbers of terms, by changing the collocation path, and by using finiteelement displacement solutions rather than the traction solutions as the collocation
data for determining the eigenseris. These different solutions were also found to be
in close agreement [8].
One feature of anisotropic elasticity that significantly complicates the theoretical analysis as well as the computational algorithms is material degeneracy.
A degenerate or extra-degenerate material has repeated material eigenvalues for
which the number of associated (zeroth-order) eigenvectors is smaller than the
multiplicity of the eigenvalue. The representation of the general solution of such
materials must include higher-order eigenvectors (often called “generalized eigenvectors”) with more complicated analytical expressions. The practical importance
of the issue is indicated by isotropic and transversely isotropic materials, which
are degenerate and have triple material eigenvalues ±i. Ting [9] and Yin [10] have
given examples of extra-degenerate materials, and the class appears to be surprisingly wide. Such materials have a triple eigenvalue with only one independent
zeroth-order eigenvector, and thus require two higher-order eigenvectors.
In previous elasticity analyses of multi-material wedges, the sectors are often
assumed to be isotropic, transversely isotropic, or anisotropic but non-degenerate
[11–15]. Analytical expressions of the wedge eigensolutions in such sectors are
relatively simple. However, general multi-material wedges may contain degenerate
sectors that are not isotropic or transversely isotropic, for which the commonly
used expressions of material eigenmodes are not valid.⋆ In any such sector, a correct
⋆ In this paper, the term “eigenmode” refers to one of six independent solutions of 2-D anisotropic
elasticity of a material sector which may be non-degenerate, degenerate or extra-degenerate, whereas
an “eigensolution” of a wedge is obtained by taking linear combinations of eigenmodes in successive
sectors, matching the coefficients of combination in such a way as to satisfy displacement and traction
continuity across the radial interfaces and homogeneous boundary conditions on the exterior edges.
MULTI-MATERIAL SINGULARITIES
781
representation of the elasticity solution must include higher-order material eigenmodes, which are given by modified expressions of complicated types. Explicit
expressions of the general solutions of extra-degenerate materials are not found
in the literature until very recently. Hence current theoretical and computational
analysis of multi-material wedge singularities shows deficiency in completeness
and generality.
It was shown recently that anisotropic materials may be classified into five distinct classes, each having different representations of the general solution in terms
of the material eigenvalues and eigenvectors. Degenerate and extra-degenerate materials have higher-order material eigenvectors and eigenmodes which may be obtained by differentiating appropriate analytical expressions of the zeroth-order
eigenvectors and eigenmodes with respect to µ, which is temporarily regarded as a
variable, followed by evaluating µ at the specific multiple eigenvalue. This derivative rule was proved analytically for all classes of degenerate and extra-degenerate
materials [10]. It provides a simple and direct way for deriving the higher-order
eigenvectors and eigenmodes. The expressions required for representing the wedge
eigensolutions in the degenerate sectors are thereby found.
In this paper, the structure of the eigensolutions of anisotropic elasticity, both
at the sector level and at the wedge level, is given in a concise form. A wedge
eigensolution satisfies the continuity of tractions and displacements across all radial interfaces, and homogeneous boundary conditions on the two exterior edges.
Within each anisotropic sector, the wedge eigensolution is a linear combination of
the six material eigenmodes. The characteristic equation for the wedge eigenvalues
is obtained explicitly regardless of the degeneracy of the sectors, and for various
edge conditions including free, fixed, sliding, floating, or the elastically supported
type. Analytical expressions of the wedge eigensolutions are also given. Thus the
eigen-problem of general multi-material wedges with unrestricted elastic material types is solved completely in a purely algebraic procedure resulting in fully
explicitly analytical expressions.
In the final section of the paper, elasticity solutions are obtained for one bisector
wedge and one trisector wedge in a three-layer composite model with the middle
layer containing an inclined crack. Various solutions including different numbers
of eigensolutions are compared to show the trend of convergence. Comparison is
also made with the asymptotic solution (the dominant singular eigensolution). It is
found that the trend shown by the asymptotic solution and the associated generalized stress intensity factors may be physically irrelevant and useless because they
differ drastically from the elasticity solution over any physically meaningful range
of scale length.
2. Two-Dimensional Anisotropic Elasticity – Non-Degenerate Material
Let αij (i, j = 1, . . . , 6) denote the anisotropic elastic compliance constants relating the strain components εx , εy , εz , γyz , γxz , γxy to the stress components σx , σy ,
782
W.-L. YIN
σz , τyz , τxz , τxy , and let
αi3 αj 3
βij = αij −
for i, j = 3.
α33
Then, for generalized plane deformations (i.e., deformations in which all strain
components depend only on x and y), one has [16]
{ε} = [β]{σ },
(2.1)
where {ε} = {εx , εy , γyz , γxz , γxy }T , {σ } = {σx , σy , τyz , τxz , τxy }T .
In the absence of body forces, an equilibrated stress field {σ } may be represented by the derivatives of a pair of stress functions F (x, y) and (x, y):
σx = F,yy ,
σy = F,xx ,
τxy = −F,xy ,
τxz = ,y ,
τyz = −,x , (2.2)
We seek solutions for the displacement vector u ≡ {u, v, w} and the vector of
stress potentials q ≡ {F,y , −F,x , } of the following form
or,
u = af (x + µy),
q = bf (x + µy),
(2.3a,b)
χ [0] ≡ {F,y , −F,x , , u, v, w}T = ξ [0] f (x + µy),
ξ [0] ≡ {bT , aT }T ,
(2.3c,d)
where f is an arbitrary analytic function and a ≡ {a1 , a2 , a3 }T and b ≡ {b1 , b2 , b3 }T
are constant vectors. The complex parameter µ and the six-dimensional vector ξ [0]
will be identified later as material eigenvalues and (zeroth-order) material eigenvectors if they make the derivatives of u and q satisfy the anisotropic stress-strain
relations, i.e., equation (2.1). Since
τxy = −∂x F,y = −b1 f ′ (x + µy) = ∂y (−F,x ) = b2 µf ′ (x + µi y),
one has b1 = −µb2 . Hence
b ≡ J1 (µ)η,
⎡
⎤
−µ 0
where J1 (µ) ≡ ⎣ 1 0 ⎦ ,
0 1
η≡
b2
b3
$
.
(2.4)
Equations (2.1), (2.3a,b) and (2.4) yield the important relation
E(µ)a = [β]P(µ)η
where the matrix functions E(µ) and P(µ) are defined by
⎡
⎤
⎡
⎤
1 0 0
−µ2 0
⎢0 µ 0⎥
⎢ −1
0 ⎥
⎢
⎥
⎢
⎥
⎢
⎥
⎢
−1 ⎥
E(µ) = ⎢ 0 0 µ ⎥ ,
P(µ) = ⎢ 0
⎥.
⎣0 0 1⎦
⎣ 0
µ ⎦
µ 1 0
µ
0
(2.5)
(2.6a,b)
Notice that all columns of P(µ) are orthogonal to all columns of E(µ). Therefore,
pre-multiplication of the last equation by ET (µ)[β]−1 and P(µ)T yield, respectively,
783
MULTI-MATERIAL SINGULARITIES
ET (µ)[β]−1 E(µ)a ≡ Ŵ(µ)a = 0,
PT (µ)[β]P(µ)η = M(µ)η = 0,
where
M(µ) ≡
l4 (µ)
−l3 (µ)
−l3 (µ)
l2 (µ)
l2 (µ) = β44 − 2β45 µ + β55 µ2 ,
l3 (µ) = −β24 + (β25 + β46 )µ − (β14 + β56 )µ2 + β15 µ3 ,
l4 (µ) = β22 − 2β26 µ + (2β12 + β66 )µ2 − 2β16 µ3 + β11 µ4 .
(2.7)
(2.8)
(2.9a)
(2.9b)
The matrix Ŵ ≡ ET (µ)[β]−1 E(µ) in equation (2.7) is well known in the Stroh
formalism [17].
Equations (2.7) and (2.8) have nontrivial solutions for a and η, respectively, if
and only if the following characteristic equation (2.10a) or (2.10b) is satisfied:
(µ) ≡ |Ŵ(µ)| = 0,
δ(µ) ≡ |M(µ)| = 0.
(2.10a,b)
The two conditions are in fact equivalent, and they yield three identical pairs of
complex conjugate roots which are the material eigenvalues. Equivalence of (2.10a)
and (2.10b) follows clearly from the relations which express one of the two vectors
b and η in term of the other:
⎡
⎤
1 0 0 0 0
(2.11)
a ≡ J2 (µ)η, where J2 (µ) ≡ ⎣ −µ 0 0 0 1 ⎦ [β]P(µ),
0 0 0 1 0
0 1 0 0 0
η≡
[β]−1 E(µ)a.
(2.12)
0 0 1 0 0
A direct if somewhat lengthy proof of equivalence was given by Barnett and Kirchner [18]. However, the present proof also makes transparent the equivalence of the
eigenspaces of the Lekhnitskii and Stroh formalisms.
For each eigenvalue µ, equations (2.8), (2.9) and (2.4) yield the explicit expression of the b-vector:
⎧
⎫
⎧
⎫
⎨ −µl3 (µ) ⎬
⎨ −µl2 (µ) ⎬
l3 (µ)
l2 (µ)
if l2 (µ) = 0,
otherwise b =
.(2.13)
b=
⎩
⎭
⎩
⎭
l4 (µ)
l3 (µ)
The corresponding material eigenvector is given by
$
$
b
J1
[0]
ξ ≡
= Jη, where J ≡
.
a
J2
(2.14)
The Stroh formalism, on the hand, first determines the a-vector from equation (2.7).
The expression is lengthy because Ŵ(µ) is a 3 × 3 matrix. The Stroh formalism becomes very cumbersome in degenerate and extra-degenerate cases, where
784
W.-L. YIN
higher-order eigenvectors (often called “generalized eigenvectors”) must be obtained through the use of the derivative rule or other equivalent steps.
It is easily seen that the complex conjugate of the eigenvalue µ is associated
with eigenvectors that are the complex conjugates of ξ . If the characteristic equation (2.10b) has three distinct pairs of complex conjugate roots, or if it has a
double root µ0 for which all elements of the matrix M(µ0 ) vanish, so that equations (2.7) and (2.8) yield two independent eigenvectors associated with the double
root (in addition to the eigenvector associated with a simple root), then the material
is called non-degenerate. Orthotropic materials with unequal elastic constants in
three material symmetry axes generally belong to the first type of non-degenerate
material.
For non-degenerate materials, we let B denote the matrix of the three b-vectors
associated with the eigenvectors µ1 , µ2 and µ3 that have positive imaginary parts,
and let A be the matrix composed of the corresponding a-vectors. We define the
6 × 6 matrix of eigenvectors
B 1
B
+ 1+
Z ≡ [Z , Z ] =
(2.15)
1
A A
where the overbars denote complex conjugates. Then the 2D general solution of a
homogeneous body of a non-degenerate anisotropic material is given by
χ ≡ {F,y , −F,x , , u, v, w} = Zf (x + µy)c,
(2.16)
where c is a constant vector and f (x +µy) denotes the 6×6 diagonal matrix with
the elements f1 (x + µ1 y), f2 (x + µ2 y), . . . , f6 (x + µ̄3 y). Real-valued solutions χ
are obtained by choosing
cj +3 = c̄j ,
fj +3 (x + µ̄j y) ≡ fj (x + µj y),
j = 1, 2, 3.
(2.17a,b)
3. Two-Dimensional Anisotropic Elasticity – Degenerate and
Extra-Degenerate Materials
If the characteristic equation has a repeated root µ, and the number of associated
independent eigenvectors is smaller than the multiplicity of µ, then the material is
called degenerate or extra-degenerate, depending on whether the deficiency of independent eigenvectors relative to the multiplicity of eigenvalues is 1 or 2. In such
cases, the zeroth-order eigenvectors must be supplemented by higher-order eigenvectors to form the matrix Z. These higher-order eigenvectors yield additional independent eigenmodes, not according to the simple relations of equations (2.3a,b)
but according to the “derivative rule” described in the following. The degenerate
case is important in practice because isotropic materials and transversely isotropic
materials are degenerate, and have the triple eigenvalues µ = ±i.
There are two classes of degenerate materials, each with a distinct representation of the general solution. The first class has a double eigenvalue µ0 which is
MULTI-MATERIAL SINGULARITIES
785
normal, that is, M(µ0 ) = 0. For this class, equation (2.8) has only one independent
solution η = {l2 (µ0 ), l3 (µ0 )}T , which yields one eigenvector ξ [0] = J(µ0 )η with
the zeroth-order eigenmode χ [0] = f (x + µ0 y)ξ [0] . An independent eigenmode
sharing the same eigenvalue is given by the following expression evaluated at
µ = µ0 and involving an arbitrary analytic function g(x + µy):
dχ [0]
d
=
g(x + µy)ξ [0]
dµ
dµ
= g(x + µy) J′ {l2 , l3 }T + J{l2′ , l3′ }T + yg ′ (x + µy)J(µ){l2 , l3 }T , (3.1)
χ [1] =
i.e., differentiation of the analytical expression of the zeroth-order eigenmode χ [0]
with respect to µ, followed by evaluation at µ = µ0 , yields independent eigenmodes of higher orders. This derivative rule is easily implemented in the present
compliance-based formulation because the analytical expressions of ξ [0] = J(µ0 )η
is explicit and simple. In the Stroh formalism, where the eigenvectors are expressed
in terms of elastic constants instead of elastic compliances, analytical expressions
of the zeroth-order eigenvectors are lengthy, and the generation of higher-order
eigenvectors and eigenmodes via the derivative rule becomes prohibitively cumbersome.
The second class of degenerate materials has a triple eigenvalue µ0 which is
abnormal, that is, M(µ0 ) is the null matrix. Then equation (2.8) imposes no restriction on η. One may take ξ [0] = J(µ0 ){0, 1}T . Applying the derivative rule to
J(µ){l2 (µ), l3 (µ)}T twice, one obtains two additional eigenvectors ξ [1] =
J{l2′ , l3′ }T (µ0 ) and ξ [2] = 2J′ {l2′ , l3′ }T (µ0 ) + J{l2′′ , l3′′ }T (µ0 ). All isotropic materials
belong to this class and the triple eigenvalues are µ0 = ±i.
Finally, extra-degenerate materials have a normal triple eigenvalue µ0 . Equation (2.8) has only one independent solution which is proportional to {l2 (µ0 ),
l3 (µ0 )}T . Three independent eigenmodes are given by f (x+µy)J(µ){l2 (µ), l3 (µ)}T
and its first and second derivatives with respect to µ, followed by evaluation at
µ = µ0 . The expression of equation (3.1) for χ [1] remains valid. The expression
of χ [2] involves all three eigenvectors of different orders:
χ [2] = f (x + µ0 y)ξ [2] + 2yf ′ (x + µ0 y)ξ [1] + y 2 f ′′ (x + µ0 y)ξ [0] .
(3.2)
Thus, for all degenerate and extra-degenerate materials, three complex conjugate
pairs of eigenvectors of the zeroth and higher orders may also be obtained which
form the 6 × 6 matrix of eigenvectors Z. However, the linear independence of
the higher-order eigenvectors needs to be proved. A proof is suggested in the next
section and the details may be found in [10].
The general two-dimensional solutions of degenerate and extra-degenerate materials are given by the following expression in place of equation (2.15):
χ = ZDf (x + µy)c,
(3.3)
786
W.-L. YIN
where D ≡ D1 , D1 is a block-diagonal matrix of differential operators composed
of the a 3 × 3 matrix D1 and its complex conjugate matrix D1 . For non-degenerate
materials, D is the identity matrix. The following expressions (3.4a) and (3.4b) give
D1 for degenerate and extra-degenerate materials, respectively,
⎡
⎤
∂
∂2
⎤
⎡
1
1 0
0
⎢
∂µ ∂µ2 ⎥
⎥
⎢
∂ ⎥
⎢
D1 ≡ ⎢
(3.4a,b)
D1 ≡ ⎣ 0 1 m
⎦,
2∂ ⎥
⎢
⎥.
∂µ
⎣0 1
⎦
∂µ
0 0
1
0 0
1
where m = 1 for a normal double eigenvalue and m = 2 for an abnormal triple
eigenvalue.
Let [rot3] denote the rotation matrix with respect to the z-axis through an
angle θ:
⎡
cos θ
⎣
[rot3] = −sin θ
0
sin θ
cos θ
0
⎤
0
0⎦
1
(3.5)
and let [rot6] ≡ [rot3], [rot3]. Then, from equation (3.1),
1
F,θ , −F,r , , ur , uθ , w
r
$T
= [rot6]ZDf (x + µy)c.
(3.6)
Differentiating equation (3.1) with respect to the coordinates x and y, one obtains
$T
∂v
(3.7)
−τxy , −σy , −τyz , εx , , γxz = ∂x χ = ZDf ′ (x + µy)c,
∂x
$T
∂u
(3.8)
σx , τxy , τxz , , εy , γyz = ∂y χ = ZDµf ′ (x + µy)c.
∂y
Using the transformation rules of the displacements, strains and stresses from the
rectangular to the polar coordinates, one finds that
{−τrθ , −σθ , −τθz , εr , ∂r uθ , γrz }T
6
7
= [rot6]ZD (cos θ + µ sin θ)f ′ (x + µy) c,
$T
∂θ ur − uθ
σr , τrθ , τrz ,
, εθ , γθz
r
6
7
= [rot6]ZD (µ cos θ − sin θ)f ′ (x + µy) c,
(3.9)
(3.10)
The shear strains γxy = ∂v/∂x + ∂u/∂y and γrθ = ∂r uθ + (∂θ ur − uθ )/r may be
obtained from the previous equations by taking linear combinations.
MULTI-MATERIAL SINGULARITIES
787
4. Multi-Material Wedges and Eigensolutions
A multi-material wedge is composed of N consecutive sectors of isotropic or
anisotropic materials that are perfectly bonded along radial interfaces which converge at the vertex of the wedge. We choose a polar coordinate system (r, θ) with
the vertex as the origin. The kth interface, θ = θk , separates the kth sector from
the (k + 1)th sector (k = 1, 2, . . . , N − 1). In the case of an open wedge, the first
and the last sectors are bounded, respectively, by exterior boundary edges θ = θ0
and θ = θN , on which boundary conditions of displacements, tractions, or of the
mixed type are imposed. An artificially defined curve Ŵ encircles the wedge and
demarcates the interior domain of the wedge from the surrounding structure. In
the case of a closed wedge, there are no exterior boundary edges. The radial lines
θ = θ0 and θ = θN coincide and become the interface between the first and the
Nth sector. Then Ŵ is a closed circuit.
We seek elasticity solutions of the wedge where the solution vector χ in each
sector is expressed by equation (3.3) with
f1 = f2 = f3 = (x + µy)λ = r λ (cos θ + µ sin θ)λ .
(4.1)
While the material eigenvalues µi vary from sector to sector, the parameter λ is
required to be the same for all sectors. This is required by the continuity of the
displacements and tractions across the sector interfaces. If the material of the kth
sector is non-degenerate, equation (2.15) yields
χ (k) (r, θ) = r λ Z(k) (k) (θ)c(k) ,
(4.2)
where
6
(k)
(k)
λ
λ
λ
(k) (θ) = (cos θ + µ(k)
1 sin θ) , (cos θ + µ2 sin θ) , (cos θ + µ3 sin θ) ,
7
(k)
(k)
λ
λ
λ
(cos θ + µ̄(k)
1 sin θ) , (cos θ + µ̄2 sin θ) , (cos θ + µ̄3 sin θ) .
(4.3)
Notice that equation (4.2) requires the dependence of the solution on the coordinates r and θ to be separated. In particular, the solution has the same θ-dependence
irrespective of r.
For a degenerate sector with µ2 = µ3 , or an extra-degenerate sector with µ1 =
µ2 = µ3 , equations (3.3), (3.4a,b) and (4.1) yield results of the form (4.2) where
Z(k) contains higher-order eigenvectors and where (k) (θ) is to be modified as
follows
(k)
1
0
(k)
(θ) =
,
(4.4)
0
(k)
2
where for the degenerate case
788
W.-L. YIN
6
7
(k)
(k)
λ
λ
λ
≡ D1 (cos θ + µ(k)
1 sin θ) , (cos θ + µ2 sin θ) , (cos θ + µ3 sin θ)
(k)
1
⎡
λ
(cos θ + µ(k)
1 sin θ)
⎣
0
0
0
λ
(cos θ + µ(k)
2 sin θ)
0
⎤
0
λ−1 ⎦
mλ sin θ(cos θ + µ(k)
2 sin θ)
(k)
λ
(cos θ + µ2 sin θ)
(4.5)
(m = 1 for a normal double eigenvalue and m = 2 for an abnormal triple eigenvalue) whereas for the extra-degenerate case
⎡
λ
λ−1
(cos θ + µ(k)
λ sin θ(cos θ + µ(k)
1 sin θ)
1 sin θ)
(k)
(k)
λ
⎣
1 =
0
(cos θ + µ1 sin θ)
0
0
⎤
(k)
2
2
λ−2
λ sin θ(cos θ + µ1 sin θ)
λ−1 ⎦
(4.6)
2λ sin θ(cos θ + µ(k)
1 sin θ)
(k)
λ
(cos θ + µ1 sin θ)
(k)
In both cases the matrix (k)
2 is obtained from 1 by merely replacing cos θ +
(k)
µ(k)
1 sin θ and cos θ +µ2 sin θ by the respective complex conjugates, while keeping
λ unchanged.
The undetermined complex coefficient vectors c(k) and c(k+1) in two consecutive
sectors are related according to the continuity of χ across the sector interface θ =
θk :
Z (k+1) (k+1)(θk )c(k+1) = Z (k)(k) (θk )c(k) .
This implies the recurrence relation for the vectors c(k) :
c(k+1) = Tk c(k) ,
(4.7)
where
−1 (k+1) −1 (k) (k)
Tk ≡ (k+1) (θk )
Z
Z (θk )
(k)
is the transfer matrix relating the vectors c
quently,
c(N) = TN−1 TN−2 · · · T1 c(1) .
(4.8)
in two consecutive sectors. Conse(4.9)
The transfer matrices Tk , as defined by equation (4.8), involves the inverse matrices of (k+1) and Z (k+1). It follows from equations (4.3)–(4.6) that, irrespective
of material degeneracy, the inverse matrix of (k) may be obtained simply by
substituting −λ for λ in (k) , i.e.,
(k) −1
(λ)
= (k) (−λ).
(4.10)
An explicit expression of the inverse matrix of Z (k) may also be given. It has been
shown that any two eigenvectors ξ and ξ ′ associated with different eigenvalues µ
and µ′ are orthogonal in the following sense (see [10] for a general proof for all
materials regardless of degeneracy)
789
MULTI-MATERIAL SINGULARITIES
[ξ , ξ ′ ] ≡ ξ T IIξ ′ = 0,
where II ≡
03×3
I3
I3
03×3
,
(4.11)
I3 and 03×3 denotes 3 × 3 identity and null matrices, respectively. In particular,
[ξ , ξ̄ ] = 0. Hence the six-dimensional solution space is the direct sum of an
even number of eigenspaces, one corresponding to each distinct eigenvalue, whose
dimension equals the multiplicity of that eigenvalue. The eigenvectors belonging
to the same eigenvalue are generally not orthogonal in the sense of the bracket
product defined by equation (4.11). Clearly,
≡ [Z+ , Z+ ] ≡ Z+T IIZ+ = AT B + BT A
(4.12)
is a symmetric block-diagonal matrix, and so is its complex conjugate . On the
+
other hand, orthogonality of the vectors in Z+ and Z implies that
+
[Z+ , Z ] = AT B + BT Ā = 0.
Combining (4.12), (4.13) and their complex conjugates, one obtains
T
A BT
0
B B
T
=
[Z, Z] = Z II Z =
.
T
0
A Ā
ĀT B
(4.13)
(4.14)
Hence the six-dimensional matrix Z is invertible if and only if the three-dimensional
symmetric matrix is. The latter has been calculated for all types of anisotropic
materials and is shown to be nonsingular in every case. This provides a proof of
the linear independence of the six eigenvectors of the various orders. Hence the
eigenmatrix possesses a unique inverse given by
T
−1
0
A BT
−1
(4.15)
Z =
T .
−1
ĀT B
0
The form of the matrix −1 depends on the material type. For a material with three
distinct complex conjugate pairs of eigenvalues, one has
1
1
1
−1
= ′
,
,
,
(4.16)
δ (µ1 ) δ ′ (µ2 ) δ ′ (µ3 )
where δ(µ) was defined in equation (2.10b). For other types of materials, −1 can
be expressed in terms of the functions l2 , l3 , l4 , δ and their derivatives of the various
orders. These expressions are less simple and may be found in [10]. Substituting
equations (4.10) and (4.15) into (4.8), one obtains a reduced explicit expression
of Tk .
The transfer matrix Tk of equation (4.7) relates the constant vectors c(k) and
(k+1)
c
of two consecutive sectors. A second type of transfer matrices may be defined
to relate the values of the eigensolution on two consecutive interfaces:
4
χ(r, θk ) = Tk χ(r, θk−1 ),
(4.17)
790
W.-L. YIN
where
(k)
−1 (k) −1
4
(k) (k)
Z
.
Tk ≡ Z (θk ) (θk−1 )
Hence
r −λ χ(θN ) =
1>
2
4
4
4
4
−λ
−λ
Tk r χ(θ0 ) = TN TN−1 · · · T1 r χ(θ0 ).
(4.18)
(4.19)
We now consider the boundary conditions on the exterior edges, which will provide the last elements for the determination of the eigenvalues and the associated
eigensolutions. For a closed wedge, the lines θ = θ0 and θ = θN coincide, and the
displacements and stress potentials are required to be continuous across the line.
Therefore,
2
1>
4
(4.20)
−
I
Tk
6 χ(θ0 ) = 0.
This yields the characteristic equation for the closed wedge:
C> 4
D
Det
Tk − I6 = 0.
(4.21)
For every root λ of the characteristic equation, equation (4.20) has a nontrivial
solution χ(θ0 ), uniquely determined except for an arbitrary complex factor. Then
−1 (1) −1 −λ
c(1) = (1)(θ0 )
Z
r χ(θ0 ),
(4.22)
and equation (4.7) yields c(k) of the other sectors. The displacements and the stress
potentials are given by equation (4.2) in the successive sectors.
For an open wedge, there are generally three homogenous boundary conditions
imposed on the six components of [rot6]χ = {(1/r)F,θ , −F,r , , ur , uθ , w}T at
θ = θ0 , and another set of three conditions at θ = θN . A problem involving
nonhomogeneous boundary conditions may be reduced to one with homogenous
boundary conditions by superposing a suitable particular solution. Hence the two
sets of conditions may be written as
Q0 [rot6(θ0 )]χ(θ0 ) = 0,
QN [rot6(θN )]χ(θN ) = 0,
(4.23a,b)
where Q0 and QN are 3×6 matrices of rank 3, i.e., each having at least one nonsingular 3×3 submatrix. We may interchange certain pairs of elements of [rot6]χ(θ0 )
and the corresponding pairs of columns of Q0 so that, after the rearrangement,
[rot6]χ(θ0 ) becomes a column vector whose first and last three elements form
the vectors χ ′ and χ ′′ , respectively, whereas the matrix Q0 changes to [Q′ , Q′′ ],
in which the first 3 × 3 submatrix Q′ is nonsingular. Equation (4.23a) becomes
Q′ χ ′ + Q′′ χ ′′ = 0. Hence,
χ ′ = −(Q′ )−1 Q′′ χ ′′ .
The vector [rot6(θ0 )]χ(θ0 ) is related to the rearranged vector {χ ′T , χ ′′T }T by a 6×6
matrix K. This matrix has zero elements except the following:
MULTI-MATERIAL SINGULARITIES
791
(i) if the rearrangement leaves any element of the vector [rot6(θ0 )]χ(θ0 ) unchanged, then the corresponding diagonal element of K has the value 1 and
(ii) if any pair of distinct elements, the ith and the j th, are interchanged in the
rearrangement, then Kij = Kj i = 1.
If the rearrangement specified by K is repeated once, then the elements of the rearranged vector resume their original positions. Hence K−1 = K. Equations (4.23)
and (4.19) yield
′$
$
−(Q′ )−1 Q′′ ′′
χ
χ ,
(4.24)
[rot6(θ0 )]χ(θ0 ) = K ′′ = K
I3
χ
$
1> 2
−(Q′ )−1 Q′′ ′′
4
QN [rot6(θN )]
χ = 0.
(4.25)
Tk [rot6(−θ0 )]K
I3
Hence the general form of the characteristic equation for the eigenvalues of an open
multi-material wedge is
′ −1 ′′ $
1> 4 2
(Q ) Q
[rot6(−θ
)]K
Det QN [rot6(θN )]
= 0.
(4.26)
Tk
0
I3
The displacement and stress fields in the various sectors are obtained in a way
similar to the eigensolutions of a closed wedge.
If the real part of λ is smaller than one, then the stress field near r = 0 has the
r −(1−Re[λ]) type singularity. Such wedge eigenvalues are called singular. The roots
of the characteristic equation with the real parts smaller than or equal to zero must
be ignored because the associated eigensolution requires infinite strain energy
We now show that the wedge eigenvalues λ occur in complex conjugate pairs.
According to equation (4.2), an eigensolution of the wedge has the following expression in each sector
7
6
D1 (cos θ + µ sin θ)λ
0
B B
6
7
c.
χ = rλ
0
D1 (cos θ + µ̄ sin θ)λ
A Ā
(4.27)
The complex conjugate of the preceding expression yields
7
6
D1 (cos θ + µ̄ sin θ)λ̄
B B
λ̄
6
χ̄ = r
Ā A
0
D1 (cos θ
7
6
D1 (cos θ + µ sin θ)λ̄
B B
6
= r λ̄
A Ā
D1 (cos θ
0
0
7
c̄
+ µ̄ sin θ)λ̄
0
7
ĉ
+ µ̄ sin θ)λ̄
(4.28)
where ĉ is obtained from c̄ by interchanging the first three and the last three elements. Comparing equations (4.27) and (4.28), one finds that all interfacial continuity conditions as well as the homogenous boundary conditions on θ = θ0 and θ =
792
W.-L. YIN
θN of an open wedge remain satisfied when λ and c(k) are replaced by λ̄ and ĉ(k) ,
respectively (k = 1, 2, . . . , N). Therefore, if λ and c(1) , c(2) , . . . , c(N) constitute an
eigensolution of the multi-material wedge, then so do λ̄ and ĉ(1) , ĉ(2) , . . . , ĉ(N) . For
such a pair of solutions, the combined displacements and stress potentials, χ + χ̄,
are real-valued in all sectors.
In particular, if λ is a real eigenvalue with the associated vector c(k) in the kth
sector, then equation (4.28) implies that λ and ĉ(k) also constitute an eigensolution,
and so do λ and c(k) + ĉ(k) . While c(k) + ĉ(k) are generally not real, they yield realvalued displacements and stresses in all sectors. The last three elements of c(k) +ĉ(k)
are the complex conjugates of the first three.
If λ is a repeated root of the characteristic equation, then equation (4.20) or
(4.25) (for closed and open wedges, respectively) may give more than one independent solution χ(θ0 ) or χ ′ , and equation (4.22) gives an equal number of
independent vectors c(1) . Each vector yields an independent eigensolution. However, the number of independent solutions of equation (4.20) or (4.25) may be
smaller than the multiplicity of the eigenvalue λ. In such a case, additional eigensolutions involving the factors log(x + µ(k)
i y) and possibly their integer powers
may be obtained by differentiating χ (k) = Z (k)D(x + µ(k) y)λ c(k) with respect
to λ. The first derivative yields the additional solution
χ ∗(k) = Z (k)Dlog(x + µ(k) y)(x + µ(k) y)λ−1 c(k) + Z (k) D(x + µ(k) y)λ c∗(k)
(4.29)
where, if equations (4.22) and either (4.20) or (4.25) are recast in the form G0 (λ)c(1)
= 0, then c∗(1) is obtained by solving the equation
G0 c∗(1) +
dG0 (1)
c = 0,
dλ
(4.30)
and c∗(k) in the other sectors may be obtained recursively from
c∗(k+1) = Tk c∗(k) +
dTk (k)
c .
dλ
(4.31)
All functions of λ in equations (4.29)–(4.31) should be evaluated at the repeated
root, and the evaluation should be made only after performing all required differentiations with respect to λ.
The general characteristic equations for closed and open wedges, equations (4.21) and (4.26), have an infinite number of roots. The eigenvalues must
generally be determined by numerical methods. Furthermore, multiple eigenvalues
need to be identified. For each multiple eigenvalue λ, equation (4.20) or (4.25)
yields 3 − r independent solutions χ(θ0 ) or χ ′′ , where r is the rank of the matrix in
the respective equation. If the multiplicity of λ is larger than 3 − r, then additional
eigensolution of the form (4.29), and others obtained by further differentiation with
respect to λ, must be found.
MULTI-MATERIAL SINGULARITIES
793
Due to the complexity of the characteristic equations, it is not practical to ascertain the multiplicity of a root by taking and evaluating the derivatives of the
equation. However, the argument principle in the complex variable theory [19]
provides an exceedingly useful mathematical tool for exhaustive search of all roots
in any finite region of the complex plane. The principle gives the number of zeros of
a function in the region enclosed by any closed path, with a repeated root counted
as many times as its multiplicity. Hence the multiplicity can be determined except
in the case when there are two very close roots. For practical purposes, however,
two or more slightly differently roots may be treated as a single multiple root, and
the elasticity solution is not appreciably affected by this alteration.
5. Examples: Elasticity Solutions of a Bisector Wedge and a Trisector Wedge
Two examples are given in this section to illustrate the entire procedure of elasticity analysis of multimaterial wedges in a composite structure subjected to arbitrary mechanical loads. For each example, several solutions based on truncated
eigenseries of different lengths are computed to ascertain accuracy and convergence, and to examine the relevance and usefulness of the asymptotic solution and
the associated “generalized stress intensity factors”. The procedure includes the
following steps:
(i) Use a conventional finite element structural analysis code to determine the
traction vector along a circular path Ŵ encircling the vertex of the wedge. The
path Ŵ should be separated from the vertex by at least a few rings of elements.
Although the finite element solution cannot closely approximate the singular
stress field in the immediate vicinity of the vertex, it yields sufficiently accurate results of stress on the path Ŵ. This may be validated by refining the
mesh and comparing the original and refined finite element solutions.
(ii) The traction data σr , τrθ and τθz on Ŵ are curve-fitted by Fourier series in θ.
The results are integrated analytically with respect to θ to obtain the data of
F,r , F,θ and .
(iii) The material eigenvalues and eigenvectors are found for each sector of the
wedge by using symbolic algebraic capabilities of Mathematica [20]. In the
cases of degenerate or extradegenerate sectors, higher-order eigenvectors associated with multiple eigenvalues, as described in Section 3, are obtained
by implementing the derivative rule. This yields the matrices Z(k) of equation (4.2) for all sectors. The matrix functions (k) (θ) of equation (4.3) are
also determined except for the wedge eigenvalue λ.
(iv) The characteristic equation for a closed or open wedge, i.e., equations (4.21)
and (4.26), respectively, are derived explicitly in closed analytical form by using symbolic algebra. For multimaterial wedges with more than two sectors,
the characteristic equation contains hundreds of terms or more. All real and
complex eigenvalues with the real parts below a certain level are determined
by numerical techniques. This is achieved in Mathematica by using the “Find-
794
W.-L. YIN
Figure 1. Deformed 3-layer model under shear.
Root” command in conjunction with contour-plotting. However, the argument
principle provides crucial help for ascertaining the number of roots (and the
multiplicity of repeated roots) within any closed curve. Each real or complex
root λ determines a wedge eigensolution whose analytical expression in the
kth sector is given by equations (4.2)–(4.6).
(v) The wedge eigensolutions associated with the successive wedge eigenvalues
are linearly combined to form a (truncated) eigenseries. The coefficients of
combination are determined by collocation along the path Ŵ. Generally there
are more data points than coefficients, and a least-square error criterion is
used to best fit the data of stress potentials on the collocation path.
The composite structural model to be studied has three layers of equal thickness
2 cm. The middle layer has the length 12 cm in the x-direction while the top and
bottom layers are 14 cm long, as shown in Figure 1 in its deformed state under a
shear loading. All three layers are made of the same unidirectional graphite-epoxy
composite whose homogenized anisotropic elastic properties are characterized by
the following elastic constant:
E2 = E3 = 10.3 GPa, v12 = v13 = v23 = 0.28,
E1 = 181 GPa,
(5.1)
G12 = G13 = 7.17 GPa,
G23 = 4.023 GPa.
The fiber axis in the bottom, middle and top layers are oriented at angles 30◦ , 0◦
and −60◦ , respectively, with respect to the z-direction. The middle layer has a crack
in a 45◦ inclined plane that runs through the entire thickness of the layer, and also
through the entire width in the z-direction. This is a matrix crack since the crack
plane is parallel to the fibers.
Since the stress solutions near the singularities will be examined at various scale
lengths including those that are much smaller than fiber diameters, the elasticity
solutions to be obtained are not so much appropriate to the three-layer composite
model as to the layerwise homogenized model. That is, strictly speaking, our analy-
795
MULTI-MATERIAL SINGULARITIES
sis and solutions concern the layerwise homogenized model, not the composite
model. But the three layers will still be designated as 30◦ , 0◦ and −60◦ layers.
The lower surface of the model is fixed and the upper surface is moved rigidly
in the negative x-direction through a distance 1/100 cm. Plane strain condition
εz = 0 is maintained for the model.
The material eigenvalues of the three layers are all purely imaginary:
−60◦ layer
0◦ layer
30◦ layer
±0.871351 i,
±i,
±0.959036 i,
±1.25958 i,
±i,
±1.093396 i,
±4.26833,
±i,
±2.596064 i.
(5.2)
The 0◦ middle layer is transversely isotropic, and therefore degenerate (but not
extradegenerate). It has two zeroth-order material eigenvectors and one first-order
eigenvector.
There are six multimaterial singularities in this model. The singularities at the
two ends of the inclined crack are associated with trisectors wedges. At points B
and C, one has the well-known free-edge singularities, which will not be studied
in this work. The bisector wedges at A and B are similar but not identical to those
at D and C, respectively, due to the different orientations of the top and bottom
layers. For the two bisector wedges at the reentrant corners A and D, and for the
two trisector wedges at the ends of the inclined crack, the wedge eigenvalues are
shown in Table I. For these four wedges, all singular eigenvalues are real. Hence the
elasticity solutions do not show oscillatory behavior in the immediate vicinities of
the singularities. In addition, as one approaches the vertex of a wedge, the elasticity
solution approaches the asymptotic limit determined by the real eigenvector associated with the dominant real eigenvalue. Both the radial and angular dependence
of this real asymptotic solution are determined by the geometry and material of the
wedge sectors and the edge boundary conditions except for a real amplitude factor
which is determined by remote loading. Thus, changes in remote loading can only
affect this amplitude factor but cannot change the ratios of stress components of the
asymptotic solution. This is in stark contrast with the case of complex conjugate
dominant singularities where the stress ratios of the asymptotic solution (for example, the stress-intensity factors of interface cracks) generally depend on remote
loading.
The distribution of eigenvalues in the complex plane has a similar pattern for the
two corner wedges, and also for the two trisector wedges, despite large difference
in the axial stiffness of the −60◦ and 30◦ layers (110.693 GPa and 24.690 GPa,
respectively). In general, the wedge eigenvalues are strongly affected by the edge
conditions and the wedge geometry (including the number of sectors and sector
angles), but are less sensitive to moderate changes in the elastic constants of the
sectors. The two bisector wedges have simple integer eigenvalues 1, 2, 4, 6, etc.
For the trisector wedges, every positive integer is a triple eigenvalue. However, one
of the three eigenvectors associated with λ = 1 is a rigid rotation mode that contributes no stress. It is interesting to notice that the dominant singular eigenvalues
796
Table I. Engenvalues of four multimaterial wedges
[0/30] at D
Lower end of crack
Upper end of crack
0.5924654
0.6760317
0.9679459
1
1.3290965
1.6881490 ± 0.2455260 i
2
2.1615945 ± 0.2719786 i
2.6696804
2.9973732 ± 0.1681712 i
3.3304593
3.8604587 ± 0.7342584 i
4
4.0072164 ± 0.2662313 i
4.6694182
4.9992643 ± 0.1843503 i
5.3306112
5.8853940 ± 1.0211285 i
6
6.0013338 ± 0.2422297 i
6.6693539
6.9997085 ± 0.1893604 i
7.3306568
7.9007901 ± 1.2123062 i
0.5585964
0.7150752
0.9339855
1
1.2822588
1.6422164 ± 0.2330850 i
2
2.2631785 ± 0.3099347 i
2.7187519
2.9863063 ± 0.2948049 i
3.2811556
3.7095433 ± 0.4676336 i
4
4.2072597 ± 0.4663597 i
4.7189494
4.9936378 ± 0.3531516 i
5.2810307
5.8077471 ± 0.6126472 i
6
6.1164562 ± 0.5507335 i
6.7189994
6.9967129 ± 0.3782984 i
7.2809934
7.9133476 ± 0.8085709 i
0.4732559
0.5271834
0.6922293
1, 1, 1
1.3592020
1.4947400 ± 0.0418597 i
2, 2, 2
2.4973147 ± 0.1793125 i
2.5318482
3, 3, 3
3.5437306
3.8255711 ± 0.9599962 i, 3.8721508 ± 0.3127019 i
4, 4, 4
8
8
0.4891194
0.5056793
0.6130216
1, 1, 1
1.4577165
1.5280594 ± 0.0266804 i
2, 2, 2
2.4480841 ± 2.4652695
2.5869116
3, 3, 3
3.5396360 ± 0.0972610 i
3.7955437
4, 4, 4
4.2567490 ± 0.1718115 i
4.4462049
5, 5, 5
5.3228318 ± 0.2168114 i
5.4462668
6, 6, 6
6.5306451
6.5936590 ± 0.3364298 i
7, 7, 7
7.5448581
7.8660481 ± 1.1948926 i
7.9450938 ± 0.3830909 i
8, 8, 8
4.4694496
5, 5, 5
5.2557896 ± 0.4775619 i
5.4666851
6, 6, 6
6.5271205
6.5893884 ± 0.5830187 i
7, 7, 7
7.5344069
7.7590945 ± 1.9577296 i
7.9333809 ± 0.6565767 i
8, 8, 8
W.-L. YIN
[−60/0] at A
797
MULTI-MATERIAL SINGULARITIES
of the two trisector wedges are smaller than 0.5 (λ = 0.4891194 and 0.4732559,
respectively, for the wedge at the lower and upper end of the crack). That is, the
strengths of the singularities exceed that of an interface crack.
For the bisector wedge at A, an elasticity solution is obtained by the preceding
procedure, using the traction data on a circle of radius r0 = 1 cm generated by
a finite element analysis using over 2000 triangular elements. Twenty-two wedge
eigensolutions are combined, including all eigensolutions with λ ≤ 5. The resulting interfacial stresses between 0◦ and −60◦ layers are shown in Figure 2. The leading term of the eigenseries contributes the dominant singular stress field of the order r λ−1 , where λ = 0.5924654 is the first eigenvalue. When this asymptotic stress
field is multiplied by the factor r 1−λ , the result is independent of r, and is a function
of θ only. The values of its components on the upper and lower sides of the interface
θ = 0 may be taken as the generalized stress intensity factors Sij+ and Sij− (In fact Sij+
and Sij− determine each other algebraically due to the three continuity conditions of
tractions and another three continuity conditions of tangential strains):
−
+
= 5.89400,
= Syy
Syy
+
Sxx = 20.8647,
−
Sxx
= 4.12336,
−
+
= −2.41276,
= Sxy
Sxy
+
Szz = 6.84576,
−
Szz
= 2.77966,
where the unit of the stress is MPa.
Figure 2. Interfacial stresses of the bisector wedge.
−
+
= −1.27436,
= Syy
Syy
+
Sxz = −8.23518,
−
Sxz
= 0.94912,
(5.3)
798
W.-L. YIN
The results of Figure 2 can only be deciphered for the range 10−2 r̄ 1,
where r̄ ≡ r/r0 . In Figures 3(a)–(c), the elasticity solutions of the interfacial
stresses are normalized through multiplication by the factor r̄ 1−λ , and plotted in
solid curves as functions of log10 (r̄). It is seen that the normalized σy and τyz
approach the asymptotic solution very slowly (Figures 3(a) and (c)) as compared to
τxy . Significant discrepancies are found even as r decreases to the subatomic scale
and beyond (notice that since r0 = 1 cm, r = 10−9 m corresponds to log10 (r̄) =
−7). This is due to the closeness of the first two eigenvalues. A two-term approximate solution obtained by discarding all except the first two eigensolutions
in the 22-term elasticity solution yields interfacial stresses that are shown as broken curves in Figures 3(a)–(c). These results show excellent agreement with the
22-term solution except for the domain r̄ > 10−3 . The tangential stresses σx , σz
and τxz are discontinuous across the interface. Their normalized values on the upper
and lower sides of the interface are shown in Figures 4(a) and (b) with r̄ also plotted
in the logarithmic scale.
The preceding solution is compared with five additional elasticity solutions
using truncated eigenseries of various lengths. Each solution includes all terms
associated with wedge eigenvalues that have real parts not greater than N, where
N is taken successively to be 1, 2, 3, 4 and 10 (N = 5 corresponds to the pre-
(a)
Figure 3. (a) σy of 22-, 2- and 1-term series; (b) τxy of 22-, 2- and 1-term series; (c) τyz of
22-, 2- and 1-term series.
MULTI-MATERIAL SINGULARITIES
(b)
(c)
Figure 3. (Continued.)
799
800
W.-L. YIN
(a)
(b)
Figure 4. Tangential stresses (a) on the upper side (b) on the lower side of the interface.
MULTI-MATERIAL SINGULARITIES
801
Figure 5. Syy of various solutions, bisector wedge.
ceding 22-terms solution). The generalized stress intensity factors Syy for all six
solutions are shown in Figure 5. For each solution, other components of Sij+ and
Sij− are determined by the same stress ratios of the dominant singularity as given
in equation (5.3). Except for the two solutions with the smallest number of terms,
the results of the other solutions are in close agreement. The eigenseries converge
rapidly and, for this wedge, an accurate solution requires only eigensolutions with
λ < 3.
For the trisector wedge at the lower end of the inclined crack, the traction data
on r ≡ r0 = 1 is obtained from the same finte-element analysis. A 29-term
eigenseries including all eigensolutions with Re[λ] 5 is used in collocation. The
resulting interfacial stresses on the interface to the left of the singularity are shown
in Figure 6. The results are normalized with respect to r̄ −0.510881, and plotted versus
log10 (r̄) in Figures 7(a)–(c), where r̄ covers a much wider range from 10−50 to 1.
The corresponding asymptotic solutions are shown as dashed horizontal lines. Any
lingering faith that the asymptotic solution generally provides useful and realistic
parameters for characterizing the criticality of stress singularity and for predicting
failure initiation must be dispelled by the behavior of the interfacial peeling stress
as shown in Figure 7(a). The peeling stress of the elasticity solution is nearly ten
times greater than the asymptotic solution at r̄ = 10−5 , and about eight times
greater at r̄ = 10−10 . It is more than three times bigger even at r̄ = 10−50 . On
the other hand, τxy approaches the asymptotic solution much faster, as shown in
Figure 7(b). Therefore, the stress ratios of the elasticity solution are very different
from the generalized stress intensity factors. The asymptotic solution has neither
a physical relation nor a mathematical semblance to the stress field in the wedge
at any physically meaningful length scale, because the region of dominance of the
asymptotic solution has an extraordinarily small size of the order much smaller
802
W.-L. YIN
Figure 6. Interfacial stresses on the left interface, trisector wedge.
(a)
Figure 7. (a) σy , (b) τxy , (c) τyz of 29-, 3- and 1-term series.
MULTI-MATERIAL SINGULARITIES
(b)
(c)
Figure 7. (Continued.)
803
804
W.-L. YIN
Figure 8. Syy of various solutions, left interface of the trisector wedge.
than 10−50 . Outside this minute region, the next two eigensolutions contribute
significantly, as shown by the dashed curves in Figures 7(a)–(c), which combine
the results of the first three terms of the 29-term solution. The 3-term solution
agrees closely with the full 29-term solution in the region r̄ < 10−5 . Notice that
the trisector wedge has a series of clusters of three closely spaced eigenvalues on
or near the real axis, as seen in Table I.
On the interface to the right of the singularity, the interfacial streses are found
to be significantly smaller and the plots are not shown. The stresses σy and τxy
approach the asymptotic solution more rapidly compared to the left interface, but
τyz still has a very slow rate of approach. The generalized stress intensity factors
are given by the asymptotic stresses on the left and right interfaces multiplied by
r̄ 0.510881. One has (all results in the unit MPa)
On the left interface: Syy = 0.229723,
On the right interface: Syy = 0.307952,
Sxy = −0.577359,
Sxy = 0.510823,
Syz = 0.888552,
Syz = −0.323603.
In the interior segment of the interface at sufficiently large distances away from
the bisector and trisector singularities, the stresses σy , τxy and τyz reach constant
values 0, −7.85475 MPa and 0, respectively. These limiting values are given by the
layerwise constant stresses in an otherwise identical model without the inclined
crack and with infinite length in the axial direction. In the opposite limit of extremely small r, the stress τxy on the left and right interface approaches negative
and positive asymptotic results, respectively, whose ratio is independent of loading
(since the dominant singular value is real).
Besides the 29-term solution, five additional solutions using all eigensolutions
with λ N are obtained, where N assumes the values 1, 2, 3, 4 and 6. For the
stress intensity factor Syy on the right interface, the results of the various solutions
MULTI-MATERIAL SINGULARITIES
805
Figure 9. Interface stresses of the bisector wedge in a realistic range of r.
are compared in Figure 8. The 11-term solution (N = 2) yields relatively poor
results, with an error of more than 20% compared to the 29-term solution. The
last four solutions, all including the eigensolutions with λ 3, are in excellent
agreement.
On different interfaces the elasticity solutions show different stress ratios. These
ratios also vary greatly with r. It is only meaningful to make comparison in a
physically relevant range of r̄, and comparisons of the interfacial stresses for different wedges require normalization with respect to a common power, which is
conveniently taken to be r̄ −1/2 . Figures 9 and 10 show the results of the bisector
wedge, and of the left interface of the trisector wedge, both normalized with respect to r̄ −1/2 and plotted over the range −7 log10 (r̄) 0 (corresponding to
10−9 m r 0.01 m). Within this range, the stresses on the left interface of the
trisector wedge (Figure 10) are much higher than those on the right interface (not
shown). But the generalized stress intensity factors of the right interface exceed
those on the left by more than one third. The relative intensity of the peeling
stress of the two wedges as shown in Figures 9 and 10 appears to depend on the
length scale and, therefore, may require consideration of damage mechanism and
microstructure.
In the literature, so much attention has been directed to the order of stress
singularities that it is almost taken for granted that a stronger order is invariably
more threatening. In the present case, the trisector wedge has a stronger order with
806
W.-L. YIN
Figure 10. Trisector wedge, left interface stresses over a realistic range of r.
the lowest eigenvalue 0.489119, versus 0.592465 for the bisector wedge. But this
difference may have very little to do with the severity of the interfacial stresses in
the two wedges over the physically relevant range of scales. Results in Figure 9 for
the bisector wedge are more severe than the right interface of the trisector wedge.
Yet for exceedingly small length scales, the mathematical solution of the stresses
on the latter interface will grow faster and exceed in magnitude the stresses in
the bisector wedge. Ultimately, if interface failure prediction is to be based on
the elasticity solution, one must restrict attention to a physically relevant range
of r, and ignore the mathematical solution beyond that range. That is, one must
formulate and apply failure criteria to the relevant analysis results such as shown
in Figures 9 and 10. Notice that these figures require accurate elasticity analysis
as presented in this work. In order to obtain results at a length scale r̄ = 10−N
directly, one almost needs a conventional finite-element analysis with close to 102N
elements, unless a substructuring method is used.
Solutions of the preceding two examples lead to the following observations:
(1) The two-step substructure approach, in which a conventional finite-element
analysis is used to generate the traction boundary conditions on a path encircling a singularity, and an elasticity solution of the multimaterial wedge is
obtained by combining eigensolutions, provides a reliable, highly efficient and
accurate method for the analysis of singularities in heterogeneous structures.
MULTI-MATERIAL SINGULARITIES
807
(2) The eigenseries converge rapidly. In the two examples, the various elasticity
solutions that include all eigensolutions with λ 3 are in excellent agreement.
However, a solution that excludes some terms with smaller λ may incur very
significant error.
(3) When the lowest singular eigenvalue is real, the elasticity solution approaches
the dominant singular solution asymptotically as r → 0 but the approach may
be very slow. As r decreases to subatomic size and even to 10−50 r0 in the
present problems, the elasticity solution for the interfacial stress may still be
significantly different from the asymptotic solution. For the intervening range
of r, the ratios of the stress components of the elasticity solution vary widely
with r, and they can be very different from the ratios of the generalized stress
intensity factors.
(4) Hence the asymptotic solution (including both the order of singularity and the
generalized stress intensity factors) cannot be used generally to characterize
the criticality of stress singularity, and to be used as the main basis for the
prediction of failure initiation. Interface failure may be affected by stress contributions from the second and third eigensolutions, and failure criteria must
be based on relevant size scales as determined by the model geometry and
microstructure (fiber diameters, etc.), since the mathematical results of the
stress level in a minute region of subatomic size have no relevance to physical
processes including failure initiation.
References
1.
2.
3.
4.
5.
6.
7.
8.
W.-L. Yin, A general analysis methodology for singularities in composite structures. In:
Proc. AIAA/ASME/ASCE/AHS/ASC 38th SDM Conference, Kissimere, FL, 7–10 April 1997,
pp. 2238–2246.
W.-L. Yin, Mixed mode stress singularities in anisotropic composites. In: Y.D.S. Rajapakse and
G.A. Kardomateas (eds), Thick Composites for Load Bearing Structures, AMD 235. ASME,
New York (1999) pp. 33–45.
W.-L. Yin, K.C. Jane and C.-C. Lin, Singular solutions of multimaterial wedges under thermomechanical loading. In: G. J. Simitses (ed.), Analysis and Design Issues for Modern Aerospace
Vehicles – 1997, ASME AD 55. ASME, New York (1997) pp. 159–166.
A.Y. Kuo, Thermal stresses at the edge of a bimetallic thermostat. J. Appl. Mech. 56 (1989)
585–589.
S.S. Wang and F.G. Yuan, A hybrid finite element approach to laminate elasticity problems
with stress singularities. J. Appl. Mech. 50 (1983) 835–844.
W.-L. Yin, Evaluation of the stress intensity factors in the general delamination problem. In:
R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics
of Materials, International Center for Numerical Methods in Engineering, Barcelona, Spain
(1996) pp. 489–500.
W.-L. Yin, Delamination: Laminate analysis and fracture mechanics. Fatigue Fracture Engrg.
Materials Struct. 21 (1998) 509–520.
W.-L. Yin, Singularities of multimaterial wedges in heterogeneous structures. In: Proc.
AIAA/ASME/ASCE/AHS/ASC 42nd SDM Conference, Seattle, WA, April 2001, AIAA Paper
No. 2001-1250, 10 pages.
808
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
W.-L. YIN
T.C.T. Ting, Existence of an extraordinary degenerate matrix N for anisotropic elastic materials.
Quart. J. Mech. Appl. Math. 49 (1996) 405–417.
W.-L. Yin, Deconstructing plane anisotropic elasticity, Part I: The latent structure of Lekhnitskii’s formalism; Part II: Stroh’s formalism sans frills. Internat. J. Solids Struct. 37 (2000)
5257–5276 and 5277–5296.
F. Delale, Stress singularities in bonded anisotropic materials. Internat. J. Solids Struct. 20
(1984) 31–40.
S.S. Pageau and S.B. Biggers, Jr., The order of stress singularities for bonded and disbonded
three-material junctions. Internat. J. Solids Struct. 31 (1994) 2979–2997.
T. Inoue and H. Koguchi, Influence of the intermediate material on the order of stress singularity
in three-phase bonded structures. Internat. J. Solids Struct. 33 (1996) 399–417.
T.C.T. Ting, Stress singularities at the tip of interfaces in polycrystals. In: H.-P. Rossmanith
(ed), Proc. of the 1st Internat. Conf. on Damage and Failure of Interfaces, Vienna, Austria,
1997, pp. 75–82.
H.-P. Chen, Stress singularities in anisotropic multimaterial wedges and junctions. Internat. J.
Solids Struct. 35 (1998) 1057–1073.
S.G. Lekhnitskii, Theory of Elasticity of an Anisotropic Body. Holden-Day, San Francisco, CA
(1963).
T.C.T. Ting, Anisotropic Elasticity: Theory and Application. Oxford Univ. Press, New York,
NY (1996).
D.M. Barnett and H.O.K. Kirchner, A proof of the equivalence of the Stroh and Lekhnitskii
sextic equations for plane anisotropic elastostatics. Phil. Mag. 76 (1997) 231–239.
G.F. Carrier, M. Krook and C.E. Pearson, Functions of a Complex Variable. McGraw-Hill, New
York (1966).
S. Wolfram, Mathematica: A System for Doing Mathematics by Computer, 2nd ed. AddisonWesley, Redwood City, CA (1991).