tensor analysis with application in mechanics

Tensor Analysis with Applications in Mechanics

This page intentionally left blankThis page intentionally left blank

http://www.a-pdf.com

N E W J E R S E Y • L O N D O N • S I N G A P O R E • B E I J I N G • S H A N G H A I • H O N G K O N G • TA I P E I • C H E N N A I

World Scientific

Leonid P. LebedevNational University of ColombiaSouthern Federal University, Russia

Michael J. CloudLawrence Technological University, USA

Victor A. EremeyevMartin-Luther-University Halle-Wittenberg, GermanySouthern Scientific Center of Russian Academy of ScienceSouthern Federal University, Russia

Tensor Analysis with Applications in Mechanics

British Library Cataloguing-in-Publication DataA catalogue record for this book is available from the British Library.

For photocopying of material in this volume, please pay a copying fee through the CopyrightClearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission tophotocopy is not required from the publisher.

Desk Editor: Tjan Kwang Wei

ISBN-13 978-981-4313-12-4ISBN-10 981-4313-12-2

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,electronic or mechanical, including photocopying, recording or any information storage and retrievalsystem now known or to be invented, without written permission from the Publisher.

Copyright © 2010 by World Scientific Publishing Co. Pte. Ltd.

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Printed in Singapore.

TENSOR ANALYSIS WITH APPLICATIONS IN MECHANICS

Foreword

Every science elaborates tools for the description of its objects of study. Inclassical mechanics we make extensive use of vectorial quantities: forces,moments, positions, velocities, momenta. Confining ourselves to a singlecoordinate frame, we can regard a vector as a fixed column matrix. Thedefinitive trait of a vector quantity, however, is its objectivity; a vector doesnot depend on our choice of coordinate frame. This means that as soonas the components of a force are specified in one frame, the componentsof that force relative to any other frame can be found through the use ofappropriate transformation rules.

But vector quantities alone do not suffice for the description of con-tinuum media. The stress and strain at a point inside a body are alsoobjective quantities; however, the specification of each of these relative to agiven frame requires a square matrix of elements. Under changes of frame,these elements transform according to rules different from the transforma-tion rules for vectors. Stress and strain tensors are examples of tensors ofthe second order. We could go on to cite other objective quantities that oc-cur in the mechanics of continua. The set of elastic moduli associated withHooke’s law comprise a tensor of the fourth order; as such, these moduliobey yet another set of transformation rules. Despite the differences thatexist between the transformation laws for the various types of objectivequantities, they all fit into a unified scheme: the theory of tensors.

Tensor theory not only relieves our memory from a huge burden, but en-ables us to carry out differential operations with ease. This is the case evenin curvilinear coordinate systems. Through the unmatched simplicity andbrevity it affords, tensor analysis has attained the status of a general lan-guage that can be spoken across the various areas of continuum physics. Afull comprehension of this language has become necessary for those working

v

vi Tensor Analysis with Applications in Mechanics

in electromagnetism, the theory of relativity, or virtually any other field-theoretic discipline. More modern books on physical subjects invariablycontain appendices in which various vector and tensor identities are listed.These may suffice when one wishes to verify the steps in a development, butcan leave one in doubt as to how the facts were established or, a fortiori,how they could be adapted to other circumstances. On the other hand,a comprehensive treatment of tensors (e.g., involving a full excursion intomultilinear algebra) is necessarily so large as to be flatly inappropriate forthe student or practicing engineer.

Hence the need for a treatment of tensor theory that does justice tothe subject and is friendly to the practitioner. The authors of the presentbook have met these objectives with a presentation that is simple, clear,and sufficiently detailed. The concise text explains practically all thoseformulas needed to describe objects in three-dimensional space. Occur-rences in physics are mentioned when helpful, but the discussion is keptlargely independent of application area in order to appeal to the widestpossible audience. A chapter on the properties of curves and surfaces hasbeen included; a brief introduction to the study of these properties can beconsidered as an informative and natural extension of tensor theory.

I.I. VorovichLate Professor of Mechanics and MathematicsRostov State University, RussiaFellow of Russian Academy of Sciences(1920–2001)

Preface

The first edition of this book was written for students, engineers, and physi-cists who must employ tensor techniques. We did not present the mate-rial in complete generality for the case of n-dimensional space, but ratherpresented a three-dimensional version (which is easy to extend to n dimen-sions); hence we could assume a background consisting only of standardcalculus and linear algebra.

We have decided to extend the book in a natural direction, adding twochapters on applications for which tensor analysis is the principal tool. Onechapter is on linear elasticity and the other is on the theory of shells andplates. We present complete derivations of the equations in these theories,formulate boundary value problems, and discuss the problem of uniquenessof solutions, Lagrange’s variational principle, and some problems on vibra-tion. Space restrictions prohibited us from presenting an entire course onmechanics; we had to select those questions in elasticity where the role oftensor analysis is most crucial.

We should mention the essential nature of tensors in elasticity and shelltheory. Of course, to solve a certain engineering problem, one should writethings out in component form; sometimes this takes a few pages. The corre-sponding formulas in tensor notation are quite simple, allowing us to graspthe underlying ideas and perform manipulations with relative ease. Be-cause tensor representation leads quickly and painlessly to component-wiserepresentation, this technique is ideal for presenting continuum theories tostudents.

The first five chapters are largely unmodified, aside from some newproblem sets and material on tensorial functions needed for the chapterson elasticity. The end-of-chapter problems are supplementary, whereas theintegrated exercises are required for a proper understanding of the text.

vii

viii Tensor Analysis with Applications in Mechanics

In the first edition we used the term rank instead of order. This wascommon in the older literature. In the newer literature, the term “rank” isoften assigned a different meaning.

Because the book is largely self-contained, we make no attempt at acomprehensive reference list. We merely list certain books that cover similarmaterial, that extend the treatment slightly, or that may be otherwise usefulto the reader.

We are deeply grateful to our World Scientific editor, Mr. Tjan KwangWei, for his encouragement and support.

L.P. LebedevDepartment of Mathematics

National University of Colombia, Colombia

M.J. CloudDepartment of Electrical and Computer Engineering

Lawrence Technological University, USA

V.A. EremeyevSouth Scientific Center of RASci

&Department of Mathematics, Mechanics

and Computer SciencesSouth Federal University, Russia

Preface to the First Edition

Originally a vector was regarded as an arrow of a certain length that couldrepresent a force acting on a material point. Over a period of many years,this naive viewpoint evolved into the modern interpretation of the notionof vector and its extension to tensors. It was found that the use of vectorsand tensors led to a proper description of certain properties and behaviorsof real natural objects: those aspects that do not depend on the coordinatesystems we introduce in space. This independence means that if we definesuch properties using one coordinate system, then in another system we canrecalculate these characteristics using valid transformation rules. The ease

Preface ix

with which a given problem can be solved often depends on the coordinatesystem employed. So in applications we must apply various coordinatesystems, derive corresponding equations, and understand how to recalculateresults in other systems. This book provides the tools necessary for suchcalculation.

Many physical laws are cumbersome when written in coordinate formbut become compact and attractive looking when written in tensorial form.Such compact forms are easy to remember, and can be used to state complexphysical boundary value problems. It is conceivable that soon an ability tomerely formulate statements of boundary value problems will be regardedas a fundamental skill for the practitioner. Indeed, computer software isslowly advancing toward the point where the only necessary input data willbe a coordinate-free statement of a boundary value problem; presumablythe user will be able to initiate a solution process in a certain frame andby a certain method (analytical, numerical, or mixed), or simply ask thecomputer algorithm to choose the best frame and method. In this way,vectors and tensors will become important elements of the macro-languagefor the next generation of software in engineering and applied mathematics.

We would like to thank the editorial staff at World Scientific — espe-cially Mr. Tjan Kwang Wei and Ms. Sook-Cheng Lim — for their assistancein the production of this book. Professor Byron C. Drachman of MichiganState University commented on the manuscript in its initial stages. Lastly,Natasha Lebedeva and Beth Lannon-Cloud deserve thanks for their pa-tience and support.

L.P. LebedevDepartment of Mechanics and Mathematics

Rostov State University, Russia&

Department of MathematicsNational University of Colombia, Colombia

M.J. CloudDepartment of Electrical and Computer Engineering

Lawrence Technological University, USA

Contents

Foreword v

Preface vii

Tensor Analysis 1

1. Preliminaries 3

1.1 The Vector Concept Revisited . . . . . . . . . . . . . . . . 31.2 A First Look at Tensors . . . . . . . . . . . . . . . . . . . 41.3 Assumed Background . . . . . . . . . . . . . . . . . . . . 51.4 More on the Notion of a Vector . . . . . . . . . . . . . . . 71.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2. Transformations and Vectors 11

2.1 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Dual Bases . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Transformation to the Reciprocal Frame . . . . . . . . . . 172.4 Transformation Between General Frames . . . . . . . . . . 182.5 Covariant and Contravariant Components . . . . . . . . . 212.6 The Cross Product in Index Notation . . . . . . . . . . . 222.7 Norms on the Space of Vectors . . . . . . . . . . . . . . . 242.8 Closing Remarks . . . . . . . . . . . . . . . . . . . . . . . 272.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3. Tensors 29

3.1 Dyadic Quantities and Tensors . . . . . . . . . . . . . . . 29

xi

xii Tensor Analysis with Applications in Mechanics

3.2 Tensors From an Operator Viewpoint . . . . . . . . . . . 303.3 Dyadic Components Under Transformation . . . . . . . . 343.4 More Dyadic Operations . . . . . . . . . . . . . . . . . . . 363.5 Properties of Second-Order Tensors . . . . . . . . . . . . . 403.6 Eigenvalues and Eigenvectors of a Second-Order Symmet-

ric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.7 The Cayley–Hamilton Theorem . . . . . . . . . . . . . . . 483.8 Other Properties of Second-Order Tensors . . . . . . . . . 493.9 Extending the Dyad Idea . . . . . . . . . . . . . . . . . . 563.10 Tensors of the Fourth and Higher Orders . . . . . . . . . 583.11 Functions of Tensorial Arguments . . . . . . . . . . . . . . 603.12 Norms for Tensors, and Some Spaces . . . . . . . . . . . . 663.13 Differentiation of Tensorial Functions . . . . . . . . . . . . 703.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4. Tensor Fields 85

4.1 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . 854.2 Differentials and the Nabla Operator . . . . . . . . . . . . 944.3 Differentiation of a Vector Function . . . . . . . . . . . . 984.4 Derivatives of the Frame Vectors . . . . . . . . . . . . . . 994.5 Christoffel Coefficients and their Properties . . . . . . . . 1004.6 Covariant Differentiation . . . . . . . . . . . . . . . . . . . 1054.7 Covariant Derivative of a Second-Order Tensor . . . . . . 1064.8 Differential Operations . . . . . . . . . . . . . . . . . . . . 1084.9 Orthogonal Coordinate Systems . . . . . . . . . . . . . . . 1134.10 Some Formulas of Integration . . . . . . . . . . . . . . . . 1174.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5. Elements of Differential Geometry 125

5.1 Elementary Facts from the Theory of Curves . . . . . . . 1265.2 The Torsion of a Curve . . . . . . . . . . . . . . . . . . . 1325.3 Frenet–Serret Equations . . . . . . . . . . . . . . . . . . . 1355.4 Elements of the Theory of Surfaces . . . . . . . . . . . . . 1375.5 The Second Fundamental Form of a Surface . . . . . . . . 1485.6 Derivation Formulas . . . . . . . . . . . . . . . . . . . . . 1535.7 Implicit Representation of a Curve; Contact of Curves . . 1565.8 Osculating Paraboloid . . . . . . . . . . . . . . . . . . . . 1625.9 The Principal Curvatures of a Surface . . . . . . . . . . . 164

Contents xiii

5.10 Surfaces of Revolution . . . . . . . . . . . . . . . . . . . . 1685.11 Natural Equations of a Curve . . . . . . . . . . . . . . . . 1705.12 A Word About Rigor . . . . . . . . . . . . . . . . . . . . . 1735.13 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 1755.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Applications in Mechanics 179

6. Linear Elasticity 181

6.1 Stress Tensor . . . . . . . . . . . . . . . . . . . . . . . . . 1816.2 Strain Tensor . . . . . . . . . . . . . . . . . . . . . . . . . 1906.3 Equation of Motion . . . . . . . . . . . . . . . . . . . . . . 1936.4 Hooke’s Law . . . . . . . . . . . . . . . . . . . . . . . . . 1946.5 Equilibrium Equations in Displacements . . . . . . . . . . 2006.6 Boundary Conditions and Boundary Value Problems . . . 2026.7 Equilibrium Equations in Stresses . . . . . . . . . . . . . . 2036.8 Uniqueness of Solution for the Boundary Value Problems

of Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . 2056.9 Betti’s Reciprocity Theorem . . . . . . . . . . . . . . . . . 2066.10 Minimum Total Energy Principle . . . . . . . . . . . . . . 2086.11 Ritz’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 2166.12 Rayleigh’s Variational Principle . . . . . . . . . . . . . . . 2216.13 Plane Waves . . . . . . . . . . . . . . . . . . . . . . . . . 2276.14 Plane Problems of Elasticity . . . . . . . . . . . . . . . . . 2306.15 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

7. Linear Elastic Shells 237

7.1 Some Useful Formulas of Surface Theory . . . . . . . . . . 2397.2 Kinematics in a Neighborhood of Σ . . . . . . . . . . . . . 2427.3 Shell Equilibrium Equations . . . . . . . . . . . . . . . . . 2447.4 Shell Deformation and Strains; Kirchhoff’s Hypotheses . . 2497.5 Shell Energy . . . . . . . . . . . . . . . . . . . . . . . . . 2567.6 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . 2597.7 A Few Remarks on the Kirchhoff–Love Theory . . . . . . 2617.8 Plate Theory . . . . . . . . . . . . . . . . . . . . . . . . . 2637.9 On Non-Classical Theories of Plates and Shells . . . . . . 277

Appendix A Formulary 287

xiv Tensor Analysis with Applications in Mechanics

Appendix B Hints and Answers 315

Bibliography 355

Index 359

Chapter 1

Preliminaries

1.1 The Vector Concept Revisited

The concept of a vector has been one of the most fruitful ideas in all ofmathematics, and it is not surprising that we receive repeated exposure tothe idea throughout our education. Students in elementary mathematicsdeal with vectors in component form — with quantities such as

x = (2, 1, 3)

for example. But let us examine this situation more closely. Do the compo-nents 2, 1, 3 determine the vector x? They surely do if we specify the basisvectors of the coordinate frame. In elementary mathematics these are sup-posed to be mutually orthogonal and of unit length; even then they are notfully characterized, however, because such a frame can be rotated. In thedescription of many common phenomena we deal with vectorial quantitieslike forces that have definite directions and magnitudes. An example is theforce your body exerts on a chair as you sit in front of the television set.This force does not depend on the coordinate frame employed by someonewriting a textbook on vectors somewhere in Russia or China. Because thevector f representing a particular force is something objective, we shouldbe able to write it in such a form that it ceases to depend on the details ofthe coordinate frame. The simplest way is to incorporate the frame vectorsei (i = 1, 2, 3) explicitly into the notation: if x is a vector we may write

x =3∑

i=1

xiei. (1.1)

Then if we wish to change the frame, we should do so in such a way thatx remains the same. This of course means that we cannot change only the

3

4 Tensor Analysis with Applications in Mechanics

frame vectors ei: we must change the components xi correspondingly. Sothe components of a vector x in a new frame are not independent of thosein the old frame.

1.2 A First Look at Tensors

In what follows we shall discuss how to work with vectors using differentcoordinate frames. Let us note that in mechanics there are objects ofanother nature. For example, there is a so-called tensor of inertia. Thisis an objective characteristic of a solid body, determining how the bodyrotates when torques act upon it. If the body is considered in a Cartesianframe, the tensor of inertia is described by a 3×3 matrix. If we change theframe, the matrix elements change according to certain rules. In textbookson mechanics the reader can find lengthy discussions on how to change thematrix elements to maintain the same objective characteristic of the bodywhen the new frame is also Cartesian. Although the tensor of inertia isobjective (i.e., frame-independent), it is not a vector: it belongs to anotherclass of mathematical objects. Many such tensors of the second order arisein continuum mechanics: tensors of stress, strain, etc. They characterizecertain properties of a body at each point; again, their “components” shouldtransform in such a way that the tensors themselves do not depend on theframe. The precise meaning of the term order will be explained later.

For both vectors and tensors we can introduce various operations. Ofcourse, the introduction of any new operation should be done in such a waythat the results agree with known special cases when such familiar casesare met. If we introduce, say, dot multiplication of a tensor by a vector,then in a Cartesian frame the operation should resemble the multiplicationof a matrix by a column vector. Similarly, the multiplication of two ten-sors should be defined so that in a Cartesian frame the operation involvesmatrix multiplication. To this end we consider dyads of vectors. These arequantities of the form

eiej .

A tensor may then be represented as∑i,j

aijeiej

where the aij are the components of the tensor. We compare with equation(1.1) and notice the similarity in notation.

Preliminaries 5

The quantity eiej is also called the tensor product of the vectors ei andej , and is sometimes denoted ei ⊗ ej . Our notation (without the symbol⊗) emphasizes that, for example, e1e2 is an elemental object belonging tothe set of second-order tensors, in the same way that e1 is an elemental ob-ject belonging to the set of vectors. Note that e2e1 and e1e2 are differentobjects, however. The term “tensor product” indicates that the opera-tion shares certain properties with the product we know from elementaryalgebra. We will discuss this further in Chapter 3.

Natural objects can possess characteristics described by tensors of higherorder. For example, the elastic properties of a body are described by atensor of the fourth order (i.e., a tensor whose elemental parts are of theform abcd, where a,b, c,d are vectors). This means that in general theproperties of a body are given by a “table” consisting of 3× 3× 3× 3 = 81elements. The elements change according to certain rules if we change theframe.

Tensors also occur in electrodynamics, the general theory of relativity,and other sciences that deal with objects situated or distributed in space.

1.3 Assumed Background

In what follows we suppose a familiarity with the dot and cross productsand their expression in Cartesian frames. Recall that if a and b are vectors,then by definition

a · b = |a||b| cos θ,

where |a| and |b| are the magnitudes of a and b and θ is the (smaller)angle between a and b. From now on we reserve the symbol i for the basisvectors of a Cartesian system. In a Cartesian frame with basis vectorsi1, i2, i3 where a and b are expressed as

a = a1i1 + a2i2 + a3i3, b = b1i1 + b2i2 + b3i3,

we have

a · b = a1b1 + a2b2 + a3b3.

Also recall that

a × b =

∣∣∣∣∣∣i1 i2 i3a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣ .


The dot product will play a role in our discussion from the very beginning.The cross product will be used as needed, and a fuller discussion will appearin § 2.6.

b

a

c

Fig. 1.1 Geometrical meaning of the scalar triple product.

Given three vectors a,b, c we can form the scalar triple product

a · (b × c).

This may be interpreted as the volume of the parallelepiped having a,b, cas three of its co-terminal edges (Fig. 1.1). In rectangular components wehave, according to the expressions above,

a · (b× c) =

∣∣∣∣∣∣a1 a2 a3

b1 b2 b3c1 c2 c3

∣∣∣∣∣∣ .Permissible manipulations with the scalar triple product include cyclic in-terchange of the vectors:

a · (b × c) = c · (a × b) = b · (c × a).

Exercise 1.1. What does the condition a · (b × c) = 0 say about a, b,and c? (Hints for this and many other exercises appear in Appendix Bbeginning on page 315.)

So far we have made reference to vectors in a three-dimensional space,and shall continue this practice throughout most of the book. It is alsopossible (and sometimes useful) to introduce vectors in a more generalspace of n > 3 dimensions, e.g.,

a = i1a1 + i2a2 + · · · + inan.

Preliminaries 7

It turns out that many (but not all) of the principles and techniques weshall learn have direct extensions to such higher-dimensional spaces. Itis also true that many three-dimensional notions can be reconsidered intwo dimensions. The reader should take the time to reduce each three-dimensional formula to its two-dimensional analogue to understand whathappens with the corresponding assertion.

1.4 More on the Notion of a Vector

Before closing out this chapter we should mention the notions of a vectoras a “directed line segment” or “quantity having magnitude and direction.”We find these in elementary math and physics textbooks. But it is easy topoint to a situation in which a quantity of interest has magnitude and di-rection but is not a vector. The total electric current flowing in a thin wireis one example: to describe the current we must specify the rate of flow ofelectrons, the orientation of the wire, and the sense of electron movementalong the wire. However, if two wires lie in a plane and carry equal cur-rents running perpendicular to each other, then we cannot duplicate theirphysical effects by replacing them with a third wire directed at a 45 anglewith respect to the first two. Total electric currents cannot be consideredas vector quantities since they do not combine according to the rule forvector addition.

Another problem concerns the notion of an n-dimensional Euclideanspace. We sometimes hear it defined as the set each point of which isuniquely determined by n parameters. However, it is not reasonable toregard every list of n things as a possible vector. A tailor may take mea-surements for her client and compose an ordered list of lengths, widths, etc.,and by the above “definition” the set of all such lists is an n-dimensionalspace. But in R

n any two points are joined by a vector whose componentsequal the differences of the corresponding coordinates of the points. In thetailor’s case this notion is completely senseless. To make matters worse,in R

n one can multiply any vector by any real number and obtain anothervector in the space. A tailor’s list showing that someone is six meters tallwould be pretty hard to find.

Such simplistic definitions can be dangerous. The notion of a vectorwas eventually elaborated in physics, more precisely in mechanics wherethe behavior of forces was used as the model for a general vector. However,forces have some rather strange features: for example, if we shift the point


of application of a force acting on a solid body, then we must introduce amoment acting on the body or the motion of the body will change. So themechanical forces that gave birth to the notion of a vector possess featuresnot covered by the mathematical definition of a vector. In the classicalmechanics of a rigid body we are allowed to move a force along its line ofaction but cannot simply shift it off that line. With a deformable body wecannot move a force anywhere because in doing so we immediately changethe state of the body. We can vectorially add two forces acting on a rigidbody (not forgetting about the moments arising during shift of the forces).On the other hand, if two forces act on two different material points thenwe can add the vectors that represent the forces, but will not necessarilyobtain a new vector that is relevant to the physics of the situation. So weshould understand that the idea of a vector in mathematics reflects onlycertain features of the real objects it describes.

We would like to mention something else about vectors. In elementarymathematics students use vectors and points quite interchangeably. How-ever, these are objects of different natures: there are vectors in space andthere are the points of a space. We can, for instance, associate with a vec-tor in an n-dimensional Euclidean vector space a point in an n-dimensionalEuclidean point space. We can then consider a vector x as a shift of allpoints of the point space by an amount specified by x. The result of thismap is the same space of points; each pair of points that correspond underthe mapping define a vector x under which a point shifts into its image.This vector is what we find when we subtract the Cartesian coordinatesof the initial point from those of the final point. If we add the fact thatthe composition of two such maps obeys the rules of vector addition, thenwe get a strict definition of the space introduced intuitively in elementarymathematics. Engineers might object to the imposition of such formalityon an apparently simple situation, but mathematical rigor has proved itsworth from many practical points of view. For example, a computer thatprocesses information in the complete absence of intuition can deal prop-erly with objects for which rigorous definitions and manipulation rules havebeen formulated.

This brings us to our final word, regarding the expression of vectors incomponent-free notation. The simple and compact notation x for a vectorleads to powerful ways of exhibiting relationships between the many vector(and tensor) quantities that occur in mathematical physics. It permitsus to accomplish manipulations that would take many pages and becomequite confusing if done in component form. This is typical for problems of

Preliminaries 9

nonlinear physics, and for those where change of coordinate frame becomesnecessary. The resulting formal nature of the manipulations means thatcomputers can be expected to take on more and more of this sort of work,even at the level of original research.

1.5 Problems

1.1 Find 2a1 + 3a2 for the vectors a1 and a2 given by

(a) a1 = (1, 2,−1), a2 = (1, 1, 2);(b) a1 = (−1, 2, 0), a2 = (3, 1,−2);(c) a1 = (1, 3, 4), a2 = (5, 1,−4).

1.2 Find a1 + 3a2 − 2a3 for the vectors a1, a2, a3 given by

(a) a1 = (−1, 2,−2), a2 = (1,−1, 2), a3 = (1,−1, 2);(b) a1 = (1, 3, 2), a2 = (2,−3, 2), a3 = (3, 2, 3);(c) a1 = (2, 1, 2), a2 = (4, 3, 0), a3 = (1, 1,−2).

1.3 Find x satisfying the equation

(a) a + 2x− 4b = 0;(b) 2(a + 2x) + b = 3(b + x);(c) x + 2a + 16(x− b) + c = 2(c + 3x) + a− x.

1.4 Find the values of a · b and a × b if

(a) a = (0, 1, 1), b = (1, 1, 1);(b) a = (1, 2, 3), b = (2, 3, 1);(c) a = (1, 1, 1), b = (1, 1, 2);(d) a = (−1, 1,−1), b = (2,−1,−1);(e) a = (0,−1, 1), b = (−1, 1, 0);(f) a = (−1, 1, 0), b = (2, 3, 0);(g) a = (1, 2,−1), b = (1, 1, 2);(h) a = (−1, 2, 0), b = (3, 1,−2);(i) a = (1, 3, 4), b = (5, 1,−4).

1.5 Show that (a) i1 × i2 = i3; (b) i2 × i3 = i1; (c) i3 × i1 = i2.


1.6 Show that the equation a × i1 = −a2i3 + a3i2 holds for an arbitraryvector a = a1i1 + a2i2 + a3i3.

1.7 Let a = a1i1 + a2i2, b = b1i1 + b2i2, and let i1, i2, i3 be a Cartesianbasis. Show that a × b = (a1b2 − a2b1)i3.

1.8 Suppose a× x = 0, a · x = 0, and a = 0. Demonstrate that x = 0.

1.9 Find: (a) i1 · (i2 × i3); (b) i1 · (i3 × i2); (c) i1 · (i3 × i1).

1.10 Find a1 · (a2 × a3) when

(a) a1 = (−1, 2,−2), a2 = (1,−1, 2), a3 = (1,−1, 3);(b) a1 = (−1, 1, 0), a2 = (1, 1, 1), a3 = (1,−1, 2);(c) a1 = (1, 1, 1), a2 = (1, 2, 2), a3 = (1,−3, 2);(d) a1 = (−1, 2,−1), a2 = (1,−2, 2), a3 = (3,−1, 3);(e) a1 = (9, 8, 4), a2 = (7, 1, 3), a3 = (5, 3, 6);(f) a1 = (1, 2, 3), a2 = (7, 2, 3), a3 = (1, 4, 6);(g) a1 = (−1, 2,−2), a2 = (1,−1, 2), a3 = (1,−1, 2);(h) a1 = (1, 3, 2), a2 = (2,−3, 2), a3 = (3, 2, 3);(i) a1 = (2, 1, 2), a2 = (4, 3, 0), a3 = (1, 1,−2).

1.11 Find (a1 × a2) · (a3 × a4), where

(a) a1 = (−1, 2,−2), a2 = (1,−1, 2), a3 = (1,−1, 3), a4 = (1, 0, 0);(b) a1 = (−1, 1, 0), a2 = (1, 1, 1), a3 = (1,−1, 2), a4 = (0,−1, 0);(c) a1 = (1, 1, 1), a2 = (1, 2, 2), a3 = (1,−3, 2), a4 = (1, 1, 1);(d) a1 = (−1, 2,−1), a2 = (1,−2, 2), a3 = (3,−1, 3), a4 = (1, 0, 2);(e) a1 = (9, 8, 4), a2 = (7, 1, 3), a3 = (5, 3, 6), a4 = (2, 3,−1);(f) a1 = (1, 2, 3), a2 = (7, 2, 3), a3 = (1, 4, 6), a4 = (0, 0, 1);(g) a1 = (−1, 2,−2), a2 = (1,−1, 2), a3 = (1,−1, 2), a4 = (2,−2, 4);(h) a1 = (1, 3, 2), a2 = (2,−3, 2), a3 = (3, 2, 3), a4 = (1,−1, 1);(i) a1 = (2, 1, 2), a2 = (−6,−3,−6), a3 = (1, 1,−2), a4 = (1, 12, 3).

Chapter 2

Transformations and Vectors

2.1 Change of Basis

Let us reconsider the vector

x = (2, 1, 3).

Fully written out in a given Cartesian frame ei (i = 1, 2, 3), it is

x = 2e1 + e2 + 3e3.

(This is one of the few times we do not use i as the symbol for a Cartesianframe vector.) Suppose we appoint a new frame ei (i = 1, 2, 3) such that

e1 = e1 + 2e2 + 3e3,

e2 = 4e1 + 5e2 + 6e3,

e3 = 7e1 + 8e2 + 9e3.

From these expansions we could calculate the ei and verify that they arenon-coplanar. As x is an objective, frame-independent entity, we can write

x = 2(e1 + 2e2 + 3e3) + (4e1 + 5e2 + 6e3) + 3(7e1 + 8e2 + 9e3)

= (2 + 4 + 21)e1 + (4 + 5 + 24)e2 + (6 + 6 + 27)e3

= 27e1 + 33e2 + 39e3.

In these calculations it is unimportant whether the frames are Cartesian;it is important only that we have the table of transformation⎛⎝ 1 2 3

4 5 67 8 9

⎞⎠ .

11


It is clear that we can repeat the same operation in general form. Letx be of the form

x =3∑

i=1

xiei (2.1)

with the table of transformation of the frame given as

ei =3∑

j=1

Aji ej.

Then

x =3∑

i=1

xi3∑

j=1

Aji ej =

3∑j=1

ej

3∑i=1

Ajix

i.

So in the new basis we have

x =3∑

j=1

xj ej where xj =3∑

i=1

Ajix

i.

Here we have introduced a new notation, placing some indices as subscriptsand some as superscripts. Although this practice may seem artificial, thereare fairly deep reasons for following it.

2.2 Dual Bases

To perform operations with a vector x, we must have a straightforwardmethod of calculating its components — ultimately, no matter how ad-vanced we are, we must be able to obtain the xi using simple arithmetic.We prefer formulas that permit us to find the components of vectors usingdot multiplication only; we shall need these when doing frame transfor-mations, etc. In a Cartesian frame the necessary operation is simple dotmultiplication by the corresponding basis vector of the frame: we have

xk = x · ik (k = 1, 2, 3).

This procedure fails in a more general non-Cartesian frame where we do notnecessarily have ei · ej = 0 for all j = i. However, it may still be possibleto find a vector ei such that

xi = x · ei (i = 1, 2, 3)

Transformations and Vectors 13

in this more general situation. If we set

xi = x · ei =

⎛⎝ 3∑j=1

xjej

⎞⎠ · ei =3∑

j=1

xj(ej · ei)

and compare the left- and right-hand sides, we see that equality holds when

ej · ei = δij (2.2)

where

δij =

1, j = i,

0, j = i,

is the Kronecker delta symbol. In a Cartesian frame we have

ek = ek = ik

for each k.

Exercise 2.1. Show that ei is determined uniquely by the requirementthat xi = x · ei for every x.

Now let us discuss the geometrical nature of the vectors ei. Consider,for example, the equations for e1:

e1 · e1 = 1, e2 · e1 = 0, e3 · e1 = 0.

We see that e1 is orthogonal to both e2 and e3, and its magnitude is suchthat e1 · e1 = 1. Similar properties hold for e2 and e3.

Exercise 2.2. Show that the vectors ei are linearly independent.

By Exercise 2.2, the ei constitute a frame or basis. This basis is said tobe reciprocal or dual to the basis ei. We can therefore expand an arbitraryvector x as

x =3∑

i=1

xiei. (2.3)

Note that superscripts and subscripts continue to appear in our notation,but in a way complementary to that used in equation (2.1). If we dot-multiply the representation (2.3) of x by ej and use (2.2) we get xj . Thisexplains why the frames ei and ei are dual: the formulas

x · ei = xi, x · ei = xi,


look quite similar. So the introduction of a reciprocal basis gives manypotential advantages.

Let us discuss the reciprocal basis in more detail. The first problemis to find suitable formulas to define it. We derive these formulas next,but first let us note the following. The use of reciprocal vectors may notbe practical in those situations where we are working with only two orthree vectors. The real advantages come when we are working intensivelywith many vectors. This is reminiscent of the solution of a set of linearsimultaneous equations: it is inefficient to find the inverse matrix of thesystem if we have only one forcing vector. But when we must solve sucha problem repeatedly for many forcing vectors, the calculation and use ofthe inverse matrix is reasonable.

Writing out x in the ei and ei bases, we used a combination of indices(i.e., subscripts and superscripts) and summation symbols. From now onwe shall omit the symbol of summation when we meet matching subscriptsand superscripts: we shall write, say,

xiai for the sum

∑i

xiai.

That is, whenever we see i as a subscript and a superscript, we shall under-stand that a summation is to be carried out over i. This rule shall applyto situations involving vectors as well: we shall understand, for example,

xiei to mean the summation∑

i

xiei.

This rule is called the rule of summation over repeated indices.1 Note thata repeated index is a dummy index in the sense that it may be replaced byany other index not already in use: we have

xiai = x1a

1 + x2a2 + x3a

3 = xkak

for instance. An index that occurs just once in an expression, for examplethe index i in

Aki xk,

is called a free index. In tensor discussions each free index is understood torange independently over a set of values — presently this set is 1, 2, 3.1The rule of summation was first introduced not by mathematicians but by Einstein,

and is sometimes referred to as the Einstein summation convention. In a paper wherehe introduced this rule, Einstein used Cartesian frames and therefore did not distinguishsuperscripts from subscripts. However, we shall continue to make the distinction so thatwe can deal with non-Cartesian frames.


Let us return to the task of deriving formulas for the reciprocal basisvectors ei in terms of the original basis vectors ei. We construct e1 first.Since the cross product of two vectors is perpendicular to both, we cansatisfy the conditions

e2 · e1 = 0, e3 · e1 = 0,

by setting

e1 = c1(e2 × e3)

where c1 is a constant. To determine c1 we require

e1 · e1 = 1.

We obtain

c1[e1 · (e2 × e3)] = 1.

The quantity e1 · (e2 × e3) is a scalar whose absolute value is the volume ofthe parallelepiped described by the vectors ei. Denoting it by V , we have

e1 =1V

(e2 × e3).

Similarly,

e2 =1V

(e3 × e1), e3 =1V

(e1 × e2).

The reader may verify that these expressions satisfy (2.2). Let us mentionthat if we construct the reciprocal basis to the basis ei we obtain the initialbasis ei. Hence we immediately get the dual formulas

e1 =1V ′ (e

2 × e3), e2 =1V ′ (e

3 × e1), e3 =1V ′ (e

1 × e2),

where

V ′ = e1 · (e2 × e3).

Within an algebraic sign, V ′ is the volume of the parallelepiped describedby the vectors ei.

Exercise 2.3. Show that V ′ = 1/V .

Let us now consider the forms of the dot product between two vectors

a = aiei = ajej , b = bpep = bqeq.


We have

a · b = aiei · bpep = aibpei · ep.

Introducing the notation

gip = ei · ep, (2.4)

we have

a · b = aibpgip.

(As a short exercise the reader should write out this expression in full.)Using the reciprocal component representations we get

a · b = ajej · bqeq = ajbqgjq

where

gjq = ej · eq. (2.5)

Finally, using a mixed representation we get

a · b = aiei · bqeq = aibqδqi = aibi

and, similarly,

a · b = ajbj.

Hence

a · b = aibjgij = aibjgij = aibi = aib

i.

We see that when we use mixed bases to represent a and b we get formulasthat resemble the equation

a · b = a1b1 + a2b2 + a3b3

from § 1.3; otherwise we get more terms and additional multipliers. We willencounter gij and gij often. They are the components of a unique tensorknown as the metric tensor. In Cartesian frames we obviously have

gij = δji , gij = δi

j .


2.3 Transformation to the Reciprocal Frame

How do the components of a vector x transform when we change to thereciprocal frame? We simply set

xiei = xiei

and dot both sides with ej to get

xiei · ej = xiei · ej

or

xj = xigij . (2.6)

In the system of equations⎛⎝ x1

x2

x3

⎞⎠ =

⎛⎝ g11 g21 g31g12 g22 g32g13 g23 g33

⎞⎠⎛⎝ x1

x2

x3

⎞⎠the matrix of the components of the metric tensor gij is also called theGram matrix. A theorem in linear algebra states that its determinant isnot zero if and only if the vectors ei are linearly independent.

Exercise 2.4. (a) Show that if the Gram determinant vanishes, then theei are linearly dependent. (b) Prove that the Gram determinant equals V 2.

We called the basis ei dual to the basis ei. In ei the metric componentsare given by gij , so we can immediately write an expression dual to (2.6):

xi = xjgij . (2.7)

We see from (2.6) and (2.7) that, using the components of the metric tensor,we can always change subscripts to superscripts and vice versa. Theseactions are known as the raising and lowering of indices. Finally, (2.6) and(2.7) together imply

xi = gijgjkxk,

hence

gijgjk = δk

i .

Of course, this means that the matrices of gij and gij are mutually inverse.


Quick summary

Given a basis ei, the vectors ei given by the requirement that

ej · ei = δij

are linearly independent and form a basis called the reciprocal or dual basis.The definition of dual basis is motivated by the equation xi = x · ei. Theei can be written as

ei =1V

(ej × ek)

where the ordered triple (i, j, k) equals (1, 2, 3) or one of the cyclic permu-tations (2, 3, 1) or (3, 1, 2), and where

V = e1 · (e2 × e3).

The dual of the basis ek (i.e., the dual of the dual) is the original basis ek.A given vector x can be expressed as

x = xiei = xiei

where the xi are the components of x with respect to the dual basis.

Exercise 2.5. (a) Let x = xkek = xkek. Write out the modulus of x in allpossible forms using the metric tensor. (b) Write out all forms of the dotproduct x · y.

2.4 Transformation Between General Frames

Having transformed the components xi of a vector x to the correspondingcomponents xi relative to the reciprocal basis, we are now ready to take onthe more general task of transforming the xi to the corresponding compo-nents xi relative to any other basis ei. Let the new basis ei be related tothe original basis ei by

ei = Aji ej . (2.8)

This is, of course, compact notation for the system of equations⎛⎝ e1

e2

e3

⎞⎠ =

⎛⎝ A11 A2

1 A31

A12 A2

2 A32

A13 A2

3 A33

⎞⎠︸︷︷︸

≡A, say

⎛⎝ e1

e2

e3

⎞⎠ .


Before proceeding, we note that in the symbol Aji the subscript indexes

the row number in the matrix A, while the superscript indexes the columnnumber. Throughout our development we shall often take the time to writevarious equations of interest in matrix notation. It follows from (2.8) that

Aji = ei · ej.

Exercise 2.6. A Cartesian frame is rotated about its third axis to give anew Cartesian frame. Find the matrix of transformation.

A vector x can be expressed in the two forms

x = xkek, x = xiei.

Equating these two expressions for the same vector x, we have

xiei = xkek,

hence

xiei = xkAjkej . (2.9)

To find xi in terms of xi, we may expand the notation and write (2.9) as

x1e1 + x2e2 + x3e3 = x1Aj1ej + x2Aj

2ej + x3Aj3ej

where, of course,

Aj1ej = A1

1e1 +A21e2 +A3

1e3,

Aj2ej = A1

2e1 +A22e2 +A3

2e3,

Aj3ej = A1

3e1 +A23e2 +A3

3e3.

Matching coefficients of the ei we find

x1 = x1A11 + x2A1

2 + x3A13 = xjA1

j ,

x2 = x1A21 + x2A2

2 + x3A23 = xjA2

j ,

x3 = x1A31 + x2A3

2 + x3A33 = xjA3

j ,

hence

xi = xjAij . (2.10)

It is possible to obtain (2.10) from (2.9) in a succinct manner. On the right-hand side of (2.9) the index j is a dummy index which we can replace with


i and thereby obtain (2.10) immediately. The matrix notation equivalentof (2.10) is ⎛⎝ x1

x2

x3

⎞⎠ =

⎛⎝ A11 A1

2 A13

A21 A2

2 A23

A31 A3

2 A33

⎞⎠⎛⎝ x1

x2

x3

⎞⎠and thus involves multiplication by AT , the transpose of A.

We shall also need the equations of transformation from the frame ei

back to the frame ei. Since the direct transformation is linear the inversemust be linear as well, so we can write

ei = Ajiej (2.11)

where

Aji = ei · ej.

Let us find the relation between the matrices of transformation A and A.By (2.11) and (2.8) we have

ei = Ajiej = Aj

iAkj ek,

and since the ei form a basis we must have

AjiA

kj = δk

i .

The relationship

Aji A

kj = δk

i

follows similarly. The product of the matrices (Aji ) and (Ak

j ) is the unitmatrix and thus these matrices are mutually inverse.

Exercise 2.7. Show that xi = xkAik.

Formulas for the relations between reciprocal bases can be obtained asfollows. We begin with the obvious identities

ej(ej · x) = x, ej(ej · x) = x.

Putting x = ei in the first of these gives

ei = Aije

j ,

while the second identity with x = ei yields

ei = Aij e

j .

From these follow the transformation formulas

xi = xkAki , xi = xkA

ki .


2.5 Covariant and Contravariant Components

We have seen that if the basis vectors transform according to the relation

ei = Aji ej ,

then the components xi of a vector x must transform according to

xi = Aji xj .

The similarity in form between these two relations results in the xi beingtermed the covariant components of the vector x. On the other hand, thetransformation law

xi = Aij x

j

shows that the xi transform like the ei. For this reason the xi are termedthe contravariant components of x. We shall find a further use for thisnomenclature in Chapter 3.

Quick summary

If frame transformations

ei = Aji ej ,

ei = Ajiej ,

ei = Aij e

j ,

ei = Aije

j ,

are considered, then x has the various expressions

x = xiei = xiei = xiei = xiei

and the transformation laws

xi = Aji xj ,

xi = Ajixj ,

xi = Aij x

j ,

xi = Aijx

j ,

apply. The xi are termed contravariant components of x, while the xi aretermed covariant components. The transformation laws are particularlysimple when the frame is changed to the dual frame. Then

xi = gjixj , xi = gijxj ,

where

gij = ei · ej , gij = ei · ej,

are components of the metric tensor.


2.6 The Cross Product in Index Notation

In mechanics a major role is played by the quantity called torque. Thisquantity is introduced in elementary physics as the product of a force mag-nitude and a length (“force times moment arm”), along with some rulesfor algebraic sign to account for the sense of rotation that the force wouldencourage when applied to a physical body. In more advanced discussionsin which three-dimensional problems are considered, torque is regarded asa vectorial quantity. If a force f acts at a point which is located relativeto an origin O by position vector r, then the associated torque t about Ois normal to the plane of the vectors r and f . Of the two possible unitnormals, t is conventionally (but arbitrarily) associated with the vector ngiven by the familiar right-hand rule: if the forefinger of the right hand isdirected along r and the middle finger is directed along f , then the thumbindicates the direction of n and hence the direction of t. The magnitudeof t equals |f ||r| sin θ, where θ is the smaller angle between f and r. Theserules are all encapsulated in the brief symbolism

t = r × f .

The definition of torque can be taken as a model for a more generaloperation between vectors: the cross product. If a and b are any twovectors, we define

a × b = n|a||b| sin θwhere n and θ are defined as in the case of torque above. Like any othervector, c = a × b can be expanded in terms of a basis; we choose thereciprocal basis ei and write

c = ciei.

Because the magnitudes of a and b enter into a×b in multiplicative fashion,we are prompted to seek ci in the form

ci = εijkajbk. (2.12)

Here the ε’s are formal coefficients. Let us find them. We write

a = ajej , b = bkek,

and employ the well-known distributive property

(u + v) × w ≡ u× w + v × w


to obtain

c = ajej × bkek = ajbk(ej × ek).

Then

c · ei = cmem · ei = ci = ajbk[(ej × ek) · ei]

and comparison with (2.12) shows that

εijk = (ej × ek) · ei.

Now the value of (ej × ek) · ei depends on the values of the indices i, j, k.Here it is convenient to introduce the idea of a permutation of the orderedtriple (1, 2, 3). A permutation of (1, 2, 3) is called even if it can be broughtabout by performing any even number of interchanges of pairs of thesenumbers; a permutation is odd if it results from performing any odd numberof interchanges. We saw before that (ej × ek) · ei equals the volume of theframe parallelepiped if i, j, k are distinct and the ordered triple (i, j, k) isan even permutation of (1, 2, 3). If i, j, k are distinct and the ordered triple(i, j, k) is an odd permutation of (1, 2, 3), we obtain minus the volume ofthe frame parallelepiped. If any two of the numbers i, j, k are equal weobtain zero. Hence

εijk =

⎧⎪⎪⎨⎪⎪⎩+V, (i, j, k) an even permutation of (1, 2, 3),

−V, (i, j, k) an odd permutation of (1, 2, 3),

0, two or more indices equal.

Moreover, it can be shown (Exercise 2.4) that

V 2 = g

where g is the determinant of the matrix formed from the elements gij =ei · ej of the metric tensor. Note that |V | = 1 for a Cartesian frame.

The permutation symbol εijk is useful in writing formulas. For example,the determinant of a matrix A = (aij) can be expressed succinctly as

detA = εijka1ia2ja3k.

Much more than a notational device however, εijk represents a tensor (theso-called Levi–Civita tensor). We discuss this further in Chapter 3.

Exercise 2.8. The contravariant components of a vector c = a×b can beexpressed as

ci = εijkajbk


for suitable coefficients εijk. Use the technique of this section to find thecoefficients. Then establish the identity

εijkεpqr =

∣∣∣∣∣∣∣δpi δq

i δri

δpj δq

j δrj

δpk δq

k δrk

∣∣∣∣∣∣∣and use it to show that

εijkεpqk = δp

i δqj − δq

i δpj .

Use this in turn to prove that

a × (b× c) = b(a · c) − c(a · b) (2.13)

for any vectors a,b, c.

Exercise 2.9. Establish Lagrange’s identity

(a × b) · (c × d) = (a · c)(b · d) − (a · d)(b · c).

2.7 Norms on the Space of Vectors

We often need to characterize the intensity of some vector field locally orglobally. For this, the notion of a norm is appropriate. The well-knownEuclidean norm of a vector a = akik written in a Cartesian frame is

‖a‖ =

(3∑

k=1

a2k

)1/2

.

This norm is related to the inner product of two vectors a = akik andb = bkik: we have a · b = akbk so that

‖a‖ = (a · a)1/2.

In a non-Cartesian frame, the components of a vector depend on thelengths of the frame vectors and the angles between them. Since the sumof squared components of a vector depends on the frame, we cannot use it tocharacterize the vector. But the formulas connected with the dot productare invariant under change of frame, so we can use them to characterizethe intensity of the vector — its length. Thus for two vectors x = xiei andy = yjej written in the arbitrary frame, we can introduce a scalar product(i.e., a simple dot product)

x · y = xiei · yjej = xiyjgij = xiyjgij = xiyi.


Note that only in mixed coordinates does this resemble the scalar productin a Cartesian frame. Similarly, the norm of a vector x is

‖x‖ = (x · x)1/2 =(xixjgij

)1/2=(xixjg

ij)1/2

=(xixi

)1/2.

This dot product and associated norm have all the properties required fromobjects of this nature in algebra or functional analysis. Indeed, it is neces-sary only to check whether all the axioms of the inner product are satisfied.

(i) x · x ≥ 0, and x · x = 0 if and only if x = 0. This property holdsbecause all the quantities involved can be written in a Cartesianframe where it holds trivially. By the same reasoning, we confirmsatisfaction of the property

(ii) x · y = y · x. The reader should check that this holds for anyrepresentation of the vectors. Finally,

(iii) (αx + βy) · z = α(x · z) + β(y · z) where α and β are arbitrary realnumbers and z is a vector.

By the general theory then, the expression

‖x‖ = (x · x)1/2 (2.14)

satisfies all the axioms of a norm:

(i) ‖x‖ ≥ 0, with ‖x‖ = 0 if and only if x = 0.(ii) ‖αx‖ = |α| ‖x‖ for any real α.(iii) ‖x + y‖ ≤ ‖x‖ + ‖y‖.

In addition we have the Schwarz inequality

‖x · y‖ ≤ ‖x‖ ‖y‖ , (2.15)

where in the case of nonzero vectors the equality holds if and only if x = λyfor some real λ.

The set of all three-dimensional vectors constitutes a three-dimensionallinear space. A linear space equipped with the norm (2.14) becomes anormed space. In this book, the principal space is R

3. Note that we canintroduce more than one norm in any normed space, and in practice avariety of norms turn out to be necessary. For example, 2 ‖x‖ is also anorm in R

3. We can introduce other norms, quite different from the above.One norm can be introduced as follows. Let ek be a basis of R3 and let


x = xkek. For p ≥ 1, we introduce

‖x‖p =

(3∑

k=1

|xk|p)1/p

.

Norm axioms (i) and (ii) obviously hold. Axiom (iii) is a consequence of theclassical Minkowski inequality for finite sums. The reader should be awarethat this norm is given in a certain basis. If we use it in another basis, thevalue of the norm of a vector will change in general. An advantage of thenorm (2.14) is that it is independent of the basis of the space.

Later, when investigating the eigenvalues of a tensor, we will need aspace of vectors with complex components. It can be introduced similarlyto the space of complex numbers. We start with the space R

3 having basisek, and introduce multiplication of vectors in R3 by complex numbers. Thisalso yields a linear space, but it is complex and denoted by C3. An arbitraryvector x in C3 takes the form

x = (ak + ibk)ek,

where i is the imaginary unit (i2 = −1). Analogous to the conjugatenumber is the conjugate vector to x, defined by

x = (ak − ibk)ek.

The real and imaginary parts of x are akek and bkek, respectively. Clearly,a basis in C3 may contain vectors that are not in R3. As an exercise, thereader should write out the form of the real and imaginary parts of x insuch a basis.

In C3, the dot product loses the property that x · x ≥ 0. However, we

can introduce the inner product of two vectors x and y as

〈x,y〉 = x · y.It is easy to see that this inner product has the following properties. Letx,y, z be arbitrary vectors of C

3. Then

(i) x · x ≥ 0, and x · x = 0 if and only if x = 0.(ii) x · y = y · x.(iii) (αx + βy) · z = α(x · z) + β(y · z) where α and β are arbitrary

complex numbers.

The reader should verify these properties. Now we can introduce the normrelated to the inner product,

‖x‖ = 〈x,x〉1/2,


and verify that it satisfies all the axioms of a norm in a complex linearspace. As a consequence of the general properties of the inner product,Schwarz’s inequality (2.15) also holds in C

3.

2.8 Closing Remarks

We close by repeating something we said in Chapter 1:

A vector is an objective entity.

In elementary mathematics we learn to think of a vector as an ordered tripleof components. There is, of course, no harm in this if we keep in mind acertain Cartesian frame. But if we fix those components then in any otherframe the vector is determined uniquely. Absolutely uniquely! So a vectoris something objective, but as soon as we specify its components in oneframe we can find them in any other frame by the use of certain rules.

We emphasize this because the situation is exactly the same with ten-sors. A tensor is an objective entity, and fixing its components relative toone frame, we determine the tensor uniquely — even though its componentsrelative to other frames will in general be different.

2.9 Problems

2.1 Find the dual basis to ei.

(a) e1 = 2i1 + i2 − i3, e2 = 2i2 + 3i3, e3 = i1 + i3;(b) e1 = i1 + 3i2 + 2i3, e2 = 2i1 − 3i2 + 2i3, e3 = 3i1 + 2i2 + 3i3;(c) e1 = i1 + i2, e2 = i1 − i2, e3 = 3i3;(d) e1 = cosφi1 + sinφi2, e2 = − sinφi1 + cosφi2, e3 = i3.

2.2 Let

e1 = −2i1 + 3i2 + 2i3, e1 = 2i1 + i2 − i3,

e2 = −2i1 + 2i2 + i3, e2 = 2i2 + 3i3,

e3 = −i1 + i2 + i3, e3 = i1 + i3.

Find the matrix Aji of transformation from the basis ei to the basis ej .


2.3 Let

e1 = i1 + 2i2, e1 = i1 − 6i3,

e2 = −i2 − i3, e2 = −3i1 − 4i2 + 4i3,

e3 = −i1 + 2i2 − 2i3, e3 = i1 + i2 + i3.

Find the matrix of transformation of the basis ei to ej.

2.4 Find

(a) ajδjk,

(b) aiajδi

j ,(c) δi

i ,(d) δijδjk,(e) δijδji,(f) δj

i δkj δ

ik.

2.5 Show that εijkεijl = 2δl

k.

2.6 Show that εijkεijk = 6.

2.7 Find

(a) εijkδjk,

(b) εijkεmkjδi

m,(c) εijkδ

kmδ

jn,

(d) εijkaiaj ,

(e) εijk|εijk|,(f) εijkε

imnδjm.

2.8 Find (a × b) × c.

2.9 Show that (a × b) · a = 0.

2.10 Show that a · (b × c)d = (a · d)b × c + (b · d)c × a + (c · d)a × b.

2.11 Show that (e× a) × e = a if |e| = 1 and e · a = 0.

2.12 Let ek be a basis of R3, let x = xkek, and suppose h1, h2, h3 are fixedpositive numbers. Show that hk|xk| is a norm in R3.

Chapter 3

Tensors

3.1 Dyadic Quantities and Tensors

We have met sets of quantities like gij or gij . Such a table of 3 × 3 = 9coefficients could be considered as a vector in a nine-dimensional space, butwe must reject this idea for an important reason: if we change the framevectors and calculate the relations between the new and old components,the results differ in form from those that apply to vector components. Thecomponents of the metric tensor transform according to certain rules, how-ever, and it is found that these transformation rules also apply to variousquantities encountered in physical science. We indicated in Chapter 1 thatthese quantities, represented by 3×3 matrices, form a class of objects knownas second-order tensors. Our plan is to present the relevant theory in a waythat parallels the vector presentation of Chapter 2.

We begin to realize this program with the introduction of the dyad (ortensor product) of two vectors a and b, denoted a⊗b. We assume that thetensor product satisfies many usual properties of a product:

(λa) ⊗ b = a ⊗ (λb) = λ(a ⊗ b),

(a + b) ⊗ c = a ⊗ c + b⊗ c,

a ⊗ (b + c) = a ⊗ b + a ⊗ c, (3.1)

where λ is an arbitrary real number. However, the tensor product is notsymmetric: if a is not proportional to b then a⊗b = b⊗ a. From now on,we shall write out the dyad without the ⊗ symbol: ab = a ⊗ b.

Let us once again consider the space of three-dimensional vectors withthe frame ei. Using the expansion of the vectors in the basis vectors andthe properties (3.1), we represent the dyad ab as

ab = aieibjej = aibjeiej .

29


This introduces exactly nine different dyads eiej . We now consider a linearspace whose basis is this set of nine dyads and call it the space of second-order tensors (or tensors of order two). The numerical coefficients of thedyads are called the components of the tensor. Thus an element of thisspace, a tensor A, has the representation

A = aijeiej .

To maintain the property of objectivity of the elements of this space, werequire that upon transformation of the frame the components of A trans-form correspondingly. Note that we have introduced superscript indices forthe components of A. This was done in keeping with the development ofChapter 2.

In preparation for the next section let us introduce the dot product ofa dyad ab by a vector c:

ab · c = (b · c)a. (3.2)

So the result is a vector co-oriented with a. Analogously we can introducethe dot product from the left:

c · ab = (c · a)b. (3.3)

Exercise 3.1. (a) A dyad of the form ee, where e is a unit vector, issometimes called a projection dyad. Explain. (b) Write down matrices forthe dyads i1i1, i2i2, and i3i1.

3.2 Tensors From an Operator Viewpoint

An alternative to viewing a second-order tensor as a weighted sum of dyadsis to view the tensor as an operator. From this standpoint a tensor A isconsidered to map a vector x into a vector y according to the equation

y = A · x.Conversely, a given linear relation between x and y will define the operatorA uniquely. Thus if we have A · x = B · x for all x, then we have A = B.Let us show that the components are really uniquely defined in any basisby the equality y = A · x. The tensor A is represented by the expressionaijeiej in some basis ei. It is clear that the operation A ·x is linear in x, sowe define A uniquely if we specify its action on all three vectors of a basis.Taking x = ek, the corresponding y is

A · x = aijeiej · ek = aikei.

Tensors 31

Dot multiplying this by el we get

alk = el ·A · ek.

In this way we can find the components of a tensor A in any basis:

aij = ei · A · ej , ai·j = ei · A · ej ,

aij = ei · A · ej , a·ji = ei · A · ej .

Note that in “mixed components” we position the indices in such a waythat their association with the various dyads remains clear.

Analyzing the above reasoning, we can find that we have proved thequotient law for tensors of order two. If y is a given vector and there is alinear transformation from x to y for an arbitrary vector x, then the lineartransformation is a tensor and we can write y = A · x. This statement issometimes useful in establishing the tensorial character of a set of scalarquantities (i.e., the components of A).

We may also define common algebraic operations from the operatorviewpoint. Given tensors A and B, the sum is the tensor A + B uniquelydefined by the requirement that

(A + B) · x = A · x + B · xfor all x. If c is a scalar, cA is defined by the requirement that

(cA) · x = c(A · x)

for all x. In particular, any product of the form 0A gives a zero tensordenoted 0. The dot product A · B is regarded as the composition of theoperators B and A:

(A · B) · x ≡ A · (B · x).

The dot product y · A, called pre-multiplication of A by a vector y, isdefined by the requirement that

(y · A) · x = y · (A · x)

for all vectors x.A simple but important tensor is the unit tensor denoted by E and

defined by the requirement that for any x

E · x = x · E = x. (3.4)


It is evident that in any Cartesian frame ii we must have

E =3∑

i=1

iiii. (3.5)

In any frame we have

E = eiei = ejej (3.6)

for the mixed components. Consequently, the raising and lowering of indicesgives

E = gijeiej = gijeiej (3.7)

in non-mixed components. We see that the role of the unit tensor belongsto the metric tensor! Throughout our discussion of second-order tensors weshall emphasize the close analogy between tensor theory and matrix theory.Equations (3.5) and (3.6) show that the matrix representation of E in eitherCartesian or mixed components is the 3 × 3 identity matrix⎛⎝ 1 0 0

0 1 00 0 1

⎞⎠ .

This does not hold for the non-mixed components of (3.7).

Exercise 3.2. Use (3.4) along with (2.4) and (2.5) to show that the variouscomponents of E are given by

eij = gij , ei·j = δi

j,

eij = gij , e ·ij = δij.

Hence establish (3.6).

Our consideration of A as an operator leads us to introduce the notionof an inverse tensor : if

A ·A−1 = E,

then A−1 is called the inverse of A. The inverse of a tensor is also a tensor.An important special case occurs when the matrix of the tensor is a diagonalmatrix. If in a Cartesian frame ii we have

A =3∑

i=1

λiiiii

Tensors 33

then the corresponding matrix representation is⎛⎝ λ1 0 00 λ2 00 0 λ3

⎞⎠ .

If we take

B =3∑

j=1

λ−1j ijij

to which there corresponds the matrix⎛⎝ λ−11 0 00 λ−1

2 00 0 λ−1

3

⎞⎠and form the dot product A · B, we get

A ·B =3∑

i=1

λiiiii ·3∑

j=1

λ−1j ijij

=3∑

i=1

λiii3∑

j=1

λ−1j (ii · ij)ij

=3∑

i=1

λiiiλ−1i ii

=3∑

i=1

iiii

= E.

This means that B = A−1. Correspondingly,⎛⎝ λ1 0 00 λ2 00 0 λ3

⎞⎠⎛⎝ λ−11 0 00 λ−1

2 00 0 λ−1

3

⎞⎠ =

⎛⎝ 1 0 00 1 00 0 1

⎞⎠ .

Exercise 3.3. Establish the formula

(A ·B)−1 = B−1 ·A−1

for invertible tensors A, B.


A second-order tensor A is singular if A · x = 0 for some x = 0.Hence A is nonsingular if A · x = 0 only when x = 0. Recall that amatrix A is said to be nonsingular if and only if detA = 0. The connectionbetween the uses of this terminology in the two areas is as follows. If wetake a mixed representation of the tensor A, the equation A · x = 0 yieldsa set of simultaneous equations in the components of x; these equationshave a nontrivial solution if and only if the determinant of the coefficientmatrix (i.e., the matrix representing A) is zero. Moreover, taking any otherrepresentation of A and a dual representation of the vector, we again arrivethe same conclusion regarding the determinant. This brings the use of theterm “singular” to the tensor A. By definition the determinant of a second-order tensor A, denoted detA, is the determinant of the matrix of its mixedcomponents:

detA = |a·ji | = |ak·m| =

1g|ast| = g|apq|.

The first equality is the definition as stated above; the rest are left for thereader to establish. Various other formulas such as

detA =16εijkε

mnpa · im a · j

n a ·kp

can be established for the determinant.We close this section with an important remark. We can derive all de-

sired properties of a tensor, and perform actions with the tensor, in any co-ordinate frame. Convenience will often dictate the use of Cartesian frames.But if we obtain an equation or expression through the use of a Cartesianframe and can subsequently represent this result in non-coordinate form,then we have provided rigorous justification of the latter. As we have saidbefore, tensors are objective entities and ultimately all results pertainingto them must be frame independent.

3.3 Dyadic Components Under Transformation

The standpoint for deriving the transformation rules is that in any basis atensor is the same element of some space, and only (3.1) and the rules wederived for vectors can govern the rules for transforming the componentsof a tensor. Let us begin with the transformation of the components whenwe go to the reciprocal basis. We set

aijeiej = aijeiej

Tensors 35

and take dot products as in (3.2) and (3.3):

ek · aijeiej · em = ek · aijeiej · em.

This gives

aij(ek · ei)(ej · em) = aij(ek · ei)(ej · em),

hence

akm = aijgkigjm.

We see that the components of the metric tensor are encountered in thistransformation.

Now we can construct the formulas for transforming the tensor compo-nents when the change of basis takes the general form

ei = Aji ej .

From

aij eiej = akmekem = akmApkepA

qmeq

we obtain

aij = akmAikA

jm. (3.8)

Similarly, the inverse transformation

ei = Ajiej

leads to

aij = akmAikA

jm. (3.9)

Equations (3.8) and (3.9) together imply that

Akj A

ik = δi

j .

Various expressions for A,

A = aijeiej = aklekel = a·ji eiej = ak· lekel

= aij eiej = aklekel = a· ji eiej = ak· lekel,

lead to other transformation formulas such as

aij = Aki A

ljakl, aij = Ak

iAlj akl,

and

a· ji = AkiA

jl a

· lk , ai

· j = AikA

lja

k· l.

Remembering the terminology of § 2.5, we see why the aij are called thecovariant components of A while the aij are called the contravariant com-ponents. The components ai

·j and a·ji are called mixed components.


Quick summary

We have

A = aij eiej = aklekel = a· ji eiej = ak· lekel

= aijeiej = aklekel = a·ji eiej = ak· lekel

where

aij = AikA

jl a

kl, aij = AikA

jl a

kl,

aij = Aki A

ljakl, aij = Ak

iAlj akl,

ai· j = Ai

kAlja

k· l, ai

·j = AikA

lj a

k· l,

a· ji = Aki A

jl a

· lk , a·ji = Ak

i Ajl a

· lk .

Exercise 3.4. (a) Express the transformation law

bij = AikA

jmb

km

in matrix notation. (b) Repeat for a transformation law of the form

bij = Aki A

mj bkm.

Exercise 3.5. Our Aji values give the transformation from one basis to

another; they define a transformation of the space that is an operator, andhence a second-order tensor. Write out the tensor for which the Aj

i arecomponents. Repeat for the inverse transformation.

3.4 More Dyadic Operations

The dot product of two dyads ab and cd is defined by

ab · cd = (b · c)ad.The result is again a dyad, with a coefficient b · c. Extensions of this andthe formulas (3.2) and (3.3) to operations with sums of dyads and vectors(using (3.1) and the vectorial rules) gives us a number of rules which thedot product obeys. Let A and B be dyads, a and b be vectors, and λ andµ be any real numbers. Then

A · (λa + µb) = λA · a + µA · b,(λA + µB) · a = λA · a + µB · a.

Tensors 37

Similar identities hold for dot products taken in the opposite orders. Theseresults show that linearity may be assumed in working with these opera-tions.

Now let us pursue the close analogy between the dot product and matrixmultiplication. We begin with the simple case of a Cartesian frame. Wetake a dyad A and a vector b and express these relative to a basis ik:

A = akmikim, b = bjij .

Denoting

c = A · b (3.10)

we have

ckik = akmikim · bjijso that

ck =3∑

j=1

akjbj.

(We have inserted the summation symbol because j stands in the upperposition twice and the summation convention would not apply.) Writtenout, this is the system of three equations

c1 = a11b1 + a12b2 + a13b3,

c2 = a21b1 + a22b2 + a23b3,

c3 = a31b1 + a32b2 + a33b3,

or ⎛⎝ c1

c2

c3

⎞⎠ =

⎛⎝ a11 a12 a13

a21 a22 a23

a31 a32 a33

⎞⎠⎛⎝ b1

b2

b3

⎞⎠ .

Here we have a matrix equation of the form

c = Ab (3.11)

where c and b are column vectors and A is a 3 × 3 matrix. The analogybetween the dot product and matrix multiplication is evident from (3.10)and (3.11). This analogy extends beyond the confines of Cartesian frames.Let us write, for example,

A = akmekem, b = bjej.


This time c = A · b gives

ckek = akmekem · bjej = akmekδjmbj,

hence

ck = akjbj.

The corresponding matrix equation is, of course,⎛⎝ c1

c2

c3

⎞⎠ =

⎛⎝ a11 a12 a13

a21 a22 a23

a31 a32 a33

⎞⎠⎛⎝ b1b2b3

⎞⎠ .

With suitable understanding we could still write this as (3.11). Note whathappens when we express both the dyad and the vector in terms of covariantcomponents:

A = akmekem, b = bjej.

We obtain

ck = akmgmjbj

and the metric tensor appears. The corresponding matrix form is⎛⎝ c1c2c3

⎞⎠ =

⎛⎝ a11 a12 a13

a21 a22 a23

a31 a32 a33

⎞⎠⎛⎝ g11 g12 g13

g21 g22 g23

g31 g32 g33

⎞⎠⎛⎝ b1b2b3

⎞⎠ .

Because the metric tensor can raise an index on a vector component, wemay also write these equations in the forms

ck = akmbm

and ⎛⎝ c1c2c3

⎞⎠ =

⎛⎝ a11 a12 a13

a21 a22 a23

a31 a32 a33

⎞⎠⎛⎝ b1

b2

b3

⎞⎠ .

Let us examine the dot product between two dyads. There are variouspossibilities for the components of the dyad

C = A ·B,depending on how we choose to express A and B. If we use all contravariantcomponents and write

A = akmekem, B = bkmekem,

Tensors 39

then

C = ckneken

where

ckn = akmgmjbjn.

Similarly, the use of all covariant components as in

A = akmekem, B = bkmekem,

leads to

C = ckneken

where

ckn = akmgmjbjn.

Mixed components appear when we express

A = akmekem, B = bkmekem.

Then

C = A ·B = akmekem · bjnejen = akmδjmbjneken = akjbjneken.

Defining

ck·n = akjbjn

we have

C = ck·neken.

We leave other possibilities to the reader as

Exercise 3.6. (a) Discuss how the formulation c ·nk = akjbjn arises.

(b) Show how all the forms above correspond to matrix multiplication.(c) What happens if mixed components are used on the right-hand sides toexpress A and B?

Another useful operation that can be performed between tensors is dou-ble dot multiplication. If ab and cd are dyads, we define

ab ·· cd = (b · c)(a · d).

That is, we first dot multiply the near standing vectors, then the remainingvectors, and thereby obtain a scalar as the result.


Exercise 3.7. (a) Calculate A · ·E if A is a tensor of order two. Howdoes this relate to the trace of the matrix that represents A in mixedcomponents? (b) Let A and B be tensors of order two. Write down severaldifferent component forms for the quantity A ··B.

Yet another operation is the scalar product of two second-order tensorsA and B, denoted by A • B. This represents a natural extension of theoperation

ab • cd = (a · c)(b · d)

between two dyads ab and cd.

3.5 Properties of Second-Order Tensors

Now we would like to consider in more detail those tensors that occur mostfrequently in applications: tensors of order two. First we recall that such atensor is represented in dyadic form as

A = aijeiej .

Those who work in the applied sciences are probably more accustomed tothe matrix representation ⎛⎝ a11 a12 a13

a21 a22 a23

a31 a32 a33

⎞⎠ . (3.12)

When we use the matrix form (3.12) the dyadic basis of the tensor remainsimplicit. Of course when we use a unique, say Cartesian, frame for thespace of vectors, then it does not matter whether we show the dyads. Thecorrespondence between the dyadic and matrix representations suggeststhat we can introduce many familiar ideas from the theory of matrices.

The tensor transpose

Let us begin with the notion of transposition. For a matrix A = (aij) thetransposed matrix AT is

AT = (aji).

Similarly we introduce the transpose operation for the tensor A:

AT = ajieiej. (3.13)

Tensors 41

This operation yields a new tensor, in each representation of which thecorresponding indices appear in reverse order:

AT = ajieiej = ajieiej = aj·ie

iej = a ·ij eiej .

A useful relation for any second-order tensor A and any vector x is

A · x = x · AT . (3.14)

This follows when we write x = xkek and use (3.13) to see that

AT = aijejei.

Equation (3.14) can be used to define the transpose. Also note that

(AT )T = A

for any second-order tensor A.

Exercise 3.8. Let A and B be tensors of order two. Demonstrate that

A • B = A ··BT = AT ··B.

Exercise 3.9. Let A be a second-order tensor. Find A ··AT . Demonstratethat A ··AT = 0 if and only if A = 0.

Exercise 3.10. (a) Show that if A and B are tensors of order two, then

(A · B)T = BT ·AT .

(b) Let a and b be vectors and C be a tensor of order two. Show that

a · CT · b = b · C · a.

(c) Show that if A is a nonsingular tensor of order two, then the componentsof the tensor B = A−1 are given by the formulas

b·ji =1

2 detAεiklε

jmna · km a · l

n .

(d) Verify the following relations:

detA−1 = (detA)−1, (A · B)−1 = B−1 · A−1,

(AT )−1 = (A−1)T , (A−1)−1 = A.


Tensors raised to powers

By analogy with matrix algebra we may raise a tensor to a positive integerpower:

A2 = A · A, A3 = A ·A2, A4 = A · A3,

and so on. Note that Ak still represents a linear operator. Negative integerpowers are defined by raising A−1 to positive integer powers:

A−2 = A−1 · A−1, A−3 = A−2 ·A−1, A−4 = A−3 · A−1,

and so on. These operations can be used to construct functions of tensorsusing Taylor expansions of elementary functions. For example,

ex = 1 +x

1!+x2

2!+x3

3!+ · · · .

By this we can introduce the exponential of the tensor A:

eA = E +A1!

+A2

2!+

A3

3!+ · · · .

The issue of convergence of such series is approached in a manner similarto the absolute convergence of usual series, but with use of a norm of thetensor A (see § 3.12). Note that eA represents a linear operator. We canintroduce other functions similarly. This technique is used in the study ofnonlinear elasticity, for example.

Symmetric and antisymmetric tensors

Among the class of all second-order tensors, an important role is playedby the symmetric tensors. These include the strain and stress tensors ofthe theory of elasticity. The tensor of inertia is symmetric, as is the metrictensor. All these satisfy the relation

A = AT .

It follows from (3.14) that

A · x = x · A (3.15)

and

(A · x) · y = x · (A · y) (3.16)

if A is symmetric. The reader will recall that the unit tensor E satisfies arelation of the form (3.15). A tensor A is said to be antisymmetric if

A = −AT .

Tensors 43

Exercise 3.11. Give the matrix forms corresponding to the cases of sym-metric and antisymmetric tensors. How many components can be indepen-dently specified for a symmetric tensor? For an antisymmetric tensor?

Both symmetric and antisymmetric tensors arise naturally in the phys-ical sciences. Their significance is also shown by the following

Theorem 3.1. Any second-order tensor can be decomposed as a sum ofsymmetric and antisymmetric tensors:

A = B + C

where B = BT and C = −CT .

Proof. Take

B =12(A + AT

), C =

12(A − AT

),

and check all the statements.

The dyad ab can be decomposed into symmetric and antisymmetricparts as

ab =12(ab + ba) +

12(ab − ba)

for example.

Exercise 3.12. Show that if A is symmetric and B is antisymmetric thenA ··B = 0.

Exercise 3.13. Demonstrate that the quadratic form x · A · x does notchange if the second-order tensor A is replaced by its symmetric part.

Given an antisymmetric tensor C = cijiiij in a Cartesian frame, we canconstruct a vector

ω = ωkik

according to the formulas

ω1 = c32, ω2 = c13, ω3 = c21.

It is easy to verify directly that

C · x = ω × x, x ·C = x × ω,


where x is an arbitrary vector. These formulas are written in non-coordinate form so they hold in any frame. The reader can derive theformulas for ω, which is called the conjugate vector , for an arbitrary frame.

The cross-products of a tensor A = aijeiej and a vector x are definedby the formulas

A × x = aijei(ej × x), x × A = aij(x × ei)ej .

Exercise 3.14. Show that C = E × ω = ω × E.

3.6 Eigenvalues and Eigenvectors of a Second-Order Sym-metric Tensor

We now consider the question of which basis yields a tensor of simplestform. As the analogous question in matrix theory relates to eigenvaluesand eigenvectors, we extend these notions to tensors. The pair

(λ,x) (x = 0)

is called an eigenpair if the equality

A · x = λx (3.17)

holds. Hence x is an eigenvector of A if A operates on x to give a vectorproportional to x. Equation (3.17) may also be written in the form

(A − λE) · x = 0.

Exercise 3.15. Find the eigenpairs of the dyad ab. Now try to positionan eigenvector on the left: x · ab = λx. You should find that this differsfrom the previous eigenvector, so it makes sense to introduce left and righteigenvectors. In the case of a symmetric tensor they coincide.

The eigenvalues of a second-order tensor A are found as solutions of thecharacteristic equation for A, which is derived as follows. In components(3.17) becomes

aijeiej · xkek = λxiei

or

aijgjkxkei = λxkδi

kei.

Writing this as

(ai·k − λδi

k)xk = 0,

Tensors 45

we have a system of three simultaneous equations in the three variables xk.A nontrivial solution exists if and only if the determinant of the coefficientmatrix vanishes: ∣∣∣∣∣∣∣

a1·1 − λ a1

·2 a1·3

a2·1 a2·2 − λ a2·3a3·1 a3

·2 a3·3 − λ

∣∣∣∣∣∣∣ = 0.

This is the characteristic equation1 for A. Writing it in the form

−λ3 + I1(A)λ2 − I2(A)λ+ I3(A) = 0 (3.18)

we note that it is cubic in λ, hence there are at most three distinct eigen-values λ1, λ2, λ3. The coefficients I1(A), I2(A), and I3(A) are called thefirst, second, and third invariants of A, and are expressed in terms of theeigenvalues by the Viete formulas

I1(A) = λ1 + λ2 + λ3,

I2(A) = λ1λ2 + λ1λ3 + λ2λ3,

I3(A) = λ1λ2λ3.

After representing the tensor in diagonal form it will be easy to see thatI1(A) and I3(A) are, respectively, the trace and determinant of the tensorA. In fact,

I1(A) = trA, I2(A) =12[tr2 A− trA2], I3(A) = detA.

In nonlinear elasticity, the invariants and eigenvalues of several tensorsplay important roles in the formulation of various constitutive laws. See,for example, [Lurie (1990); Lurie (2005); Ogden (1997)].

Exercise 3.16. A tensor A, when referred to a certain Cartesian basis,has matrix ⎛⎝ 1 0 1

2 −1 00 1 2

⎞⎠ .

Find the first, second, and third principal invariants of A.1Note that it is expressed in terms of mixed components of the tensor. However, it char-

acterizes the properties of the tensor and has invariant properties since the eigenvaluesof a tensor do not depend on the coordinate frame in which they are obtained.


In applications, the most important second-order tensors are the real-valued symmetric tensors. These have special properties. For a real-valuedtensor that is considered as an operator in the complex space C

3, we havea formula analogous to (3.16):

(A · x) · y = x · (A · y). (3.19)

This will be used below. We recall that a bar over an expression denotescomplex conjugation.

Theorem 3.2. The eigenvalues of a real symmetric tensor are real. More-over, eigenvectors corresponding to distinct eigenvalues are orthogonal.

Proof. Let A be a real symmetric tensor and λ an eigenvalue of A thatcorresponds to the eigenvector x = 0, so

A · x = λx.

Dot-multiply both sides of this equality by x. It follows that

λ =(A · x) · x

x · x .

Now we prove that λ is real. Indeed, x · x = |x|2 is positive. To see that(A · x) · x takes a real value, we write out

(A · x) · x = x · (A · x)

= (A · x) · x(we have used a property of the inner product in a complex linear space,and then (3.19)). Because the eigenvalues λ are real, the components of theeigenvectors satisfy a linear system of simultaneous equations having realcoefficients; hence they are real as well.

To prove the second part of the theorem, suppose that

A · x1 = λ1x1, A · x2 = λ2x2,

where λ2 = λ1. From these we obtain

λ1x1 · x2 = (A · x1) · x2, λ2x2 · x1 = (A · x2) · x1,

and subtraction gives

(λ1 − λ2)x1 · x2 = x2 · A · x1 − x1 ·A · x2 = 0.

It follows that

x1 · x2 = 0. (3.20)

Tensors 47

From (3.20) we may obtain another property of the eigenvectors x1,x2,known as generalized orthogonality:

x1 ·A · x2 = 0.

This holds for the eigenvectors corresponding to different eigenvalues of areal symmetric tensor A, and is useful in applications.

Exercise 3.17. Show that for any second-order tensor A (not necessarilysymmetric), eigenvectors corresponding to distinct eigenvalues are linearlyindependent.

In solving the characteristic equation for λ, we may find that there arethree distinct solutions or fewer than three. Note that if x is an eigenvectorof A then so is αx for any α = 0. In other words, an eigenvector isdetermined up to a constant multiple. So when the eigenvalues are distinctwe can compose a Cartesian frame from the orthonormal eigenvectors xk

and then express the tensor A in terms of its components aij as

A =∑

aijxixj .

Since the frame xk is Cartesian the reciprocal basis is the same, and wemay calculate the components of A from

aij = xi ·A · xj .

Since the xi are also eigenvectors we can write

A · xj = λjxj ,

and dot multiplication by xi from the left gives

xi ·A · xj = xi · λjxj = λjδji .

Hence the coefficients of the dyads xixj in A are nonzero only for thosecoefficients that lie on the main diagonal of the matrix representation of A;moreover, these diagonal entries are the eigenvalues of A. We can thereforewrite

A =3∑

i=1

λixixi. (3.21)

This is called the orthogonal representation of A. The eigenvectors com-posing the coordinate frame give us the principal axes of A, and the processof referring the tensor to its principal axes is known as diagonalization.


When the characteristic equation of a tensor has fewer than three dis-tinct solutions for λ, then the repeated eigenvalue is said to be degenerate.If A is symmetric, we may represent A in a Cartesian basis and apply factsfrom the theory of symmetric matrices. Corresponding to a multiple root ofthe characteristic equation we have a subspace of eigenvectors. If λ1 = λ2

is a double root, then the subspace is two-dimensional and we can select anorthonormal pair that is orthogonal to the third eigenvector (since λ1 andλ3 are distinct). Such an eigenvalue corresponding to two linearly indepen-dent eigenvectors is regarded as a multiple eigenvalue (multiplicity two). Ifλ1 = λ2 = λ3 then any vector is an eigenvector, and choosing a Cartesianframe we would have an eigenvalue of multiplicity three. However, this casearises only when the tensor under consideration is proportional to the unittensor E. Such a tensor is called a ball tensor.

Exercise 3.18. Show directly that a second-order tensor (not necessarilysymmetric) having three distinct eigenvalues cannot have more than onelinearly independent eigenvector corresponding to each eigenvalue.

3.7 The Cayley–Hamilton Theorem

The Cayley–Hamilton theorem states that every square matrix satisfies itsown characteristic equation. For example the 2 × 2 matrix

A =(a b

c d

)has characteristic equation∣∣∣∣ a− λ b

c d− λ

∣∣∣∣ = λ2 − (a+ d)λ+ (ad− bc) = 0,

and the Cayley–Hamilton theorem tells us that A itself satisfies

A2 − (a+ d)A+ (ad− bc)I = 0,

where I is the 2 × 2 identity matrix and the zero on the right side denotesthe 2×2 zero matrix. Similarly, the Cayley–Hamilton theorem for a second-order tensor A whose characteristic equation is given by (3.18) states thatA satisfies the equation

−A3 + I1(A)A2 − I2(A)A + I3(A)E = 0. (3.22)

This permits us to represent A3 in terms of lower powers of A. Further-more, we may dot multiply (3.22) by A and thereby represent A4 in terms

Tensors 49

of lower powers of A. It is clear that we could continue in this fashion andeventually express any desired power of A in terms of E, A, and A2. Thisis useful in certain applications (e.g., nonlinear elasticity) where functionsof tensors are represented approximately by truncated Taylor series.

It is easy to establish the Cayley–Hamilton theorem for the case of asymmetric tensor. Such a tensor A has the representation (3.21), wherewe redenote ii = xi because the eigenvectors xi constitute an orthonormalbasis:

A =3∑

i=1

λiiiii.

We get

A2 = A · A =3∑

i=1

λiiiii ·3∑

j=1

λjijij =3∑

i,j=1

λiiiλjijδij =

3∑i=1

λ2i iiii.

Similarly

A3 = A ·A ·A =3∑

i=1

λ3i iiii. (3.23)

Let us return to equation (3.18) written for the ith eigenvalue and put itinto the expression (3.23):

A3 =3∑

i=1

[I1(A)λ2i − I2(A)λi + I3(A)]iiii

= I1(A)3∑

i=1

λ2i iiii − I2(A)

3∑i=1

λiiiii + I3(A)3∑

i=1

iiii

= I1(A)A2 − I2(A)A + I3(A)E,

as desired.

Exercise 3.19. Use the Cayley–Hamilton theorem to express A3 in termsof A2, A, and E if A = i1i1 + i2i1 + i2i2 + i3i2.

3.8 Other Properties of Second-Order Tensors

Tensors of rotation

A tensor Q of order two is said to be orthogonal if it satisfies the equality

Q ·QT = QT ·Q = E.


We see that

QT = Q−1

for an orthogonal tensor. Furthermore,

detQ = ±1.

Indeed, detQ is determined by the determinant of the matrix of mixedcomponents of Q. Because of the properties of the determinant of a matrixand the correspondence between tensors and matrices, we have for twotensors A and B of order two

det(A ·B) = detA detB

and

detA = detAT .

Thus

1 = detE = detQ detQT = (detQ)2

as desired. We call Q a proper orthogonal tensor if detQ = +1; we call Qan improper orthogonal tensor if detQ = −1.

Exercise 3.20. (a) Show that the tensor Q = −i1i1 + i2i2 + i3i3 is orthog-onal. Is it proper or improper? (b) Show that

qijqkj = δk

i

if Q is orthogonal. (c) Show that if Q is orthogonal then so is Qn for everyinteger n.

We now consider the orthogonal tensor Q as an operator in the spaceof all vectors.

Theorem 3.3. The operator defined by the orthogonal tensor Q preservesthe magnitudes of vectors and the angles between them.

Proof. Consider the result of application of Q to both the multipliers ofthe inner product x · y:

(Q · x) · (Q · y) = (x · QT ) · (Q · y) = x ·QT ·Q · y = x · E · y = x · y.First we put y = x to see that Q preserves vector magnitudes: |Q · x|2 =|x|2. Thus, by definition of the dot product in terms of the cosine, anglesare also preserved.

Tensors 51

The action of Q amounts to a rotation of all vectors of the space. (Moreprecisely, this is the case for a proper orthogonal tensor; an improper tensoralso causes an axis reflection that changes the “handedness” of the frame.)The situation is analogous to the case of a solid body where the position ofa point is defined by a vector beginning at a fixed origin of some frame andending at the point. Any motion of the solid with a fixed point is a rotationwith respect to some axis by some angle. The equations of physics shouldoften be introduced in such a way that they are invariant under rotationof the coordinate frame whose position is not determined in space. Suchinvariance under rotation should be verified, and in large part this can bedone by showing that the application of Q to all the vectors of the relationdoes not change the form of the relation.

Let us note that some quantities are always invariant under rotation.One of them is the first invariant I1(A) of a tensor. This quantity, alsocalled the trace of the tensor and denoted tr(A), is the sum of the diagonalmixed components of A:

tr(A) = a·ii .

The trace can be equivalently determined in the non-coordinate form

tr(A) = E ··A = E • A

(the reader should check this). To show invariance we consider the tensor

Q · A · QT = Q · (aijeiej) ·QT

= aij(Q · ei)(ej · QT )

= aij(Q · ei)(Q · ej).

We see that this is a representation of the “rotated” tensor, derived as theresult of applying Q to each vector of the dyadic components of A. Thetrace of the rotated tensor is given by

E ·· (Q ·A · QT ) = E ·· (Q · a·ji eiej · QT )

= E ·· a·ji (Q · ei)(Q · ej).

Under the action of Q the frame ei transforms to the frame Q ·ei, to whichthe reciprocal basis is Q · ei. This means that the unit tensor (the metrictensor!) can be represented, in particular, as

E = (Q · ei)(Q · ei) = (Q · ej)(Q · ej).


It follows that

E ·· (Q · A · QT ) =[(Q · ek)(Q · ek)

] ·· a·ji (Q · ei)(Q · ej)

= a·ji[(Q · ek) · (Q · ei)

] [(Q · ek) · (Q · ej)

]= a·ji

[(ek ·QT ) · (Q · ei)

] [(ek ·QT ) · (Q · ej)

]= a·ji

[ek ·QT ·Q · ei

] [ek · QT · Q · ej

]= a·ji

[ek ·E · ei

] [ek · E · ej

]= a·ji

[ek · ei

] [ek · ej

]= a·ji δ

ikδ

kj

= a ·kk

= tr(A).

Another example demonstrates that under the transformation Q theeigenvalues of a tensor A remain the same but the eigenvectors xi rotate.Indeed, let

A · xi = λixi.

We show that Q · xi is an eigenvector of Q ·A · QT :

(Q · A · QT ) · (Q · xi) = Q · A · QT · Q · xi

= Q · A · E · xi

= Q · (A · xi)

= Q · (λixi)

= λi(Q · xi).

Let e be an axis of rotation defined by an orthogonal tensor Q, andlet ω be the angle of rotation about e. It can be shown that the properorthogonal tensor has the representation

Q = E cosω + (1 − cosω)ee− e× E sinω.

Polar decomposition

A second-order tensor A is nonsingular if

detA = 0

where A is the matrix of mixed components of A. It is possible to expresssuch a tensor as a product of a symmetric tensor and another tensor. A

Tensors 53

statement of this result, known as the polar decomposition theorem, requiresthat we introduce some additional terminology.

A symmetric tensor is said to be positive definite if its eigenvalues areall positive. By orthogonal decomposition such a tensor S may be writtenin the form

S = λ1i1i1 + λ2i2i2 + λ3i3i3,

where all the λi > 0 and the eigenvectors ik constitute an orthonormalbasis.

If A is nonsingular then the tensor A · AT is symmetric and positivedefinite. Symmetry follows from the equation

(A ·AT )T = (AT )T ·AT = A · AT .

To see positive definiteness, we begin with the definition

(A · AT ) · x = λx

of an eigenvalue λ and dot with x from the left to get

λ =x · (A · AT ) · x

|x|2 .

The numerator is positive because

x · (A · AT ) · x = (x ·A) · (AT · x) = (x · A) · (x · A) = |x ·A|2.Hence all the eigenvalues λ of A · AT are positive. The diagonalization ofA ·AT now shows that it is positive definite.

With these facts in hand we may turn to our main result.

Theorem 3.4. Any nonsingular tensor A of order two may be written asa product of an orthogonal tensor and a positive definite symmetric tensor.The decomposition may be done in two ways: as a left polar decomposi-

tion

A = S ·Q (3.24)

or as a right polar decomposition

A = Q · S′. (3.25)

Here Q is an orthogonal tensor of order two, and S and S′ are positivedefinite and symmetric.


Proof. Because A ·AT is positive definite and symmetric, we have

A · AT = λ1i1i1 + λ2i2i2 + λ3i3i3,

where the ik are orthonormal. We define

S ≡ (A ·AT )1/2

=√λ1i1i1 +

√λ2i2i2 +

√λ3i3i3

since the λi are positive. We see that S−1 exists and is equal to

S−1 =1√λ1

i1i1 +1√λ2

i2i2 +1√λ3

i3i3.

Now we set

Q ≡ S−1 · A.

To see that Q is orthogonal, we write

Q ·QT = (S−1 ·A) · (S−1 · A)T

= (S−1 ·A) · (AT · (S−1)T )

= S−1 · (A ·AT ) · (ST )−1

= S−1 · S2 · S−1

= E.

Thus we have expressed A = S · Q as in (3.24). The validity of (3.25)follows from defining S′ ≡ QT · S ·Q.

Exercise 3.21. We have called a tensor positive definite if its eigenvaluesare all positive. An alternative definition is that A is positive definite ifx · (A · x) > 0 for all x = 0. Explain.

As an application of polar decomposition, let us show that a nonsingulartensor A operates on the position vectors of the points on the unit sphereto produce position vectors defining the points of an ellipsoid. If r locatesany point on the unit sphere then

r · r = 1. (3.26)

Let x be the image of r under A:

x = A · r.

Tensors 55

Then r = A−1 · x, and substitution into (3.26) along with (3.24) gives

1 = [(S · Q)−1 · x] · [(S · Q)−1 · x]

= x · [(S · Q)−1]T · (S · Q)−1 · x= x · S−1 · S−1 · x

(the reader can supply the missing details). Because

S =√λ1i1i1 +

√λ2i2i2 +

√λ3i3i3,

expansion in the Cartesian frame ik with use of

x =3∑

i=1

xiii, S−1 =3∑

j=1

1√λj

ijij

reduces the above equation x · S−1 · S−1 · x = 1 to the form

3∑i=1

1λix2

i = 1.

This is the equation of an ellipsoid.Polar decomposition provides the background for introducing measures

of deformation in nonlinear continuum mechanics.

Deviator and ball tensor representation

For A we can introduce the representation

A =13I1(A)E + dev A.

Such a representation is found useful in the theory of elasticity. Moreover,it is used to formulate constitutive equations in the theories of plasticity,creep, and viscoelasticity.

The tensor dev A is defined by the above equality. It has the sameeigenvectors as A, but eigenvalues that differ from the eigenvalues of A by(1/3)I1(A):

λi = λi − 13

trA.


3.9 Extending the Dyad Idea

Third-order tensors can be introduced in a way that parallels the introduc-tion of dyads in § 3.1. Using the tensor product as before, we introducetriad quantities of the type

R = abc

where a, b, and c are vectors. Expanding these vectors in terms of a basisei we obtain

R = aibjckeiejek.

We then consider a linear space whose basis is the set of 27 quantitieseiejek and call it the space of third-order tensors. We continue to refer tothe numerical values aibjck as the tensor’s components. A general elementof this space, a tensor R of order three, has the representation

R = rijkeiejek. (3.27)

The property of objectivity of R remains paramount, leading to the re-quirement that the components transform appropriately when we changethe frame. The now familiar procedure of setting

rijk eiej ek = rmnpemenep

under the change of frame

ei = Aji ej

gives

rijk = rmnpAimA

jnA

kp

— a direct extension of (3.8). As an alternative to the representation (3.27)in contravariant components, we could use the covariant-type representa-tion

R = rijkeiejek

or either of the mixed representations

R = rij· ·keiejek, R = ri

·jkeiejek.

These necessitate the respective transformation laws

rijk = rmnpAmi A

nj A

pk

Tensors 57

and

rij··k = rmn

· · pAimA

jnA

pk, ri

·jk = rm· npA

imA

nj A

pk,

as is easily verified.Dot products involving triads follow familiar rules. The dot product of

a triad with a vector is given by the formula

abc · x = ab(c · x),

while the double dot product of a triad with a dyad is given by

abc ··xy = a(c · x)(b · y).

One may also define a triple dot product of a triad with another triad:

abc ··· xyz = (c · x)(b · y)(a · z).The scalar product of triads is defined by the rule

abc • xyz = (a · x)(b · y)(c · z).The reader has surmised by now that the order of a tensor is always equal

to the number of free indices needed to specify its components. A vector,for instance, is a tensor of order one. In Chapter 2 we also met a quantitywhose components are specified by three indices: εijk. This quantity, whicharose naturally in our discussion of the vector cross product, is known asthe Levi–Civita tensor and is given by

E = εijkeiejek.

Exercise 3.22. Verify that E = −E× E.

Exercise 3.23. Verify the following formulas for operations involving theLevi–Civita tensor:

(a) E ··· zyx = x · (y × z),(b) E ··xy = y × x,(c) E · x = −x×.

Note: The notation of (c) may require some explanation. The result ofapplying the third-order tensor E to a vector x is a second-order tensorE · x. When E · x is applied to another vector y, it becomes equivalentto the cross product y × x. This is the meaning of the right side of (c).Although the notation is awkward, it is rare that the action of some tensorcan be described as the action of two vectors, and the development of aspecial notation is unwarranted.


3.10 Tensors of the Fourth and Higher Orders

We can obviously extend the present treatment to tensors of any desiredorder. Let us illustrate the essential points using tensors of order four.

A fourth-order tensor C can be represented by several types of compo-nents:

cijkl, c·jkli , c· ·kl

ij , c· · · lijk , cijkl.

The first and last are purely contravariant and purely covariant, respec-tively, while the other three are mixed with indices in various positions.As before these components represent C with respect to various bases; forinstance,

C = cijkleiejekel.

Dot products with vectors can be taken as before: the rule is that we simplydot multiply the basis vectors positioned nearest to the dot. Carrying outsuch operations we may obtain results of various kinds. A dot product ofa fourth-order tensor with a vector gives, for example,

R = C · x= cijkleiejekel · xmem

= cijkleiejekδml xm

= cijklxleiejek

— a tensor of order three. Similarly, a dot product between a third-ordertensor and a vector gives a second-order tensor. Wherever the dot productis utilized, it continues to enjoy the linearity properties stated earlier.

Double dot products also appear in applications. For example, in elas-ticity one encounters double dot products between the tensor of elasticconstants and the strain tensor. In generalized form Hooke’s law becomes

σ = C ·· ε

where σ is the stress tensor, C is the tensor of elastic constants, and ε isthe strain tensor. The density of the function of internal (elastic) energy inlinear elasticity is

12σ ·· ε =

12(C ·· ε) ·· ε.

Tensors 59

Because σ and ε are symmetric, the last expression can be put in a moresymmetrical form

12ε ··C ·· ε

in which the result does not depend on the order of operations.

Isotropic tensors

In engineering, isotropic materials play an important role. These are thematerials whose properties are the same in all directions. Air, for example,is isotropic: it is equally transparent in all directions. It is impossible to tellwhether a ball made of isotropic material has been rotated through someangle. In mechanics, material properties are expressed via constitutive re-lations. From a mathematical point of view, a material is isotropic when itsconstitutive equations are invariant with respect to certain transformations:the rotations and mirror reflections.

First we consider the question of when various tensorial quantities canbe isotropic. We say that a tensor is isotropic if its individual componentsare invariant under all possible rotations and mirror reflections in R

3.Any scalar quantity is isotropic. Clearly, the only isotropic vector is 0.

Let A be a second-order tensor so that in a basis e1, e2, e3 we have

A = aijeiej .

Recall that any rotation or mirror reflection of R3 is uniquely defined byan orthogonal tensor Q. Let us apply Q to each vector of the basis:

ek = Q · ek.

In the new basis we have

A = aij eiej .

Let A be isotropic. By the above definition we must have aij = aij . So

A = aij eiej = aijQ · eiQ · ej = Q · (aijeiej) ·QT .

This means that A is isotropic if and only if the equation

A = Q ·A ·QT

holds for any orthogonal tensor Q.


Common sense tells us that the metric tensor E, which is the unit tensoras well, should be isotropic. Let us demonstrate this. For any orthogonalQ we have

Q · QT = E.

This can be rewritten as

E = Q · E ·QT .

Hence E is isotropic. If λ is a scalar, then clearly the ball tensor λE isisotropic as well.

The following is, unfortunately, not a trivial exercise. It asserts thatany isotropic second-order tensor takes the form λE.

Exercise 3.24. Show that A is an isotropic tensor of order two if and onlyif it is a ball tensor: that is, A = λE for some scalar λ.

Under an orthogonal transformation Q of R3, a fourth-order tensor

C = cijmneiejemen

takes the form

C = cijmn(Q · ei)(Q · ej)(Q · em)(Q · en).

By the general definition, it is isotropic if cijmn = cijmn for all Q.In a Cartesian frame, the general form of the fourth-order isotropic

tensor is

αEE + βekEek + γI, (3.28)

where α, β, γ are arbitrary scalars [Jeffreys (1931)]. The proof is cumber-some and we omit it. The properties of the tensors in the representationare exhibited in Exercises 3.27, 3.28, and 3.29. This fact can be applied tothe tensor of elastic constants for an isotropic material. The quantities C,σ, and ε will be considered further in Chapter 6.

3.11 Functions of Tensorial Arguments

The reader is familiar with the notion of a function f(x1, . . . , xn) in n

variables. If we regard xk as a Cartesian component of a vector

x = (x1, . . . , xn),

Tensors 61

then we can regard f as a function of the vectorial argument x:

f = f(x).

But a logically good definition of such a function dictated by physics shouldrequire that f be independent of the representation of x in a basis.

Similarly, we can consider a function of one or more tensorial arguments.In a fixed basis, such a function reduces to a function in many variables,the components of the tensorial arguments. Again, however, a true functionof a tensorial argument cannot depend on the basis representations of itsarguments.

To extend this notion further, we can consider functions that take valuesin the set of vectors — or even tensors. Such functions arise in applications.For example, a force vector f = f(t) can be given as a function of time t.Later, we will encounter other functions that take values in the set of tensorsof some order. Depending on this latter set, the function may be termedscalar-valued, vector-valued, or tensor-valued.

As for any function in many variables, we can apply the tools of calculusto tensor-valued functions. These include the notion of continuity, the firstdifferential, and derivatives. We will consider these topics later.

Linear functions

In linear elasticity and linear shell theory, linear relations and quadraticfunctions (such as occur in strain energy expressions) play central roles.We define a linear function of a tensorial variable as a function f which,for any tensors A,B and scalars λ, µ, satisfies the relation

f(λA + µB) = λf(A) + µf(B).

This mimics the definition of a linear matrix operator. From this point ofview, the equation y = kx+ b represents a linear function only if b = 0.

Theorem 3.5. Let f be a scalar-valued function of a vectorial argumentx. There is a unique c such that for all x,

f(x) = c · x. (3.29)

Proof. We expand x = xkek with respect to the basis ek. By linearity,f(x) = xkf(ek). Equation (3.29) holds with c = f(ek)ek. Supposing theexistence of two vectors c1 and c2 such that c1 · x = c2 · x, and puttingx = c2 − c1, we get c1 = c2.


Now let us consider a scalar-valued function whose argument is a second-order tensor A. Clearly, for a second-order tensor B, the function

f(A) = tr(B · A) = B ··AT

is linear.

Exercise 3.25. Show that B ··AT = A ··BT .

Theorem 3.6. Let f be a scalar-valued function of a second-order tensorA. There is a unique second-order tensor B such that for all A,

f(A) = tr(B ·AT ). (3.30)

The proof is left to the reader.The representation of any tensor-valued function of a tensorial argument

is similar. We use it to introduce the tensor of elastic constants in linearelasticity.

Theorem 3.7. Let F = F(A) be a linear function from the set of second-order tensors A to the same set of second-order tensors. There is a uniquefourth-order tensor C such that for all A,

F(A) = C ··AT . (3.31)

Proof. Write A = amnemen and introduce C by the formula

C = F(emen)emen.

Then

C ··AT = F(emen)emen ·· aijejei = F(emen)amn = F(A).

So the representation is valid. Proof of uniqueness is left to the reader.

Exercise 3.26. An operator on the set of vectors x is given by the formulay = B ·x where B is a second-order tensor. This can be extended to the setof second-order tensors by the equation Y = B · XT . Show that by usingthe fourth-order tensor C = B · enEen, we get C ··XT = B ·XT for all X.Note that we cannot represent Hooke’s law using only this operation B ·X.

Looking back, we note that the general form of a fourth-order isotropictensor contains three independent isotropic tensors. Their properties areexhibited in the following exercises.

Exercise 3.27. Show that the identity operator from the representation(3.31) is I = ekemekem; that is, for all A we have I · ·AT = A.

Tensors 63

Exercise 3.28. A linear function is defined by the equality F(A) = AT .Show that the corresponding fourth-order tensor is ekEek, i.e., show that

ekEek ··AT = AT .

Exercise 3.29. A linear function is defined by F(A) = (trA)E. Showthat the corresponding fourth-order tensor is EE, i.e., show that

EE ··AT = (trA)E.

Isotropic scalar-valued functions

A scalar function of a tensor is said to be isotropic if it retains its formunder any orthogonal transformation of the space (or equivalently, of itsbasis).

A scalar-valued function f(A) of a second-order argument A is isotropicif and only if for any orthogonal tensor Q we have

f(A) = f(Q · A · QT

).

Let us demonstrate that any eigenvalue λ of a second-order tensor A,when considered as a scalar-valued function of A (i.e., λ = f(A)), isisotropic. Indeed, an eigenvalue satisfies the characteristic equation

det(A − λE) = 0.

The eigenvalues of Q ·A ·QT satisfy

det(Q ·A · QT − λE) = 0.

The equality

det(Q · A · QT − λE) = det(Q · A · QT − λQ · E ·QT )

= det[Q · (A − λE) ·QT

]= (detQ)2 det(A − λE)

= det(A − λE)

shows that the eigenvalues of A and Q ·A ·QT satisfy the same equation.Hence they coincide.

Exercise 3.30. Show that the invariants I1(A), I2(A), I3(A) are isotropicfunctions of A.


It can be shown that any scalar-valued isotropic function of a second-order tensor is a function of its invariants [Lurie (1990); Ogden (1997);Truesdell and Noll (2004)]. This is used in nonlinear elasticity to introducethe constitutive equations for isotropic bodies.

By Theorem 3.6, a scalar-valued linear function has the representation

f(A) = B ··AT

for some second-order tensor B.

Theorem 3.8. The function f(A) = B ··AT is isotropic if and only if Bis an isotropic tensor of order two, and hence a ball tensor: B = λE.

Proof. Because f is isotropic, we have

B ··AT = B ·· (Q ·A ·QT )T

for any orthogonal Q. Using

B ··AT = tr(B ·A)

we get

B ·· (Q ·A · QT )T = tr(B · Q ·A · QT )

= tr(QT · B · Q ·A)

= (QT ·B · Q) ··AT .

Hence for any A we have

B ··AT = (QT · B · Q) ··AT .

But this occurs if and only if the relation

B = QT ·B ·Qholds for any Q. Therefore B is isotropic.

Exercise 3.31. Show that a scalar-valued, linear, isotropic function of asecond-order tensor A is a linear function of trA: that is, f(X) = λ trX.

If a scalar function maintains its form under some subgroup of orthog-onal transformations, then one can find functions that are invariant underthese. The reader will find applications of this idea in books on crystallog-raphy and elasticity.

Tensors 65

Isotropic tensor-valued functions

Now we consider a function F whose domain and range are the set of second-order tensors. We say that F(A) is isotropic if the components of its imagevalue do not change from one Cartesian basis to another. So F is isotropicif

F(Q ·A ·QT

)= Q · F(A) ·QT

holds for any orthogonal tensor Q. An example of an isotropic tensor-valuedfunction is F = λA.

As with a scalar-valued function, it can be shown that a tensor-valuedlinear function F represented in terms of a fourth-order tensor C,

F = C ··AT ,

is isotropic if and only if C is isotropic and therefore is given by (3.28). SoF takes the form

F(A) = αE trA + βAT + γA.

Mechanicists employ functions whose domains and ranges can be sets ofsymmetric tensors. This imposes certain additional restrictions on the formof C. Indeed, let A = AT and F(A) = F(A)T . Using the representation

F(A) = cijmnamniiij

with a Cartesian basis ii, we see that

cijmn = cjimn = cijnm.

In the general case, C has 81 independent components. But, in view ofsymmetry, C has only 36 independent components. An isotropic fourth-order tensor C satisfying the symmetry conditions takes the form

αEE + β(ekEek + I).

So the general form of an isotropic linear function satisfying the symmetrycondition is

F(A) = αE trA + 2βA.

We will introduce the elements of calculus for functions of tensorialarguments. First, we require a way to gauge the magnitude of a vector ortensor. Suitable norms exist for this purpose.


3.12 Norms for Tensors, and Some Spaces

In § 2.7 we introduced a norm in the space R3. It can be immediatelyextended to Rk for any k as

‖x‖ = (x · x)1/2,

where the inner product of x and y in the space is given by x · y.Similarly, we can introduce a norm and inner product in the set of

second-order tensors. We denote the inner product by (A,B) and define itusing the dot product as

(A,B) = A ··BT

= aijeiej ·· btseset

= aijbtsgjsgit

= aijbtsgjsgit

= aijeiej ·· btseset

= aijbtsδsjδ

ti

= aijbij .

It is clear that in a Cartesian frame (A,A) is the sum of all the squaredcomponents of A, so this is quite similar to the scalar product of vectors.Using the same reasoning as above, we can show that the axioms of thescalar product hold here as well (note that using only a Cartesian frame,we could regard the components of a tensor as those of a nine-dimensionalvector, so this is another reason why the axioms hold). In this case the innerproduct axioms are written for arbitrary second-order tensors A,B,C as

(i) (A,A) ≥ 0, and (A,A) = 0 if and only if A = 0;(ii) (A,B) = (B,A);(iii) (αA + βB,C) = α(A,C) + β(B,C) for any real α, β.

By linear algebra, the expression

‖A‖ = (A,A)1/2

is a norm in the set of second-order tensors. It satisfies the following axioms.

(i) ‖A‖ ≥ 0, with ‖A‖ = 0 if and only if A = 0;(ii) ‖αA‖ = |α| ‖A‖ for any real α;(iii) ‖A + B‖ ≤ ‖A‖ + ‖B‖.

Tensors 67

We still have the Schwarz inequality

‖(A,B)‖ ≤ ‖A‖ ‖B‖ .

In linear algebra, many particular implementations of the vector andtensor norms can be introduced. We can do the same in any fixed frame, butif we wish to change the frames under consideration we must remember thatthese norms should change in accordance with the tensor transformationrules.

Note that the representation of a linear function in the previous sectionused AT as an argument. This reflects the form of the inner product onthe set of second-order tensors.

It is worth noting two important properties of the norms we have intro-duced, as these are used in analysis. Let A and B be second-order tensorsand x a vector. Then the relations

‖A · x‖ ≤ ‖A‖ ‖x‖

and

‖A · B‖ ≤ ‖A‖ ‖B‖

hold. The latter implies that ∥∥Ak∥∥ ≤ ‖A‖k

. (3.32)

Using this property, we can justify the introduction of tensor-valued func-tions like eA.

Exercise 3.32. Using (3.32), prove convergence of the series

eA = E +11!

A + · · · + 1k!

Ak + · · · .

It is easy to extend the notion of inner product to tensors of any order;we introduce the inner product

(A,B) = ai1i2···inbi1i2···in .

The reader can represent this in all its particular forms and verify the innerproduct axioms.


Some elements of calculus

As in ordinary calculus, we can introduce the notion of a function in oneor many variables that takes values in a set of vectors or tensors. Such afunction will be given on a certain set. Consider, for example, a functionon the segment [a, b] with values in R

3. This mapping pairs each pointof [a, b] with at most one vector from R3. We may similarly construct afunction from [a, b] to the set of tensors of some order. Furthermore, wecan introduce the notions of limit and continuity at a point t0 in the sameway as in calculus:

The function f : [a, b] → R3 has limit a at t = t0 ∈ [a, b] if for any

ε > 0 there is a δ > 0, dependent on ε, such that for any t = t0 and|t − t0| ≤ δ we have ‖f(t) − a‖ < ε. When a = f(t0), we say thefunction is continuous at t0.

Various norms can be used on R3. However, in linear algebra it is shown

that on a finite-dimensional space all norms are equivalent. Equivalence oftwo norms ‖·‖1 and ‖·‖2 means that there exist positive constants c1 andc2 such that for any element x of the space we have

0 < c1 ≤ ‖x‖1

‖x‖2

≤ c2 <∞,

where c1 and c2 do not depend on x. Consequently, either norm can beused in the definition of limit: the limit will exist or not, independently ofthe form taken by the norm.

As in calculus, we say that a function f(t) is continuous on [a, b] if it iscontinuous at each point of [a, b]. When a function takes values in R

3, theordinary definitions of limit, derivative, integral, etc., can be modified byreplacing the absolute value with a suitable norm. The derivative of f(t)at t0 is given by

f ′(t0) =dfdt

∣∣∣∣t=t0

= limt→t0

f(t) − f(t0)t− t0

.

The derivative of a vector-valued function has many properties familiar fromordinary calculus. For example, if f(t) and g(t) are both differentiable att, then the product rule holds in the form

(f(t) · g(t))′ = f ′(t) · g(t) + f(t) · g′(t).

If we expand f in a basis e1, e2, e3 so that f(t) = fk(t)ek, then

df(t)dt

=dfk(t)dt

ek.

Tensors 69

Here we have assumed that the ek do not depend on t; otherwise an appli-cation of the product rule would have been required (we will encounter thissituation later). Definite integration can also be carried out in component-wise fashion: ∫ b

a

f(t) dt =

(∫ b

a

fk(t) dt

)ek.

The integral has all the properties familiar from calculus.Clearly, all this can be extended to the case of a function in one scalar

variable taking values in the set of tensors of some order. Moreover, thetheory of functions in many variables is generalized in a similar way to thetensorial functions. The reader should remember that formally, in all thedefinitions of ordinary calculus, we should change the absolute value to thenorm. We record here only the formula for the derivative of a tensorialfunction F(t) = fmn(t)emen:

dF(t)dt

=dfmn(t)dt

emen.

Some normed spaces

In textbooks on functional analysis, the norm of a vector function is usuallyintroduced in a Cartesian frame. For example, the norm on the space C(V )of continuous vector functions given on a compact region V is

‖f(x)‖C = ‖fk(x)ik‖C = maxk

(max

V‖fk(x)‖

). (3.33)

This formula depends on the Cartesian frame ik. If we use a similar norminvolving vector components in a frame having singular points in V — asmay be the case with spherical coordinates — we obtain a norm that isnot equivalent to (3.33). This means that (3.33) is an improper way tocharacterize the intensity of a vector field.

However, the proper norm for a function given in curvilinear coordinatesis based on the above norm:

‖f(x)‖ =∥∥f i(x)ri

∥∥ = maxV

[f i(x)fi(x)

]1/2. (3.34)

If we would like to use a norm of the type (3.33), we need to remember thatduring transformation of the frame we must change the form of the normaccordingly.


On the set of second-order tensor functions continuous on a compactregion V , we can introduce a norm similar to (3.34):

‖A(x)‖ =∥∥aij(x)rirj

∥∥ = maxV

[aij(x)aij(x)

]1/2.

Finally, note that instead of the norms for continuous vector and tensorfunctions we can introduce the norms and scalar products corresponding tothe space of scalar functions L2(V ). The inner product is then

(A,B) =∫

V

aijbij dV. (3.35)

The reader should verify all the axioms of the inner product for this, andintroduce all forms of the inner product and norm corresponding to (3.35).

Exercise 3.33. Let A be a tensor of order two with ‖A‖ = q < 1. Demon-strate that E− A has the inverse (E− A)−1 that is equal to

E + A + A2 + A3 + · · · + An + · · · .

Exercise 3.34. Let A be a tensor of order two with ‖A‖ = q < 1. Whatis the inverse to E + A?

3.13 Differentiation of Tensorial Functions

In the linear theory of elasticity, there are functions that relate two ten-sors. The stress and strain tensors, for example, are related though thegeneralized form of Hooke’s law. More complex relations occur in nonlin-ear elasticity or plasticity. Here one must differentiate tensorial functions.In elasticity, for example, the stress tensor can be found as the derivativeof the strain energy with respect to the strain tensor.

We recall that for an ordinary function f(x), the derivative is

f ′(x) = lim∆x→0

f(x+ ∆x) − f(x)∆x

.

The first differential is given by the formula

df = f ′(x) dx. (3.36)

For a function in n variables we have

df(x1, . . . , xn) =n∑

k=1

∂f

∂xkdxk.

Tensors 71

Using a Cartesian basis i1, . . . , in, we can formally represent this as

df(x1, . . . , xn) =

(n∑

k=1

∂f

∂xkik

)·

n∑m=1

dxmim.

Let us regard

x =n∑

m=1

xmim

as a vector, write

f(x1, . . . , xm) = f(x),

and consider this as a function in a vectorial variable. In the same way weintroduce

dx =n∑

m=1

dxmim

where the dxm are some quantities that are not necessarily infinitesimal.Let ε be a real variable. For fixed x and dx, the function f(x + ε dx) is afunction in one variable ε. The chain rule formally applied to this functiongives us

df(x + ε dx)dε

∣∣∣∣ε=0

=

(n∑

k=1

∂f

∂xkik

)·

n∑m=1

dxmim, (3.37)

hence

df(x1, . . . , xn) =df(x + ε dx)

dε

∣∣∣∣ε=0

. (3.38)

The right-hand side of this equality is termed the Gateaux derivative off at the point x in the direction dx. For the intermediate expression, weintroduce the notation

f,x =n∑

k=1

∂f

∂xkik

and call it the derivative of f with respect to x. Later we will refer to thisvector quantity as the gradient of f .

We began with a function in n variables, but could in fact consider afunction of a vectorial argument f(x) and present the same operations innon-component form. From (3.38) and (3.37) it follows that

df(x1, . . . , xn) =df(x + ε dx)

dε

∣∣∣∣ε=0

= f,x · dx.


When we derive a relation in Cartesian coordinates but present it in non-component form, it becomes valid for any basis. The reader may wish toverify this by direct calculation. Note that in any basis, the componentsof f,x are partial derivatives of f with respect to the components of theexpansion of x in the basis. When we wish to have this in a form that doesnot include the basis vectors, we must use the expansion of dx in the dualbasis.

These ideas can be extended to tensorial functions of tensorial argu-ments in a straightforward manner.

First we consider a scalar-valued function f(X) whose argument X be-longs to the second-order tensors. In a fixed basis, it can be considered asa function in 3 × 3 = 9 variables, the components of X. As above, in aCartesian basis in R

3 we can introduce the first differential df . Then weintroduce the Gateaux derivative:

∂

∂εf(X + ε dX)

∣∣∣∣ε=0

≡ limε→0

f(X + ε dX) − f(X)ε

= f,X ·· dXT (3.39)

for any tensor dX, not necessarily infinitesimal. Here, in the Cartesianbasis we have

f,X =∂f

∂xmnimin,

where the xmn are the Cartesian components of X. Then

df = f,X ·· dXT .

The expression f,X is called the derivative of f with respect to the tensorargument X. Although the last formula was derived in Cartesian coordi-nates, it holds in any basis.

For a function that maps values X in the set of second-order tensorsinto the same set, the derivative F,X(X) is defined as

F,X ·· dXT =∂

∂εF(X + εdX)

∣∣∣∣ε=0

≡ limε→0

F(X + εdX) − F(X)ε

. (3.40)

This is a particular case of the Gateaux derivative. Again, we can repeatthe method used above for the first differential written in Cartesian com-ponents. The order of F,X(X) is four. In component form it is

F,X =∂fij

∂xmniiijimin.

(We recall that in Cartesian bases ik = ik, and hence the summation con-vention applies in situations such as this.) For

f,X(X) and F,X(X),

Tensors 73

the notations

df(X)dX

anddF(X)dX

are also used.In a similar way, we can define the derivative of a tensor-valued function

for tensors of any order.Clearly, when calculating dF we get a linear function in dX. This is

why we studied linear functions earlier.Now we introduce a partial derivative for a scalar-valued function

f(X1, . . . ,Xm) in several tensorial arguments. Let the Xi be second-ordertensors. The partial derivative of f with respect to Xi, denoted by ∂f/∂Xi,is defined by the equality

∂f

∂Xi··YT =

∂

∂εf(X1, . . . ,Xi + εY, . . . ,Xm)

∣∣∣∣ε=0

for any second-order tensor Y. When function F takes values in the set ofsecond-order tensors, the partial derivative ∂F/∂Xi is similarly defined byequality

∂F∂Xi

··YT =∂

∂εF(X1, . . . ,Xi + εY, . . . ,Xm)

∣∣∣∣ε=0

.

As in case of a tensorial function of one variable, the components of theabove partial derivatives can be expressed as ordinary partial derivatives ofthe components. We describe this representation in Cartesian coordinates.Let

f(X1, . . . ,Xm) = f(x(1)ij , . . . , x

(m)ij )

be a function in 9m variables, the components x(k)ij of Xk. Then

∂f

∂Xi=

∂f

∂x(i)jk

ijik.

Exercise 3.35. Find the derivative of f(X) = I1(X) ≡ trX with respectto X.

Exercise 3.36. Find the derivative of f(X) = trX2.

Exercise 3.37. Using the method applied to the previous exercise, showthat the derivative of f(X) = trX3 with respect to X is 3(XT )2.


Exercise 3.38. Show that the derivative of

f(X) = I2(X) ≡ 12[tr2 X − trX2]

with respect to X is

I2(X),X = I1(X)E − XT .

Exercise 3.39. Show that the derivative of

f(X) = I3(X) ≡ detX

with respect to X is

[X2 − I1(X)X + I2(X)E]T .

The formulas for differentiating the invariants of the strain tensor areused to write down the constitutive equation for a nonlinear elastic isotropicmaterial under finite deformation.

Exercise 3.40. The strain energy of an isotropic elastic medium is a func-tion of the invariants Ik of the strain tensor, f = f(I1, I2, I3). Using theresults of Exercises 3.35, 3.36, and 3.39, demonstrate that its derivative is

f,X =[∂f

∂I1+ I1

∂f

∂I2+ I2

∂f

∂I3

]E−

(∂f

∂I2+ I1

∂f

∂I3

)XT +

∂f

∂I3XT 2

.

We have presented examples of the derivatives of scalar-valued func-tions. The derivatives of tensor-valued functions are more complicated.Note that the derivative of a linear function

F(X) = C ··XT

is C:

F,X = C.

Exercise 3.41. Using Exercises 3.27 and 3.28, verify that if F(X) = Xthen F,X = I. If F(X) = XT , then F,X = ekEek.

By the exercise above,

X,X = I.

That is, the derivative of a second-order tensor X with respect to X is I, atensor of order four.

Tensors 75

On symmetric tensor functions

The principal tensors of linear elasticity, the stress tensor σ and straintensor ε, are symmetric (cf., Chapter 6). So we consider the problem ofdifferentiating a tensor-valued function of a symmetric tensorial argument,as this specific case has some peculiarities.

Let us consider the derivative of a scalar-valued function f(X) of a sym-metric second-order tensor X. We modify the definition (3.39) as follows.The derivative f,X is a second-order symmetric tensor that satisfies thecondition

∂

∂εf(X + ε dX)

∣∣∣∣ε=0

= f,X ·· dX (3.41)

for any symmetric tensor dX. The component representation of f,X in aCartesian basis is

f,X =∂f

∂xmnimin.

So we have brought the symmetry of f,X into the definition:

(f,X)T = f,X.

Why do we require f,X to be symmetric? The sets of symmetric andantisymmetric second-order tensors are subspaces of the space of all second-order tensors. Let A be a symmetric tensor and B an antisymmetricsecond-order tensor. It is easy to see that A ··B = 0. So the subspaces ofsymmetric and antisymmetric tensors are mutually orthogonal. In (3.41),dX is arbitrary but symmetric. If in this definition we do not require f,X

to be symmetric, then (3.41), holding only for all the symmetric tensorsdX, defines f,X non-uniquely up to an additive term B that can be anyantisymmetric second-order tensor.

Note that the definition of derivative for a function with respect to asymmetric tensor argument is closely related to the problem of representinga linear function acting on the subspace of symmetric tensors, which differsslightly from the general case (see Problems 3.52, 3.53, and 3.54).

As an example, we find the derivative of the function

f(X) = C ··X,where C is a second-order tensor and XT = X. By definition,

f,X ·· dX = C ·· dX


for all dX such that dX = dXT . At first glance it seems that f,X = C, andthis does hold for an arbitrary argument X. But for a symmetric argumentX the answer changes. Indeed the derivative must be a symmetric tensor,so f,X = C if and only if C = CT . Now we consider the case when C is notsymmetric. We said that f,X must be symmetric. A consequence is thatf,X becomes

f,X =12(C + CT ).

This expression is uniquely defined. Indeed, we can represent C as a sumof symmetric and antisymmetric tensors:

C =12(C + CT ) +

12(C − CT )

But

12(C − CT ) ·· dX =

12(C − C) ·· dX = 0

and thus

C ·· dX =12(C + CT ) ·· dX

as claimed.In a similar fashion, we should modify the definition of the derivative of

a tensor-valued function of a second-order symmetric argument. If F takesvalues in the set of tensors of order n, the derivative with respect to X isdenoted by F,X; it takes values in the subset of tensors of order n+2 whichare symmetric in the two last indices and satisfy the equality

F,X ·· dX =∂

∂εF(X + ε dX)

∣∣∣∣ε=0

(3.42)

for all symmetric second-order tensors dX. The symmetry of F,X in thelast two indices means that for the components Fij...pt of F,X we have

Fij...pt = Fij...tp.

Similar changes permit the definition of partial derivative for tensorialfunctions in many symmetric tensorial arguments.

Exercise 3.42. Let

F(X) = C ··X,

Tensors 77

where C = cmnptiminipit is a fourth-order tensor and X is a second-ordersymmetric tensor. Demonstrate that

F,X =12(C + C′),

where C′ is derived from C by transposing the last two indices: C′ =cmnptiminitip.

Exercise 3.43. Let

f(X) =12X ··C ··X,

where C = cmnptiminipit and X is a second-order symmetric tensor.Demonstrate that

f,X =14(C ··X + C′′ ··X + X ··C + X ··C′),

where C′ is derived from C by transposing the last two indices of thecomponents of C, i.e., C′ = cmnptiminitip, and C′′ by transposing the firsttwo indices of the components, i.e., C′′ = cmnptinimipit.

Exercise 3.44. Let

f(X) =12X ··C ··X,

where C is a fourth-order tensor and X is a second-order symmetric tensor.Suppose C = C′ = C′′ and C ··X = X ··C for any symmetric tensor X; interms of components, this means the equalities cmnpt = cnmpt and cmnpt =cptmn hold for any sets of indices. Using the solution of Exercise 3.43,demonstrate that f,X = C ··X.

3.14 Problems

In this section, unless otherwise stated, we use A,B,C,X to denote second-order tensors, λ a scalar, Ω an antisymmetric second-order tensor, and Qan orthogonal second-order tensor.

3.1 Write out the components of the dyad i1i2 in a Cartesian basis(i1, i2, i3).

3.2 Find the components of the tensor i1i2 − i2i1 + 2i3i3 in the Cartesianbasis (i1, i2, i3).


3.3 Write out the components of the tensor that is the dyad a1a2 in theCartesian basis (i1, i2, i3), where a1 = (−1, 2,−2) and a2 = (1,−1, 2).

3.4 Show that a0 = 0b = 0.

3.5 Determine the symmetric and antisymmetric parts of the followingtensors:

(a) i1i2;(b) i1i2 − i2i1 + 2i3i3;(c) i1i2 − 2i2i1 + i1i3;(d) i1i2 + i2i3 + i1i3;(e) i1i1 + 2i1i2 + 2i2i1 + i3i1 + i1i3.

3.6 Determine the ball and deviator parts of the following tensors:

(a) i1i2;(b) i1i2 + i2i1;(c) i1i1;(d) aa;(e) i1i1 + 2i1i2 + 2i2i1 + i3i1 + i1i3.

3.7 Show that

I1(X) = trX, I2(X) =12[tr2 X− trX2], I3(X) = detX.

3.8 Find the invariants of the following tensors:

(a) aa;(b) i1i2;(c) i1i1 + i2i2;(d) λE;(e) 2i1i1 + 3i2i2 + 4i3i3.

3.9 Let A be a symmetric, invertible tensor. Show that

I1(A−1) =I2(A)I3(A)

, I2(A−1) =I1(A)I3(A)

, I3(A−1) = I−13 (A).

3.10 Find the left and right polar decompositions of the following tensors:

(a) λE;(b) aa + bb + cc if a,b, c are mutually orthogonal;

Tensors 79

(c) λE + ai1i1;(d) λE + ai1i1 + bi2i2;(e) ai1i1 + bi2i2 + ci3i3.

3.11 Suppose trA2 = 0, a is an arbitrary scalar, and Y is an arbitrarysecond-order tensor. Demonstrate that

X = aY − atr(A ·Y)

trA2A

satisfies the equation tr(A ·X) = 0.

3.12 Let a nonzero scalar a and second-order tensors A,B be given. Finda solution X of the equation

aX + tr(A · X)E = B.

3.13 For the equations

(a) X + tr(A · X)B = C,(b) XT + tr(A ·X)B = C,(c) X + a(trX)A = B,

find a solution X and establish conditions for uniqueness.

3.14 Under what conditions will the equation

aX + E trX = 0

have a nonzero solution X?

3.15 Under what conditions will the equation

aX + A trX = 0

have a nontrivial solution?

3.16 Demonstrate that in a Cartesian basis

E = i1i2i3 + i2i3i1 + i3i1i2 − i1i3i2 − i3i2i1 − i2i1i3.

3.17 Let A be a symmetric tensor. Show that I2(dev A) ≤ 0.

3.18 Let A be a nonsingular tensor and let a and b be arbitrary vectors.Prove that A+ab is a nonsingular tensor if and only if 1+b ·A−1 · a = 0,and

(A + ab)−1 = A−1 − 11 + b ·A−1 · a (A−1 · a)(b ·A−1).


3.19 Show that (E× ω)2 = ωω − Eω · ω.

3.20 Show that

(a) (E× ω)2n = (−1)n−1(ω · ω)n−1(ωω − Eω · ω),(b) (E× ω)2n+1 = (−1)n(ω · ω)nE× ω,(c) a × E× b = ba − a · bE.

3.21 For a second-order tensor A = amnemen, the vectorial invariant wasintroduced by J.W. Gibbs (1839–1903) as

A× = amnem × en.

For example, (ab)× = a× b. Find the vectorial invariants of the followingtensors:

(a) aa;(b) E× ω;(c) ab − ba;(d) E;(e) A if A = AT .

3.22 Show that A× = 12 (A− AT )×.

3.23 Show that (a × A)× = A · a − a trA.

3.24 Show that (A×B) ··BT = 0 holds for arbitrary second-order tensorsA and B.

3.25 Show that tr [b × (a × A)] = b ·A · a − a · b trA.

3.26 Show that

A × ω = −(ω × AT )T , ω × A = −(AT × ω)T .

3.27 Two second-order tensors A and B are called commutative if A ·B =B · A. Show that symmetrical tensors A and B are commutative if theirsets of eigenvectors coincide.

3.28 Let a · b = 0. Demonstrate that the dyads aa and bb are commuta-tive.

3.29 Let the symmetric tensors A and B be commutative. Demonstratethat (A ·B)× = 0.

Tensors 81

3.30 Show that the tensor Q = i1i2 − i2i1 + i3i3 is orthogonal.

3.31 Show that the principal invariants of an orthogonal tensor satisfy therelations I1I3 = I2 and I2

3 = 1.

3.32 Let e · e = 1. Show that the tensor Q = E− 2ee is orthogonal.

3.33 Show that if Ω is an antisymmetric tensor, then the tensor

Q = (E + Ω) · (E− Ω)−1

is orthogonal.

3.34 Show that the tensor Q = eΩ is orthogonal if Ω is antisymmetric.

3.35 Demonstrate that the tensor

Q =1

4 + θ2[(4 − θ2)E + 2θθ − 4E× θ] ,

where θ2 = θ · θ, is orthogonal. We call θ a finite rotation vector.

3.36 Let Q be an orthogonal tensor. Establish the identity

(Q− E) ·· (QT − E) = 6 − 2 trQ.

3.37 Let ek (k = 1, 2, 3) and dm (m = 1, 2, 3) be two orthonormal bases.Verify that Q = ekdk is an orthogonal tensor.

3.38 Let Q be a proper orthogonal tensor. Establish the identity

Q× = 2 sinωe,

where ω and e are the rotation angle and axis, respectively.

3.39 Find the principal invariants of the tensor αE + βee.

3.40 Find the principal invariants of the tensor E × ω.

3.41 Find the principal invariants of the tensor (E × ω)2.

3.42 Show that

I3(A) =16(tr3 A − 3 trA trA2 + 2 trA3

).

3.43 From the Cayley–Hamilton theorem for a non-degenerate tensor, itfollows that

A−1 =1

detA(A2 − I1(A)A + I2(A)E

).

Use this to find the inverses of the following tensors:


(a) i1i1 + 2i2i2 + i3i3 + i1i2 + i2i1;(b) ai1i3 + bi2i2 + ci3i1, a, b, c = 0;(c) ai1i1 + i2i2 + i3i3 + bi1i2;(d) aE + bi1i2.

3.44 Find the derivative with respect to the tensor X of the followingscalar-valued functions:

(a) trX4;(b) a ·X · a;(c) a ·X · b;(d) tr(B · X).

3.45 Let A be a second-order tensor. A tensor X is called the cofactor ofA and denoted by X = cof A if X satisfies the equation

A · XT = XT · A = (detA)E.

If A is nonsingular, then cof A = (detA)A−T . Check the following prop-erties of cofactor:

(a) cof(AT ) = (cof A)T ;(b) cof(A · B) = cof A · cof B;(c) cof λA = λ2 cof A;(d) I2(A) = tr cof A.

3.46 Let A = λ1i1i1 + λ2i2i2 + λ3i3i3. Demonstrate that

cof A = λ2λ3i1i1 + λ1λ3i2i2 + λ1λ2i3i3.

3.47 Let f(x) be an analytic function in an open ball x < r centered atx = 0; that is, suppose the representation

f(x) =∞∑

k=0

f (k)(0)k!

xk

holds and the series converges for x < r. In the ball, the relation

f ′(x) =∞∑

k=1

f (k)(0)(k − 1)!

xk−1

holds. Let

F(X) =∞∑

k=0

f (k)(0)k!

Xk.

Tensors 83

Demonstrate that

[tr(F(X)],X =∞∑

k=1

f (k)(0)(k − 1)!

XT k−1= f ′(XT )

for ‖X‖ < r.

3.48 Let f(X) be a scalar-valued function. Show that the derivative off(X) with respect to XT is equal to the transpose of its derivative withrespect to X, i.e.,

f(X),XT = (f(X),X)T .

3.49 Find the derivative with respect to X of the following tensor-valuedfunctions:

(a) tr(A ·XT )E;(b) tr(A ·X)E;(c) a ·X · bcd.

3.50 For a nonsingular tensor X, demonstrate that

(detX),X = (detX)X−T .

3.51 Let S be the set of all symmetric second-order tensors. Show that Sis a linear subspace of the linear space of all the second-order tensors. OnS the inner product A ··BT of the whole space takes the form

A ··BT = A ··B.

3.52 Let f be a scalar-valued function of a symmetric second-order tensorA. Show that there is a unique symmetric second-order tensor B such that

f(A) = tr(B · A)

for any symmetric A, that is A = AT .

3.53 Let F = F(A) be a linear function from the set of second-ordersymmetric tensors A to the set of second-order tensors. Show that there isa unique fourth-order tensor C such that

F(A) = C ··A, C = cmnptiminipit, cmnpt = cmntp

for all A = AT .


3.54 Let F = F(A) be a linear function from the set of second-ordersymmetric tensors A to the same set of symmetric second-order tensors(F(A) = F(A)T ). Show that there is a unique fourth-order tensor C suchthat

F(A) = C ··A, C = cmnptiminipit, cmnpt = cmntp, cmnpt = cnmpt

whenever A = AT .

3.55 Let Q be the set of all antisymmetric second-order tensors. Showthat Q is a linear subspace of the linear space of all second-order tensors.On Q the inner product A ··BT of the whole space takes the form

A ··BT = −A ··B = −AT ··BT .

3.56 Let C be a tensor of order n that is antisymmetric in the two lastcomponents: cmn...pt = −cmn...tp. Let A be an arbitrary symmetric second-order tensor. Demonstrate that C · ·A = 0, where 0 is the zero tensor oforder n− 2.

Chapter 4

Tensor Fields

4.1 Vector Fields

The state of a point (more precisely of an infinitely small volume) withina natural object is frequently characterized by a vector or a tensor. Henceinside a spatial object there arises what we call a vector or tensor field. As arule these fields are governed by simultaneous partial differential equations.Such equations are usually derived using a Cartesian space frame. In thisframe, the operations of calculus closely parallel those of one dimensionalanalysis: the differentiation of a vector function proceeds on a componentby component basis, for example. However, it is often convenient to intro-duce curvilinear coordinates in the body, in terms of which the problemformulation is simpler. In this way we get a frame that changes from pointto point, and component-wise differentiation is not enough to characterizethe change of the vector function. Thus we need to develop the apparatusof calculus for vector and tensor functions when the frame in the object ischangeable. Another reason for introducing these tools is the objectivity ofthe laws of nature: we must be able to formulate frame independent state-ments of these laws. Finally, there is an aesthetic reason: in non-coordinateform many statements of mathematical physics look much less cumbersomethan their counterparts stated in terms of coordinates. It is said that beautygoverns the world; although this is not absolutely true, most students wouldprefer a short and beautiful statement to a nightmarish formula taking halfa page.

The position of a point in space is characterized by three numbers calledcoordinates of the point. Coordinate systems in common use include theCartesian, cylindrical, and spherical systems. The first of these differs fromthe latter two in an important respect: the frame vectors of a Cartesian

85


system are unique, while those for the curvilinear systems change from pointto point. In a general coordinate system the position of a point in space isdetermined uniquely by three numbers q1, q2, q3. These are referred to ascurvilinear coordinates if the frame is not Cartesian. See Fig. 4.1.

Fig. 4.1 Curvilinear coordinates.

If we fix two of the coordinates and change the third, we get a linein space called a coordinate line. The Cartesian coordinates x1, x2, x3 ofa point can be determined through the general curvilinear coordinates byrelations of the form

xi = xi(q1, q2, q3) (i = 1, 2, 3). (4.1)

Except for some set of singular points in space, the correspondence (4.1) isone to one. We suppose the functions xi = xi(q1, q2, q3) are smooth (con-tinuously differentiable). In this case the local one-to-one correspondenceis provided by the requirement that the Jacobian∣∣∣∣∂xi

∂qj

∣∣∣∣ = 0. (4.2)

Fixing some origin O, we characterize the position of a point P (q1, q2, q3)by a vector r connecting the points O and P :

r = r(q1, q2, q3).

When a point moves along a coordinate line, along the q1 line for in-stance, the end of its position vector moves along this line and the differencevector

∆r = r(q1 + ∆q1, q2, q3) − r(q1, q2, q3)

Tensor Fields 87

Fig. 4.2 Generation of a local tangent vector to a coordinate line.

is directed along the chord (Fig. 4.2). The smaller the value of ∆q1, thecloser ∆r is to being tangent to the q1 line. The limit as ∆q1 → 0 of theratio ∆r/∆q1 is a vector

∂r(q1, q2, q3)∂q1

= r1

tangent to the q1 coordinate line. At the same point (q1, q2, q3) we canintroduce two other vectors

r2 =∂r(q1, q2, q3)

∂q2, r3 =

∂r(q1, q2, q3)∂q3

,

tangent to the q2 and q3 coordinate lines, respectively. If the coordinatepoint (q1, q2, q3) is not singular then the vectors ri are non-coplanar andtherefore constitute a frame triad. The mixed product r1 · (r2 × r3) isthe volume of the frame parallelepiped; renumbering the coordinates ifnecessary, we obtain from this the same expression for the Jacobian (4.2).Let us denote it

√g = r1 · (r2 × r3) =

∣∣∣∣∂xi

∂qj

∣∣∣∣ .Unlike the previous chapters in which all frame vectors were constant, wenow deal with frame vectors that change from point to point. However, ateach point we can repeat our prior reasoning. In particular, let us introducethe reciprocal basis ri by the relation

ri · rj = δij .

By the previous chapter we define the metric coefficients

gij = ri · rj , gij = ri · rj , gji = ri · rj = δj

i ,


which are the components of the metric tensor E at each point in space.Above we differentiated a vector function r(q1, q2, q3). Note that the

rules for differentiating vector functions are quite similar to those for dif-ferentiating ordinary functions. For brevity we consider the case of fieldsdepending on one variable t. Let e1(t) and e2(t) be continuously differen-tiable at some finite t, which means that for i = 1, 2 there exist

dei(t)dt

= lim∆t→0

ei(t+ ∆t) − ei(t)∆t

.

It is easily seen that

d(e1(t) + e2(t))dt

=de1(t)dt

+de2(t)dt

.

Indeed,

d(e1(t) + e2(t))dt

= lim∆t→0

e1(t+ ∆t) − e1(t) + e2(t+ ∆t) − e2(t)∆t

= lim∆t→0

e1(t+ ∆t) − e1(t)∆t

+ lim∆t→0

e2(t+ ∆t) − e2(t)∆t

=de1(t)dt

+de2(t)dt

.

Similarly, for a constant c we have

d(ce1(t))dt

= cde1(t)dt

.

Finally, the product of a scalar function f(t) and a vector function e(t) isdifferentiated by a rule similar to the formula for differentiating a productof scalar functions:

d(f(t)e(t))dt

=df(t)dt

e(t) + f(t)de(t)dt

.

The reader can adapt the proof for the scalar case almost word for word.The rules for partial differentiation of vector functions in several variableslook quite similar and we leave their formulation to the reader. Repre-senting vectors in Cartesian coordinates it is easy to see the validity of thefollowing formulas:

d

dt(e1(t) · e2(t)) = e′1(t) · e2(t) + e1(t) · e′2(t),

d

dt(e1(t) × e2(t)) = e′1(t) × e2(t) + e1(t) × e′2(t).

Tensor Fields 89

Exercise 4.1. (a) Differentiate e1(t) · e2(t) and e1(t) × e2(t) with respectto t if

e1(t) = i1e−t + i2, e2(t) = −i1 sin2 t+ i2e−t.

(b) Show that [e(t) × e′(t)]′ = e(t) × e′′(t) for any differentiable vectorfunction e(t).

Exercise 4.2. For the mixed product

[e1(t), e2(t), e3(t)] = (e1(t) × e2(t)) · e3(t)

show that

d

dt[e1(t), e2(t), e3(t)]

= [e′1(t), e2(t), e3(t)] + [e1(t), e′2(t), e3(t)] + [e1(t), e2(t), e′3(t)].

Cylindrical coordinates

In the cylindrical coordinate system we have

(q1, q2, q3) = (ρ, φ, z),

where ρ is the radial distance from the z-axis and φ is the azimuthal angle(Fig. 4.3).

Fig. 4.3 Cylindrical coordinate system.

Using the expression

r = xx+ yy + zz


for the position vector in rectangular coordinates and the coordinate trans-formation formulas

x = ρ cosφ, y = ρ sinφ, z = z,

we have

r(ρ, φ, z) = xρ cosφ+ yρ sinφ+ zz.

Then

r1 =∂r(ρ, φ, z)

∂ρ= x cosφ+ y sinφ.

Similarly we compute

r2 =∂r(ρ, φ, z)

∂φ= −xρ sinφ+ yρ cosφ, r3 =

∂r(ρ, φ, z)∂z

= z.

Then√g = r1 · (r2 × r3)

= (x cosφ+ y sinφ) · [(−xρ sinφ+ yρ cosφ) × z]

= (x cosφ+ y sinφ) · (yρ sinφ+ xρ cosφ)

= ρ(cos2 φ+ sin2 φ)

= ρ.

It is easy to verify that the frame vectors r1, r2, r3 are mutually perpendic-ular. For instance,

r1 · r2 = (x cosφ+ y sinφ) · (−xρ sinφ+ yρ cosφ)

= −ρ cosφ sinφ+ ρ cosφ sinφ = 0.

By direct computation we also find that

|r1| = 1, |r2| = ρ, |r3| = 1.

These facts may be used to construct the reciprocal basis as

r1 = r1, r2 = r2/ρ2, r3 = r3.

The various metric coefficients for the cylindrical frame are easily computed,and are

(gij) =

⎛⎝ 1 0 00 ρ2 00 0 1

⎞⎠ , (gij) =

⎛⎝ 1 0 00 1/ρ2 00 0 1

⎞⎠ .

Tensor Fields 91

Commonly the unit basis vectors ρ, φ, and z are introduced in cylin-drical coordinates. These are unit vectors along the directions of r1, r2,and r3, respectively, at the point of interest. Given any vector v at a point(ρ, φ, z) in space, it is conventional in applications to write

v = ρvρ + φvφ + zvz.

The quantities vρ, vφ, and vz are known as the physical components of v.(For a vector, the physical components are the projections of the vector onthe directions of the corresponding frame vectors.1) But since we can alsoexpress v in the forms

v = v1r1 + v2r2 + v3r3 = v1r1 + v2r2 + v3r3,

we can identify the contravariant and covariant components of v from ourprevious expressions. They are

(v1, v2, v3) = (vρ, vφ/ρ, vz), (v1, v2, v3) = (vρ, vφρ, vz).

We see that expression of v in terms of the basis (ρ, φ, z) yields components(vρ, vφ, vz) that are neither contravariant nor covariant.

Exercise 4.3. Show that the acceleration of a particle in plane polar co-ordinates (ρ, φ) is given by

d2rdt2

= ρ

[d2ρ

dt2− ρ

(dφ

dt

)2]

+ φ(ρd2φ

dt2+ 2

dρ

dt

dφ

dt

).

Spherical coordinates

In the spherical coordinate system we have

(q1, q2, q3) = (r, θ, φ),1Mathematicians often prefer to deal with dimensionless quantities. But in applications

many quantities have physical dimensions (e.g., N/m2 for a stress). When we introducecoordinates in space some can also have dimensions; in spherical coordinates, for instance,r has the dimension of length whereas θ and φ are dimensionless. Since r has the lengthdimension, the frame vectors corresponding to θ and φ must have the length dimension,whereas the frame vector for r is dimensionless. In this way some components of themetric tensor may have dimensions and, when involved in transformation formulas, maythus introduce not only numerical changes but also dimensional changes in the terms sothat it becomes possible to add such quantities. When one uses physical components ofvectors and tensors they take the dimension of the corresponding tensor in full, and theframe vectors become dimensionless.


Fig. 4.4 Spherical coordinate system.

where r is the radial distance from the origin, φ is the azimuthal angle, andθ is the polar angle (Fig. 4.4). The coordinate transformation formulas

x = r sin θ cosφ, y = r sin θ sinφ, z = r cos θ

give

r(r, θ, φ) = xr sin θ cosφ+ yr sin θ sinφ+ zr cos θ.

Then

r1 =∂r(r, θ, φ)

∂r= x sin θ cosφ+ y sin θ sinφ+ z cos θ,

r2 =∂r(r, θ, φ)

∂θ= xr cos θ cosφ+ yr cos θ sinφ− zr sin θ,

r3 =∂r(r, θ, φ)

∂φ= −xr sin θ sinφ+ yr sin θ cosφ.

In this case

√g =

∣∣∣∣∣∣sin θ cosφ sin θ sinφ cos θr cos θ cosφ r cos θ sinφ −r sin θ−r sin θ sinφ r sin θ cosφ 0

∣∣∣∣∣∣ = r2 sin θ.

In this system we find that

|r1| = 1, |r2| = r, |r3| = r sin θ,

and the frame vectors are again mutually orthogonal. Hence we have

r1 = r1, r2 = r2/r2, r3 = r3/r

2 sin2 θ

Tensor Fields 93

for the vectors of the reciprocal basis. The metric coefficients are

(gij) =

⎛⎝ 1 0 00 r2 00 0 r2 sin2 θ

⎞⎠ , (gij) =

⎛⎝ 1 0 00 1/r2 00 0 1/r2 sin2 θ

⎞⎠ .

Exercise 4.4. The unit basis vectors of spherical coordinates are denotedr, θ, and φ. Give expressions relating the various sets of components(v1, v2, v3), (v1, v2, v3), and (vr , vθ, vφ) of a vector v.

We see that both the spherical and cylindrical frames are orthogonal.In this case the reciprocal basis vectors have the same directions as thevectors of the main basis, but have reciprocal lengths so that |ri||ri| = 1for each i.

In § 2.4 we obtained the transformation laws that apply to the compo-nents of a vector under a change of frame. These laws are a consequence ofthe change of the old curvilinear coordinates qk to other coordinates qj , so

qj = qj(q1, q2, q3), qk = qk(q1, q2, q3) (j, k = 1, 2, 3).

In the coordinates qj , we denote all the quantities in the same manner asfor the coordinates qk, but with the tilde above: rk, rk, and so on. Wenow extend the transformation laws to the case of a vector field in generalcoordinates. Because

ri =∂r∂qi

=∂r∂qj

∂qj

∂qi= rj

∂qj

∂qi

we can write

ri = Aji rj , Aj

i =∂qj

∂qi, (4.3)

to describe the change of frame. For the inverse transformation

ri = Aji rj , Aj

i =∂qj

∂qi. (4.4)

The results of § 2.4 now apply, and we can write immediately

f i = Aijf

j, fi = Ajifj , f i = Ai

j fj , fi = Aj

i fj,

for the transformation laws pertaining to the components of a vector fieldf . The f i are still termed contravariant components of f , while the fi arestill termed covariant components — the main difference is that now Aj

i

and Aji can change from point to point in space. The transformation laws

for the components of higher-order tensor fields are written similarly.


Exercise 4.5. A set of oblique rectilinear coordinates (u, v) in the planeis related to a set of Cartesian coordinates (x, y) by the transformationequations

x = u cosα+ v cosβ, y = u sinα+ v sinβ,

where α, β are constants. (a) Sketch a few u and v coordinate curvessuperimposed on the xy-plane. (b) Find the basis vectors ri (i = 1, 2) andthe reciprocal basis vectors ri in the (u, v) system. Use these to calculatethe metric coefficients. (c) Let z be a given vector. In the oblique systemfind the covariant components of z in terms of the contravariant componentsof z.

4.2 Differentials and the Nabla Operator

Let us consider the infinitesimal vector extending from point (q1, q2, q3) topoint (q1 + dq1, q2 + dq2, q3 + dq3), denoted by dr:

dr =∂r∂qi

dqi = ri dqi.

Hence we can define dqi as

dqi = ri · dr = dr · ri. (4.5)

(Here there is no summation over i. Note that we only introduce spatial co-ordinates having superscripts. However, for Cartesian frames it is commonto see subscripts used exclusively.) The length of this infinitesimal vectoris defined by

(ds)2 = dr · dr = ri dqi · rj dq

j = gij dqi dqj . (4.6)

On the right we have a quadratic form with respect to the variables dqi;the coefficients of this quadratic form are the covariant components of themetric tensor. In a Cartesian frame (4.6) takes the familiar form

(ds)2 = (dx1)2 + (dx2)2 + (dx3)2.

In cylindrical and spherical frames we have

(ds)2 = (dρ)2 + ρ2(dφ)2 + (dz)2,

(ds)2 = (dr)2 + r2(dθ)2 + r2 sin2 θ(dφ)2,

respectively.

Exercise 4.6. Find (ds)2 for the oblique system of Exercise 4.5.

Tensor Fields 95

Let f(q1, q2, q3) be a scalar differentiable function of the variables qi.Its differential is

df(q1, q2, q3) =∂f(q1, q2, q3)

∂qidqi.

Here and in similar situations the i in the denominator stands in the lowerposition and so we must sum over i. Using (4.5) we can write

df = ri ∂f

∂qi· dr.

The first multiplier on the right is a vector ri∂f/∂qi. Let us introduce asymbolic vector

∇ = ri ∂

∂qi

called the nabla operator, whose action on a function f is as given above:

∇f = ri ∂f

∂qi

(we repeat that there is summation over i here). The nabla operator isoften referred to as the gradient operator.

Exercise 4.7. (a) Show that the unit normal to the surface ϕ(q1, q2, q3) =c = constant is given by

n =gij ∂ϕ

∂qi√gmn ∂ϕ

∂qm∂ϕ∂qn

rj .

(b) Show that the angle at a point of intersection of the surfaces

ϕ(q1, q2, q3) = c1, ψ(q1, q2, q3) = c2,

is given by

cos θ =gij ∂ϕ

∂qi∂ψ∂qj√

gmn ∂ϕ∂qm

∂ϕ∂qn grt ∂ψ

∂qr∂ψ∂qt

(c) Show that the angle between the coordinate surfaces q1 = c1, q2 = c2 is

given by

cos θ12 =g12√g11g22

.


(d) Derive the condition for orthogonality of surfaces

gij ∂ϕ

∂qi

∂ψ

∂qj= 0.

Exercise 4.8. Show that the gradient operation in the cylindrical andspherical frames is given by the formulas

∇f = ρ∂f

∂ρ+ φ

1ρ

∂f

∂φ+ z

∂f

∂z,

∇f = r∂f

∂r+ θ

1r

∂f

∂θ+ φ

1r sin θ

∂f

∂φ,

respectively. What are the expressions for these when using the componentsconnected with the triads ri and ri?

Let us find a formula for the differential of a vector function f(q1, q2, q3):

df(q1, q2, q3) =∂f(q1, q2, q3)

∂qidqi.

We use (4.5) again. Then

df(q1, q2, q3) =∂f(q1, q2, q3)

∂qi(ri · dr)

= (dr · ri)∂f(q1, q2, q3)

∂qi

= dr ·(ri ∂f(q

1, q2, q3)∂qi

).

Thus we can represent this as

df = dr · ∇f .

The quantity ∇f , known as the gradient of f , is clearly a tensor of ordertwo. With the aid of transposition we can present df in the form

df = ∇fT · dr.Sometimes ∇fT is called the gradient of f ; it is also called the derivative off in the direction of r and denoted as

dfdr

= ∇fT .

An application of the gradient to a tensor brings a new tensor whose orderis one higher than that of the original tensor.

For a pair of vectors, we introduced two types of multiplication: the dotand cross product operations. We can apply these to the pair consisting

Tensor Fields 97

of the nabla operator (which is regarded as a formal vector) and a vectorfunction f = f(q1, q2, q3). In this way we get two operations: the divergence

div f = ∇ · f = ri · ∂f∂qi

,

and the rotation

rot f = ∇× f = ri × ∂f∂qi

. (4.7)

The vector

ω =12

rot f

is called the curl of f . In terms of the curl of f we can introduce a tensorΩ as

Ω = E× ω.This tensor is antisymmetric (the reader should verify this), and is calledthe tensor of spin. It can be shown that

Ω =12(∇fT −∇f

).

Similarly we can introduce the divergence and rotation operations for atensor A of any order:

∇ ·A = ri · ∂

∂qiA, ∇× A = ri × ∂

∂qiA.

Let us see what happens when we apply the nabla operator to the radiusvector:

∇r = ri ∂

∂qir = riri = E,

∇ · r = ri · ∂

∂qir = ri · ri = 3,

and

∇× r = ri × ∂

∂qir = ri × ri = 0.

Exercise 4.9. Calculate ∇E, ∇ · E, ∇× E, ∇E , ∇ · E, and ∇× E.


Exercise 4.10. Let f and g be functions, let f be a vector, and let Q bea tensor of order two. Verify the following identities:

∇(fg) = g∇f + f∇g,∇(f f) = (∇f)f + f∇f ,

∇(fQ) = (∇f)Q + f∇Q.

Exercise 4.11. Show that the following identities hold (f and g are anyvectors):

∇(f · g) = (∇f) · g + f · ∇gT = (∇f) · g + (∇g) · f ,∇(f × g) = (∇f) × g − (∇g) × f ,

∇× (f × g) = g · ∇f − g∇ · f − f · ∇g + f∇ · g,∇ · (fg) = (∇ · f)g + f∇ · g,

∇ · (f × g) = g · (∇× f) − f · (∇× g).

4.3 Differentiation of a Vector Function

We have differentiated vector functions with respect to qi, with the tacitunderstanding that the formulas necessary to do this resemble those fromordinary calculus. Moreover, the reader certainly knows that to differentiatea vector function

f(x1, x2, x3) = fk(x1, x2, x3)ik

with respect to a Cartesian variable xi it is enough to differentiate eachcomponent of f with respect to this variable:

∂

∂xif(x1, x2, x3) =

∂fk(x1, x2, x3)∂xi

ik. (4.8)

Let us consider how to differentiate a vector function f that is written outin curvilinear coordinates:

f(q1, q2, q3) = f i(q1, q2, q3)ri.

In this case the frame vectors ri depend on qk as well, which meansthat simple differentiation of the components of f does not result in theneeded formula. To understand this let us consider a constant functionf(q1, q2, q3) = c. By the general definition of the partial derivative we have∂f(q1, q2, q3)/∂qi = 0. But the components of f are not constant since the

Tensor Fields 99

rk are variable, hence the derivatives of the components of this function arenot zero.

The derivative of the product of a simple function and a vector functionis taken by the product rule:

∂

∂qkf(q1, q2, q3) =

∂

∂qk

[f i(q1, q2, q3)ri

]=∂f i(q1, q2, q3)

∂qkri + f i(q1, q2, q3)

∂ri

∂qk.

So the derivative of a vector function written in component form consistsof two terms. The first is the same as for the derivative in the Cartesianframe (4.8), and the other contains the derivatives of the frame vectors.Thus we need to find the latter.

4.4 Derivatives of the Frame Vectors

We would like to find the value of the derivative

∂

∂qjri =

∂

∂qj

(∂

∂qir)

=∂2r

∂qj∂qi=

∂2r∂qi∂qj

=∂

∂qirj . (4.9)

Of course if we have the expression for the Cartesian components of theframe vector we can compute these derivatives (and will do so in the exer-cises); however in the general case we exploit only the fact that the partialderivative of a frame vector is a vector as well, hence it can be expandedin the same frame ri (i = 1, 2, 3). Let us denote the coefficients of theexpansion by Γk

ij :

∂

∂qjri = Γk

ijrk. (4.10)

The quantities Γkij are called Christoffel coefficients of the second kind and

are often denoted by

Γkij =

k

ij

. (4.11)

By (4.9) there is symmetry in the subscripts of the Christoffel symbols:

Γkij = Γk

ji. (4.12)


Now we can write out the formula for the derivative of a vector functionin full:

∂

∂qkf(q1, q2, q3) =

∂f i(q1, q2, q3)∂qk

ri + Γjkif

i(q1, q2, q3)rj

=∂f i(q1, q2, q3)

∂qkri + Γi

ktft(q1, q2, q3)ri

=(∂f i

∂qk+ Γi

ktft

)ri. (4.13)

This is called covariant differentiation. The coefficients of ri are called thecovariant derivatives of the contravariant components of f , and are denotedby

∇kfi =

∂f i

∂qk+ Γi

ktft.

Let us discuss some properties of the Christoffel coefficients.

4.5 Christoffel Coefficients and their Properties

In a Cartesian frame the Christoffel symbols are zero, so it is impossible toobtain them for another frame by the usual transformation rules for tensorcomponents. This is easy to understand: the Christoffel symbols dependnot only on the frame vectors themselves but on their rates of change frompoint to point, and these rates do not appear in the transformation rulesfor tensors. So the Christoffel symbols are not the components of a tensor,despite their notation. This is why many authors prefer the notation shownon the right side of equation (4.11).

It is important to have formulas for computing the Christoffel symbols.Let us introduce the notation

rij =∂

∂qjri =

∂2r∂qi∂qj

.

Relation (4.10) can be written as

rij = Γkijrk (4.14)

from which it follows that

rij · rt = Γkijrk · rt = Γk

ijgkt.

Tensor Fields 101

The left-hand side of this can be expressed in terms of the components ofthe metric tensor. Indeed

∂

∂qjgit =

∂

∂qjri · rt = rij · rt + ri · rtj . (4.15)

Similarly

∂

∂qtgji =

∂

∂qtrj · ri = rjt · ri + rj · rit, (4.16)

∂

∂qigtj =

∂

∂qirt · rj = rti · rj + rt · rji. (4.17)

Now we obtain rij · rt by subtracting (4.16) from the sum of (4.15) and(4.17) and dividing by 2:

rij · rt =12

(∂git

∂qj+∂gtj

∂qi− ∂gji

∂qt

)= Γijt. (4.18)

The quantities Γijk are called Christoffel coefficients of the first kind. Theyare denoted frequently as

[ij, k] = Γijk.

It is clear that they are symmetric in the first two subscripts: Γijk = Γjik.Thus we have

Γkijgkt = Γijt (4.19)

and it follows that

Γkij = gktΓijt. (4.20)

Using these we can obtain

∂rj

∂qi= −Γj

itrt. (4.21)

Indeed

0 =∂

∂qtδij =

∂

∂qt

(ri · rj

)=∂ri

∂qt· rj + ri · rjt =

∂ri

∂qt· rj + ri · Γk

jtrk,

so∂ri

∂qt· rj = −ri · Γk

jtrk = −Γkjtδ

ik = −Γi

jt

and we have∂ri

∂qt= −Γi

jtrj .


It is also useful to have formulas for transformation of the Christoffelsymbols under change of coordinates. Suppose the new coordinates qi aredetermined by the old coordinates qi through relations of the form qi =qi(q1, q2, q3), and refer back to equations (4.3)–(4.4). Now (4.14) appliedin the new system gives

Γkij rk = rij . (4.22)

But

rij =∂

∂qjri =

∂qm

∂qj

∂

∂qmri = Am

j

∂

∂qm(An

i rn),

and expansion by the product rule gives

rij = Amj

(∂An

i

∂qmrn + An

i

∂rn

∂qm

)

= Amj

∂Ani

∂qmrn + Am

j Ani Γp

nmrp

= Amj

∂Ani

∂qmAk

nrk + Amj A

ni Γp

nmAkp rk.

Comparison with (4.22) shows that

Γkij = Am

j

∂Ani

∂qmAk

n + Amj A

ni A

kpΓp

nm.

The presence of the first term on the right means that the Christoffel co-efficient is not a tensor of order three. This confirms our earlier statementof this fact, which was based on different reasoning.

Exercise 4.12. Show that the only nonzero Christoffel coefficients of thefirst kind for cylindrical coordinates are

Γ221 = −ρ, Γ122 = Γ212 = ρ.

Show that the nonzero Christoffel coefficients of the first kind for sphericalcoordinates are

Γ221 = −r, Γ122 = Γ212 = r,

Γ331 = −r sin2 θ, Γ332 = −r2 sin θ cos θ,

Γ313 = Γ133 = r sin2 θ, Γ233 = Γ323 = r2 sin θ cos θ.

Tensor Fields 103

Exercise 4.13. Show that the only nonzero Christoffel coefficients of thesecond kind for cylindrical coordinates are

Γ122 = −ρ, Γ2

12 = Γ221 = 1/ρ.

Show that the nonzero Christoffel coefficients of the second kind for spher-ical coordinates are

Γ122 = −r, Γ2

12 = Γ221 = 1/r,

Γ133 = −r sin2 θ, Γ2

33 = − sin θ cos θ,

Γ331 = Γ3

13 = 1/r, Γ323 = Γ3

32 = cot θ.

Exercise 4.14. A system of plane elliptic coordinates (u, v) is introducedaccording to the transformation formulas

x = c coshu cos v, y = c sinhu sin v,

where c is a constant. Find the Christoffel coefficients.

Euclidean vs. non-Euclidean spaces

When we derive the length of the elementary vector dr we write

(ds)2 = dr · dr = ri dqi · rj dq

j = gij dqi dqj . (4.23)

With respect to the variables dqi and dqj we have, by construction, a posi-tive definite quadratic form. This is one of the main properties of the metrictensor for a real coordinate space.

If a space is Euclidean, that is, if it can be described by a set of Cartesiancoordinates xt (t = 1, 2, 3), then the line element (ds)2 can be written as

(ds)2 =3∑

t=1

(dxt)2. (4.24)

Given a set of admissible transformation equations xt = xt(qn), the sameline element can be expressed in terms of general coordinates qn. Since

dxt =∂xt

∂qndqn

we have

(ds)2 =3∑

t=1

(∂xt

∂qndqn

)2

,


and comparison with (4.23) gives

gij =3∑

t=1

∂xt

∂qi

∂xt

∂qj(i, j = 1, 2, 3), (4.25)

for the metric coefficients in the qn system. We now pose the followingquestion. Suppose a space is originally described in terms of a set of generalcoordinates qn. Such a description must include a set of metric coefficientsgij having (4.23) positive definite at each point. Will it be possible tointroduce a set of Cartesian coordinates xt that also describe this space?That is, are we guaranteed the existence of functions xt(qn) such that in theresulting coordinates xt = xt(qn) the line element takes the form (4.24)?Put still another way, is the space Euclidean? Not necessarily. Given thegij in the qn system, (4.25) provides us with six equations for the threeunknown functions xt(qn). This implies that some additional conditionsmust be fulfilled by the given metric coefficients. It is possible to formulatethese restrictions neatly in terms of a certain fourth-order tensor Rp

· ijk asthe equality

Rp· ijk = 0. (4.26)

The tensor Rp· ijk is known as the Riemann–Christoffel tensor, and its as-

sociated tensor

Rnijk = gnpRp· ijk

is called the curvature tensor for the space. It can be shown that

Rp· ijk =

∂Γpij

∂qk− ∂Γp

ik

∂qj− (

ΓpmjΓ

mik − Γp

mkΓmij

).

Moreover, Rp· ijk actually has only six independent components, so (4.26)

represents six conditions on the gij that must hold for the space describedby the gij to be Euclidean [Sokolnikoff (1994)].

We have included this information only for the reader’s background,as such considerations are important in certain application areas. Furtherdetails and more rigorous formulations can be found in some of the morecomprehensive references.

Tensor Fields 105

4.6 Covariant Differentiation

Let us return to the problem of differentiation of a vector function. Weobtained the formula for the derivative (4.13)

∂

∂qkf =

∂

∂qk(f iri) =

(∂f i

∂qk+ Γi

ktft

)ri

and observed that the coefficients of this expansion denoted by

∇kfi =

∂f i

∂qk+ Γi

ktft (4.27)

are called the covariant derivatives of contravariant components of vectorfunction f . Let us express the same derivatives in terms of the covariantcomponents:

∂

∂qkf =

∂

∂qk(firi) =

∂fi

∂qkri + fi

∂

∂qkri.

Using (4.21) we get

∂

∂qkf =

∂fi

∂qkri − fiΓi

ktrt =

(∂fi

∂qk− Γj

kifj

)ri.

The coefficients of this expansion are called covariant derivatives of thecovariant components and are denoted by

∇kfi =∂fi

∂qk− Γj

kifj. (4.28)

In these notations we can write out the formula for differentiation in theform

∂f∂qi

= rk∇ifk = rj∇ifj. (4.29)

Because

∇f = ri ∂

∂qi(fjrj) = ri(∇ifj)rj = rirj∇ifj ,

we see that ∇ifj is a covariant component of ∇f . Similarly,

∇f = ri ∂

∂qi(f jrj) = rirj∇if

j

shows that ∇ifj is a mixed component of ∇f .

Exercise 4.15. Write out the covariant derivatives of the contravariantcomponents of a vector f in plane polar coordinates.


Quick summary

The formulas for differentiation of a vector are

∂f∂qi

= rk∇ifk = rj∇ifj

where

∇kfi =∂fi

∂qk− Γj

kifj, ∇kfi =

∂f i

∂qk+ Γi

ktft.

The formulas

∇f = rirj∇ifj = rirj∇ifj

show that ∇ifj is a covariant component of ∇f , while ∇ifj is a mixed

component of ∇f .

4.7 Covariant Derivative of a Second-Order Tensor

Let us find a partial derivative of a tensor A of order two:

∂

∂qkA =

∂

∂qk(aijrirj)

=∂aij

∂qkrirj + aij(rikrj + rirjk)

=∂aij

∂qkrirj + aij(Γt

ikrtrj + riΓtjkrt).

Changing dummy indices we get

∂

∂qkA =

(∂aij

∂qk+ Γi

ksasj + Γj

ksais

)rirj .

The parenthetical expression is designated as a covariant derivative:

∇kaij =

∂aij

∂qk+ Γi

ksasj + Γj

ksais.

Tensor Fields 107

Similarly

∂

∂qkA =

∂

∂qk(aijrirj)

=∂aij

∂qkrirj + aij

(∂ri

∂qkrj + ri ∂r

j

∂qk

)=∂aij

∂qkrirj − aij(Γi

ktrtrj + riΓj

ktrt)

=(∂aij

∂qk− Γs

kiasj − Γskjais

)rirj .

As before we denote

∇kaij =∂aij

∂qk− Γs

kiasj − Γskjais.

Exercise 4.16. Show that

∂

∂qkA =

(∂a·ji∂qk

− Γskia

·js + Γj

ksa·si

)rirj ,

∂

∂qkA =

(∂ai

·j∂qk

+ Γiksa

s·j − Γs

kjai·s

)rirj .

The expressions in parentheses are all denoted the same way:

∇ka·ji =

∂a·ji∂qk

− Γskia

·js + Γj

ksa·si ,

∇kai·j =

∂ai·j

∂qk+ Γi

ksas·j − Γs

kjai·s.

Quick summary

For a tensor A of order two, we have

∂

∂qkA = ∇ka

ijrirj = ∇kaijrirj = ∇ka·ji rirj = ∇ka

i·jrirj

where

∇kaij =

∂aij

∂qk+ Γi

ksasj + Γj

ksais, ∇kaij =

∂aij

∂qk− Γs

kiasj − Γskjais,

∇ka·ji =

∂a·ji∂qk

− Γskia

·js + Γj

ksa·si , ∇ka

i·j =

∂ai·j

∂qk+ Γi

ksas·j − Γs

kjai·s.

Exercise 4.17. Show that any of the covariant derivatives of any compo-nent of the metric tensor is equal to zero.


Exercise 4.18. Demonstrate that the components of the metric tensorbehave as constants under covariant differentiation of components:

∇kgstat = gst∇kat, ∇kgsta

t = gst∇kat.

4.8 Differential Operations

Here we look further at various differential operations that may be per-formed on vector and tensor fields. We begin with the rotation of a vector.By (4.7) and (4.29) we have

∇× f = ri × ∂f∂qi

= ri × rj∇ifj = εijkrk∇ifj .

The use of (4.28) allows us to write this as

∇× f = εijkrk

(∂fj

∂qi− Γn

ijfn

). (4.30)

Considering the second term in parentheses we note that

εijkΓnij = εijkΓn

ji = εjikΓnij

by (4.12) and a subsequent renaming of dummy indices. But εjik = −εijk,hence εijkΓn

ij = −εijkΓnij so that εijkΓn

ij = 0. Equation (4.30) is therebyreduced to

∇× f = rkεijk ∂fj

∂qi.

As an example, we recall that the vector field E of electrostatics satisfies∇×E = 0. Such a field is said to be irrotational. So the condition for anyvector field f to be irrotational can be written in generalized coordinates as

εijk ∂fj

∂qi= 0.

Of course, the operation ∇× f is given in a Cartesian frame by the familiarformula

∇× f =

∣∣∣∣∣∣∣∣x y z∂

∂x

∂

∂y

∂

∂zfx fy fz

∣∣∣∣∣∣∣∣ .Let us turn to the divergence of f . We start by writing

∇ · f = ri · ∂f∂qi

= ri · rj∇ifj = gij∇ifj.

Tensor Fields 109

By the result of Exercise 4.18 and equation (4.27) this can be written as

∇ · f = ∇igijfj = ∇if

i =∂f i

∂qi+ Γi

infn. (4.31)

As was the case with (4.30) this can be simplified; we must first develop auseful identity for Γi

in. This is done as follows. We begin by writing

∂√g

∂qn=

∂

∂qn[r1 · (r2 × r3)] =

∂r1

∂qn· (r2 × r3) + r1 · ∂

∂qn(r2 × r3)

where

r1 · ∂

∂qn(r2 × r3) = r1 · r2 × ∂r3

∂qn+ r1 · ∂r2

∂qn× r3

=∂r3

∂qn· (r1 × r2) +

∂r2

∂qn· (r3 × r1)

so that∂√g

∂qn=∂r1

∂qn· (r2 × r3) +

∂r2

∂qn· (r3 × r1) +

∂r3

∂qn· (r1 × r2).

Continuing to rewrite this we have

∂√g

∂qn= r1n · (r2 × r3) + r2n · (r3 × r1) + r3n · (r1 × r2)

= Γi1nri · (r2 × r3) + Γi

2nri · (r3 × r1) + Γi3nri · (r1 × r2)

= Γ11n

√g + Γ2

2n

√g + Γ3

3n

√g

=√g Γi

in.

Hence

Γiin =

1√g

∂√g

∂qn.

This is the needed identity, and with it (4.31) may be written as

∇ · f =1√g

∂

∂qi

(√gf i

). (4.32)

We know that the condition of incompressibility of a liquid in hydrome-chanics is expressed as

∇ · v = 0

where v is the velocity of a material point, so in general coordinates we canwrite it out as

1√g

∂

∂qi

(√gvi

)= 0.


In electromagnetic theory the magnetic source law states that ∇ · B = 0where B is the vector field known as magnetic flux density. In general, avector field f is called solenoidal if ∇ · f = 0. Of course, ∇ · f is given inCartesian frames by the familiar expression

∇ · f =∂fx

∂x+∂fy

∂y+∂fz

∂z.

Exercise 4.19. Use (4.32) to express ∇ · f in the cylindrical and sphericalcoordinate systems.

The curl of a tensor field A may be computed as follows. We start with

∇× A = rk × ∂

∂qkA

= rk × rirj∇kaij

= εkinrnrj∇kaij

= εkinrnrj

(∂aij

∂qk− Γs

kiasj − Γskjais

).

We then use

εkinΓski = 0, −rjΓs

kj =∂rs

∂qk,

to get

∇× A = εkinrn

(rj ∂aij

∂qk+∂rs

∂qkais

)and hence

∇× A = εkinrn∂

∂qk

(rjaij

).

For the divergence of a tensor field A, we write

∇ ·A = rk · ∂

∂qkA

= rk · rirj∇kaij

= rj∇iaij

= rj

(∂aij

∂qi+ Γi

isasj + Γj

isais

).

But

rjΓjis =

∂rs

∂qi

Tensor Fields 111

so we have

∇ · A = rj∂aij

∂qi+ rj

1√g

∂√g

∂qsasj +

∂rs

∂qiais

=∂

∂qi

(rja

ij)

+ rj1√g

∂√g

∂qiaij .

Finally then,

∇ · A =1√g

∂

∂qi

(√gaijrj

). (4.33)

Many problems of mathematical physics reduce to Poisson’s equationor Laplace’s equation. The unknown function in these equations can be ascalar function or (as in electrodynamics) a vector function. The equationsof the linear theory of elasticity in displacements also contain the Laplacianoperator and another type of operation involving the nabla operator. InCartesian frames it is simple to write out corresponding expressions. Solv-ing corresponding problems with the use of curvilinear coordinates, we needto find the representation of these formulas. We begin with the formulasthat relate to the second-order tensor ∇∇f . We have

∇∇f = ri ∂

∂qi

(rj ∂

∂qjf

)= ri

(∂

∂qi

∂f

∂qj− Γk

ij

∂f

∂qk

)rj ,

hence

∇∇f = rirj

(∂2f

∂qi∂qj− Γk

ij

∂f

∂qk

). (4.34)

From this we see that

(∇∇f)T = ∇∇f (4.35)

which is obvious by symmetry (in i and j) of the Christoffel coefficient andthe rest of the expression on the right side of (4.34). A formal insertion2

of the dot product operation between the vectors of (4.34) allows us togenerate an expression for the Laplacian ∇2f ≡ ∇ · ∇f :

∇2f = ri · rj

(∂2f

∂qi∂qj− Γk

ij

∂f

∂qk

)= gij

(∂2f

∂qi∂qj− Γk

ij

∂f

∂qk

). (4.36)

2This sort of operation can be done with any dyad of vectors; we can generate a scalara ·b from ab by inserting a dot. This operation can have additional meaning when donewith a tensor A: if we write the tensor in mixed form a·j

i rirj we obtain a·ji ri · rj = a·i

i .This is the first invariant of A, known as the trace of A.


Hence Laplace’s equation ∇2f = 0 appears in generalized coordinates as

gij

(∂2f

∂qi∂qj− Γk

ij

∂f

∂qk

)= 0.

In a Cartesian frame (4.36) gives us

∇2f =∂2f

∂x2+∂2f

∂y2+∂2f

∂z2,

while in cylindrical and spherical frames

∇2f =1ρ

∂

∂ρ

(ρ∂f

∂ρ

)+

1ρ2

∂2f

∂φ2+∂2f

∂z2,

∇2f =1r2

∂

∂r

(r2∂f

∂r

)+

1r2 sin θ

∂

∂θ

(sin θ

∂f

∂θ

)+

1r2 sin2 θ

∂2f

∂φ2,

respectively.

Exercise 4.20. Show that a formal insertion of the cross product operationbetween the vectors of (4.34) leads to the useful identity

∇×∇f = 0,

holding for any scalar field f .

Since usual and covariant differentiation amount to the same thing fora scalar f , we can write

∇∇f = ri ∂

∂qirj ∂

∂qjf = ri ∂

∂qi

(rj∇jf

)= rirj∇i∇jf = rjri∇i∇jf

where in the last step we used (4.35). The corresponding result for theLaplacian is

∇2f = ri · rj∇i∇jf = gij∇i∇jf = ∇j∇jf

where

∇j ≡ gij∇i.

The Laplacian of a vector arises in physical applications such as elec-tromagnetic field theory. For this we have

∇2f = ∇ · ∇f = rk · ∂

∂qkrirj∇if

j = rk · rirj∇k∇ifj = gkirj∇k∇if

j

hence

∇2f = rj∇i∇ifj.

Tensor Fields 113

Also

∇∇ · f = ri ∂

∂qi

[1√g

∂

∂qj

(√gf j

)]= ri∇i∇jf

j .

It is possible to demonstrate that

∇×∇× f = ∇∇ · f −∇2f .

4.9 Orthogonal Coordinate Systems

The most frequently used coordinate frames are Cartesian, cylindrical, andspherical. All of these are orthogonal. There are many other orthogonalcoordinate frames in use as well. It is sensible to give a general treatmentof these systems because mutual orthogonality of the frame vectors leadsto simplification in many formulae. Additional motivation is provided bythe fact that in applications it is important to know the magnitudes of fieldcomponents. The general formulations given above are inconvenient forthis; the general frame vectors are not of unit length, hence the magnitudeof the projection of a vector onto an orthogonal frame direction is notthe corresponding component of the vector. The physical components ofa vector are conveniently displayed in orthogonal frames with use of theLame coefficients.

In this section we consider frames where the coordinate vectors aremutually orthogonal:

ri · rj = 0, i = j.

The Lame coefficients Hi are

(Hi)2 = gii = ri · ri (i = 1, 2, 3).

Note that Hi is the length of the frame vector ri. At each (q1, q2, q3) thecoordinate frame is orthogonal. In the orthogonal frame, by the construc-tion of the reciprocal basis, the vectors ri are co-directed with ri and theproduct of their lengths is 1. Hence

ri = ri/(Hi)2 (i = 1, 2, 3).

Let us introduce the frame whose vectors are co-directed with the basisvectors but have unit length:

ri = ri/Hi (i = 1, 2, 3).


Thus at each point the vectors ri form a Cartesian basis, a frame thatrotates when the origin of the frame moves from point to point. We shallpresent all the main formulas of differentiation when vectors are given inthis basis. They are not convenient in theory since many useful propertiesof symmetry are lost, but they are necessary when doing calculations incorresponding coordinates. We begin by noting that

ri = Hiri, ri = ri/Hi (i = 1, 2, 3).

Let us compute the ri in the cylindrical and spherical systems. In a cylin-drical frame where

H1 = 1, H2 = ρ, H3 = 1,

we have

r1 = x cosφ+ y sinφ,

r2 = −x sinφ+ y cosφ,

r3 = z.

In a spherical frame where

H1 = 1, H2 = r, H3 = r sin θ,

we have

r1 = x sin θ cosφ+ y sin θ sinφ+ z cos θ,

r2 = x cos θ cosφ+ y cos θ sinφ− z sin θ,

r3 = −x sinφ+ y cosφ.

Differentiation in the orthogonal basis

Using the definition we can represent the ∇-operator in new terms as

∇ =ri

Hi

∂

∂qi.

(In this formula there is summation over i, and this continues to hold inall formulas of this section below. The convention on summation over sub-and super-indices is modified here since in a Cartesian system, which theframe ri locally constitutes, the reciprocal and main bases coincide.) Nowlet us find the formulas of differentiation of the new frame vectors. We

Tensor Fields 115

begin with the formula (4.18):

rij · rt =12

(∂git

∂qj+∂gtj

∂qi− ∂gji

∂qt

)= Ht

∂Hi

∂qjδit +Ht

∂Hj

∂qiδjt −Hi

∂Hj

∂qtδij .

Here we used the fact that gij = 0 if i = j. On the other hand

rij · rt =∂

∂qj(Hiri) ·Htrt

=∂Hi

∂qjri ·Htrt +Hi

∂ri

∂qj·Htrt

=∂Hi

∂qjHtδit +HiHt

∂ri

∂qj· rt.

Thus

∂ri

∂qj· rt =

1Hi

∂Ht

∂qiδjt − 1

Ht

∂Hi

∂qtδij .

Therefore

∂ri

∂qj=

3∑t=1

(1Hi

∂Ht

∂qiδjt − 1

Ht

∂Hi

∂qtδij

)rt.

(Note that the components of vectors are given in the frame of the sameri.) Using this we derive

∇f =3∑

i,j=1

rj

Hj

∂

∂qj(firi) .

The gradient, divergence, and rotation of a vector field f are given by

∇f = rirj

(1Hi

∂fj

∂qi− fi

HiHj

∂Hi

∂qj+ δij

fk

Hk

1Hi

∂Hi

∂qk

)

∇ · f =1

H1H2H3

(∂

∂q1(H2H3f1) +

∂

∂q2(H3H1f2) +

∂

∂q3(H1H2f3)

)(4.37)

and

∇× f =12ri × rj

HiHj

(∂

∂qi(Hjfj) − ∂

∂qj(Hifi)

). (4.38)


Using the gradient we can write out the tensor of small strains for a dis-placement vector u = uiri, which is the main object of linear elasticity:

ε =12(∇u + ∇uT

)=

12rirj

(1Hi

∂ui

∂qi+

1Hj

∂uj

∂qj− ui

HiHj

∂Hi

∂qj(4.39)

− uj

HiHj

∂Hj

∂qi+ 2δij

ut

HiHj

∂Hj

∂qt

).

The Laplacian of a scalar field f is

∇2f =1

H1H2H3

[∂

∂q1

(H2H3

H1

∂f

∂q1

)+

∂

∂q2

(H3H1

H2

∂f

∂q2

)+

∂

∂q3

(H1H2

H3

∂f

∂q3

)]. (4.40)

In the cylindrical and spherical coordinate systems, for example, (4.37)yields

∇ · f =1ρ

∂

∂ρ(ρfρ) +

1ρ

∂fφ

∂φ+∂fz

∂z

and

∇ · f =1r2

∂

∂r

(r2fr

)+

1r sin θ

∂

∂θ(sin θfθ) +

1r sin θ

∂fφ

∂φ,

while (4.38) yields

∇× f = ρ

(1ρ

∂fz

∂φ− ∂fφ

∂z

)+ φ

(∂fρ

∂z− ∂fz

∂ρ

)+ z

1ρ

(∂(ρfφ)∂ρ

− ∂fρ

∂φ

)and

∇× f = r1

r sin θ

(∂

∂θ(sin θfφ) − ∂fθ

∂φ

)+ θ

1r sin θ

(∂fr

∂φ− sin θ

∂(rfφ)∂r

)+ φ

1r

(∂(rfθ)∂r

− ∂fr

∂θ

).

The specializations of (4.40) to cylindrical and spherical coordinates weregiven in § 4.8.

Tensor Fields 117

4.10 Some Formulas of Integration

Let f(x1, x2, x3) be a continuous function of the Cartesian coordinatesx1, x2, x3 in a compact volume V . Let q1, q2, q3 be curvilinear coordinates inthe same volume, in one-to-one continuously differentiable correspondencewith the Cartesian coordinates, so after transformation of the coordinateswe shall write out the same function as f(q1, q2, q3). The transformation is∫

V

f(x1, x2, x3) dx1 dx2 dx3 =∫

V

f(q1, q2, q3)J dq1 dq2 dq3

where

J =√g =

∣∣∣∣∂xi

∂qj

∣∣∣∣is the Jacobian.

Exercise 4.21. Show that the Jacobian determinants of the transforma-tions from Cartesian coordinates to cylindrical and spherical coordinatesare, respectively,∣∣∣∣∂(x, y, z)

∂(ρ, φ, z)

∣∣∣∣ = ρ,

∣∣∣∣∂(x, y, z)∂(r, θ, φ)

∣∣∣∣ = r2 sin θ.

Exercise 4.22. Two successive coordinate transformations are given byxi = xi(qj) and qi = qi(qj). Show that the Jacobian determinant of thecomposite transformation xi = xi(qj) is given by∣∣∣∣∂(x1, x2, x3)

∂(q1, q2, q3)

∣∣∣∣ =∣∣∣∣∂(x1, x2, x3)∂(q1, q2, q3)

∣∣∣∣ ∣∣∣∣∂(q1, q2, q3)∂(q1, q2, q3)

∣∣∣∣ .For functions f(x1, x2, x3) and g(x1, x2, x3) that are continuously differ-

entiable on a compact volume V with piecewise smooth boundary S, thereis the well known Gauss–Ostrogradsky formula for integration by parts:∫

V

∂f

∂xkg dx1 dx2 dx3 = −

∫V

∂g

∂xkf dx1 dx2 dx3 +

∫S

fgnk dS,

where dS is the differential element of area on S and nk is the projection ofthe outward unit normal n from S onto the axis ik. In the particular caseg = 1 we have ∫

V

∂f

∂xkdx1 dx2 dx3 =

∫S

fnk dS. (4.41)


Let us use (4.41) to derive some formulas involving the nabla operator thatare frequently used in applications. These formulas will be valid in curvilin-ear coordinates, despite the fact that the intermediate transformations willbe done in Cartesian coordinates, because the final results will be writtenin non-coordinate form. We begin with the integral of ∇f :∫

V

∇f dV =∫

V

ik∂f

∂xkdx1 dx2 dx3 = ik

∫S

fnk dS =∫

S

fn dS.

Now consider an analogous formula for a vector function f :∫V

∇f dV =∫

V

ik∂ft

∂xkit dx1 dx2 dx3

= ikit

∫S

nkft dS

=∫

S

nkikftit dS

=∫

S

nf dS.

Since the left- and right-hand sides are written in non-coordinate form wecan use this formula with any coordinate frame with dV =

√g dq1 dq2 dq3.

This is the formula for a tensor ∇f ; for its trace we have∫V

∇ · f dV =∫

V

ik∂ft

∂xk· it dx1 dx2 dx3

= ik · it∫

S

nkft dS

=∫

S

nkik · ftit dS

=∫

S

n · f dS.

In a similar fashion we can derive∫V

∇× f dV =∫

S

n× f dS.

It is easily seen that∫V

∇fT dV =∫

S

(nf)T dS =∫

S

fn dS.

In a similar fashion the reader can use (4.41) to derive the formulas∫V

∇A dV =∫

S

nA dS,

Tensor Fields 119

∫V

∇ · A dV =∫

S

n · A dS,

and ∫V

∇× A dV =∫

S

n× A dS

for a tensor field A. Finally let us write out Stokes’s formula in non-coordinate form. Recall that this relates a vector function f given on asimply-connected surface S to its circulation over the piecewise smoothboundary contour Γ: ∮

Γ

f · dr =∫

S

(n×∇) · f dS.

For a tensor A of second-order this formula extends to the two formulas∮Γ

dr · A =∫

S

(n ×∇) ·A dS

and ∮Γ

A · dr =∫

S

(n ×∇) · AT dS.

We leave these for the reader to prove. We should note that Stokes’s for-mulas hold only when S is simply-connected; for a doubly- or multiply-connected surface, the formulas must be amended by some cyclic constants.

4.11 Problems

In this problem set, we let u denote a vector field; f, g, h smooth functions;a, b, c arbitrary constants; A a second-order tensor; r the position vector ofa body point; n the unit external normal to the body boundary.

4.1 Let f = f(r), where r2 = r · r. Find ∇f and ∇2f .

4.2 Let a, b, c be arbitrary constants, f, g arbitrary functions, and u agiven vector field. Find ∇u and ∇ · u for the following u.

(a) u = ax1i1 + bx2i2 + cx3i3;(b) u = ax2i1;(c) u = ar;(d) u = f(r)er (assume polar coordinates);(e) u = f(r)eφ (assume polar coordinates);


(f) u = f(r)ez (assume cylindrical coordinates);(g) u = f(r)er (assume spherical coordinates);(h) u = ω × r, ω = const;(i) u = f(φ)ez + g(φ)eφ (assume cylindrical coordinates);(j) u = f(z)ez + g(φ)eφ (assume cylindrical coordinates);(k) u = A · r, A = const.

4.3 Demonstrate:

(a) ∇ · (A · f) = (∇ ·A) · f + AT ·· ∇f ;(b) ∇ · (A ·B) = (∇ · A) ·B + AT ·· ∇B;(c) ∇ · (A × r) = (∇ · A) × r, if A is a symmetric tensor: A = AT ;(d) ∇ · [(E × ω) × r] = 2ω + (∇× ω) × r;(e) ∇× (∇× A) = ∇(∇ ·A) −∇ · (∇A);(f) ∇× (f × r) = r · ∇f − r(∇ · f) + 2f ;(g) tr [∇× (E × ω)] = −2∇ · ω;(h) ∇ · [(∇f)T − (∇ · f)E] = 0;(i) ∇ · [∇f ×∇g] = 0;(j) ∇× (∇× A)T is a symmetric tensor if A is symmetric;(k) ∇ · (f f) = f∇ · f + (∇f) · f − f × (∇× f);(l) tr [∇× (A · B)] = B ·· (∇× A) − AT ·· (∇× BT );

(m) (∇f)× = ∇× f .

4.4 Find

(a) ∇ · (Er);(b) ∇× [∇× (r × A × r)]T , if A = AT = const;(c) ∇ · (A · r) if A = const;(d) ∇ · (fE);(e) ∇ · (rE);(f) ∇ · (rr).

4.5 Let f, g, h be arbitrary smooth functions and r2 = r · r. Find thedivergence of the tensors A given by the following formulas.

(a) A = f(r)erer + g(r)eφeφ + h(z)ezez;(b) A = f(r)erer + g(r)eφeφ + g(r)eθeθ;(c) A = f(r)erer + g(r)eφeφ + h(r)ezez;

Tensor Fields 121

(d) A = f(x1)i1i1 + g(x2)i2i2 + h(x3)i3i3;(e) A = f(x2)i1i1 + g(x3)i2i2 + h(x1)i3i3;(f) A = f(r)ereφ + g(r)eφer + h(r)erez;(g) A = f(z)erez + g(z)eφer + h(z)ezer.

4.6 Let A be a symmetric second-order tensor depending on the coordi-nates. Denote f = ∇ ·A. Find ∇ · (A × r) in terms of f .

4.7 The elliptic cylindrical coordinates are related to Cartesian coordinatesby the formulas

x1 = aστ,

x2 = ±a√

(σ2 − 1)(1 − τ2),

x3 = z,

where σ ≥ 1, |τ | ≤ 1, and a is a positive parameter. So σ, τ, z are inter-nal coordinates for the cylinder. Show that the internal coordinates areorthogonal and find their Lame coefficients.

4.8 Parabolic coordinates σ, τ, φ in space are related to the Cartesian co-ordinates x1, x2, x3 by the formulas

x1 = στ cosφ,

x2 = στ sinφ,

x3 =12(τ2 − σ2),

where a is a positive parameter. Show that the parabolic system of coor-dinates is orthogonal. Find its Lame coefficients.

4.9 The bipolar cylindrical coordinates σ, τ, z are related to the Cartesiancoordinates x1, x2, x3 by the formulas

x1 =a sinh τ

cosh τ − cosσ,

x2 =a sinσ

cosh τ − cosσ,

x3 = z,

where a is a positive parameter. Show that the bicylindrical coordinatesystem is orthogonal. Find its Lame coefficients.


4.10 The bipolar coordinates σ, τ, φ are related to the Cartesian coordi-nates x1, x2, x3 by the formulas

x1 =a sinσ

cosh τ − cosσcosφ,

x2 =a sinσ

cosh τ − cosσsinφ,

x3 =a sinh τ

cosh τ − cosσ,

where 0 ≤ σ < π, 0 ≤ φ < 2π, and a is a positive parameter. Show thatthe bipolar coordinate system is orthogonal. Find its Lame coefficients.

4.11 The toroidal coordinates σ, τ, φ are related to Cartesian coordinatesby the formulas

x1 =a sinh τ

cosh τ − cosσcosφ,

x2 =a sinh τ

cosh τ − cosσsinφ,

x3 =a sinσ

cosh τ − cosσ,

where −π ≤ σ ≤ π, 0 ≤ τ , 0 ≤ φ < 2π, and a is a positive parameter.Show that the toroidal coordinate system is orthogonal. Find its Lamecoefficients.

4.12 Use the Gauss–Ostrogradsky theorem to show that∫S

nr dS = VE,

where V is the volume of the domain bounded by surface S.

4.13 Show that the volume V of the body bounded by surface S is givenby the following formulas.

(a)

V =16

∫S

n · ∇r2 dS, r2 = r · r,

(b)

V =13

∫S

n · r dS.

Tensor Fields 123

4.14 Let a be a vector field satisfying the condition ∇ · a = 0. Using theGauss–Ostrogradsky formula, demonstrate that∫

S

fna dS =∫

V

(∇f) · a dV.

4.15 Let S be a closed surface. Demonstrate that∫S

n dS = 0.

4.16 Let the second-order tensor A be symmetric. Prove the followingidentity. ∫

S

r × (n ·A) dS =∫

V

r × (∇ ·A) dV.

4.17 Prove the identity∫V

∇2A dV =∫

S

n · ∇A dS.

4.18 Let A be a given second-order tensor. Denote f = ∇ · A in volumeV and n · A|S = g over S, the boundary surface of V . Find∫

V

A dV.

4.19 Prove the identity (n×∇) · A = n · rotA.

Chapter 5

Elements of Differential Geometry

The standard fare of high school geometry consists mostly of material col-lected two millennia ago when geometry stood at the center of naturalphilosophy. The ancient Greeks, however, did not limit their investigationsto the circles and straight lines of Euclid’s Elements. Archimedes, usingmethods and ideas that were later to underpin the analysis of infinitesimalquantities, could calculate the length of a spiral and the areas and volumesof other complex figures. In elementary algebra we learn to graph simplequadratic functions such as the parabola and hyperbola, and then in ana-lytic geometry we learn to handle space figures such as the ellipsoid. Themethods involved are essentially due to Descartes, who connected the ideasof geometry with those of algebra, and their application is largely limitedto objects whose describing equations are of the second order. Finally,in elementary calculus we study formulas that permit us to calculate thelength of a curve given in Cartesian coordinates, etc. These more power-ful methods are now incorporated into a branch of mathematics known asdifferential geometry.

Differential geometry allows us to characterize curves and figures of avery general nature. The practical importance of this is well illustrated bythe problem of optimal pursuit, wherein one object tries to catch anothermoving object in the shortest possible time.1 Of course, we shall often makeuse of standard figures such as circles, parabolas, etc., as specific examplessince we are fully familiar with their properties; in this way the objects ofboth elementary and analytic geometry enter into the more general subjectof differential geometry.

1A rather humorous statement of one such problem has two old ladies traveling inopposite directions around the base of a hemispherical mountain while a mathematically-minded fly travels back and forth between their noses in the least possible time.

125


5.1 Elementary Facts from the Theory of Curves

In Chapter 4 we introduced the idea of a coordinate curve in space, which isdescribed by the tip of the radius vector as it moves in such a way that oneof the coordinates q1, q2, q3 changes while the other two remain fixed. Nowwe consider a general curve described in a similar manner by a radius vectorwhose initial point is the origin and whose terminal point moves throughspace along the curve. We can describe the position of the radius vectorusing some parameter t (for a coordinate curve this was the coordinatevalue qi). Thus a curve is described as

r = r(t) (5.1)

where t runs through some set along the real axis. Each value of t corre-sponds to a point of the curve. Unless otherwise stated we shall supposethat the dependence of r on t is smooth enough that r′(t) is continuous int at each point and, where necessary, that the same holds for r′′(t). Thenotion of vector norm is required for this (§ 2.7). Here we denote this normusing the ordinary notation for the magnitude of a vector, e.g., |r|. Recallthat in Chapter 4 we introduced frame vectors ri tangential to the coordi-nate lines. We now introduce the tangential vector to an arbitrary curve ata point t, which is r′(t). The differential of the radius vector correspondingto the curve (5.1) is

dr(t) = r′(t) dt.

The length of an elementary section of the curve is

ds = |r′(t)| dt,and the length of the portion of the curve corresponding to t ∈ [a, b] is

s =∫ b

a

|r′(t)| dt.

Exercise 5.1. (a) Find the length of one turn of the helix

r(t) = i1 cos t+ i2 sin t+ i3t.

(b) Calculate the perimeter of the ellipse

x2

A2+y2

B2= 1

that lies in the z = 0 plane.

Elements of Differential Geometry 127

Exercise 5.2. Show that the general formulas for arc length in the rect-angular, cylindrical, and spherical coordinate systems are

s =∫ b

a

[(dx

dt

)2

+(dy

dt

)2

+(dz

dt

)2]1/2

dt,

s =∫ b

a

[(dρ

dt

)2

+ ρ2

(dφ

dt

)2

+(dz

dt

)2]1/2

dt,

s =∫ b

a

[(dr

dt

)2

+ r2(dθ

dt

)2

+ r2 sin2 θ

(dφ

dt

)2]1/2

dt.

Most convenient theoretically is the natural parametrization of a curvewhere the parameter represents the length of the curve calculated from theendpoint:

r = r(s).

With this parametrization

ds = |r′(s)| ds so |r′(s)| = 1.

Hence when a curve is parametrized naturally τ (s) = r′(s) is the unit tan-gential vector at point s. In this section s shall denote the length parameterof a curve.

Exercise 5.3. Re-parametrize the helix of Exercise 5.1 in terms of its nat-ural length parameter s. Then calculate the unit tangent to the helix as afunction of s.

If there is another parametrization of the curve that relates with thenatural parametrization s = s(t), then the vector dr(s(t))/dt also is tangentto the curve at the point s = s(t). For the derivative of a vector functionthe chain rule

dr(s(t))dt

=dr(s)ds

∣∣∣∣s=s(t)

ds(t)dt

holds.As is known from analytic geometry, when we know the position of a

point a of a straight line and its directional vector b, then the line can berepresented in the parametric form

r = a + λb.


This gives the vector equation of the line tangent to a curve at point r(t0):

r = r(t0) + λr′(t0).

In Cartesian coordinates (x, y, z) this equation takes the form

x = x(t0) + λx′(t0),

y = y(t0) + λy′(t0),

z = z(t0) + λz′(t0), (5.2)

which in nonparametric form is

x− x(t0)x′(t0)

=y − y(t0)y′(t0)

=z − z(t0)z′(t0)

. (5.3)

Exercise 5.4. Describe the plane curve

r(t) = et(i1 cos t+ i2 sin t),

and find the line tangent to this curve at the point t = π/4.

Exercise 5.5. (a) Write out the equations for the tangent line to a planecurve corresponding to (5.2) and (5.3) in polar coordinates. (b) Find theangle between the tangent at a point of the curve and the radius vector(from the origin) at this point.

Exercise 5.6. Write out the equations for the tangent line to a curvecorresponding to (5.2) and (5.3) in cylindrical and spherical coordinates.

Note that construction of a tangent vector is possible if r′(t0) = 0. Apoint t0 where r′(t0) = 0 is called singular.

Exercise 5.7. For a sufficiently smooth curve r = r(t) find a tangent at asingular point.

Exercise 5.8. Under the conditions of the previous exercise write out therepresentation of the type (5.2).

Curvature

For an element of circumference of a circle corresponding to the length ∆s,the radius R relates to the central angle ∆ϕ according to ∆s = R∆ϕ. Thusthe curvature k = 1/R is

k =∆ϕ∆s

.


Fig. 5.1 Calculation of curvature.

The curvature of an arbitrary plane curve at a point s is defined as thelimit

k = lim∆s→0

∆ϕ∆s

where ∆ϕ is the change in angle of the tangent to the element of thecurve. For a spatial curve the role of the tangent is played by the tangentunit vector τ (s). When a point moves through the element of the curvecorresponding to ∆s, the tangent turns through an angle ∆ϕ. Fig. 5.1demonstrates that |τ (s+ ∆s) − τ (s)| is equal to 2 sin(∆ϕ/2). Since

lim∆ϕ→0

sin(∆ϕ/2)(∆ϕ/2)

= 1,

the curvature of a spatial curve is given by

k = lim∆s→0

|∆ϕ|∆s

= lim∆s→0

|τ (s+ ∆s) − τ (s)|∆s

= lim∆s→0

|r′(s+ ∆s) − r′(s)|∆s

= |r′′(s)| . (5.4)

We intentionally introduced the definition of curvature in such a way thatit remains nonnegative. For a plane curve the definition is normally given


so that k can be positive or negative depending on the sense of convexity(“concave up” or “concave down”).

Exercise 5.9. A helix is described by

r(t) = i1α cos t+ i2α sin t+ i3βt.

Study the curvature as a function of the parameters α and β, showing thatk → 0 as α→ 0 and k → 1/α as β → 0. Explain.

Moving trihedron

Let us note that τ 2 = τ · τ = 1 for all s. This means thatd

dsτ 2(s) = 2τ (s) · τ ′(s) = 0, (5.5)

hence τ ′(s) is orthogonal to τ (s).

Remark 5.1. It is clear that τ (s) in (5.5) need not be a tangent vector.Thus we have a general statement: any unit vector e(s) is orthogonal to itsderivative e′(s). That is, e(s) · e′(s) = 0.

We define the principal normal ν at point s by the equation

ν =r′′(s)k

. (5.6)

If the curve r = r(s) lies in a plane, it is clear that ν lies in the same plane.Any plane through the point s that contains a tangent to the curve at thispoint is called a tangent plane. Among all the tangent planes at the samepoint of a non-planar curve, there is a unique one that plays the role ofthe plane containing a plane curve. It is the plane that contains τ and νsimultaneously. This plane is said to be osculating, and can be thought ofas “the most locally tangent” plane to the curve at the point s. A unitnormal to the osculating plane at the same point s is introduced using therelation

β = τ × ν.These vectors τ , ν, and β constitute what is called the moving trihedronof the curve. Associated with this frame at each point along the curve is aset of three mutually perpendicular planes. As we stated above, the planeof τ and ν is called the osculating plane. The plane of ν and β is calledthe normal plane, and the plane of β and τ is called the rectifying plane.


The equation of the osculating plane at point s0 is

(r − r(s0)) · β(s0) = 0

where r is the radius vector of a point of the osculating plane. For generalparametrization of the curve r = r(t) the equation of the osculating planein Cartesian coordinates is∣∣∣∣∣∣

x− x(t0) y − y(t0) z − z(t0)x′(t0) y′(t0) z′(t0)x′′(t0) y′′(t0) z′′(t0)

∣∣∣∣∣∣ = 0.

Exercise 5.10. Calculate ν and β for the helix of Exercise 5.9. Then findthe equation of the rectifying plane at t = t0.

Using the chain rule we can present the expression for k for an arbitraryparametrization of a curve:

k2 =(r′(t) × r′′(t))2

(r′2(t))3. (5.7)

We shall now denote the curvature by k1, because there is another quantitythat characterizes how a curve differs from a straight line. In terms of k1

we may define the radius of curvature as the number R = 1/k1.

Exercise 5.11. Derive expression (5.7).

The principal normal and the binormal to a space curve are not defineduniquely when τ ′(s) = 0. Any point at which this condition holds is calleda point of inflection of the curve. We shall assume that our curves satisfyτ ′(s) = 0 for all s.

Curves in the plane

The equations of this section take special forms when the curve under con-sideration lies in a plane. In such a case it is expedient to work in a concretecoordinate system such as rectangular coordinates or plane polar coordi-nates. In a rectangular coordinate frame (x, y) where

r = xx(t) + yy(t),

it is easily seen that the length of the part of the curve corresponding tothe interval [a, b] of the parameter t is

s =∫ b

a

√[x′(t)]2 + [y′(t)]2 dt.


The curvature is given by

k1 =x′(t)y′′(t) − x′′(t)y′(t)[x′(t)]2 + [y′(t)]23/2

.

These formulas correspond to the familiar formulas

s =∫ b

a

√1 + [f ′(x)]2 dx, k1 =

f ′′(x)1 + [f ′(x)]23/2

from elementary calculus, which are written for a curve expressed in thenon-parametric form y = f(x). In polar coordinates where the curve isexpressed in the form r = r(θ), we have

s =∫ b

a

√[r(θ)]2 + [r′(θ)]2 dθ, k1 =

[r(θ)]2 + 2[r′(θ)]2 − r(θ)r′′(θ)[r(θ)]2 + [r′(θ)]23/2

.

Note that for a plane curve the curvature k1 possesses an algebraic sign.

Exercise 5.12. (a) Find the radius of curvature of the curve y = x3 at thepoint (1, 1). Repeat for the curve y = x4 at the point (0, 0). (b) Locate thepoint of maximum curvature of the parabola y = ax2 + bx+ c.

Exercise 5.13. Suppose that all the tangents to a smooth curve passthrough the same point. Demonstrate that the curve is a part of a straightline or the whole line.

Exercise 5.14. Suppose that all the tangents to a smooth curve are parallelto a plane. Show that the curve lies in a plane.

Exercise 5.15. Suppose that all the principal normals of a smooth curveare parallel to a plane. Does this curve lie in a plane? Repeat when all thebinormals are parallel to a plane.

5.2 The Torsion of a Curve

When a curve lies in a plane, the binormal β is normal to this plane.Moreover, β is normal to the osculating plane to the curve at a point, sothe rate of rotation of this plane, which is measured by the rate of turn ofthe binormal, characterizes how the curve is “non-planar.” By analogy tothe curvature of a curve we introduce this characteristic of “non-planeness”called the torsion or second curvature, and define it as the limit of the ratio∆ϑ/∆s where ∆ϑ is the angle of turn of the binormal β. Since β is a unit


vector we can use the same reasoning as we used to derive the expressionfor the curvature (5.4). Let us denote the torsion by k2 and write

k2 = lim∆s→0

∆ϑ∆s

.

This quantity has an algebraic sign as we explain further below.2

Theorem 5.1. Let r = r(s) be a three times continuously differentiablevector function of s. At any point where k1 = 0, the torsion of the curve is

k2 = − (r′(s) × r′′(s)) · r′′′(s)k21

.

Proof. At a point s where k1 = 0 the binormal is defined uniquely as isν. Since β is a unit vector we can define the absolute value of the turnof the binormal, when it moves along the element corresponding to ∆s, inthe same manner as we used to introduce the curvature of a curve. Now|β(s+ ∆s) − β(s)| = 2 |sin ∆ϑ/2| . Thus we have

|k2| = lim∆s→0

∣∣∣∣∆ϑ∆s

∣∣∣∣= lim

∆s→0

∣∣∣∣ ∆ϑ2 sin(∆ϑ/2)

∣∣∣∣ ∣∣∣∣2 sin(∆ϑ/2)∆s

∣∣∣∣= lim

∆s→0

∣∣∣∣β(s+ ∆s) − β(s)∆s

∣∣∣∣=∣∣β′(s)

∣∣ .Let us demonstrate that β′(s) and ν are parallel. The derivative of anyunit vector x(s) is normal to the vector:

0 =d

ds(x(s) · x(s)) = 2x(s) · x′(s).

So β′(s) is orthogonal to β(s) and hence parallel to the osculating plane.Next

β′(s) = (τ (s) × ν(s))′ = τ ′(s) × ν(s) + τ (s) × ν′(s) = τ (s) × ν ′(s) (5.8)

where we used the fact that τ ′(s) is parallel to ν(s). By (5.8) it followsthat β′(s) is orthogonal to τ (s), so we have established the needed property.Since |ν(s)| = 1 it follows that

∣∣β′(s) · ν(s)∣∣ =

∣∣β′(s)∣∣ and thus

|k2| =∣∣β′(s) · ν(s)

∣∣ .2Basically, a curve having positive torsion will twist in the manner of a right-hand screw

thread as s increases.


Let us use the fact that ν = r′′/k1, and thus

ν′ = r′′′/k1 − r′′k1′/k2

1 , β = (r′ × r′′)/k1.

We get

|k2| = |(τ × ν′) · ν| =|(r′ × r′′) · r′′′|

k12 .

Here we used the properties of the mixed product. We now define

k2 = − (r′ × r′′) · r′′′k1

2 .

The rule for the sign is introduced as follows: if the binormal β turns in thedirection from β to ν, then the sign is positive; otherwise, it is negative.The sign of k2 is taken in such a way that

β′ = k2ν. (5.9)

This completes the proof.

Let us also mention that if we consider another parametrization of thecurve, then a simple calculation using the chain rule yields

k2 = − (r′(t) × r′′(t)) · r′′′(t)(r′(t) × r′′(t))2

. (5.10)

We have said that the value of the torsion indicates the rate at whichthe curve distinguishes itself from a plane curve. Let us demonstrate thismore clearly. Let k2 = 0 for all s. We show that the curve lies in a plane.Indeed 0 = |k2| =

∣∣β′(s) · ν(s)∣∣, thus

∣∣β′(s)∣∣ = 0 (since ν is a unit vector)

and so β(s) = β0 = const. The tangent vector τ is orthogonal to β0, so0 = τ · β0 = r′(s) · β0 and thus, integrating, we get (r(s) − r(s0)) · β0 = 0.This means that the curve lies in a plane.

By formula (5.9), k2 = |dβ/ds|. So k2 = lim∆s→0 |∆β/∆s|. But up tosmall quantities of the second order of ∆s, the change |∆e| of a unit vectore is equal to the angle of rotation of e when moved through a distance ∆s.Thus |∆β| is approximately equal to the angle of rotation of the binormalduring the shift of the point through ∆s, and so k2 measures the rate ofrotation of the binormal when a point moves along the curve.

Exercise 5.16. Demonstrate that a smooth curve lies in a plane only ifk2 = 0.


Exercise 5.17. Demonstrate that in Cartesian coordinates

k1 =

[(y′z′′ − z′y′′)2 + (z′x′′ − x′z′′)2 + (x′y′′ − y′x′′)2

]1/2

[(x′)2 + (y′)2 + (z′)2]3/2

and

k2 =

∣∣∣∣∣∣x′ y′ z′

x′′ y′′ z′′

x′′′ y′′′ z′′′

∣∣∣∣∣∣(y′z′′ − z′y′′)2 + (z′x′′ − x′z′′)2 + (x′y′′ − y′x′′)2

where the prime denotes d/dt.

Exercise 5.18. What happens to k1 and k2 (and the moving trihedron ofa curve) if the direction of change of the parameter is reversed (t → (−t))?Exercise 5.19. Calculate k2 for the helix of Exercise 5.9.

5.3 Frenet–Serret Equations

The natural triad τ ,ν,β can serve as coordinate axes of the space; this isused when studying local properties of a curve. Frenet established a systemof ordinary differential equations which governs the triad along the curve.We have already derived two of these three equations: (5.6) and (5.9). Theformer will be written as τ ′ = k1ν. Let us derive the third formula of theFrenet–Serret system. We have ν = β × τ . By this,

ν′ = (β × τ )′

= β′ × τ + β × τ ′

= k2ν × τ + β × (k1ν)

= −k1τ − k2β.

Let us collect the Frenet–Serret equations together:

τ ′ = k1ν,

ν ′ = −k1τ − k2β,

β′ = k2ν. (5.11)

We recall that these equations are written out when the curve has thenatural parametrization with the length parameter s.

Note that if the curvatures k1(s) and k2(s) are given functions of s, thesystem (5.11) becomes a linear system of ordinary differential equations; in


component form, it becomes a system of nine equations in nine unknowns.Fixing some point of the curve in space and an orthonormal triad τ ,ν,βat the point, by this system we can define τ (s), ν(s), and β(s) uniquely;then, by the equation r′(s) = τ (s), we define the curve r = r(s) uniquelyas well. Thus k1(s) and k2(s) define the curve up to a motion in space.That is why the pair of equations for k1(s), k2(s) is called the set of naturalequations of the curve.

Let us demonstrate how to use the Frenet–Serret equations to charac-terize the curve locally. We use the Taylor expansion of the radius vectorof a curve at point s:

r(s+ ∆s) = r(s) + ∆sr′(s) +(∆s)2

2r′′(s) +

(∆s)3

6r′′′(s) + o(|∆s|3).

By the definition ν = r′′/k1 and by the second of equations (5.11) we get

r′′′ = (k1ν)′ = k′1ν + k1ν′ = k′1ν + k1(−k1τ − k2β).

Substituting these we get

r(s+∆s) = r(s)+∆sτ +(∆s)2

2k1ν+

(∆s)3

6(k′1ν−k2

1τ−k1k2β)+o(|∆s|3).This shows that, to the order of (∆s)2, the curve lies in the osculatingplane at point s. At a point s the triad τ ,ν,β is Cartesian. Let us fix thisframe and place its origin at r(s) = 0. Defining x, y, z as the componentsof r(s+∆s) in this frame we obtain the approximate representation for thecurve

x = ∆s− k21(∆s)

3

6+ o(|∆s|3),

y =k1(∆s)2

2+k′1(∆s)

3

6+ o(|∆s|3),

z = −k1k2(∆s)3

6+ o(|∆s|3).

As another application of the Frenet–Serret equations, we find the ve-locity and acceleration of a point moving in space. Let s be the lengthparameter of the trajectory of the point, and let t denote time. We shalluse the triad (τ ,ν,β) of the trajectory r = r(s). The position of the pointis given by the equation

r = r(s(t)).

The velocity of the point is

v =d

dtr(s(t)) =

drds

ds

dt=ds

dtτ .


Denoting v = ds/dt, the particle speed at each point along its path, we canwrite v = vτ . Now let us find the acceleration of the point in the sameframe, which is

a =d2

dt2r(s(t)) =

d

dtv(s(t)) =

d

dt

(ds

dtτ

)=d2s

dt2τ +

ds

dt

dτ

ds

ds

dt= s′′(t)τ + v2 dτ

ds.

Using the Frenet–Serret equations we get

a = s′′(t)τ + k1v2ν

where k1 is the principal curvature of the trajectory. In mechanics this iscommonly written as

a = s′′(t)τ + (v2/ρ)ν

where ρ = 1/k1 is the radius of curvature of the curve. Thus the accelerationvector lies in the osculating plane at each point along the trajectory. Onthis fact several practical methods of finding the acceleration are based.

Exercise 5.20. Express d3r/ds3 in terms of the moving trihedron.

Exercise 5.21. By defining a vector δ = k1β−k2τ , show that the Frenet–Serret equations can be written in the form

τ ′ = δ × τ , ν′ = δ × ν, β′ = δ × β.The vector δ is known as the Darboux vector.

Exercise 5.22. A particle moves through space in such a way that itsposition vector is given by

r(t) = i1(1 − cos t) + i2t+ i3 sin t.

Find the tangential and normal components of the acceleration.

5.4 Elements of the Theory of Surfaces

The reader is familiar with many standard surfaces: the sphere, the cone,the cylinder, etc. These are easy to visualize, and in Cartesian coordinatesare described by simple equations. For example, the equation of a spherereflects the definition of a sphere: all points (x, y, z) of a sphere have thesame distance R from the center (x0, y0, z0):

(x− x0)2 + (y − y0)2 + (z − z0)2 = R2. (5.12)


An infinite circular cylinder with generator parallel to the z-axis is givenby the equation

(x− x0)2 + (y − y0)2 = R2,

in which the variable z does not appear. We can obtain other types ofcylindrical surfaces by considering a set of spatial points satisfying theequation

f(x, y) = 0.

This equation does not depend on z, hence by drawing a generator parallelto the z-axis through the point (x, y, 0) we get a more general cylindricalsurface. A paraboloid is an example of a more complex surface:

z = ax2 + 2bxy + cy2 + dx+ ey + f, (5.13)

where a, b, c, d, e, f are numerical coefficients. Depending on the values ofa, b, c, the paraboloid can be elliptic, hyperbolic or parabolic. These termsfrom analytic geometry will be used to characterize the shape of a surfaceat a point. The areas and volumes associated with such standard surfaces(or portions thereof) can be calculated by integration or, sometimes, by theuse of elementary formulas.

From a naive viewpoint a surface is something that fully or partiallybounds a spatial body. However, a precise definition of the term “surface”is not easy to give. An attempt could be based on a local description, re-garding an elementary portion of a surface as a continuous image in spaceof a small disk in the plane. Unfortunately such a definition would placeunder consideration surfaces of extremely complex structure. To use the or-dinary tools of calculus we must invoke the idea of smoothness of a surface,even if we do not define this term precisely (it is common in the naturalsciences to use notions and study objects that are not explicitly introduced,and about which we know only some things).

As a first step we could view a surface as something in space that re-sembles the figures mentioned above in the sense that it has zero thickness.To use calculus we must represent the coordinates of the points of a surfaceusing functions. Note that the paraboloid (5.13) can be described using asingle function of the two variables x and y, whereas this is not possiblewith the sphere (5.12). But we can divide a sphere into hemispheres in sucha way that each can be described by a separate function of x and y. Thiscan be done with more general surfaces, although it may not be possible


to have z as the dependent variable for all portions of a surface. In gen-eral, the position of any point of a surface is determined by the values of apair (u, v) of independent parameters which may or may not be Cartesiancoordinates. It is convenient to use the position vector

r = r(u, v)

to locate this point with respect to the coordinate origin. For a Cartesianframe this vector equation is equivalent to the three scalar equations

x = x(u, v), y = y(u, v), z = z(u, v),

but the use of other coordinate frames is common. The pair (u, v) can beregarded as the coordinates of a point in the uv-plane, and a portion ofthe surface may be regarded as the image of a domain in this plane; inthis sense we can say that a surface is two-dimensional. As was the casein earlier sections, we shall consider only sufficiently smooth functions. Wehave seen that the use of tensor symbolism simplifies many formulas andrenders them clear. So instead of the notation (u, v) we shall use (u1, u2)to denote the intrinsic coordinates of a surface.3

One goal of this book is to collect major results and explain how thesecan be used by practitioners of many sciences. In particular, we now presentformulas used in the mechanics of elastic shells, which relies heavily on thetheory of surfaces. First we show how to calculate distances and angles on ageneral surface. This is done by introducing the “first fundamental form” ofthe surface. In this way we introduce a metric form of the surface analogousto the one used for three-dimensional space. Other surface properties can bedescribed once we possess the notion of the surface normal. The structureof a surface at a point may cause us to experience differences as we departfrom the point in various directions. To study this we introduce the “secondfundamental form” of the surface. The two fundamental forms provide afull local description of the surface. We shall study other notions as well,concluding our treatment by adapting tensor analysis to two dimensionsand applying the resulting tools to the study of surfaces.

First fundamental form

We have said that a coordinate line in space is a set of points generatedby fixing two of the coordinates q1, q2, q3 and changing the remaining one.3We use this term to indicate that the coordinates refer to the surface; they are taken

in the surface, not from the surrounding space somehow.


A coordinate level surface is generated by fixing one of the coordinatesq1, q2, q3 and changing the remaining two. A similar idea is used to intro-duce an arbitrary surface in space. The position of a point on a surfacecan be characterized by a radius vector initiated from some origin of thespace and with terminus that moves along the surface. A point of a surfacecan be characterized by two parameters that we shall denote by u1, u2 orsometimes, when convenient, by u, v:

r = r(u1, u2). (5.14)

Thus we consider a surface as the spatial image of some domain in the(u1, u2) coordinate plane. For simplicity we shall restrict ourselves to thosesurfaces that are smooth everywhere except possibly for some simple curvesor points which lie thereon. So we shall consider the case when (5.14) isas smooth as we like (it is normally necessary for (5.14) to have all of itssecond (and sometimes third) partial derivatives continuous except for somelines or poles, as is the case with the vertex of a cone). Similar to the caseof a three-dimensional space, on a surface there are coordinate lines thatarise when in (5.14) we fix one of coordinates. In similar fashion we canintroduce the tangent vectors

ri =∂r(u1, u2)

∂ui(i = 1, 2).

We suppose that, except at some singular lines or poles in the (u1, u2)plane, these vectors are continuously differentiable in the coordinates. Wealso suppose that, except at the same points, these two vectors are notcollinear so the vector

N = r1 × r2 = 0.

By this we can introduce the unit normal to the surface

n =r1 × r2

|r1 × r2| ,

which is simultaneously normal to the plane that is osculating to the surfaceat point (u1, u2). Let us demonstrate this. Let the plane through point(u1, u2) have normal n. The osculating plane (Fig. 5.2) is the plane forwhich

lim∆s→0

h

∆s= 0

where

∆s =∣∣r(u1 + ∆u1, u2 + ∆u2) − r(u1, u2)

∣∣


and h is the distance from the point with coordinates (u1 +∆u1, u2 +∆u2)to the plane. Note that this must hold for any curve on the surface throughthe point. Let the plane be orthogonal to n. Then

h =∣∣(r(u1 + ∆u1, u2 + ∆u2) − r(u1, u2)

) · n∣∣ .Using the definition of the differential we have

h

∆s=

∣∣(r(u1 + ∆u1, u2 + ∆u2) − r(u1, u2)) · n∣∣|r(u1 + ∆u1, u2 + ∆u2) − r(u1, u2)|

=

∣∣(ri∆ui + o(∣∣∆u1

∣∣+ ∣∣∆u2∣∣)) · n∣∣

|ri∆ui + o(|∆u1| + |∆u2|)|

=o(∣∣∆u1

∣∣+ ∣∣∆u2∣∣)

|ri∆ui + o(|∆u1| + |∆u2|)| .

This means that the plane we chose is osculating.

Fig. 5.2 Osculating plane to a surface.

It is clear that much of what was said about ri in Chapter 4 should besimply reformulated for the present case if we use the vectors r1, r2, r3 = nas a basis in R

3. First, we can introduce the reciprocal basis in space whosethird vector is n again and whose two others are defined by the equations

ri · rj = δij (i = 1, 2), ri · n = 0.

The vectors ri (i = 1, 2) constitute the reciprocal frame to ri (i = 1, 2) inthe plane osculating to the surface at (u1, u2).


Let us consider the differential

dr = ri dui. (5.15)

In this chapter, when summing, the indices take the values 1 and 2. Thedifferential is the main linear part in dui of the difference

∆r = r(u1 + du1, u2 + du2) − r(u1, u2).

The main part of the distance between nearby points r(u1 + du1, u2 + du2)and r(u1, u2) is given by

(ds)2 = dr · dr = ri dui · rj du

j = gij dui duj

where gij = ri · rj is the metric tensor of the surface. The form

(ds)2 = gij dui duj

is called the first fundamental form of the surface. Beginning with Gaussthe metric coefficients were denoted by

E = g11, F = g12 = g21, G = g22,

so the first fundamental form is

(ds)2 = E(du1)2 + 2F du1 du2 +G(du2)2.

Exercise 5.23. Find the first fundamental form of a sphere of radius a.

Besides the length of an elementary curve on the surface, formula (5.15)allows us to find an angle between two elementary curves dr = ri du

i anddr = ri du

i. Indeed, by the definition of the dot product we have

cosϕ =dr · dr|dr| |dr|

=E du1 du1 + F (du1 du2 + du2 du1) +Gdu2 du2√

E(du1)2 + 2F du1 du2 +G(du2)2√E(du1)2 + 2F du1 du2 +G(du2)2

.

(5.16)

Exercise 5.24. (a) Find the expression for the angle between coordinatelines of a surface. (b) Calculate the angle at which the curve θ = φ crossesthe equator of a sphere.

The area of the parallelogram that is based on the elementary vectorsr1 du

1 and r2 du2 which are tangent to the coordinate lines is

dS =∣∣r1 du

1 × r2 du2∣∣ .


This approximates up to higher order terms the area of the correspondingcurvilinear figure bordered by the coordinate lines

u1 = const, u2 = const, u1 + du1 = const, u2 + du2 = const.

Let us find dS in terms of the first fundamental form. We need to calculate|r1 × r2|. Let us demonstrate that

|r1 × r2| =√EG− F 2. (5.17)

Indeed, by definition of the dot and cross product we have

|r1 × r2|2 + (r1 · r2)2 = |r1|2 |r2|2 sin2 ϕ+ |r1|2 |r2|2 cos2 ϕ = |r1|2 |r2|2 .

Thus

|r1 × r2|2 = |r1|2 |r2|2 − (r1 · r2)2 = EG− F 2

and (5.17) follows.Summing the elementary areas corresponding to a domain A in the

(u1, u2) plane and doing the limit passage we get the value

S =∫

A

√EG− F 2 du1 du2.

It can be shown that for a smooth surface this limit S does not depend onthe parametrization and hence is the area of the portion A of the surface.

Exercise 5.25. A cone is described by the position vector

r(u1, u2) = i1u1 cosu2 + i2u1 sinu2 + i3u1,

where 0 ≤ u1 ≤ a and 0 ≤ u2 < 2π. Find the surface area of the cone.

Exercise 5.26. Find the above formulas for a figure of revolution describedby the Cartesian coordinate position vector

r = xρ cosφ+ yρ sinφ+ zf(ρ)

where f(ρ) is a suitable profile function.

Exercise 5.27. Let two smooth surfaces be parametrized in such a waythat the coefficients of their first fundamental forms are proportional at anypoint with the same coordinates (so this defines a map of one surface ontothe other). Show that the map preserves the angles between correspondingdirections on the surfaces.


Geodesics

A fundamental problem in surface theory is to find the curve of minimumlength between two given points on a surface S. To treat such a problemit is natural to begin with the first fundamental form

(ds)2 = gij dui duj .

We know that parametric equations

u1 = u1(t), u2 = u2(t),

can specify a curve C on S. Writing (ds)2 as

(ds)2 = gijdui

dt

duj

dt(dt)2,

the expression for arc length along C from t = a to t = b becomes

s =∫ b

a

(gijdui

dt

duj

dt

)1/2

dt. (5.18)

We seek a curve of minimum length between a given pair of endpoints: i.e.,we seek functions u1(t) and u2(t) that minimize s.

Since

gij = gij(u1(t), u2(t)),

we see that (5.18) has the form

s(u) =∫ b

a

f(t,u, u) dt (5.19)

where u = u(t) = (u1(t), u2(t)) and the overdot denotes differentiation withrespect to t. Because s(u) is a correspondence that assigns a real numbers to each function u in some class of functions, we call it a functional inu. Here the class of functions includes all admissible routes u between thegiven endpoints on the surface.

The minimization of functionals is treated in the calculus of variations.Fortunately the main ideas of this extensive subject lend themselves to abrief discussion aimed at the present task. (Additional coverage appears inChapter 6, where we consider the variational principles of elasticity.)

The approach to finding a minimizer u(t) for the functional (5.19)hinges on replacing u(t) by a new function u(t) + εϕ(t), where ϕ(t) isan admissible variation of the function u(t) and ε is a small real parame-ter. In this discussion we shall consider a variation ϕ(t) to be admissible


if ϕ(a) = ϕ(b) = 0; that is, if each curve of the form u(t) + εϕ(t) connectsthe same endpoints as the curve u(t). Such a replacement gives

s(u + εϕ) =∫ b

a

f(t,u + εϕ, u + εϕ) dt. (5.20)

For fixed u and ϕ this is a function of the real variable ε, and takes itsminimum at ε = 0 for any ϕ. Let us vary the components of u(t) oneat a time. If we take ϕ of the special form ϕ1(t) = (ϕ(t), 0), then (5.20)becomes

s(u + εϕ1) =∫ b

a

f(t, u1(t) + εϕ(t), u2(t), u1(t) + εϕ(t), u2(t)) dt.

We now set

ds(u + εϕ1)dε

∣∣∣∣ε=0

= 0 (5.21)

and handle the left-hand side using the chain rule:

d

dε

∫ b

a

f(t, u1 + εϕ, u2, u1 + εϕ, u2) dt∣∣∣∣ε=0

=

=∫ b

a

[∂

∂u1f(t, u1, u2, u1, u2)ϕ+

∂

∂u1f(t, u1, u2, u1, u2)ϕ

]dt.

This is the first variation of the functional s(u) with respect to variations ofthe form u + εϕ1; we equate it to zero as in (5.21), and employ integrationby parts in the second term of the integrand to get∫ b

a

[∂

∂u1f(t, u1, u2, u1, u2) − d

dt

∂

∂u1f(t, u1, u2, u1, u2)

]ϕdt = 0.

According to the fundamental lemma of the calculus of variations (cf.,[Lebedev and Cloud (2003)]; one version of this lemma is presented inTheorem 6.8), this equation can hold for all admissible variations ϕ only ifthe bracketed quantity vanishes:

∂

∂u1f(t, u1, u2, u1, u2) − d

dt

∂

∂u1f(t, u1, u2, u1, u2) = 0.

Repeating this process for variations of the special form ϕ2(t) = (0, ϕ(t))we obtain a similar result

∂

∂u2f(t, u1, u2, u1, u2) − d

dt

∂

∂u2f(t, u1, u2, u1, u2) = 0.


This system of two equations, which we can write in more compact notationas

∂f

∂ui− d

dt

∂f

∂ui= 0 (i = 1, 2), (5.22)

is known as the system of Euler equations for the functional (5.19). Satis-faction of this system is a necessary condition for u = (u1(t), u2(t)) to be aminimizer of s. In the calculus of variations a solution to an Euler equation(or, in this case, a system of such equations) is called an extremal. This isanalogous to a stationary point of an ordinary function: further testing isrequired to ascertain whether the extremal actually yields a minimum ofthe functional (it could yield a maximum, say). We refer to solutions of thesystem (5.22) as geodesics on the surface S.

We now use the fact that

f2 = gjkujuk (5.23)

to compute the derivatives in (5.22). First we have

2f∂f

∂ui=∂gjk

∂uiuj uk

so that∂f

∂ui=

12f

∂gjk

∂uiujuk. (5.24)

Expanding the notation in (5.23) we see that

f2 = g11u1u1 + g12u

1u2 + g21u2u1 + g22u

2u2,

hence

2f∂f

∂u1= 2g11u1 + g12u

2 + g21u2 = 2(g11u1 + g12u

2) = 2g1juj

and ∂/∂u2 is taken similarly. So

f∂f

∂ui= gij u

j ,

which gives

d

dt

∂f

∂ui=

d

dt

(gij u

j

f

)=

1f

d

dt(gij u

j) + gij uj d

dt

(1f

)=

1f

(gij u

j + uj d

dtgij

)+ gij u

j

(− 1f2

df

dt

).


Hered

dtgij =

∂gij

∂ukuk

by the chain rule, so

d

dt

∂f

∂ui=

1f

(gij u

j +∂gij

∂ukuj uk

)− gij u

j

f2

df

dt.

It simplifies things if we take t to be the arc length parameter; then f ≡ 1and df/dt ≡ 0 so that

d

dt

∂f

∂ui= gij u

j +∂gij

∂ukuj uk. (5.25)

Putting (5.24) and (5.25) into (5.22) we have

gij uj +

[∂gij

∂uk− 1

2∂gjk

∂ui

]uj uk = 0. (5.26)

The second term on the left can be rewritten:[∂gij

∂uk− 1

2∂gjk

∂ui

]=[12

(∂gij

∂uk+∂gik

∂uj

)− 1

2∂gjk

∂ui

]ujuk

=12

[∂gij

∂uk+∂gik

∂uj− ∂gjk

∂ui

]uj uk

= Γjkiuj uk.

Putting this into (5.26) and raising the index i, we have finally

un + Γnjku

j uk = 0 (n = 1, 2). (5.27)

This is a system of two nonlinear differential equations for the unknownun(t). Again, t is assumed to be the natural length parameter; this meansthat the constraint gij u

j uk = 1 must be enforced at each value of t alongthe curve.

As a simple example we may treat a cylinder ρ = a. It is clear inadvance that if we cut the cylinder along a generator and “unroll” it ontoa plane, then the shortest route between two points is the direct segmentconnecting them. Rolling this plane back into a cylinder we get a curve ofthe form

φ(t) = c1t+ c2, z(t) = c3t+ c4, (5.28)

i.e., a helix. Now let us obtain the same result using (5.27). For thisparticular surface (u1, u2) = (φ, z) and

r = xa cosφ+ ya sinφ+ zz.


We find

∂r∂φ

= −xa sinφ+ ya cosφ,∂r∂z

= z,

giving g11 = a2, g12 = g21 = 0, g22 = 1. Since these are all constants theChristoffel coefficients are all zero, and (5.27) reduces to the system

φ = 0, z = 0.

Integration replicates the result (5.28).

Exercise 5.28. Find the geodesics of a sphere. Show that they are greatcircles of the sphere.

This exercise shows that a geodesic is not always a shortest route, sincewe can get either part of a great circle on the sphere connecting two givenpoints, and both of them are geodesics.

5.5 The Second Fundamental Form of a Surface

Using the Taylor expansion we can find the change of r(u1, u2) to a higherdegree of approximation:

r(u1 + ∆u1, u2 + ∆u2) = r(u1, u2) +∂r(u1, u2)

∂ui∆ui+

+12∂2r(u1, u2)∂ui∂uj

∆ui∆uj + o((∆u1)2 + (∆u2)2

).

Here we apply the o-notation to vector quantities, indicating that the normof the remainder is of a higher order of smallness. We are interested in thedeviation of the surface from the osculating plane in a neighborhood of apoint (u1, u2). This value is given through the dot product by the normalvector n:

(r(u1 + ∆u1, u2 + ∆u2) − r(u1, u2)) · n

=∂r(u1, u2)

∂ui· n∆ui +

12∂2r(u1, u2)∂ui∂uj

· n∆ui∆uj + o((∆u1)2 + (∆u2)2

)=

12∂2r(u1, u2)∂ui∂uj

· n∆ui∆uj + o((∆u1)2 + (∆u2)2

).


We have used the fact that ri · n = 0. The terms of the second order ofsmallness in ∆ui constitute the second fundamental form. Denoting

∂2r(u1, u2)(∂u1)2

· n = L(u1, u2) = L,

∂2r(u1, u2)∂u1∂u2

· n = M(u1, u2) = M,

∂2r(u1, u2)(∂u2)2

· n = N(u1, u2) = N,

we introduce the second fundamental form by

d2r · n = L(du1)2 + 2Mdu1du2 +N(du2)2. (5.29)

The deviation of the surface from the osculating plane in a small neighbor-hood of (u1, u2) is given by the following approximation:

z =12(L(du1)2 + 2Mdu1du2 +N(du2)2

), (5.30)

and so (5.30) defines the nature of the behavior of the surface in this neigh-borhood. This formula defines a paraboloid

z =12(L(v1)2 + 2Mv1v2 +N(v2)2

)(5.31)

written in the triad(r1(u1, u2), r2(u1, u2),n(u1, u2)

)when the origin is the

point of the surface with coordinates (u1, u2). Depending on the coefficientsL,M,N , this paraboloid can be elliptic, hyperbolic or parabolic. If L =M = N = 0 then the corresponding point is called a planar point.

For the second fundamental form of the surface there is a representationdifferent from (5.29). Differentiating the identity ri · n = 0 we get

∂ri

∂uj· n = −ri · ∂n

∂uj. (5.32)

Dot multiplying dr · dn we get

ri dui · ∂n∂uj

duj = −(∂ri

∂ujdui duj

)· n = −d2r · n,

which means that

−dr · dn = L(du1)2 + 2M du1 du2 +N(du2)2.

Exercise 5.29. Let a surface be given in Cartesian components as

r = x(u, v)x + y(u, v)y + z(u, v)z.


Let the subscripts u and v denote corresponding partial derivatives with re-spect to u and v. Demonstrate that the coefficients of the first fundamentalform are

E = r2u = x2

u + y2u + z2

u,

F = ru · rv = xuxv + yuyv + zuzv,

G = r2v = x2

v + y2v + z2

v .

Then show that the coefficients of the second fundamental form are

L =

∣∣∣∣∣∣xuu yuu zuu

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

, M =

∣∣∣∣∣∣xuv yuv zuv

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

, N =

∣∣∣∣∣∣xvv yvv zvv

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

.

Exercise 5.30. Find L,M,N for a surface of revolution.

Normal curvature of the surface

Let us consider a curve lying on the surface. The curve can be uniquelydefined by a parameter s in such a way that u1 = u1(s), u2 = u2(s); forsimplicity we assume s to be the length parameter. Then the equation ofthe curve is

r = r(u1(s), u2(s)) = r(s)

and a tangential vector to the curve at (u1, u2) is (ridui/ds) ds. Consider

the curvature of this curve, which can be found using r′′(s): the normal νto the curve has the same direction and k1 equals |r′′(s)|. It follows that

r′′(s) · n = k1 cosϑ (5.33)

where ϑ is the angle between ν and n. Let us rewrite (5.33) in the form

k1 cosϑ =

=∂2r

(∂u1)2

(du1

ds

)2

(ds)2 + 2 ∂2r∂u1∂u2

du1

dsdu2

ds (ds)2 + ∂2r(∂u2)2

(du2

ds

)2

(ds)2

(ds)2· n

=L(du1(s))2 + 2M du1(s) du2(s) +N(du2(s))2

E(du1(s))2 + 2F du1(s) du2(s) +G(du2(s))2. (5.34)

For all the curves with the same direction defined by the constant ratiodu1(s) : du2(s), the right-hand side of (5.34) is the same as the ratio of the


second to the first fundamental forms. We define this value k0 as the normalcurvature of the surface in the direction du1(s) : du2(s). Geometrically, thisis the curvature of the curve formed by intersecting the surface with theplane through (u1(s), u2(s)) that is parallel to n and ri du

i(s). This normalcurvature satisfies the equality

k0 = k1 cosϑ

which is the main part of Meusnier’s theorem. From this formula for cur-vatures it follows that the normal curvature of the surface in the directiondu1 : du2 has minimal absolute value among all the curvatures of the curveson the surface through (u1(s), u2(s)) having the same direction du1 : du2.Let us note that the curvature of the osculating paraboloid is character-ized by the same equation. The normal curvature depends on the directiondu1 : du2 = x : y only. Let us rewrite its expression as

k0 =Lx2 + 2Mxy +Ny2

Ex2 + 2Fxy +Gy2. (5.35)

The right-hand side of (5.35) is a homogeneous form with respect to x, yof zero order; it is easy to see that the minimum and maximum of k0 canbe found as the minimum and maximum of the quadratic form

Lx2 + 2Mxy +Ny2

when

Ex2 + 2Fxy +Gy2 = 1. (5.36)

This problem can be solved with use of the Lagrange multiplier. On thecurve (5.36) denote

min(Lx2 + 2Mxy +Ny2) = kmin, max(Lx2 + 2Mxy +Ny2) = kmax.

Let the corresponding values for x, y be (x1, y1) and (x2, y2), respectively.If kmin = kmax these values define only two directions xj : yj (j = 1, 2) andit can be shown that

Ex1x2 + F (x1y2 + x2y1) +Gy1y2 = 0,

which, by (5.16), means that the directions corresponding to the extremevalues of the normal curvature are mutually orthogonal. In the case whenkmin = kmax the normal curvature is the same in all directions (the oscu-lating paraboloid corresponds to a paraboloid of revolution). These valueskmin, kmax are the extreme normal curvatures of the surface at a point; theycharacterize the surface at a point and hence are invariant with respect to


the change of coordinates of the surface. So are two other characteristicsof the surface at a point: the mean curvature

H =12(kmin + kmax),

and the Gaussian or complete curvature

K = kminkmax.

We present without proof the equations in terms of the fundamental formsof the surface:

H =12LG− 2MF +NE

EG− F 2, K =

LN −M2

EG− F 2.

Since EG − F 2 > 0, the sign of K is the sign of LN −M2. When K = 0everywhere the surface is developable.4 These and many other facts on theproperties of surfaces with given H and K can be found in any textbookon differential geometry (e.g., [Pogorelov (1957)]).

Exercise 5.31. Find the mean and Gaussian curvatures for a sphere ofradius a.

Exercise 5.32. Find the mean and Gaussian curvatures at x = y = 0 ofthe following paraboloids: (a) z = axy; (b) z = a(x2+y2); (c) z = ax2+by2.

Exercise 5.33. Suppose a surface is given in Cartesian coordinates:

r = xx+ yy + zf(x, y).

Demonstrate that (a)

E = r2x = 1 + f2

x , F = rx · ry = fxfy, G = r2y = 1 + f2

y ;

(b)

EG− F 2 = 1 + f2x + f2

y ;

(c)

L =fxx√

1 + f2x + f2

y

, M =fxy√

1 + f2x + f2

y

, N =fyy√

1 + f2x + f2

y

;

4Roughly speaking, a developable surface is one that can be flattened into a portionof a plane without compressing or stretching any part of it. Examples include conesand cylinders, but not spheres. A developable surface can be generated by sweeping astraight line (generator) along a curve through space. See page 168 for a more precisedefinition.


(d) the area of a portion of the surface is

S =∫

D

√1 + f2

x + f2y dx dy;

(e) the Gaussian curvature is

K =LN −M2

EG− F 2=

fxxfyy − f2xy(

1 + f2x + f2

y

)2 .5.6 Derivation Formulas

At each point of a surface there is defined the frame triad

r1(u1, u2), r2(u1, u2), n(u1, u2).

In the theory of shells they introduce the curvilinear coordinates in a neigh-borhood of the surface in such a way that on the surface the triad (r1, r2,n)is preserved. Since we seek different characteristics of fields given on thesurface and outside of it, we must find the derivatives of the triad withrespect to the coordinates. The goal of this section is to present thesein terms of the surface we have introduced: i.e., in terms of the first andsecond fundamental forms of the surface.

We start with the representation for the derivatives through theChristoffel notation. This is valid because, by assumption, (r1, r2,n) isa basis of the space of vectors. So

rij =∂2r

∂ui∂uj= Γt

ijrt + λijn. (5.37)

We recall that we use indices i, j, t taking values from the set 1, 2, which iswhy we introduce the notation λij for the coefficients of n. Similarly let usintroduce the expansion for the derivatives of n. To derive the coefficientsλij we dot multiply (5.37) by n. Using the expressions for the secondfundamental form we get

λ11 = L, λ12 = λ21 = M, λ22 = N.

Next, dot multiplying (5.37) first by r1 and then by r2, we get six equations


in the six unknown Christoffel symbols:12∂E

∂u1= Γ1

11E + Γ211F,

∂F

∂u1− 1

2∂E

∂u2= Γ1

11F + Γ211G,

12∂E

∂u2= Γ1

12E + Γ212F,

12∂G

∂u1= Γ1

12F + Γ212G,

∂F

∂u2− 1

2∂G

∂u1= Γ1

22E + Γ222F,

12∂G

∂u2= Γ1

22F + Γ222G.

This system can be easily solved for the Christoffel coefficients since thesystem splits into pairs of equations with respect to the pairs of Christoffelsymbols. The first pair yields, for instance,

Γ111 =

1EG− F 2

[G

2∂E

∂u1− F

∂F

∂u1+F

2∂E

∂u2

],

and a similar expression for Γ211. We leave the rest of this work to the reader,

mentioning that the expressions for the Christoffel symbols are composedonly of the coefficients of the first fundamental form of the surface. Next,we consider the expressions for the derivatives of the normal n:

ni ≡ ∂n∂ui

= µ·ti rt + µ·3

i n. (5.38)

Let us find the coefficients of these expansions through the coefficients ofthe fundamental forms of the surface. Because of the equality

0 =∂

∂ui1 =

∂

∂ui(n · n) = 2ni · n

we have

µ·3i = 0.

Let us dot multiply (5.38) when i = 1 by r1 and r2 successively. Using thedefinitions for the coefficients of the first and second fundamental forms wehave the equations

−L = µ ·11 E + µ ·2

1 F, −M = µ ·11 F + µ ·2

1 G,

from which we get

µ ·11 =

−LG+MF

EG− F 2, µ ·2

1 =LF −ME

EG− F 2.


Repeating this procedure for (5.38) when i = 2 we similarly obtain

µ ·12 =

NF −MG

EG− F 2, µ ·2

2 =−NE +MF

EG− F 2.

Exercise 5.34. Derive the Christoffel symbols for the case when the firstfundamental form is (du1)2 +G(du2)2.

Exercise 5.35. For a surface of revolution with coordinate lines beingparallels and meridians, derive all coefficients of the expansions (5.37) and(5.38).

Some useful formulas

Certain identities are frequently employed. Their derivation is lengthy sowe omit it here. Let us redenote the coefficients of the second fundamentalform using index notations:

b11 = L, b12 = b21 = M, b22 = N.

These are the covariant components of a corresponding symmetric tensorof order two. The first formula is due to Gauss:

b11b22 − b212 =∂2g12∂u1∂u2

− 12

(∂2g11(∂u2)2

+∂2g22(∂u1)2

)+(Γi

12Γj12 − Γi

11Γj22

)gij ,

from which (and the form of K, see page 152) it follows that the Gaus-sian curvature of the surface can be expressed purely in terms of the firstfundamental form. The next formulas are due to Peterson and Codazzi:

∂bi1∂u2

− ∂bi2∂u1

= Γti2bt1 − Γt

i1bt2 (i = 1, 2).

Using the formulas of this paragraph, we can derive the formulas for differ-entiating a spatial vector

f = f iri + f3n = firi + f3n

given on the surface. Denoting r3 = r3 = n, we can derive the followingformulas that mimic the formulas of Chapter 4:

∂f∂ui

= ∇iftrt = ∇iftrt.


In these, the covariant derivatives are

∇ifj =

∂f j

∂ui+ Γj

itft − bjif3,

∇ifj =∂fj

∂ui+ Γt

ijft − bijf3,

∇if3 =∂f3∂ui

+ bitft =

∂f3∂ui

+ btift.

Here i, j take the values 1, 2, while t takes the values 1, 2, 3. Summation iscarried out over i, j, and t in the respective ranges. When f = n, i.e., whenf1 = f2 = 0 and f3 = 1, we get the following formula for the derivatives ofn:

∂n∂ui

= −btirt = −bitrt. (5.39)

Introducing the surface gradient or surface nabla operator by the for-mula

∇ = ri ∂

∂ui(i = 1, 2),

we find that

∇f = ∇iftrirt = ∇iftrirt, ∇n = −B,

where B is the surface curvature tensor given by

B = bjirirj = bijrirj = bijrirj .

One may introduce the surface divergence and rotation operations

div f = ∇ · f = ri · ∂f∂ui

, rot f = ∇ × f = ri × ∂f∂ui

.

The applications of the differential calculus on a surface are presented inChapter 7, where shell theories are considered.

5.7 Implicit Representation of a Curve; Contact of Curves

Our work in Chapter 5 has stressed technical topics needed to derive equa-tions describing certain natural objects. We now turn to some less technicalbut quite important questions that can further our understanding of differ-ential geometry.

We have described a curve by the representation r = r(t). From thispoint of view, a curve is the image of some one-dimensional domain of


the parameter t. Similarly, a surface was an image of a two-dimensionaldomain, given by a formula

r = r(u1, u2). (5.40)

However, we know that in Cartesian coordinates a sphere is described bythe equation

(x− x0)2 + (y − y0)2 + (z − z0)2 = R2,

and this cannot be represented in the form (5.40) for all points simultane-ously. This is an example of the implicit form of description of a surface.We can describe any small part of a sphere in an explicit form of the type(5.40). This is a point from which the theory of manifolds arose: a surfacesuch as a sphere can be divided into small portions, each of which can bedescribed in the needed way.

However, implicit form descriptions of surfaces and curves are commonin practice. Let us suppose that the equation of a surface is

F (r) = 0.

In Cartesian coordinates this would look like

F (x, y, z) = 0.

Earlier we described coordinate curves in space as sets given by the equationr = r(q1, q2, q3) when two of the three coordinate parameters q1, q2, q3 arefixed. Each such curve can be described alternatively as the intersection oftwo coordinate surfaces. Consider for instance a q1 coordinate curve — thecurve corresponding to q2 = q20 , q

3 = q30 , is the intersection of the surfacesr = r(q1, q2, q30) and r = r(q1, q20 , q

3). Similarly a curve in space can berepresented as the set of points described by a vector r that satisfies twosimultaneous equations

F1(r) = 0, F2(r) = 0,

or, in coordinate form,

F1(q1, q2, q3) = 0, F2(q1, q2, q3) = 0.

A tangent t to the curve at a point must be orthogonal to the two surfacenormals at the point; if n1 and n2 are these normals, we can write

t = n1 × n2.


Contact of curves

A curve in the plane can be also shown implicitly, but using just one equa-tion:

F (r) = 0 or F (q1, q2) = 0.

We would like to study the problem of approximating a given plane curve,at a given point, by another curve taken from a parametric family of curves.Let us begin with the following problem:

Given a smooth curve A and point C on it, select from all straight lines inthe plane that which best approximates the behavior of A at C.

Of course, one solution is the line tangent to A at C. We see this byconsidering a line through C that intersects A at a point D close to C,and then producing a limit passage under which this line rotates about Cin such a way that D tends to C. The line approached in the limit is thetangent line.

Let us extend this idea somewhat and seek an approximation that canreflect both the direction of the curve A at C and its curvature at thatpoint. Since straight lines cannot reflect curvature behavior, we shall haveto employ some other family of approximating curves. We could use thefamily of all circles in the plane, since any three points in the plane deter-mine a unique circle. Taking two points D1 and D2 near C and drawing thecircle through these three points, we could attempt a limit passage underwhich both D1 and D2 tend to C. This should yield a circle capable ofreflecting both desired properties (tangent and curvature) of A at C. Thefamily of all circles in the plane is described in Cartesian coordinates bythe equation

(x− x0)2 + (y − y0)2 = R2, (5.41)

where the three values x0, y0 and R are free parameters that should beproperly chosen to approximate a given curve at a point. So we see thata three-parameter family of plane curves will be needed to get the betterapproximation we seek. We could employ a parabola from the family y =ax2 + bx+ c, but this manner of approximation would be less informative.

In general we can seek to approximate a plane curve

r = r(t) (5.42)

at a point t = t0 by introducing a parametric family Φn of curves

F (r, a1, . . . , an) = 0,


where a1, . . . , an are free parameters.5 Suppose we take n points

r(tk) (k = 0, . . . , n− 1)

of the curve (5.42), where all the tk are close to t0. With n free parametersat our disposal we should be able to find a curve from Φn that goes throughthese points. This means that the system

F (r(t0), a1, . . . , an) = 0,

F (r(t1), a1, . . . , an) = 0,...

F (r(tn−1), a1, . . . , an) = 0, (5.43)

is satisfied by some set of parameters a1, . . . , an. We assume that the ai

depend on the tk in such a way that when all the tk tend to t0 the ai

tend continuously to some respective values bi. Thus we find the curveF (r, b1, . . . , bn) = 0 that best approximates the behavior of (5.42) at t0,and we have

F (r(t0), b1, . . . , bn) = 0. (5.44)

As a practical matter it is not convenient to use such a limit passageto obtain the needed curve. It would be better to have conditions in whichonly those characteristics of the curves at t0 are involved. Since we supposethe above limit passage is well-defined, we can take the ordered set of pointstk:

t0 < t1 < · · · < tn−1.

Let a1, . . . , an be a solution to (5.43) in this case, and consider the function

f(t) = F (r(t), a1, . . . , an)

of the single variable t. This function vanishes at t = tk for k = 0, . . . , n−1.By Rolle’s theorem there exists t′k ∈ [tk−1, tk] (k = 1, . . . , n− 1) such thatf ′(t′k) = 0. During the limit passage all the t′k also tend to t0, so we have

d

dtF (r(t), b1, . . . , bn)

∣∣∣∣t=t0

= 0.

Similarly, the function

f ′(t) = Ft(r(t), a1, . . . , an)5We suppose that the curve (5.42) is sufficiently smooth at t0, as is each curve taken

from the family Φn.


takes n − 1 zeroes at the points t′1 < · · · < t′n−1. So by Rolle’s theoremthere is a point t′′k on each segment [t′k−1, t

′k] such that f ′′(t′′k) = 0. After

the main limit passage we will have

d2

dt2F (r(t), b1, . . . , bn)

∣∣∣∣t=t0

= 0.

Repeating this procedure for each derivative up to order n − 1, we obtainthe conditions

dk

dtkF (r(t), b1, . . . , bn)

∣∣∣∣t=t0

= 0 (k ≤ n− 1).

These, taken together with (5.44), constitute n conditions that should befulfilled by the needed curve from the family Φn. We define the order ofcontact between this curve and (5.42) as the number n− 1.

Contact of a curve with a circle; evolutes

Let us return to our previous problem and apply these conditions to ap-proximate a curve

r = i1x(t) + i2y(t)

at a point t = t0 by a circle of the family (5.41). Our three free parametersx0, y0, R should satisfy the system

(x(t0) − x0)2 + (y(t0) − y0)2 = R2,

2(x(t0) − x0)x′(t0) + 2(y(t0) − y0) y′(t0) = 0,

2x′(t0)2 + 2(x(t0) − x0)x′′(t0) + 2y′(t0)2 + 2(y(t0) − y0) y′′(t0) = 0.

In particular, solution of these yields

R2 =[x′(t0)]2 + [y′(t0)]23

[x′(t0)y′′(t0) − x′′(t0)y′(t0)]2.

We see that the curvature 1/R of the circle coincides with the curvature k1

of the curve. As the formulas for x0 and y0 are cumbersome, it is worthwhileto recast the problem in vector notation.

Let r0 locate the center of the contact circle

(r − r0)2 = R2.

Let the curve be given in natural parametrization as r = r(s). To solve thisproblem of second-order contact, we define

F (s) = (r(s) − r0)2 −R2


and have

F (s0) = 0, F ′(s0) = 0, F ′′(s0) = 0,

or

(r(s0) − r0)2 −R2 = 0,

2(r(s0) − r0) · τ (s0) = 0,

2 + 2(r(s0) − r0) · ν(s0)k1 = 0.

By the second equation the vector r(s0) − r0 is orthogonal to the tangentτ (s0), hence is directed along the normal ν(s0) and we have

(r(s0) − r0) · ν(s0) = −|r(s0) − r0| = −R.This and the third equation yield 1 − k1R = 0, hence R = 1/k1 as statedabove.

The locus of centers of all contact circles for a given curve is called theevolute of the curve. The equation of the evolute is

ρ(t) = r(t) +Rν(t), R = 1/k1

or in parametric Cartesian form (ρ = (ξ, η))

ξ = x− y′x′2 + y′2

x′y′′ − x′′y′, η = y + x′

x′2 + y′2

x′y′′ − x′′y′.

Contact of nth order between a curve and a surface

Quite similarly we can solve a problem of nth-order contact between a spacecurve r = r(t) at t = t0 and a surface from an n + 1 parameter family ofsurfaces given implicitly by

F (r, a1, . . . , an+1) = 0.

Everything from the previous pages should be repeated word for word. Firstwe choose a surface from the family in such a way that it coincides withthe curve in n+ 1 points close to t0. This gives n+ 1 equations:

F (r(t0), a1, . . . , an+1) = 0,

F (r(t1), a1, . . . , an+1) = 0,...

F (r(tn−1), a1, . . . , an+1) = 0.


Our previous reasoning carries through and we obtain

dk

dtkF (r(t), b1, b2, . . . , bn+1)

∣∣∣∣t=t0

= 0 (k = 0, 1, . . . , n),

as the equations that should hold at the point of nth-order contact.We have introduced the osculating plane to a curve at a point A as the

plane through A that contains τ and ν. Note that it could also be definedas a surface of second-order contact. The result is the same.

The reader should consider how to apply these considerations to theproblem of nth-order contact between a given surface and a surface from amany-parameter family of surfaces.

Exercise 5.36. Treat the problem of third-order contact between a spacecurve and a sphere. Show that denoting R = 1/k1 and ρ the radius of thesphere we get (in natural parametrization)

ρ2 = R2 +R′2/k22.

Also show that the center of the contact sphere lies on the straight linethrough the center of principal curvature that is parallel to the binormalat the point of the curve.

5.8 Osculating Paraboloid

When considering the structure of a surface at a point, it is often helpfulto approximate the surface using another surface whose behavior is moreeasily visualized. A spherical surface would be insufficient for this purposebecause it has the same normal curvature in all directions. We can, however,use the osculating paraboloid introduced in (5.31). Let us reconsider thisparaboloid from the point of view of local approximation.

Let O be a fixed point of a surface, and assume the surface is sufficientlysmooth at O. At O we determine the osculating plane and introduce aCartesian frame (i1, i2,n), where (i1, i2) is a Cartesian frame with originO on the osculating plane and n is normal to both the surface and theosculating plane. For a smooth surface, Cartesian coordinates (x, y) canplay the role of surface coordinates since they uniquely define any pointof the surface at O. The surface in the vicinity of O can be described bythe equation z = z(x, y). We suppose that z(x, y) is twice continuouslydifferentiable near O. In the neighborhood of (0, 0) we can use the Taylor


expansion of z(x, y):

z(x, y) = z(0, 0) + zx(0, 0)x+ zy(0, 0)y

+12(zxxx

2 + 2zxyxy + zyyy2)

+ o(x2 + y2),

where the indices x, y indicate that we take partial derivatives with respectto x, y, respectively (and in this section, evaluate at the point (0, 0)). Sincethe coordinates (x, y) are in the osculating plane, the partial derivatives

zx(0, 0) = 0, zy(0, 0) = 0.

By choice of the origin, z(0, 0) = 0. Thus the Taylor expansion is

z(x, y) =12(zxxx

2 + 2zxyxy + zyyy2)

+ o(x2 + y2). (5.45)

Let us consider the paraboloid

z(x, y) =12(zxxx

2 + 2zxyxy + zyyy2). (5.46)

The difference between the surfaces described by (5.45) and (5.46) is small(as indicated by the o term). This allows us to show that it is an osculatingparaboloid that approximates the behavior of the surface at point O. Achange of coordinate frames can show that the paraboloid (5.46) coincideswith (5.31). Its coefficients are L,M,N at the point O.

Let us note that the second fundamental form of the initial surface ata point coincides with that of the osculating paraboloid, and the situationis the same with the normal curvatures. We denote

r = zxx(0, 0), s = zxy(0, 0), t = zyy(0, 0),

so that the equation of the osculating paraboloid is

z =12(rx2 + 2sxy + ty2

).

Let us draw the projection onto the xy-coordinate plane of the cross-sectionsof the osculating paraboloid by the two planes z = ±h, h > 0. This is thecurve defined by

12

∣∣rx2 + 2sxy + ty2∣∣ = h. (5.47)

The curve is an ellipse when the paraboloid is elliptical (rt − s2 > 0) ora hyperbola when it is hyperbolic (rt − s2 < 0), or a family of straightlines when it is parabolic (rt− s2 = 0). It can be shown that the radius ofnormal curvature of the surface in the direction x : y is proportional to the


squared distance from the origin to the point of the curve (5.47) taken inthe same direction. This curve is called the Dupin indicatrix.

The Dupin indicatrix is a curve of second order on the plane, and thusit has some special directions (axes) and special values characterizing thecurve. The special directions are known as principal directions. In the nextsection we consider this from another vantage point.

5.9 The Principal Curvatures of a Surface

Let us discuss in more detail the properties of a surface connected withits second fundamental form. We now denote the coordinates without in-dices, using u = u1, v = u2. We also use subscripts u, v to indicate partialdifferentiation:

r1 = ru =∂r∂u, r2 = rv =

∂r∂v, nu =

∂n∂u

, nv =∂n∂v.

The second fundamental form is

Ldu2 + 2M dudv +N dv2 = −dr · dn,where

L = ruu · n, M = ruv · n, N = rvv · n.Consider the first differential of n at a point P :

dn = nu du+ nv dv.

Since n is a unit vector, its differential dn is orthogonal to n and thus liesin the osculating plane to the surface at P . We know that the differentialdr = ru du + rv dv also lies in the plane osculating at P . Let us considerthe relation between the differentials dr and dn with respect to the vari-ables du, dv now considered as independent variables. It is clearly a linearcorrespondence dr → dn. Thus it defines a tensor A in two-dimensionalspace such that

dn = A · dr.This tensor is completely defined by its values nu = A · ru and nv = A · rv.

Lemma 5.1. The tensor A is symmetric.

Proof. It is enough to establish the equality

x1 · (A · x2) = x2 · (A · x1)


for a pair of linearly independent vectors (x1,x2). To show symmetry of Aconsider

ru · (A · rv) = ru · nv.

Similarly

rv · (A · ru) = rv · nu.

The symmetry of A follows from the identity

ru · nv = rv · nu;

this is derived by differentiating the identity n · ru = 0 with respect to v,then differentiating n · rv = 0 with respect to u and eliminating the termcontaining ruv.

We know that a symmetric tensor has real eigenvalues; in this case thereare no more than two eigenvalues λ1 and λ2 to which there correspondmutually orthogonal eigenvectors x1,x2, respectively (if λ1 = λ2).

Let us find the equation for the eigenvalues. A vector in the osculatingplane can be represented as xru + yrv since (ru, rv) is a basis in it. Bydefinition of A we have

A · (xru + yrv) = xnu + ynv.

On the other hand, xru + yrv = 0 is an eigenvector if there exists λ suchthat

A · (xru + yrv) = λ (xru + yrv) .

Thus, for the same vector we get

xnu + ynv = λ (xru + yrv) . (5.48)

We see that an eigenvector is defined by the same condition as the principaldirections of the previous section, since this equation means that we findthe direction in which dr and dn are parallel.

From the last vector equation, let us derive scalar equations. For this,dot multiply (5.48) first by ru, then by rv. By (5.32) we have

L = −nu · ru, M = −nu · rv, N = −nv · rv,

hence

−Lx−My = λ(Ex+ Fy),

−Mx−Ny = λ(Fx+Gy).


Rewriting this as

(L + λE)x+ (M + λF )y = 0,

(M + λF )x + (N + λG)y = 0, (5.49)

we have a homogeneous linear system of algebraic equations. This systemhas nontrivial solutions when its determinant vanishes:

(EG− F 2)λ2 − (2MF − EN − LG)λ+ (LN −M2) = 0. (5.50)

In the general case, there are two roots λ1 and λ2, to which there correspondtwo directions previously called the principal directions. They can be foundby elimination of λ from (5.49):∣∣∣∣∣∣

−x2 xy −y2

E F G

L M N

∣∣∣∣∣∣ = 0.

By this we define two mutually orthogonal directions x : y. If λ1 = λ2,then all the directions are principal so we can choose any two mutuallyorthogonal directions and regard them as principal.

Now we would like to consider another question that will bring us to thesame equations. It is the question of finding the extreme normal curvaturesat a point of a surface. We have established that in the direction x : y thenormal curvature of a surface is given by the formula

k =Lx2 + 2Mxy +Ny2

Ex2 + 2Fxy +Gy2.

Because of the second-order homogeneity in x, y of the numerator and de-nominator we can reformulate the problem of finding extreme curvaturesas the problem of finding extremal points of the function

Lx2 + 2Mxy +Ny2

under the restriction

Ex2 + 2Fxy +Gy2 = 1.

Applying the theory of Lagrange multipliers to this, we should find theextreme points of the function

Lx2 + 2Mxy +Ny2 − k(Ex2 + 2Fxy +Gy2),

which leads to the equations

(L− kE)x+ (M − kF )y = 0,

(M − kF )x+ (N − kG)y = 0. (5.51)


The systems (5.49) and (5.51) coincide if we put k = −λ, which meansthat the principal directions and the directions found here are the same.Moreover, it is seen that λ1 and λ2 found as the eigenvalues of the tensor Agive the extreme curvatures of the surface at a point, which are k1 = −λ1

and k2 = −λ2. Remembering that they are the roots of the polynomial(5.50) and using the Viete theorem, we get

k1 + k2 =EN − 2FM +GL

EG− F 2, k1k2 =

LN −M2

EG− F 2.

We have met these expressions in the Gaussian curvature K = k1k2 andthe mean curvature H = (k1 + k2)/2.

We see that both are expressed through the coefficients of the first andsecond fundamental forms of the surface. There is a famous theorem dueto Gauss that K can be expressed only in terms of the coefficients of thefirst fundamental form.

When the principal curvatures and their directions are known, the Eulerformula gives the normal curvature at any direction of the curve whichcomposes the angle φ with the first principal direction at the same point,corresponding to k1:

kφ = k1 cosφ+ k2 sinφ.

The principal directions of a surface define the lines of curvature of thesurface. A line in the surface is called a line of curvature if at each pointits tangent is directed along one of the principal directions at this point ofthe surface.

Two lines of curvature pass through each point. Denoting their direc-tions by du : dv and δu : δv we have two equations, the first of which meansorthogonality of the directions and the second their conjugation:

E du δu+ F (du δv + dv δu) +Gdv δv = 0,

L du δu+M(du δv + dv δu) +N dv δv = 0.

It is convenient to take the family of the lines of curvature of the surfaceas the coordinate lines of the surface. For these curvilinear coordinatesF = M = 0.

Two surfaces are called isometric if there is a one-to-one correspondencebetween the points of the surfaces such that corresponding curves in thesurfaces have the same length.

A planar surface and a cylindrical surface provide an example of iso-metric surfaces since we can develop the cylindrical surface over the plane.


The correspondence between points is defined by coincidence of the pointsin this developing.

We can refer to local isometry of a surface at point A1 to another surfaceat point B2 if there are neighborhoods of points on the surfaces that areisometric.

Two smooth surfaces are locally isometric if and only if there areparametrizations of the surfaces at corresponding points such that the co-efficients of the first principal forms of the surfaces coincide:

E1 = E2, F1 = F2, G1 = G2.

A surface is developable if it is locally isometric to the plane at eachpoint. It turns out that a surface is developable if and only if its Gaussiancurvature is zero at each point.

Surfaces with zero Gaussian curvature appear once more in the problemof finding spatial surfaces of minimal area that have given boundaries. Thispopular physics problem describes the shape assumed by a soap film on awire frame. The energy of such a film is proportional to its area, and thusthe form actually taken by such a film is the surface of minimum area.

5.10 Surfaces of Revolution

Surfaces of revolution are quite frequent in practice. Suppose that thesurface is formed by rotation of the profile curve

x = φ(u), z = ψ(u), (5.52)

in the xz-plane about the z-axis (Fig. 5.3). When we fix u we get a circlehaving center on the z axis; it is called a parallel. To define a point on theparallel we introduce the angle of rotation v from the xz-plane. When wefix v we get a meridian, a curve congruent to the initial curve (5.52).

It is easy to see that the equations of the surface of revolution corre-sponding to (5.52) are

x = φ(u) cos v, y = φ(u) sin v, z = ψ(u).

Let us find the coefficients of the first fundamental form of the surface:

E = x2u + y2

u + z2u = (φ′ cos v)2 + (φ′ sin v)2 + ψ′2 = φ′2 + ψ′2,

F = xuxv + yuyv + zuzv = (φ′ cos v)(−φ sin v) + (φ′ sin v)(φ cos v) = 0,

G = x2v + y2

v + z2v = (−φ sin v)2 + (φ cos v)2 = φ2.


Fig. 5.3 Surface of revolution about the z-axis.

Note that F = 0 means the orthogonality of the parametrization net. Thusthe first fundamental form is

(ds)2 =(φ′2 + ψ′2

)du2 + φ2 dv2.

For the components of the second fundamental form we have

L =ruu · (ru × rv)

|ru × rv| =

∣∣∣∣∣∣xuu yuu zuu

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

=ψ′′φ′ − φ′′ψ′√φ′2 + ψ′2

,

M =ruv · (ru × rv)

|ru × rv| =

∣∣∣∣∣∣xuv yuv zuv

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

= 0,

and

N =rvv · (ru × rv)

|ru × rv| =

∣∣∣∣∣∣xvv yvv zvv

xu yu zu

xv yv zv

∣∣∣∣∣∣√EG− F 2

=ψ′φ√

φ′2 + ψ′2.


So the second fundamental form is

−dn · dr =ψ′′φ′ − φ′′ψ′√φ′2 + ψ′2

du2 +ψ′φ√

φ′2 + ψ′2dv2.

We see that M = 0, which means the coordinate lines are conjugate.6 Thusthe coordinate lines of a surface of revolution are the lines of curvature.

Exercise 5.37. Find the principal curvatures of the surface of revolution.

Exercise 5.38. Find the first and second fundamental forms for (a) theplane, and (b) the sphere.

Exercise 5.39. Find the first and second fundamental forms for each ofthe following paraboloids: (a) z = x2 + y2, (b) z = x2 − y2, (c) z = x2.Find H and K for each of these.

5.11 Natural Equations of a Curve

Suppose we are given functions k1(s) and k2(s) of a natural parameters, continuous on a segment [s0, s1], and that k1(s) is a positive function.We are interested in whether there exists a space curve that has principalcurvature k1(s) and torsion k2(s). Geometrical considerations show that ifsuch a curve exists then it is uniquely defined up to rigid motions. Thismeans that if we take two such curves, place them in space, and then shiftand rotate one of them so that their initial points and moving trihedra atthese points coincide, then all points of the curves coincide.

Now we will show that such a curve exists. Thus the form of the curveis defined by k1(s) and k2(s) uniquely; this prompts us to call the twofunctions k1 = k1(s) and k2 = k2(s) the natural equations of the curve.

Let us demonstrate that from k1(s) and k2(s) we can find the neededcurve. For this, we consider a vector system of differential equations that

6To any symmetric quadratic form there corresponds something like an inner product ofvectors, and hence something akin to orthogonality. Conjugate directions at a point aredefined as the directions on the surface du : dv and δu : δv for which Ldu δu+M(du δv+dv δu) + N dv δv = 0. Conjugate coordinate lines are those which are conjugate at eachpoint.


mimics the formula for a tangent vector and the Frenet–Serret equations:

drds

= x,

dxds

= k1(s)y,

dyds

= −k1(s)x − k2(s)z,

dzds

= k2(s)y. (5.53)

Written in Cartesian components, this is a system of 12 linear ordinarydifferential equations. ODE theory states that if we define the initial valuesfor all the unknowns (the Cauchy problem), we will have a unique solutionto this system.

We can choose arbitrary initial conditions for r(s0). The initial valuesx(s0), y(s0), z(s0) must constitute an arbitrary right-handed orthonormaltrihedron (x0,y0, z0).

Thus on [s0, s1] there exists a unique solution (r(s),x(s),y(s), z(s)) ofthe equations (5.53), satisfying some fixed initial conditions. It can beshown that because of skew symmetry of the matrix of the three last equa-tions of (5.53),

∣∣∣∣∣∣0 k1(s) 0

−k1(s) 0 −k2(s)0 k2(s) 0

∣∣∣∣∣∣ = 0,

the vectors (x(s),y(s), z(s)) constitute an orthonormal frame of the sameorientation as the initial one on the whole segment [s0, s1].

We can treat r = r(s) as the equation of the needed curve in naturalparametrization. Then the first of the equations (5.53) states that x(s) isits unit tangent. Comparing the equation dx/ds = k1(s)y with the firstof the Frenet–Serret equations for a curve r = r(s), we see that y(s) is itsprincipal normal and k1(s) is its principal curvature. Similarly we find thatz(s) is the binormal of the curve and k2(s) is its torsion. This completesthe necessary reasoning.

For a plane curve the natural equations reduce to a single equation forthe curvature. In the plane we associate the curvature with an algebraicsign. The condition of positivity of k(s) is not necessary.


Natural equation of a curve in the plane

A plane curve has zero torsion. Let us consider how to reconstruct a curveif its curvature k(s) is a given function of the length parameter s. We fixthe initial point of the curve, corresponding to s = s0, by the equation

r(s0) = r0. (5.54)

We must also fix the direction of the curve at this point. Here it is usefulto introduce the angle φ = φ(s), measured between the unit tangent to thecurve and the vector i where i, j is an orthonormal basis in the plane. Inthis way the unit tangent is given by

τ = i cosφ+ j sinφ.

So to define the initial direction of the curve we introduce

φ(s0) = φ0. (5.55)

The equation relating τ and ν, which is dτ/ds = k(s)ν, becomes

(−i sinφ+ j cosφ)dφ

ds= k(s)ν.

The vector −i sinφ + j cosφ has unit magnitude and is orthogonal to τ .Hence it is parallel to ν, and we conclude that

|k(s)| =∣∣∣∣dφds

∣∣∣∣ .Taking

k(s) =dφ

ds,

we define a sign of k(s) in the standard manner where k(s) is positive forpoints at which the curve is concave upwards.

We can integrate the last equation and obtain

φ(s) − φ0 =∫ s

s0

k(t) dt.

Knowing φ(s) we can integrate the equation for the unit tangent, which isdr/ds = τ , rewritten as

drds

= i cosφ(s) + j sinφ(s).

Integration with respect to s gives us

r(s) − r0 =∫ s

s0

[i cosφ(t) + j sinφ(t)] dt. (5.56)


By (5.56), the function k(s) and the initial values (5.54) and (5.55) uniquelydefined the needed curve r = r(s). Changing the initial values of the curve,we get a family of curves with the same k(s) such that all curves have thesame shape but different positions with respect to the coordinate axes.

Exercise 5.40. Given the curvature k(s) = (as)−1 of a plane curve, findthe curve.

5.12 A Word About Rigor

The main goal of this book is the presentation of those tools and formu-las of tensor analysis that are needed for applications. Our approach istypical of engineering books; we seldom offer clear statements of the as-sumptions that guarantee validity of a formula, supposing instead that inapplications all the functions, curves, surfaces, etc., should be sufficientlysmooth for our purposes. This approach is probably best for any practi-tioner who must simply get his or her hands on a formula. However, inphysics we see surfaces that do not necessarily bound real-world bodies;purely mathematical surfaces can occur, such as those describing the en-ergy of a two-parameter mechanical system. Such a surface can be quitecomplex, and a physicist may be largely interested in its singular pointssince these represent the states at which the system changes its behaviorcrucially. A reader interested in such applications would do well to study amore sophisticated treatment where each major result is stated as the theo-rem under the weakest possible hypotheses. Many such treatments employmuch more advanced (e.g., topological) tools.

However, the reader should be aware that even on the elementary levelof our treatment there are questions that require additional explanation.How, for example, should we define the length of a curve or the area of asurface?

Early in our education we learn how to measure the length of a segmentor of a more complex set on a straight line. We also learn how to definethe circumference of a circle. For this we inscribe an equilateral triangleand calculate its perimeter. Doubling the number of sides of the inscribedtriangle, we get an inscribed polygon whose perimeter approximates thecircumference. Doubling the number of sides to infinity and calculatingthe limit of their perimeters, we define this limit as the needed length. Of


course, a limit passage of this type is used when we derive the formula∫ b

a

|r′(t)| dt (5.57)

for the length of a curve. Here we divide the segment [a, b] into small piecesby the points t0 = a, t1, . . . , tn = b, and draw the vector

∆r(ti + 1) = r(ti+1) − r(ti).

In this way we “inscribe” a polygon into the curve. For a continuouslydifferentiable vector function r(t), the length of a side of a polygon can befound by the mean value theorem:

|r(ti+1) − r(ti)| = |r′(ξi+1)|(ti+1 − ti),

where ξi+1 is a point in [ti+1 − ti]. Thus the perimeter of the inscribedpolygon that approximates the length of the curve is

n∑i=1

|r′(ξi+1)|(ti+1 − ti).

We obtain (5.57) from a limit passage in which the number of partitionpoints tends to infinity while the maximum length of a partition segmenttends to zero.

Although this might seem fine, underlying the process is the notion ofapproximating a small piece of the curve by a many-sided polygon. Our“intuition” tells us that the smaller its sides, the closer is the polygon tothe curve, hence its perimeter should approximate the curve length moreand more closely. But in elementary geometry one may “demonstrate” thatthe sum of the legs of any right triangle is equal to its hypotenuse. Theconstruction is as follows. Suppose we are given a right triangle whose legshave lengths a and b. Let us form a curve that looks like the toothed edgeof a saw, by placing along the hypotenuse of the given triangle a sequenceof small triangles each similar to the given triangle (Fig. 5.4). It is clearthat the sum of all the legs of the sawteeth does not depend on the numberof teeth and equals a+ b. But as the number of teeth tends to infinity, thesaw edge “coincides” with the hypotenuse of the original triangle. Hencethe limiting length is equal to the length of the hypotenuse c, and wehave c = a + b. So we come to understand that we cannot approximate acurve by a polygon in an arbitrary fashion. Similar remarks apply to theapproximation of surfaces to calculate area.


Fig. 5.4 Fallacious estimation of the length of a line.

These remarks should serve as a warning that to apply formulas cor-rectly it is often necessary to understand the restrictions under which theywere derived.

5.13 Conclusion

Differential geometry is a well developed subject with many results andformulas. We have presented the main technical formulas that are usedin applications. The interested reader could go on to study volumes de-voted to the theory of curves, surfaces, manifolds, etc. Such an extensiveundertaking falls outside the scope of this book.

5.14 Problems

5.1 Find the parametrization and singular points of the plane curve

|x| 23 + |y| 23 = |a| 23 ,where a is a parameter.

5.2 Find the singular points of the plane curve given by the equations

x = a(t− sin t), y = a(1 − cos t),

where a is a parameter.

5.3 Find the equation of the tangent to the curve

x2 + y2 + z2 = 1, x2 + y2 = x

at the point (0, 0, 1).


5.4 Find the length of the astroid

x = a cos3 t, y = a sin3 t.

5.5 Find the length of the part of the cycloid

x = a(t− sin t), y = a(1 − cos t)

defined by t ∈ [0, 2π].

5.6 Find the length of the cardioid given in polar coordinates by

ρ = 2a(1 − cosφ).

See Fig. 5.5.

2a

-4a

y

x

0

Fig. 5.5 Cardioid.

5.7 Find the length of the portion of the curve

x = a cosh t, y = a sinh t, z = at

between the points t = 0 and t = T .

5.8 Find the curvature of the curve

x = t− sin t, y = 1 − cos t, z = 4 sint

2.

5.9 Find the curvature and torsion of the curve

x = a cosh t, y = a sinh t, z = at

at an arbitrary point.


5.10 Find the torsion of the curve

x = a cosh t cos t, y = a cosh t sin t, z = at

at an arbitrary point.

5.11 Let a triad of orthonormal vectors e1(s), e2(s), e3(s) (i.e., vectorssatisfying ei · ej = δij) be given along a curve. Show that

d

dsei(s) = d × ei(s)

where

d = −12(e′i × ei).

Note that these formulas are analogous to the Frenet–Serret equations withthe Darboux vector δ.

5.12 Let Q(s) be an orthogonal tensor given along a curve. Verify theformulas

Q′ = d × Q, d = −12(Q′ × QT )×.

5.13 Find the second quadratic form of the surface given by the equations

x = u cos v, y = u sin v, z = v.

5.14 A surface is defined by the equation z = f(x, y). Show that thecoefficients of its second principal form are

b11 =fxx√

1 + f2x + f2

y

, b12 = b21 =fxy√

1 + f2x + f2

y

, b22 =fyy√

1 + f2x + f2

y

.

5.15 A surface is defined by the equation z = f(x, y), where f satisfiesLaplace’s equation ∇2f = 0. Demonstrate that its Gaussian curvaturesatisfies K ≤ 0.

5.16 Show that the mean curvature of the surface z = f(x, y) is given by

H = div

(grad f√

1 + | gradf |2

),

where

grad = i1∂

∂x+ i2

∂

∂y.


5.17 Show that the mean curvature of a surface is given by H = −∇·n/2.

5.18 Demonstrate that ∇ · A = 2Hn where A = E− nn.

5.19 Let X be a second-order tensor. Show that n·(∇×X) = −∇·(n×X).

5.20 Find the Gaussian curvature of a surface given by the relation

z = f(x) + g(y).

5.21 The first principal form of a surface is A2 du2 + B2 dv2. Determineits Gaussian curvature.

5.22 Let X be a second-order tensor. Prove the following analog of theGauss–Ostrogradsky theorem on a surface S having boundary contour Γ:∫

S

(∇ · X + 2Hn · X

)dS =

∮Γ

ν · X ds, (5.58)

where ν is the outward unit normal to Γ lying in the tangent plane, i.e.,ν · n = 0. Note that when S is a closed surface,∫

S

∇ · X dS = −∫

S

2Hn ·X dS.

5.23 Prove that (5.58) holds for a tensor field X of any order.

5.24 Use the solution of the previous problem to prove that∫S

(∇X + 2HnX

)dS =

∮Γ

νX ds,

∫S

(∇ × X + 2Hn× X

)dS =

∮Γ

ν × X ds,

and ∫S

∇ × (nX) dS =∮

Γ

τX ds,

where τ = ν × n is the unit tangent vector to Γ.

Chapter 6

Linear Elasticity

In this chapter we apply tensor analysis to linear elasticity. Linear elasticityis a powerful tool of engineering design; using general computer programs,engineers can calculate the strains and stresses within a complex elasticbody under load. It is useful to learn the principles behind these calcu-lations. Linear elasticity is based on the ideas of classical mechanics, butits technical tools are those of tensor analysis. It treats the small defor-mations of elastic bodies described by a linear constitutive equation thatrelates stresses and strains, extending the elementary form of Hooke’s lawfor a spring.

Linear elasticity is the first step (an elementary but not easy step) to-ward nonlinear mechanics. The latter considers other effects in solids, suchas heat propagation, deformation due to piezoelectric or magnetic effects,etc. It therefore incorporates thermodynamics and other areas of physics.

The plan of this chapter is to introduce the principal tools and laws oflinear elasticity, to formulate and consider some properties of the boundaryvalue problems of elasticity, and to study some variational principles inapplied elasticity.

We start with the idea of the stress tensor, which was introduced byAugustin Louis Cauchy (1789–1857).

6.1 Stress Tensor

The notion of stress is decisive in continuum mechanics. It is an extensionof the notion of pressure, and is introduced as the ratio of the value of aforce distributed over an elementary surface element to the element area.Consider, for example, a long thin cylindrical bar having cross-sectionalarea S and stretched by a force f which is uniformly distributed over the

181


cylinder faces (Fig. 6.1). This force is equidistributed over any normalcross-section of the bar. The stress σ is defined as

σ = f/S.

In elementary physics, the pressure is the only force characteristic at agiven point in a liquid or gas. But pressure is not sufficient to describe theaction of forces inside a three-dimensional solid body. At a particular pointin such a body, we find that the direction of the contact force may not benormal to a given area element. Furthermore, as we change the orientationof the area element, the density of the force acting across the element maychange in magnitude and direction.

-f fS

Fig. 6.1 The stress σ = f/S in a bar having cross-sectional area S and stretched by aforce f .

Forces

Let us discuss the conditions for equilibrium of a deformable body. Fromclassical mechanics, we know that the equilibrium conditions for a rigidbody consist of two vector equations. First, the resultant force (i.e., thesum of all forces) acting on the body must be zero. Second, the resultantmoment (the sum of the moments of all the forces with respect to somepoint) must be zero. We write∑

k

fk = 0,∑

k

(rk − r0) × fk = 0, (6.1)

where fk denotes a force applied to a point located by position vector rk.The position vector r0 locates an arbitrary but fixed point with respect towhich the moments are taken.

To define the equilibrium of a deformable body, we apply the equilibriumequations for a rigid body to any portion of the deformable body. We acceptthis as an axiom, known as the

Solidification principle. In equilibrium, any part of a deformable bodyobeys the equilibrium equations as if it were a rigid body under the actionof (1) all the external forces, and (2) the force reaction imposed by theremainder of the body on the part under consideration.

Linear Elasticity 183

This principle is not a direct consequence of classical mechanics. Rather,it is a kind of axiom which allows us to apply the results of classical me-chanics — obtained for non-deformable objects — to deformable objects.We will use the following terminology. A solid body occupies a certainvolume in three-dimensional space. The mapping which takes its materialpoints into the spatial points is called a configuration of the body; roughlyspeaking, this describes the geometry of the body in space. If the bodyis not under load, the configuration is termed the initial or reference con-figuration. For a deformed body in equilibrium under a given load, theconfiguration is termed the actual configuration.

Let us apply the solidification principle in the actual configuration of adeformable body in equilibrium. We take an arbitrary portion P as shownin Fig. 6.2. Two types of forces act on P . First, there are body forces.These act on the interior of P and do not depend on the conditions overthe boundary surface of P . An example is the gravitational force. Second,there are contact forces. These act on the boundary of P and represent thereaction forces imposed on P by the remainder of the body. They arise asfollows. Imagine that we isolate P from the body and replace the effectsof the rest of the body by some forces. These reactions are applied only tothat part of the boundary of P that comes into contact with the rest of thebody (hence the term “contact forces”). In addition, surface forces mayact over the external boundary of the entire body. We will refer to these,together with the body forces, as the external forces. By the solidificationprinciple, P must be in equilibrium under the action of all forces as thoughit were a rigid body of classic mechanics. (Continuum mechanics permitsanother approach, in which the reactions of the remaining part arise in thevolume as well. However, this approach is not used in modern engineeringpractice.)

So the total force acting on P is

f(P) = fB(P) + fC(P),

where the subscripts B and C denote the body and contact forces, respec-tively.

To characterize the spatial distribution of the forces, we introduce forcedensities defined by the equalities

fB(P) =∫

VPρf dV, fC(P) =

∫ΣP

t dΣ,

where VP is the space volume of P , ΣP = ∂VP is the boundary of P , and ρis the (specific) density of the material composing the body. The density t


carries dimensions of force per unit area, whereas f carries those of force perunit mass. We emphasize that these are defined in the actual configuration.The quantity t is called the stress vector.

t

f

VP

VP

ΣP

Fig. 6.2 Forces acting on a portion VP of the body.

Equilibrium equations of a continuum medium

By the solidification principle, the equilibrium equations (6.1) for a rigidbody become the following two conditions for a deformable body.

1. The resultant force acting on any portion P is zero:∫VPρf dV +

∫ΣP

t dΣ = 0. (6.2)

2. The resultant moment of all forces acting on P is zero:∫VP

(r − r0) × ρf dV +∫

ΣP(r − r0) × t dΣ = 0, (6.3)

where r locates a material point and r0 locates a fixed reference point. Thereader is encouraged to show, using (6.2), that (6.3) does not depend onthe choice of r0.

Note that P is not completely arbitrary, but is such that the integrationoperation over P makes sense.


Stress tensor

In general, the stress vector depends on the position r of a particle and onthe normal n to the area element in the body. We will always take thenormal outward from the portion of the body under consideration.

n

rn

-nt( n)-t(n)

i

ii1

2

3

Fig. 6.3 Interaction of two parts of the body.

In continuum mechanics, Newton’s third law on action-reaction pairs iscalled Cauchy’s lemma and is expressed by the following relation:

t(r,n) = −t(r,−n). (6.4)

Formula (6.4) describes the interaction of the contacting parts of the bodyshown in Fig. 6.3. Cauchy’s lemma allows us to introduce the stress tensor,which describes the dependence of t on n, the normal to the area elementat a point. We will see this in Cauchy’s theorem below.

First, however, we will use (6.2) to obtain the equilibrium equation indifferential form. We fix an arbitrary point P in V , the volume of the body,and use it as a vertex of an arbitrary small parallelepiped Π having facesparallel to the Cartesian coordinate planes as shown in Fig. 6.4. Hence thenormals to the faces lie along the orthonormal basis vectors i1, i2, i3.

Let us expand the stress vector t(ik) on the face having normal ik:

t(ik) = tksis. (6.5)

That is, the tks are the components of t(ik) in the basis is.


x

x

x

P

Fig. 6.4 Parallelepiped Π.

Theorem 6.1. The differential equation

ρf +∂tks

∂xkis = 0, (6.6)

known as the equilibrium equation, holds in V .

Proof. Equation (6.2) for the parallelepiped Π is∫VΠ

ρf dV +∫

ΣΠ

t dΣ = 0.

The surface integral is taken over the faces of Π, each of which is perpen-dicular to one of the ik. On the face whose normal is n = ik, we havet(ik) = nkt1sis, where nk = 1 and the other two components of n vanish.On the opposite face the normal is −ik, so nk = −1 and the remainingcomponents vanish. By Cauchy’s lemma (6.4) we have t(−ik) = −t(ik),hence t(−ik) = nkt1sis. Therefore the above equation takes the form∫

VΠ

ρf dV +∫

ΣΠ

nktksis dΣ = 0.

Applying (4.41) to the surface integral, we have∫ΣΠ

nktksis dΣ =∫

VΠ

∂tks

∂xkis dV (6.7)

and it follows that ∫VΠ

(ρf +

∂tks

∂xkis

)dV = 0. (6.8)

Suppose the integrand in (6.8) is a continuous function. As the vertex P

of Π is fixed and Π is arbitrarily small, the differential equation (6.6) musthold at P . Since P is arbitrary, (6.6) holds in V .


Exercise 6.1. Prove (6.7) by direct integration of the right-hand side.

Now we formulate

Theorem 6.2. (Cauchy). At any point of a body, the dependence of t onn, the normal to an elementary area at a point, is linear:

t = n · σ.Here σ is a second-order tensor depending on the point; it is the Cauchy

stress tensor.

Proof. We construct a tetrahedron T with vertex O at an arbitrary butfixed point of V as shown in Fig. 6.5. Applying (6.2) to T , we get∫

VT

ρf dV +∫

ΣT

nktksis dΣ +∫

M1M2M3

t(n) dΣ = 0, (6.9)

where VT is the tetrahedron volume, ΣT is the part of tetrahedron bound-ary consisting of the faces that are parallel to the coordinate planes, andM1M2M3 is the inclined face. On the face that is orthogonal to ik, wewill use the representation t(ik) = nkt1sis from the proof of the previoustheorem.

x

x

x

Ì

Ì

Ìo

Fig. 6.5 Tetrahedron T .

Using equation (6.6), we transform (6.9) to the following equality:∫VT

∂tks

∂xkis dV =

∫ΣT

nktksis dΣ +∫

M1M2M3

t(n) dΣ.


A consequence of (4.41) for V = VT is∫VT

∂tks

∂xkis dV =

∫ΣT

nktksis dΣ +∫

M1M2M3

nktksis dΣ.

Comparing the last two equalities, we see that∫M1M2M3

(t(n) − nktksis

)dΣ = 0.

As the tetrahedron is small and arbitrary at point O, we get the followingidentity:

t(n) − nktksis = 0.

Because M1M2M3 can have an arbitrary orientation n, this equality holdsfor any n and, moreover, at any point of V .

We have shown that the dependence of t on n is linear. As we know(§ 3.2), a linear dependence between two vectors is given by a second-ordertensor that we denote by σ:

t(n) = n · σ. (6.10)

The proof is complete.

By the proof of Cauchy’s theorem, we see that the components of thematrix (tsk) are the components of the stress tensor in a Cartesian frame:

σ = tskikis.

To maintain the correspondence between the notations for σ and its com-ponents, we change tsk to σsk and write

σ = σskikis.

Each subscript of σsk has a certain geometric meaning. The first subscriptdesignates the area element with normal ik, while the second designates thedirection of the projection of the stress vector onto is. For example, σ31 isthe projection of t(i3) onto the axis x1, and the stress vector t(i3) acts onthe elementary surface element having normal parallel to i3.

Now we return to the equilibrium equation. It is easy to see that theequation

∇ · σ + ρf = 0 (6.11)

written in Cartesian coordinates is (6.6). This means we have found thecomponent-free form of the equilibrium equation for the body. Note thatthe equation does not depend on the properties of the material that makes


up the body. We recall that (6.11) is the differential form of the conditionthat the resultant force applied to an arbitrary part of the body is zero.

So far we have exploited only the force equation (6.2). Now we willformulate the consequences of the moment equation (6.3).

Theorem 6.3. Let equation (6.3) hold for any part of the body. It followsthat σ is a symmetric tensor: σ = σT .

Proof. Using (4.41), we change the surface integral in (6.3) to an integralover VP : ∫

ΣP(r − r0) × t dΣ =

∫ΣP

(r − r0) × (n · σ) dΣ

= −∫

ΣPn · σ × (r − r0) dΣ

= −∫

VP∇ · [σ × (r − r0)] dV. (6.12)

Let us transform the integrand of the last integral. Because ∂r/∂xk = ikand ∂r0/∂xk = 0, we have

∇ · [σ × (r − r0)] = ∇ · σ × (r − r0) + ik · σ × ∂

∂xk(r − r0)

= −(r− r0) ×∇ · σ + ik · σ × ik= −(r− r0) ×∇ · σ − σksik × is.

Thus the condition that the resultant moment of all forces acting on VP iszero brings us to the relation∫

VP(r − r0) ×

(ρf + ∇ · σ) dV +

∫VPσksik × is dV = 0. (6.13)

The first integral in (6.13) is zero by (6.11). So the second integral in (6.13)is zero for arbitrary VP , and it follows that

σksik × is = 0 in V.

This holds if and only if σ is symmetric at each point, i.e.,

σ = σT .

Indeed, let us consider the part of the sum σksik × is when k, s are 1 or 2.We have

σksik × is = σ11i1 × i1 + σ22i2 × i2 + σ12i1 × i2 + σ21i2 × i1= (σ12 − σ21)i3= 0,


which implies that σ12 = σ21. Similarly we may demonstrate that σ23 = σ32

and σ13 = σ31. This completes the proof.

It is worth noting that in continuum mechanics other types of stresses,such as couple stresses, can be introduced [Cosserat and Cosserat (1909);Eringen (1999)]. For such models, the Cauchy stress tensor is not symmetricin general.

Principal stresses and principal area elements

In a general basis ek (k = 1, 2, 3), the Cauchy stress tensor σ takes theform

σ = σskesek,

where the matrix σsk has only six independent components.Because σ is symmetric, there exists the spectral expansion (3.21):

σ = σ1i1i1 + σ2i2i2 + σ3i3i3. (6.14)

Here the eigenvalues σk of the matrix (σsk) are the principal stresses, andthe normalized eigenvectors ik of σ are the principal axes of σ. On theprincipal area element having normal ik, the tangential stresses are absent.When the σk are distinct, the frame i1, i2, i3 is orthonormal. For the caseof repeated σk, the frame of ik is not unique; even in this case, however, wecan select an orthonormal set i1, i2, i3.

6.2 Strain Tensor

Under load, a body changes shape. We will consider how to describe defor-mation using the strain tensor. We restrict our consideration to very smalldeformations.

Let us illustrate the notion of strain using a stretched bar as an example.An undeformed bar has length l0; under load, the length becomes l. Thestrain is

ε =l − l0l0

≡ ∆ll0.

Generalization of this to three dimensions is not straightforward: we shouldconsider changes in shape of the body in all directions.

Let a body initially occupy a volume V in space. Under some externalload, it occupies the volume v. The position vectors of a particle in the


initial and deformed states are denoted by r0 and r, respectively. Thedisplacement vector

u = r − r0

describes the displacement of a particle due to deformation (Fig. 6.6). Inthis book we restrict ourselves to the case in which ‖u‖ 1 and all the firstderivatives of u are small in comparison with 1. So we will omit all termsof the second order of smallness in any expression containing first-orderterms. Moreover, in the case of small deformations we will not distinguishthe initial and actual states of the body; that is, all quantities will beconsidered to be given in the initial volume V .

rr0

u

i1

i

i2

3

Fig. 6.6 Deformation of a three-dimensional body.

Let us consider the change of an infinitesimal vector-segment dr0 dueto deformation. After deformation the segment is given by the vector dr.We have

dr = dr0 · F, (6.15)

where F = E + ∇u is the gradient of the deformation.1

Exercise 6.2. Derive (6.15) from the relation r = r0 + u.

Next, we consider the change in length of the segment during deforma-tion. Before deformation, the squared length is

dS2 = dr0 · dr0.

1Some books use F = E + ∇uT , in which case (6.15) takes the form dr = F · dr0.


After deformation it is

ds2 = dr · dr = dr0 ·F ·FT · dr0.

We have

ds2 − dS2 = dr0 · (F · FT − E) · dr0

= dr0 ·[(∇u + (∇u)T + (∇u) · ∇uT )

] · dr0.

For small deformations, we may omit all the squared quantities and write

ds2 − dS2 = dr0 ·[(∇u + (∇u)T )

] · dr0 = 2dr0 · ε · dr0,

where

ε =12(∇u + (∇u)T

)(6.16)

is the linear strain tensor. It is clear that ε is a symmetric tensor.In a Cartesian frame, ε is given by

ε = εmnimin

where

ε11 =∂u1

∂x1, ε12 =

12

(∂u1

∂x2+∂u2

∂x1

),

ε22 =∂u2

∂x2, ε13 =

12

(∂u1

∂x3+∂u3

∂x1

),

ε33 =∂u3

∂x3, ε23 =

12

(∂u2

∂x3+∂u3

∂x2

).

The diagonal components ε11, ε22, ε33 describe the changes in the lengths ofelementary segments along the i1, i2, i3 directions, respectively. The othercomponents εmn (m = n) represent skewing of the body; they characterizethe deformational changes in the angles between elementary segments lyinginitially along the axes.

In arbitrary curvilinear coordinates q1, q2, q3 with basis rk (k = 1, 2, 3)and dual basis rk (k = 1, 2, 3), the tensor ε is given by

ε = εstrsrt, εst =12

(∂us

∂qt+∂ut

∂qs

)− Γr

stur.

In arbitrary orthogonal curvilinear coordinates, ε was represented in (4.39).See Appendix A for ε in the cylindrical and spherical systems.


Equation (6.16) defines ε as a tensorial function of u. The inverseproblem, of finding u when ε is given, has a solution if and only if thecompatibility condition

∇× (∇× ε)T = 0 (6.17)

holds. In this case, u can be found using Cesaro’s formula

u = u0 +ω0 × (r− r0)+∫ M

M0

ε(s) + [r(s) − r] ×∇× ε(s) ·dr(s), (6.18)

where u0 and ω0 are arbitrary but fixed vectors, and the integration pathM0M joins the points M0 and M whose position vectors are r0 and r,respectively. Here r(s) locates an arbitrary point on M0M .

The derivations of these formulas can be found in any book on elasticity,e.g., [Green and Zerna (1954); Lurie (2005)].

6.3 Equation of Motion

Using the equilibrium equations (6.11) and d’Alembert’s principle of me-chanics, we can immediately obtain the equation of motion for a body. Thetechnique is to formally add the inertia forces to the body forces:

f → f − ρ∂2u∂t2

,

where t is the time variable and ρ is the material density. The equation ofmotion is

∇ · σ + ρf = ρ∂2u∂t2

. (6.19)

The form of (6.19) is simplest in Cartesian coordinates. Putting

σ = σmnimin, u = umim,

we get

∂σij

∂xi+ ρfj = ρ

∂2uj

∂t2(j = 1, 2, 3)


or explicitly

∂σ11

∂x1+∂σ21

∂x2+∂σ31

∂x3+ ρf1 = ρ

∂2u1

∂t2,

∂σ12

∂x1+∂σ22

∂x2+∂σ32

∂x3+ ρf2 = ρ

∂2u2

∂t2,

∂σ13

∂x1+∂σ23

∂x2+∂σ33

∂x3+ ρf3 = ρ

∂2u3

∂t2.

With regard for (4.33), in curvilinear coordinates equation (6.19) takesthe form

1√g

∂

∂qi

(√gσijrj

)+ ρf = ρ

∂2u∂t2

. (6.20)

In components it is

∂

∂qi

(√gσij

)+ Γj

mnσmn + ρ

√gf j = ρ

√g∂2uj

∂t2(j = 1, 2, 3). (6.21)

The equations of motion can be simplified in orthogonal coordinates. SeeAppendix A for expressions in cylindrical and spherical coordinates.

6.4 Hooke’s Law

Equations (6.16) and (6.19) apply to any small deformation of a continu-ous medium. However, they do not uniquely define the deformations andstresses in a body. To study a body under load, we should relate the stressesto the strains using the material properties of the body. These relations arecalled constitutive equations. In this book we consider linearly elastic ma-terials for which the constitutive equation represents a linear dependencebetween σ and ε. The simplest version is Hooke’s law

σ = Eε, (6.22)

which describes the elastic properties of a thin rod under tension or com-pression. Here E is the elastic modulus of the material from which the rodis made; it is known as Young’s modulus.

Robert Hooke (1635–1703) was the first to establish the linear depen-dence f ∼ ∆l between the applied force and the elongation of a bar similarto that in Fig. 6.1. Thomas Young (1773–1829) introduced the elastic mod-ulus E as a quantity that does not depend on the cross-sectional area ofthe bar; rather, E characterizes the material itself. The linear dependence(6.22) holds only in some range |ε| < ε0, where ε0 depends on the material,


temperature, and other factors. However, the importance of Hooke’s lawin engineering cannot be overestimated.

In the general case, a linear dependence between the second-order ten-sors σ and ε is presented in equation (3.31), known as the generalizedHooke’s law :

σ = C ·· ε. (6.23)

In a Cartesian basis it is

σij = cijmnεmn.

The fourth-order tensor of elastic moduli

C = cijmniiijimin

has 81 components; only 36 of these are independent, however, as the sym-metries of σ and ε lead to the conditions

cijmn = cjimn = cijnm.

In linear elasticity it is shown [Lurie (2005)] that we can introduce thestrain energy

12

∫V

W dV

stored in the elastic body by virtue of its deformation. The integrandW = W (ε), the strain energy function, is a quadratic form in ε:

W =12ε ··C ·· ε =

12εijcijmnεmn.

The fact that W is uniquely defined for any deformation requires C topossess an additional symmetry property which, in terms of components,is

cijmn = cmnij

for any indices i, j,m, n. Indeed, we can represent C as the sum of twotensors

C = C′ + C′′,

where the components of C′ satisfy

c′ijmn = c′mnij

and those of C′′ satisfy

c′′ijmn = −c′′mnij.


(This is similar to the representation of a second-order tensor as the sumof symmetric and antisymmetric tensors.) For any symmetric tensors εk,we have

ε1 ··C′′ ·· ε2 = −ε2 ··C′′ ·· ε1,

hence for any strain tensor ε,

ε ··C′′ ·· ε = −ε ··C′′ ·· εand

ε ··C′′ ·· ε = 0.

It follows that

ε ··C ·· ε = ε ··C′ ·· ε.Thus C′′ does not affect the values of W , and the constitutive relationsshould not include it. Setting C′′ to zero, we get C = C′. Thus thecomponents of C have the following symmetry properties:

cijmn = cjimn = cijnm, cijmn = cmnij ,

for any indices i, j, n,m. Elementary calculation shows that these are 60equalities. So in Hooke’s law, of the 81 components of C there remain only81 − 60 = 21 independent elastic constants.

It follows from Exercise 3.44 that W is a potential for σ, i.e., that

σ = W,ε

where W,ε is the derivative of W with respect to ε.A dual relation

ε = W,σ

holds when we express W in terms of σ, i.e., W = W (σ). We urge thereader to show this as an exercise.

In engineering analysis, the vector and matrix notations are typicallyused to describe an elastic body. Because of the symmetry properties ofthe tensors, all the relations are written in terms of formal six-dimensional“vectors” for the components of σ and ε, and 6× 6 matrices for C. Voigt’srule shows one how to transform the tensor notation to the matrix-vectornotation. The rule for changing cijmn → Cpq is as follows. The pairs ofindices 11, 22, 33 change to 1, 2, 3, the pairs 23 and 32 to 4, the pairs 13and 31 to 5, and the pairs 12 and 21 to 6. For example, c1122 → C12 and


c1232 → C64. The symmetry property Cpq = Cqp holds. The componentsof the stress and strain tensors are transformed by the formulas

⎡⎢⎢⎢⎢⎢⎢⎢⎣

σ11

σ22

σ33

σ23

σ31

σ12

⎤⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

σ1

σ2

σ3

σ4

σ5

σ6

⎤⎥⎥⎥⎥⎥⎥⎥⎦,

⎡⎢⎢⎢⎢⎢⎢⎢⎣

ε11ε22ε332ε232ε312ε12

⎤⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

ε1ε2ε3ε4ε5ε6

⎤⎥⎥⎥⎥⎥⎥⎥⎦. (6.24)

In this notation, Hooke’s law takes the form

⎡⎢⎢⎢⎢⎢⎢⎢⎣

σ1

σ2

σ3

σ4

σ5

σ6

⎤⎥⎥⎥⎥⎥⎥⎥⎦=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

C11 C12 C13 C14 C15 C16

C12 C22 C23 C24 C25 C26

C13 C23 C33 C34 C35 C36

C14 C24 C34 C44 C45 C46

C15 C25 C35 C45 C55 C56

C16 C26 C36 C46 C56 C66

⎤⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎣

ε1ε2ε3ε4ε5ε6

⎤⎥⎥⎥⎥⎥⎥⎥⎦. (6.25)

The general case in which all 21 elastic constants in (6.25) are inde-pendent is not common in applications. In engineering practice, the ma-terials usually possess symmetries that reduce the number of independentconstants. Most common are isotropic materials; such a material exhibitscomplete symmetry of its properties in space so that at each point the ma-terial properties do not vary with direction. Mathematically, symmetry isexpressed in the language of group theory. Examples of isotropic materialsare steel, aluminium, many other metals, polymeric materials, etc.

For an isotropic material, the relation between σ and ε is given by alinear isotropic function. This function, considered in Chapter 3, now takesthe form

σ = λE tr ε+ 2µε, (6.26)

where λ and µ are Lame’s moduli . We may also refer to µ as the shearmodulus. For an isotropic material,

C = λEE + µ(ekEek + I).


In matrix notation, to C there corresponds the diagonal matrix⎡⎢⎢⎢⎢⎢⎢⎢⎣

λ+ 2µ 0 0 0 0 00 λ+ 2µ 0 0 0 00 0 λ+ 2µ 0 0 00 0 0 2µ 0 00 0 0 0 2µ 00 0 0 0 0 2µ

⎤⎥⎥⎥⎥⎥⎥⎥⎦.

Other elastic constants were used historically in engineering work, andin terms of these, Hooke’s law takes other forms. Common engineeringconstants include Young’s modulus E, Poisson’s ratio ν, the bulk modulus k,and the shear modulus G. For isotropic materials, these are used in pairs: Eand ν, and k and G. The modulus E originated in bar stretching problems;it relates the tension applied to the bar with the resulting strain. Themodulus k is used to describe uniform volume deformation of a material,say of a ball under uniform pressure. G describes the shear characteristicsof a material. Finally, ν describes the lateral shortening of a stretchedband. We will derive some relations between the constants.

Let us split σ into a ball tensor and deviator:

σ =13σE + devσ.

Then (6.26) takes the form

σ =13(3λ+ 2µ) tr ε, devσ = 2µ dev ε.

The bulk modulus k relates the mean stress σ with the bulk strain tr εthrough

σ = k tr ε,

and so

k = λ+ 2µ/3.

Shear stresses are related to shear strains by

σij = 2Gεij (i = j).

The above deviator equation in components is

σij = 2µεij (i = j),

which yields G = µ.


Now we turn to Young’s modulus E. Let us consider a uniaxial homo-geneous deformation of a bar; this occurs when the material is uniformlystretched or compressed. Take the unit vector i1 along the bar axis. Thenσ = σ11i1i1. Relation (6.26) reduces to the three nontrivial componentequations

σ11 = λ tr ε+ 2µε11, 0 = λ tr ε+ 2µε22, 0 = λ tr ε+ 2µε33.

Eliminating ε22 and ε33 from these, we get

σ11 = Eε11, where E =µ(3λ+ 2µ)λ+ µ

.

When a bar is uniformly stretched by a force and its strain is ε, experi-ment shows that its transverse dimensions decrease. The transverse strainfor this deformation is proportional to the longitudinal strain −νε, the co-efficient of proportionality being Poisson ratio. Its relation to the otherconstants is defined by the following

Exercise 6.3. For the above uniaxial strain state of the bar, find ε22/ε11 =−ν. Show that ν = λ/(2λ+2µ). Note that in the uniaxially stretched bar,ν defines the dependence of the lateral strain ε22 on the axial strain ε11.

The relations between various pairs of moduli used in the literature aresummarized in Appendix A.

Thermodynamic considerations show that the strain energy should bepositive:

W (ε) > 0 whenever ε = 0. (6.27)

This puts additional restrictions on the elastic moduli for an isotropic ma-terial. We have

W (ε) =12σ ·· ε =

12λ(tr ε)2 + µε ·· ε ≥ 0.

Let us represent ε as the sum of its ball and deviator terms:

ε =13E tr ε+ dev ε.

Because

ε ·· ε =13(tr ε)2 + dev ε ·· dev ε,

we obtain

W (ε) =16(3λ+ 2µ)(tr ε)2 + µ dev ε ·· dev ε.


But tr ε and dev ε are independent quantities, and from (6.27) it followsthat

3λ+ 2µ > 0, µ > 0. (6.28)

Hence the bulk modulus k = λ + 2/3µ and shear modulus G = µ arepositive. A consequence of (6.28) is explored in

Exercise 6.4. Demonstrate that E > 0 and −1 < ν ≤ 1/2.

For most engineering materials, we have ν > 0. The values of ν forrubbers are close to 1/2; materials like re-entrant foams can have negativevalues of ν. In technical reference books, the reader can find data for variousmaterials. For regular steels, the values of E are about 2 × 1011 Pa, and νlies in the range 0.25 − 0.33.

Exercise 6.5. For an isotropic body, express ε and W in terms of σ.

6.5 Equilibrium Equations in Displacements

We derived the equation of motion (6.19) and the equilibrium equation(6.11). Equation (6.11) is written in terms of stresses. In component form,(6.11) contains three equations in six independent unknowns σij . Usingthe definition of strain tensor and Hooke’s law, we can transform (6.19)and (6.11) to a system of three simultaneous equations with respect tothe components of the displacement vector u. So the systems (6.19) and(6.11) reduce to systems involving three equations in the three unknowncomponents of u. This corresponds to the common viewpoint that a well-posed problem should contain the same number of equations as unknowns.

For simplicity, we derive the equilibrium equations for an isotropic ho-mogeneous material defined by (6.26). Suppose λ and µ are constants.First we derive ∇ · σ in terms of u:

∇ · σ = ∇ · (λE tr ε+ 2µε) = λ∇ tr ε+ µ∇ · ∇u + µ∇ · (∇u)T .

Because

tr ε = ∇ · uand

∇ · (∇u)T = ∇∇ · u,


we get

∇ · σ = (λ+ µ)∇∇ · u + µ∇ · ∇u.

So the equilibrium equation takes the form

(λ+ µ)∇∇ · u + µ∇ · ∇u + ρf = 0. (6.29)

Exercise 6.6. Show that ∇ · (∇u)T = ∇∇ · u.

In Cartesian coordinates, equation (6.29) is

(λ+ µ)∂ui

∂xk∂xi+ µ

∂2uk

∂xi∂xi+ ρfk = 0 (k = 1, 2, 3). (6.30)

That is,

(λ+ µ)∂

∂x1

(∂u1

∂x1+∂u2

∂x2+∂u3

∂x3

)+ µ∆u1 + ρf1 = 0,

(λ+ µ)∂

∂x2

(∂u1

∂x1+∂u2

∂x2+∂u3

∂x3

)+ µ∆u2 + ρf2 = 0,

(λ+ µ)∂

∂x1

(∂u1

∂x1+∂u2

∂x2+∂u3

∂x3

)+ µ∆u3 + ρf3 = 0, (6.31)

where

∆ =∂2

∂x21

+∂2

∂x22

+∂2

∂x23

.

In curvilinear coordinates, the component representation of (6.29) is

(λ+ µ)∂

∂qkε+ µgmn∇m∇nuk + ρfk = 0 (k = 1, 2, 3) (6.32)

where

ε = tr ε =1√g

∂(√ggmnun)∂qm

.

Exercise 6.7. Derive (6.32).

See Appendix A for the corresponding relations in cylindrical and spher-ical coordinates.

Exercise 6.8. Write down the equations of motion in displacements.


6.6 Boundary Conditions and Boundary Value Problems

We have derived the equilibrium equations. To ensure uniqueness of so-lution, we must supplement the equations with certain conditions on theboundary of the domain where the equations hold. We first pose the mainequilibrium problems for elastic bodies. Then we touch on the existence-uniqueness question.

Physical intuition tells us that at a boundary point we should appointeither a displacement vector or a stress vector defined by the contact load.Appointment of both at the same point is impossible. This will be con-firmed later. In elasticity, there are three main boundary value problems.Two were mentioned above: one with given displacements, one with givencontact forces. The third is a mixed problem with some combination ofdisplacements and contact loads given on the boundary.

In the first boundary value problem of elasticity, we supplement the equi-librium equations in displacements (6.29) or other forms of the equations(6.30)–(6.32) with a given displacement field u0 on the whole boundary Σ:

u∣∣Σ

= u0. (6.33)

This is known as the kinematic boundary condition.In the second boundary value problem, we supplement (6.29) with a

prescribed contact load t0 on Σ:

n · σ∣∣Σ

= t0. (6.34)

This is known as a static boundary condition. Since no point of the bodyis fixed, the body can move freely in space. With this condition, the staticboundary value problem is well-posed only if the load acting on the bodyis self-balanced; this means that the resultant force and resultant momentof all the external forces acting on the body must vanish.

The reader familiar with partial differential equations may recognizethat these types of boundary conditions are analogous to the Dirichlet andNeumann conditions for Poisson’s equation.

The third boundary value problem requires us to supplement (6.29) withmixed boundary conditions. That is, we must assign kinematic conditions onsome portion Σ1 of the boundary, and static conditions t0 on the remainderΣ2:

u∣∣Σ1

= u0, n · σ∣∣Σ2

= t0. (6.35)

There exist other “mixed problems” of elasticity in which, at eachboundary point, we assign three conditions that combine the given dis-


placements and loads in some fashion. Moreover, on the boundary we cancombine conditions involving contact with other elastic or inelastic bodiesor media. As a rule, such problems require consideration of well-posedness,although engineering intuition commonly permits investigators to proposemeaningful boundary conditions of that type. An example of a meaning-less problem is the problem with simultaneously given normal displacementsand normal stresses on the boundary. These cannot be specified indepen-dently.

Henceforth we will assume V is a bounded volume whose boundary Σ issufficiently regular that we can apply the technique of integration by parts.

6.7 Equilibrium Equations in Stresses

A boundary value problem in displacements involves finding a displacementfield satisfying the equilibrium equations (6.29) and one of the sets of bound-ary conditions (6.33), (6.34), or (6.35). It is assumed that the stress tensorin (6.34) or (6.35) is expressed in terms of u by (6.26) or (6.16). Whenthe stresses are prescribed over the whole boundary, we can try finding theunknown σ via (6.11) and (6.34). However, the solution for σ is not unique.We must bring in other equations that take into account Hooke’s law. Wecan do this as follows. Suppose we have found a solution σ of equations(6.11) supplemented by (6.34). By σ, through Hooke’s law (6.26), we defineε. Calculating the trace of equation (6.26), we get

tr ε =1

3λ+ 2µtrσ.

Substituting this into (6.26), we have

ε =12µ

[σ − λ

3λ+ 2µE trσ

]=

12µ

[σ − ν

1 + νE trσ

]. (6.36)

But we know that the strain tensor makes sense, i.e., by ε we can find thedisplacement field, if and only if the compatibility equation (6.17) holds:

∇× (∇× ε)T = 0.

Substituting (6.36) into the compatibility equation, we get the equation interms of σ:

∇× (∇× ε)T =12µ

[∇× (∇× σ)T − ν

1 + ν∇× (∇× (E trσ)T

]= 0.


Omitting cumbersome calculations, we can reduce this to the Beltrami–Michell equations

∇ · ∇σ +ν

1 + ν∇∇ trσ + ρ∇f + ρ(∇f)T + E

ν

1 − νρ∇ · f = 0. (6.37)

These equations, being supplementary to (6.11) and (6.34), present us withthe complete setup of the equilibrium problem in stresses. Note that thisproblem has a solution only if the set of external forces, consisting of thebody forces and those acting on the boundary, is self-balanced.

When body forces are absent, (6.37) reduces to

∇ · ∇σ +ν

1 + ν∇∇ trσ = 0. (6.38)

In Cartesian coordinates, (6.38) implies

∆σ11 +1

1 + ν

∂2σ

∂x21

= 0, ∆σ12 +1

1 + ν

∂2σ

∂x1∂x2= 0,

∆σ22 +1

1 + ν

∂2σ

∂x22

= 0, ∆σ23 +1

1 + ν

∂2σ

∂x2∂x3= 0,

∆σ33 +1

1 + ν

∂2σ

∂x23

= 0, ∆σ13 +1

1 + ν

∂2σ

∂x1∂x3= 0, (6.39)

where

σ = trσ = σ11 + σ22 + σ33.

To solve the equilibrium problem completely, we first find σ by (6.11),(6.34), and the Beltrami–Michell equations. Then, using Hooke’s law, wefind ε. Finally, by Cesaro’s formula (6.18), we find the displacement fieldu that is uniquely defined up to a rigid motion of the body.

We should note that, from the standpoint of classical mathematicalphysics, the structure of the boundary value problem in stresses is a bitstrange. The unknowns are the six components of σ, satisfying nine equa-tions — three of first order, and six of second order with respect to thecomponents of σ. These are supplemented with three boundary condi-tions. By its derivation, this boundary value problem is equivalent to theboundary value problem in displacements. The theory of such “strange”boundary value problems is far from complete.


6.8 Uniqueness of Solution for the Boundary Value Prob-lems of Elasticity

Uniqueness of solution to a boundary value problem is an important indi-cation of its well-posedness. A uniqueness theorem for elasticity problemswas established by Gustav R. Kirchhoff (1824–1887).

We will prove uniqueness of solution for equations (6.29) supplementedwith boundary conditions (6.35), which include conditions (6.33) as a par-ticular case.

To the contrary, we suppose that there exist two solutions u1 and u2

that satisfy (6.29) and (6.35). We denote the corresponding stress tensorsby σ1 and σ2, respectively. So the following two sets of equations hold:

∇ · σ1 + ρf = 0 in V, u1

∣∣Σ1

= u0, n · σ1

∣∣Σ2

= t0,

∇ · σ2 + ρf = 0 in V, u2

∣∣Σ1

= u0, n · σ2

∣∣Σ2

= t0.

Consider the difference u = u1 − u2 and its corresponding stress fieldσ = σ1 − σ2. Subtracting the equations for u1 and u2, we see that u is asolution of the following equilibrium problem:

∇ · σ = 0 in V, u∣∣Σ1

= 0, n · σ∣∣Σ2

= 0.

Dot-multiplying the equilibrium equation by u and integrating over V , weget ∫

V

(∇ · σ) · u dV = 0.

Using the Gauss–Ostrogradsky theorem, we transform this to the equation

−∫

V

σ ·· (∇u)T dV +∫

Σ

n · σ · u = 0.

At each point of Σ, one of the conditions u = 0 or n · σ = 0 holds. So thelast surface integral is zero. The tensor σ is symmetric. So

σ ·· (∇u)T = σ ·· (∇u).

This allows us to represent the volume integral as follows:

−∫

V

σ ·· ε dV = −2∫

V

W (ε) dV = 0.

Recall that W (ε) is positive definite. Hence the equality of the integral tozero implies that its integrand vanishes everywhere in V : W (ε) = 0. Bypositiveness of W , it follows that

ε = 0. (6.40)


By Cesaro’s formula (6.18), when ε(u) = 0, the displacement vector takesthe form

u = u0 + ω0 × (r − r0), (6.41)

where u0,ω0 and r0 are constant vectors. But it is a small displacementfield for the rigid volume V . Because u|Σ1 = 0, we have u = 0 in V . Thus

u1 = u2.

We have proved uniqueness of solution for the first and third boundaryvalue problems of elasticity.

For the second problem, the initial steps of the proof remain valid. Wefind that ε = 0 in V , hence the two solutions to the problem differ by arigid-body displacement:

u1 = u2 + u0 + ω0 × (r − r0).

But this time we cannot conclude that u1 = u2. From physics, the situationis clear. The second problem of elasticity describes a body free of geometricrestrictions, so its solution must be defined up to a rigid-body displacement.This is what the last formula states.

6.9 Betti’s Reciprocity Theorem

The solutions to the equilibrium problems for an elastic body under twodifferent loads obey Betti’s reciprocal work theorem.

Let us consider a body under the action of contact forces t′ and bodyforces f ′, and the same body under the action of another pair of externalforces t′′ and f ′′. The displacements and other quantities for the two cor-responding problems will be denoted similarly, using primes and double-primes, respectively. Now we have two solutions of two different secondboundary value problems for the elastic body.

Theorem 6.4. The solutions u′ and u′′ of the second boundary value prob-lem of elasticity that correspond to the respective loading pairs t′, f ′ andt′′, f ′′ satisfy the following relation:∫

V

ρf ′ · u′′ dV +∫

Σ

t′ · u′′ dΣ =∫

V

ρf ′′ · u′ dV +∫

Σ

t′′ · u′ dΣ. (6.42)

This equality is Betti’s theorem. First derived for beam theory and laterextended to many portions of physics, it is used to derive the equations of


the boundary finite element methods. Each of the expressions on the leftand right represents the work of a load over a displacement. So we can givea mechanical formulation of Betti’s theorem.

The work of the first system of forces over the displacements of the bodydue to the action of the second system of forces is equal to the work of thesecond system of forces over the displacements due to the action of the firstsystem of forces.

Proof. The vectors u′ and u′′ respectively satisfy the following equilib-rium equations and boundary conditions:

∇ · σ′ + ρf ′ = 0 in V, n · σ′∣∣Σ

= t′;

∇ · σ′′ + ρf ′′ = 0 in V, n · σ′′∣∣Σ

= t′′.

Let us consider the left-hand side of (6.42). Writing the forces in terms ofthe stresses, we get∫

V

ρf ′ · u′′ dV +∫

Σ

t′ · u′′ dΣ = −∫

V

(∇ · σ′) · u′′ dV +∫

Σ

n · σ′ · u′′ dΣ.

Applying the Gauss–Ostrogradsky theorem to the surface integral, we get

−∫

V

(∇ · σ′) · u′′ dV +∫

Σ

n · σ′ · u′′ dΣ

=∫

V

[−(∇ · σ′) · u′′ + ∇ · (σ′ · u′′)] dV

=∫

V

σ′ ·· (∇u′′)T dV

=∫

V

σ′ ·· ε′′ dV.

Similarly, ∫V

ρf ′′ · u′ dV +∫

Σ

t′′ · u′ dΣ =∫

V

σ′′ ·· ε′ dV.

So the proof reduces to verification of the equality∫V

σ′ ·· ε′′ dV =∫

V

σ′′ ·· ε′ dV.

Using Hooke’s law and recalling the symmetry properties of C, we have

σ′ ·· ε′′ = (C ·· ε′) ·· ε′′ = ε′′ ··C ·· ε′ = ε′ ··C ·· ε′′ = σ′′ ·· ε′.This completes the proof.


Exercise 6.9. Let a portion Σ1 of the boundary be fixed: u|Σ1 = 0. Sup-pose two boundary value problems include this condition and, moreover,that the body is under the action of one of the two systems of forces t′, f ′

or t′′, f ′′, with t′, t′′ given on Σ2 = Σ \ Σ1. Prove that in this case Betti’sequality takes the form∫

V

ρf ′ · u′′ dV +∫

Σ2

t′ · u′′ dΣ =∫

V

ρf ′′ · u′ dV +∫

Σ2

t′′ · u′ dΣ.

Exercise 6.10. Consider two equilibrium problems for a body under twosystems of external forces, as in the previous exercise. This time, however,two distinct displacement fields are prescribed over Σ1:

u′∣∣Σ1

= a′, u′′∣∣Σ1

= a′′. (6.43)

As above, denote the solutions of the corresponding equilibrium problemsby u′ and u′′. Denote the solutions of the equilibrium problems for a bodyfree of external forces, that satisfy the corresponding conditions (6.43), byu′

0 and u′′0 , respectively. Finally, introduce

u′ = u′ − u′0, u′′ = u′′ − u′′

0 .

Prove that Betti’s equality takes the form∫V

ρf ′ · u′′ dV +∫

Σ2

t′ · u′′ dΣ =∫

V

ρf ′′ · u′ dV +∫

Σ2

t′′ · u′ dΣ.

6.10 Minimum Total Energy Principle

On a curved surface, a ball in equilibrium takes the lowest position; we callthe equilibrium stable. In elementary physics, it is said that at such an equi-librium point, the potential energy of the ball takes a minimum value. Astationary point of potential energy corresponds to an equilibrium positionof the ball as well, but such an equilibrium may be unstable. An equilib-rium that corresponds to a non-minimum point normally is unstable. Thestability result for the potential energy of a particle was first extended toclassical mechanics by Lagrange. The minimum potential energy principleis now one of the most important principles of physics, holding also fordistributed systems like elastic bodies. However, it is not always straight-forward to formulate, since we must find an expression for the potentialenergy and prove that it takes a minimum value in equilibrium.


As the expression for the energy of an elastic body under load, wepropose the total energy

E =∫

V

W (ε) dV −∫

V

ρf · u dV −∫

Σ2

t0 · u dΣ.

The first term is the strain energy associated with deformation. If we slowlydecrease the external forces acting on the body to zero, the strain energyis the maximum work that the body can produce. The other terms, withnegative signs, represent the work of external forces on the displacementfield of the body. These terms are analogous to the expression for the workof the gravitational force acting on a moving particle, and are so similar tothe gravitational potential that they may be called potential energy terms.Again, we must prove that E really takes its minimum value when thebody is in equilibrium. Clearly, we should explain what is meant by “theminimum value of E” and introduce the tools necessary to establish theminimum principle.

The reader is familiar with the definition of a local minimum point fora function in n variables: there should exist a neighborhood of the pointin which all the values of the function are no less than its value at thepoint. For a global minimum point, this value should be no greater thanthe values of the function at any point. These definitions can be extendedto quantities like E, as they take values in the set of real numbers. Wecould call E a function, but because it depends on the vector function u,it is called a functional. A functional is a correspondence that takes itsargument, which can be a whole function as it is here, to at most one realnumber.

To extend the idea of minimum to functionals, we should treat thenotions of independent variable, the domain of such a variable, and theneighborhood of a “point” (which is now a vector function u in the domain).For simplicity, we will assume the domain of E consists of vector functionspossessing all continuous derivatives in V up to the second order. Moreover,we suppose that for any u, the kinematic boundary conditions hold. Suchvector functions will be called admissible. To define a neighborhood, wemust introduce a metric or norm on the set of admissible vector functions.The reader interested in the formalities should consult standard books onthe calculus of variations. Our present approach will be essentially that ofthe pioneers of the subject, which was to obtain results without worryingtoo much about formal justification. In fact, we can completely avoid thequestion about neighborhoods, since for the problems under consideration,


in the equilibrium state E takes its global minimum value.Now we will describe the technical tools that permit us to seek the

minimum points of a functional. We wish to derive for E some analogueof Fermat’s theorem for a differentiable function in one variable: the firstderivative must vanish at a minimum point. Fermat’s theorem extends tothe theory of functions in many variables: all the first partial derivatives ofthe function must vanish. Let us review how this extension is accomplished.

Let f be an ordinary real-valued function of a vector variable, and sup-pose f takes its minimum at a point x ∈ R

n. If a is any vector and τ isa real parameter, then f(x + τa) can be regarded as a function of τ thattakes its minimum value at τ = 0. Therefore, at τ = 0 its derivative withrespect to τ must vanish:

df(x + τa)dτ

∣∣∣∣τ=0

= 0. (6.44)

At a minimum point x, this must hold for any a. As the reader is aware,the expression on the left is the directional derivative of f at x along thedirection of a. Moreover, if we formally set a = dx, then the left membertakes the form of the first differential df at point x:

df =df(x + τ dx)

dτ

∣∣∣∣τ=0

= ∇f ∣∣x· dx.

Putting a = ik in (6.44), we obtain Fermat’s theorem for a function in n

variables: the partial derivative of f with respect to xk at x must vanishif ∇f |x exists and x is a minimum point of f . We emphasize that thisis merely a necessary condition. If at some point x equation (6.44) holds,then x may not be a minimum (or a maximum) point. Any x satisfying(6.44) is called a stationary point of f . It could be a minimum point, amaximum point, or a saddle point.

This procedure extends to functionals in a straightforward manner. Sup-pose u is a minimum point of E. Let us fix some δu such that u + τ δu isan admissible displacement for all small τ . Substitute this into E. ThenE(u+τ δu) is a function of the real variable τ that takes its minimum valueat τ = 0. If the derivative of this function with respect to τ exists, then itvanishes at τ = 0:

δE =d

dτE(u + τ δu)

∣∣∣∣τ=0

= 0

for any δu such that u + τ δu is admissible for any small τ .


We call δE the first variation of the functional E. Its zeros are calledstationary points of E; they are analogous to the stationary points of anordinary function f . The equation we obtained is a necessary condition foru to be a minimum point of E.

We assumed the displacements u + τ δu were admissible for all smallτ . Clearly, if all derivatives of u and δu up to order two are continuous inV , the same is true of u + τ δu. We required u to satisfy the kinematicrestrictions on Σ1. Hence if δu satisfies

δu∣∣Σ1

= 0,

then u+ τ δu also satisfies the kinematic restrictions for any τ . In this caseδu is called a virtual displacement.

We note that the definition of δE can be applied to any sufficientlysmooth E. (We leave this on an intuitive level, as a formal definitionwould require a background in functional analysis.) It is called the Gateauxderivative of the functional. In the notation δE, we regard the symbol δas some action on E given by a formula similar to the one used to findthe differential of a function. By δu we denote a virtual displacement; thiscould be denoted by v as well, but we would have to keep clarifying thatit is a virtual displacement. That is, in this case the symbol δ is merely anotation; it is not an operation and is used for historical reasons. Whenwe apply the operation δ to an ordinary function, it coincides with theoperation of taking its first differential. The differentials of variables in thiscase are written with the use of the symbol δ instead of d; for example, wehave δ(x2) = 2x δx.

Using the definition, let us find δE for our problem:

δE =∫

V

δW (ε) dV −∫

V

ρf · δu dV −∫

Σ2

t0 · δu dΣ.

Calculating δW in the integrand and recalling that for intermediate stepswe can use the formulas for the first differential, we get

δW (ε) =12δ(ε ··C ·· ε) = ε ··C ·· δε = σ ·· δε,

where

δε = ε(δu) =12(∇δu + (∇δu)T

).

It follows that

δE =∫

V

σ ·· δε dV −∫

V


Σ2

t0 · δu dΣ.


Now we wish to find a relationship between a solution to the equilibriumproblem for an elastic body and the minimum problem for E. We start withthe following theorem.

Theorem 6.5. A stationary point u of E on the set of admissible displace-ments satisfies the equilibrium equations of the elastic body in the volumeV and the boundary condition n · σ|Σ2 = t0, and conversely.

Proof. First we prove that u, a solution of the equilibrium problem foran elastic body under load, is a stationary point of E; that is,

δE = 0

when u is a solution and δu is an arbitrary virtual displacement. Assumeu satisfies

∇ · σ + ρf = 0 in V, u∣∣Σ1

= u0, n · σ∣∣Σ2

= t0. (6.45)

Dot-multiply the equilibrium equation by an admissible δu and integrateover V . Then apply the Gauss–Ostrogradsky theorem:

0 =∫

V

[(∇ · σ) · δu + ρf · δu] dV

=∫

V

[−σ ·· (∇δu)T + ρf · δu] dV +∫

Σ

n · σ · δu dΣ

=∫

V

[−σ ·· δε+ ρf · δu] dV +∫

Σ2

t0 · δu dΣ

= −δE.

We used the fact that δu = 0 on Σ1 and n · σ = t0 on Σ2. Hence u is astationary point of E.

Now we prove the converse statement. Let u be a stationary point of E,so that it satisfies the equation δE = 0 for any virtual δu being sufficientlysmooth, and let u|Σ1 = u0. We will show that u satisfies the equilibriumequations in V as well as the contact condition n · σ = t0 on Σ2. Indeed,


doing the above calculations in reverse order, we get

δE =∫

V

σ ·· δε dV −∫

V


Σ2

t0 · δu dΣ

=∫

V

σ ·· (∇δu)T dV −∫

V


Σ2

t0 · δu dΣ

= −∫

V

[(∇ · σ) · δu + ρf · δu] dV +∫

Σ

n · σ · δu dΣ −∫

Σ2

t0 · δu dΣ

= −∫

V

[∇ · σ + ρf ] · δu dV

+∫

Σ1

n · σ · δu dΣ −∫

Σ2

[n · σ − t0

] · δu dΣ.Because δu = 0 on Σ1, we finally obtain

δE = −∫

V

[∇ · σ + ρf ] · δu dV −∫

Σ2

[n · σ − t0

] · δu dΣ = 0. (6.46)

We assumed u to be smooth enough that the above integrands are contin-uous. As admissible δu is arbitrary, from this equality it follows that

∇ · σ + ρf = 0 in V, n · σ∣∣Σ2

= t0. (6.47)

The justification of this last step requires standard material from the calcu-lus of variations. We postpone this discussion until the end of the presentsection.

Now we would like to prove

Theorem 6.6. Let u be a solution of the boundary problem (6.45). Thenit is a point of global minimum of E.

Proof. The statement is a consequence of the fact that W is a positivequadratic form with respect to ε. Indeed, let v be an arbitrary admissi-ble displacement so that it is smooth enough and satisfies the kinematiccondition v|Σ1 = u0. Consider the difference

∆E = E(v) − E(u).


We have

∆E =∫

V

W (ε(v)) dV −∫

V

ρf · v dV −∫

Σ2

t0 · v dΣ

−∫

V

W (ε(u)) dV +∫

V

ρf · u dV +∫

Σ2

t0 · u dΣ

=∫

V

[W (ε(v)) −W (ε(u))] dV

−∫

V

ρf · (v − u) dV −∫

Σ2

t0 · (v − u) dΣ.

Let w = v−u. Because u and v coincide on Σ1, we have w|Σ1 = 0. Next,

2 [W (ε(v)) −W (ε(u))] = ε(v) ··C ·· ε(v) − ε(u) ··C ·· ε(u)

= ε(w) ··C ·· ε(w) + 2ε(u) ··C ·· ε(w).

Therefore

∆E =12

∫V

ε(w) ··C ·· ε(w) dV

+∫

V

ε(u) ··C ·· ε(w) dV −∫

V

ρf ·w dV −∫

Σ2

t0 ·w dΣ.

The sum of the terms in the second line constitutes δE at u with δu = w,which is a virtual displacement, so this sum is zero. Thus we have

∆E =12

∫V

ε(w) ··C ·· ε(w) dV =∫

V

W (ε(w)) dV. (6.48)

As W is a positive definite form, ∆E ≥ 0 for any admissible v. Hence u isa global minimizer of E.

The two preceding theorems can be summarized as Lagrange’s varia-tional principle.

Theorem 6.7. The solution of the equilibrium problem for an elastic bodyis equivalent to the problem of minimizing the total energy functional Eover the set of all kinematically admissible smooth displacement fields.

To complete the proof of Theorem 6.5, we must show how to obtain(6.47) from (6.46). It will suffice to do this for one of the componentequations. Denoting the kth component of the expressions in the bracketsin (6.46) by F and f respectively, the kth component of δu by δu, and


setting the remaining components of δu to zero, we reduce equation (6.46)to ∫

V

F δu dV +∫

Σ2

f δu dΣ = 0. (6.49)

Equation (6.47) follows from the next theorem.

Theorem 6.8. Let F be continuous in V and let f be continuous on Σ2.Suppose (6.49) holds for all sufficiently smooth functions δu that vanish onΣ1. Then

F = 0 in V, f = 0 on Σ2.

Proof. The proof is done in two steps. We begin by restricting the setof all admissible δu to those that vanish on the whole boundary Σ. Then(6.49) becomes ∫

V

F δu dV = 0.

Suppose, contrary to the theorem statement, that F (x∗) = a = 0 at somepoint x∗. Without loss of generality we take x∗ to be an interior point ofV and a > 0. By continuity of F there is an open ball Br with center x∗and radius r > 0 that lies inside V such that F (x) > a/2 > 0 on Br. Nowif we take a smooth function δu0 that is positive on Br and zero outsideBr, we get ∫

V

F δu0 dV > 0,

which contradicts the above equality that must hold for all δu that vanishon the boundary of V . It is clear that such a function δu0 exists; the readercan find examples of “bell-shaped functions” in textbooks on the calculusof variations, cf. [Lebedev and Cloud (2003)]. So F = 0 in V .

The second step is to prove that f = 0 on Σ2. For this we return to(6.49). Because F = 0, the first integral is zero for any admissible δu. Sowe find that ∫

Σ2

f δu dΣ = 0

holds for all admissible δu that take arbitrary values on Σ2. But now weare in a position similar to the proof for F . We can repeat that proof, buton a two-dimensional domain Σ2, so f = 0 on Σ2.


A reader familiar with the calculus of variations may recognize the aboveprocedure as leading to one version of the “main lemma” of that subject.

It is worth noting that in the theory of elasticity there are several varia-tional principles. Some are of minimax type; others are of stationary type.We can mention principles that carry the names of Castigliano, Reissner,Tonti, Hamilton, etc.

6.11 Ritz’s Method

The minimum total energy principle is of great physical importance. It isalso the basis for introducing the generalized setup of the equilibrium prob-lems, which in turn defines weak solutions in elasticity. These solutions havefinite strain energies. The principle is even more important from a practicalstandpoint. The variational equation δE = 0 is the basis for various nu-merical methods, including the Ritz and Galerkin methods, the variationalfinite difference methods, and the finite element methods. Minimality ofthe total energy on the solution warrants stability of the algorithms of thesemethods. For details, the reader should consult specialized literature. Herewe present Ritz’s method, which spawned the other methods mentionedabove.

The minimum total energy principle states that a solution of the equi-librium problem

∇ · σ + ρf = 0 in V, u∣∣Σ1

= u0, n · σ∣∣Σ2

= t0,

expressed in terms of u, is a point of global minimum of the total energyfunctional

E =∫

V

W (ε) dV −∫

V

ρf · u dV −∫

Σ2

t0 · u dΣ

on the set of admissible displacements. The converse statement is also valid,hence by the uniqueness theorem this point of minimum is unique. WalterRitz (1878–1909) thought that by finding a minimum point on some subsetof admissible displacements, we can get an approximation to the solution.The decisive idea was how to select this subset in such a way that solvingthe approximate minimum problem would be relatively easy. Ritz proposedto minimize E on the set of displacements of the form

uN =N∑

k=1

ukϕk + u∗, u∗∣∣Σ1

= u0, ϕn

∣∣Σ1

= 0 (6.50)


with some fixed ϕk and u∗, by determining the numerical coefficients uk.The first step is to find a sufficiently smooth vector function u∗ that satisfiesthe condition u∗|Σ1 = u0. Then we choose some set of basis elementsϕk that vanish on Σ1. Clearly, any uN satisfies the kinematic conditionuN |Σ1 = u0.

In Ritz’s time, when calculations where done manually, the selectionof the basis was extremely important; an engineer needed to find a fewcoefficients uk in a reasonable time. Now, with powerful computers, thenumber of basis elements can be large. One requirement for the basiselements is that they constitute a linearly independent set. Another is thatusing this basis, we can actually approximate a solution with satisfactoryprecision. We shall not pursue this issue, as it requires a serious excursioninto mathematics.

Let us formulate Ritz’s minimization problem for the N th approxima-tion:

Minimize E over the set of all uN satisfying (6.50).

That is, find

uN =N∑

k=1

ukϕk + u∗

such that

E(uN ) ≤ E(uN ) for any uN from the set (6.50),

where

E(uN ) =∫

V

W (ε(uN )) dV −∫

V

ρf · uN dV −∫

Σ2

t0 · uN dΣ.

Because u∗ is fixed, we should minimize E(uN ) on the N -dimensional linearspace spanned by the elements ϕ1, . . . ,ϕN . Hence we must find the valuesof the real coefficients u1, . . . , uN . From the minimum theorem on the wholeset of admissible displacements, it follows that E(uN ) has a minimum pointin the spanned space.

Now we need a practical way of finding the coefficients of the minimizerof E. Let us write out the expression for E(uN ) in detail. We introducethe following notation:

〈u,v〉 =∫

V

ε(u) ·· C ·· ε(v) dV. (6.51)


Note that 〈u,v〉 = 〈v,u〉, and that 〈u,u〉 is twice the strain energy forthe displacement field u. This form 〈u,v〉 can be considered as an innerproduct on any linear space of sufficiently smooth u such that u|Σ1 = 0,for which the expression 〈u,v〉 makes sense.

Next we introduce

fm =∫

V

ρf · ϕm dV +∫

Σ2

t0 ·ϕm dΣ − 〈u∗,ϕm〉

and

F =∫

V

ρf · u∗ dV +∫

Σ2

t0 · u∗ dΣ − 12〈u∗,u∗〉.

In this notation we have

E(uN ) =12

N∑m=1

N∑n=1

umun〈ϕm,ϕn〉 −N∑

n=1

fnun − F.

So E(uN ) is a quadratic function in the N variables un, and we can applystandard tools from calculus.

At the point of minimum of E(uN ), the simultaneous equations

∂E(uN )∂um

= 0 (m = 1, . . . , N)

must hold. Writing these asN∑

n=1

un〈ϕn,ϕm〉 = fm (m = 1, . . . , N), (6.52)

we have a system of linear algebraic equations. The matrix of the systemis

A =

⎛⎜⎜⎜⎜⎜⎝〈ϕ1,ϕ1〉〈ϕ2,ϕ1〉〈ϕ3,ϕ1〉 . . . 〈ϕN ,ϕ1〉〈ϕ1,ϕ2〉〈ϕ2,ϕ2〉〈ϕ3,ϕ2〉 . . . 〈ϕN ,ϕ2〉〈ϕ1,ϕ3〉〈ϕ2,ϕ3〉〈ϕ3,ϕ3〉 . . . 〈ϕN ,ϕ3〉

......

.... . .

...〈ϕ1,ϕN 〉〈ϕ2,ϕN 〉〈ϕ3,ϕN 〉 . . . 〈ϕN ,ϕN 〉

⎞⎟⎟⎟⎟⎟⎠ .

Its determinant det(A) is Gram’s determinant. In linear algebra, it isshown that for a linearly independent system of elements the Gram deter-minant is nonzero and conversely. But the elements ϕ1, . . . ,ϕN are linearlyindependent by assumption, hence (6.52) has a unique solution.

Now let us touch on the question of convergence for Ritz’s approxima-tions. Our use of the term “approximations” does not, by itself, answer thequestion whether uN is really close to the solution.


We shall reformulate the problem. Let u be a solution to the equilib-rium problem under consideration. Then the Nth approximation of Ritz’smethod minimizes the functional E(uN )−E(u) as well. But by the formula(6.48) in the proof of Theorem 6.6, we have

E(uN ) − E(u) =∫

V

W (ε(uN − u)) dV. (6.53)

Because W is a positive definite form, the minimization procedure seemsto be convergent if we can approximate any admissible displacement uwith uN . The sense in which we should approximate the displacements isgiven by the above limit E(uN ) − E(u) → 0 as N → ∞. So to get anapproximation, it suffices to have the set ϕm be complete in the sense thatfor any admissible v such that v|Σ1 = 0 and ε > 0, we can find a un suchthat ∫

V

W (ε(un − v)) dV < ε.

In practice we can find complete systems, but proof of completeness is noteasy.

Having a complete basis set of ϕm, it seems we could say the following.Because the Nth Ritz approximation is the best approximation of the so-lution from all the possible approximations, by completeness of the ϕm weimmediately obtain convergence of the Ritz approximations uN to u in thesense that ∫

V

W (ε(uN − u)) dV → 0 as N → ∞.

This argument is, unfortunately, merely plausible.We can show that for Ritz’s approximations the following holds:∫

V

W (ε(uN ) − ε(uM )) dV → 0 whenever N,M → ∞. (6.54)

So uN seems to be a Cauchy sequence in the integral energy sense. Un-fortunately, using only classical calculus we cannot conclude that a limitelement exists and determine its properties. These issues are discussed inthe theory of Sobolev spaces. Engineers who use Ritz’s method shouldunderstand that they work with smooth approximations that are ordinaryfunctions, but that the limit functions and convergence questions are func-tional analytic issues; see, for example, [Lebedev and Cloud (2003)].

Often engineers attempt to compensate for a lack of theoretical justifi-cation by means of experiment. However, experiments are quite restrictive


and cannot provide real justification in all practical cases. They can onlysupport opinion regarding the applicability of a method.

Korn’s inequality

For various qualitative questions about solutions of elasticity problems,Korn’s inequality plays a significant role. It states that a displacementwith finite energy belongs to the space W 1,2(V ); that is, all of its Cartesiancomponents have square-integrable first derivatives. It can also yield anestimate for the displacement field through the expression for the strainenergy.

One form of Korn’s inequality is∫V

(u · u + ∇u ·· ∇uT ) dV ≤ c

∫V

W (ε(u)) dV (6.55)

for any sufficiently smooth u with a constant c that depends only on V andΣ1, the part of the boundary on which u|Σ1 = 0. Because W (ε(u)) is apositive definite quadratic form in ε and we are not interested in the exactvalue of c, we can change W (ε(u)) to tr(ε · ε). A general proof for Σ1 = Σis difficult [Ciarlet (1988)], so we establish the inequality when u|Σ = 0.We will prove ∫

V

tr(∇u · ∇uT ) dV ≤ 2∫

V

tr(ε · ε) dV, (6.56)

from which (6.55) follows. In Cartesian coordinates this is

∫V

3∑i,j=1

(∂ui

∂uj

)2

dV ≤ 2∫

V

⎛⎝ 3∑i,j=1

ε2ij

⎞⎠ dV.

Another part of the inequality for ui follows from Friedrich’s inequality

∫V

3∑i=1

|ui|2 dV ≤ c1

∫V

3∑i,j=1

(∂ui

∂uj

)2

dV,

which holds with a constant c1 independent of ui for any smooth functionui that vanishes on the boundary. See [Lebedev and Cloud (2003)] for aproof.


So we prove (6.56). We have∫V

tr(ε · ε) dV =12

∫V

[tr(∇u · (∇u)T ) + tr(∇u · ∇u)

]dV

=12‖∇u‖2 +

12

∫V

tr(∇u · ∇u) dV.

On the boundary, u = 0. Using the Gauss–Ostrogradsky theorem, let ustransform the last integral:∫

V

tr(∇u · ∇u) dV =∫

V

∂uj

∂xi

∂ui

∂xjdV = −

∫V

uj∂2ui

∂xi∂xjdV

=∫

V

∂ui

∂xi

∂uj

∂xjdV =

∫V

(∇ · u)2 dV.

So

2 ‖ε‖2 = ‖∇u‖2 +∫

V

(∇ · u)2 dV.

Inequality (6.56) follows from the positivity of the second term.

6.12 Rayleigh’s Variational Principle

The minimum total energy principle, also known as Lagrange’s principle,is formulated for equilibrium problems; it cannot be applied to dynamicsproblems. But there is one important dynamics problem for which thereexists a variational principle based on a similar minimization idea. This isthe eigenoscillation problem for an elastic body. In this problem we seeksolutions to a dynamic homogeneous problem in displacements in the form

u = u(r, t) = w(r)eiωt. (6.57)

The equations of the dynamical problem are

∇ · σ = ρu in V, u∣∣Σ1

= 0, n · σ∣∣Σ2

= 0. (6.58)

Substituting (6.57) into (6.58) expressed in displacements, and cancelingthe factor eiωt, we get

∇ · σ = −ρω2w in V, w∣∣Σ1

= 0, n · σ∣∣Σ2

= 0. (6.59)

Here σ is given by

σ = C ·· ε, ε = ε(w) =12(∇w + (∇w)T

).


Equations (6.59) constitute an eigenvalue problem: we must find pos-itive values ω, known as eigenfrequencies of the elastic body, for which(6.59) has a nontrivial solution w, called an eigenoscillation.

It can be shown that the problem has only nonnegative eigenfrequencies,for which w has only real components, and that the set of eigenfrequencies iscountable. To demonstrate that the set of eigenfrequencies is countable, andto show that a complete set of linearly independent eigenmodes constitutesa basis in the space of modes having finite energy, we need techniquesthat fall outside the scope of this book. Note that these proofs requireV to be a compact volume with a sufficiently smooth boundary. The term“sufficiently smooth” covers the needs of engineering practice: the boundarycannot have cusps, but V can be a pyramid or a cone.

Now we show that any eigenfrequency ω is real and, moreover, nonnega-tive. Suppose to the contrary ω is a complex number so that its correspond-ing mode w is also complex-valued. Dot-multiplying the first equation of(6.59) by w and integrating over V , we get∫

V

(∇ · σ) ·w dV = −ω2

∫V

ρw · w dV.

Applying the Gauss–Ostrogradsky theorem to the left side, we get∫V

σ ·· ε dV = ω2

∫V

ρw ·w dV. (6.60)

Because σ · · ε ≥ 0 and ρw · w ≥ 0, from (6.60) it follows that ω2 is areal nonnegative number, hence so is ω. As the eigenfrequency equationis linear with real-valued coefficients, the real and imaginary parts of aneigensolution are eigensolutions as well. This means that we can consideronly real eigenmodes.

Engineers are mostly interested in some range of eigenfrequencies, saythe lowest few, or those belonging to some finite range.

Let us note that for the first and third boundary value problems, theminimum eigenfrequency is positive. Indeed, for ω = 0 the problem (6.59)is described by the equilibrium equations, and by uniqueness we have onlythe zero solution w = 0. For the second boundary value problem, whereΣ2 = Σ, to the value ω = 0 there corresponds a nontrivial solution thatrepresents a displacement of the body as a rigid whole:

w = w0 + ω0 × (r − r0)

with arbitrary but constant vectors w0,ω0, and r0. We restrict ourselvesto the case of positive ω.


By linearity of the problem, if w is an eigensolution, then so is aw forany scalar a. We typically choose a so that aw has unit L2 norm. Such aneigensolution is called an oscillation mode.

We prove the following theorem.

Theorem 6.9. For oscillation eigenmodes w1 and w2 corresponding todistinct eigenfrequencies ω1 and ω2 respectively, the relation∫

V

ρw1 ·w2 dV = 0 (6.61)

holds. Moreover,

〈w1,w2〉 = 0. (6.62)

Equality (6.61) is the orthogonality relation, and (6.62) is the general-

ized orthogonality relation between w1 and w2.

Proof. The proof mimics that of Theorem 3.2. Let w1 and w2 satisfy

∇ · σ1 = −ρω21w1 in V, w1

∣∣Σ1

= 0, n · σ1

∣∣Σ2

= 0,

and

∇ · σ2 = −ρω22w2 in V, w2

∣∣Σ1

= 0, n · σ2

∣∣Σ2

= 0,

where σk = C·· εk and εk = ε(wk). Dot-multiply the first equation in V byw2 and integrate this over V . Applying the Gauss–Ostrogradsky theorem,we get

−∫

V

σ1 ·· ε2 dV + ω21

∫V

ρw1 · w2 dV = 0. (6.63)

Similarly,

−∫

V

σ2 ·· ε1 dV + ω22

∫V

ρw2 · w1 dV = 0.

Subtracting these two integral equalities and using σ1 · · ε2 = σ2 · · ε1, weget

(ω21 − ω2

2)∫

V

ρw1 ·w2 dV = 0.

Relation (6.61) follows from the fact that ω1 = ω2. Substituting this intothe above equality, we obtain (6.62).


Mathematically, Theorem 6.9 states that the oscillation eigenmodes cor-responding to distinct eigenfrequencies are orthogonal in the space L2(V )with weight ρ. From (6.63) it follows that they are also orthogonal with re-spect to the energy inner product 〈·, ·〉. If to some ω there correspond a fewlinearly independent eigenmodes, we can always construct an orthonormalset of corresponding eigenmodes using the Gram–Schmidt procedure. See[Yosida (1980)].

This orthogonality is the basis for practical application of the followingvariational Rayleigh principle.

Theorem 6.10. Eigenmodes are stationary points of the energy functional

E0(w) =12

∫V

W (ε(w)) dV

on the set of displacements satisfying the boundary conditions w|Σ1 = 0and subject to the constraint

12

∫V

ρw ·w dV = 1. (6.64)

Conversely, all the stationary points of E0(w) on the above set of displace-ments are eigenmodes of the body that correspond to its eigenfrequencies.

Proof. Let us write out the stationarity condition for E0(w). Using thesame reasoning as in the proof of Theorem 6.3, we get

δE0 = −∫

V

(∇ · σ) · δw dV −∫

Σ2

n · σ · δw dΣ = 0. (6.65)

This should be considered on the set of displacements with the restrictionsdescribed in the theorem statement. So δw is not independent as it was inTheorem 6.3, but must satisfy the condition

δ

(12

∫V

ρw · w dV − 1)

=∫

V

ρw · δw dV = 0. (6.66)

In calculus, the minimization of a function with constraints is solvedusing Lagrange multipliers. We will adapt this method to our problem fora functional. To E0(w) let us add

−λ(

12

∫V

ρw · w dV − 1)

where λ is an indefinite multiplier (we shall call it a Lagrange multiplier).


Let us introduce a functional E0(w, λ) depending on the variables wand λ, as

E0(w, λ) = E0(w) − λ

(12

∫V

ρw ·w dV − 1).

Now w is not subjected to the constraint. We will show that w from astationary point (w, λ) of E0 is a stationary point of E0(w) under theconstraint (6.64). Indeed, δE0 = 0 is

δE0 = δE0 − λ

∫V

ρw · δw dV − (δλ)(

12

∫V

ρw ·w dV − 1)

= 0,

where δE0 is given by (6.65). Because w and λ are independent, δE0 = 0implies two simultaneous equations:

δE0 − λ

∫V

ρw · δw dV = 0,12

∫V

ρw · w dV − 1 = 0. (6.67)

This shows the required property possessed by stationary points of E0.So we consider a stationary point E0(w, λ) without constraints on w.

From the first equation in (6.67) we get

−∫

V

[(∇ · σ) · δw + λρw · δw] dV −∫

Σ2

n · σ · δw dΣ = 0. (6.68)

As in the derivation of the minimum total energy principle, it follows fromthis integral equality for arbitrary δw that

∇ · σ = −ρλw, n · σ∣∣Σ2

= 0.

If we change λ to ω2, this equation coincides with (6.59). This explains themeaning of the Lagrange multiplier: it is equal to the squared eigenvalue ωwhose non-negativeness was proved above. Thus the stationarity conditionfor E0(w) is valid on eigensolutions of the problem (6.59).

Now we prove the converse. Let w be a solution of the equation

∇ · σ = −ρω2w

in V for some ω. We dot-multiply it by v, integrate over V , and apply theGauss–Ostrogradsky theorem to get

0 =∫

V

[(∇ · σ) · v + ω2ρw · v] dV

=∫

V

[−σ ·· ε(v) + ω2ρw · v] dV +∫

Σ2

n · σ · v dΣ, (6.69)


where

ε(v) =12(∇v + (∇v)T ).

Let us select v such that ∫V

ρw · v dV = 0.

Denoting v = δw, we transform the second line of (6.69) to −δE0 from(6.65). Thus δE0, which is calculated at the eigenmode, is equal to zero.This completes the proof.

We will reformulate Rayleigh’s principle for ease of application. In thenew formulation, one need not stipulate separate integral restrictions onthe set of w.

Theorem 6.11. On the set of admissible vector-functions satisfying thecondition w|Σ1 = 0, the oscillation eigenmodes are stationary points of thefunctional

R(w) =E0

K, where K =

12

∫V

ρw ·w dV.

R(w) is called Rayleigh’s quotient. Conversely, a normalized stationarypoint of R(w) is an eigenmode that corresponds to some eigenfrequency;the value of R(w) on an eigenmode is the squared eigenfrequency.

Proof. The proof of the first part follows from the previous proof. Indeed,R does not change when we multiply w by a constant factor. Selecting sucha factor that

12

∫V

ρw ·w dV = 1,

we see that R = E0.Now we show that for the eigenmode w corresponding to the eigenfre-

quency ω, we have

R(w) = ω2. (6.70)

We dot-multiply (6.59) by w, integrate over V , and apply the Gauss–Ostrogradsky theorem. We get∫

V

σ ·· ε dV = ω2

∫V

ρw ·w dV,

from which (6.70) follows.


Equation (6.70) is widely used in mechanics for finding approximatevalues of eigenfrequencies. After using eigenmode orthogonality to approx-imate an eigenmode by w, we can approximate the eigenfrequency throughthe equality

ω =√R(w).

A finite-dimensional analogue of Rayleigh’s quotient gave birth to a classof numerical methods for large sparse systems of linear algebraic equations.The interested reader should consult specialized literature.

6.13 Plane Waves

It is harder to obtain an exact solution for a dynamics problem than foran equilibrium problem. However, there is an important dynamics problemthat can be solved analytically. This is the problem of plane waves in anunbounded homogeneous anisotropic space that is free of body forces.

Such solutions describe propagation of sound far from the source, sayfor distant earthquakes or explosions, so they are of significant interest inapplications. On the other hand, their theory is relatively simple, and theinvestigation reduces to an algebraic eigenvalue problem.

A solution of the form

u = U(k · r − ωt)a

is called a plane wave. Here k is a constant vector called the wave vector,ω is the frequency, a is a constant vector, and U = U(x) is an unknownfunction in one variable x = k · r − ωt . The equation

k · r − ωt = const

represents a plane propagating in the space R3. At any instant, the normal

to this plane is parallel to k. The propagation velocity of the plane isc = ω/|k|; this is the wave velocity. A solution of this form is constanton each propagating plane. The vector k defines the direction of wavepropagation, whereas a is the direction of the displacement. For u we have

∇u = U ′(k·r−ωt)ka, ε =12U ′(k·r−ωt)(ka+ak), u = ω2U ′′(k·r−ωt)a,

where the prime denotes differentiation with respect to x, where x = k ·r−ωt, and the overdot denotes differentiation with respect to t.


Using the symmetry properties of C, we write out Hooke’s law for thisdisplacement field:

σ = C ·· (ka)U ′.

Substituting u and σ into the equations of motion, we get

k ·C ·· (ka)U ′′ = ρω2U ′′a.

This equation has the uninteresting trivial solution U ′′ = 0. The necessarynontrivial solutions are defined by the following algebraic problem:

Find a nontrivial solution of the equation

A(k) · a = ρω2a, where A(k) = k · C · k. (6.71)

We call A the acoustic tensor.

We recall that C has some symmetry properties, and that ε ··C ·· ε ispositive for any ε = 0. It is easy to verify that A(k) is a positive definitesymmetric tensor. Considering ρω2 as an eigenvalue of A(k), we have aneigenvalue problem (6.71) that possesses three positive eigenvalues. In otherwords, from (6.71) it follows that for any k there exist three values of ω. Aplane wave corresponds to each of these.

Let us consider the plane wave problem for an isotropic medium in moredetail. Now

C = λEE + µ(ekEek + I),

where E = ekek is the unit tensor and I = ekemekem. The acoustic tensorbecomes

A = k · C · k = (λ+ µ)kk + µ(k · k)E. (6.72)

Let us take a Cartesian frame with basis vectors i1, i2, k = k/|k|, so|i1| = |i2| = |k| = 1 and i1 · i2 = i1 · k = i2 · k = 0. In this basis, the matrixA(k) is diagonal:

A(k) =

⎛⎝ µ|k|2 0 00 µ|k|2 00 0 (2µ+ λ)|k|2

⎞⎠ .


The eigenvalues and eigenvectors from (6.71) take the form

ω1 =√µ

ρ|k| for a = i1,

ω2 =√µ

ρ|k| for a = i2,

ω3 =

√λ+ 2µρ

|k| for a = k.

The first two solutions describe transverse waves ; their directions of dis-placement are those a that are perpendicular to k, the direction of wavepropagation. This type of wave is also called a shear or S wave. The thirdequation describes a longitudinal wave; its displacement is along the direc-tion of propagation (Fig. 6.7). This type of wave is also called a dilatational,pressure, or P wave. Such solutions are used in acoustics and seismology.

i

i

1

2

k

a

i

i

1

2

k

a

Fig. 6.7 Plane waves in an isotropic medium. Longitudinal wave is on the left. Trans-verse wave is on the right.

Let us note that for an arbitrary anisotropic medium, the spectral prob-lem describes waves that may be neither transverse nor longitudinal.

Exercise 6.11. Verify (6.72).

Exercise 6.12. Prove that A(k) is a symmetric tensor.

Exercise 6.13. Show that A(k) is positive definite.


6.14 Plane Problems of Elasticity

Two important classes of deformation for which elastic problems can be sig-nificantly simplified are the plane strain and plane stress deformations. Theproblems for these deformations are called the plane problems of elasticity.

First we consider the plane deformation problem. A deformation iscalled plane if all the displacements of the body points are parallel to aplane and depend only on the coordinates in this plane. With reasonableprecision, this is the case for a central portion of a long cylindrical orprismatic body stretched by axial forces.

The displacement vector for plane deformation takes the form

u = u1(x1, x2)i1 + u2(x1, x2)i2.

The body and contact forces should be of similar form:

f = f1(x1, x2)i1 + f2(x1, x2)i2, t0 = t1(x1, x2)i1 + t2(x1, x2)i2.

In a Cartesian frame, the strain tensor components with indices 3, corre-sponding to u, are zero: ε13 = ε23 = ε33 = 0, so

ε = ε11i1i1 + ε12(i1i2 + i2i1) + ε22i2i2.

Using Hooke’s law, we find the components of the stress tensor:

σ11 = λ tr ε+ 2µε11, σ22 = λ tr ε+ 2µε22, σ33 = λ tr ε,

σ12 = µε12, σ23 = 0, σ13 = 0,

where tr ε = ε11 + ε22. Because ε33 = 0, we can express σ33 in terms of σ11

and σ22:

σ33 =λ

2(λ+ µ)(σ11 + σ22) = ν(σ11 + σ22).

All the components of σ do not depend on x3; this allows us to simplifythe equilibrium equations:

∂σ11

∂x1+∂σ21

∂x2+ ρf1 = 0,

∂σ12

∂x1+∂σ22

∂x2+ ρf2 = 0.

Let us consider a particular case when body forces are absent. Intro-ducing the Airy stress function Φ via the relations

σ11 =∂2Φ∂x2

2

, σ22 =∂2Φ∂x2

1

, σ12 = − ∂2Φ∂x1∂x2

, (6.73)


we identically satisfy the equilibrium equations. From the set of Beltrami–Michell equations, in the plane problem there remain only three nontrivialequations:

∆σ11 +1

1 + ν

∂2σ

∂x21

= 0,

∆σ22 +1

1 + ν

∂2σ

∂x22

= 0,

∆σ12 +1

1 + ν

∂2σ

∂x1∂x2= 0, (6.74)

where

σ = trσ = (1 + ν)(σ11 + σ22).

Substitute (6.73) into (6.74). Since

σ = (1 + ν)∆Φ,

we reduce the first of the equations (6.74) to the biharmonic equation forΦ:

0 = ∆σ11 +1

1 + ν

∂2σ

∂x21

= ∆∂2Φ∂x2

2

+∂2

∂x21

∆Φ = ∆2Φ.

We can reduce the second equation from (6.74) to the same biharmonicequation, whereas the third equation from (6.74) holds identically. Thus Φsatisfies

∆2Φ = 0. (6.75)

In the plane problem, this plays the role of the compatibility equation; itis the equation we have to solve.

Similar simplifications can be done for the plane stress problem. Thistype of deformation occurs in a thin plate with planar faces free of load.We appoint the x3 direction along the normal to the plate faces. Now thestress components σ13, σ23, and σ33 are small, so we set them to zero:σ13 = σ23 = σ33 = 0. The equality σ33 = 0 distinguishes the plane straindeformation from the above plane deformation where σ33 = 0. For theplane stress deformation, the analysis of the equilibrium equations can beperformed as above.

Exercise 6.14. Find the components of ε for the plane stress deformation.

Exercise 6.15. Find the components of the strain tensor for a plane stressstate.


The results of Exercises 6.14 and 6.15 show that the equations of theplane strain and plain stress problems differ only in the values of the cor-responding elastic moduli λ.

Special tools have been developed for use in plane elasticity theory.Some are based on the theory of functions of a complex variable [Muskhe-lishvili (1966)]. Using the complex potential method, we can find exactsolutions to rather complex problems. However, computer techniques andthe finite element method have significantly reduced interest in these.

6.15 Problems

6.1 Write out the boundary conditions for the problem of plane elasticityon the square ABCD shown in Fig. 6.8.

6.2 Write out the boundary conditions for the plane elasticity boundaryvalue problem on the triangle ABC shown in Fig. 6.9.

6.3 Write out the boundary conditions for the plane elasticity problem onthe triangle shown in Fig. 6.10. Assume the forces are normal to the sidesand equal to p1, p2, p3, respectively.

6.4 Write out the boundary conditions for the plane elasticity problem onthe portion of a ring depicted in Fig. 6.11.

6.5 Write out the boundary conditions for the elasticity problem on thevisible portions of the cubes shown in Fig. 6.12.

6.6 At a point of an elastic body the principal stresses are σ1 = 50 MPa,σ2 = −50 MPa, and σ3 = 75 MPa. Referring to Fig. 6.13, find the stressvector on ABC that is equi-inclined to the principal axes.

6.7 Referring to Fig. 6.14, write out the boundary conditions for the elas-ticity problem on the visible portions of the cylindrical bodies.

6.8 Let γ, λ, k be given parameters. For the following displacement vectors,find the expressions for the strain tensors.

(a) u = γx2i1;(b) u = λx1i1;(c) u = λr;(d) u = u(r)er + kzez (use cylindrical coordinates);


i

i1

2 A

B C

D A

B C

D

A

B C

D A

B C

D

2

2

a) b)

d)c)

p p

p

p

Fig. 6.8 Problem 6.1.

(e) u = u(r)eφ + kzez (use cylindrical coordinates);(f) u = (u(r) + kz)ez (use cylindrical coordinates);(g) u = u(r)er (use spherical coordinates).

6.9 Let ϕ be an arbitrary biharmonic function. Verify that the expression

2µu = 2(1 − ν)∇2ϕ−∇∇ ·ϕis a solution of Lame’s equations (6.29) at f = 0. It is the Boussinesq–Galerkin solution of the equilibrium problem.

6.10 Let ψ be an arbitrary harmonic vector function and ψ0 an arbitrary


i

i1

2

A B

C

y

x

yn


i

i1

2

A B

C

y

x

n

n

p1

pp

23


harmonic function. Verify that the expression

2µu = 4(1 − ν)ψ −∇(r · ψ + ψ0)

is a solution of the equilibrium equations (6.29) at f = 0. It is called thePapkovich–Neuber solution.

6.11 Let ψ be an arbitrary harmonic vector function, and suppose thefunction η satisfies the equation

∇2η = 2∇ ·ψ.Verify that the expression

2µu = 4(1 − ν)ψ −∇η


i

i1

2

a) b)

p1

p2

A B

C

D

p

12

A B

C

D p


A

B C

D

2

a) b)

p

i1

i3

i2 A

B C

D

E F

G

E F

G

p

p1

2

3


is a solution to (6.29) at f = 0.

6.12 Let ψ be an arbitrary harmonic vector function, and suppose thefunction ξ satisfies the equation

∂ξ

∂z=

14ν − 3

∇ ·ψ.Verify that the expression

2µu = ∇ψ + z∇ξrepresents a solution to equations (6.29) at f = 0.


i1

i3

i2

n

A

B

C


i1

i3

i2

a) b) c)

p

2

1

3

1

3

1

3

A

B C

D

E

F G

H

A

B C

D

E

F G

H

2


Chapter 7

Linear Elastic Shells

Thin-walled structures are common in engineering practice. Examples in-clude ship hulls, tank beds, rocket and aircraft airframes, and various tubu-lar structures such as arteries, membranes, and vases. A shell resembles asurface having thickness. Mathematical models for such objects were devel-oped along with the theory of three-dimensional elasticity. However, shelltheory is still under development; it is an approximation to describe the be-havior of three-dimensional objects using two-dimensional models given ona surface. Clearly, any shell theory can be accurate only in some restrictednumber of situations. A few popular shell models exist. We will presentsome linear models that are used in applications.

Historically, the first shell model was based on the Kirchhoff–Love as-sumptions. Models are also attributed to Timoshenko, Reissner, Cosserat,Naghdi, and others. In shell theory, we encounter the names of Euler,Lagrange, Poisson, Kelvin, etc. A particular case of shell theory is platetheory, where the object under consideration is akin to the top of a desk.In this chapter, some features of shell theory will be demonstrated only forplate models: our goal is an introduction to the theory and a demonstrationof how tensor analysis is applied in its construction.

A mathematical model of a shell takes the form of a boundary valueproblem for simultaneous partial differential equations given on a base sur-face, usually the midsurface of the shell. Since the quantities describingshell behavior are functions of the surface coordinates, shell theory is closelyrelated to the theory of surfaces from differential geometry (Chapter 5). Thedecisive point in any version of shell theory is the method of eliminating thethird spatial coordinate — along the normal to the shell surface — fromthe three-dimensional model.

Before the invention of the computer, the numerical solution of three-

237


dimensional elasticity problems was nearly impossible. Much interest wasdirected toward shell theories, where problems could be solved manually torequired accuracy. Modern computer solution has not changed the situationfor thin-walled problems. This is because numerical calculations of strainsin shell-like bodies harbor difficulties related to huge mesh sizes in discretemodels of a shell as a three-dimensional body and the ill-posedness of thecorresponding three-dimensional problems. It is interesting to note that theapplication of powerful computers has increased interest in more accurateshell models which allow engineers to apply two-dimensional finite elementsthat incorporate a more realistic strain distribution along the shell thicknesscoordinate.

There are two approaches to the development of two-dimensional shellmodels. The first is based on direct approximation of a given three-dimensional boundary value problem for a shell-like body in terms of aboundary value problem given on the base surface of the body. This isdone in a few ways. One is the hypothesis method, in which we assumethe form of the dependence of the shell displacement and stress fields onthe thickness coordinate. We will present an example of this type of model,based on the classical Kirchhoff–Love hypotheses. Another method involvesasymptotic expansion of the solution of a three-dimensional problem withrespect to a natural small parameter ε, the ratio of the shell thickness to acharacteristic shell size such as the diameter of a circular plate. There areother principles for expanding a solution in a series with respect to ε. Thesemodels commonly use as a starting point the equilibrium equations for theshell as a three-dimensional body. Afterwards, the expansions reduce theequations to equations given on the shell surface. The models typicallyresult in a system of equations on the base surface whose order exceeds theorder of the initial system for a three-dimensional body.

Another approach to deriving shell models can be called the direct ap-proach. In this case a shell is considered as a material surface having ad-ditional mechanical properties. So it is a surface that resists deformation,possesses strain energy, can have distributed mass, etc. The dynamical orequilibrium equations for such a surface are formulated directly throughfundamental laws of mechanics such as the law of impulse conservation. Inthe direct approach, the equations of three-dimensional elasticity are notused; hence constitutive laws similar to Hooke’s law should be formulatedindependently.

In this chapter, we will demonstrate how tensor calculus can be used toderive a few linear shell and plate models for the case of small strains. We

Linear Elastic Shells 239

will also discuss some properties of the boundary value problems of shelltheory. We start with the Kirchhoff–Love model.

7.1 Some Useful Formulas of Surface Theory

In this section we collect some tensorial formulas needed to formulate shellrelations. Many are taken from Chapter 5, with certain changes in notation.

Let Σ be a sufficiently smooth surface in R3 (Fig. 7.1). The position

vector of a point on Σ is denoted by ρ. We introduce1 coordinates q1 andq2 over Σ, which is then described by the equation ρ = ρ(q1, q2).

n

qq

i1 i

2

i3

12

12

Fig. 7.1 Surface Σ with curvilinear coordinate lines (q1, q2).

The basis vectors on Σ are tangent to the coordinate lines at a point;they are denoted by ρ1 and ρ2. The vectors ρ1 and ρ2 constitute the dualbasis. We recall that

ρ1 =∂ρ

∂q1, ρ2 =

∂ρ

∂q2, ρα · ρβ = δβ

α (α, β = 1, 2).

From now on, Greek indices take the values 1, 2, whereas Roman indicestake the values 1, 2, 3. The normal n to Σ at a point is defined by therelation

n =ρ1 × ρ2

|ρ1 × ρ2|.

1In Chapter 5, following the tradition of differential geometry, we denoted the coordi-nates by u1 and u2. To avoid confusion with the components of the displacement vectoru in this chapter, we rename the coordinates as indicated.


Let us introduce the metric tensor A on the surface:

A = ραρα = E− nn,

where E is the metric tensor in R3. It plays the role of the unit tensor onΣ at a point: if a vector v lies in the tangent plane at the point so thatv · n = 0, then A · v = v.

Exercise 7.1. Let v · n = 0. Show that A · v = v.

Exercise 7.2. Show that A = aαβραρβ = aαβραρβ , where aαβ = ρα · ρβ

and aαβ = ρα · ρβ .

Exercise 7.2 states that the components of A are the coefficients ofthe first fundamental form of Σ. This is why A is also called the firstfundamental tensor of Σ.

At a point of Σ, the vectors (ρ1,ρ2,n) constitute a basis of R3. Thedual basis is (ρ1,ρ2,n). This means that an arbitrary vector field v givenon Σ can be represented as

v = v(q1, q2) = v1(q1, q2)ρ1 + v2(q1, q2)ρ2 + v3(q1, q2)n

= v1(q1, q2)ρ1 + v2(q1, q2)ρ2 + v3(q1, q2)n.

Note that v3 = v3.The gradient operator on Σ is given by the formula

∇ = ρα ∂

∂qα.

To apply the differential operations, we need the derivatives of the basisvectors ρ1, ρ2, n, ρ1, ρ2. We recall that the derivatives of the normal nare (see equation (5.39))

∂n∂qα

= −bαβρβ .

We use the components bαβ to construct the tensor

B = bαβραρβ.

Using the definition of ∇, we see that

B = −∇n.

The symmetric tensor B is called the curvature tensor of Σ or its secondfundamental tensor ; the components of B are the coefficients of the secondfundamental form of Σ.


Exercise 7.3. Show that B = bαβραρβ = bαβραρβ . Also show that the

expressions for bαβ coincide with the coefficients of the second fundamentalform (§ 5.5).

The derivatives of ρα and ρα are given by the formulas

∂ρα

∂qβ= Γγ

αβργ + bαβn,∂ρα

∂qβ= −Γα

βγργ + bαβn,

where Γγαβ is the Christoffel symbol introduced in § 4.5.

Let us derive the expression for the gradient ∇v of a vector field v givenon Σ. We split v into two components; one, v, is tangent to Σ such thatv · n = 0, and the other is normal to Σ:

v = v + wn, v = v1(q1, q2)ρ1 + v2(q1, q2)ρ2, w = v3 = v3 = v · n.Then

∇v = (∇v) · A− wB + (∇w + B · v)n. (7.1)

We also will need the expression for the tensor divergence:

∇ · T = ργ ∂

∂qγ· (Tαβραρβ + T 3βnρβ + Tα3ραn + T 33nn

)=∂Tαβ

∂qαρβ + Tαβργ · ∂ρα

∂qγρβ + Tαβ ∂ρβ

∂qα+ T 3βργ · ∂n

∂qγρβ

+ T 33ργ · ∂n∂qγ

n +∂Tα3

∂qαn + Tα3ργ · ∂ρα

∂qγn + Tα3 ∂n

∂qα

=∂Tαβ

∂qαρβ + TαβΓγ

αγρβ + TαβΓγβαργ + Tαβbαβn − T 3βbγγρβ

− T 33bγγn +∂Tα3

∂qαn + Tα3Γγ

αγn − Tα3bβαρβ . (7.2)

Exercise 7.4. Establish (7.1).

Exercise 7.5. Use (7.1) to show that ∇ · v = ∇ · v − w trB.

Exercise 7.6. Let v = v(φ, θ) be a vector given on a sphere with sphericalcoordinates φ and θ. Find ∇v.

Exercise 7.7. Let v = v(φ, z) be a vector given on a cylindrical surfacewith cylindrical coordinates φ and z. Find ∇v.

Exercise 7.8. Let T be a second-order tensor given on Σ, such that T·n =n ·T = 0. Show that the equation ∇ ·T = 0 implies the algebraic relationT αβbαβ = 0.


Exercise 7.9. Let T be a second-order tensor given on a sphere with spher-ical coordinates φ, θ. Find ∇ ·T.

Exercise 7.10. Let T be a second-order tensor given on a cylindrical sur-face with cylindrical coordinates φ, z. Find ∇ ·T.

7.2 Kinematics in a Neighborhood of Σ

To describe a spatial domain that surrounds Σ, we will use the coordi-nates and quantities of the previous section. A shell is a three-dimensionalbody occupying some neighborhood of the base surface Σ, which is used todescribe shell kinematics as well as displacements and strains under load.

The coordinates of a point Q in a neighborhood of Σ are introduced asfollows. Let n be the normal to Σ through a point Q. Let the base pointof n be P , whose position vector is ρ(q1, q2) so that its coordinates on Σare q1, q2. We appoint q1, q2 as the two first coordinates of Q, and thedistance z from P to Q as the third coordinate (Fig. 7.2). Note that z istaken positive when

−−→PQ is co-directed with n and negative when oppositely

directed.

z

n

r

qq P

Q

i1 i

2

i3

1

1

2

2

Fig. 7.2 Kinematics in a neighborhood of Σ.

The position vector r of Q is

r = r(q1, q2, z) = ρ(q1, q2) + zn.


The volume occupied by the shell is defined by the inequalities

−h/2 ≤ z ≤ h/2,

where h is the shell thickness. In general, h = h(q1, q2) can vary from pointto point. At a point Q near Σ, the basis vectors are

rα =∂r∂qα

= ρα + z∂n∂qα

= (A − zB) · ρα, r3 = n. (7.3)

The dual basis is given by

rα = (A − zB)−1 · ρα, r3 = n. (7.4)

Let us explain the notation (A−zB)−1. It is seen that (A−zB)·n = 0,so A−zB degenerates as a three-dimensional tensor and its inverse does notexist. Here we restrict A−zB to an operator acting in the two-dimensionalsubspace tangent to Σ. This subspace has basis ρ1,ρ2, and the tensor Aplays the role of the unit operator on it. In what follows, we suppose thath ‖B‖ is very small. Now, using the Banach contraction principle [Lebedevand Cloud (2003)], we can prove the existence of a unique inverse to A−zBfor small |z| ≤ h/2 on the tangent space. So we have

(A − zB)−1 · (A − zB) = (A − zB) · (A − zB)−1 = A. (7.5)

We illustrate how to derive (A − zB)−1 via two important examples.

(1) Let Σ be a cylinder of radius R. In cylindrical coordinates

n = er,

A = E− nn = eφeφ + ezez,

B = −∇er = −eφeφ/R.

Then

A− zB =(1 +

z

R

)eφeφ + ezez.

It is easy to check that

(A − zB)−1 =(1 +

z

R

)−1

eφeφ + ezez

satisfies (7.5) and hence is the needed inverse tensor.

(2) Now we define (A − zB)−1 for a sphere of radius R. In sphericalcoordinates,

n = er,

A = E − nn = eφeφ + eθeθ,

B = −∇er = −(eφeφ + eθeθ)/R.


Then

A − zB =(1 +

z

R

)(eφeφ + eθeθ).

Again, a direct check of formula (7.5) for

(A − zB)−1 =(1 +

z

R

)−1

(eφeφ + eθeθ)

demonstrates that it is the needed inverse tensor.

These examples show that (A − zB)−1 cannot be found for all valuesof z. In both cases, the inverse tensor exists for all |z| ≤ h/2 if h/2R < 1.This is the domain in which each point is uniquely defined by the coordinatesystem introduced above. We shall use the definition of (A− zB)−1 in thisrestricted sense.

Finally, using the dual basis and (7.4), we find the following represen-tation of the spatial nabla operator:

∇ = rα ∂

∂qα+ n

∂

∂z

= (A − zB)−1 · ρα ∂

∂qα+ n

∂

∂z

= (A − zB)−1 · ∇ + n∂

∂z.

7.3 Shell Equilibrium Equations

Let us derive the two-dimensional equations for shell equilibrium. Thesewill be a direct consequence of the three-dimensional equilibrium equationsfor the shell as a spatial body occupying a domain V in R3 (see Fig. 7.3).V is bounded by two surfaces (faces) S− and S+, each at distance h/2 fromthe midsurface Σ, and by the lateral ruled surface Sν . Sν is formed by themotion of the normal n to Σ along the boundary of Σ. In other words, Vis the set of all spatial points given by the position vector

r = ρ(q1, q2) + zn,

where ρ is the position vector of a point on Σ and −h/2 ≤ z ≤ h/2. Forsimplicity, we suppose h is constant.

On the faces S±, the boundary conditions are

n+ · σ∣∣z=h/2

= t0+, n− · σ∣∣

z=−h/2= t0

−, (7.6)


where n± are the normals to S± (Fig. 7.3) and t0± are the surface loads

on S±, respectively. In the general case, on Sν we will assign the mixedboundary conditions of the third problem of elasticity.

r

n+

S+

S

n_

i1 i

2

i3

Fig. 7.3 A shell-like body.

Equation (6.20) with u = 0 describes the equilibrium of a three-dimensional elastic body occupying volume V :

1√g

∂

∂qi

(√gσijrj

)+ ρf = 0. (7.7)

Using q1, q2, q3 = z as the coordinates, in equation (7.7) we can isolate thedifferentiation with respect to z. Let us denote

σα = rα · σ = σαjrj , σ3 = n · σ = σ3jrj .

We get

1√g

(∂(√gσα)∂qα

+∂(√gσ3)∂z

)+ ρf = 0. (7.8)

We recall that g is the determinant of the matrix gij = ri · rj . We canrepresent g in terms of the parameters of Σ. Indeed,

gαβ = ρα · (A − zB)2 · ρβ , g13 = 0, g23 = 0, g33 = 1.

So g takes the form

g = aG2, a = a11a22 − a212, G = det(A − zB), (7.9)


where aαβ are equal to the values of gαβ taken on Σ. Here det(A − zB) iscalculated as the determinant of a two-dimensional tensor:

detX = X11X

22 − (X2

1 )2,

where Xβα are the mixed components of X = A− zB. Also observe that in

the new notation (7.6) is

σ3∣∣z=±h/2

= t0±.

Let us rewrite (7.8) in the form

1√a

∂(√aGσα)∂qα

+∂(Gσ3)∂z

+ ρGf = 0. (7.10)

Now we can derive two-dimensional equilibrium equations for the shell. Weintegrate (7.10) over the thickness, i.e., along the normal coordinate z from−h/2 to h/2. Taking into account the boundary conditions (7.6), we get

1√a

∂

∂qα

(√a [[σα]]

)+G+t0

+ −G−t0− + [[ρf ]] = 0, (7.11)

where

G± = G∣∣z=±h/2

.

Here we introduced the notation [[· · · ]]. For any quantity f , this denotesthe definite integral of Gf over thickness:

[[f ]] =∫ h/2

−h/2

Gf dz.

Let us transform (7.11) to component-free form. The quantity

T = ρα[[σα]]

can be represented as follows:

T = ρα[[rα · σ]]

= ρα[[ρα · (A − zB)−1 · σ]]

= ραρα · [[(A − zB)−1 · σ]]

= A · [[(A − zB)−1 · σ]]

= [[(A − zB)−1 · σ]].

We call T the stress resultant tensor. By definition, it is clear that n·T = 0.The components of T·A are the stress resultants acting in the tangent plane


of the shell. T·n is the vector of transverse shear stress resultants. So (7.11)becomes

∇ · T + q = 0, (7.12)

where

q = G+t0+ −G−t0

− + [[ρf ]].

The vector q is the distributed load on the shell. Its component q ·A actsin the tangent plane to Σ, and q · n is the transverse load. So we havederived the first shell equilibrium equation.

To derive the second equilibrium equation, we cross-multiply the termsof (7.8) by zn from the left:

1√azn× ∂(

√aGσα)∂qα

+ zn× ∂(Gσ3)∂z

+ ρGzn× f = 0. (7.13)

Moving n through the derivative sign, we get

1√a

∂

∂qα(n ×√

aGzσα) − ∂n∂qα

×Gzσα + z∂

∂z(Gn × σ3) + ρGzn× f = 0.

Because

rα = ρα + z∂n∂qα

so that

z∂n∂qα

= rα − ρα,

we obtain∂n∂qα

× zσα = (rα − ρα) × σα = −n× σ3 − ρα × σα.

Since

ri × σi = ri × (ri · σ)

= ri × (ri · σkjrkrj)

= ri × rjσij

= −rj × riσij

= −rj × riσji

= −ri × rjσij

= −ri × σi,


it follows that

ri × σi = 0.

From

ri × σi = rα × σα + n× σ3 = 0

it follows that

rα × σα = −n× σ3.

Equation (7.13) becomes

1√a

∂

∂qα(n ×√

aGzσα) + ρα ×Gσα +∂

∂z(Gn × zσ3) + ρGzn× f = 0.

Integrating with respect to z from −h/2 to h/2 and using (7.6), we get thesecond vector equilibrium equation for the shell:

1√a

∂

∂qα

(√a[[n × zσα]]

)+ ρα × [[σα]]

+hG+

2n × t0

+ +hG−

2n × t0

− + [[ρzn× f ]] = 0. (7.14)

Now we represent it in component-free form. Let us introduce the stresscouple tensor

M = ρα[[n × zσα]],

which we represent as

M = −ρα[[rα · zσ × n]]

= −ρα[[ρα · (A − zB)−1 · zσ × n]]

= −ραρα · [[(A − zB)−1 · zσ × n]]

= −A · [[(A − zB)−1 · zσ × n]]

= −[[(A − zB)−1 · zσ × n]].

By definition of M, we see that n · M = M · n = 0. Let us denote

m =12hG+n × t0

+ +12hG− n× t0

− + [[ρzn× f ]],

which represents the external bending couples applied to the shell surface.By the definition of m, we see that m belongs to the tangent plane of Σ;that is, m · n = 0.

Finally, we introduce a vectorial invariant of T, which is

T× = ρα × [[σα]];


this notation is called the Gibbsian cross. Now we can rewrite (7.14) incomponent-free form as follows:

∇ ·M + T× + m = 0. (7.15)

Note that (7.12) and (7.15) are exact consequences of the equilibriumequations for the three-dimensional continuum of the shell. They expressthe equality to zero of the sum of all forces and moments acting on a shellelement.

So we have derived two two-dimensional equilibrium equations (7.12)and (7.15) for a shell. The unknowns T and M are defined on Σ. In whatfollows, Σ will be called the shell surface, or the midsurface or base surfaceof the shell.

7.4 Shell Deformation and Strains; Kirchhoff’s Hypotheses

Kirchhoff formulated his famous hypotheses on the laws of plate defor-mation that extend Bernoulli’s assumptions for bending of a beam. SinceKirchhoff’s hypotheses were extended by Love to shell theory, they are oftencalled the Kirchhoff–Love hypotheses.

Kirchhoff’s kinematic hypothesis. Any shell cross section that is per-pendicular to the midsurface before deformation remains perpendicular tothe deformed midsurface. The transverse shear strains are negligibly small.Mathematically, this means that n · ε · τ = 0 for all τ such that τ · n = 0.

Kirchhoff’s static hypothesis. The normal stress is negligible in com-parison with the other stresses, so σ33 = n · σ · n = 0.

Experimentally, the behavior of the normal often reflects the real sit-uation quite well. In this theory, we ignore shear strains and transversecompression in the shell. In the literature we can find different ver-sions of Kirchhoff’s hypotheses. We can also find discussions regardingtheir interpretations, applications, etc. See [Ciarlet (1997); Ciarlet (2000);Chroscielewski, Makowski, and Pietraszkiewicz (2004); Donnell (1976);Goldenveizer (1976); Libai and Simmonds (1998); Novozhilov, Chernykh,and Mikhailovskiy (1991); Timoshenko (1985); Vorovich (1999); Zhilin(2006); Zubov (1982)]. These topics fall outside the scope of the presentbook.

Kirchhoff’s static hypothesis allows us to eliminate ε33 ≡ n · ε · n fromHooke’s law. This is done in a manner similar to the elimination of ε33 in


plane elasticity (cf., Exercises 6.14 and 6.15). By Kirchhoff’s hypothesis,

σ33 ≡ n · σ · n = 0.

Let the shell material be isotropic. Because

σ33 = λ tr ε+ 2µε33 = 0

it follows that

ε33 = − λ

λ+ 2µ(ε11 + ε22).

So for the stress and strain tensors of the shell we get the representations

σ = σ + σα3(rαn + nrα),

ε = ε+ εα3(rαn + nrα) − λ

λ+ 2µ(ε11 + ε22)nn,

where σ and ε satisfy the relations

n · σ = 0, σ · n = 0, n · ε = 0, ε · n = 0.

Hooke’s law takes the form

σ = 2µ[ε+

λ

λ+ 2µE tr ε

]=

E

1 + ν

[ε+

ν

1 − νE tr ε

]. (7.16)

In component representation, this is

σ11 =

E

1 − ν2(ε11 + νε22), σ2

2 =E

1 − ν2(ε22 + νε11),

σ21 =

E

1 + νε21, σ3

1 =E

1 + νε31,

σ32 =

E

1 + νε32, (7.17)

where

σβα = rα · σ · rβ , σ3

α = rα · σ · n.The components of ε are denoted similarly.

Exercise 7.11. Derive (7.16) and (7.17).

Now we turn to the kinematic hypothesis. By this hypothesis, anycross-section perpendicular to the midsurface before deformation remainsperpendicular after deformation and, moreover, remains planar. Let us fixa point q1, q2 on Σ. Any cross-section through the point is planar afterdeformation if and only if the dependence of the displacement of the points


on the normal at q1, q2 is linear in z. So we conclude that the general formof the displacement vector is

u(q1, q2, z) = v(q1, q2) + w(q1, q2)n − zϑ(q1, q2), (7.18)

where

ϑ(q1, q2) = ϑ+ ϑnn, n · ϑ = 0.

Let us derive the expression for the strain tensor corresponding to (7.18).We have

∇u =(∇ + n

∂

∂z

)(v + wn − zϑ)

= ∇v + (∇w)n − wB − z∇ϑ+ zϑnB− nϑ (7.19)

and

(∇u)T = (∇v)T + n∇w − wB − z(∇ϑ)T + zϑnB − ϑn.

So the strain tensor is

2ε = ∇u + ∇uT

= ∇v + (∇v)T − 2wB − z(∇ϑ+ (∇ϑ)T

)+ 2zϑnB

+ (∇w)n − nϑ+ n∇w − ϑn. (7.20)

We recall that by (7.1),

∇v = (∇v) · A + (B · v)n, ∇ϑ = (∇ϑ) · A + (B · ϑ)n.

Substituting these into (7.20), we get

2ε = (∇v) ·A + A · (∇v)T − 2wB − z((∇ϑ) · A + A · (∇ϑ)T

)+ (B · v)n + n(B · v) − z(B · ϑ)n− zn(B · ϑ) + 2zϑnB

+ (∇w)n − nϑ+ n∇w − ϑn. (7.21)

Let us derive some other consequences of Kirchhoff’s kinematic hypoth-esis. By the hypothesis we have

n · ε · τ = 0, (7.22)

where τ is an arbitrary vector orthogonal to n; in components this is ε31 =ε32 = 0. Using (7.21) we find that

2n · ε = B · v − ϑ+ ∇w − zB · ϑ− 2ϑnn.


All terms on the right-hand side except 2ϑnn are orthogonal to n. It followsfrom (7.22), which holds for any τ orthogonal to n, that

B · v − ϑ+ ∇w − zB · ϑ = 0.

From this we find ϑ:

ϑ = (A + zB)−1 · (∇w + B · v).

In linear elasticity, the displacements and their derivatives should bevery small. For a sufficiently thin and smooth shell, we also can supposethat hB is small. We formalize this as

Assumption S. Suppose h ‖B‖ 1.

So we assume that the terms zB are negligibly small in comparisonwith unity. Note that by (7.9), from Assumption S it follows that G =det(A − zB) is very close to 1. Under Assumption S we find that

ϑ = ∇w + B · v. (7.23)

With this, we have

2n · ε · τ = −zτ ·B · ϑ.On the left side of this equality we see n · ε · τ which, according to Kirch-hoff’s kinematic hypothesis, is zero. On the right, by Assumption S, wesee terms of a higher order of smallness than ϑ. So the above choice ofϑ makes Kirchhoff’s kinematic hypothesis sufficiently accurate. Note thatthe quantity ϑn is still undefined.

Substituting (7.23) into (7.21), we find that

2ε = (∇v) ·A + A · (∇v)T − 2wB − z((∇ϑ) · A + A · (∇ϑ)T

)− 2ϑnnn + 2zϑnB− z(B · ϑ)n − zn(B · ϑ). (7.24)

By Assumption S, we set the underlined terms in (7.24) to zero. Finally,we get the expression for the strain tensor in the form

2ε = (∇v) ·A + A · (∇v)T − 2wB

− z((∇ϑ) ·A + A · (∇ϑ)T

)− 2ϑnnn

= 2ε− 2ϑnnn, (7.25)

where

2ε = (∇v) · A + A · (∇v)T − 2wB− z((∇ϑ) ·A + A · (∇ϑ)T

).


We rewrite the last equation as

ε = ε+ zæ, (7.26)

where

2ε = (∇v) ·A + A · (∇v)T − 2wB,

æ = −12

((∇ϑ) · A + A · (∇ϑ)T

).

The tensor ε is the tangential strain measure or the midsurface strain ten-sor. It describes the shell deformation in the tangent plane. The tensor ædescribes the shell deformation due to bending, and is called the bendingstrain measure or the tensor of changes of curvature. The simplest inter-pretation for these quantities will be formulated for a plate, a particularcase of the shell, later. Various simplifications of the strain measures ap-pear in the literature. They are based on the assumptions of smallness ofcertain terms in ε and æ. We will not consider these particular cases here.The above deformation measures are used, for example, by W.T. Koiter[Koiter (1970)].

Let us note that in the framework of Kirchhoff’s hypotheses, the quan-tity ϑn does not appear in the expressions for ε and æ. It is only involvedin ε33. Thus, in the Kirchhoff–Love theory, ϑn does not affect the shelldeformation and stress characteristics: indeed, if we set ϑn = 0 we get theabove expressions for ε and æ. The quantity ϑn describes the rotation ofthe shell cross section, which is tangent to the midsurface, about the nor-mal. From a physical standpoint, at least for a homogeneous shell, theserotations are very small in comparison with the other rotations. This isrelated to the fact that the shell twisting stiffness is much bigger than itsbending stiffness. Thus, in the Kirchhoff–Love theory, a consideration ofϑn and the rotations of the cross sections about the normal related to ϑn isimpossible. In much of the literature, it is assumed that ϑn is zero; alterna-tively some analysis of its smallness, similar to the above, is proposed.See [Novozhilov, Chernykh, and Mikhailovskiy (1991); Vorovich (1999);Zhilin (2006)]. The reader can find a rigorous asymptotic proof of the small-ness of ϑn, under certain assumptions on the loads, the type of boundaryconditions, and shell homogeneity, in [Goldenveizer (1976)].

Let us turn to Hooke’s law. Substitute ε into equation (7.17). Theexpression for σ becomes

σ =E

1 + ν

[ε+

ν

1 − νA tr ε

]. (7.27)


In component form it is

σβα =

E

1 + ν

[εβ

α +ν

1 − νδβα tr ε

](7.28)

or, in more detail,

σ11 =

E

1 − ν2(ε11 + νε22) +

Ez

1 − ν2(æ1

1 + νæ22),

σ22 =

E

1 − ν2(ε22 + νε11) +

Ez

1 − ν2(æ2

2 + νæ11),

σ21 =

E

1 + νε21 +

Ez

1 + νæ2

1.

Using these, we can find the resultants T · A and M via the formulas

T = [[(A − zB)−1 · σ]], M = −[[(A − zB)−1 · zσ × n]].

In the context of Kirchhoff’s theory, knowing the deformation, we cannotrecover the transverse shear stress resultants T · n; indeed by Kirchhoff’shypothesis, the shear strains ε ·31 and ε ·32 in the shell are zero. But thevector T · n, which represents the shear stress components, is not zero.Until now it has been left indefinite, but T · n can be found via (7.12) and(7.15).

Assumption S allows us to overcome the difficulties of integrating(A − zB)−1 over the thickness. This simplifies the formulas for the stressresultants and stress couples to

T · A = [[A · σ]], M = −[[A · zσ × n]]. (7.29)

The simplification (7.29) implies that the two-dimensional equilibriumequations (7.12) and (7.15) with these versions of T and M are not ex-act consequences of the three-dimensional equilibrium equations as earlier.Rather, they are only first approximations to the equilibrium equations thatdiffer in terms of the second order of smallness. Integrating with respect toz over the thickness, we get

T · A =Eh

1 + ν

[ε+

ν

1 − νA tr ε

](7.30)

and

M = − Eh3

12(1 + ν)

[æ +

ν

1 − νA træ

]× n. (7.31)

In the basis ρα, we have

T · A = Tαβραρβ


and

ε = εαβραρβ .

Then the component representation of the stress resultant tensor is

Tαβ =Eh

1 + ν

[εαβ +

ν

1 − νaαβε

γγ

]. (7.32)

Comparing this with (7.28), we see that

Tαβ = [[σαβ ]] ≡∫ h/2

−h/2

σαβ dz. (7.33)

Here we have used the fact that, under Assumption S, we can take G = 1;hence we redefined

[[f ]] ≡∫ h/2

−h/2

f(z) dz.

Similarly, we find the components of the stress couple tensor:

Mαβ = [[zσαβ ]] ≡∫ h/2

−h/2

zσαβ dz. (7.34)

These are related to the components of æ through the formulas

Mαβ =Eh

1 + ν

[æαβ +

ν

1 − νaαβæγ

γ

]. (7.35)

Comparing (7.33) and (7.34) with (7.29), we see that

T ·A = Tαβραρβ, M = −Mαβρ

αρβ × n.

These expressions for the components of stress resultants and stress coupletensors were used in [Koiter (1970)], [Donnell (1976)], [Libai and Simmonds(1998)], and many others.

Donnell analyzed the conditions under which a particular class of shal-low shells that obey Assumption S can be used. The theory of shallow shellsis usable for moderate deformations where the shell shape remains close toplanar. For example, this is the case when the midsurface is a sphericalcap which is a relatively small part of the complete sphere. This theoryis attributed to Donnell, Marguerre, and Vlasov. In writing out æ, theyneglect B · v in the expression for ϑ = ∇w as a small quantity of higherorder. Thus, the strains in shallow shell theory are given by

2ε = (∇v) · A + A · (∇v)T − 2wB, æ = −∇∇w.


This simplification for æ was first introduced by the Russian engineer andmechanicist Vasilij Z. Vlasov [Vlasov (1949)].

In shallow shell theory, in the expressions for the stress resultants andstress couples, we use a “strange” plane geometry; we formally set thecurvature components B to zero in æ but retain them in ε. The differentialoperations in this theory are produced in the curvilinear coordinate system.Note that Vlasov’s initial equations were derived when the geometry of ashell was taken to be plane, but some nonzero curvatures were artificiallyincluded in ε. A particular case of shallow shell theory is the plate theoryconsidered in § 7.8.

It is worth mentioning some aspects of the application of tensor analy-sis to shell theory. In the literature, one can find many versions of linearshell theory using various assumptions on the smallness of certain quan-tities. Some of these were derived in special coordinate systems like theones constituted by the principal curvature lines. In a certain sense, for anisotropic material, such equations may lose their isotropic properties whenwe express them in arbitrary curvilinear coordinates. Such a change of theisotropic property was noted by [Zhilin (2006)] for an early version of theshell equations by Novozhilov. Tensor analysis allows us to avoid similaraccidents.

7.5 Shell Energy

We saw the importance of strain and total energy in elasticity. Using these,we proved uniqueness of solution for boundary value problems of elasticity,and formulated variational principles of elasticity. In shell theory, the en-ergy also plays an important role. Kirchhoff’s hypotheses affect the form ofthe stresses and strains in a shell as a three-dimensional body, so we shouldderive the form of the strain energy density for the shell.

The strain energy density W of the shell as a three-dimensional bodytakes the form

W =12σ ·· ε =

12σi

jεji =

12σα

β εβα.

This follows from Kirchhoff’s hypotheses that ε13 = 0, ε23 = 0, and σ33 = 0.


Substituting (7.28) into W , we get

W =E

2(1 + ν)

[ε ·· ε+

ν

1 − νtr2 ε

]=

E

2(1 + ν)

[εβ

αεαβ +

ν

1 − ν(εα

α)2]. (7.36)

Using (7.26), we transform this to

W =E

2(1 + ν)

[ε ·· ε+

ν

1 − νtr2 ε

]+

Ez2

2(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

]+

Ez

1 + ν

[ε ··æ +

ν

1 − νtr ε træ

]. (7.37)

So, using Kirchhoff’s hypotheses, we have derived the expression for W interms of ε, æ, and z.

Consider the strain energy functional

E =∫

V

W dV.

We recall that in curvilinear coordinates q1, q2, z the volume element is

dV =√g dq1 dq2 dz =

√aGdq1 dq2 dz, G = det(A − zB).

Under our assumptions on smallness of the displacements, W is a quantityof the second order of smallness because it is a quadratic form in the strains.Under Assumption S, we can neglect zB in comparison with A, for whichdetA = 1, and change G to 1. Integrating (7.37) over z explicitly, wetransform E to

E =∫

V

W√a dq1 dq2 dz =

∫Σ

U√a dq1 dq2,

where U = [[W ]] denotes the shell strain energy density per area:

U =Eh

2(1 + ν)

[ε ·· ε+

ν

1 − νtr2 ε

]+

Eh3

24(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

].

(7.38)U is a quadratic form in two tensors, ε and æ, defined on Σ. We see that Usplits into two terms; the first depends only on ε, while the second dependsonly on æ. From a physical viewpoint, this decomposition means that thestrain energy density splits into two parts. The first is due to the shelltangential deformation, i.e., the deformations in the tangent plane. Thesecond is due to shell bending.


From (7.38) it follows that U ≥ 0, and that U = 0 if and only if bothε = 0 and æ = 0. Indeed, we recall that −1 < ν ≤ 1/2 for an elasticmaterial. Then

ε ·· ε+ν

1 − νtr2 ε > 0 when ε = 0.

Similarly,

æ ··æ +ν

1 − νtr2 æ > 0 when æ = 0.

Exercise 7.12. Show that

ε ·· ε+ν

1 − νtr2 ε > 0 when ε = 0.

Using the component notation for T and M, we can represent U inanother form:

U =12Tαβε

αβ +12Mαβæαβ . (7.39)

From (7.39), with use of (7.32) and (7.35), it follows that

Tαβ =∂U

∂εαβ, Mαβ =

∂U

∂æαβ. (7.40)

In component-free form these are

U =12T ·· ε+

12M ·· (n × æ) =

12T ·· ε+

12(M × n) ··æ (7.41)

and

T =∂U

∂ε, M × n =

∂U

∂æ. (7.42)

U inherits the positive definiteness of the energy density in three-dimensional linear elasticity, where the energy density W = 0 if and only ifε = 0. We recall that ε = 0 if and only if the displacement field is a smalldisplacement of a rigid body, i.e., it is the sum of a parallel translation anda rotation of the body in space. We encounter a similar situation in shelltheory. In Kirchhoff’s theory, the expressions for ε and æ are cumbersome,so we will demonstrate that fact later only for a simpler case of the platetheory.

Exercise 7.13. Prove (7.41).


7.6 Boundary Conditions

To uniquely define a solution of the equilibrium problem for an elastic shell,we should supplement the equilibrium equations with boundary conditions.This is no elementary task: in the beginnings of shell theory, the question ofboundary conditions was debated for a long time. Even the great Poissonproposed an erroneous number of boundary conditions.

The equilibrium equations are defined on the midsurface Σ, so theboundary conditions are given on its boundary contour ω.

First we will introduce the kinematic boundary conditions : on theboundary we assign the displacement v that defines the position of theboundary contour after deformation. In the Kirchhoff–Love theory weshould also assign the rotation of the normal n about the tangential to theboundary shell contour vector — that is, about the vector n×ν (Fig. 7.6).

n

n x

Fig. 7.4 The trihedron n, ν, τ = n × ν used to formulate the boundary conditions.

The rotation is defined by the formula ϑ = ∇w + B · v, so the rotationabout τ = n× ν is

ϑν ≡ ν · ϑ = ν · ∇w + ν · B · v =∂w

∂ν+ ν · B · v,

where ν · ∇w = ∂w/∂ν, the derivative with respect to the normal. Thus,in the Kirchhoff–Love theory, the kinematic boundary conditions consist offour scalar equations:

v∣∣ω

= v0(s),∂w

∂ν+ ν · B · v∣∣

ω= ϑ0(s), (7.43)

where v0(s) and ϑ0(s) are given on ω.Some difficulties are involved in formulating static boundary conditions

in the Kirchhoff–Love theory. At first glance, it seems we could assignthe quantities ν · T and ν · (M × n) on the boundary. But this providesfive, not four, scalar conditions. Earlier we mentioned Poisson’s error inconstructing plate theory: he proposed too many static conditions on theboundary.


The derivation of the static boundary conditions can be based on theLagrange variational principle for the shell, which states that in equilibriumthe total energy functional takes its minimum value on the set of kinemati-cally admissible displacements. The static boundary conditions become thenatural boundary conditions for the problem of minimizing the total energy;they are dual to the kinematic conditions. For the shell, the proof of theLagrange principle and the derivation of the static boundary conditionsare cumbersome, so we refer to more specialized literature (e.g., [Greenand Zerna (1954); Novozhilov, Chernykh, and Mikhailovskiy (1991)]) andmerely sketch the procedure for deriving the boundary conditions. Later,this will be given in more detail for the technically simpler case of a plate.

A consequence of the Lagrange variational principle is the statementthat the variation of the shell strain energy is equal to the work of externalforces on the kinematically admissible displacements:

δE = δA.

It turns out that on the set of admissible displacements of the shell edge,the quantities v = v + wn and ϑν , which constitute four scalar quantities,can be assigned independently. Hence the work of external forces on theshell edge takes the form

δA =∫

ω

(ϕ0(s) · δv − 0(s)δϑν

)ds,

where ϕ0(s) and 0(s) are given functions on ω. It is seen that δA dependson δv and δϑν linearly. With regard for (7.40) we get

δE =∫

Σ

δU√a dq1 dq2 =

∫Σ

(Tαβ δεαβ +Mαβ δæαβ)

√a dq1 dq2.

The further transformations of δE are done using integration by parts andthe Gauss–Ostrogradsky theorem. We should represent δεαβ and δæαβ

through δv and δϑ and eliminate their derivatives from the surface inte-grals. Unlike the case with three-dimensional elasticity, δæαβ contains thesecond derivatives of w and hence integration by parts should be done twice.For the boundary integrals we should also apply further transformations insuch a way that they contain only the factors δv and δϑν ; the techniqueswill be presented in the section for the plate. The equation δE = δA holdsfor any admissible δv and δϑ. Selecting only δv and δϑ satisfying the ho-mogeneous kinematic conditions, we find that the minimum point v and ϑsatisfies the equilibrium equations on Σ. Next, extending the set of δv and


δϑ to all the admissible δv and δϑ, we derive the following set of staticboundary conditions for v and ϑ. They are defined on ω:

ν ·T ·A∣∣ω

= ϕ0(s) · A,

ν · T · n∣∣ω− ∂

∂s(ν · (M × n) · τ )

∣∣ω

= ϕ0(s) · n, (7.44)

ν · (M × n) · ν∣∣ω

= 0(s).

As with the kinematic boundary conditions, these are also four scalar equa-tions. Here ϕ0(s) and 0(s) are given functions on ω that represent thestress resultants and the bending couples on the shell edge.

The principal boundary conditions used in engineering are as follows.

Clamped (fixed) edge:

v∣∣ω

= 0,∂w

∂ν

∣∣∣∣ω

= 0.

Simple support edge:

v∣∣ω

= 0, ν · (M × n) · ν∣∣ω

= 0.

Free edge:

ν · T · A∣∣ω

= 0, ν · T · n∣∣ω− ∂

∂s(ν · (M × n) · τ )

∣∣ω

= 0,

ν · (M × n) · ν∣∣ω

= 0.

Mixed boundary conditions are also possible. For example, on a partof ω we might specify (7.43) and on the remainder (7.44). Other com-binations of boundary conditions, four at each point of ω, define variousboundary value problems. Admissible combinations of boundary conditionsare defined by the variational setups of these problems.

7.7 A Few Remarks on the Kirchhoff–Love Theory

There exist various approaches in the Kirchhoff–Love theory. They usedifferent simplifying assumptions that depend on the choice of constitu-tive relations for U , T, and M, as well as on the choice of deformationmeasures ε and æ. In a linear theory, equation (7.38) for U is a commonform of dependence of U on the deformation measures (see, e.g., [Donnell(1976); Koiter (1970); Novozhilov, Chernykh, and Mikhailovskiy (1991);Timoshenko (1985)]).


The versions may differ in their definitions of T. We introduced T as

T = [[(A − zB)−1 · σ]]

and used Assumption S to recast it in the form

T = [[A · σ]].

In the Kirchhoff–Love literature, one can find both the initial and simplifieddefinitions. Let us consider this difference in more detail. We introduce aspecial coordinate system on the shell surface, defined by the principalcurvature curves on Σ. Now B takes the diagonal form

B = e1e1/R1 + e2e2/R2,

where R1 and R2 are principal radii of curvature of Σ. Then

A = e1e1 + e2e2

and

G = (1 − z/R1)(1 − z/R2).

The tensor G(A − zB)−1 takes the form

G(A − zB)−1 =(

1 − z

R2

)e1e1 +

(1 − z

R1

)e2e2.

Taking

T = [[(A − zB)−1 · σ]],

we get the following formulas for the components of T:

T11 =∫ h/2

−h/2

(1− z

R2

)σ11 dz, T22 =

∫ h/2

−h/2

(1 − z

R1

)σ22 dz,

T12 = T21 =∫ h/2

−h/2

σ12 dz.

The alternative definition T = [[A · σ]] yields the expressions

T11 =∫ h/2

−h/2

σ11 dz, T22 =∫ h/2

−h/2

σ22 dz, T12 = T21 =∫ h/2

−h/2

σ12 dz.

So the two definitions of T yield equilibrium equations that differ by quan-tities that are second-order small. The simplified form for T was usedby a number of authors, e.g., [Goldenveizer (1976); Novozhilov, Chernykh,and Mikhailovskiy (1991); Timoshenko (1985)]. Koiter [Koiter (1970)] used(7.38) and (7.40) to relate the stress resultants and the bending couples with


U , but he presented more complex expressions to calculate them throughthe stresses in a three-dimensional body.

The components of Mαβ are also commonly calculated using (7.34),and are related to U through (7.40). As is the case for T, however, in theliterature one can find other ways to introduce M. In [Goldenveizer (1976)],for instance, the expressions for Mαβ include terms that are quadratic in z.

Some works on shell theory feature a “mixed” formulation of the bound-ary value problems, introducing the stress resultants via the simplified for-mula T = [[A ·σ]]. The constitutive equations take the form (7.38), but theequilibrium equations include additional terms arising from the completedefinition T = [[(A− zB)−1 ·σ]]. In this way, the couple stresses appear inall the equilibrium equations.

In [Koiter (1970); Novozhilov, Chernykh, and Mikhailovskiy (1991)] andcertain other works, the above tensor ε is also used. The second deformationmeasure æ, which was derived by the simplifying replacement of ϑ by ∇w,also has a few different forms (see the discussion in [Koiter (1970)]).

In the linear Kirchhoff–Love shell theory, the reader can encounter var-ious forms of the equilibrium equations in displacements, as well as dif-ferences in the stress resultants and bending couples. Clearly, in differentversions of shell theory the forms of the static boundary conditions candiffer, and this is also reflected in the literature.

It is worth noting that in different references, the same equilibriumequations in stresses can lead to different equilibrium equations in displace-ments. For a thin shell with a sufficiently smooth midsurface and smoothloads, this difference, as a rule, is of the second order of smallness in com-parison with the main terms in the equations.

7.8 Plate Theory

Resultant forces and moments in a plate; equilibrium equa-

tions

A particular case of a shell is a thin plate. The base surface Σ of a plateis part of a plane, so B = 0. The relations for a plate follow from shelltheory. The equilibrium equations are

∇ · T + q = 0, ∇ ·M + T× + m = 0, (7.45)


where

T = [[A · σ]], M = −[[A · zσ × n]]. (7.46)

In a Cartesian basis i1, i2, i3 = n, we have

σ = σαβiαiβ +σα3(iαn+niα)+σ33nn, A = iαiα, ∇ = iα∂

∂xα, G = 1,

where x1, x2 are Cartesian coordinates in the plane. The stress resultantsare

T = Tαβiαiβ + Tαniαn, (7.47)

Tαβ = [[σαβ ]] ≡∫ h/2

−h/2

σαβ dz, Tαn = [[σα3]] ≡∫ h/2

−h/2

σα3 dz.

Note that σ33 does not appear in the definition of the stress resultants orthe transverse shear stress resultants. It follows that the matrix Tαβ issymmetric: Tαβ = Tβα. The quantity T× is calculated by the formula

T× = Tαβiα × iβ + Tαniα × n = Tαniα × n = T2ni1 − T1ni2.

Next, we introduce the moments matrix

Mαβ = [[zσαβ ]] ≡∫ h/2

−h/2

zσαβ dz. (7.48)

Here, Mαβ is a symmetric matrix as well. With regard for the equality

σ × n = σαβiαiβ × n

= −σ11i1i2 + σ12i1i1 − σ21i2i2 + σ22i2i1,

from (7.46) we obtain

M = [[zσ11]]i1i2 − [[zσ12]]i1i1 + [[zσ21]]i2i2 − [[zσ22]]i2i1.

By (7.48), M takes the form

M = M11i1i2 −M12i1i1 +M21i2i2 −M22i2i1= −Mαβiαiβ × n.

So the equilibrium equations in Cartesian coordinates are

∂Tαβ

∂xαiβ +

∂Tαn

∂xαn + q = 0, −∂Mαβ

∂xαiβ × n + T× + m = 0. (7.49)


In component form, these are

∂T11

∂x1+∂T21

∂x2+ q1 = 0,

∂T12

∂x1+∂T22

∂x2+ q2 = 0,

∂T1n

∂x1+∂T2n

∂x2+ qn = 0, (7.50)

and∂M11

∂x1+∂M21

∂x2− T1n +m1 = 0,

∂M12

∂x1+∂M22

∂x2− T2n +m2 = 0, (7.51)

where

qα = iα · q, qn = n · q, mα = iα · (n × m).

Note that T12 = T21 and M12 = M21; we keep the different notation forsymmetry.

In applications of shell theory, m is often negligible in comparison withthe external forces. Hence, for simplicity we put m = 0. Now we excludeT1n and T2n from equations (7.51). For this, we differentiate the secondequation of (7.51) with respect to x2, the first from (7.51) with respect tox1, and add the results. Using the third equation from (7.50), we obtainthe well-known equation of plate theory

∂2M11

∂x12 + 2

∂2M12

∂x1∂x2+∂2M22

∂x22 + qn = 0. (7.52)

In index notation this reads

∂2Mαβ

∂xα∂xβ+ qn = 0,

whereas in component-free notation it takes the form

∇ · (∇ · (M × n)) + q · n = 0. (7.53)

Indeed,

M = −Mαβiαiβ × n.

Using the identity

(iβ × n) × n = −iβ,


the proof of which is left to the reader, we get

M × n = −Mαβiα(iβ × n) × n = Mαβiαiβ.

Next we obtain

∇ · (M × n) =∂Mαβ

∂xαiβ, ∇ · (∇ · (M × n)) =

∂2Mαβ

∂xα∂xβ,

from which (7.53) follows.In plate theory, Kirchhoff’s kinematic hypotheses reduce to an assump-

tion on the form of the displacement field, which is

u(x1, x2, z) = v(x1, x2) − z∇w(x1, x2)

= v + wn − z∇w, v · n = 0. (7.54)

Let us find the expression for the strain tensor that corresponds to u.We have

∇u =(∇ + n

∂

∂z

)(v + wn − z∇w)

= ∇v − z∇∇w + (∇w)n − n∇w,

and

(∇u)T = (∇v)T − z∇∇w + n∇w − (∇w)n.

Thus,

2ε = ∇v + (∇v)T − 2z∇∇w.

It follows that ε · n = n · ε = 0.Using the notation

2ε = ∇v + (∇v)T , æ = −∇∇w,

we find that the plate strains split into two parts:

ε = ε+ zæ. (7.55)

Here ε represents the deformation in the plate plane, and æ is related toplate bending. Geometrically, æ describes an infinitesimal change of theplate curvatures due to bending.


Now let us use Kirchhoff’s static hypothesis σ33 = 0. From Hooke’s lawit follows that

σ11 =E

1 − ν2(ε11 + νε22),

σ22 =E

1 − ν2(ε22 + νε11),

σ12 =E

1 + νε12. (7.56)

Substituting (7.55) into these, we get

σ11 =E

1 − ν2(ε11 + νε22) +

Ez

1 − ν2(æ11 + νæ22),

σ22 =E

1 − ν2(ε22 + νε11) +

Ez

1 − ν2(æ22 + νæ11),

σ12 =E

1 + νε12 +

Ez

1 + νæ12.

Now by (7.47) and (7.48), integrating the expressions for σαβ with respectto z over thickness, we derive the expressions for the stress resultants andstress couples:

T11 =Eh

1 − ν2(ε11 + νε22),

T22 =Eh

1 − ν2(ε22 + νε11),

T12 =Eh

1 + νε12,

M11 =Eh3

12(1 − ν2)(æ11 + νæ22),

M22 =Eh3

12(1 − ν2)(æ22 + νæ11),

M12 =Eh3

12(1 + ν)æ12. (7.57)

The constant

D = Eh3/12(1− ν2)

is the bending stiffness. Substituting the expressions for Mαβ into (7.52)and taking into account that

æ11 = −∂2w

∂x21

, æ22 = −∂2w

∂x22

, æ12 = − ∂2w

∂x1∂x2,


we obtain

D

(∂4w

∂x41

+ 2∂4w

∂x21∂x

22

+∂4w

∂x42

)= qn. (7.58)

In component-free form this is

D∇4w = qn. (7.59)

Because

∇4 = (∇2)2 = ∆2,

we call (7.58) (or (7.59)) the biharmonic equation.This equation has a long history in plate theory. The bending equation

was published by Marie-Sophie Germain, who received a prize from theParis Academy of Sciences in 1816. Later some of Germain’s errors werecorrected by Lagrange, hence the equation is now called Germain’s equationor the Germain–Lagrange equation. The plate theory we have consideredis attributed to Kirchhoff, who formulated correct boundary conditions fora plate. Moreover, Kirchhoff clearly introduced physically meaningful hy-potheses regarding the distribution of displacements and stresses in theplate.

Note that the Tαβ do not depend on w. Thus, the plate equilibriumequations split into two groups. The first consists of the two first equationsof (7.50) with respect to v, whereas the second consists of the equation(7.59) for variable w.

Let us write the equations for T ·A in terms of the tangential displace-ment vector v. In tensor form, equations (7.50) are

∇ · (T · A) + q ·A = 0. (7.60)

The constitutive equation for T · A is

T ·A =Eh

1 + ν

[ε+

ν

1 − νA tr ε

].

So we obtain

∇ · (T · A) =Eh

1 + ν

[∇ · ε+

ν

1 − ν∇ tr ε

]=

Eh

2(1 + ν)

[∇ · ∇v + ∇∇ · v +

2ν1 − ν

∇∇ · v]

=Eh

2(1 + ν)

[∇ · ∇v +

1 + ν

1 − ν∇∇ · v

].


In terms of v, equation (7.60) takes the form

Eh

2(1 + ν)

[∇ · ∇v +

1 + ν

1 − ν∇∇ · v

]+ q · A = 0. (7.61)

There may be some concern over equations (7.51), which contain thetransverse shear stress resultants T1n and T2n. It may seem that by thekinematic hypothesis n · ε = 0 and the last equation from (7.47), both T1n

and T2n should vanish. But Kirchhoff’s kinematic hypothesis restricts onlythe form of deformation. So in Kirchhoff’s plate theory, T1n and T2n are notdefined by the strains; they are the reactions due to the strain constraints.Note that the equations (7.51) are used to determine T1n and T2n.

Boundary conditions

In plate theory as a particular case of shell theory, on the boundary contourω we can assign the kinematic conditions

v = v0(s), w = w0(s),∂w

∂ν= ϑ0(s), (7.62)

or the static conditions

ν ·T · A = ϕ0(s)·A, ν ·T · n − ∂

∂s(ν · (M × n) · τ ) = ϕ0

n(s),

ν · (M × n) · ν = 0(s). (7.63)

For a mixed boundary value problem, these sets of conditions can be givenon different portions of ω. Alternatively, at a point we can include condi-tions from (7.62) and the dual conditions from (7.63). At each point, fourscalar conditions must be appointed.

We split the boundary load into tangential and normal parts:

ϕ0 = ϕ0 ·A + ϕ0nn, ϕ0

n = ϕ0 · n.From equilibrium equations (7.51), we find the transverse shear stress re-sultants in terms of the stress couples. As in our earlier treatment of shelltheory, we set m1 = m2 = 0. We have

T1n =∂M11

∂x1+∂M21

∂x2, T2n =

∂M12

∂x1+∂M22

∂x2.

These can be written as the single vector equation

T · n = ∇ · (M × n).

There are various types of mixed boundary conditions. We restrictourselves to the two defined by (7.62) and (7.63). Note that T ·A depends


only on v, whereas M is defined through w. Thus, like the equilibriumequations, the boundary conditions in plate theory split into two sets. Oneis for normal displacement w:

w∣∣ω1

= w0(s),∂w

∂ν

∣∣∣∣ω1

= ϑ0(s), (7.64)

ν · (∇ · (M × n))∣∣ω2

− ∂

∂s(ν · (M × n) · τ )

∣∣ω2

= ϕ0n(s),

ν · (M × n) · ν∣∣ω2

= 0(s).

The other is for tangential displacements v:

v∣∣ω3

= v0(s), ν ·T ·A∣∣ω4

= ϕ0(s) ·A. (7.65)

We have supposed the plate contour to be partitioned in such a way thatω2 = ω \ ω1 and ω4 = ω \ ω3.

In Appendix A we list some common homogeneous boundary conditionsin terms of w for the plate.

Strain energy for the plate

Let us consider the strain energy density for a plate in more detail. It is

U =Eh

2(1 + ν)

[ε ·· ε+

ν

1 − νtr2 ε

]+

Eh3

24(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

].

(7.66)A plate is a particular case of a shell considered above, but with B = 0.Hence we can use the expression for shell strain density

U =12Tαβε

αβ +12Mαβæαβ =

12T ·· ε+

12M ·· (n× æ).

Let us rewrite U in terms of the displacements:

U =Eh

2(1 + ν)

[∂uα

∂xβ

∂uα

∂xβ+

ν

1 − ν

(∂uα

∂xα

)2]

+Eh3

24(1 + ν)

[∂2w

∂xα∂xβ

∂2w

∂xα∂xβ+

ν

1 − ν

(∂2w

∂xα∂xα

)2].

Rigid motions

Now we consider the consequences of the equation U = 0. Because U ispositive definite, it is zero if and only if both ε and æ are zero. But ε = 0implies that the plate can move in its plane as a rigid body only. In other


words, it can translate and rotate about a normal to its midplane. Thecorresponding displacement field takes the form

v = v0 + ω0n× (ρ− ρ0),

where v0 is an arbitrary fixed vector in the midplane, i.e., v0 · n = 0, theangle ω0 is an arbitrary but fixed rotation angle about n, the vector ρ isthe position vector of a point on the midplane, and ρ0 is an arbitrary butfixed vector in the same plane. This formula is a particular case of (6.41)for the plane deformation.

The equation

æ ≡ −∇∇w = 0

implies that w is a linear function in x1 and x2:

w = w0 + w1x1 + w2x2,

where w0, w1, and w2 are arbitrary fixed scalars. Physically, w correspondsto a translation of the whole plate in the normal direction and a rotationwith respect to the axes i1, i2. Thus, as for three-dimensional elasticity,U = 0 only for those displacements that represent small displacements ofthe plate as a rigid body.

Lagrange variational principle in plate theory

In shell theory, the Lagrange variational principle of elasticity changes some-what because of Kirchhoff’s hypotheses. We will formulate Lagrange’s vari-ational principle for the plate. Recall that in three-dimensional elasticity,the total energy functional takes the form

E(u) =∫

V

W (ε) dV −∫

V

ρf · u dV −∫

S

t0 · u dS.

We found that for a thin shell, the integral over V is represented throughthe integral over the midplane Σ. The same is true for the plate:∫

V

W (ε) dV =∫

Σ

U dΣ,

where U is defined by (7.66). Let us consider the remaining terms in E,which represent the work of external forces. In terms of the plate theory,they take the form∫

Σ

q · v dΣ +∫

ω4

ϕ0(s) · v dS +∫

ω2

(ϕ0

nw − 0∂w

∂ν

)dS.


So the total energy functional for the plate splits into two parts:

E = Ev(v) + Ew(w),

where

Ev(v) =Eh

2(1 + ν)

∫Σ

[ε ·· ε+

ν

1 − νtr2 ε

]dΣ

−∫

Σ

q · v dΣ −∫

ω4

ϕ0(s) · v dS

is due to tangential displacements, and

Ew(w) =Eh3

24(1 + ν)

∫Σ

[æ ··æ +

ν

1 − νtr2 æ

]dΣ

−∫

Σ

wq · n dΣ −∫

ω2

ϕ0nw dS +

∫ω2

0∂w

∂νdS

accounts for bending.The arguments v and w of Ev and Ew are independent, so the prob-

lem of minimizing the total energy naturally splits into two minimizationproblems. Moreover, we can formulate two variational principles: one fortangential deformations, and one for plate bending.

Lagrange’s variational principle for tangential deformation

Theorem 7.1. On the set of admissible tangential displacement fields, astationary point v of Ev satisfies the plate equilibrium equations (7.60) onΣ, or, equivalently (7.61) and the boundary condition

ν ·T · A|ω4 = ϕ0(s) · A,and conversely. The stationary point, if it exists, minimizes Ev.

The set of admissible tangential displacements v consists of vector func-tions twice differentiable in Σ and satisfying the kinematic boundary con-dition on ω3 from (7.65). The proof mimics that of Lagrange’s principle inthree-dimensional elasticity (Exercise 7.18).

Note that we represented the work of external forces using the gen-eral definition of work and common sense. The fact that the equilibriumequations and the kinematic boundary conditions really define a stationarypoint of the energy functional shows that the work functional was writtencorrectly.


Lagrange’s variational principle for plate bending

Theorem 7.2. On the set of admissible deflection fields, a stationary pointw of Ew satisfies the plate equilibrium equations (7.53) or (7.59) in Σ andthe static boundary conditions from (7.64) on ω2, and conversely. Thestationary point, if it exists, minimizes Ew.

Note that the set of admissible deflections w consists of the functionsthat possess continuous derivatives in Σ up to order four and satisfy twokinematic boundary conditions on ω1 from (7.64).

Proof. Let us define the variation of Ew. We have

δEh3

24(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

]=

Eh3

12(1 + ν)

[æ +

ν

1 − νtræ

]·· δæ

= Mαβ δæαβ

= −Mαβ∂2 δw

∂xαxβ.

Using this and the Gauss–Ostrogradsky theorem, we get

−∫

Σ

Mαβ∂2 δw

∂xαxβdΣ =

∫Σ

∂Mαβ

∂xα

∂ δw

∂xβdΣ −

∫ω

ναMαβ∂ δw

∂xβds

= −∫

Σ

∂2Mαβ

∂xα∂xβδw dΣ +

∫ω

νβ∂Mαβ

∂xαδw ds

−∫

ω

ναMαβ∂ δw

∂xβds

= −∫

Σ

∇ · (∇ · (M × n)) δw dΣ

+∫

ω

ν · (∇ · (M × n)) δw ds

−∫

ω

ν · (M × n) · ∇δw ds, (7.67)

where να = ν · iα and s is the length parameter over ω.Let us consider the last integral in (7.67). On ω the expression ∇δw

can be represented as

∇δw = ν∂ δw

∂ν+ τ

∂ δw

∂τ,

where τ = n × ν is the vector tangent to ω. Using the contour lengthparameter s over ω, we get

∂ δw

∂τ=∂ δw

∂s.


Note that on ω the expressions ∂ δw/∂ν and δw are independent. Theexpressions ∂ δw/∂s and δw are dependent because the value ∂ δw/∂s isuniquely defined when δw is known on ω, whereas ∂ δw/∂ν is not definedby the values of δw on ω. To exclude the derivative ∂ δw/∂s, we integrateby parts in the integral over the closed contour ω:∫

ω

f∂g

∂sds = −

∫ω

g∂f

∂sds,

where f and g are arbitrary functions of s. Applying this to the last integralin (7.67), we get

−∫

ω

ν · (M × n) · ∇δw ds = −∫

ω

[ν · (M × n) · ν ∂ δw

∂ν

− ∂

∂s(ν · (M × n) · τ ) δw

]ds.

On ω1 we have

δw = 0 =∂ δw

∂ν,

hence

δEw = −∫

Σ

[∇ · (∇ · (M × n)) + q · n

]δw dΣ

+∫

ω2

[ν · (∇ · (M × n)

)− ∂

∂s

(ν · (M × n) · τ )− ϕ0

n

]δw ds

+∫

ω2

[ν · (M × n) · ν − 0

] ∂ δw∂ν

ds. (7.68)

Using a standard procedure in the calculus of variations (see, e.g., [Lebe-dev and Cloud (2003)]), we can show that the equation δEw = 0, whichholds for all admissible δw, implies both the equilibrium equation for theplate

∇ · (∇ · (M × n))

+ q · n = 0 in Σ

and the static boundary conditions

ν · (∇ · (M × n))− ∂

∂s

(ν · (M × n) · τ)− ϕ0

n = 0

and

ν · (M × n) · ν − 0 = 0 on ω2.


This derivation also indicates that we have introduced the work for thetransverse load correctly: a stationary point of Ew satisfies (7.53) on Σ andthe boundary conditions (7.64) on ω2.

Now we can prove the converse statement: a solution w of the boundaryvalue problem (7.53), (7.64) is a stationary point of Ew. Multiplying (7.53)by an admissible δw, integrating appropriately by parts and therefore doingthe above transformations in the reverse order, we arrive at the equationδEw = 0. This completes the proof.

In a manner similar to the proof of minimality in the Lagrange principlein elasticity, we can show minimality of a stationary point in the bendingproblem as well. The reader is encouraged to produce a complete proof.Note that in the proof for three-dimensional elasticity, the only thing thatmattered was the structure of the total energy functional: it is a sumof quadratic and linear functionals, and its quadratic portion is positivedefinite.

Note that by considering a plate as a three-dimensional body, we obtaina system of simultaneous equations in displacements of the second order,accompanied by three boundary conditions at each point of the boundary.Transformation of the problem to a two-dimensional plate problem bringsus again to three differential equations in displacements, but the equationfor w is of fourth order; it is supplemented with two conditions at each pointof the plate edge, and the other two equations are of second order. Theincrease in order of the system is the cost of reducing a three-dimensionalproblem to a two-dimensional one.

In shell theory there are other variational principles. Their importancein this theory is even more than in elasticity. They are the basis for nu-merical methods and investigations of qualitative questions of the theory.Moreover, variational principles are used to construct various versions ofshell theory. For example, we can introduce the type of distribution ofthe stresses and displacements along the thickness, and derive the equi-librium equations, find admissible sets of boundary conditions, and justifywell-posedness of the corresponding boundary value problems.

Uniqueness of solution

For an equilibrium problem in plate theory, a solution is unique in the samemeaning as for a problem in three-dimensional elasticity. That is, if somepart of the boundary contour is clamped, then the solution is truly unique.


When the plate can move as a rigid body, the solution is uniquely definedby the equations and boundary conditions up to a small rigid motion. Wewill briefly outline a proof.

Suppose to the contrary the existence of two solutions v1 and v2 tothe boundary value problem (7.60), (7.65), (7.53), (7.64). The differencev = v2 − v1 is a solution of this homogeneous boundary value problem.It is easy to understand that it should be a stationary point of the totalenergy functional ∫

Σ

U dΣ.

This implies that

δ

∫Σ

U dΣ = 0.

But U is a homogeneous quadratic form, so putting δv = v in δU we getδU = 2U(v) and thus

2∫

Σ

U dΣ = 0.

But earlier we considered the question of when the strain energy functionaltakes zero value: the equation ∫

Σ

U dΣ = 0

implies that v is a small rigid displacement of the plate. So the unique-ness of a solution for an equilibrium problem is demonstrated: a solutionis uniquely defined up to a rigid displacement if there are no kinematicboundary constraints.

Exercise 7.14. Derive (7.50) from (7.49).

Exercise 7.15. Show that tr ε = ∇ · v − z∇ · ∇w.

Exercise 7.16. Let w and its first and second derivatives be small. Demon-strate that æ is the curvature tensor of the bent surface of the plate. Usethe results of Exercise 5.33.


Exercise 7.18. Demonstrate the variational principle for tangential dis-placements in the theory of plates.

Exercise 7.19. Show that at a point of plate equilibrium, Ew takes itsminimum value.


Remarks

Kirchhoff’s plate theory was extended to shell theory. In shell theory,Kirchhoff’s hypotheses are called the Kirchhoff–Love hypotheses. In theKirchhoff–Love shell theory, the equations are written on the midsurface.The strain energy density for the shell also splits into two parts, for tangen-tial and bending deformations. However, unlike the splitting of the plateenergy, these terms are dependent. Moreover, all three equilibrium equa-tions in displacements contain all components of the displacement vector.Technically, the Kirchhoff–Love shell theory is more complicated than platetheory, but qualitatively they are quite similar. In shell theory, we haveLagrange’s variational principle and uniqueness of solution to a boundaryvalue problem. In both cases, we can also prove existence of a solution, butin shell theory this requires more advanced mathematical tools.

7.9 On Non-Classical Theories of Plates and Shells

Reissner’s approach to plate and shell theory

We recall that Kirchhoff’s hypotheses yield only an approximation to thereal deformations of thin-walled bodies. In the mechanics of plates andshells, there are other approaches to the representation of three-dimensionaldeformation, where shear deformation, normal extension, and other factorsare taken into account. In particular, the Reissner and Mindlin approachesallow us to construct more precise two-dimensional models for the solutionof a three-dimensional elastic problem for a shell. The main features of thetheory will be demonstrated using Reissner’s plate equations.

Unlike (7.54), in Reissner’s approach we take the displacement field tobe of the form

u(x1, x2, z) = v(x1, x2) − zϑ(x1, x2)

= v + wn − zϑ, ϑ · n = 0. (7.69)

The components of ϑ can be interpreted as the rotation angles of the shellcross section. In the general case, unlike in the Kirchhoff theory, ϑ = ∇wand ϑ is considered to be independent of v. Thus, in Reissner’s theorythere are five unknown variables: the components v1, v2, w of v and thecomponents ϑ1, ϑ2 of ϑ. In Reissner’s theory, the transverse shear stressresultants are on equal footing with the stress resultants and the bendingand twisting stress couples. Unlike what happens in Kirchhoff’s model, they


are independent and so it is necessary to formulate additional constitutiveequations along the lines of (7.47). The equilibrium equations for Reissner’splate take the same form as in Kirchhoff’s theory, i.e., (7.45):

∇ · T + q = 0, ∇ ·M + T× + m = 0, (7.70)

where T and M are the stress resultant tensor and the stress couple tensor,respectively, and q and m represent the forces and couples distributed overΣ. Note that m · n = 0. In the component representation, they coincidewith (7.50) and (7.51).

The dynamic equations of the theory are

∇ ·T + q = ρv + ρΘ1 · ϑ,∇ ·M + T× + m = ρΘT

1 ·· v + ρΘ2 · ϑ, (7.71)

where ρ is the surface shell density, Θ1 and Θ2 are the inertia tensors, and

ΘT2 = Θ2.

The boundary conditions for the plate are

v∣∣ω1

= v0(s), ϑ∣∣ω3

= ϑ0(s), (7.72)

ν ·T∣∣ω2

= ϕ(s), ν · M∣∣ω4

= (s), (7.73)

where v0(s) and ϑ0(s) are given vector functions of the length parameters such that ϑ0 · n = 0. They define the displacements and the rotation onsome part of the boundary contour, respectively. A given ϕ(s) and (s)would define the stress resultants and the stress couples acting on the restof the plate edge, · n = 0.

In Reissner’s plate theory, the constitutive equations are

T · A = C ··µ, T · n = Γ · γ, M = D ··κ. (7.74)

Here T · A is the in-plane stress resultant tensor, T · n represents thetransverse shear stress resultants, and M is the stress couple tensor. Thestrain measures are denoted as follows: µ is the in-plane strain tensor, γis the vector of transverse shear strains, and κ is the tensor of out-of-planestrains. Their definitions are

µ =12

(∇v + (∇v)T

), γ = ∇w − n× ϑ, κ = ∇ϑ.

The remaining notation is for the fourth-order tensors C and D and thesecond-order tensor Γ that describe the effective stiffness properties of the


plate; they depend on the material properties of the plate and on the cross-section geometry. In the case of an isotropic plate having properties sym-metric about the midplane, the effective stiffness tensors take the form

C = C11AA + C22(A2A2 + A4A4),

D = D22(A2A2 + A4A4) +D33A3A3,

Γ = ΓA,

where

A = i1i1 + i2i2, A2 = i1i1 − i2i2,

A3 = i1i2 − i2i1, A4 = i1i2 + i2i1,

and i1 and i2 are the unit basis vectors with i1 · i2 = 0 on the midplane.Denote A1 = A. The following orthogonality relation can be obtained:

12Ai ··Aj = δij (i, j = 1, 2, 3, 4).

For an isotropic homogeneous plate, the components of the stiffnesstensors are defined as follows:

C11 =Eh

2(1 − ν), C22 =

Eh

2(1 + ν)= µh,

D33 =Eh3

24(1 − ν), D22 =

Eh3

24(1 + ν)=µh3

12.

The classical bending stiffness for a plate is

D = D33 +D22 =Eh3

12(1 − ν2). (7.75)

The surface density and the inertia tensors are

ρ = ρ0h, Θ1 = 0, Θ2 = ΘA, Θ =ρ0h

3

12, (7.76)

where ρ0 is the plate material density.The transverse shear stiffness is given by

Γ = kµh, (7.77)

where k is a shear correction factor first introduced by Timoshenko in thetheory of beams2. For the value of k, Mindlin3 proposed k = π2/12 while2Timoshenko, S. P. On the correction for shear of the differential equation for transverse

vibrations of prismatic bars. Phil. Mag. Ser. 6, 41, 744–746, 1921.3Mindlin, R.D. Influence of rotatory inertia and shear on flexural motions of isotropic,

elastic plates. Trans. ASME J. Appl. Mech., 18, 31–38, 1951.


Reissner4 proposed a similar value of k = 5/6. The literature mentionsother values of k; for example, for plates that are strongly nonhomogeneousin thickness, k can significantly differ from the above values5.

The strain energy density for Reissner’s plate is given by

U =12µ ··C ··µ+

12κ ··D ··κ+

12γ · Γ · γ.

It is seen that

T ·A =∂U

∂µ, T · n =

∂U

∂γ, M =

∂U

∂κ.

The five scalar equations of (7.50) and (7.51) contain five scalar un-knowns v1, v2, w, ϑ1, and ϑ2. In Reissner’s model, equation (7.59) for wchanges to the following:

D∇4w = qn − D

Γ∆qn. (7.78)

Because of the additional term (D/Γ)∆qn, for strongly nonhomogeneousloads the results obtained from Reissner’s theory can differ significantlyfrom those obtained from Kirchhoff’s model.

For advanced readers, we propose


Reissner’s theory is usually used for non-thin plates, for dynamic prob-lems, and for cases of anisotropic materials having small shear stiffness. Amore detailed presentation of the theory can be found in [Wang, Reddy, andLee (2000); Zhilin (2006)]; see also Altenbach6 and Grigolyuk and Selezov7.

Plate and shell theories of higher order

The approximation of the displacement field (7.69) is linear in z. In theliterature on shell theory, we encounter displacement approximations in z of4Reissner, E. On the theory of bending of elastic plates. J. Math. Physics, 23, 184–194,

1944.5Altenbach, H., and Eremeyev, V.A. Direct approach based analysis of plates composed

of functionally graded materials. Arch. Appl. Mech., 78, 775–794, 2008.6Altenbach, H. An alternative determination of transverse shear stiffnesses for sandwich

and laminated plates. Int. J. Solids Struct., 37, 3503–3520, 2000. Altenbach, H. On thedetermination of transverse shear stiffnesses of orthotropic plates. ZAMP, 51, 629–649,2000.7Grigolyuk, E.I., and Selezov, I.T. Nonclassical theories of vibration of beams, plates

and shells (in Russian). In: Itogi nauki i tekhniki, Mekhanika tverdogo deformiruemogotela, vol 5, VINITI, Moskva, 1973.


higher order. The natural idea of expanding the displacement in a series ofpowers in z belongs to Cauchy and Poisson. We could say that Germain’sequation for plate deflection represents the zero-order theory, as here thedeflection field does not depend on z. Equation (7.69) can be considered asa basis of the first-order theory. Some versions of higher-order theory weredeveloped in a number of works; certain difficulties arise, however, suchas how one should define the boundary conditions and physically interpretmoments of higher order.

As an example, we will consider the third-order plate theory. In [Wang,Reddy, and Lee (2000)] there is a review of the literature on this version ofthe theory. The displacement field is approximated by the expressions

u(x1, x2, z) = v(x1, x2) + zφ(x1, x2) + z3ψ(x1, x2)

= v + wn + zφ+ z3ψ, (7.79)

where v ·n = 0, and φ and ψ are functions such that φ·n = 0 and ψ ·n = 0.In particular, Reddy proposed the following expression for ψ:

ψ = −α(φ+ ∇w

),

where α = 4/(3h2) and h is the plate thickness. The strain tensor becomes

ε = ε+12

(γn + nγ) , (7.80)

where

ε = ε+z

2

(∇φ+ (∇φ)T

)+z3

2

(∇ψ + (∇ψ)T

)= ε+

z

2

(∇φ+ (∇φ)T

)− α

z3

2

(∇φ+ (∇φ)T + 2∇∇w

),

γ = φ+ 3z2ψ = φ− 3z2αφ− 3z2α∇w,2ε = ∇v + (∇v)T .

In the third-order theory, we introduce the stress resultants Tαβ, thetransverse shear stress resultants Tαn, and the stress couples Mαβ. Butwe introduce some additional quantities such as the higher-order stress


resultants Pαβ and Rα:

Tαβ = [[σαβ ]] ≡∫ h/2

−h/2

σαβ dz, Tαn = [[σαn]] ≡∫ h/2

−h/2

σα3 dz,

Mαβ = [[zσαβ ]] ≡∫ h/2

−h/2

zσαβ dz, (7.81)

Pαβ = [[z3σαβ ]] ≡∫ h/2

−h/2

z3σαβ dz, Rα = [[z2σαn]] ≡∫ h/2

−h/2

z2σα3 dz.

Exercise 7.21. Using (7.80) and (7.56), express (7.81) in terms of thecomponents of φ and w.

The equilibrium equations split into two systems. The first is for theplane state; it is of the form we had in Kirchhoff’s theory:

∂T11

∂x1+∂T21

∂x2+ q1 = 0,

∂T12

∂x1+∂T22

∂x2+ q2 = 0. (7.82)

Equations (7.82) are supplemented with the boundary conditions

v∣∣ω1

= v0(s), ν ·T · A∣∣ω2

= ϕ0(s), (7.83)

where

n · v0 = 0, n · ϕ0 = 0, ω1 ∪ ω2 = ω, ω1 ∩ ω2 = ∅.The constitutive equations are

T11 =Eh

1 − ν2(ε11 + νε22),

T22 =Eh

1 − ν2(ε22 + νε11),

T12 =Eh

1 + νε12.

The boundary value problem for plate bending is presented by the equi-librium equations

∂T1n

∂x1+∂T2n

∂x2+ α

(∂2P11

∂x21

+ 2∂2P12

∂x1∂x2+∂2P22

∂x22

)+ qn = 0,

∂M11

∂x1+∂M21

∂x2− T1n = 0,

∂M12

∂x1+∂M22

∂x2− T2n = 0, (7.84)


where

Mαβ = Mαβ − αPαβ , Tαn = Tαn − 3αRα.

The kinematic boundary conditions for the bending problem consist of fourscalar equations:

w∣∣ω

= w0(s),∂w

∂n

∣∣∣∣ω

= w0n(s), φ

∣∣ω

= φ0(s),

where n ·φ0 = 0. The static boundary conditions are expressed in terms ofstress resultants, stress couples, and moments of higher order; the interestedreader should consult the original sources [Wang, Reddy, and Lee (2000)]and Kienzler8, Levinson9, and Reddy10.

Note that the third- and higher-order shell and plate theories show whyvariational formulations are needed. These are the only way to introducecorrect static boundary conditions.

Micropolar shells or 6th-parametrical shell theory

As an example of a non-classical shell theory, we will sketch the theory ofmicropolar shells. Its roots originate in the work of the Cosserat brothers,Eugen and Francois [Cosserat and Cosserat (1909)]. The micropolar theoryof shells is presented in the works of Eremeyev and Zubov [Eremeyev andZubov (2008)], Zhilin [Zhilin (2006)], and others.

In this theory, the dynamical equations and shell kinematics coincidewith those of a 6-parametric shell theory, a nonlinear version of which ispresented in the books by Libai and Simmonds [Libai and Simmonds (1998)]and Chroscielewski et al [Chroscielewski, Makowski, and Pietraszkiewicz(2004)]. There exists a micropolar plate theory by Eringen [Eringen (1999)].It is based on integration over the thickness of the plate when its materialis considered as a Cosserat continuum. This differs from 6-parametricalshell-plate theory, as it contains more than 6 unknown scalar functions.

The kinematics of the shell surface are described by six scalar quantities,three of which are the components of the displacement vector v and therest of which are the components of the microrotation vector ϑ. So ashell particle has six degrees of freedom described by the components of v8Kienzler, R. On consistent plate theories. Arch. Appl. Mech., 72, 229–247, 2002.9Levinson, M. An accurate, simple theory of the statics and dynamics of elastic plates.

Mech. Res. Comm., 7, 343–350, 1980.10Reddy, J.N. A simple higher-order theory for laminated composite plates. Trans.ASME J. Appl. Mech., 51, 745–752, 1984.


and ϑ. The vector ϑ satisfies ϑ · n = 0; the vectors v and ϑ are mutuallyindependent. This is analogous to a two-dimensional version of the Cosseratmedium with couple stresses and the rotation interaction of the particles.

For a micropolar shell, we can assign the couple load acting on the shellsurface. Here the order of the equilibrium equations is 12, so we shouldsupplement them with 6 conditions on the shell edge. On the part of theedge that is free of geometrical constraints, we can assign the distributionof forces and couples. This extends the range of applicability of the the-ory in comparison with the theories considered above: we can, say, assignboundary conditions that describe a shell clamped to a rigid body.

The micropolar theory is used, in particular, to describe branching shellsand thin-walled bodies with complex intrinsic structure that include mul-tilayered or composite plates and shells, shells with inner partition, withstringers and those similar to honeycomb or made of highly porous mate-rials.

We present the micropolar shell equations for small deformation. Thedynamical equations are

∇ ·T + q = ρv + ρΘ1 · ϑ,∇ ·M + T× + m = ρΘT

1 ··· v + ρΘ2 · ϑ, (7.85)

where T and M are surface stress and couple stress tensors analogous tothose in Kirchhoff’s theory, ρ is the shell surface density, Θ1 and Θ2 are theinertia tensors, Θ2 is symmetric, ΘT

2 = Θ2, and q and m are distributedsurface forces and couples, respectively. Besides, in this theory, we canassign a twisting couple (or drilling moment) on the shell surface. Theequilibrium equations take the form

∇ · T + q = 0, ∇ ·M + T× + m = 0. (7.86)

The conditions at a boundary point can be kinematic

v∣∣ω1

= v0(s), ϑ∣∣ω3

= ϑ0(s), (7.87)

or static

ν · T∣∣ω2

= ϕ(s), ν · M∣∣ω4

= (s), (7.88)

where v0(s) and ϑ0(s) are given vector functions of the length parameters; they define the displacements and microrotation of the shell edge. Thefunctions ϕ(s) and (s) determine the surface stresses and stress coupleson the edge. Here

ω = ω1 ∪ ω2 = ω3 ∪ ω4, ω2 = ω \ ω1, ω4 = ω \ ω3.


The form of the equilibrium and dynamical equations for a micropolar shelldoes not differ from (7.70) and (7.71). In general, in this theory m · n = 0and · n = 0. In particular, M · n = 0.

The strain measures take the form

ε = ∇v + A × ϑ, κ = ∇ϑ, (7.89)

where ε and κ are non-symmetric strain and bending strain tensors respec-tively.

The constitutive equations for an elastic shell are represented thoughthe strain energy density U = U(ε,κ):

T =∂U

∂ε, M =

∂U

∂κ. (7.90)

For an isotropic shell, U is a quadratic form in its variables:

2U = α1 tr2 ε+ α2 tr ε2 + α3 tr(ε · εT ) + α4n · εT · ε · n

+ β1 tr2 κ + β2 tr κ2 + β3 tr

(κ · κ

T)

+ β4n · κT · κ · n,

ε = ε · A, κ = κ · A, (7.91)

where αk and βk (k = 1, 2, 3, 4) are elastic moduli. Substituting (7.91) into(7.90), we get

T = α1A tr ε+ α2εT + α3ε+ α4(ε · n)n,

M = β1A tr κ + β2κT + β3κ + β4(κ · n)n. (7.92)

Equations (7.86)–(7.88) and (7.92) constitute a linear boundary valueproblem with respect to the fields of displacement and microrotation; theydescribe the equilibrium of a micropolar shell in the case of small defor-mation. For dynamic problems, equations (7.86) change to (7.85). Undersome additional assumptions, Reissner’s shell theory is a consequence ofthe micropolar theory.

As in the other shell theories, we can formulate some variational princi-ples. Lagrange’s variational principle for an elastic micropolar shell startswith formulation of the total energy functional

E(v,ϑ) =∫

Σ

U(ε,κ) dΣ −A(v,ϑ), (7.93)

where the potential of external loads A(v,ϑ) is

A(v,ϑ) =∫

Σ

(q · v + m · ϑ) dΣ +∫

ω2

ϕ · v ds+∫

ω4

· ϑ ds.


The functional E(v,ϑ) is considered on the set of twice continuously differ-entiable fields of displacements and microrotations that satisfy (7.87). Thepair (v,ϑ) that satisfies (7.86) and (7.88) is a stationary point of E(v,ϑ).Lagrange’s stationary principle is minimal: on the equilibrium solution, thefunctional (7.93) attains its minimum.

Rayleigh’s variational principle now takes the following form.

On the set of functions with boundary conditions v∣∣ω1

= 0, ϑ∣∣ω3

= 0 thatobey the constraint

K(v,ϑ) ≡∫

Σ

ρ

(12v · v + v ·Θ1 · ϑ +

12ϑ · Θ2 · ϑ

)dΣ = 1,

the eigenoscillation modes of the shell are stationary points of the strainenergy functional

E(v,ϑ) =∫

Σ

U(ε,κ) dΣ, (7.94)

where ε = ∇v + A × ϑ and κ = ∇ϑ.

Rayleigh’s principle also includes the reverse statement: on the set offunctions satisfying the above restrictions, the stationary points of E arethe modes of eigenoscillation. Here v and ϑ are the amplitudes of theoscillations of the displacements and microrotations as the solutions of theeigenoscillation problem are sought in the form v = veiωt, ϑ = ϑeiωt.Rayleigh’s quotient takes the form

R(v,ϑ) =E(v,ϑ)K(v,ϑ)

.

Now the lowest eigenfrequency of the shell, which is the minimal frequency,is equal to the minimum value of the functional R.

Finally, we would like to repeat that, contrary to the situation with lin-ear elasticity, shell theory is still under development; investigators continueto seek better models and numerical methods for the solution of practicalproblems. This is a consequence of the evident fact that it is impossible tofind an accurate approximation to the results of a three-dimensional elastic-ity problem under all the circumstances when one solves the same problemusing one of the two-dimensional models of shell theory.

Appendix A

Formulary

For convenience we list the main formulas obtained in each chapter. Thesymbol ∀ denotes the universal quantifier (read as “for all” or “for every”).

Chapter 1

Dot product

a · b = |a||b| cos θ = a1b1 + a2b2 + a3b3

Cross product

a × b =

∣∣∣∣∣∣i1 i2 i3a1 a2 a3

b1 b2 b3

∣∣∣∣∣∣Scalar triple product

a · (b × c) =

∣∣∣∣∣∣a1 a2 a3

b1 b2 b3c1 c2 c3

∣∣∣∣∣∣287


Chapter 2

Reciprocal (dual) basis

Kronecker delta symbol

δij =

1 j = i

0 j = i

Definition of reciprocal basis

ej · ei = δij

Components of a vector

x = xi ei = xi ei xi = x · ei

xi = x · ei

Relations between dual bases

ei =1V

(ej × ek) ei =1V ′ (e

j × ek)

where

(i, j, k) = (1, 2, 3) or (2, 3, 1) or (3, 1, 2)

and

V = e1 · (e2 × e3)

V ′ = e1 · (e2 × e3)V ′ = 1/V

Metric coefficients

gjq = ej · eq

gip = ei · ep

gij gjk = δk

i

In Cartesian frames,

gij = δji gij = δi

j

Dot products in mixed and unmixed bases

a · b = ai bj gij = ai bj gij = ai bi = ai b

i

Formulary 289

Raising and lowering of indices

xj = xi gij xi = xj gij

Frame transformation

Equations of transformation

ei = Aji ej Aj

i = ei · ej

ei = Aji ej Aj

i = ei · ej

where

Aji A

kj = Aj

i Akj = δk

i

Vector components and transformation laws

x = xi ei = xi ei = xi ei = xi ei

and

xi = Aij x

j xi = Aij x

j

xi = Aji xj xi = Aj

i xj

Miscellaneous

Permutation (Levi–Civita) symbol

εijk = ei · (ej × ek) =

⎧⎪⎪⎨⎪⎪⎩+V (i, j, k) an even permutation of (1, 2, 3)

−V (i, j, k) an odd permutation of (1, 2, 3)

0 two or more indices equal

εijk = ei · (ej × ek) =

⎧⎪⎪⎨⎪⎪⎩+V ′ (i, j, k) an even permutation of (1, 2, 3)

−V ′ (i, j, k) an odd permutation of (1, 2, 3)

0 two or more indices equal


Useful identities

εijk εpqr =

∣∣∣∣∣∣∣δpi δq

i δri

δpj δq

j δrj

δpk δq

k δrk

∣∣∣∣∣∣∣ εijk εpqk = δp

i δqj − δq

i δpj

Determinant of Gram matrix

V 2 = det[gij ]

Cross product

a × b = ei εijk aj bk = ei ε

ijk aj bk

Chapter 3

Dyad product

Properties

(λa) ⊗ b = a ⊗ (λb) = λ(a ⊗ b)

(a + b) ⊗ c = a⊗ c + b ⊗ c

a ⊗ (b + c) = a⊗ b + a ⊗ c

Dot products of dyad with vector

ab · c = (b · c)ac · (ab) = (c · a)b

Tensors from operator viewpoint

Equality of tensors

A = B ⇐⇒ ∀x, A · x = B · x

Formulary 291

Components

aij = ei · A · ej

aij = ei · A · ej

ai·j = ei · A · ej

a·ji = ei · A · ej

Definition of sum A + B

∀x, (A + B) · x = A · x + B · x

Definition of scalar multiple cA

∀x, (cA) · x = c(A · x)

Definition of dot product A ·B

(A ·B) · x = A · (B · x)

Definition of pre-multiplication y · A

∀x, (y ·A) · x = y · (A · x)

Definition of unit tensor E

∀x, E · x = x ·E = x

Unit tensor components

E = ei ei = ej ej = gij ei ej = gij ei ej

Inverse tensor

A · A−1 = E

(A ·B)−1 = B−1 ·A−1


Nonsingular tensor A

A · x = 0 =⇒ x = 0

Determinant of a tensor

detA =∣∣a·ji ∣∣ =

∣∣ak·m∣∣ =

1g|ast| = g |apq|

=16εijk ε

mnp a · im a · j

n a ·kp

Dyadic components under transformation

Transformation to reciprocal basis

akm = aij gki gjm

More general transformation

ei = Aji ej =⇒ aij = akmAi

k Ajm

ei = Aji ej =⇒ aij = akm Ai

k Ajm

Akj A

ik = δi

j

A = aij ei ej = akl ek el = a· ji ei ej = ak· l ek el

= aij ei ej = akl ek el = a·ji ei ej = ak· l ek el

where

aij = Aik A

jl a

kl aij = Aik A

jl a

kl

aij = Aki A

lj akl aij = Ak

i Alj akl

ai· j = Ai

k Alj a

k· l ai

·j = Aik A

lj a

k· l

a· ji = Aki A

jl a

· lk a·ji = Ak

i Ajl a

· lk

Formulary 293

More dyadic operations

Dot product

ab · cd = (b · c)ad

A · (λa + µb) = λA · a + µA · b(λA + µB) · a = λA · a + µB · a

Double dot product

ab ·· cd = (b · c)(a · d)

Scalar product of second-order tensors

ab • cd = (a · c)(b · d)

Second-order tensor topics

Transpose

AT = aji ei ej = aji ei ej = aj·i e

i ej = a ·ij ei ej

A · x = x · AT

(AT )T = A

(A ·B)T = BT · AT

a · CT · b = b ·C · a

detA−1 = (detA)−1 (A · B)−1 = B−1 · A−1

(AT )−1 = (A−1)T (A−1)−1 = A


Tensors raised to powers

A0 = E An = A · An−1 for n = 1, 2, 3, . . .

A−n = A−n+1 ·A−1 for n = 2, 3, 4, . . .

eA = E +A1!

+A2

2!+

A3

3!+ · · ·

Symmetric and antisymmetric tensors

symmetric: A = AT ; ∀x, A · x = x ·Aantisymmetric: A = −AT

A =12(A + AT

)+

12(A − AT

)Eigenpair

A · x = λx

Viete formulas for invariants

I1(A) = λ1 + λ2 + λ3 = trA

I2(A) = λ1λ2 + λ1λ3 + λ2λ3 =12[tr2 A − trA2]

I3(A) = λ1λ2λ3 = detA

Orthogonal tensor

Q · QT = QT · Q = E

Polar decompositions

A = S · Q Q orthogonal

A = Q · S′ S,S′ positive definite and symmetric

Formulary 295

Chapter 4

Vector fields

Some rules for differentiating vector functions

d(e1(t) + e2(t))dt

=de1(t)dt

+de2(t)dt

d(c e(t))dt

= cde(t)dt

d(f(t)e(t))dt

=df(t)dt

e(t) + f(t)de(t)dt

d

dt(e1(t) · e2(t)) = e′1(t) · e2(t) + e1(t) · e′2(t)

d

dt(e1(t) × e2(t)) = e′1(t) × e2(t) + e1(t) × e′2(t)

and

d

dt[e1(t) , e2(t) , e3(t)] = [e′1(t) , e2(t) , e3(t)]

+ [e1(t) , e′2(t) , e3(t)]

+ [e1(t) , e2(t) , e′3(t)]

where

[e1(t) , e2(t) , e3(t)] = (e1(t) × e2(t)) · e3(t)

Tangent vectors to coordinate lines

ri =∂r∂qi

(i = 1, 2, 3)

Jacobian

√g = r1 · (r2 × r3) =

∣∣∣∣∂xi

∂qj

∣∣∣∣Pointwise definition of reciprocal basis

ri · rj = δij


Definition of metric coefficients

gij = ri · rj

gij = ri · rj

gji = ri · rj = δj

i

Transformation laws

ri = Aji rj Aj

i =∂qj

∂qi

ri = Aji rj Aj

i =∂qj

∂qi

f i = Aij f

j fi = Aji fj f i = Ai

j fj fi = Aj

i fj

Differentials and the nabla operator

Metric forms

(ds)2 = dr · dr = gij dqi dqj

Nabla operator

∇ = ri ∂

∂qi

Gradient of a vector function

df = dr · ∇f = ∇fT · dr

Divergence of vector

div f = ∇ · f = ri · ∂f∂qi

Rotation and curl of vector

rot f = ∇× f = ri × ∂f∂qi

ω =12

rot f

Formulary 297

Divergence and rotation of second-order tensor

∇ · A = ri · ∂

∂qiA ∇× A = ri × ∂

∂qiA

Differentiation of a vector function

Christoffel coefficients of the second kind

∂ri

∂qj= Γk

ij rk∂rj

∂qi= −Γj

it rt

Γkij = Γk

ji

Christoffel coefficients of the first kind

12

(∂git

∂qj+∂gtj

∂qi− ∂gji

∂qt

)= Γijt

Γijk = Γjik

Covariant differentiation

∂f∂qi

= rk ∇ifk = rj∇ifj

∇kfi =∂fi

∂qk− Γj

ki fj ∇kfi =

∂f i

∂qk+ Γi

kt ft

∇f = ri rj ∇ifj = ri rj ∇ifj

Covariant differentiation of second-order tensor

∂

∂qkA = ∇ka

ij ri rj = ∇kaij ri rj = ∇ka·ji ri rj = ∇ka

i·j ri rj


∇kaij =

∂aij

∂qk+ Γi

ks asj + Γj

ks ais ∇kaij =

∂aij

∂qk− Γs

ki asj − Γskj ais

∇ka·ji =

∂a·ji∂qk

− Γski a

·js + Γj

ks a·si ∇ka

i·j =

∂ai·j

∂qk+ Γi

ks as·j − Γs

kj ai·s

Differential operations

∇× f = rk εijk ∂fj

∂qi

Γiin =

1√g

∂√g

∂qn

∇ · f =1√g

∂

∂qi

(√gf i

)

∇× A = εkin rn∂

∂qk

(rj aij

)

∇ ·A =1√g

∂

∂qi

(√g aij rj

)

∇2f = ∇ · ∇f = gij

(∂2f

∂qi∂qj− Γk

ij

∂f

∂qk

)

∇2f = ∇j ∇j f ∇j = gij ∇i

∇2f = rj ∇i ∇i fj

∇∇ · f = ri ∇i ∇j fj

∇×∇× f = ∇∇ · f −∇2f

Formulary 299

Orthogonal coordinate systems

Lame coefficients

(Hi)2 = gii

ri = ri/(Hi)2 (i = 1, 2, 3)

ri = ri/Hi

Differentiation in the orthogonal basis

∇ =ri

Hi

∂

∂qi

∇f = rirj

(1Hi

∂fj

∂qi− fi

HiHj

∂Hi

∂qj+ δij

fk

Hk

1Hi

∂Hi

∂qk

)

∇ · f =1

H1H2H3

(∂

∂q1(H2H3f1) +

∂

∂q2(H3H1f2) +

∂

∂q3(H1H2f3)

)

∇× f =12ri × rj

HiHj

(∂

∂qi(Hjfj) − ∂

∂qj(Hifi)

)

∇2f =1

H1H2H3

[∂

∂q1

(H2H3

H1

∂f

∂q1

)+

∂

∂q2

(H3H1

H2

∂f

∂q2

)+

∂

∂q3

(H1H2

H3

∂f

∂q3

)]

Integration formulas

Transformation of multiple integral∫V

f(x1, x2, x3) dx1 dx2 dx3 =∫

V

f(q1, q2, q3)J dq1 dq2 dq3

J =√g =

∣∣∣∣∂xi

∂qj

∣∣∣∣


Integration by parts∫V

∂f

∂xkg dx1 dx2 dx3 = −

∫V

∂g

∂xkf dx1 dx2 dx3 +

∫S

fg nk dS

Miscellaneous results∫V

∇f dV =∫

S

fn dS∫

V

∇ · f dV =∫

S

n · f dS∫V

∇f dV =∫

S

nf dS∫

V

∇× f dV =∫

S

n × f dS

∫V

∇A dV =∫

S

nA dS∫V

∇ ·A dV =∫

S

n ·A dS∫V

∇× A dV =∫

S

n× A dS

∮Γ

f · dr =∫

S

(n ×∇) · f dS

∮Γ

dr · A =∫

S

(n×∇) · A dS∮Γ

A · dr =∫

S

(n×∇) · AT dS

Chapter 5

Elementary theory of curves

Parametrization

r = r(t) or r = r(s)

Length

s =∫ b

a

|r′(t)| dt

Formulary 301

Unit tangent

τ (s) = r′(s)

Equation of tangent line

r = r(t0) + λr′(t0)

x− x(t0)x′(t0)

=y − y(t0)y′(t0)

=z − z(t0)z′(t0)

Curvature

k1 = |r′′(s)| k21 =

(r′(t) × r′′(t))2

(r′2(t))3

Radius of curvature

R = 1/k1

Principal normal, binormal

ν =r′′(s)k1

β = τ × ν

Osculating plane

[r − r(s0)] · β(s0) = 0

∣∣∣∣∣∣x− x(t0) y − y(t0) z − z(t0)x′(t0) y′(t0) z′(t0)x′′(t0) y′′(t0) z′′(t0)

∣∣∣∣∣∣ = 0

Torsion

k2 = − (r′(s) × r′′(s)) · r′′′(s)k21

k2 = − (r′(t) × r′′(t)) · r′′′(t)(r′(t) × r′′(t))2


Serret–Frenet equations

τ ′ = k1ν

ν′ = −k1τ − k2β

β′ = k2ν

Theory of surfaces

Parametrization

r = r(u1, u2)

Tangent vectors, unit normal

ri =∂r∂ui

(i = 1, 2)

n =r1 × r2

|r1 × r2|

First fundamental form

(ds)2 = gij dui duj = E(du1)2 + 2F du1 du2 +G(du2)2

E = r1 · r1 F = r1 · r2 G = r2 · r2

Orthogonality of curves

E du1 du1 + F (du1 du2 + du2 du1) +Gdu2 du2 = 0

Area

S =∫

A

√EG− F 2 du1 du2

Formulary 303

Second fundamental form

d2r · n = L(du1)2 + 2M du1 du2 +N(du2)2 = −dr · dn

L =∂2r

(∂u1)2· n M =

∂2r∂u1∂u2

· n N =∂2r

(∂u2)2· n

Normal curvature, mean curvature, Gaussian curvature

k0 = k1 cosϑ ϑ = angle between ν and n

H =12(kmin + kmax) =

12LG− 2MF +NE

EG− F 2

K = kminkmax =LN −M2

EG− F 2

Surface given by z = f(x, y) in Cartesian coordinates

Subscripts x, y denote partial derivatives with respect to x, y respectively.

E = 1 + fx2 F = fx fy G = 1 + fy

2

EG− F 2 = 1 + fx2 + fy

2 S =∫

D

√1 + fx

2 + fy2 dx dy

n =−fxi1 − fyi2 + i3√

1 + fx2 + fy

2

L = rxx · n =fxx√

1 + fx2 + fy

2

M = rxy · n =fxy√

1 + fx2 + fy

2

N = ryy · n =fyy√

1 + fx2 + fy

2


K =fxxfyy − fxy

2(1 + fx

2 + fy2)2

Surface of revolution about z-axis

x = φ(u) z = ψ(u)

(ds)2 =(φ′2 + ψ′2

)du2 + φ2 dv2

−dn · dr =ψ′′φ′ − φ′′ψ′√φ′2 + ψ′2

du2 +ψ′φ√

φ′2 + ψ′2dv2

Surface gradient operator and Gauss–Ostrogradsky theorems

Surface gradient operator

∇ = ri ∂

∂ui(i = 1, 2)

Surface analogues of the Gauss–Ostrogradsky (divergence) theorem∫S

(∇ ·X + 2Hn ·X

)dS =

∮Γ

ν ·X ds

∫S

(∇X + 2HnX

)dS =

∮Γ

νX ds

∫S

(∇ × X + 2Hn× X

)dS =

∮Γ

ν × X ds

∫S

∇ × (nX) dS =∮

Γ

τX ds

Chapter 6

Stress tensor

Relation between stress tensor and stress vector

t = n · σ

Formulary 305

Equilibrium and motion equations

∇ · σ + ρf = 0 ∇ · σ + ρf = ρd2udt2

Strain tensor

ε =12(∇u + (∇u)T

)Compatibility deformation conditions

∇× (∇× ε)T = 0

Hooke’s law

σ = C ·· ε cijmn = cjimn = cijnm

cijmn = cmnij

Isotropic material

σ = λE tr ε+ 2µε

Recalculation of elastic moduli for isotropic body

Moduli λ, µ k, µ µ, ν E, ν E, µ

λ λ k − 23µ

2µν1−2ν

νE(1+ν)(1−2ν)

(E−2µ)µ3µ−E

µ = G µ µ µ E2(1+ν) µ

k λ+ 23µ k 2µ(1+ν)

3(1−2ν)E

3(1−2ν)Eµ

3(3µ−E)

E µ(3λ+2µ)λ+µ

9kµ3k+µ 2µ(1 + ν) E E

ν λ2(λ+µ)

3k−2µ6k+2µ ν ν 1

2Eµ − 1


Principal equations in Cartesian coordinates

Equations of motion

∂σij

∂xi+ ρfj = ρ

∂2uj

∂t2

∂σ11

∂x1+∂σ21

∂x2+∂σ31

∂x3+ ρf1 = ρ

∂2u1

∂t2

∂σ12

∂x1+∂σ22

∂x2+∂σ32

∂x3+ ρf2 = ρ

∂2u2

∂t2

∂σ13

∂x1+∂σ23

∂x2+∂σ33

∂x3+ ρf3 = ρ

∂2u3

∂t2

Strains

ε11 =∂u1

∂x1ε12 =

12

(∂u1

∂x2+∂u2

∂x1

)ε22 =

∂u2

∂x2ε13 =

12

(∂u1

∂x3+∂u3

∂x1

)ε33 =

∂u3

∂x3ε23 =

12

(∂u2

∂x3+∂u3

∂x2

)

Principal equations in a curvilinear coordinate system

Equations of motion

∂

∂qi

(√gσij

)+ Γj

mnσmn + ρ

√gf j = ρ

√g∂2uj

∂t2

Strains

ε = εstrsrt εst =12

(∂us

∂qt+∂ut

∂qs

)− Γr

stur

Formulary 307

Principal equations in cylindrical coordinates

Equations of motion

∂σrr

∂r+σrr − σφφ

r+

1r

∂σrφ

∂φ+∂σzr

∂z+ ρfr = ρ

∂2ur

∂t2

∂σrφ

∂r+ 2

σrφ

r+

1r

∂σφφ

∂φ+∂σzφ

∂z+ ρfφ = ρ

∂2uφ

∂t2

∂σrz

∂r+σrz

r+

1r

∂σzφ

∂φ+∂σzz

∂z+ ρfz = ρ

∂2uz

∂t2

Strains

εrr =∂ur

∂rεrφ =

12

(∂uφ

∂r+

1r

∂ur

∂φ− uφ

r

)εφφ =

1r

∂uφ

∂φ+ur

rεφz =

12

(1r

∂uz

∂φ+∂uφ

∂z

)εzz =

∂uz

∂zεrz =

12

(∂ur

∂z+∂uz

∂r

)

Equilibrium equations in displacements

µ

(∆ur − ur

r2− 2r2∂uφ

∂φ

)+ (λ+ µ)

∂

∂r

[1r

∂

∂r(rur) +

1r

∂uφ

∂φ+∂uz

∂z

]+ ρfr = 0

µ

(∆uφ − uφ

r2+

2r2∂ur

∂φ

)+ (λ+ µ)

1r

∂

∂φ

[1r

∂

∂r(rur) +

1r

∂uφ

∂φ+∂uz

∂z

]+ ρfφ = 0

µ∆uz + (λ+ µ)∂

∂r

[1r

∂

∂r(rur) +

1r

∂uφ

∂φ+∂uz

∂z

]+ ρfz = 0

where

∆ =∂2

∂r2+

1r

∂

∂r+

1r2

∂2

∂φ2+

∂2

∂z2


Principal equations in spherical coordinates

Equations of motion

∂σrr

∂r+

1r

∂σrθ

∂θ+

1r sin θ

∂σrφ

∂φ

+1r

(2σrr − σθθ − σφφ + σrθ cot θ) + ρfr = ρ∂2ur

∂t2

∂σrφ

∂r+

1r

∂σθφ

∂θ+

1r sin θ

∂σφφ

∂φ

+1r

(3σrφ + 2σθφ cot θ) + ρfφ = ρ∂2uφ

∂t2

∂σrθ

∂r+

1r

∂σθθ

∂θ+

1r sin θ

∂σθφ

∂φ

+1r

[(σθθ − σφφ) cot θ + 3σrθ] + ρfθ = ρ∂2uθ

∂t2

Strains

εrr =∂ur

∂r

εθθ =1r

∂uθ

∂θ+ur

r

εφφ =1

r sin θ∂uφ

∂φ+uθ

rcot θ +

ur

r

εθφ =12r

(∂uφ

∂θ+

1sin θ

∂uθ

∂φ− uφ cot θ

)εrφ =

12

(1

r sin θ∂ur

∂φ+∂uφ

∂r− uφ

r

)εrθ =

12

(∂uθ

∂r+

1r

∂ur

∂θ− uθ

r

)

Formulary 309

Equilibrium equations in displacements

µ

∆ur − 2

r2

[ur +

1sin θ

∂

∂θ(uθ sin θ) +

1sin θ

∂uφ

∂φ

]+ (λ+ µ)

∂

∂r

[1r2

∂

∂r(r2ur) +

1r sin θ

∂

∂θ(uθ sin θ) +

1r sin θ

∂uφ

∂φ

]+ ρfr = 0

µ

∆uθ − 2

r2

[∂ur

∂θ− 1

2 sin2 θuθ − cos θ

sin2 θ

∂uφ

∂φ

]+λ+ µ

r

∂

∂θ

[1r2

∂

∂r(r2ur) +

1r sin θ

∂

∂θ(uθ sin θ) +

1r sin θ

∂uφ

∂φ

]+ ρfθ = 0

µ

∆uφ +

2r2 sin θ

[∂ur

∂φ+ cot

∂uθ

∂φ− uφ

2 sin θ

]+λ+ µ

r sin θ∂

∂φ

[1r2

∂

∂r(r2ur) +

1r sin θ

∂

∂θ(uθ sin θ) +

1r sin θ

∂uφ

∂φ

]+ ρfφ = 0

where

∆ =1r2

∂

∂r

(r2∂

∂r

)+

1r2 sin θ

∂

∂θ

(sin θ

∂

∂θ

)+

1r2 sin θ

∂2

∂φ2

Chapter 7

Formulas of Surface Theory

Position vector of the middle surface and base vectors

ρ = ρ(q1, q2)

ρ1 =∂ρ

∂q1

ρ2 =∂ρ

∂q2

ρα · ρβ = δβα (α, β = 1, 2)

Normal to mid-surface

n =ρ1 × ρ2

|ρ1 × ρ2|


Metric tensor

A = ραρα = E− nn

Nabla operator

∇ = ρα ∂

∂qα

Curvature tensor

B = bαβραρβ = −∇n

Derivatives of ρα and ρα

∂ρα

∂qβ= Γγ

αβργ + bαβn∂ρα

∂qβ= −Γα

βγργ + bαβn

Gradient of vector v

∇v = (∇v) ·A − wB + (∇w + B · v)n

v = v + wn

v = v1(q1, q2)ρ1 + v2(q1, q2)ρ2

w = v3 = v3 = v · n

Divergence of second-order tensor T

∇ · T = ργ ∂

∂qγ· (Tαβραρβ + T 3βnρβ + Tα3ραn + T 33nn

)=∂Tαβ

∂qαρβ + Tαβργ · ∂ρα

∂qγρβ + Tαβ

∂ρβ

∂qα+ T 3βργ · ∂n

∂qγρβ

+ T 33ργ · ∂n∂qγ

n +∂Tα3

∂qαn + Tα3ργ · ∂ρα

∂qγn + Tα3 ∂n

∂qα

=∂Tαβ

∂qαρβ + TαβΓγ

αγρβ + TαβΓγβαργ + Tαβbαβn − T 3βbγγρβ

− T 33bγγn +∂Tα3

∂qαn + Tα3Γγ

αγn − Tα3bβαρβ

Formulary 311

Kinematics in a Neighborhood of a Shell Mid-Surface

Position vector

r = r(q1, q2, z) = ρ(q1, q2) + zn

Basis vectors

rα =∂r∂qα

= ρα + z∂n∂qα

= (A − zB) · ρα

rα = (A − zB)−1 · ρα

r3 = r3 = n

Spatial nabla operator

∇ = rα ∂

∂qα+ n

∂

∂z= (A − zB)−1 · ∇ + n

∂

∂z

Shell Equilibrium Equations

Stress resultant tensor and couple stress tensor

T = [[(A − zB)−1 · σ]]

M = −[[(A − zB)−1 · zσ × n]]

where

[[f ]] =∫ h/2

−h/2

Gf dz

Equilibrium equations

∇ · T + q = 0 T× = ρα × [[σα]]

∇ · M + T× + m = 0 σα = rα · σ

Strain measures

ε =12

[(∇v) ·A + A · (∇v)T

]− wB

æ = −12

((∇ϑ) · A + A · (∇ϑ)T

)


Constitutive equations for Kirchhoff–Love shells

T ·A =Eh

1 + ν

[ε+

ν

1 − νA tr ε

]M = − Eh3

12(1 + ν)

[æ +

ν

1 − νA træ

]× n

Component representation

T · A = Tαβραρβ ε = εαβρ

αρβ

M = −Mαβραρβ × n æ = æαβρ

αρβ

Tαβ = [[σαβ ]] =∫ h/2

−h/2

σαβ dz

Mαβ = [[zσαβ ]] =∫ h/2

−h/2

zσαβ dz

Tαβ =Eh

1 + ν

[εαβ +

ν

1 − νaαβε

γγ

]Mαβ =

Eh

1 + ν

[æαβ +

ν

1 − νaαβæγ

γ

]

Shell surface energy density

U =Eh

2(1 + ν)

[ε ·· ε+

ν

1 − νtr2 ε

]+

Eh3

24(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

]

Common principal boundary conditions

(1) Clamped (fixed) edge

v∣∣ω

= 0∂w

∂ν

∣∣∣∣ω

= 0

(2) Simple support edge

v∣∣ω

= 0 ν · (M × n) · ν∣∣ω

= 0

Formulary 313

(3) Free edge

ν · T · A∣∣ω

= 0

ν · T · n∣∣ω− ∂

∂s(ν · (M × n) · τ )

∣∣ω

= 0

ν · (M × n) · ν∣∣ω

= 0

Deflection of a plate

Resultant moments

M11 =Eh3

12(1 − ν2)(æ11 + νæ22) æ11 = −∂

2w

∂x21

M22 =Eh3

12(1 − ν2)(æ22 + νæ11) æ22 = −∂

2w

∂x22

M12 =Eh3

12(1 + ν)æ12 æ12 = − ∂2w

∂x1∂x2

Bending stiffness

D =Eh3

12(1 − ν2)

Equilibrium equation

D∇4w = qn

Equilibrium equation in Cartesian coordinates

D

(∂4w

∂x41

+ 2∂4w

∂x21∂x

22

+∂4w

∂x42

)= qn

Appendix B

Hints and Answers

Chapter 1 Exercises

Exercise 1.1 The given equation states that a,b, c are not coplanar;because a has a nonzero component perpendicular to the plane of b and c,it cannot be in this plane.

Chapter 1 Problems

Problem 1.1 (a) (5, 7, 4); (b) (7, 7,−2); (c) (17, 11,−4).

Problem 1.2 (a) (0, 1, 0); (b) (1,−10, 2); (c) (12, 8, 6).

Problem 1.3 (a) x = − 12a+2b; (b) x = 2b− 2a; (c) x = 1

12 (c−a)+ 43b.

Problem 1.4 (a) 2, (0, 1,−1); (b) 11, (−7, 5,−1); (c) 4, (1,−1, 0); (d) −2,(−2,−3,−1); (e) −1, (−1,−1,−1); (f) 1, (0, 0,−5); (g) 1, (5,−3,−1); (h)−1, (−4,−2,−7); (i) −8, (−16, 24,−14).

Problem 1.9 (a) 1; (b) −1; (c) 0.

Problem 1.10 (a) −1; (b) −4; (c) 5; (d) 5; (e) −179; (f) 0; (g) 0; (h) 13;(i) −2.

Problem 1.11 (a) −1; (b) 4; (c) 3 ; (d) −7; (e) −826; (f) −18; (g) 0; (h)105; (i) 0.

315


Chapter 2 Exercises

Exercise 2.1 Suppose that ei and f i are two vectors such that xi = x · ei

and xi = x · f i for all x. Then x · (ei − f i) = 0 for all x, from which itfollows that ei − f i = 0. So f i = ei.

Exercise 2.2 We must show that the equation α1e1 + α2e2 + α3e3 = 0implies α1 = α2 = α3 = 0. To get α1 = 0, for example, we simply dot-multiply the equation by e1 and use ei · ej = δj

i .

Exercise 2.3 We have

e2 × e3 =1V 2

(e3 × e1) × (e1 × e2)

=1V 2

[e2 · (e3 × e1)]e1 − [e1 · (e3 × e1)]e2

=1V 2

e1[e2 · (e3 × e1)]

by the vector triple product identity, hence

V ′ =1V 2

e1 · e1[e2 · (e3 × e1)] =1V 2

[e1 · (e2 × e3)] =1V.

Exercise 2.4 (b) First use common properties of determinants to establishthe identity

[a · (b× c)][u · (v × w)] =

∣∣∣∣∣∣a · u a · v a · wb · u b · v b ·wc · u c · v c · w

∣∣∣∣∣∣ . (*)

Write

[a · (b × c)][u · (v × w)] =

∣∣∣∣∣∣a1 a2 a3

b1 b2 b3c1 c2 c3

∣∣∣∣∣∣∣∣∣∣∣∣u1 u2 u3

v1 v2 v3w1 w2 w3

∣∣∣∣∣∣in a Cartesian frame. Then

[a · (b × c)][u · (v × w)] =

∣∣∣∣∣∣a1 a2 a3

b1 b2 b3c1 c2 c3

∣∣∣∣∣∣∣∣∣∣∣∣u1 v1 w1

u2 v2 w2

u3 v3 w3

∣∣∣∣∣∣=

∣∣∣∣∣∣a1u1 + a2u2 + a3u3 a1v1 + a2v2 + a3v3 a1w1 + a2w2 + a3w3

b1u1 + b2u2 + b3u3 b1v1 + b2v2 + b3v3 b1w1 + b2w2 + b3w3

c1u1 + c2u2 + c3u3 c1v1 + c2v2 + c3v3 c1w1 + c2w2 + c3w3

∣∣∣∣∣∣ .

Hints and Answers 317

(Note that our intermediate steps occurred in a Cartesian frame, but thefinal result is still frame independent.) Finally, write

V 2 = [e1 · (e2 × e3)]2 =

∣∣∣∣∣∣e1 · e1 e1 · e2 e1 · e3

e2 · e1 e2 · e2 e2 · e3

e3 · e1 e3 · e2 e3 · e3

∣∣∣∣∣∣ = g.

Exercise 2.5

(a) |x|2 = xkxk = gijx

ixj = gmnxmxn.(b) x · y = xkyk = xky

k = gijxiyj = gmnxmyn.

Exercise 2.6 The transformation matrix is of the form⎛⎝ cos θ sin θ 0− sin θ cos θ 0

0 0 1

⎞⎠ ,

where θ is the angle of rotation.

Exercise 2.7 Equate two different expressions for x as is done in the text.

Exercise 2.8

εijk =

⎧⎪⎪⎨⎪⎪⎩+V ′, (i, j, k) an even permutation of (1, 2, 3),

−V ′, (i, j, k) an odd permutation of (1, 2, 3),

0, two or more indices equal.

(Recall that V ′ = 1/V .) Then use the determinantal identity establishedin Exercise 2.4: put a = ei, b = ej , c = ek, u = ep, v = eq, w = er. Toprove the vector triple product identity we write

a× (b × c) = eiεijkajεkpqbpcq

= ei(δpi δ

qj − δq

i δpj )ajbpcq

= eibi(aqcq) − eici(apbp).

Exercise 2.9 Using the scalar triple product identity and (2.13) we have

(a × b) · (c × d) = a · [b× (c × d)]

= a · [(b · d)c − (b · c)d],

and the result follows.


Chapter 2 Problems

Problem 2.1 (a) e1 = (2i1 + 3i2 − 2i3)/9, e2 = (−3i1 + 3i2 + i3)/9,e3 = (5i1 − 6i2 + 4i3)/9; (b) e1 = −i1 + i3, e2 = (−5i1 − 3i2 + 7i3)/13,e3 = (12i1 +2i2−9i3)/13; (c) e1 = (i1 + i2)/2, e2 = (i1− i2)/2, e3 = 1/3i3;(d) e1 = cosφi1 + sinφi2, e2 = − sinφi1 + cosφi2, e3 = i3.

Problem 2.2 Using the formulas e1 = (e2 × e3)/V , etc., first find

e1 = (1, 1, 0), e2 = (−1, 0,−1), e3 = (−1,−2, 2).

Then, with regard for the formula Aji = ei · ej, arrive at

⎛⎝3 −1 −62 −3 21 −2 1

⎞⎠.

Problem 2.3 13

⎛⎝ 5 14 2−10 −10 −12 −1 −1

⎞⎠.

Problem 2.4 (a) ak; (b) aiai; (c) 3; (d) δk

i ; (e) 3; (f) 3.

Problem 2.7 (a) 0; (b) −6; (c) εinm; (d) 0; (e) 0; (f) 2δnk .

Problem 2.8 (a × b) × c = b(a · c) − a(b · c).

Chapter 3 Exercises

Exercise 3.1 (b)⎛⎝ 1 0 00 0 00 0 0

⎞⎠ ,

⎛⎝ 0 0 00 1 00 0 0

⎞⎠ ,

⎛⎝ 0 0 00 0 01 0 0

⎞⎠ .

Exercise 3.2 E can be written in any of the forms

E = eijeiej = eijeiej = ei·jeiej = e·ji eiej .

Let us choose the first form as an example. Equation (3.4) with x = ek

implies eijeiej · ek = ek. Pre-dotting this with em yields emk = gmk.

Exercise 3.3 (A ·B) · (B−1 ·A−1) = A · (B ·B−1) ·A−1 = A ·E ·A−1 =A ·A−1 = E.

Exercise 3.4 (a) B = ATBA. (b) B = ABAT .


Exercise 3.5 We are given ei = Aji ej . Now it is necessary to understand

what the needed tensor does. We denote it by A. It is the map y = A · xthat must take x = ei into y = Aj

i ej for each i = 1, 2, 3. Let us write outthese equations:

A · ei = Aji ej .

The decisive step is to suppose that A should be written in a “mixed basis”as A = am

· nemen. We have

am· nemen · ei = Aj

i ej

and it follows that

am· iem = Aj

i ej.

Thus (because the indices are dummy)

aj·i = Aj

i

and so the tensor is A = am· nemen where am

· n = Amn .

Exercise 3.7

(a) aijeiej ·· ekek = aijδkj gik = a ·k

k .

(b) aijeiej ·· bkneken = aijbknδj

kδin = aikb

ki. Having obtained this we mayuse the metric tensor to raise and lower indices and thereby obtainother forms; for example, aikb

ki = ap·kgpib

ki = ap·kb

k·p.

Exercise 3.10

(a) Write

A = aijeiej , B = bkneken,

AT = ajieiej , BT = bnkeken,

and show that both sides can be written as ajpb ·kp ekej .(b) b · C · a = b · (C · a) = b · (a · CT ) = (a · CT ) · b = a ·CT · b.(c) See [Lurie (2005)].(d) The relation (AT )−1 = (A−1)T follows from the two relations

AT · (A−1)T = (A−1 · A)T = ET = E,

(A−1)T · AT = (A · A−1)T = ET = E.


Exercise 3.11⎛⎝ a11 a12 a13

a12 a22 a23

a13 a23 a33

⎞⎠ ,

⎛⎝ 0 a12 a13

−a12 0 a23

−a13 −a23 0

⎞⎠ .

Exercise 3.12 With A = aijeiej and B = bkneken we have

A ··B = aijbkngjkgin. (*)

Since A is symmetric and B is antisymmetric, we may also write A ··B =−ajibnkgjkgin; let us swap i with j and k with n in this expression to get

A ··B = −aijbkngingjk. (**)

Adding (*) and (**) we obtain 2(A ··B) = 0.

Exercise 3.13

x ·[12(A + AT )

]· x =

12[x · A · x + x ·AT · x]

=12

[x ·A · x + x · A · x]

= x · A · x.

Exercise 3.15 They are (1) x = a with λ = b · a, and (2) any x that isperpendicular to b, with λ = 0.

Exercise 3.16 The characteristic equation −λ3+2λ2+λ = 0 has solutionsλ1 = 0, λ2 = 1 +

√2, λ3 = 1 −√

2. The Viete formulas give

I1(A) = λ1 + λ2 + λ3 = 2,

I2(A) = λ1λ2 + λ1λ3 + λ2λ3 = −1,

I3(A) = λ1λ2λ3 = 0.

Exercise 3.17 We prove this for two eigenvectors; the reader may readilygeneralize to any number of eigenvectors. Let A · x1 = λ1x1 and A · x2 =λ2x2 where λ2 = λ1. Now suppose

α1x1 + α2x2 = 0. (*)


Let us operate on both sides of (*) with the tensor A − λ2I. After simpli-fication we obtain

α1(λ1 − λ2)x1 = 0.

Because x1 = 0 by definition, we have α1 = 0. Putting this back into (*),we also have α2 = 0.

Exercise 3.18 Assume x1,x2,x3 are eigenvectors of A correspondingto the distinct eigenvalues λ1, λ2, λ3, respectively. Let us show that anyeigenvector corresponding to λ1 must be a scalar multiple of x1. So supposethat

A · x = λ1x. (**)

By the linear independence of x1,x2,x3, these vectors form a basis forthree-dimensional space and we can express

x = c1x1 + c2x2 + c3x3.

Applying A to both sides we have, by (**),

λ1x = λ1c1x1 + λ1c2x2 + λ1c3x3 = c1λ1x1 + c2λ2x2 + c3λ3x3.

This simplifies to

(λ1 − λ2)c2x2 + (λ1 − λ3)c3x3 = 0.

By linear independence of x2,x3, we get c2 = c3 = 0. Hence x = c1x1.

Exercise 3.19 The characteristic equation of A is∣∣∣∣∣∣1 − λ 0 0

1 1 − λ 00 1 −λ

∣∣∣∣∣∣ = 0,

which is λ3 − 2λ2 + λ = 0. Thus A3 = 2A2 − A.

Exercise 3.20

(a) Q · QT = QT · Q = i1i1 + i2i2 + i3i3 = E. Since we can write Q =−i1i1 + i2i2 + i3i3, the matrix of Q in mixed components is⎛⎝ −1 0 0

0 1 00 0 1

⎞⎠ .

Hence detQ = −1.


(b) The defining equation Q · QT = E implies

qijeiej · qkmemek = qijqkjeiek = δk

i eiek.

(c) If n is a positive integer,

Qn · (Qn)T = Qn · (QT )n = (Q · QT )n = En = E.

Exercise 3.21 Pre-dot A · x = λx with x to get

λ =x · (A · x)

|x|2 .

Clearly λ > 0 under the stated condition.

Exercise 3.23

(a) E ··· zyx = εijk(ek · z)(ej · y)(ei · x) = xi(εijkyjzk).

(b) E ··xy = eiεijk(ek · x)(ej · y) = eiεijkyjxk.

(c) E · x = εijkeiejxk so that (E · x) · y = εijkeiyjxk = y × x.

Exercise 3.28 Write E = emem.

Exercise 3.30 Use the representation of the invariants through the eigen-values of A.

Exercise 3.31 Use Theorem 3.8, the fact that B in the representation isa ball tensor, and the equality E ··AT = trA.

Exercise 3.32 Introduce the nth partial sum

Sn = E +11!

A + · · · + 1n!

An.

It follows that

‖Sn+m − Sn‖ =∥∥∥∥ 1

(n+ 1)!A(n+1) + · · · + 1

(n+m)!A(n+m)

∥∥∥∥≤ 1

(n+ 1)!‖A‖n+1 + · · · + 1

(n+m)!‖A‖n+m

.

Next, use the proof of convergence of the Taylor series for ex from anytextbook with x = ‖A‖.Exercise 3.33 Show that

(1) ‖An‖ ≤ qn → 0 as n→ ∞;(2) (E− A)(E + A + A2 + A3 + · · · + An) = E− An+1;


(3) limit passage may be justified in the last equality.

Exercise 3.34 (E + A)−1 = E− A + A2 − A3 + · · · + (−1)nAn + · · · .Exercise 3.35 Consider f(X+ εB) = trX+ ε trB. Using the definition,we get

∂

∂εf(X + εdX)

∣∣∣∣ε=0

= tr(dX).

As trB = E ··BT , we get f,X = E.

Exercise 3.36 Consider f(X + εB) = trX2 + ε tr(X ·B) + ε tr(B ·X) +ε2 trB2. By the definition, we get

∂

∂εf(X + εB)

∣∣∣∣ε=0

= tr(X ·B) + tr(B · X) = 2 tr(X · B)

With regard for tr(X · B) = X ··B = XT ··BT , we get f,X = 2XT .

Exercise 3.38 Use the results of Exercises 3.35 and 3.36.

Exercise 3.39 Use a consequence of the Cayley–Hamilton theorem, whichis

I3 =13

tr(X3 − I1X2 + I2X

),

and the results of Exercises 3.35, 3.36 and 3.37.

Exercise 3.43 Let us derive f(X + εdX). We have

f(X + εdX) =14X ··C ··X +

ε

2dX ··C ··X

+ε

2X ··C ·· dX +

ε2

2dX ··C ·· dX.

So

∂

∂εf(X + εdX)

∣∣∣∣ε=0

=12dX ··C ··X +

12X ··C ·· dX

=12(dxmncmnptxpt + xmncmnptdxpt)

=12(cmnptxpt + xijcijmn)dxmn.


Equating the components of this to the components of f,X ·· dX, we get

∂f

∂x11=

12(c11ptxpt + xmncmn11),

∂f

∂x12=

14(c12ptxpt + c21ptxpt + xmncmn12 + xmncmn21),

∂f

∂x22=

12(c22ptxpt + xmncmn22),

∂f

∂x13=

14(c13ptxpt + c31ptxpt + xmncmn13 + xmncmn31),

...

In tensor notation these formulas are

f,X =14(C ··X + C′′ ··X + X ··C + X ··C′),

where C′ = cmnptiminitip and C′′ = cmnptinimitip.

Exercise 3.44 Consider the tensor equality

C ··X =14(C ··X + C′′ ··X + X ··C + X ··C′)

in components.

Chapter 3 Problems

Problem 3.1 They are the components of the matrix

⎛⎝0 1 00 0 00 0 0

⎞⎠.

Problem 3.2

⎛⎝ 0 1 0−1 0 00 0 2

⎞⎠

Problem 3.3

⎛⎝−1 1 −22 −2 4−2 2 −4

⎞⎠.

Problem 3.5

(a) 12 (i1i2 + i2i1), 1

2 (i1i2 − i2i1);(b) 2i3i3, i1i2 − i2i1;(c) 1

2 (−i1i2 − i2i1 + i1i3 + i3i1), 3(i1i2 − i2i1) + 12 (i1i3 − i3i1);


(d) 12 (i1i2+i2i3+i1i3+i2i1+i3i2+i3i1), 1

2 (i1i2+i2i3+i1i3−i2i1−i3i2−i3i1);(e) i1i1 + 2i1i2 + 2i2i1 + i3i1 + i1i3, 0.

Problem 3.6

(a) 0, i1i2;(b) 0, i1i2 + i2i1;(c) 1

3E, 23 i1i1 − 1

3 i2i2 − 13 i3i3;

(d) 13a · aE, aa − 1

3a · aE;(e) 1

3E, 2i1i2 + 2i2i1 + i3i1 + i1i3 + 23 i1i1 − 1

3 i2i2 − 13 i3i3.

Problem 3.8

(a) I1 = a · a, I2 = 0, I3 = 0;(b) I1 = 0, I2 = − 1

2 , I3 = 0;(c) I1 = 2, I2 = 1, I3 = 0;(d) I1 = 3λ, I2 = 3λ2, I3 = λ3;(e) I1 = 9, I2 = 26, I3 = 24.

Problem 3.10

(a) S = S′ = |λ|E, Q = sgnλE;(b) S = S′ = aa + bb + cc, Q = 1

a·aaa + 1b·bbb + 1

c·ccc ≡ E;(c) S = S′ = |λ+ a|i1i1 + |λ|i2i2 + |λ|i3i3, Q = sgn (λ+ a)i1i1 +sgnλi2i2 +

sgnλi3i3;(d) S = S′ = |λ+a|i1i1 + |λ+ b|i2i2 + |λ|i3i3, Q = sgn (λ+a)i1i1 +sgn (λ+

b)i2i2 + sgnλi3i3;(e) S = S′ = |a|i1i1 + |b|i2i2 + |c|i3i3, Q = sgnai1i1 + sgn bi2i2 + sgn ci3i3.

Problem 3.12 Let X be a needed solution. Multiply the above equalityby A from the left and take the trace of both sides:

a tr(A · X) + tr(A · X) trA = tr(A · B). (*)

If a+trA = 0, then tr(A ·X) = tr(A ·B)/(a+tr A). Substituting this intothe above equality we get

X =1a

(B− tr(A · B)

a+ trAE).


If a + trA = 0 then (*) reduces to 0 = tr(A · B). So if tr(A · B) = 0 theequation has no solution X. Let us see what happens when a + trA = 0and tr(A · B) = 0. Now tr(A · X) can take any value but the devX isuniquely defined as follows from the initial equation: devX = 1

a dev B asa = 0.Answer. (1) If a+trA = 0, then X = 1

a

(B − tr(A·B)

a+trA E); (2) if a+trA = 0

and tr(A · B) = 0 then a solution does not exist; (3) if a + trA = 0 andtr(A · B) = 0, then only dev X is uniquely defined X = 1

a devB and soX = λE + 1

aB.

Problem 3.13 (a) X = C − tr(A·C)1+tr(A·B)B is uniquely defined when 1 +

tr(A · B) = 0; (b) X = CT − tr(A·CT )1+tr(A·BT )B

T is uniquely defined when1 + tr(A · BT ) = 0; (c) X = B − a trB

1+a trAA is uniquely defined when1 + a trA = 0.

Problem 3.14 Taking the dev-operation on the above equality, we geta dev X = 0. This means that dev X = 0 when a = 0. If a = 0 then dev Xcan take any value.

Applying the trace operation to the equality, we get (a+trE) trX = 0.It is valid when (1) a = −3 or (2) trX = 0.

If a = −3, then X = λE with any scalar λ.If trX = 0, then the equation reduces to aX = 0 and so it can have a

nonzero solution X only if a = 0.Answer. (1) Let a = −3. Then X = λE with any scalar λ; (2) Let a = 0.Then X is an arbitrary tensor such that trX = 0.

Problem 3.15 Let X be a solution. Calculate the trace of both sidesof the equation given in the problem. It follows that (a + trA) trX = 0.There follow two possibilities: (1) a = − trA, or (2) trX = 0.

When a = − trA then trX can take any value; now X = λA is asolution for any scalar λ.

Let trX = 0. Then the equation reduces to aX = 0, which has anonzero solution only if a = 0.Answer. (1) a = − trA, X = λA; (2) If a = 0, then X is an arbitrarytensor such that trX = 0.


Problem 3.19

(E× ω)2 = (E × ω) · (E× ω)

= ω × E · E× ω= ω × E× ω= ω × ikik × ω= ωpip × ikik × itωt

= ωpωtεpkmεktnimin= ωpωtεmpkεtnkimin= ωpωt(δmtδpn − δmnδpt)imin= ωmωnimin − ωpωpimim= ωω − ω · ωE.

Problem 3.21 (a) 0; (b) −2ω; (c) 2a × b; (d) 0; (e) 0.

Problem 3.39 I1 = 3α+ βe · e, I2 = 3α2 + 2αβe · e, I3 = α2(α+ βe · e).

Problem 3.40 I1 = 0, I2 = ω · ω, I3 = 0.

Problem 3.41 I1 = −2ω · ω, I2 = (ω · ω)2, I3 = 0.

Problem 3.43

(a) 2i1i1 + i2i2 + i3i3 − i1i2 − i2i1;(b) c−1i1i3 + b−1i2i2 + c−1i3i1;(c) a−1i1i1 + i2i2 + i3i3 − a−1bi1i2(d) a−1E− a−2bi1i2.

Problem 3.44 (a) 4XT 3; (b) aa; (c) ab; (d) BT .

Problem 3.49 (a) EA; (b) EAT ; (c) cdab.

Chapter 4 Exercises

Exercise 4.1

(a) Consider the derivative of the dot product. Since we are given thevectors explicitly, an easy way is to dot the vectors first:

e1(t) · e2(t) = e−t(1 − sin2 t) = e−t cos2 t,


hence

[e1(t) · e2(t)]′ = −e−t cos t(2 sin t+ cos t).

However, it is easily checked that the product-rule identity given in thetext yields the same result.

(b) [e(t) × e′(t)]′ = e′(t) × e′(t) + e(t) × e′′(t) = e(t) × e′′(t).

Exercise 4.3 Use the product rule to take the indicated second derivativeof r = ρρ, noting that

dρ

dt=

d

dt(x cosφ+ y sinφ) = (−x sinφ+ y cosφ)

dφ

dt= φ

dφ

dt

and, similarly,

dφ

dt= −ρdφ

dt.

Exercise 4.4 v1 = vr, v2 = vθ/r, v3 = vφ/r sin θ.

Exercise 4.5

(a) Fix v = v0 and eliminate u from the transformation equations to get

y = x tanα+ v0(sinβ − cosβ tanα);

thus the v = v0 coordinate curve is a straight line in the xy-planemaking an angle α with the x-axis. Similarly, the u = u0 line makesan angle β with the x-axis.

(b) r = xx+ yy = x(u cosα+ v cosβ) + y(u sinα+ v sinβ) gives

r1 =∂r∂u

= x cosα+ y sinα, r2 =∂r∂v

= x cosβ + y sinβ.

Note that these are both unit vectors. To find r1 we write r1 = xa+ yband solve the equations

r1 · r1 = 1, r1 · r2 = 0,

simultaneously to get a, b. By this method we find that

r1 = xsinβ

sin(β − α)− y

cosβsin(β − α)

,

r2 = −xsinα

sin(β − α)+ y

cosαsin(β − α)

.


These are not orthonormal vectors; moreover the coordinate system isnon-orthogonal except in such trivial cases as β − α = π/2. We have

g11 = 1 = g22, g12 = g21 = cos(β − α),

and

g11 =1

sin2(β − α)= g22, g12 = g21 = − cos(β − α)

sin2(β − α).

(c) Write z = z1r1 + z2r2 = z1r1 + z2r2. Substitute for the ri and ri interms of x and y and then equate coefficients of x, y to get simultaneousequations for the zi in terms of the zi. The answers are

z1 = z1 + z2 cos(β − α), z2 = z1 cos(β − α) + z2.

Exercise 4.6 (ds)2 = (du)2 + 2 du dv cos(β − α) + (dv)2.

Exercise 4.7

(a) The total differential of ϕ is

dϕ =∂ϕ

∂qidqi.

If we start at a point on the given surface and move in such a waythat we stay on the surface, then dϕ = 0. In the text it is shown thatdqi = ri · dr, so if we move along dr satisfying

∂ϕ

∂qiri · dr = 0

then we will stay on the surface. It is clear that the vector

∇ϕ =∂ϕ

∂qiri

is normal to the surface, since it is perpendicular to every tangentdirection. We have

|∇ϕ| =√

∇ϕ · ∇ϕ

=

√∂ϕ

∂qmrm · ∂ϕ

∂qnrn

=

√gmn

∂ϕ

∂qm

∂ϕ

∂qn,


hence

n =∇ϕ|∇ϕ| =

∂ϕ∂qi r

i√gmn ∂ϕ

∂qm∂ϕ∂qn

=gij ∂ϕ

∂qi√gmn ∂ϕ

∂qm∂ϕ∂qn

rj .

(b) Use the result of part (a).(c) Use the result of part (b).(d) Use the result of part (b).

Exercise 4.8 For cylindrical coordinates we have

∇f = ri ∂f

∂qi

= r1 ∂f

∂ρ+ r2 ∂f

∂φ+ r3 ∂f

∂z

= r1∂f

∂ρ+

r2

ρ2

∂f

∂φ+ r3

∂f

∂z

= ρ∂f

∂ρ+φ

ρ

∂f

∂φ+ z

∂f

∂z.

Exercise 4.9 The results are all zero by expansion in Cartesian frame.

Exercise 4.10 Use expansion in Cartesian frame.

Exercise 4.11 Use expansion in Cartesian frame.

Exercise 4.12 Equation (4.18) gives, for instance,

Γ221 =12

(∂g12∂q2

+∂g21∂q2

− ∂g22∂q1

).

In the cylindrical system this is

Γ221 =12

(∂0∂φ

+∂0∂φ

− ∂ρ2

∂ρ

)= −ρ.

Exercise 4.13 Use the results of the previous exercise and equation (4.20).

Exercise 4.14 From r = xc coshu cos v + yc sinhu sin v we find

ru =∂r∂u

= xc sinhu cos v + yc coshu sin v,

rv =∂r∂v

= −xc coshu sin v + yc sinhu cos v,


and consequently

guu = gvv = c2(cosh2 u− cos2 v), guv = gvu = 0

(the system is orthogonal). Then

ru =ru

c2(cosh2 u− cos2 v), rv =

rv

c2(cosh2 u− cos2 v),

from which we get

guu = gvv =1

c2(cosh2 u− cos2 v), guv = gvu = 0.

An application of (4.18) gives

Γuuu = c2 coshu sinhu,

Γuuv = −c2 cos v sin v,

Γuvu = Γvuu = c2 cos v sin v,

Γvvu = −c2 coshu sinhu,

Γvuv = Γuvv = c2 coshu sinhu,

Γvvv = c2 cos v sin v.

Finally, from (4.20) we obtain

Γuuu =

coshu sinhucosh2 u− cos2 v

,

Γvuu = − cos v sin v

cosh2 u− cos2 v,

Γuuv = Γu

vu =cos v sin v

cosh2 u− cos2 v,

Γuvv = − coshu sinhu

cosh2 u− cos2 v,

Γvvu = Γv

uv =coshu sinhu

cosh2 u− cos2 v,

Γvvv =

cos v sin vcosh2 u− cos2 v

.

Exercise 4.15 We switch to a notation in which the coordinate symbolsρ and φ are used for the indices. The only nonzero Christoffel symbols are


Γρφφ = −ρ, Γφ

ρφ = 1/ρ. Then

∇ρfρ =

∂fρ

∂ρ, ∇φf

ρ =∂fρ

∂φ− ρfφ,

∇ρfφ =

∂fφ

∂ρ+

1ρfφ, ∇φf

φ =∂fφ

∂φ+

1ρfρ.

Exercise 4.16 Use the method shown in the text. For example,

∂

∂qk(a·ji rirj) =

∂a·ji∂qk

rirj + a·ji

(∂ri

∂qkrj + ri ∂rj

∂qk

)=∂a·ji∂qk

rirj + a·ji(−Γi

ktrtrj + riΓt

jkrt

)=∂a·ji∂qk

rirj +(−a ·j

s Γskir

irj + ria·si Γjskrj

)=(∂a·ji∂qk

− Γskia

·js + Γj

ska·si

)rirj .

Exercise 4.17 Consider, for example,

∇kgij =∂gij

∂qk− Γs

kigsj − Γskjgis.

By (4.19), this is

∇kgij =∂gij

∂qk− Γkij − Γkji.

Elimination of Γkij and Γkji via (4.18) shows that ∇kgij = 0.

Exercise 4.18 Equate components of the first and third members of thetrivial identity ∇(E · a) = ∇a = E · ∇a.

Exercise 4.19 In cylindrical coordinates,

∇ · f =1√g

∂

∂qi(√gf i)

=1ρ

[∂

∂ρ(ρf1) +

∂

∂φ(ρf2) +

∂

∂z(ρf3)

]=

1ρ

∂

∂ρ(ρf1) +

1ρ

∂

∂φ(ρf2) +

∂

∂z(f3)

=1ρ

∂

∂ρ(ρfρ) +

1ρ

∂fφ

∂φ+∂fz

∂z.


In spherical coordinates,

∇ · f =1√g

∂

∂qi(√gf i)

=1

r2 sin θ

[∂

∂r(r2 sin θf1) +

∂

∂θ(r2 sin θf2) +

∂

∂φ(r2 sin θf3)

]=

1r2 sin θ

[∂

∂r(r2 sin θfr) +

∂

∂θ(r sin θfθ) +

∂

∂φ(rfφ)

]=

1r2

∂

∂r(r2fr) +

1r sin θ

∂

∂θ(sin θfθ) +

1r sin θ

∂fφ

∂φ.

Exercise 4.20

∇×∇f = εijkrk

(∂2f

∂qi∂qj− Γn

ij

∂f

∂qn

)= 0.

(Swapping two adjacent indices on εijk causes a sign change; hence, when-ever the symbol multiplies an expression that is symmetric in two of itssubscripts, the result is zero.)


∣∣∣∣∂(x1, x2, x3)∂(q1, q2, q3)

∣∣∣∣ =

∣∣∣∣∣∣∣∣∣∣∣∣

∂x1

∂q1∂x1

∂q2∂x1

∂q3

∂x2

∂q1∂x2

∂q2∂x2

∂q3

∂x3

∂q1∂x3

∂q2∂x3

∂q3

∣∣∣∣∣∣∣∣∣∣∣∣=

∣∣∣∣∣∣∣∣∣∣∣∣

∂x1

∂qi

∂qi

∂q1∂x1

∂qi

∂qi

∂q2∂x1

∂qi

∂qi

∂q3

∂x2

∂qi

∂qi

∂q1∂x2

∂qi

∂qi

∂q2∂x2

∂qi

∂qi

∂q3

∂x3

∂qi

∂qi

∂q1∂x3

∂qi

∂qi

∂q2∂x3

∂qi

∂qi

∂q3

∣∣∣∣∣∣∣∣∣∣∣∣

=

∣∣∣∣∣∣∣∣∣∣∣∣

⎛⎜⎜⎜⎜⎜⎜⎝∂x1

∂q1∂x1

∂q2∂x1

∂q3

∂x2

∂q1∂x2

∂q2∂x2

∂q3

∂x3

∂q1∂x3

∂q2∂x3

∂q3

⎞⎟⎟⎟⎟⎟⎟⎠

⎛⎜⎜⎜⎜⎜⎜⎝∂q1

∂q1∂q1

∂q2∂q1

∂q3

∂q2

∂q1∂q2

∂q2∂q2

∂q3

∂q3

∂q1∂q3

∂q2∂q3

∂q3

⎞⎟⎟⎟⎟⎟⎟⎠

∣∣∣∣∣∣∣∣∣∣∣∣.

The result follows from the fact that the determinant of a product equalsthe product of the determinants.

Chapter 4 Problems

Problem 4.1 ∇f = f ′(r)/rr, ∇2f = f ′′(r) + 2f ′(r)/r.


Problem 4.2

(a) ai1i1 + bi2i2 + ci3i3, a+ b+ c;(b) ai2i1, 0;(c) aE, 3a;(d) f ′(r)erer + f(r)/r eφeφ, f ′(r) + f(r)/r;(e) f ′(r)ereφ − f(r)/r eφer, 0;(f) f ′(r)erez, 0;(g) f ′(r)erer + f(r)/r eφeφ + f(r)/r eθeθ, f ′(r) + 2f(r)/r;(h) −E× ω, 0;(i) f ′(φ)/r eφez + g′(φ)/r eφeφ − g(φ)/r eφer, g′(φ)/r;(j) f ′(z)ezez + g′(φ)/r eφeφ − g(φ)/r eφer, f ′(z) + g′(φ)/r;(k) AT , trA.

Problem 4.4 (a) E; (b) −6A; (c) trA; (d) ∇f ; (e) 3E; (f) 4r.

Problem 4.5

(a) f ′er + f−gr er + h′ez;

(b) f ′er + 2 f−gr er;

(c) f ′er + f−gr er;

(d) f ′i1 + g′i2 + h′i3;(e) 0;(f) f ′eφ + f+g

r eφ + h′ez;(g) f

r ez + greφ + h′er.

Problem 4.6 f × r.

Problem 4.7 Find the coordinate vectors for the new coordinates anddemonstrate their orthogonality. The answers are

Hσ = a2σ2 − τ2

σ2 − 1, Hτ = a2σ

2 − τ2

1 − τ2, Hz = 1.

Problem 4.8 Hσ = Hτ = σ2 + τ2, Hφ = σ2τ2.

Problem 4.9

Hσ = Hτ =a2

(cosh τ − cosσ)2, Hz = 1.


Problem 4.10

Hσ = Hτ =a2

(cosh τ − cosσ)2, Hφ =

a2 sin2 σ

(cosh τ − cosσ)2.

Problem 4.11

Hσ = Hτ =a2

(cosh τ − cosσ)2, Hφ =

a2 sinh2 τ

(cosh τ − cosσ)2.

Problem 4.18 Consider the integral∫V

∇ · (Ar) dV.

Using Gauss–Ostrogradsky theorem we get∫V

∇ · (Ar) dV =∫

S

n · Ar dS =∫

S

gr dS.

On the other hand, with regard for the identity

∇ · (Ar) = (∇ · A)r + AT · ∇r = (∇ · A)r + AT

we have ∫V

∇ · (Ar) dV =∫

V

(fr + AT ) dV.

Comparing these two expressions we find∫V

AT dV =∫

S

gr dS −∫

V

fr dV,

from which the answer follows. The answer is∫S

rg dS −∫

V

rf dV.


Chapter 5 Exercises

Exercise 5.1

(a) r′(t) = −i1 sin t+ i2 cos t+ i3, |r′(t)| =√

2, so the required length is∫ 2π

0

√2 dt = 2π

√2.

(b) The ellipse can be described as the locus of the tip of the vector

r(t) = i1A cos t+ i2B sin t (0 ≤ t < 2π).

Hence

s =∫ 2π

0

(A2 sin2 t+B2 cos2 t)1/2 dt.

The integral on the right is an elliptic integral and cannot be evaluatedin closed form.

Exercise 5.3 s(t) =∫ t

0

√2 dt = t

√2, so

r(s) = i1 cos(s/√

2) + i2 sin(s/√

2) + i3(s/√

2).

The unit tangent is

r′(s) =1√2

[−i1 sin

(s√2

)+ i2 cos

(s√2

)+ i3

].

Exercise 5.4 The curve is an exponential spiral. We have

r′(t) = et[i1(cos t− sin t) + i2(cos t+ sin t)],

hence r′(π/4) = i2eπ/4√

2. Also

r(π/4) =√

22eπ/4(i1 + i2),

so the tangent line is described by

x =√

22eπ/4.

Exercise 5.5 Assuming the curve can be expressed in the form ρ = ρ(θ),we can write the position vector as

r(θ) = i1ρ(θ) cos θ + i2ρ(θ) sin θ.


Differentiation gives

r′(θ) = i1[−ρ(θ) sin θ + ρ′(θ) cos θ] + i2[ρ(θ) cos θ + ρ′(θ) sin θ].

(a) The equation r = r(θ0)+λr′(θ0) gives upon substitution and matchingof coefficients

ρ cos θ = ρ(θ0) cos θ0 + λ[−ρ(θ0) sin θ0 + ρ′(θ0) cos θ0],

ρ sin θ = ρ(θ0) sin θ0 + λ[ρ(θ0) cos θ0 + ρ′(θ0) sin θ0],

where (ρ, θ) locates a point on the tangent line. The desired analoguesof (5.2) can be obtained by (1) squaring and adding, whereby we elim-inate θ from the left-hand side, and (2) dividing the two equations,whereby we eliminate ρ from the left-hand side. Elimination of λ in-stead gives

ρ cos θ − ρ(θ0) cos θ0−ρ(θ0) sin θ0 + ρ′(θ0) cos θ0

=ρ sin θ − ρ(θ0) sin θ0

ρ(θ0) cos θ0 + ρ′(θ0) sin θ0for the analogue of (5.3).

(b) We compute r(θ) · r′(θ) to get

r(θ) · r′(θ) = ρ(θ)ρ′(θ).

We also find

|r(θ)| = ρ(θ), |r′(θ)| = [ρ2(θ) + ρ′2(θ)]1/2,

so that by definition of the dot product

cosφ =r(θ) · r′(θ)|r(θ)||r′(θ)| =

ρ′(θ)[ρ2(θ) + ρ′2(θ)]1/2

.

Use of the identity

tanφ =[1 − cos2 φ]1/2

cosφ

allows us to get

tanφ =ρ(θ)ρ′(θ)

.

Exercise 5.7 Let r′(t0) = 0. For determining a tangent vector at t0 wecan use the Taylor expansion of r = r(t):

r(t0 + ∆t) = r(t0) + r′(t0)∆t

+12!

r′′(t0)(∆t)2 + · · · + 1n!

r(n)(t0)(∆t)n + o(|∆t|n).


If the nth derivative is the first that is not zero at t = t0, then

r(t0 + ∆t) = r(t0) +1n!

r(n)(t0)(∆t)n + o(|∆t|n).

When ∆t → 0, the direction of the vector r(t0 + ∆t) − r(t0) tends to thetangential direction of the curve at t = t0. Thus a tangent vector t to thecurve at t = t0 can be found as

t = lim∆t→0

r(t0 + ∆t) − r(t0)(∆t)n

=1n!

r(n)(t0).

Exercise 5.8

x = x(t0) + λ1n!x(n)(t0),

y = y(t0) + λ1n!y(n)(t0),

z = z(t0) + λ1n!z(n)(t0).

Exercise 5.9 This time s(t) = t√α2 + β2, hence

r(s) = i1α cos

(s√

α2 + β2

)+ i2α sin

(s√

α2 + β2

)+ i3β

s√α2 + β2

.

This gives

r′(s) =1√

α2 + β2

[− i1α sin

(s√

α2 + β2

)

+ i2α cos

(s√

α2 + β2

)+ i3β

],

r′′(s) =α

α2 + β2

[−i1 cos

(s√

α2 + β2

)− i2 sin

(s√

α2 + β2

)],

and hence

k = |r′′(s)| =α

α2 + β2.

Exercise 5.10 The principal normal is

ν = −i1 cos

(s√

α2 + β2

)− i2 sin

(s√

α2 + β2

).


The binormal is obtained as

β =

∣∣∣∣∣∣∣∣∣∣i1 i2 i3

− α√α2+β2

sin(

s√α2+β2

)α√

α2+β2cos

(s√

α2+β2

)β√

α2+β2

− cos(

s√α2+β2

)− sin

(s√

α2+β2

)0

∣∣∣∣∣∣∣∣∣∣and is

β =1√

α2 + β2

[i1β sin

(s√

α2 + β2

)− i2β cos

(s√

α2 + β2

)+ i3α

].

We can determine a plane in space by specifying the normal to theplane and a single point through which the plane passes. If the planepasses through the point located by the position vector r0, and N is anynormal vector, then the plane is described by the vector equation

N · (r − r0) = 0

(that is, all points whose position vectors r satisfy the above equation willlie in the plane). Hence the equations for the three fundamental planes are

β · (r − r0) = 0 osculating plane,

τ · (r − r0) = 0 normal plane,

ν · (r − r0) = 0 rectifying plane.

The rectifying plane for the helix is found to be x cos t0 + y sin t0 = α.

Exercise 5.11 Writing r = r(t), we differentiate with respect to t and usethe chain rule to obtain

drdt

= τds

dt,

d2rdt2

= τd2s

dt2+ νk

(ds

dt

)2

.

These give

drdt

× d2rdt2

= βk

(ds

dt

)3

or

r′(t) × r′′(t) = βk|r′(t)|3.Dotting this equation to itself we obtain

k2 =[r′(t) × r′′(t)]2

|r′(t)|6 .


The formula in the text was presented in such a way that the absolute valuefunction (with its inherent discontinuity at the origin) would be avoided.

Exercise 5.12

(a) R = 5√

10/3. R = ∞.(b) x = −b/2a, the vertex of the parabola.

Exercise 5.15 No; construct counterexamples.

Exercise 5.16 First show that β is a constant β0. Then use the definitionof a plane as a set of points described by a radius vector r such that r·n = c,where c is a constant.

Exercise 5.17 Straightforward using equations (5.7) and (5.10).

Exercise 5.18 The vectors r′(t) and r′′′(t) reverse their directions, whiler′′(t) does not. So τ reverses direction, ν maintains its direction, andtherefore β reverses direction. The moving trihedron changes its handed-ness. The formulas for k1 and k2 show that these quantities do not changesign.


k2 = − β

α2 + β2.

Note that the ratio of the curvature to the torsion is constant for the helix.

Exercise 5.20

d3rds3

= −k21τ +

dk1

dsν − k1k2β.


r′(t) = i1 sin t+ i2 + i3 cos t, |r′(t)| =√

2,

and can find the length parameter for the curve by setting

s(t) =∫ t

0

|r′(x)| dx =∫ t

0

√2 dx =

√2t.


So t = s/√

2 and we have

r(s) = i1

[1 − cos

(s√2

)]+ i2

(s√2

)+ i3 sin

(s√2

),

r′(s) = i11√2

sin(s√2

)+ i2

(1√2

)+ i3

1√2

cos(s√2

),

r′′(s) = i112

cos(s√2

)− i3

12

sin(s√2

).

The desired decomposition of the acceleration is

a = s′′(t)τ + k1v2ν.

In the present case we have s′′(t) = 0, while k1 = |r′′(s)| = 1/2 andv2 = (ds/dt)2 = 2. Therefore a = ν.

Exercise 5.23 The sphere is described by

r = xa sin θ cosφ+ ya sin θ sinφ+ za cos θ.

We have∂r∂θ

= xa cos θ cosφ+ ya cos θ sinφ− za sin θ,

∂r∂φ

= −xa sin θ sinφ+ ya sin θ cosφ,

hence

E = a2, F = 0, G = a2 sin2 θ,

so that (ds)2 = a2(dθ)2 + a2 sin2 θ(dφ)2.

Exercise 5.24

(a) The differentials of coordinates for elementary curves on the coordinatelines are (du1, du2 = 0) and (du1 = 0, du2). The answer is given bycos θ = F/

√EG.

(b) The desired angle is ψ = π/4.

Exercise 5.25 We find

r1 = i1 cosu2 + i2 sinu2 + i3,

r2 = −i1u1 sinu2 + i2u1 cosu2,

hence E = 2, F = 0, G = (u1)2. Then

S =∫ 2π

0

∫ a

0

√2u1 du1 du2 = π

√2a2.


Exercise 5.26 Computing dr/dρ and dr/dφ, we find that

E = 1 + [f ′(ρ)]2, F = 0, G = ρ2.

From these we get

(ds)2 = 1 + [f ′(ρ)]2(dρ)2 + ρ2(dφ)2

and

S =∫

A

√EG− F 2 dρ dφ =

∫∫ √1 + [f ′(ρ)]2ρ dρ dφ.

Exercise 5.30 From

r = xρ cosφ+ yρ sinφ+ zf(ρ)

we find that

n =∂r∂ρ × ∂r

∂φ∣∣∣ ∂r∂ρ × ∂r

∂φ

∣∣∣ =−xf ′(ρ) cosφ− yf ′(ρ) sinφ+ z√

1 + [f ′(ρ)]2

and

∂2r∂ρ2

= zf ′′(ρ),

∂2r∂ρ∂φ

= −x sinφ+ y cosφ,

∂2r∂φ2

= −xρ cosφ− yρ sinφ.

Hence

L =f ′′(ρ)√

1 + [f ′(ρ)]2, M = 0, N =

ρf ′(ρ)√1 + [f ′(ρ)]2

.

Exercise 5.31 In Exercise 5.23 we found that E = a2, F = 0, andG = a2 sin2 θ for the sphere. The outward normal to the sphere is

n = x sin θ cosφ+ y sin θ sinφ+ z cos θ,

and this yields the values L = −a, M = 0, N = −a sin2 θ. The meancurvature is H = −1/a, and the Gaussian curvature is K = 1/a2.


Exercise 5.35 The answers include

µ ·11 =

−f ′′(ρ)1 + [f ′(ρ)]23/2

, µ ·21 = µ ·1

2 = 0, µ ·22 =

−f ′(ρ)ρ√

1 + [f ′(ρ)]2,

and

Γ111 =

f ′(ρ)f ′′(ρ)1 + [f ′(ρ)]2

.

Exercise 5.40 A logarithmic spiral

r = ceaφ[i cos(φ+ φ0) + j sin(φ+ φ0)],

where the constant c is defined by the initial values of the curve.

Chapter 5 Problems

Problem 5.1 The curve is called the astroid (Fig. B.1(a)). Its parametricrepresentation is x = a cos3 t, y = a sin3 t. The singular points are (0, a),(0,−a), (a, 0), and (−a, 0).

a

a

-a

x

y

2a

2 2

a) b)

x

y

Fig. B.1 (a) astroid curve, t ∈ [0, 2π]; (b) cycloid curve, t ∈ [−2π, 2π].

Problem 5.2 The curve is a cycloid (Fig. B.1(b)). The singular pointsare (0, 2πk) for k = 0,±1,±2, . . ..

Problem 5.3x

0=y

1=z − 1

0.


Problem 5.4 s = 6a.

Problem 5.5 s = 8a.

Problem 5.6 s = 16a.

Problem 5.7 s = a√

2 sinh t.

Problem 5.8

k1 =14

√1 + sin2 t

2.

Problem 5.9

k1 =1

2a cosh2 t, k2 =

12a cosh2 t

.

Problem 5.10 k2 = a cosh t.

Problem 5.12 Differentiating the relation Q ·QT = E with respect to s,we obtain

Q′ ·QT + Q · Q′T = 0.

Hence Q′ ·QT is an antisymmetric tensor, so it takes the form Q′ = d×Qwhere d is the corresponding conjugate vector. Thus Q′ ·QT = d×E. Nowrefer to the result of Problem 3.21 (b).

Problem 5.13−2du dv√u2 + 1

.

Problem 5.17 2H = trB = − tr ∇n = −∇ · n.

Problem 5.18

0 = ∇ ·E = ∇ · (A + nn) = ∇ · A + (∇ · n)n = ∇ ·A − 2Hn.

Problem 5.20

K =fxxgyy(

1 + f2x + g2

y

)2 .


Problem 5.21

K = − 1AB

[(Av

B

)v

+(Bu

A

)u

],

where indices u and v denote partial derivatives with respect to u and v

respectively.

Problem 5.22 Let us consider Stokes’ formula from Chapter 4 for a tensorfield X and a vector field x:∮

Γ

τ · X ds =∫

S

n · (∇× X) dS,∮

Γ

τ · x ds =∫

S

n · (∇× x) dS, (B.1)

where we have used the relations dr = τ ds and (n×∇) ·X = n · (∇×X).Using the identities

A = −n× (n × A)

and

A = E ·X = (A + nn) ·X = −n× (n × A) ·X + nn ·X,we represent the tensor X as

X = n × X1 + nx2 (B.2)

where X1 = −n × X and x2 = n · X. Note that the surface and spatialgradient operators ∇ and ∇ are related by

∇ = ∇ + n∂

∂z,

where z is the distance coordinate along the normal to S. Then

n · (∇× X1) = n · (∇ × X1).

Formula (B.1) for X1 yields∫S

n · (∇ × X1) dS =∮

Γ

τ ·X1 ds. (B.3)

In view of the identities τ = ν × n and (ν × n) · X1 = ν · (n × X1),equation (B.3) reduces to∫

S

∇ · (n × X1) dS =∮

Γ

ν · (n × X1) ds. (B.4)

Using the relation

∇ · (nx2) = (∇ · n)x2 = −2Hx2,


from (B.2) and (B.4) we derive the final formula∫S

(∇ · X + 2Hn · X

)dS =

∮Γ

ν · X ds.

Problem 5.23 Similar to Problem 5.22.

Problem 5.24 Applying (5.58) to the tensor AX, we obtain the firstformula ∫

S

(∇X + 2HnX

)dS =

∮Γ

νX ds.

The second one,∫S

(∇ × X + 2Hn× X

)dS =

∮Γ

ν × X ds,

follows from this. Substituting the tensor nX into the last formula, weobtain the third formula∫

S

∇ × (nX) dS =∮

Γ

τX ds.

Chapter 6 Exercises

Exercise 6.5 Start with Hooke’s law: σ = λE tr ε+2µε. Apply the traceoperation to it and find tr ε:

tr ε =1

3λ+ 2µtrσ.

Substituting this into Hooke’s law, obtain ε:

ε =12µ

[σ − λ

3λ+ 2µE trσ

]=

12µ

[σ − ν

1 + νE trσ

].

Now

W =12σ ·· ε =

14µ

[σ ··σ − ν

1 + ν(trσ)2

].

Exercise 6.8 Use d’Alembert’s principle, which states that in dynamicsthe system of all forces — including the inertia forces — are in equilibrium.So in the equilibrium equations, change

fi → fi − ρd2ui

dt2.


Exercise 6.12 Consider A(k) and C in a Cartesian frame. By definition,

A(k) = k ·C · k= kmim · cpqrtipiqirit · ksis= kmkscmqrsiqir= kmkscmrqsiqir= kmkscmqrsiriq= A(k)T .

Exercise 6.13 It is sufficient to show that a ·A(k) · a > 0 for a = 0. Thetensor C is such that ε · ·C · · ε > 0 whenever ε = 0. Substitute into thisε = ka. Because C is symmetric, the identity (ka) ··C = (ak) ··C holds.Thus

ε ··C ·· ε = (ka) ··C ·· (ka) = a · (k ·C · k)a = a ·A(k) · a > 0.

Exercise 6.14 From Hooke’s law it follows that

tr ε = ε11 + ε22 =σ11 + σ22

λ+ 2µ.

So we have

ε11 =12µσ11 − λ

2µ(λ+ 2µ)(σ11 + σ22),

ε22 =12µσ22 − λ

2µ(λ+ 2µ)(σ11 + σ22),

ε12 =σ12

µ.

Exercise 6.15 By Hooke’s law we get

σ11 = λ tr ε+ 2µε11, σ22 = λ tr ε+ 2µε22, 0 = λ tr ε+ 2µε33,

σ12 = µε12, 0 = µε13, 0 = µε23.

It follows that

ε13 = 0, ε23 = 0, ε12 =σ12

µ, ε33 = − λ

λ+ 2µ(ε11 + ε22).


Substituting ε33 into the first two equations, we obtain

σ11 =2λµλ+ 2µ

(ε11 + ε22) + 2µε11, σ22 =2λµλ+ 2µ

(ε11 + ε22) + 2µε22.

These equations coincide with the equations of the plane deformation prob-lem when we change λ to λ = 2λµ

λ+2µ . Then we have

ε11 + ε22 =σ11 + σ22

λ + 2µ=

λ+ 2µ4µ(λ+ µ)

(σ11 + σ22),

ε11 =12µσ11 − λ

2µ(λ + 2µ)(σ11 + σ22),

ε22 =12µσ22 − λ

2µ(λ + 2µ)(σ11 + σ22).

Chapter 6 Problems

Problem 6.1

(a) On sides AB and CD: σ11 = σ, σ12 = 0; on sides AD and BC: σ22 = 2σ,σ12 = 0.

(b) On sides AB and CD: σ11 = 0, σ12 = −τ ; on sides AD and BC: σ22 = 0,σ12 = −τ .

(c) On sides AB and CD: σ11 = −p, σ12 = 0; on sides AD and BC: σ22 = σ,σ12 = 0.

(d) On sides AB and CD: σ11 = 0, σ12 = τ ; on sides AD and BC: σ22 = −p,σ12 = τ .

Problem 6.2 On AB we have n = −i2. The boundary condition is n ·σ =0. Using the representation

σ = σ11i1i1 + σ12(i1i2 + i2i1) + σ22i2i2,

we get

n · σ = −σ22i2 − σ12i1 = 0.

So the conditions on side AB are: σ22 = 0, σ12 = 0.On BC we have n = i1. The conditions on side BC are

n · σ = σ11i1 + σ12i2 = 0.


On side AC the task is a bit more difficult. On AC we have

n = − sinαi1 + cosαi2.

The boundary condition on AC is n · σ = 0. Using the representation

σ = σ11i1i1 + σ12(i1i2 + i2i1) + σ22i2i2,

we get

n · σ = (− sinαi1 + cosαi2) · (σ11i1i1 + σ12(i1i2 + i2i1) + σ22i2i2)

= i1(−σ11 sinα+ σ12 cosα) + i2(−σ12 sinα+ σ22 cosα)

= 0,

which yields the needed result.The final answers are as follows. Side AB:

σ22 = 0, σ12 = 0.

Side BC:

σ11 = −γy, σ12 = 0.

Side AC:

−σ11 sinα+ σ12 cosα = 0, −σ12 sinα+ σ22 cosα = 0.

Problem 6.3 On side BC, the normal is

n = sinβi1 + cosβi2.

Then

n · σ = (sinβi1 + cosβi2) · (σ11i1i1 + σ12(i1i2 + i2i1) + σ22i2i2)

= i1(σ11 sinβ + σ12 cosβ) + i2(σ12 sinβ + σ22 cosβ).

The force on BC is

−p3n = −p3(sinβi1 + cosβi2).

The conditions on side BC follow from the relation n · σ = −p3n.The normal to AC is

n = − sinαi1 + cosαi2.

The rest is similar to the above solution for BC.The final answers are as follows. Side AB:

σ22 = −p1, σ12 = 0.


Side BC:

σ11 sinβ + σ12 cosβ = −p3 sinβ,

σ12 sinβ + σ22 cosβ = −p3 cosβ.

Side AC:

−σ11 sinα+ σ12 cosα = p2 sinα,

−σ12 sinα+ σ22 cosα = −p2 cosα.

Problem 6.4

(a) On side AB: σφφ = σ, σφr = 0.On a part of circle BC: σrr = −p2, σrφ = 0.On a part of circle AC: σrr = −p1, σrφ = 0.On side CD: σφφ = σ, σφr = 0.

(b) On side AB: σφφ = −p, σφr = 0.On a part of circle BC: σrr = 0, σrφ = −τ2.On a part of circle AC: σrr = 0, σrφ = −τ1.On side CD: σφφ = −p, σφr = 0.

Problem 6.5

(a) On side ABCD: σ11 = σ, σ12 = 0, σ13 = τ .On a part of circle BEFC: σ33 = 2σ, σ32 = 0, σ31 = τ .On a part of circle DCFG: σ22 = σ, σ21 = 0, σ23 = 0.

(b) On side ABCD: σ11 = −p1, σ12 = τ , σ13 = 0.On a part of circle BEFC: σ33 = −p3, σ32 = 0, σ31 = 0.On a part of circle DCFG: σ22 = −p2, σ21 = τ , σ23 = 0.

Problem 6.6 Denote the principal axes by i1, i2, i3. In this Cartesiansystem, the normal to triangle ABC is

n =1√3(i1 + i2 + i3).

The stress tensor is

σ = σ1i1i1 + σ2i2i2 + σ3i3i3.


The required stress vector is

t = n · σ=

1√3(σ1i1 + σ2i2 + σ3i3)

=1√3(50i1 − 50i2 + 75i3).

Problem 6.7

(a) On the upper cylinder face: σzz = σ, σzφ = τ3, σzr = 0.On the internal surface part: σrr = 0, σrφ = τ2, σrz = 0.On the external later surface: σrr = 0, σrφ = τ1, σrz = 0.

(b) On side ABCD: σφφ = 0, σφr = 0, σφz = τ1.On a part of ring BCEH: σzz = −p, σzφ = 0, σzr = 0.On a part of cylinder CEFD: σrr = 0, σrφ = τ3, σrz = 0.On side FEHG: σφφ = 0, σφr = 0, σφz = −τ2.

(c) On side ABCD: σφφ = 0, σφr = τ1, σφz = 0.On a part of ring BCEH: σzz = σ, σzφ = 0, σzr = 0.On a part of cylinder CEFD: σrr = 0, σrφ = 0, σrz = −τ3.On side FEHG: σφφ = 0, σφr = −τ2, σφz = 0.

Problem 6.8

(a) ε = 12γ(i1i2 + i2i1);

(b) ε = λ1i1i1;(c) ε = λE;(d) ε = u′(r)erer + u(r)/r eφeφ + kezez;(e) ε = (u′(r) − u(r)/r)(ereφ + eφer) + kezez;(f) ε = u′(r)(erez + ezer) + kezez;(g) ε = u′(r)erer + u(r)/r eφeφ + u(r)/r eθeθ.

Chapter 7 Exercises

Exercise 7.11 Hooke’s law is given by the formula

σ = λE tr ε+ 2µε.


Dot-multiply this by n from the left and the right. From σ33 = 0 it followsthat

λ tr ε+ 2µε33 = 0.

Because tr ε = tr ε+ ε33 where tr ε = ε11 + ε22, we get

ε33 = − λ

λ+ 2µtr ε.

So

tr ε =2µ

λ+ 2µtr ε.

It follows that

σ = 2µ[ε+

λ

λ+ 2µE tr ε

]≡ E

1 + ν

[ε+

ν

1 − νE tr ε

].

Exercise 7.12 In Cartesian coordinates we get

ε ·· ε = ε211 + ε222 + 2ε212, tr2 ε = (ε11 + ε22)2 = ε211 + ε222 + 2ε11ε22.

So,

ε ·· ε+ν

1 − νtr2 ε =

(1 +

ν

1 − ν

)ε211 +

(1 +

ν

1 − ν

)ε222

+2ν

1 − νε11ε22 + 2ε212

=1

1 − ν

[ε211 + ε222 + 2νε11ε22

]+ 2ε212.

If |ν| < 1 and ε11 = 0, ε22 = 0 then ε211 + ε222 + 2νε11ε22 > 0. This provesthe required positivity.

Exercise 7.13 Doubly dot-multiplying T by ε, we get

T ·· ε = (T · A + (T · n)n) ·· ε = (T ·A) ·· ε = Tαβεαβ .

Substituting (7.30) into this, we get

T ·· ε =Eh

1 + ν

[ε+

ν

1 − νA tr ε

]·· ε =

Eh

1 + ν

[ε ·· ε+

ν

1 − νtr2 ε

].

Using the equality

M = −Mαβραρβ × n,


we find that

Mαβ = −ρα · M · (ρβ × n).

Then

Mαβæαβ = −ρα · M · (ρβ × n)æαβ = −ρα ·M · (ρβ × n)ρα · æ · ρβ

= −M ·· [(ρβ × n)ραρα · æ · ρβ

]= M ·· [(n × ρβ)æ · ρβ

]= M ·· [(n× ρβ)ρβ · æ]

= M ·· (n× æ)

= (M × n) ··æ.In addition, we obtain the equality

M ·· (n × æ) =Eh3

12(1 + ν)

[æ ··æ +

ν

1 − νtr2 æ

].

Exercise 7.15 tr ∇v = tr(∇v)T = ∇ · v and tr ∇∇w = ∇ · ∇w. It followsthat tr ε = ∇ · v − z∇ · ∇w.

Exercise 7.16 Let the equation of the bent plate surface be z = w(x1, x2).In vector form, the surface is given by

r = x1i1 + x2i2 + nw(x1, y1).

With slightly different notation, this was analyzed in Exercise 5.33; it wasshown that

L =wxx√

1 + w2x + w2

y

, M =wxy√

1 + w2x + w2

y

, N =wyy√

1 + w2x + w2

y

.

We recall that L = b11, M = b12, N = b22 are the components of thecurvature tensor B. If w and its derivatives are small, we get

B = wxxi1i1 + wxy(i1i2 + i2i1) + wyyi2i2 = −æ.

In other words, the second fundamental tensor of the bent plate surfacecoincides with æ up to a difference in algebraic sign.

Exercise 7.18 Modify the proof in three-dimensional elasticity, changingσ to T · A, and u to v.

Bibliography

Ciarlet, Ph.G. (1988). Mathematical Elasticity. Vol. I. Three-Dimensional Elas-ticity (North-Holland, Amsterdam).

Ciarlet, Ph.G. (1997). Mathematical Elasticity. Vol. II. Theory of Plates (North-Holland, Amsterdam).

Ciarlet, Ph.G. (2000). Mathematical Elasticity. Vol. III. Theory of Shells (North-Holland, Amsterdam).

Ciarlet, Ph. G. (2005). An Introduction to Differential Geometry with Applicationto Elasticity (Springer, Dordrecht).

Chroscielewski, J., Makowski, J. and Pietraszkiewicz, W. (2004). Statyka idynamika powlok wieloplatowych. Nieliniowa teoria i metoda elementowskonczonych (Wydawnictwo IPPT PAN, Warszawa).

Cosserat, E. and Cosserat, F. (1909). Theorie des corps deformables (Herman etFils, Paris). English translation: NASA TT F-11, 561 (NASA, Washington,DC).

Danielson, D.A. (1992). Vectors and Tensors in Engineering and Physics(Addison–Wesley, New York).

Donnell, L.H. (1976). Beams, Plates and Shells (McGraw-Hill, New York).Eremeyev, V.A. and Zubov L. M. (2008). Mechanics of Elastic Shells (in Russian)

(Nauka, Moscow).Eringen, A.C. (1999) Microcontinuum Field Theory. I. Foundations and Solids

(Springer-Velag, New York).Green, A.E. and Zerna, W. (1954). Theoretical Elasticity (Clarendon Press, Ox-

ford).Goldenveizer, A. L. (1976). Theory of Thin Elastic Shells (in Russian) (Moscow,

Nauka).Goodbody, A.M. (1982). Cartesian Tensors, with Applications to Mechanics,

Fluid Mechanics, and Elasticity (Halsted Press, Ellis Horwood, New York).Heinbockel, J. H. (2001). Introduction to Tensor Calculus and Continuum Me-

chanics (Trafford Publishing, British Columbia, Canada).Jeffreys, H. (1931). Cartesian Tensors (Cambridge University Press, Cambridge,

UK).Kay, D.C. (1988). Tensor Calculus (Schaum’s Outline Series, McGraw-Hill, New

355


York).Knowles, J. (1997). Linear Vector Spaces and Cartesian Tensors (Oxford Univer-

sity Press, Oxford).Koiter, W.T. On the foundations of the linear theory of thin elastic shells. I.

Proc. Kon. Ned. Ak. Wet. B73, 169–195.Lebedev, L. P. and Cloud, M. J. (2003). The Calculus of Variations and Functional

Analysis with Optimal Control and Applications in Mechanics (World Sci-entific, Singapore).

Libai, A. and Simmonds, J. G. (1998). The Nonlinear Theory of Elastic Shells,2nd ed (Cambridge University Press, Cambridge, UK).

Lipschutz, M. M. (1969). Differential Geometry (Schaum’s Outline Series,McGraw-Hill, New York).

Lurie, A. I. (1990). Non-linear Theory of Elasticity, Series in Applied Mathematicsand Mechanics, 36 (North-Holland, Amsterdam).

Lurie, A. I. (2005). Theory of Elasticity (Springer, Berlin).McConnell, A. (1957). Application of Tensor Analysis (Dover, New York).Muskhelishvili N. I., (1966). Some Basic Problems of the Mathematical Theory

of Elasticity; Fundamental Equations, Plane Theory of Elasticity, Torsionand Bending, 5th ed. (in Russian), Nauka, Moscow. Transl. P. Noord-hoff, Groningen, 1953 (Translation of the 3rd ed., Izd. Akad. Nauk SSSR,Moscow-Leningrad, 1949).

Naghdi, P. (1972). The Theory of Plates and Shells. In: Flugge, S. (Ed.), Hand-buch der Physik, Vol. VIa/2, Springer-Verlag, Berlin. Pp. 425–640.

Novozhilov, V.V., Chernykh, K.Ph. and Mikhailovskiy E.M. (1991). Linear The-ory of Thin Shells (in Russian) (Politechnika, Leningrad).

Ogden, R.W. (1997). Non-Linear Elastic Deformations (Dover, New York).O’Neill, B. (1997). Elementary Differential Geometry (Academic Press, New

York).Papastavridis, J. (1998). Tensor Calculus and Analytical Dynamics (CRC Press,

Boca Raton).Pogorelov, A.V. (1957). Differential Geometry. Translated from the first Russian

ed. by L. F. Boron (P. Noordhoff, Groningen).Rubin, M. B. (2000). Cosserat Theories: Shells, Rods and Points (Kluwer, Dor-

drecht).Simmonds, J. G. (1982). A Brief on Tensor Analysis, 2nd ed (Springer, New

York).Schouten, J. A. (1951). Tensor Analysis for Physicists (Clarendon Press, Oxford).Sokolnikoff, I. S. (1994). Tensor Analysis: Theory and Applications to Geometry

and Mechanics of Continua (Wiley, New York).Synge, J. and Schild, A. (1978). Tensor Calculus (Dover, New York).Timoshenko, S. P., Woinowsky-Krieger, S. (1985). Theory of Plates and Shells

(McGraw Hill, New York).Truesdell, C. and Noll, W. (2004). The Nonlinear Field Theories of Mechanics.

3rd ed (Springer, Berlin).Wang, C. M., Reddy, J. N. and Lee, K.H. (2000). Shear Deformable Beams and

Shells (Elsevier, Amsterdam).

Bibliography 357

Vlasov, V.Z. (1949). General Theory of Shells and its Applications in Technics(in Russian) (Moscow, Saint-Petersburg, Gostekhizdat).

Vorovich, I. I. (1999). Nonlinear Theory of Shallow Shells (Springer, New York).Wang, C.-C. and Truesdell, C. (1973). Introduction to Rational Elasticity (No-

ordhoof Int. Publishing, Leyden).Young, E. (1993). Vector and Tensor Analysis (Marcel Dekker, New York).Yosida, K. (1980). Functional Analysis (Springer, Berlin).Zhilin, P.A. (2006). Applied Mechanics. Foundations of the Theory of Shells (in

Russian) (St. Petersburg State Polytechnical University, St. Petersburg).Zubov, L.M. and Karyakin M. I. (2006). Tensor Calculus (in Russian) (Vu-

zovskaya Kniga, Moscow).Zubov, L. M. (1982). Methods of Nonlinear Elasticity in the Theory of Shells (in

Russian) (Rostov State University, Rostov on Don).

Index

6-parametric shell theory, 283

acoustic tensor, 228admissible function, 209admissible variation, 144Airy’s function, 230antisymmetric tensor, 42Archimedes, 125astroid, 343

ball tensor, 48Beltrami–Michell equations, 204bending stiffness, 267bending strain measure, 253Betti’s reciprocity theorem, 206biharmonic equation, 268body forces, 183boundary conditions

kinematic, 202, 259mixed, 202plate, 269static, 202, 261

bulk modulus, 198

calculus of variations, 144fundamental lemma, 145

Cauchylemma, 185stress tensor, 187theorem, 187

Cayley–Hamilton theorem, 48Cesaro formula, 193

Christoffel coefficients, 99, 101for a surface, 154

cofactor, 82commutative tensors, 80compatibility condition, 193complete curvature, 152configuration, 183

actual, 183initial, 183reference, 183

conjugate vector, 44constitutive equations, 194contact forces, 183continuity, 68contravariant components, 21coordinate line, 86covariant components, 21covariant derivatives, 100covariant differentiation, 100cross product, 22curl, 97curvature, 128curvature tensor, 104, 156, 240curve

curvature, 128moving trihedron, 130normal plane, 130osculating plane, 130principal normal, 130rectifying plane, 130second curvature, 132torsion, 132

359


curvilinear coordinates, 86cycloid, 343

d’Alembert’s principle, 193Darboux vector, 137degenerate eigenvalue, 48derivative, 71determinant, 34developable surface, 152, 168diagonalization, 47displacement vector, 191divergence, 97, 156dot product, 5double dot product, 39dual basis, 13dummy index, 14Dupin indicatrix, 164dyad, 4, 29dynamical equations, 193

eigenfrequencies, 222eigenoscillation, 222eigenpair, 44eigenvalue problem, 222equal tensors, 30equilibrium equations

continuum medium, 184, 186, 188elastic body, 201micropolar shell, 284plate, 263rigid body, 182shell, 247, 249

Euclid, 125Euclidean space, 103Euler equations, 146even permutation, 23evolute, 161external forces, 183extremal, 146

Fermat’s theorem, 210finite rotation vector, 81first fundamental form, 142first fundamental tensor, 240first variation, 145, 211force density, 183

force(s)body, 183contact, 183external, 183

free index, 14Friedrich inequality, 220function(s)

isotropic, 63, 65linear, 61of tensorial argument, 60symmetric tensor, 75

functional, 144, 209

Gateaux derivative, 71, 211Gauss–Ostrogradsky theorem, 117,

205, 207, 212, 221–223, 225, 226,260, 273surface analog, 178

Gaussian curvature, 152generalized Hooke’s law, 195generalized orthogonality, 47generator, 152geodesic, 146Germain–Lagrange equation, 268Gibbsian cross, 249gradient, 71, 96Gram determinant, 17, 218Gram matrix, 17

inner product, 66integration by parts, 117intrinsic coordinates, 139inverse tensor, 32irrotational field, 108isometric surfaces, 167isotropic function, 63, 65isotropic tensor, 59

kinematic boundary conditions, 202,259

Kirchhoff’s hypotheses, 249, 266kinematic, 249static, 249

Kirchhoff–Love hypotheses, 249Korn’s inequality, 220Kronecker delta, 13

Index 361

Lagrange identity, 24Lagrange’s principle, 271Lagrange’s variational principle, 214,

271, 285Lame coefficients, 113Lame’s moduli, 197level surface, 140Levi–Civita tensor, 57limit, 68lines of curvature, 167lowering indices, 17

mean curvature, 152meridian, 168metric tensor, 16Meusnier’s theorem, 151micropolar shells, 283minimizer, 144mixed boundary conditions, 202moving trihedron, 130multiple eigenvalue, 48

nabla operator, 95natural equations, 170natural parametrization, 127nonsingular tensor, 34, 52norm, 66normal curvature, 151normal plane, 130

oblique coordinates, 94odd permutation, 23order of contact, 160orthogonal representation, 47orthogonal tensor, 49oscillation mode, 223osculating plane, 130

parallel, 168permutation symbol, 23physical components, 91plane elasticity problems, 230plane strain, 230plane stress, 230plane wave, 227plate, 237, 263

bending stiffness, 267boundary conditions, 269equilibrium equations, 263Germain’s equation, 268higher-order theory, 280non-classical theory, 277Reissner’s equations, 277rigid motions, 270strain energy, 270third-order theory, 281uniqueness of solution, 275

point of inflection, 131Poisson’s ratio, 198polar decomposition, 53positive definite tensor, 53pre-multiplication, 31principal axes, 47, 190principal directions, 164principal normal, 130principal stresses, 190projection dyad, 30

quotient law, 31

radius of curvature, 131raising indices, 17Rayleigh’s quotient, 226, 286Rayleigh’s variational principle, 221,

286reciprocal basis, 13rectifying plane, 130Reissner’s plate theory, 277resultant force, 182resultant moment, 182Riemann–Christoffel tensor, 104right-hand rule, 22Ritz’s method, 216rotation, 97, 156

scalar product, 40scalar triple product, 6Schwarz inequality, 67second curvature, 132second fundamental form, 149second fundamental tensor, 240shear correction factor, 279


shear modulus, 197, 198shell, 237

base surface, 249boundary conditions, 259deformation, 249direct approach, 238energy, 256equilibrium equations, 244higher-order theory, 280hypothesis approach, 238midsurface, 249non-classical theory, 277strain, 249thickness, 243

singular point, 128singular tensor, 34solenoidal field, 110solidification principle, 182static boundary conditions, 202, 261stationary point, 210, 211Stokes’s formula, 119strain energy, 195strain energy function, 195strain tensor, 192stress couple tensor, 248stress resultant, 246stress resultant tensor, 246stress tensor, 187stress vector, 184sum of tensors, 31summation convention, 14surface

complete curvature, 152Gaussian curvature, 152gradient operator, 156, 240mean curvature, 152normal curvature, 151

surface gradient, 156symmetric tensor, 42

tangential strain measure, 253tensor field, 85tensor of changes of curvature, 253tensor of elastic moduli, 195tensor product, 5, 29tensor(s)

acoustic, 228antisymmetric, 42commutative, 80diagonalization of, 47equal, 30inverse, 32isotropic, 59Levi–Civita, 57nonsingular, 34, 52of fourth order, 5of second order, 4operator viewpoint, 30orthogonal, 49orthogonal representation, 47positive definite, 53principal axes, 47resultant stress, 246singular, 34strain , 192stress, 187stress couple, 248sum of, 31symmetric, 42trace of, 51transpose, 40unit, 31

thin-walled structure, 237torque, 22torsion, 132total energy, 209trace, 51transpose, 40transverse shear stress resultants, 247triads, 56

unit basis vectors, 91unit tensor, 31

variational principleLagrange’s, 214, 271, 285Rayleigh’s, 221, 286

vector field, 85vector(s), 3

contravariant components, 21covariant components, 21cross product of, 22

Index 363

dot product of, 5unit basis, 91

vectorial invariant, 80Viete formulas, 45virtual displacement, 211Voigt’s rule, 196

wave vector, 227wave velocity, 227waves

dilatational, 229longitudinal, 229plane, 227shear, 229transverse, 229

weak solution, 216

Young’s modulus, 194, 198

zero tensor, 31

tensor analysis with application in mechanics

Documents