a. elementary tensor analysis

A. Elementary Tensor Analysis

This appendix is intended to provide a survey of mathematical backgroundneeded for a modern development of continuum mechanics. The reader isexpected to be familiar with some notions of vector spaces or matrix algebra.In the first part we shall review some basic notions of vector spaces and lineartransformations. At the same time, elementary properties of tensors as wellas tensor notations will be introduced.

A.1 Linear Algebra

We shall consider finite dimensional real vector spaces only. The field of realnumbers is denoted by IR.

Definition. A vector space V is a set equiped with two operations:1) v + u ∈ V , called addition of v and u in V ,2) αv ∈ V , called scalar multiplication of v ∈ V by α ∈ IR,

which satisfy the following rules: for any v,u,w ∈ V , and any α, β ∈ IR,1) v + u = u + v.2) v + (u + w) = (v + u) + w.3) There exists a null vector 0 ∈ V , such that v + 0 = v.4) For any v ∈ V , there exist −v ∈ V , such that v + (−v) = 0.5) α(βv) = (αβ)v.6) (α+ β)v = αv + βv.7) α(v + u) = αv + αu.8) 1v = v.

Definition. A set of vectors {v1, · · · ,vn} is said to be a basis of V , if1) it is a linearly independent set, i.e., for any a1, · · · , an ∈ IR, if a1v1 +· · ·+ anvn = 0 then a1 = · · · = an = 0.

2) it spans the space V , i.e., for any u ∈ V , the vector u can be expressedas a linear combination of {v1, · · · ,vn}.

Let {e1, · · · , en} be a basis of V , then for any vector u ∈ V , u can beexpressed as

u =n∑i=1

uiei,

2 A. Elementary Tensor Analysis

where ui, called the components of u, are uniquely determined relative to thebasis {ei}.

A vector space can have many different bases, but all of them will havethe same number of elements. The number of elements in a basis is called thedimension of the space, in this case, we have dimV = n.

A.1.1 Inner Product

We may think of a vector as a geometric object which has a length andpointing in a certain direction. To incorporate this notion we introduce anadditional structure, inner product, into the vector space.

Definition. An inner product is a map

g : V × V → IR

with the following properties: For any u, v, w ∈ V , and α ∈ IR,1) g(u + αv,w) = g(u,w) + αg(v,w),2) g(u,v) = g(v,u),3) g(u,u) > 0, if u 6= 0.

That is, an inner product is a positive-definite symmetric bilinear functionon V . We call g(u,v) the inner product of u and v. The vector space equippedwith an inner product is called an inner product space. Hereafter, all vectorspaces considered are always inner product spaces.

Notation. g(u,v) = u · v, if g is given and fixed.

Definition. The norm of a vector v ∈ V is defined as

|v| =√

(v · v).

A vector space equipped with such a norm is called a Euclidean vector space.

The notion of angle between two vectors can be defined based on thefollowing Schwarz inequality:

|u · v| ≤ |u| |v| . (A.1)

Definition. For any non-zero u, v ∈ V , the angle between u and v, θ(u,v) ∈[0, π], is defined by

cos θ(u,v) =u · v|u| |v|

.

The vectors u and v are said to be orthogonal if θ(u,v) = π/2. Obviously,u and v are orthogonal if and only if u · v = 0.

A vector v is called a unit vector if |v| = 1. The projection of a vector u onthe vector v can be defined as |u| cos θ(u,v), or as (u · e), where e = v/ |v|is the unit vector in the direction of v. The vector (u · e) e is called theprojection vector of u in the direction of v.

A.1 Linear Algebra 3

Let {ei, i = 1, · · · , n} be a basis of V . Denote the inner product of ei andej by gij ,

gij = ei · ej .

Clearly, gij is symmetric, gij = gji. Let u = uiei, v = vjej be arbitraryvectors in V expressed in terms of the basis {ei}. Then

u · v = (uiei) · (vjej)= uivj(ei · ej) = uivjgij ,

oru · v = giju

ivj . (A.2)

Here we have used the following summation convention.

Notation. (Summation convention) In the expression of a term, if an indexis repeated once (and only once), a summation over the range of this indexis assumed.

For example,

uiei =n∑i=1

uiei,

gijuivj =

n∑i=1

n∑j=1

gijuivj .

Note that in these expressions, we purposely write the indices in two differentlevels so that the repeated summation indices are always one superindex andone subindex. The reason for doing so will become clear in the next section.

A.1.2 Dual Bases

Let {e1, · · · , en} be a basis of V . There exists a non-zero vector orthogonalto the plane spanned by the n − 1 vectors {e2, · · · , en}, and if in additionthe projection of this vector on e1 is prescribed, then this vector is uniquelydetermined. In this manner, for any given basis {e1, · · · , en}, we can constructa set of vectors {e1, · · · , en} such that

ei · ej = δij ,

where δij is called the Kronecker delta defined by

δij ={

0, if i 6= j,1, if i = j.

From this construction, if v = viei is a vector in V , then by taking the innerproduct with ei we have

ei · v = ei · (vjej) = vjδij = vi.


Hence the ith component of v relative to the basis {e1, · · · , en} is its innerproduct with the vector ei. Therefore this set of vectors {ei} associated withthe basis {ei} can be regarded as linear functions which map a vector to itscomponents.1

We can easily show that this new set of vectors is a linearly independentset. Indeed, if for any linear combination aje

j = 0, then it follows that(ajej) · ei = ajδ

ji = ai = 0 for all i. Furthermore, it also spans the space V ,

for if u = uiei is a vector in V , then for any vector v = viei, from (A.2) andvj = ej · v,

u · v = gijuivj = (gijuiej) · v,

which implies that u can be expressed as u = uiei with

ui = gijuj .

Therefore we have proved that this new set of vectors {ei} is also a basis ofV .

Definition. Let β = {ei} and β∗ = {ei} be two bases of V related by theproperty

ei · ej = δij .

They are said to be a pair of dual bases for V , or β∗ is the dual basis of β.

The dual bases are uniquely determined from each other. For this reason,we have used the same notation for their elements except the different levelof indices to distinguish them. Clearly, if u is a vector in V , then we canexpress u in terms of components in two different ways relative to the dualbases,

u = uiei = ujej ,

where we have also employed different level of component indices in orderto be consistent with our summation convention, which sums over repeatedindices in different levels. We call

ui the ith contravariant component of u,

uj the jth covariant component of u.

From the definition, it follows that

ui = ei · u, uj = ej · u, (A.3)

and they are related by

ui = gijuj , ui = gijuj ,

1 In general, the space of all linear functions on V is called the dual space of Vand denoted by V ∗. In this note, for simplicity, we shall not distinguish vectorsin V ∗ and V through the inner product.


where we have denotedgij = ei · ej .

The two operations

gij : uj 7→ ui, gij : uj 7→ ui,

enable us to lower and raise the component index. One can also show that

ej = gijei, ei = gijej .

Therefore, lowering or raising the index for dual bases can be made in thesame manner. It is easy to verify that [gij ] is the inverse of the matrix [gij ],or

gijgjk = δik.

A basis {ei} is called an orthogonal basis if all the elements of the basisare mutually orthogonal, i.e.,

ei · ej = 0 if i 6= j.

If in addition, |ei| = 1, for all i, it is called an orthonormal basis. Althoughin general, we carefully do our bookkeeping of super- and sub-indices, thisbecomes unnecessary if β = {ei} is an orthonormal basis. Since then gij = δij ,and

ei = gijej = δije

j = ei.

That is, the basis β is identical to its dual basis β∗. Hence we do not haveto distinguish contravariant and covariant components. In this case, we canwrite all the indices at the same level. for example,

v = viei.

Of course, according to our summation convention, we still sum over therepeated indices (now in the same level) in this situation.

Exercise A.1.1 Let β′ = {e1 = (1, 0), e2 = (2, 1)} be a basis of IR2,and v = (1,−1) be a vector in IR2.

1) Find the dual basis {e1, e2} of β′.2) Determine the matrix representations [gij ] and [gij ] relative to β′.3) Determine the contravariant and covariant components of v relative

to the bases and make a graphic representation of the results.


A.1.3 Tensor Product

The notion of matrix is related to linear functions on vector spaces. Let Uand V be two vector spaces with inner product. A function T : U → V , iscalled a linear transformation from U to V , if for any u,v ∈ U and α ∈ IR,

T (u + αv) = T (u) + αT (v).

Notation. L(U, V ) = {T : U → V | T is linear}.

If T and S are two linear transformations in L(U, V ), we can definethe addition T + S and the scalar multiplication αT , as transformations inL(U, V ), in the following manner, for all v ∈ U ,

(T + S)(v) = T (v) + S(v),

(αT )(v) = αT (v).

With these operations the set L(U, V ) becomes a vector space.

Definition. For any vectors v ∈ V and u ∈ U , the tensor product of v andu, denoted by v ⊗u, is defined as a linear transformation from U to V suchthat

(v ⊗ u)(w) = (u ·w)v, (A.4)

for any w ∈ U .

The tensor product of two vectors is a linear transformation. We callsuch a linear transformation a simple tensor. Of course, not every lineartransformation can be obtained as a tensor product of two vectors. However,we can show that, indeed, it can always be expressed as a linear combinationof simple tensors.

Proposition. Let {ei}, i = 1, · · · , n and {dα}, α = 1, · · · ,m be bases of Vand U respectively. Then the set {ei⊗dα}, i = 1, · · · , n, α = 1, · · · ,m, formsa basis of L(U, V ).

Proof : Let {ei} be the dual basis of {ei} and {dα} the dual of {dα}. Ifaiαei ⊗ dα = 0, then

aiα(ei ⊗ dα)(dβ) = aiα(dα · dβ)ei = aiαδβαei = aiβei = 0,

which implies that aiβ = 0 since {ei} is a basis. Therefore, {ei ⊗ dα} is alinearly independent set. Moreover, for any T ∈ L(U, V ), let

ei · T (dα) = T iα.

Then for any v ∈ V and any u ∈ U ,

v · T (u) = viei · T (uαdα)

= viuαei · T (dα) = T iαviuα.


On the other hand,

v · (ei ⊗ dα)(u) = vjej · (ei ⊗ dα)(uβdβ)

= vjuβ(ej · ei)(dα · dβ) = viuα.

Therefore, we have

v · T (u) = T iαv · (ei ⊗ dα)(u),

for any v and any u, which leads to

T = T iαei ⊗ dα.

That is, {ei ⊗ dα} spans the space L(U, V ). tu

We may call L(U, V ) the tensor product space of V and U and denote itby V ⊗ U . Obviously, from this result, we have

dimV ⊗ U = (dimV )(dimU).

The basis {ei ⊗ dα} is called a product basis of V ⊗ U . Similarly, the sets{ei ⊗ dα}, {ei ⊗ dα}, and {ei ⊗ dα} are also product bases of V ⊗ U .

Notation. V ⊗ V = L(V ) = L(V, V ).

We shall call linear transformations in L(V ) the second order tensors.Let {ei} and {ej} be dual bases of V , a second order tensor T then hasdifferent component forms relative to the different product bases.

T = T ijei⊗ ej = T ijei⊗ ej

= T ji ei⊗ ej = Tije

i⊗ ej ,

where the various components are given by

T ij = ei · Tej ,

Tij = ei · Tej ,

T ij = ei · Tej ,

T ji = ei · Tej .

(A.5)

They are related by

T ij = T ikgkj = gikTkj , etc., (A.6)

with the operations of raising or lowering the indices discussed in the previoussection.

The matrices [T ij ], [T ij ], [T ji ], [Tij ] are called the matrix representations

of T relative to the corresponding product bases. Note that the first indexrefers to the row and the second index refers to the column of the matrix. Itis important to distinguish the level as well as the position order of the com-ponent indices. In general T ij 6= T i

j , therefore it may cause some confusions


to write T ij with i and j at the same position one on top of the other. Therelation (A.6) can be written in terms of matrix multiplication, in which thecolumn of the first matrix is summed against the row of the second matrix,[

T ij]

=[T ik][gkj]

=[gik][Tkj].

These components are called the associated components of the second ordertensor T . In classical tensor analysis, they are also called

T ij contravariant tensor of order 2,

Tij covariant tensor of order 2,

T ij , Tji mixed tensor of order 2.

Note that if S, T ∈ L(V ), then the composition S◦T , defined as S◦T (v) =S(T (v)) for all v ∈ V , is also in L(V ). The composition S ◦ T will be moreconveniently denoted by ST . In terms of components and matrix operation,we have

[(ST )ij ] = [SikTkj ] = [Sik][T kj ].

Example A.1.1 The identity transformation, 1v = v for any v in V ,has the components,

1 = δijei⊗ ej = δ ji ei⊗ ej = gijei⊗ ej = gijei⊗ ej , (A.7)

since by (A.5), we have

1 ij = ei · 1ej = ei · ej = δij ,

1ij = ei · 1ej = ei · ej = gij .

Therefore, the Kronecker deltas are the mixed components of the iden-tity tensor, while gij and gij are just its covariant and contravariantcomponents. tu

Example A.1.2 For v = viei and u = uiei in V, their tensor producthas the component form:

v ⊗ u = viujei ⊗ ej .

Let v = (v1, v2) and u = (u1, u2) be two vectors in IR2, then relative tothe standard basis of IR2, the matrix of v ⊗ u is given by

[(v ⊗ u)] =[v1u1 v1u2

v2u1 v2u2

].

This product is sometimes referred to as dyadic product of vectors v andu. tu


In general, the tensor products v ⊗ u and u⊗ v belong to two differentspaces, namely V ⊗ U and U ⊗ V respectively. Even in the case V = U ,by definition, v ⊗ u and u ⊗ v are different, i.e., the tensor product is notsymmetric.

Definition. For A ∈ V ⊗ U , the transpose of A, denoted by AT , is definedas a tensor in U ⊗ V such that

v ·Au = u ·ATv, (A.8)

for any v ∈ V and any u ∈ U .

Example A.1.3 For simple tensors, it follows that

(v ⊗ u)T = u⊗ v,

because for any w1,w2 ∈ V , we have

w1 · (v ⊗ u)Tw2 = w2 · (v ⊗ u)w1

= (w2 · v)(u ·w1) = w1 · (u⊗ v)w2.

tu

Example A.1.4 We have

A(u⊗ v) = Au⊗ v, (u⊗ v)A = u⊗ATv.

Indeed, for any vector w ∈ V , we obtain

A(u⊗ v)w = Au(v ·w) = (Au⊗ v)w,

and

(u⊗ v)Aw = u(v ·Aw) = u(ATv ·w) = (u⊗ATv)w.

tu

If A is a second order tensor in L(V ), then the components of the trans-pose AT satisfy the following relations:

(AT )ij = Aji,

(AT )ij = A ij ,

(AT )ij = Aji,

(AT ) ji = Aji.(A.9)

We see from these relations that for contravariant or covariant tensors thematrix of AT is simply the transpose of the matrix of A. However, from thesecond group of the relations in (A.9) for mixed tensors, this is not valid ingeneral, since the matrix transpose of [Aij ], by changing rows and columns,is [Aji], instead of [A i

j ].

A tensor S ∈ L(V ) is called symmetric if ST = S, and is called skew-symmetric if ST = −S. In other words, S is symmetric if v ·Su = u ·Sv andS is skew-symmetric if v · Su = −u · Sv, for all u,v ∈ V .


Notation. Sym(V ) = {S ∈ L(V ) | ST = S} and

Skw(V ) = {S ∈ L(V ) | ST = −S}.

Note that both Sym(V ) and Skw(V ) are subspaces of L(V ). If S ∈Sym(V ), then its components satisfy

Sij = Sji, Sij = Sji,

Sij = S ij = gjk g

imSkm.

In terms of matrix representation we have

[Sij ] = [Sij ]T , [Sij ] = [Sij ]T .

Note that although S is symmetric, the matrix [Sij ] is not symmetric ingeneral,

[Sij ] 6= [Sij ]T .

A second order tensor can also be regarded as a bilinear function in thefollowing manner: For any A ∈ L(V ), define the function on V × V , alsodenoted by A,

A(u,v) = u ·Av,

for any vectors u and v in V . In particular, for simple tensors, we have

(u⊗ v)(u′,v′) = u′ · (u⊗ v)v′ = (u · u′)(v · v′).

By employing the notion of multilinear functions, we can generalize tensorproducts to higher orders. For example, we can define a tensor product ofthree vectors u, v, and w as a trilinear function on V by

(u⊗ v ⊗w)(u′,v′,w′) = (u · u′)(v · v′)(w ·w′)

for any vectors u′, v′ and w′ in V . One can show as before, that if {ei} is abasis of V , then {ei⊗ej ⊗ek} is a product basis for the space of all trilinearfunctions on V. We shall denote this space as V ⊗V ⊗V and call it the spaceof third order tensors. If S is a third order tensor, then

S = Sijkei ⊗ ej ⊗ ek = Sijkei ⊗ ej ⊗ ek = etc.

There are several different component forms relative to the different productbases. In a similar manner, tensor product of higher orders can be defined.We write

k⊗V =

k times︷︸︸︷V ⊗ · · · ⊗ V

for tensors of order k. Clearly, dimk⊗V = (dimV )k.


Exercise A.1.2 Let β′ = {e1 = (1, 0), e2 = (2, 1)} be a basis of IR2,and T ∈ L(IR2) be defined by

T (x1, x2) = (3x1 + x2, x1 + 2x2), ∀ (x1, x2) ∈ IR2. (A.10)

1) Show that T is a symmetric transformation.2) Determine the matrices of the associated components of T relative

to β′:[Tij ], [T ij ], [T j

i ], [T ij ].

Note that the last two matrices are not symmetric.

A.1.4 Transformation Rules for Components

The components of a tensor relative to a basis are uniquely determined andtheir values depend on the basis. Therefore, if we make a change of basis, theymust change accordingly. In this section, we shall establish the transformationrules for components of tensors under a change of basis.

Consider a change of basis from β = {ei} to another basis β = {ei}given by

ek = M jk ej . (A.11)

We call M jk the transformation matrix for the change of basis from β to β.

By the use of (A.3), we have

M jk = ek · ej ,

from which we can also obtain the relation between the dual bases β∗ = {ei}and β∗ = {ei},

ej = M jk ek. (A.12)

The above two transformation relations (A.11) and (A.12) can be schemati-cally represented by

βM−−−−→ β ,

β∗MT

←−−−− β∗.

In other words, if M changes a basis β to another basis β, their correspondingdual bases β∗ and β∗ are changed in the opposite direction through MT .

The components of a vector transform in a similar manner. Indeed, letv be a vector in V , and

v = vi ei = vi ei

= vj ej = vj ej .


One can easily verify that the transformation rules for the components are

vk = M jk vj , vj = M j

k vk, (A.13)

which look exactly like the ones for the change of basis (A.11) and (A.12).In matrix notations, we have

[vk] = [M jk ] [vj ], [vj ] = [M j

k ]T [vk]

or schematically[vi]

M−−−−→ [vi],

[vi] MT

←−−−− [vi].

That is, the covariant components transform in the same direction as thechange of basis by M , while the contravariant components transform in theopposite direction by MT . This is the reason why such components are calledco- and contra-variant in classical tensor analysis, in which tensors are definedthrough their transformation properties.

For a second order tensor A in L(V ),

A = Aij ei ⊗ ej = Aij ei ⊗ ej

= A ji ei ⊗ ej = A j

i ei ⊗ ej .

We have the following transformation rules,

Aij = Amn Mmi M n

j ,

A ji = A n

m M mi

−1

Mnj ,

(A.14)

where the matrix [−1

Mij ] is the inverse matrix of [M j

i ]. In matrix notations,the transformation rules can be written as

[Aij ] = [M mi ] [Amn] [M n

j ]T ,

[A ji ] = [M m

i ] [A nm ] [M j

n ]−1.

Transformation rules for other components and for tensors of higher ordersare similar. The general rule can easily be obtained by composing the transfor-mation rules for covariant and contravariant components as shown in (A.13)or (A.14).

Exercise A.1.3 Let β = {e1 = (1, 0), e2 = (0, 1)} and β = {e1 =(1, 0), e2 = (2, 1)} be two bases of IR2. Determine the transformationmatrix of the change of basis from β to β and also the transformationmatrix from β∗ to β∗. Let T ∈ L(R2) be defined in (A.10). Determinethe various components of T relative to the two different bases and verifythe transformation rules.


Exercise A.1.4 For any two bases β = {ei} and β = {ei} of V , thereexists a linear transformation A ∈ L(V ) such that ek = Aek. Show thatthe transformation matrix M for the change of basis from β to β is givenby M j

k = ej ·Aek, that is, [M jk ] = [Ajk].

A.1.5 Determinant and Trace

In matrix algebra, the definition of determinant of a square matrix is basedon the notion of permutation. Let (1, · · · , n) be an ordered set of naturalnumbers. A reordering of the elements in (1, · · · , n) is called a permutation.More precisely, a permutation is a one-to-one mapping σ : {1, · · · , n} →{1, · · · , n} resulting in the ordered set (σ(1), · · · , σ(n) ). There are exactly n!permutation of (1, · · · , n). A permutation by exchanging order of two adjacentelements is called a transposition. It is known that any permutation can beobtained by merely subsequent transpositions, and although the number ofsuch transpositions are not unique for a given permutation, the parity of thisnumber is. Hence, a permutation is called even or odd according to parity ofthe number of transpositions in order to restore the permutation back to thenatural order and one can define the sign of a permutation, denoted signσ,as +1 if σ is even and −1 if σ is odd.

Let [Mij ] be a square matrix. The first index denotes the row and thesecond the column (it does not matter whether they are superindices orsubindices). The determinant of the matrix can be calculated by

det [Mij ] =∑σ

(sign σ)Mσ(1) 1 · · ·Mσ(n)n, (A.15)

where the summation is taken over all permutations of (1, · · · , n).On the other hand, since the matrix representation of a linear transfor-

mation depends on the choice of basis, the question arises of whether it ismeaningful to define the determinant of a linear transformation as the deter-minant of its matrix representation. In the following, we shall see that thenotion of determinant of linear transformation can be defined in a naturalway, independent of the choice of basis and see how it is related to its matrixrepresentations.

Definition. Let V be a vector space of dimension n. A function ω :n︷︸︸︷

V × · · · × V → IR is said to be an alternating n-linear form if it is n-linearand for all v1, · · · ,vn ∈ V ,

ω(vσ(1), · · · ,vσ(n)) = (sign σ)ω(v1, · · · ,vn). (A.16)

ω is called non-trivial if there exist u1, · · · ,un ∈ V , such that ω(u1, · · · ,un) 6=0.


It is obvious that if ω is alternating then

ω(· · · ,u, · · · ,v, · · ·) = 0, if u = v. (A.17)

More generally, if {v1, · · · ,vn} is linearly dependent then ω(v1, · · · ,vn) = 0.In other words, if ω(v1, · · · ,vn) 6= 0 then {v1, · · · ,vn} is a linearly in-dependent set, and since the number of vectors in this set equals dimV ,{v1, · · · ,vn} is also a basis of V .

Theorem. (uniqueness) Let ω and ω′ be two alternating n-linear forms andω be non-trivial. Then there exists uniquely a λ ∈ IR, such that ω′ = λω,i.e., ∀ v1, · · · ,vn ∈ V ,

ω′(v1, · · · ,vn) = λω(v1, · · · ,vn).

Proof : Since ω is non-trivial, there exists a set of vectors, say {e1, · · · , en},such that ω(e1, · · · , en) 6= 0, and hence it is a basis of V . Let λ be the numberdefined by

λ =ω′(e1, · · · , en)ω(e1, · · · , en)

.

Suppose that v1, · · · ,vn ∈ V , and

va = viaei, a = 1, · · · , n.

Then using (A.16) and (A.17) one can easily obtain

ω(v1, · · · ,vn) = αω(e1, · · · , en),

ω′(v1, · · · ,vn) = αω′(e1, · · · , en),

whereα =

∑σ

(sign σ)vσ(1)1 · · · vσ(n)

n .

Therefore we have

ω′(v1, · · · ,vn) = λω(v1, · · · ,vn).

Moreover, this relation also shows that λ does not depend on the choice ofbasis. tu

Let T ∈ L(V ) be a linear transformation on V , and ω be a non-trivialalternating n-linear form on V . Define a map Tω : V × · · · × V → IR by

Tω(v1, · · · ,vn) = ω(Tv1, · · · , Tvn). (A.18)

Clearly it is alternating and n-linear, hence by the uniqueness theorem, thereexists a unique λ ∈ IR, such that

Tω = λω.


We can easily see that the scalar λ so defined does not depend on thechoice of ω. For if ω′ is another non-trivial alternating n-linear form, then bythe uniqueness theorem,

ω′ = µω, µ 6= 0.

Therefore, we haveTω′ = λ′ω′ = λ′µω.

On the other hand, we have

Tω′(v1, · · · ,vn) = ω′(Tv1, · · · , Tvn) = µω(Tv1, · · · , Tvn)

= µTω(v1, · · · ,vn) = µλω(v1, · · · ,vn),

which implies thatTω′ = µλω.

Consequently, λ = λ′. Therefore, λ is uniquely determined by T alone andwe can lay down the following definition.

Definition. T ∈ L(V ), the determinant of T , detT ∈ IR, is defined by thefollowing relation,

(detT )ω(v1, · · · ,vn) = ω(Tv1, · · · , Tvn), (A.19)

for any non-trivial alternating n-linear form ω and for any v1, · · · ,vn ∈ V .

The function det : L(V )→ IR has the following properties:

1) det u⊗ v = 0.2) det(α1 ) = αn. (A.20)3) det(ST ) = (detS)(detT ).4) detST = detS.

The first two properties are almost trivial. Here let us verify the property(3). By definition,

det(ST )ω(v1, · · · ,vn)

= ω(STv1, · · · , STvn) = ω(S(Tv1), · · · , S(Tvn))

= (detS)ω(Tv1, · · · , Tvn) = (detS)(detT )ω(v1, · · · ,vn).

Since it holds for any ω(v1, · · · ,vn), the relation (3) follows.We can calculate the determinant of a linear transformation in term of

its component matrix. Let {ei} be a basis of V , and T = T ijei ⊗ ej . Thenby definition,

(detT )ω(e1, · · · , en)

= ω(Te1, · · · , Ten) = ω(T i11ei1 , · · · , T innein)

=∑σ

(sign σ)Tσ(1)1 · · ·Tσ(n)

n ω(e1, · · · , en).


Hence we obtain

detT =∑σ

(sign σ)Tσ(1)1 · · ·Tσ(n)

n ,

which assures thatdetT = det [T ij ],

i.e., detT is equal to the determinant of the component matrix [T ij ] accord-ing to the definition (A.15). Similarly, one can show that it is also equal todeterminant of [T j

i ]. Therefore we have

detT = det[T ij ] = det[T ji ]

= det[ gikTkj ] = det[ gikT kj ].

Note that detT is not equal to det[Tij ] nor to det[T ij ] unless det[ gij ] = 1.

Similar to the determinant, another scalar can be associated with a lineartransformation. Let T ∈ L(V ), and ω be a non-trivial alternating n-linearform. Define a map Tω : V × · · · × V → IR by

Tω(v1, · · · ,vn) =n∑i=1

ω(v1, · · · , Tvi, · · · ,vn).

One can easily check that Tω is alternating and n-linear, hence by the unique-ness theorem, there exists a µ ∈ IR, such that

Tω = µω.

Moreover, µ does not depend on the choice of ω. Therefore, we can make thefollowing definition.

Definition. T ∈ L(V ), the trace of T , trT ∈ IR, is defined by the followingrelation

(trT )ω(v1, · · · ,vn) =n∑i=1

ω(v1, · · · , Tvi, · · · ,vn), (A.21)

for any non-trivial alternating n-linear form ω and for any v1, · · · ,vn ∈ V .

The function tr : L(V )→ IR has the following properties:

1) tr(αS + T ) = α trS + trT.2) tr 1 = n.3) tr(v ⊗ u) = v · u. (A.22)4) trST = trS.5) tr(ST ) = tr(TS).


The property (1) states that trace is a linear function on L(V ). Here letus prove the property (3). Suppose that v = viei, then

tr(v ⊗ u)ω(e1, · · · , en) =n∑i=1

ω(e1, · · · , (v ⊗ u)ei, · · · , en)

=n∑i=1

(u · ei)ω(e1, · · · ,v, · · · , en) =n∑i=1

(u · ei)viω(e1, · · · , en),

which implies that

tr(v ⊗ u) =n∑i=1

u · (viei) = u · v.

Hence (3) is proved.In terms of components, let T = T ijei ⊗ ej = T j

i ei ⊗ ej , then

trT = T ij tr(ei ⊗ ej) = T jj = T jj = gijT

ij = gijTij .

That is, trT is equal to the sum of diagonal elements of the matrix [T ij ] or[T ij ], but in general is not equal to that of the matrix [Tij ] or [T ij ].

Example A.1.5 Show that det(1 + u⊗ v) = 1 + u · v.By definition, we have

det(1 + u⊗ v)ω(e1, · · · , en)= ω((1 + u⊗ v)e1, · · · , (1 + u⊗ v)en)

= ω(e1, · · · , en) +n∑i=1

ω(e1, · · · , (u⊗ v)ei, · · · , en) + · · ·

= ω(e1, · · · , en) + tr(u⊗ v)ω(e1, · · · , en),

where the dots represent terms involved with more than one factor of(u ⊗ v)ei in ω. Since (u ⊗ v)ei = (v · ei)u, which is a vector in thedirection of u for any index i, those terms must all equal to zero becauseω is an alternating form. tu

The set of all non-trivial alternating n-linear forms on V consists of twodisjoint classes. Two non-trivial alternating n-linear forms ω1 and ω2 are saidto be equivalent if ω1 = λω2 for some λ > 0. Clearly, this is an equivalencerelation which decomposes the set of non-trivial alternating n-linear formsinto two equivalent classes. Each of these classes is called an orientation of


V . We call one of them, say ∆, the positive orientation. A basis {ei} of V iscalled positively oriented if for any ω ∈ ∆,

ω(e1, · · · , en) > 0,

and A ∈ L(V ) is said to be orientation-preserving if Aω ∈ ∆, for any ω ∈∆. Here Aω is defined by (A.18). Since Aω = (detA)ω, A preserves theorientation if and only if detA > 0.

Let {ei} and {ei} be two bases such that A(ei) = ei. If detA > 0(or < 0), then {ei} and {ei} are said to have the same (or the opposite)orientation.

Suppose that V is a three-dimensional vector space and let {i1, i2, i3}be an positively oriented orthonormal basis of V , then there exists a uniquee ∈ ∆, called the volume element, such that

e(i1, i2, i3) = 1.

Since e ∈ L(V × V × V, IR), it is a third order tensor and can be repre-sented as

e = εijk ii ⊗ ij ⊗ ik,

where εijk = e(ii, ij , ik) are the components of e relative to the basis {ik}.Obviously we have

εijk =

1 if (i, j, k) is an even permutation of (1, 2, 3),−1 if (i, j, k) is an odd permutation of (1, 2, 3),0 otherwise.

One can easily check the following identities:

εijkεimn = δjmδkn − δjnδkm,εijkεijn = 2 δkn,

εijkεijk = 6,

(A.23)

where δmn is the Kronecker delta.Let {ek} be a basis and A ∈ L(V ) be a change of basis from {ik} to

{ek}, i.e., A ik = ek. Then the covariant components of the volume elementrelative to {ek} are

eijk = e(ei, ej , ek), e = eijk ei⊗ ej⊗ ek.

By (A.19), it follows that

eijk = (detA)εijk.

and also,gij = ei · ej = A ii ·A ij = (ATA)ij ,


which yields g = (detA)2, where g = det[gij ]. Therefore we have

eijk =√g εijk, (A.24)

if A preserves the orientation. Similarly, the contravariant components of thevolume element are

eijk = e(ei, ej , ek), e = eijkei⊗ ej⊗ ek,

andeijk = (

√g)−1εijk,

where εijk = εijk. Moreover, the identities (A.23) can be written as

eijkeimn = δjmδkn − δjnδkm,

eijkeijn = 2 δkn,

eijkeijk = 6.

(A.25)

If T ∈ L(V ) and T = T ijei ⊗ ej , then (A.19) leads to the followingformula for the determinant of T ,

elmn(detT ) = eijkTilTjmT

kn. (A.26)

Multiplying elmn and using the last identity of (A.25), we obtain anotherformula for the determinant,

detT =16elmneijkT

il T

jmT

kn.

Exercise A.1.5 Consider the tensor defined by (A.10) in the previousexercise. Calculate detT and trT by means of definition and also by theuse of component matrices relative to β′.

A.1.6 Exterior Product and Vector Product

The usual vector product on a three-dimensional vector space can not begeneralized directly to vector spaces in general. However it can be associatedwith the skew symmetric tensor product in a trivial manner.

Definition. For any, v,u ∈ V , the exterior product of v and u, denotedv ∧ u, is defined by

v ∧ u = v ⊗ u− u⊗ v.

It is obvious that the operation ∧ : V × V −→ V ⊗ V is bilinear andskew-symmetric, i.e.,

v ∧ u = −u ∧ v.

The exterior product of two vectors v ∧ u is a skew-symmetric tensor.Suppose that {ei⊗ ej}, i, j = 1, · · · , n is a product basis of V ⊗V , then

it is easy to verify that {ei ∧ ej}, 1 ≤ i < j ≤ n is a basis for Skw(V ).Therefore we have the following proposition.


Proposition. If dimV = n, then dim Skw(V ) = n(n− 1)/2. In particular,if n = 3, then dim Skw(V ) = 3.

Now suppose that V is an oriented Euclidean three-dimensional vectorspace. Since the space of skew-symmetric tensors is also three-dimensionalwe can define a map

τ : Skw(V ) −→ V

by the condition: for all u,v,w ∈ V ,

τ(u ∧ v) ·w = e(u,v,w), (A.27)

Here e is the volume element of V . This linear map, called the duality map,is one-to-one and onto and hence establishes a one-to-one correspondencebetween a skew-symmetric tensor and a vector. It is easy to verify that

τ(ei ∧ ej) = eijkek. (A.28)

For an orthonormal basis {ik} the duality map τ establishes the followingcorrespondence,

i1 ∧ i2 7−→ i3,

i2 ∧ i3 7−→ i1,

i3 ∧ i1 7−→ i2.

For a skew-symmetric tensor W , let w = τ(W ) be the associated vector,which shall be denoted more conveniently by

w = 〈W 〉. (A.29)

In component form, if

W = W ijei ⊗ ej , W ij = −W ji,

orW =

12W ijei ∧ ej .

Then it follows from (A.28) that

w =12eijkW

ijek.

If the basis is orthonormal, it becomes

wi =12εijkWjk, Wij = εijkwk, (A.30)

or in matrix form,

[Wij ] =

0 w3 −w2

−w3 0 w1

w2 −w1 0

.


Remark. It is worthwhile to point out that the vector associated witha skew-symmetric tensor behaves differently from usual vectors underlinear transformations. To see this, let u,v ∈ V , then for any w ∈ Vand Q ∈ L(V ), it follows from the definition that

〈Qu ∧Qv〉 ·Qw = e(Qu, Qv, Qw)= (detQ) e(u,v,w) = (detQ)〈u ∧ v〉 ·w,

which implies that

〈Qu ∧Qv〉 = (detQ)Q〈u ∧ v〉.

In other words, as the vectors u, v are transformed into Qu, Qv respec-tively, the vector 〈u ∧ v〉 is transformed into Q〈u ∧ v〉 only to within ascalar constant, or, into a vector which may point in one or the oppositesense of the same axial direction. For this reason, a vector associatedwith a skew-symmetric tensor is usually called an axial vector. tu

The usual vector product, in the three-dimensional vector space, can now bedefined from the exterior product in a similar manner.

Definition. For any u,v ∈ V , the vector product of u and v, denoted u×v,is defined by

u× v = 〈u ∧ v〉. (A.31)

Clearly the operation × : V × V −→ V is bilinear and skew-symmetric.In components (A.31) gives

u× v = eijkujvkei,

If the basis is orthonormal, say {ik}, then it becomes

u× v = εijkujvkii,

which is the usual definition of the vector product.The relations (A.27) and (A.31) imply that

e(u,v,w) = (u× v) ·w.

This is usually called the triple product of u, v, and w. For convenience, weshall also use the notation,

[u,v,w] = (u× v) ·w.


With this notation, we can rewrite the definitions (A.19) and (A.21) of thedeterminant and the trace in the following form

detA =[Ae1, Ae2, Ae3]

[e1, e2, e3],

trA =[Ae1, e2, e3] + [e1, Ae2, e3] + [e1, e2, Ae3]

[e1, e2, e3].

(A.32)

One may use the duality map to identify a skew-symmetric tensor withan axial vector, as well as the exterior product with the vector product. Inother words, one may interpret the duality in either way in case no ambiguitywould arise.

Exercise A.1.6 Verify the following relations, using index notations:1) Wv = −w × v.2) (u× v)×w = (u ·w)v − (v ·w)u.3) |u× v|2 = |u|2|v|2 − |u · v|2.4) |u× v| = |u| |v| sin θ(u,v).

A.1.7 Second Order Tensors

We shall review some of the important properties of linear transformations,i.e., the second order tensors, mostly without proofs in this section. Theproofs can be found in most standard books in linear algebra.

First, let us introduce an inner product of two second order tensors. LetA,B ∈ L(V ), we can define the inner product of A and B by

A ·B = tr(ABT ),

which is obviously a bilinear, symmetric and positive-definite operation. Wehave

1 ·A = trA,

where 1 is the identity tensor, and for any A,B,C ∈ L(V ),

AB · C = B ·ATC.

The norm of a tensor A ∈ L(V ), can then be defined as

|A| =√A ·A =

√trAAT .

Note that if Aij is the components of A relative to an orthonormal basis,then the norm of A is simply

|A| = (A211 +A2

12 + · · ·+A2nn)1/2.


Now, suppose that A ∈ L(V ) is one-to-one (therefore, onto), then thereis a unique A−1 ∈ L(V ), called the inverse of A, such that

AA−1 = A−1A = 1 .

If A−1 exists, A is said to be invertible or nonsingular, otherwise, it is saidto be singular. It can be proved that A is invertible if and only if detA 6= 0,and for any nonsingular A and B,

(AB)−1 = B−1A−1,

(A−1)T = (AT )−1 = A−T .

Notation. Inv(V ) = {F ∈ L(V ) | F is invertible}.

Recall that a set G is called a group if it has the following properties:

1) If A,B ∈ G then AB ∈ G.2) If A,B,C ∈ G then A(BC) = (AB)C.3) There exists an identity element 1 ∈ G such that 1A = A1 = A, for any

A ∈ G.4) For any A ∈ G, there exists A−1∈ G, such that AA−1 = A−1A = 1 .

It is easy to verify that Inv(V ) forms a group under the operation of com-position. It is usually known as the general linear group of V , denoted byGL(V ).

Definition. Q ∈ L(V ) is called an orthogonal transformation if it preservesthe inner product of V . i.e., for all u,v ∈ V ,

Qu ·Qv = u · v.

Notation. O(V ) = {Q ∈ L(V ) | Q is orthogonal}.

The set O(V ) forms a group and is called the orthogonal group of V .Orthogonal transformations have the following properties:

1) QT = Q−1.2) |detQ| = 1.3) |Qv| = |v|.4) θ(Qv, Qu) = θ(v,u).

The last two relations assert that orthogonal transformations also preservenorms and angles. An orthogonal transformation Q is said to be proper ifdetQ = 1, and improper if detQ = −1.

Notation. O+(V ) = {Q ∈ O(V ) | detQ = 1}.


The set O+(V ) also forms a group, called the proper orthogonal group ofV . It is also called the rotation group since its elements are rotations. Notethat the subset of O(V ) with determinant equal to −1 does not form a groupsince it does not have an identity element.

Notation. U(V ) = {T ∈ L(V ) | |detT | = 1} , and SL(V ) = {T ∈L(V ) | detT = 1}.

Element of U(V ) are called unimodular transformations and U(V ) formsa group, called the unimodular group of V . SL(V ) also forms a group, calledthe special linear group of V . Clearly, we have the following relations:

O+(V ) ⊂ SL(V )O(V ) ⊂ U(V ) ⊂ GL(V ).

A.1.8 Some Theorems of Linear Algebra

We shall mention some important theorems of linear algebra relevant to thestudy of mechanics. They are all related to the concept of eigenvalues andeigenvectors.

Definition. Let A ∈ L(V ). A scalar λ ∈ IR is called an eigenvalue of A, ifthere exists a non-zero vector v ∈ V , such that

Av = λv. (A.33)

v is called the eigenvector of A associated with the eigenvalue λ.

It follows from the definition that λ is an eigenvalue if and only if

det(A− λ1 ) = 0. (A.34)

The left-hand side of (A.34) is a polynomial of degree n in λ, where n is thedimension of V . We may write it in the form

(−λ)n + I1(−λ)n−1 + · · ·+ In−1(−λ) + In = 0.

It is called the characteristic equation of A. Its real roots are the eigenvaluesof A. The coefficients I1, · · · , In are scalar functions of A and are called theprincipal invariants of A.

It can be shown that the characteristic equation is also satisfy by thetensor A itself. We have the following

Cayley–Hamilton Theorem. A second order tensor A ∈ L(V ) satisfies itsown characteristic equation,

(−A)n + I1(−A)n−1 + · · ·+ In−1(−A) + In1 = 0.


Example A.1.6 For dimV = 3 and A ∈ V, we have

det(A− λ1 ) = −λ3 + IAλ2 − IIAλ+ IIIA. (A.35)

The three principal invariants of A, more specifically denoted by IA, IIA,and IIIA can be obtained from the following relations:

IA = trA, IIA = trA−1 detA, IIIA = detA. (A.36)

Of course, the second relation is valid only when A is nonsingular.

Proof : From (A.32) we can write

det(A− λ1 )[e1, e2, e3] = [(A− λ1 )e1, (A− λ1 )e2, (A− λ1 )e3]

= − λ3 [e1, e2, e3]

+ λ2([Ae1, e2, e3] + [e1, Ae2, e3] + [e1, e2, Ae3])

− λ ([e1, Ae2, Ae3] + [Ae1, e2, Ae3] + [Ae1, Ae2, e3])

+ [Ae1, Ae2, Ae3].

Comparing this with the right-hand side of (A.35), we obtain (A.36)1,3by the use of (A.32), as well as the following relation for the secondinvariant IIA,

IIA =[e1, Ae2, Ae3] + [Ae1, e2, Ae3] + [Ae1, Ae2, e3]

[e1, e2, e3].

If A ∈ Inv(V ), then it implies the second relation of (A.36). In particular,if detA = 1, we have IIA = IA−1 . tu

In general, the characteristic equation may not have real roots. However,it is known that if A is symmetric all the roots are real and there exists abasis of V consisting entirely of eigenvectors.

Spectral Theorem. Let S ∈ Sym(V ), then there exists an orthonormalbasis {ei} of V , such that S can be written in the form

S =n∑i=1

siei ⊗ ei. (A.37)

Such a basis is called a principal basis for S. Relative to this basis, thecomponent matrix of S is a diagonal matrix and the diagonal elements siare the eigenvalues of S associated with the eigenvectors ei respectively. Theeigenvalues si, i = 1, · · · , n may or may not be distinct.


Definition. Let λ be an eigenvalue of S ∈ L(V ). We call Vλ = {v ∈ V |Sv =λv} the characteristic space of S associated with λ.

If S is a symmetric tensor and suppose that v ∈ Vλ, u ∈ Vµ, where λ andµ are two distinct eigenvalues of S, then one can easily show that v ·u = 0, i.e.they are mutually orthogonal. Moreover, by the spectral theorem any vectorv can be written in the form

v =∑λ

vλ, vλ ∈ Vλ, (A.38)

where the summation is extended over all characteristic spaces of S.

Commutation Theorem. Let T ∈ L(V ) and S ∈ Sym(V ). Then

ST = TS

if and only if T preserves all characteristic spaces of S. i.e., T maps eachcharacteristic space of S into itself.

Proof : Suppose that S and T commute, and Sv = λv. Then

S(Tv) = T (Sv) = λ(Tv),

so that both v and Tv belong to the characteristic space Vλ.To prove the converse, since S is symmetric, for any v ∈ V , let v =∑

λ vλ be the decomposition relative to the characteristic spaces of S as givenin (A.38). If T leaves each characteristic space Vλ invariant, then Tvλ ∈ Vλand

S(Tvλ) = λ(Tvλ) = T (λvλ) = T (Svλ).

Therefore, by (A.38), we have

STv =∑λ

STvλ =∑λ

TSvλ = TSv,

which shows that ST = TS. tu

There is only one subspace of V that is preserved by any rotation, namelyV itself. Therefore, we have the following

Corollary. A symmetric S ∈ L(V ) commutes with every orthogonal trans-formation if and only if S = λ1 , for some λ ∈ IR.

Definition. S ∈ L(V ) is said to be positive definite (positive semi-definite)if for any v ∈ V and v 6= 0,

v · Sv > 0 (≥ 0).


Similarly, S is said to be negative definite (negative semi-definite) if

v · Sv < 0 (≤ 0).

One can easily see that if S is symmetric, then it is positive definite ifand only if all of its eigenvalues are positive. Consequently, for any symmet-ric positive definite transformation S, there is a unique symmetric positivedefinite transformation T such that T 2 = S and the eigenvalues of T are thepositive square roots of those of S associated with the same eigenvectors.We denote T =

√S and call T the square root of S. In other words, if S is

expressed by (A.37) in terms of the principal basis, then

T =√S =

n∑i=1

√siei ⊗ ei.

Example A.1.7 Let S ∈ L(IR2) be given by S(x, y) = (3x+√

2y,√

2x+2y). Relative to the standard basis of IR2, the matrix of S is

[Sij ] =[

3√

2√2 2

],

which has the eigenvalues s1 = 4 and s2 = 1 and the corresponding prin-cipal basis e1 = (

√2/3,

√1/3) and e2 = (−

√1/3,

√2/3). Therefore, we

haveT =

√S = 2e1 ⊗ e1 + e2 ⊗ e2,

whose matrix relative to the standard basis becomes

[Tij ] =23

[2√

2√2 1

]+

13

[1 −

√2

−√

2 2

]=

13

[5√

2√2 4

].

One can easily verify that [Tij ]2 = [Sij ]. tu

Example A.1.8 Let S be a positive definite symmetric tensor in atwo-dimensional space, then

√S =

1b

(S + a1 ),

where a =√

detS and b =√

2a+ trS.Proof : Let A =

√S. By the Cayley–Hamilton Theorem in the two-

dimensional space, we have the identity

A2 − (trA)A+ (detA)1 = 0.


Since A2 = S, if we let the eigenvalues of A be a1 and a2, then detS =a21a

22 and trS = a2

1 + a22. Therefore

a =√a21a

22 = a1a2 = detA,

b =√

2a1a2 + a21 + a2

2 = a1 + a2 = trA,

which together with the above identity prove the result. tu

Polar Decomposition Theorem. For any F ∈ Inv(V ), there exist sym-metric positive definite transformations V and U and a orthogonal transfor-mation R such that

F = RU = VR.

Moveover, the transformations U , V and R are uniquely determined in theabove decompositions.

Proof : We can easily verify that FFT and FTF are symmetric positivedefinite. Indeed for any v 6= 0, we have

(v · FTFv) = (Fv · Fv) > 0,

since F is nonsingular.To prove the theorem, let us define

U =√FTF , R = FU−1, V = RURT . (A.39)

By definition, U is symmetric positive definite and R is orthogonal since

RRT = FU−1(FU−1)T = FU−1U−TFT

= FU−2FT = F (FTF )−1FT = 1 .

Moreover, from the definition (A.39) we also have

V 2 = RURT (RURT ) = (RU)(RU)T = FFT .

Therefore, V is the square root of FFT and hence is itself a symmetricpositive definite transformation. Furthermore, the uniqueness follows fromthe definition of square root. tu

The polar decomposition theorem, which decomposes a nonsingulartransformation into a rotation and a positive definite tensor, is crucial inthe development of continuum mechanics. The following decomposition ofa tensor into its symmetric and skew-symmetric parts is also important inmechanics.

For any T ∈ L(V ), let

A =12

(T + TT ), B =12

(T − TT ),


thenT = A+B, A ∈ Sym(V ), B ∈ Skw(V ).

This is sometimes called the Cartesian decomposition of a tensor. Such adecomposition is also unique.

Exercise A.1.7 Let A ∈ L(V ) be such that (1 + A) is nonsigular.Verify that

1) (1 +A)−1 = 1 −A(1 +A)−1.

2) (1 +A)−1 = 1−A+A2−· · ·+(−1)nAn+o(An) if lim|A|→0

o(An)|A|n

= 0.

Exercise A.1.8 Let u,v ∈ V . Show that if 1 + u · v 6= 0 then

(1 + u⊗ v)−1 = 1 − u⊗ v

1 + u · v.

Exercise A.1.9 For dimV = 3, let A ∈ L(V ) and B = 1 + A. Showthat

IB = 3 + IA,

IIB = 3 + 2IA + IIA,

IIIB = 1 + IA + IIA + IIIA,

and if a = detB 6= 0, verify that

(1 +A)−1 =1a

((1 + IA + IIA)1 − (1 + IA)A+A2

).

Exercise A.1.10 Prove the Cayley–Hamilton theorem for the specialcase that A ∈ L(V ) is symmetric, by employing the spectral theorem.

Exercise A.1.11 Let β = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} be the standardbasis of IR3 and the matrix representation of F ∈ L(IR3) relative to βbe given by

F =

√3 1 00 2 00 0 1

Suppose that F = RU = VR be the polar decomposition of F . Find thematrix representation of U , V , and R relative to the standard basis β.


A.2 Tensor Calculus

In the second part of this appendix, we shall discuss some basic notions ofcalculus on Euclidean spaces: gradients and other differential operators oftensor functions.

A.2.1 Euclidean Point Space

Let E be a set of points and V be a Euclidean vector space of dimension n.

Definition. E is called a Euclidean point space of dimension n, and V iscalled the translation space of E , if for any pair of points x, y ∈ E , there is avector v ∈ V , called the difference vector of x and y, written as

v = y − x, (A.40)

with the following properties:1) ∀ x ∈ E , x− x = 0 ∈ V .2) ∀ x ∈ E , ∀ v ∈ V , there exists a unique point y ∈ E , such that (A.40) is

satisfied. We write y = x+ v.3) ∀ x, y, z ∈ E , (x− y) + (y − z) = (x− z).

Obviously, with (A.40) we can define the distance between x and y in E ,denoted d(x, y), by

d(x, y) = |v| ,

or equivalentlyd(x, y) =

√(x− y) · (x− y),

where the dot denotes the inner product on V .

Notation. Ex = {vx = (x,v) | v = y − x, ∀ y ∈ E}.

Ex denotes the set of all difference vectors at x. It can be made intoa Euclidean vector space in an obvious way, with the addition and scalarmultiplication defined as

vx + ux = (v + u)x,

α vx = (αv)x.

We call Ex the tangent space of E at x.

A.2 Tensor Calculus 31

��3

��3a

aaaaaaa

aaaaaaaar

rx

y

v

v

vx ∈ Ex

vy ∈ Ey

Fig. A.1. Parallel translation

Clearly Ex is a copy of V , i.e., it is isomorphic to V . In other words, forany x ∈ E , the map ix : V → Ex, called the Euclidean parallelism, takingv to vx trivially establishes a one-to-one correspondence between Ex and V .The composite map

τxy = iy ◦ i−1x : Ex −→ Ey

takingvx = (x,v) 7−→ vy = (y,v)

defines the parallel translation of vectors at x to vectors at y (Fig. A.1).Therefore, although Ex and Ey for x 6= y, are two different tangent spaces,

they can be identified through V in an obvious manner,

Ex ∼= Ey ∼= V, ∀ x, y ∈ E .

In other words, vx = (x,v) ∈ Ex and uy = (y,u) ∈ Ey are regarded as thesame vector if and only if v = u. In this manner, vectors at different tangentspaces can be added or subtracted as if they were in the same vector space.

A.2.2 Differentiation

Before we define the derivative of tensor functions on Euclidean space ingeneral, let us recall the definition of derivative of a real-valued function ofa real variable. Let f : (a, b) → IR be a function on the interval (a, b) ⊂ IR.The derivative of f at t ∈ (a, b) is defined as

df(t)dt

= limh→0

1h

(f(t+ h)− f(t)

),

if the limit exists.


This definition can easily be extended to tensor-valued functions of a realvariable. Let W be a space equipped with a norm (or a distance function).As examples, we have

IR : d(x, y) = |x− y| ,E : d(x, y) =

√(x− y) · (x− y),

V : |u| =√

u · u,

L(V ),Sym(V ),Skw(V ) : |A| =√

trAAT .

(A.41)

With a norm it makes sense to talk about limit and convergence in the spaceW .

Let f : (a, b)→ W be a function defined on an interval (a, b) ⊂ IR. Thederivative of f at t ∈ (a, b) is defined as

df(t)dt

= limh→0

1h

(f(t+ h)− f(t)

). (A.42)

The derivative of f at t will also be denoted by f(t). Obviously for anyt ∈ (a, b) we have f(t) ∈W .

Note that if f is defined on a more general space, the expression on theright-hand side of the definition (A.42) may not make sense at all. However,we can rewrite the relation (A.42) in a different form.

For fixed t, let Df(t) : IR→W be the linear transformation defined by

Df(t)[h] = f(t)h.

Then (A.42) is equivalent to

limh→0

1|h||f(t+ h)− f(t)−Df(t)[h]| = 0.

In this form the definition of derivative can easily be generalized to otherfunctions.

Tensor fields

Now we shall consider functions on a Euclidean point space E . Let D be anopen set in E , and f be a tensor-valued function, f : D →W . Such functionsare usually called tensor fields, more specifically,1) W = IR, f is called a scalar field on D,

f : x ∈ D 7−→ f(x) ∈ IR.

2) W = V, f is called a vector field on D,

f : x ∈ D 7−→ f(x) ∈ Ex ∼= V.


3) W = L(V ), f is called a second order tensor field on D,

f : x ∈ D 7−→ f(x) ∈ Ex⊗ Ex ∼= L(V ).

4) W = E , f is called a point field on D or a deformation of D,

f : x ∈ D 7−→ f(x) ∈ E .

Definition. A function f : D →W is said to be differentiable at x ∈ D ⊂ E ,if there exists a linear transformation Df(x) ∈ L(V,W ) at x, such that forany v ∈ V ,

lim|v|→0

1|v||f(x+ v)− f(x)−Df(x)[v]| = 0. (A.43)

The linear transformation Df(x) is uniquely determined by the aboverelation, and it is called the gradient (or derivative) of f at x, denoted bygrad f , or ∇xf , or simply ∇f . By definition, ∇f(x) is a tensor in W ⊗ V ,or is a vector in V if W = IR.

The condition (A.43) is equivalent to

f(x+ v)− f(x) = ∇f(x)[v] + o(v),

where o(v) is a quantity containing terms such that

lim|v|→0

o(v)|v|

= 0.

Moreover, if we substitute tv for v for some fixed v in V , (A.43) is alsoequivalent to

∇f(x)[v] = limt→0

1t

(f(x+ tv)− f(x)

)=

d

dtf(x+ tv)

∣∣∣t=0

,

(A.44)

The right-hand side of the above relation is usually known as the directionalderivative of f relative to the vector v. Note that for fixed x and v, f(x+ tv)is a tensor-valued function of a real variable and its derivative can easilydetermined from (A.42).

Functions on tensor spaces

Let W1 and W2 be two spaces on which a norm or a distance function isdefined, such as the spaces mentioned in (A.41) and let D ⊂W1 be an opensubset. The gradient of tensor functions on D can be defined in a similarmanner.


Definition. A function F : D →W2 is said to be differentiable at X ∈ D ⊂W1, if there exists a linear transformation DF (X) ∈ L(W1,W2) at X, suchthat ∀ Y ∈ D,

lim|Y |→0

1|Y ||F (X + Y )− F (X)−DF (X)[Y ]| = 0.

The linear transformation DF (X) is uniquely determined by the aboverelation, and it is called the gradient of F with respect to X, denoted by ∂XF .We have ∂XF ∈ W2 ⊗W1. The definition is equivalent to the condition: forany Y , we have

F (X + Y )− F (X) = ∂XF (X)[Y ] + o(Y ), (A.45)

or∂XF (X)[Y ] =

d

dtF (X + tY )

∣∣∣t=0

. (A.46)

For φ ∈W2⊗W1, and Y ∈W1, the notation φ[Y ] used in the above relationsis self-evident: for φ = K ⊗X,

(K ⊗X)[Y ] = (X · Y )K, ∀K ∈W2, X, Y ∈W1.

Moreover, for all v,u ∈ V and A,S ∈ L(V ), we have

v[u] = v · u,A[u] = Au,

A[S] = A · S = trAST ,(v ⊗ u)[S] = v · Su.

Gradients can easily be computed directly from the definition (A.45) or(A.46). We demonstrate this procedure with some examples.

Example A.2.1 Let φ : L(V )× V → IR be defined by

φ(A,v) = v ·Av.

Then

φ(A,v + u) = (v + u) ·A(v + u)

= v ·Av + v ·Au + u ·Av + u ·Au

= φ(A,v) + ∂vφ[u] + o(u),

so that∂vφ[u] = v ·Au + u ·Av

= ATv · u +Av · u= (AT +A)v[u].


Therefore, we obtain

∂vφ = (A+AT )v.

Moreover, we have

φ(A+ S,v) = v · (A+ S)v = v ·Av + v · Sv,

which implies∂Aφ[S] = v · Sv = (v ⊗ v)[S],

so that∂Aφ = v ⊗ v.

tu

Example A.2.2 Let φ : L(V ) → IR be defined for any fixed u,v ∈ Vby

φ(A) = u ·Av.

From (A.46) we have

∂Aφ[S] =d

dt

(u · (A+ tS)v

)∣∣∣t=0

= u · Sv = (u⊗ v)[S],

for all S ∈ L(V ), and we obtain

∂Aφ = u⊗ v.

Now suppose that A is a symmetric tensor, hence the function φ isdefined on the subspace Sym(V ) only,

φ : Sym(V )→ IR,

and by definition ∂Aφ ∈ Sym(V ) also. In this case, we have the samerelation,

∂Aφ[S] = (u⊗ v)[S],

but it holds only for all S ∈ Sym(V ). Therefore we conclude that

∂Aφ =12

(u⊗ v + v ⊗ u),

after symmetrization.Similarly, if A is a skew-symmetric tensor, then ∂Aφ ∈ Skw(V ) and

the result must be skew-symmetrized,

∂Aφ =12

(u⊗ v − v ⊗ u).

tu


Example A.2.3 We consider trace and determinant function. Since

tr(A+ S) = trA+ trS = trA+ 1 · S,

so that trivially, the gradient of the trace is the identity transformation,

∂A(trA) = 1 . (A.47)

For the gradient of the determinant, we have

(∂AdetA)[S] = det(A+ S)− det(A) + o(S).

Let ω be a non-trivial alternating n-linear form, then

ω(v1, · · · ,vn)(∂AdetA)[S]= ω((A+S)v1, · · · , (A+S)vn)−ω(Av1, · · · , Avn) + o(S).

By the linearity of ω, after throwing away all the higher order terms intoo(S), the right-hand side becomes

=n∑i=1

ω(Av1, · · · , Svi, · · · , Avn) + o(S)

=n∑i=1

ω(Av1, · · · , AA−1Svi, · · · , Avn) + o(S)

= (detA)n∑i=1

ω(v1, · · · , A−1Svi, · · · ,vn) + o(S)

= (detA)(trA−1S)ω(v1, · · · ,vn) + o(S),

by (A.21). Therefore, we have

(∂AdetA)[S] = (detA)(trSA−1) = (detA)A−T [S],

which implies the following formula,

∂AdetA = (detA)A−T . (A.48)

tu

In differential calculus, we frequently differentiate a composite functionby the chain rule. This rule can be stated for composite tensor functions ingeneral. Let W1, W2, W3 be normed spaces of the type (A.41) and D1 ⊂W1,D2 ⊂W2 be open subsets, and let

φ : D1 →W2, ψ : D2 →W3,

with φ(D1) ⊂ D2. Then we have the following


Chain Rule. Let φ be differentiable at X ∈ D1, and ψ be differentiableat Y = φ(X) ∈ D2. Then the composition f = ψ ◦ φ is differentiable at Xand

Df(X)[Z] = Dψ(φ(X))[Dφ(X)[Z]], (A.49)

for any Z ∈W1 or simply

Df(X) = Dψ(Y ) ◦Dφ(X).

Example A.2.4 If φ is a scalar-valued function of a vector variable,g(x) is a vector field on E , and h(v) is a vector-valued function of avector variable, then

∇h(g(x)) = ∂vh∣∣∣v=g(x)

(∇g(x)),

∇φ(g(x)) =(∇g(x)

)T∂vφ

∣∣∣v=g(x)

.

Let us verify the last one in the above formulae. For any u ∈ V , from(A.49),

∇φ(g(x))[u] = ∂vφ∣∣∣v=g(x)

[∇g(x)[u]] = ∂vφ∣∣∣v=g(x)

·(∇g(x)

)u

=(∇g(x)

)T∂vφ

∣∣∣v=g(x)

· u =(∇g(x)

)T∂vφ

∣∣∣v=g(x)

[u],

where in the third step we have used the definition of transpose (A.8).Note that ∇h, ∇g, and ∂vh are all second order tensors, while ∂vφ is avector quantity. tu

Another important result in differentiation is the product rule. For tensorfunctions in general, there are many different products available, for example,the product of a scalar and a vector, the inner product, the tensor product,the action of a tensor on a vector etc. These products have one property incommon, namely, bilinearity. Therefore, in order to establish a product rulevalid for all cases of interest, we consider the bilinear operation

π : W1 ×W2 −→W3

which assigns to each φ ∈ W1, ψ ∈ W2, the product π(φ, ψ) ∈ W3. If φ, ψare two functions,

φ : D →W1, ψ : D →W2,

where D is an open subset of some normed space W , then the product f =π(φ, ψ) is the function defined by

f : D −→W3

f(X) = π(φ(X), ψ(X)), ∀X ∈ D.

We then have the following


Product Rule. Suppose that φ and ψ are differentiable at X ∈ D ⊂ W ,then their product f = π(φ, ψ) is differentiable at X and

Df(X)[V ] = π(Dφ(X)[V ], ψ(X)) + π(φ(X), Dψ(X)[V ]), (A.50)

for all V ∈W .

In other words, the derivative of the product π(φ, ψ) is the derivative ofπ holding ψ fixed plus the derivative of π holding φ fixed.

Example A.2.5 Let f be a scalar-valued, and h, q be vector-valuedfunctions on D ⊂W . For W = IR, we have

(fh)˙ = fh + f h,

(q · h)˙ = q · h + q · h.(A.51)

For W = E , we have

∇(fh) = h⊗∇f + f∇h,

∇(q · h) = (∇q)Th + (∇h)Tq.(A.52)

For W = V , we have

∂v(fh) = h⊗ ∂vf + f ∂vh,

∂v(q · h) = (∂vq)Th + (∂vh)Tq,(A.53)

Unlike the simple formulae in (A.51), the relations in (A.52) and (A.53)do not look like the familiar product rules, because they have to beconsistent with our notation conventions.

Let us demonstrate the first relation of (A.52). By the product rule(A.50), for any w ∈ V , we have

∇(fh)[w] = (∇f [w])h + f(∇h[w]) = (∇f ·w)h + f(∇h)w

= (h⊗∇f)w + f(∇h)w =(h⊗∇f + f(∇h)

)[w],

where in the third step we have used the definition (A.4). tu

If f : D ⊂ U → W is differentiable and its derivative Df is continuousin D, we say that f is of class C1. The derivative is again a function, Df :D →W ⊗U , for which we can talk about the differentiability and continuity.We say that f is of class C2, if Df is of class C1, and so forth. Frequently,we say a function is smooth to mean that it is of class Ck for some k ≥ 1. Wemention the following

Inverse Function Theorem. Let D ⊂ W be an open subset and f :D →W be a one-to-one function of class Ck(k ≥ 1). Assume that the lineartransformation Df(X) : W → W is invertible at each X ∈ D, then f−1

exists and is of class Ck.


Example A.2.6 Let D ⊂ E and φ : D → IR be of class C2. Then thesecond gradient of φ is a symmetric tensor, that is, ∇(∇φ) ∈ Sym(V ).

Indeed, from the definition, we have

∇φ(x+ u)−∇φ(x) = ∇(∇φ)[u] + o(u).

Taking inner product with v, we obtain

∇φ(x+ u)[v]−∇φ(x)[v] = v · ∇(∇φ)u + o(u),

which implies that

v · ∇(∇φ)u =(φ(x+ u + v)− φ(x+ u)

)−(φ(x+ v)− φ(x)

)+ o(u) + o(v).

Since the right-hand side of the last relation is symmetric in u and v, itfollows that

v · ∇(∇φ)u = u · ∇(∇φ)v,

which proves that the second gradient of φ is symmetric. tu

Exercise A.2.1 Show that if Q : IR → O(V ) is differentiable, thenQQT is skew symmetric.

Exercise A.2.2 Let h(v, A) = (v · Av)A2v be a vector function of avector v and a second order tensor A. Compute ∂vh and (∂Ah)[S] forany S ∈ L(V ).

Exercise A.2.3 If A ∈ L(V ) is invertible, show that1) (∂AA−1)[S] = −A−1SA−1, for any S ∈ L(V ),2) ∂A tr(A−1) = −(A−2)T .

Exercise A.2.4 Let A be a second order tensor. Show that1) For any positive integer k,

∂A trAk = k(Ak−1)T .

2) For principal invariants IA, IIA, IIIA,

∂AIA = 1 ,

∂AIIA = (IA1 −A)T ,

∂AIIIA = (IIA1 − IAA+A2)T .

(A.54)

Hint: Calculate ∂A det(A+ λ1 ) = ∂A(λ3 + IAλ2 + IIAλ+ IIIA).


A.2.3 Coordinate System

Tensor functions can be expressed in terms of components relative to smoothfields of bases in the Euclidean point space E associated with a coordinatesystem.

Definition. Let D ⊂ E be an open set. A coordinate system on D is a smoothone-to-one mapping

ψ : D −→ U,

where U is an open set in IRn, such that ψ−1 is also smooth.

Let x ∈ D,ψ : x 7−→ (x1, · · · , xn) = ψ(x).

(x1, · · · , xn) is called the (curvilinear) coordinate of x, and the functions

χi : D −→ IR

χi(x) = xi, i = 1, · · · , n,(A.55)

are called the ithcoordinate function of ψ. For convenience, we call (xi) acoordinate system on D.

Let χ = ψ−1, thenx = χ(x1, · · · , xn). (A.56)

For x1, · · · , xn fixed, the mapping

λi : IR −→ Dλi(t) = χ(x1, · · · , xi + t, · · · , xn),

(A.57)

is a curve in D passing through x at t = 0, called the ith coordinate curve atx (Fig. A.2). We denote the tangent of this curve at x by ei(x).

ei(x) = λi(t)∣∣∣t=0

=∂χ

∂xi

∣∣∣(x1,···,xn)

. (A.58)

Proposition. The set {ei(x), i = 1, · · · , n} forms a basis for the tangentspace Ex.

Proof : For any vector v ∈ Ex, we can define a curve through x by

λ(t) = x+ tv.

Letλ(t) = χ(λ1(t), · · · , λn(t)),

where λi(t) are the coordinates of λ(t) given by

λi(t) = χi(x+ tv). (A.59)


-

6

��

(x1, · · · , xn)

��BBBXXXhhhhh ��

��

QQQ

XXXXX

X

PPPq��

��*r

-rt

xi

t

λi(t)

ei(x)

λi

x

D

IRn

Fig. A.2. Coordinate curve

Then the tangent vector

v = λ(t)∣∣∣t=0

=∂χ

∂xi

∣∣∣x

dλi

dt

∣∣∣t=0

=dλi

dt

∣∣∣t=0

ei(x),

by (A.58). In other words, {ei(x)} spans the space Ex. tu

The set {ei(x)} is a basis of Ex for each x. This field of bases is calledthe natural basis of the coordinate system (xi) for V , the translation space ofE . The corresponding dual basis of this natural basis is denoted by {ei(x)}.

Combining (A.55) and (A.56), we have

xi = χi(χ(x1, · · · , xn)),

which implies

∂xi

∂xj= δij = (∇χi) · ∂χ

∂xj= (∇χi) · ej(x),

by (A.58). Therefore, the two natural bases of the coordinate system (xi) aregiven by the following relations:

ei(x) =∂χ

∂xi

∣∣∣x, ei(x) = ∇χi(x). (A.60)

The inner products,

gij(x) = ei(x) · ej(x), gij(x) = ei(x) · ej(x),

are called the metric tensors of the coordinate system.


Now let us consider change of coordinate systems. Let (xi) and (xi) betwo coordinate systems on D, and {ei(x)}, {ei(x)} be the correspondingnatural bases. Suppose that the coordinate transformations are given by

xi = xi(x1, · · · , xn),

xk = xk(x1, · · · , xn).

Then by taking the gradients, one immediately obtain the change of thecorresponding natural bases given by

ei(x) =∂xi

∂xkek(x), ei(x) =

∂xk

∂xiek(x). (A.61)

Comparing the change of bases considered in Sect. A.1.4, [∂xi/∂xk] plays the

role of the transformation matrix [M ik ] in (A.12), and hence, the transfor-

mation rules (A.14) for the components of an arbitrary tensor in the changeof coordinate system becomes

Aij = Akl∂xi

∂xk∂xl

∂xj. (A.62)

For other components of tensors in general, the transformation rules aresimilar.

Example A.2.7 Let us consider a deformation κ : D → E ,

κ(x) = x.

Let (xi) be a coordinate system on D, and (xα) be a coordinate systemon κ(D),

x = χ(x1, · · · , xn), x = χ(x1, · · · , xn).

The deformation κ is usually expressed explicitly in the form,

xα = κα(x1, · · · , xn), α = 1, · · · , n. (A.63)

Using the chain rule, we obtain, with xi = χi(x),

∇κ(x) =∂χ

∂xα

∣∣∣x

∂κα

∂xi

∣∣∣x∇χi(x),

which by (A.60) becomes

∇κ(x) =∂κα

∂xi

∣∣∣xeα(κ(x))⊗ ei(x).

This is the component form of the deformation gradient ∇κ(x) in termsof two different coordinate systems (xi) and (xα). With respect to thesetwo natural bases at two different points, namely, x and κ(x), the com-ponents of the deformation gradient are just the partial derivatives of thedeformation function (A.63), which can most easily be calculated. Othercomponent forms of ∇κ can be obtained through the metric tensors andby the change of bases relative to the coordinate systems. tu


A.2.4 Covariant Derivatives

We shall now consider the component form of the gradient of a tensor fieldin general relative to the natural basis of a coordinate system. Let (xi) be acoordinate system on D ⊂ E , and {ei(x)}, {ei(x)} be its natural bases.

To begin with, let us consider a scalar field, f : D → IR, the gradient off is then a vector field. By (A.44), (A.57), and (A.58) we have

(∇f(x)) · ei(x) = limt→0

1t

(f(x+ tei)− f(x))

= limt→0

1t

(f(χ(x1, · · · , xi + t, · · · , xn))− f(χ(x1, · · · , xn))

)=∂(f ◦ χ)∂xi

∣∣∣(x1,···,xn)

,

which are the covariant components of ∇f .Usually, we shall write f(χ(x1, · · · , xn)) as f(x1, · · · , xn) for simplicity.

Therefore, the component form of the gradient of f(x) becomes

∇f(x) =∂f

∂xi

∣∣∣xei(x). (A.64)

In other words, for the gradient of a scalar field f , its covariant componentrelative to the natural basis, (∇f)i, is just the partial derivative relative tothe coordinate xi.

Now let us consider the gradients of natural bases themselves. For eachi fixed, {ei} and {ei} can be regarded as vector fields on D,

ei : x ∈ D 7−→ ei(x) ∈ Ex.

Let us denote the gradients of natural bases by

Γi(x) = ∇ei(x) ∈ Ex⊗ Ex,Γ i(x) = ∇ei(x) ∈ Ex ⊗ Ex.

(A.65)

We writeΓi = Γ j

i kej⊗ ek, Γ i = Γ ijkej⊗ ek. (A.66)

The components Γ ji k and Γ ijk are called the Christoffel symbols. Note that

Γ ji k and Γ ijk are not the associated components of a third order tensor.

By taking the gradient of (ei(x) · ej(x)), one can obtain the relation,

Γ ij k = −Γ ijk. (A.67)


Moreover, since Γ i = ∇(∇χi(x)) by (A.60)1 and the second gradient is asymmetric tensor, we have the following symmetry conditions,

Γ ijk = Γ ikj , Γ ij k = Γ i

k j . (A.68)

Since both Christoffel symbols are related in such a simple manner, usuallyonly one is in use, namely, Γ i

j k, and it is called the Christoffel symbol of thesecond kind in classical tensor analysis.

Now let us calculate the gradient of a vector field in terms of the coor-dinate system. Suppose that v(x) is a vector field and

v(x) = vi(x)ei(x) = vi(x)ei(x).

Then by (A.52)1, (A.64), (A.65), and (A.66), we have

∇v = ∇(viei)

= ei⊗∇vi + vi∇ei

= ei⊗∂vi

∂xkek + viΓ j

i kej⊗ ek

=( ∂vj∂xk

+ viΓ ji k

)ej⊗ ek.

Hence, the gradient of v(x) has the component form,

∇v = vj,kej⊗ ek,

where

vj,k =∂vj

∂xk+ vi Γ j

i k. (A.69)

Similarly, we also have∇v = vj,ke

j⊗ ek,

wherevj,k =

∂vj∂xk− vi Γ i

j k. (A.70)

Here the relation (A.67) has been used.vj,k and vj,k are the mixed and the covariant components of ∇v. The

comma, stands for the operation called the covariant derivative, since it in-creases the covariant order of the components by one.

More generally, suppose that A is a second order tensor field, then ∇Ais a third order tensor field which has the following component form,

∇A = Aij,k ei ⊗ ej ⊗ ek,


where

Aij,k =∂Aij∂xk

+Alj Γil k −Ail Γ l

j k. (A.71)

Covariant derivatives of other components can easily be written down usingthe same recipes for covariant and contravariant components respectively.

We have seen in (A.7) that the components of the metric tensor, gij(x)and gij(x), are also the components of the identity tensor, therefore theircovariant derivatives must vanish,

gij,k = 0, gij ,k = 0. (A.72)

Consequently by (A.24), the covariant derivative of the volume tensor alsovanish,

eijk,l = 0, eijk,l = 0.In other words, the components of the metric tensor and the volume tensorbehave like constant tensors in covariant derivation although they are ingeneral functions of x.

From (A.72)1, we can derive a formula for the determination of theChristoffel symbols in terms of the metric tensor. By (A.71) we have

∂gij∂xk

= gljΓli k + gilΓ

lj k.

Rotating the indices (i, j, k) of this relation, then adding two of the threeresulting equations and subtracting the remaining one, we get

2 gljΓ li k =

(∂gjk∂xi

+∂gij∂xk

− ∂gik∂xj

).

Hence, we have the following formula:

Γ ji k =

12gjl(∂gli∂xk

+∂glk∂xi− ∂gik

∂xl

). (A.73)

The Christoffel symbols are not components of a third order tensor. For twocoordinate systems (xi) and (xi), they have the following transformationrules:

Γ ji k = Γ s

r t

∂xr

∂xi∂xj

∂xs∂xt

∂xk+

∂2xr

∂xi∂xk∂xj

∂xr.

A.2.5 Other Differential Operators

Divergence and curl of a vector field can be defined in the usual way andtheir definitions can be adopted also for tensor fields.

Definition. The divergence of a vector field u is a scalar field defined by

div u = tr(∇u). (A.74)

In component form,div u = ui,i.


Definition. The curl (or rotation) of u is a vector field defined by

curl u = 〈∇uT −∇u〉.

In component form,curl u = eijkuk,jei.

Here duality map defined in (A.29) is employed and according to (A.30)curl u is the axial vector of the skew-symmetric part of the gradient of (−2u).One can easily verify the following condition:

v · curl u = div(u× v),

for any constant vector field v. This condition can be used as the definitionfor the curl operator. In a similar manner, we can define the divergence of asecond order tensor in terms of the divergence of a vector.

Definition. The divergence of a second order tensor field S is a vector fielddefined by the condition: for any constant vector field v,

v · divS = div(STv). (A.75)

In component form, we have

divS = Sij,jei.

Definition. The Laplacian of a scalar (or vector) field φ, denoted by ∇2φ,is a scalar (or vector) field defined by

∇2φ = div(∇φ).

In component form, if φ is a scalar field,

∇2φ = gjk(φ,j),k = gjkφ,jk.

If φ = h is a vector field,

∇2h = gjkhi,jkei.

In the above expressions, the comma denotes the covariant derivative.

Example A.2.8 Let f and u, v be scalar and vector fields respectively.Then we can show the following relations:

div(fu) = u · ∇f + f div u,

div(u× v) = v · curl u− u · curl v,

∇2(u · v) = ∇2u · v + 2∇u · ∇v + u · ∇2v.

(A.76)


Let us verify the first relation.

div(fu) = tr(∇(fu))

= tr(u⊗∇f + f(∇u)

)= tr(u⊗∇f) + f tr(∇u),

which gives (A.76)1. In this calculation, we have used the definition(A.74), the relation (A.52)1, and the linearity of the trace operator.

Verification of the other relations in (A.76) may not be so straight-forward in direct notation. And more annoyingly, these relations as wellas the relations (A.52) and (A.53) are not easy to memorize. Neverthe-less, if we express all of these relations in index notation, they all becometrivially simple. Indeed, (A.76) may be written out straightly as:

(fui),i = f,iui + fui,i,

(gileljkujvk),i = gileljkuj,iv

k + gileljkujvk,i,

gjk(uivi),jk = gjk(ui,jvi + uivi,j),k= gjkui,jkvi + 2 gjkui,kvi,j + gjkuivi,jk.

which are merely the usual product rules of differentiating scalar func-tions and the symmetry of second gradient. The only difference here isthat the comma denotes the covariant derivative instead of the usualpartial derivative. tu

Remark. From the observation made in the above example, the use of in-dex notation is often encouraged, especially when complicated calculationsare involved. In arbitrary curvilinear coordinate systems, contravariant andcovariant indices must be carefully distinguished and the pair of repeatedindices, for which the summation convention is applied, must always appearin different levels. An index can be raised or lowered to its proper level withthe metric tensor gij or gij . Moreover, since the gradients of the metric tensorand the volume tensor vanish, in covariant differentiation, the metric tensorgij as well as the components of the volume element eijk can be treated asconstants. Furthermore, if Cartesian coordinate system is used, there is nodifference between contravariant and covariant components and hence all theindices can be written at the same level, and more conveniently, the covari-ant derivative becomes the partial derivative and gij = δij , eijk = εijk areconstants.

It is important to note that given an expression in index notation, onecan always turn it into an expression in direct notation or vice versa. There-fore, in handling calculations, the choice of using direct notation or indexnotation, or even using Cartesian index notation is totally up to one’s tasteand convenience.


We shall also mention some important theorems of integral calculus oftenused in mechanics.

Divergence Theorem. Let R be a bounded regular region2 in E , and letφ : R → IR, h : R → V , S : R → L(V ) be smooth fields. Then∫

∂Rφn da =

∫R∇φdv,

∫∂R

v · n da =∫R

div v dv,

∫∂R

Sn da =∫R

divS dv,

(A.77)

where n is the outward unit normal field on ∂R.

Proof : The relations (A.77)1,2 are well-known classical results. To show(A.77)3, let v be an arbitrary constant vector. Then

v ·∫∂R

Sn da =∫∂R

v · Sn da =∫∂R

STv · n da

=∫R

div(STv) dv =∫R

v · divS dv

= v ·∫R

divS dv,

where we have used (A.77)2 and the definition (A.75). tu

Proposition. Let φ : D → W be a continuous function on an open set Din E . If ∫

Nφdv = 0,

for any N ⊂ D, then φ is identically zero in D, i.e.,

φ(x) = 0, ∀x ∈ D.

Proof : Suppose that φ(x◦) 6= 0 for some x◦ ∈ D, then since φ is continuous,there exists a small neighborhood N ⊂ D containing x◦, such that φ(x) 6= 0,∀x ∈ N . Therefore, by the mean value theorem of integral calculus,∫

Nφdv = Kφ(x) 6= 0,

2 A regular region, roughly speaking, is a closed region with piecewise smoothboundary.


for some x ∈ N , where K denotes the volume of N . This contradicts thehypothesis. tu

This proposition and the divergence theorem enable us to deduce localfield equations from the integral balance laws.

Exercise A.2.5 Let f , u,v, and S be smooth scalar, vector, and secondorder tensor fields. Verify the following identities:1) div(Su) = u · divST + tr(S∇u),2) div(fS) = S∇f + f divS,3) div(u⊗ v) = (∇u)v + u div v,4) div(∇u)T = ∇(div u).

Exercise A.2.6 Let f and v be smooth scalar and vector fields respec-tively. Show that1) curl∇f = 0,2) div curl v = 0,3) If div v = 0 and curl v = 0, then ∇2v = 0.

Exercise A.2.7 Let v and S be smooth vector and tensor field on abound regular region R respectively. Show that1)∫∂R v ⊗ n da =

∫R∇v dv,

2)∫∂R v ⊗ Sn da =

∫R

(v ⊗ divS + (∇v)ST

)dv.

A.2.6 Physical Components

Let (xi) be a coordinate system on E and {ei(x)} and {ei(x)} be its naturalbases. The system (xi) is called an orthogonal coordinate system if the metrictensor

gij(x) = 0, for i 6= j, ∀x ∈ E .

For an orthogonal coordinate system, we can define a field of orthonormalbasis, denoted by {e〈i〉(x)}, by normalizing the natural basis,

e〈i〉 =ei|ei|

. (no sum)

In this expression the summation notation is not invoked as indicated explic-itly. Since

|ei| =√

ei · ei =√gii, (no sum)

therefore,

e〈i〉 =ei√gii

=ei√gii

=√gii ei =

√gii e

i. (no sum)


Here we have noted that normalization of the two dual natural bases of anorthogonal coordinate system gives rise to the same orthonormal basis.

The components of a tensor field relative to the orthonormal basis{e〈i〉(x)} are called the physical components in the coordinate system (xi).For a vector field v,

v = viei = viei = v〈i〉e〈i〉.

The physical components v〈i〉 are given by

v〈i〉 =√gii v

i =vi√gii. (no sum) (A.78)

For a second order tensor field T ,

T = T ijei⊗ ej = Tijei⊗ ej = T ijei⊗ ej

= T〈ij〉e〈i〉⊗ e〈j〉.

The physical components T〈ij〉 are given by

T〈ij〉 =√gii√gjj T

ij =Tij√gii√gjj

=√gii√gjj

T ij . (no sum) (A.79)

In particular, we have g〈ij〉 = δij .

The advantage of using physical components is obvious in practical ap-plications. Since the norms of the basis vectors of the natural basis in generalvary from point to point in E , hence it is inconvenient for the measurementof physical quantities relative to this basis.

A.2.7 Orthogonal Coordinate Systems

We now consider three orthogonal coordinate systems most commonly used:the Cartesian, the cylindrical, and the spherical coordinate systems and de-rive their basic characteristics.

a) Cartesian Coordinate System

Fix a point o in E . Let {i1, i2, i3} be an orthonormal basis of V . For anyx ∈ E , then x− o ∈ V . We write

x− o = xi ii.

Clearly, this defines a coordinate system

x 7−→ (x1, x2, x3)

with {i1, i2, i3} as its natural basis, which is of course independent of x ∈ E .We call such a system a Cartesian coordinate system.


For a Cartesian coordinate system, we have

gij(x) = δij , ∀ x ∈ E ,

and hence by (A.73)Γ ij k(x) = 0.

It is also a custom to write the basis {i1, i2, i3} as {ex, ey, ez} and thecoordinate (x1, x2, x3) as (x, y, z) for a Cartesian coordinate system.

b) Cylindrical Coordinate System

The cylindrical coordinate system (r, θ, z) is defined as

x = χ(r, θ, z),

by the following coordinate transformation (see Fig. A.3 (a)),

x1 = r cos θ, r > 0

x2 = r sin θ, 0 < θ < 2π

x3 = z,

(A.80)

where x = (x1, x2, x3) is the Cartesian coordinate system.The natural bases are denoted by {er, eθ, ez} and {er, eθ, ez}. From

(A.80) and (A.60)2, we can determine the basis in terms of the Cartesiancomponents.

er =∂χ

∂r= cos θ i1 + sin θ i2,

eθ =∂χ

∂θ= −r sin θ i1 + r cos θ i2,

ez =∂χ

∂z= i3.

Therefore, we obtain the matrix representations of the metric tensor in thecylindrical coordinate system,

[gij ] =

1r2

1

, [gij ] =

1r−2

1

,and the Christoffel symbols by (A.73),

Γ θr θ = Γ θ

θ r =1r,

Γ rθ θ = −r,

others = 0.


Moreover, we have

er = er, eθ = r2 eθ, ez = ez,

ande〈r〉 = cos θ i1 + sin θ i2,

e〈θ〉 = − sin θ i1 + cos θ i2,

e〈z〉 = i3.

-

6

��

��+

ZZZZ

ZZZZq(r, θ, z)

z

θ

r

x1

x2

x3

(a) Cylindrical

-

6

��

��+

ZZZZ

��

q(r, θ, φ)

θ

φ

r

x1

x2

x3

(b) Spherical

Fig. A.3. Coordinate systems

c) Spherical Coordinate System

The spherical coordinate system (r, θ, φ) is defined as

x = χ(r, θ, φ),

by the following coordinate transformation (see Fig. A.3 (b)),

x1 = r sin θ cosφ,

x2 = r sin θ sinφ,

x3 = r cos θ,

r > 0

0 < θ < π

0 < φ < 2π

where x = (x1, x2, x3) is the Cartesian coordinate system.The natural bases are denoted by {er, eθ, eφ} and {er, eθ, eφ}. We have

er = sin θ cosφ i1 + sin θ sinφ i2 + cos θ i3,

eθ = r cos θ cosφ i1 + r cos θ sinφ i2 − r sin θ i3,

eφ = −r sin θ sinφ i1 + r sin θ cosφ i2,


ander = er, eθ = r2 eθ, eφ = r2 sin2 θ eφ.

The matrix representations of the metric tensor has the forms

[gij ] =

1r2

r2 sin2 θ

, [gij ] =

1r−2

(r sin θ)−2

,and the Christoffel symbols are

Γ θr θ = Γ θ

θ r = Γ φr φ = Γ φ

φ r =1r,

Γ rθ θ = −r,

Γ rφ φ = −r sin2 θ,

Γ φθ φ = Γ φ

φ θ = cot θ,

Γ θφ φ = − sin θ cos θ,

others = 0,

Moreover, the orthonormal basis for the physical components are

e〈r〉 = sin θ cosφ i1 + sin θ sinφ i2 + cos θ i3,

e〈θ〉 = cos θ cosφ i1 + cos θ sinφ i2 − sin θ i3,

e〈φ〉 = − sinφ i1 + cosφ i2.

Remark. More frequently, we would like to express quantities in these co-ordinate systems in terms of their physical components. A simple way to dothis is to derive the expressions first in terms of contravariant or covariantcomponents and then convert them into physical components using relationslike (A.78) and (A.79).

Example A.2.9 Let us calculate the Laplacian of a scalar field Φ inthe spherical coordinate system. We have

Φ,j =∂Φ

∂xj,

Φ,jk =∂2Φ

∂xj∂xk− ∂Φ

∂xiΓ ij k,


from which we obtain the following covariant components:

Φ,rr =∂2Φ

∂r2,

Φ,θθ =∂2Φ

∂θ2− ∂Φ

∂rΓ rθ θ =

∂2Φ

∂θ2+ r

∂Φ

∂r,

Φ,φφ =∂2Φ

∂φ2− ∂Φ

∂rΓ rφ φ −

∂Φ

∂θΓ θφ φ =

∂2Φ

∂φ2+ r sin2 θ

∂Φ

∂r+ sin θ cos θ

∂Φ

∂θ.

We have Φ,rr = Φ,〈rr〉, Φ,θθ = r2Φ,〈θθ〉, Φ,φφ = r2 sin2 θ Φ,〈φφ〉 in termsof physical components. That is,

Φ,〈rr〉 =∂2Φ

∂r2,

Φ,〈θθ〉 =1r2∂2Φ

∂θ2+

1r

∂Φ

∂r,

Φ,〈φφ〉 =1

r2 sin2 θ

∂2Φ

∂φ2+

1r

∂Φ

∂r+

cot θr2

∂Φ

∂θ.

Therefore, the Laplacian ∇2Φ, which is the sum Φ,〈rr〉 + Φ,〈θθ〉 + Φ,〈φφ〉in physical components, becomes

∇2Φ =∂2Φ

∂r2+

2r

∂Φ

∂r+

1r2∂2Φ

∂θ2+

1r2 sin2 θ

∂2Φ

∂φ2+

cot θr2

∂Φ

∂θ.

tu

Example A.2.10 We give the physical components of the divergenceof a symmetric tensor field T in the following coordinate systems:

a) Cartesian coordinate system (x, y, z):

(div T )〈x〉 =∂T〈xx〉

∂x+∂T〈xy〉

∂y+∂T〈xz〉

∂z,

(div T )〈y〉 =∂T〈xy〉

∂x+∂T〈yy〉

∂y+∂T〈yz〉

∂z, (A.81)

(div T )〈z〉 =∂T〈xz〉

∂x+∂T〈yz〉

∂y+∂T〈zz〉

∂z.

b) Cylindrical coordinate system (r, θ, z):

(div T )〈r〉 =∂T〈rr〉

∂r+

1r

∂T〈rθ〉

∂θ+∂T〈rz〉

∂z+T〈rr〉 − T〈θθ〉

r,

(div T )〈θ〉 =∂T〈rθ〉

∂r+

1r

∂T〈θθ〉

∂θ+∂T〈θz〉

∂z+

2rT〈rθ〉, (A.82)

(div T )〈z〉 =∂T〈rz〉

∂r+

1r

∂T〈θz〉

∂θ+∂T〈zz〉

∂z+

1rT〈rz〉.


c) Spherical coordinate system (r, θ, φ):

(div T )〈r〉 =∂T〈rr〉

∂r+

1r

∂T〈rθ〉

∂θ+

1r sin θ

∂T〈rφ〉

∂φ

+1r

(2T〈rr〉 − T〈θθ〉 − T〈φφ〉 + cot θ T〈rθ〉

),

(div T )〈θ〉 =∂T〈rθ〉

∂r+

1r

∂T〈θθ〉

∂θ+

1r sin θ

∂T〈θφ〉

∂φ(A.83)

+1r

(3T〈rθ〉 + cot θ (T〈θθ〉 − T〈φφ〉)

),

(div T )〈φ〉 =∂T〈rφ〉

∂r+

1r

∂T〈θφ〉

∂θ+

1r sin θ

∂T〈φφ〉

∂φ

+1r

(3T〈rφ〉 + 2 cot θ T〈θφ〉

).

tu

Exercise A.2.8 Let u be a vector field. Show that1) in cylindrical coordinate system,

div u =∂u〈r〉

∂r+

1r

∂u〈θ〉

∂θ+∂u〈z〉

∂z+

1ru〈r〉;

2) in spherical coordinate system,

div u =∂u〈r〉

∂r+

1r

∂u〈θ〉

∂θ+

1r sin θ

∂u〈φ〉

∂φ+

2ru〈r〉 +

cot θr

u〈θ〉.

Exercise A.2.9 Let u be a vector field and E = 12 (∇u+∇uT ). Express

E in cylindrical and spherical coordinate systems,1) relative to the natural basis,2) in terms of physical components.

Exercise A.2.10 Let T be a symmetric tensor field. Compute div T ,in cylindrical and spherical coordinate systems,1) relative to the natural basis,2) in terms of physical components. (Verify (A.82) and (A.83)).

Exercise A.2.11 Let Φ : IR → E be a curve. Suppose that {ei(x)} isthe natural basis and φi(t) is the coordinate of Φ(t) in the coordinatesystem (xi). Show that

1) Φ(t) = φi(t) ei(φ(t)),

2) Φ(t) =(φi(t) + φj(t)φk(t)Γ i

j k(φ(t))

ei(φ(t)).

a. elementary tensor analysis

Documents