chapter 1 electromagnetic theory and optics

Chapter 1

Electromagnetic theory and

optics

1.1 Introduction

Optical phenomena are of an immense diversity. Yet, amazingly, the ex-

planation of all these can be traced back to a very few basic principles.

This is not to say that, once these basic principles are known, one can ar-

rive at a precise explanation of each and every optical phenomenon or at

a precise solution for each and every problem in optics. In reality, optical

phenomena can be grouped into classes where each class of phenomena

have certain characteristic features in common, and an adequate expla-

nation of each class of phenomena turns out to be a challenge in itself,

requiring appropriate approximation schemes. But whatever approxima-

tions one has to make, these will be found to involve no principles more

fundamental than, or independent of, the basic ones.

1

CHAPTER 1. ELECTROMAGNETIC THEORY AND OPTICS

What, then, are these basic principles of optics? As far as present day

knowledge goes, the most basic principle underlying the explanation of

optical phenomena, as indeed of all physical phenomena, is to be found

in quantum theory. However, a more useful and concrete way of putting

things would be to say that the theoretical basis of optics is provided

by electromagnetic theory which, in turn, is based entirely on Maxwell’s

equations.

The question then arises as to whether Maxwell’s equations and electro-

magnetic theory are to be looked at from the point of view of classical

physics or of quantum theory.

Of course, one knows that these two points of view are not independent of

each other. In a sense, classical explanations are approximations to the

more complete, quantum theoretic descriptions. But once again, these

approximations are, in a sense, necessary ingredients in the explanation

of a large body of observed phenomena. In other words, while a great

deal is known about the way classical physics is related to quantum the-

ory and while it can be stated that the latter is a more fundamental theory

of nature, it still makes sense to say that the classical and the quantum

theories are two modes of describing and explaining observed phenom-

ena, valid in their own respective realms, where the former relates to the

latter in a certain limiting sense.

This has bearing on the question I have posed above, the answer to which

one may state as follows: while the quantum theory of the electromagnetic

field provides the ultimate basis of optics, an adequate explanation of a

2


large body of optical phenomena can be arrived at from the classical elec-

tromagnetic theory without overt reference to the quantum theory. There do

remain, however, optical phenomena that cannot be adequately explained

without invoking quantum principles.

Optical phenomena are related to the behaviour of electromagnetic fields

where the typical frequencies of variation of the field components lie

within a certain range constituting the spectrum of visible light, though

the theoretical methods and principles of optics are relevant even beyond

this range.

With this in mind, I propose in this book to have a look at optics with

the classical electromagnetic theory as its theoretical basis. At the same

time, I propose to have a brief look at quantum optics as well, where

optical phenomena are linked to quantum theory of the electromagnetic

field, but this will be a more sketchy affair in this book, meant only to

indicate how quantum principles can at all be relevant in optics.

The approach of explaining optical phenomena on the basis of classical

electromagnetic theory is sometimes referred to as ‘classical optics’ so

as to distinguish it from quantum optics. But the term classical optics

is more commonly employed now to refer to a certain traditional way of

looking at optics and to distinguish this approach from what is known

as ‘modern optics’. The latter includes areas such as Fourier optics, sta-

tistical optics, nonlinear optics, and, above all, quantum optics. Not all

of these involve the quantum theory, some being mostly based on clas-

sical electromagnetic theory alone. Thus, the term classical optics has

3


two meanings attached to it - one in the sense of a certain traditional

approach in optics, and the other in the sense of an approach based on

the classical electromagnetic theory.

Classical electromagnetic theory is a subject of vast dimensions. There

is no way I can even sketchily summarise here the principal results of

this theory. Instead, I will simply start from Maxwell’s equations that

constitute the foundations of the theory, and then state a number of

basic results of relevance in optics. Fortunately, for most of classical

optics one need not delve deeper into electromagnetic theory. I will not

present derivations of the results of electromagnetic theory we will be

needing in this book, for which you will have to look up standard texts in

the subject.

1.2 Maxwell’s equations in material media and

in free space

1.2.1 Electromagnetic field variables

The basic idea underlying electromagnetic theory is that, space is perme-

ated with electric and magnetic fields whose spatial and temporal varia-

tions are coupled to one another and are related to source densities, i.e.,

distributions of charges and currents.

The electromagnetic field, moreover, is a dynamical system in itself, en-

dowed with energy, momentum, and angular momentum, and capable

of exchanging these with bodies carrying charge and current. The varia-

4


tions of the electric and magnetic field intensities are described by a set of

partial differential equations - the Maxwell equations (commonly referred

to as the field equations in the context of electromagnetic theory). As I

have already mentioned, the behaviour of the electromagnetic field as a

dynamical system can be described from either the classical or the quan-

tum theoretic point of view. The quantum point of view is more subtle

compared to the classical one, and we will have a taste of it when I talk

of quantum optics later in this book.

Maxwell’s equations for a material medium involve four electromagnetic

field variables, namely the electric intensity (E), electric displacement (D),

magnetic intensity or flux density (B), and magnetic field strength (H),

each of these being functions of space and time variables r, and t. Not

all of these field variables are independent since the electric vectors D

and E are related to each other through a set of constitutive equations

relating to the material properties of the medium. Similarly, the mag-

netic variables H and B are related through another set of constitutive

equations.

The naming of the field variables.

The field vectors do not have universally accepted names attached to them. Thus, E is

referred to variously as the electric field strength, electric field intensity (or electric in-

tensity, in brief) or, simply, the electric vector. A greater degree of non-uniformity affects

the naming of B and H. The former is often referred to as the magnetic flux density or

the magnetic induction, while the latter is commonly described as the magnetic field

strength. In this book, I will mostly refer to E and B as the electric intensity and the

magnetic intensity respectively. The term ‘intensity’ has another use in electromagnetic

theory, namely, in describing the rate of flow of electromagnetic field energy per unit

5


area oriented perpendicularly to the direction of energy flow. However, it will always

be possible to distinguish our use of the term ‘intensity’ in connection with the field

variables E and B from this other usage of the term by referring to the context. The vec-

tors D and H will be named the electric displacement and the magnetic field strength

respectively. These are, to a greater degree, commonly accepted names in the literature.

At times we will use more non-specific terms like ‘field vectors’ or ‘field variables’ to

describe one or more of these vectors or of their components, especially when some

common features of these vectors are being referred to. Once again, the meaning will

have to be read from the context.

The naming of the field variables and their space-time variations in optics.

Finally, in optics, certain characteristic features of the space-time variation of the field

vectors or of their components are often referred to by terms like the ‘optical field’,

‘optical disturbance’ or ‘optical signal’. Thus, the time variation of any of the field com-

ponents at a point or at various points in a given region of space is said to constitute an

optical disturbance in that region. The time variation of the field variables at any given

point in space is at times referred to as the optical signal at that point, and one can then

talk of the propagation of the optical signal from point to point, especially in the context

of information being carried by the time variation of the field variables.

In optics, it often suffices to consider the variations of a scalar variable rather than those

of the field vectors, where the scalar variable may stand for any of the components of a

field vector, or even for a fictitious variable simulating the variations of the field vectors.

For instance, such a scalar variable may be invoked to explain the variation of intensity

at various points in some given region of space, where a more detailed description in

terms of the field vectors themselves may involve unnecessary complexities without any

added benefits in terms of conceptual clarity.

Such scalar fields will prove to be useful in explaining interference and diffraction phe-

nomena, in Fourier optics, and in describing a number of coherence characteristics of

optical disturbances. The space-time variations of such a scalar variable are also re-

6


ferred to as an optical disturbance and the scalar variable itself is commonly termed a

field variable. A vector or scalar field variable (identified from the context) will also be

termed a wave function since such a variable commonly satisfies a wave equation as in

acoustics.

Incidentally, the temporal variation of a wave function at any given point in space is

referred to as its wave form at that point. It is often useful to think of a waveform as a

graph of the wave function plotted against time.

1.2.2 Maxwell’s equations

Maxwell’s equations - four in number - relate the space-time dependence

of the field variables to the source distributions, namely the charge den-

sity function ρ(r, t) and the current density function j(r, t):

div D = ρ, (1.1a)

curl E = −∂B∂t, (1.1b)

div B = 0, (1.1c)

curl H = j+∂D

∂t. (1.1d)

Equations (1.1a) and (1.1d) imply the equation of continuity,

div j+∂ρ

∂t= 0. (1.1e)

This equation constitutes the mathematical statement of the principle of

conservation of charge.

7


In the above equations, ρ and j are to be interpreted as the free charge

and current densities setting up the electromagnetic field under consider-

ation, where the bound charges and currents, associated with the dielec-

tric polarization and magnetization of the medium under consideration,

are excluded.

1.2.3 Material media and the constitutive relations

1.2.3.1 Linear media

The constitutive equations are phenomenological relations depending on

the type of the medium under consideration. There exist approximate

microscopic theories of these relations for some types of media. The fol-

lowing relations hold for what are known as linear media:

D = [ǫ]E, (1.1f)

B = [µ]H. (1.1g)

In this context, one has to distinguish between isotropic and anisotropic

media. For an isotropic medium, the symbols [ǫ] and [µ] in the above

constitutive equations stand for scalar constants (to be denoted by ǫ and

µ respectively) that may, in general, be frequency dependent (see below).

For an anisotropic medium, on the other hand, the symbols [ǫ] and [µ]

in the constitutive relations stand for second rank symmetric tensors

represented, in any given Cartesian co-ordinate system, by symmetric

matrices with elements, say ǫij, µij respectively (i, j = 1, 2, 3).

8


Tensors and tensor fields.

For a given r and given t, a vector like E(r, t) is an element of a real three-dimensional

linear vectorspace which we denote as, say, R(3). A tensor of rank two is then an element

of a nine dimensional vectorspace, namely, the direct product R(3) × R(3). If n1, n2, n3

constitute an orthonormal basis in R(3), then an orthonormal basis in R(3) × R(3) will

be made up of the objects ninj (i, j = 1, 2, 3), and a tensor of rank two can be expressed

as a linear combination of the form∑

i,j Cij ninj . Thus, with reference to this basis, the

tensor under consideration is completely described by the 3 × 3 matrix with elements

Cij. The matrix (and also the tensor) is termed symmetric if Cij = Cji (i, j = 1, 2, 3). The

matrix is said to be positive definite if all its eigenvalues are positive.

Now consider any of the above field vectors (say, E(r, t)) at a given time instant, but at

all possible points r. This means a vector associated with every point in some specified

region in space. The set of all these vectors is termed a vector field in the region un-

der consideration. The vector field is, moreover, time dependent since the field vector

depends, in general, on t. Similarly, one can have a tensor field like, for instance, the

permittivity tensor [ǫ] or the permeability tensor [µ] in an inhomogeneous anisotropic

medium in which the electric and magnetic material properties vary from point to point

in addition to being direction dependent. While these can, in general, even be time

dependent tensor fields, we will, in this book, consider media with time independent

properties alone.

Thus, in terms of the Cartesian components, the relations (1.1f) and

(1.1g) can be written as

Di =∑

j

ǫijEj, (1.2a)

Bi =∑

j

µijHj. (1.2b)

9


As mentioned above, the electric permittivity and magnetic permeability

tensors ([ǫ], [µ]) reduce, in the case of an isotropic medium, to scalars

(corresponding to constant multiples of the identity matrix) and the above

relations simplify to

D = ǫE, Di = ǫEi, (i = 1, 2, 3), (1.3a)

B = µH, Bi = µHi, (i = 1, 2, 3). (1.3b)

It is not unusual for an optically anisotropic medium, with a permittivity

tensor [ǫ], to be characterized by a scalar permeability µ(≈ µ0, the per-

meability of free space). In this book I use the SI system of units, in

which the permittivity and permeability of free space are, respectively,

ǫ0 = 8.85× 10−12 C2 · N−1 ·m−2 and µ0 = 4π × 10−7 N·A−2.

In general, for linear media with time independent properties, the follow-

ing situations may be encountered: (a) isotropic homogeneous media, for

which ǫ and µ are scalar constants independent of r, (b) isotropic inho-

mogeneous media for which ǫ and µ are scalars, but vary from point to

point, (c) anisotropic homogeneous media where [ǫ] and [µ] are tensors

independent of the position vector r, and (d) anisotropic inhomogeneous

media in which [ǫ] and [µ] are tensor fields. As mentioned above, in most

situations relating to optics one can, for the sake of simplicity, assume

[µ] as a scalar constant, µ ≈ µ0.

However, in reality, the relation between E and D is of a more complex

10


nature (that between B and H may, in principle, be similarly complex),

even for a linear, homogeneous, isotropic medium with time independent

properties, than is apparent from equation (1.3a) since ǫ is, in general

a frequency-dependent object. A time-dependent field vector can be ana-

lyzed into its Fourier components, each component corresponding to some

specific angular frequency ω. A relation like (1.3a) can be used only in

situations where this frequency dependence of the electric (as also mag-

netic) properties of the medium under consideration can be ignored, i.e.,

when dispersion effects are not important. In this book, we will generally

assume the media to be non-dispersive, taking into account dispersion

effects only in certain specific contexts (see sec. 1.15).

One more constitutive equation holds for a conducting medium, which

reads

j = [σ]E, (1.4)

where, in general, the conductivity [σ] is once again a second rank sym-

metric tensor which, for numerous situations of practical relevance, re-

duces to a scalar. The conductivity may also be frequency dependent, as

will be discussed in brief in sec. 1.15.2.7.

1.2.3.2 Nonlinear media

Finally, a great variety of optical phenomena arise in nonlinear media,

where the components of D depend non-linearly on those of E. Such

nonlinear phenomena will be considered in chapter 9.

11


In general, the definition of D involves, in addition to E, a second vector

P, the polarization in the medium under consideration. The setting up of

an electric field induces a dipole moment in every small volume element of

the medium, the dipole moment per unit volume around any given point

being the polarization at that point. The electric displacement vector is

then defined as

D = ǫ0E+P. (1.5a)

In the case of a linear isotropic medium, the polarization occurs in pro-

portion to the electric intensity:

P = ǫ0χEE, (1.5b)

where the constant of proportionality χE is referred to as the dielectric

susceptibility of the medium. The relation (1.3a) then follows with the

permittivity expressed in terms of the susceptibility as

ǫ = ǫ0(1 + χE) = ǫ0ǫr, (1.5c)

where the constant ǫr(= 1+χE) is referred to as the relative permittivity of

the medium. In the case of a linear anisotropic medium, the susceptibility

is in the nature of a tensor, in terms of which the permittivity tensor is

defined in an analogous manner.

For a nonlinear medium, on the other hand, the polarization P depends

on the electric intensity E in a nonlinear manner (refer to sections 9.2.3, 9.2.4),

12


giving rise to novel effects in optics.

The general definition of the magnetic vector H in terms of B likewise

involves a third vector M, the magnetization, which is the magnetic dipole

moment per unit volume induced in the medium under consideration

because of the magnetic field set up in it,

H =1

µ0

B−M. (1.6a)

For a linear isotropic medium, the magnetization develops in proportion

to H (or, equivalently, to B) as

M = χMH, (1.6b)

where χM is the magnetic susceptibility of the medium. The relation (1.3b)

then follows with the permeability defined in terms of the magnetic sus-

ceptibility as

µ = µ0(1 + χM) = µ0µr, (1.6c)

where µr(= 1 + χM) is the relative permeability.

In this book we will not have occasion to refer to magnetic anisotropy or

magnetic nonlinearity. We will, moreover, assume µr ≈ 1, i.e., µ ≈ µ0,

which happens to be true for most optical media of interest. The relation

between B and H then reduces to

B = µ0H, (1.6d)

13


which is the same as that for free space (the second relation in (1.10)).

1.2.4 Integral form of Maxwell’s equations

In electromagnetic theory and optics, one often encounters situations in-

volving interfaces between different media such that there occurs a sharp

change in the field vectors across these surfaces. A simple and conve-

nient description of such situations can then be given in terms of field

vectors changing discontinuously across such a surface. Discontinuous

changes of field vectors in time and space may have to be considered in

other situations as well such as, for instance, in describing the space-

time behaviour of the fields produced by sources that may be imagined

to have been switched on all of a sudden at a given instant of time within

a finite region of space, possibly having sharply defined boundaries.

A discontinuity in the field variables implies indeterminate values for

their derivatives which means that, strictly speaking, the Maxwell equa-

tions in the form of differential equations as written above, do not apply

to these points of discontinuity. One can then employ another version of

these equations, namely the ones in the integral form. The integral form

of Maxwell’s equations admits of idealized distributions of charges and

currents, namely, surface charges and currents, to which one can relate

the discontinuities in the field variables.

Surface charges and currents can be formally included in the differential version of

Maxwell’s equations by representing them in terms of singular delta functions. However,

strictly speaking, the delta functions are meaningful only within integrals.

14


We discount, for the time being, the possibility of the field variables being

discontinuous as a function of time, and consider only their spatial dis-

continuities. Let V denote any given region of space bounded by a closed

surface S, and Σ be a surface bounded by a closed path Γ. Then the

equations (1.1a) - (1.1d) can be expressed in the integral form

∫

S

D · nds = Q, (1.7a)

∫

Γ

E · tdl = −∂Φ∂t, (1.7b)

∫

S

B · nds = 0, (1.7c)

∫

Γ

H · tdl = I +∂

∂t

∫

Σ

D · mds, (1.7d)

In these equations, Q stands for the free charge within the volume V, I

for the free current through the surface Σ, and Φ for the magnetic flux

through Σ, while n, m, and t denote, respectively, the unit outward drawn

normal at any given point of S, the unit normal at any given point of Σ re-

lated to the sense of traversal of the path Γ (in defining the integrals along

Γ) by the right handed rule, and the unit tangent vector at any given point

of Γ oriented along a chosen sense of traversal of the path. Expressed in

the above form, Q and I include surface charges and currents, if any,

acting as sources for the fields.

15


More generally, one can express Maxwell’s equations in the integral form

while taking into account the possibility of discontinuities of the field

variables as functions of time as well. The integrals are then taken over

four dimensional regions of space-time and related to three dimensional

‘surface’ integrals over the boundaries of these four dimensional regions.

1.2.5 Boundary conditions across a surface

The integral formulation of the Maxwell equations as stated above leads to

a set of boundary conditions for the field variables across given surfaces

in space. In the presence of surface charges and currents, the boundary

conditions involve the discontinuities of the field components across the

relevant surfaces.

Referring to a surface Σ, and using the suffixes ‘1’ and ‘2’ to refer to the

regions on the two sides of the surface, the boundary conditions can be

expressed in the form

(D2 −D1) · n = σ, E2t − E1t = 0, (1.8a)

(B2 −B1) · n = 0, H2t −H1t = K. (1.8b)

In these equations, σ stands for the free surface charge density at any

given point on Σ, and K for the free surface current density, n stands

for the unit normal on Σ at the point under consideration, directed from

the region ‘1’ into region ‘2’, while the suffix ‘t’ is used to indicate the

tangential component (along the surface Σ) of the respective vectors. Ex-

16


pressed in words, the above equations tell us that the normal component

of the magnetic intensity and the tangential component of the electric in-

tensity are continuous across the surface, while the normal component

of the electric displacement vector and the tangential component of the

magnetic field strength may possess discontinuities, the change in these

quantities across the surface being related to the free surface charge den-

sity and the free surface current density respectively.

1.2.6 The electromagnetic field in free space

Maxwell’s equations in free space describe the space and time variations

of the field variables in a region where there is no material medium nor

any source charges or currents:

div E = 0, (1.9a)

curl E = −∂B∂t, (1.9b)

div B = 0, (1.9c)

curl B = ǫ0µ0∂E

∂t. (1.9d)

An electromagnetic field set up in air is described, to a good degree of

approximation, by these free space equations since the relative permit-

tivity (ǫr ≡ ǫǫ0

) and relative permeability (µr ≡ µ

µ0) of air are both nearly

unity. At times one uses the free space equations with source terms in-

troduced so as to describe the effect of charges and currents set up in

vacuum or in air. These will then look like the equations (1.1a)- (1.1d)

with equations (1.1f) and (1.1g) replaced with

17


D = ǫ0E, H =1

µ0

B. (1.10)

1.2.7 Microscopic and macroscopic variables for a ma-

terial medium

A material medium can be looked upon as microscopic charges and cur-

rents, of atomic origin, distributed in free space. Apart from these atomic

charges and currents one can have charge and current sources of ‘ex-

ternal’ origin in the medium - external in the sense of not being tied up

inside the atomic constituents.

Viewed this way, one can think of the fields produced in vacuum by

the bound (atomic) and free (external) microscopic charges and currents,

where the charge and current densities vary sharply over atomic dimen-

sions in space and over extremely small time intervals, causing the re-

sulting fields to be characterized by similar sharp variations in space and

time. Such variations, However, are not recorded by the measuring in-

struments used in macroscopic measurements, that measure only fields

averaged over length and time intervals large compared to the typical

microscopic scales. When the microscopic charge and current densities

are also similarly averaged, the microscopic Maxwell’s equations, i.e., the

ones written in terms of the fluctuating vacuum fields produced by the

microscopic charges and currents, lead to the Maxwell equations for the

material medium (i.e., equations (1.1a)- (1.1d)) under consideration, fea-

turing only the averaged field variables and the averaged source densities.

18


On averaging the microscopic charge densities around any given point of

the medium, one obtains an expression of the form

ρav = (ρfree)av − div P, (1.11a)

while a similar averaging of the microscopic current densities gives

jav = (jfree)av +∂P

∂t+ curl M. (1.11b)

In these equations, P and M stand for the electric polarization and the

magnetization vectors at the point under consideration defined, respec-

tively, as the macroscopic electric and magnetic dipole moments per unit

volume. On rearranging terms in the averaged vacuum equations, writ-

ing (ρfree)av and (jfree)av as ρ and j, and defining the field variables D and H

as

D = ǫ0E+P, H =1

µ0

B−M, (1.12)

there results the set of equations (1.1a)- (1.1d). The constitutive rela-

tions (1.3a), (1.3b) (or, more generally, (1.2a), (1.2b)) then express a set

of phenomenological linear relations between P and E, on the one hand,

and M and H on the other:

P = ǫ0χEE, M = χMH (isotropic medium). (1.13)

In these relations, χE and χM stand for the electric and magnetic sus-

ceptibilities of the medium, related to the permittivity and permeability

19


as

ǫ = ǫ0(1 + χE), µ = µ0(1 + χM). (1.14)

Finally, the phenomenological constants

ǫr = 1 + χE, µr = 1 + χM, (1.15a)

the relative permittivity and the relative permeability of the medium, are

often used instead of χE and χM, being related to ǫ and µ as

ǫ = ǫrǫ0, µ = µrµ0. (1.15b)

1.3 Electromagnetic potentials

An alternative, and often more convenient, way of writing Maxwell’s equa-

tions is the one making use of electromagnetic potentials instead of the

field vectors. To see how this is done, let us consider a linear homoge-

neous isotropic dielectric with material constants ǫ and µ.

The equation (1.1c) is identically satisfied by introducing a vector potential

A, in terms of which the magnetic intensity B is given by

B = curl A. (1.16a)

Moreover, the equation (1.1b) is also identically satisfied by introducing a

20


scalar potential φ, and writing the electric intensity E as

E = −grad φ− ∂A

∂t. (1.16b)

The remaining two of the Maxwell equations, eq. (1.1a) and eq. (1.1d) can

then be expressed in terms of these two potentials which involve four

scalar variables, in the place of the six scalar components of E and B, in

addition to the material constants.

∇2φ+∂

∂t(divA) = −ρ

ǫ, (1.17a)

∇2A− ǫµ∂2A

∂t2− grad(divA+ ǫµ

∂φ

∂t) = −µj. (1.17b)

1.3.1 Gauge transformations

One can now make use of the fact that the physically relevant quantities

are the field vectors, and that various alternative sets of potentials may

be defined, corresponding to the same field vectors. Thus, the transfor-

mations from A, φ to A′, φ′ defined as

A′ = A+ grad Λ, φ′ = φ− ∂Λ

∂t, (1.18)

with an arbitrary scalar function Λ lead to an alternative choice, A′, φ′, of

the potentials. Equations (1.18) define what is referred to as the gauge

transformation of the electromagnetic potentials.

21


1.3.2 The Lorentz gauge and the inhomogeneous wave

equation

By an appropriate choice of the gauge function Λ, one can ensure that the

new potentials satisfy

divA+ ǫµ∂φ

∂t= 0, (1.19)

where the primes on the transformed potentials have been dropped for

the sake of brevity. With the potentials satisfying the Lorentz condition,

eq. (1.19), the field equation (1.17a) and (1.17b) for the scalar and vector

potentials assume the form of inhomogeneous wave equations with source

terms −ρ

ǫand −µj respectively:

∇2φ− µǫ∂2φ

∂t2= −ρ

ǫ, (1.20a)

∇2A− µǫ∂2A

∂t2= −µj. (1.20b)

A pair of potentials A, φ, satisfying the Lorentz condition (1.19) by virtue

of an appropriate choice of the gauge function Λ, is said to belong to

the Lorentz gauge. One may also consider a gauge transformation by

means of a gauge function Λ such that the Lorentz condition (1.19) is

not satisfied. One such choice of the gauge function, referred to as the

Coulomb gauge, requires that the vector potential satisfy

divA = 0. (1.21)

22


The special advantage of the Lorentz gauge compared to other choices of

gauge is that the field equations for A and φ are decoupled from each

other, and each of the two potentials satisfies the inhomogeneous wave

equation.

1.3.3 The homogeneous wave equation in a source-free

region

In a source-free region of space, the right hand sides of equations (1.20a)

and (1.20b) become zero and the potentials are then found to satisfy

the homogeneous wave equation. Since the field vectors E and B are

linearly related to the potentials, they also satisfy the homogeneous wave

equation in a source-free region:

∇2E− ǫµ∂2E

∂t2= 0, (1.22a)

∇2B− ǫµ∂2B

∂t2= 0. (1.22b)

1.4 Digression: vector differential operators

1.4.1 Curvilinear co-ordinates

A Cartesian co-ordinate system with co-ordinates, say, x1, x2, x3, is termed

an orthogonal rectilinear one since the co-ordinate lines xi = constant (i =

1, 2, 3), are all straight lines where any two intersecting lines are perpen-

dicular to one another. Considering an infinitesimal line element with

23


end points (x1, x2, x3) and (x1 + dx1, x2 + dx2, x3 + dx3), the squared length of

the line element is given by an expression of the form

ds2 = dx21 + dx22 + dx23. (1.23)

More generally, one may consider an orthogonal curvilinear co-ordinate

system (examples: the spherical polar and cylindrical co-ordinate sys-

tems), with co-ordinates, say, u1, u2, u3, where the co-ordinate lines

ui = constant (i = 1, 2, 3) are orthogonal but curved. The squared length

of a line element with end points (u1, u2, u3), (u1 + du1, u2 + du2, u3 + du3) for

such a system is of the general form

ds2 = h21du21 + h22du

22 + h23du

23, (1.24)

where the scale factors hi (i = 1, 2, 3) are, in general, functions of the

co-ordinates u1, u2, u3. For the spherical polar co-ordinate system with

co-ordinates r, θ, φ, for instance, one has h1 = 1, h2 = r, h3 = r sin θ, while

for the cylindrical co-ordinate system made up of co-ordinates ρ, φ, z, the

scale factors are h1 = 1, h2 = ρ, h3 = 1.

In this book, a differential expression such as, say, dx will often be used loosely to

express a small increment that may alternatively expressed as δx. Strictly speaking, ex-

pressions like dx are meaningful only under integral signs. When used in an expression

in the sense of a small increment, it will be implied that terms of higher degree in the

small increment are not relevant in the context under consideration.

24


1.4.2 The differential operators

The differential operator ‘grad’ operates on a scalar field to produce a vec-

tor field, while the operators ‘div’ and ‘curl’ operate on a vector field, pro-

ducing a scalar field and a vector field respectively. These are commonly

expressed in terms of the symbol ∇ where, in the Cartesian system, one

has

∇ = e1∂

∂x1+ e2

∂

∂x2+ e3

∂

∂x3, (1.25a)

ei (i = 1, 2, 3) being the unit vectors along the three co-ordinate axes. For

an orthogonal curvilinear co-ordinate system, this generalizes to

∇ =∑

i

ei1

hi

∂

∂ui, (1.25b)

where the unit co-ordinate vectors ei are, in general, functions of the co-

ordinates u1, u2, u3. Thus, for instance, for a vector field

A(r) =∑

i

ei(u1, u2, u3)Ai(u1, u2, u3), (1.26a)

one will have

curl A =∑

i,j

(ei1

hi

∂

∂ui)×

(

ej(u1, u2, u3)Aj(u1, u2, u3))

, (1.26b)

where one has to note that the derivatives ∂∂xi

operate on the components

Aj and also on the unit vectors ej (i, j = 1, 2, 3).

In this sense, one can write div A and curl A as ∇ ·A and ∇×A respec-

25


tively, while grad φ can be expressed as ∇φ, where φ stands for a scalar

field.

The second order differential operators like curl curl and grad div can be

defined along similar lines, in terms of two successive applications of ∇.

A convenient definition of ∇2A is given by

∇2A = grad div A− curl curl A. (1.27)

1.5 The principle of superposition

The principle of superposition is applicable to solutions of Maxwell’s equa-

tions in a linear medium (eq. (1.1a)- (1.1d), along with eq. (1.3a)- (1.3b),

with ǫ and µ independent of the field strengths) since these constitute a

set of linear partial differential equations. If, for a given set of boundary

conditions, E1(r, t), H1(r, t) and E2(r, t), H2(r, t) be two solutions to these

equations in some region of space free of source charges and currents,

then a1E1(r, t) + a2E2(r, t), a1H1(r, t) + a2H2(r, t) also represents a solution

satisfying the same boundary conditions, where a1 and a2 are scalar con-

stants and where we assume that the boundary conditions involve the

field variables linearly. More generally, the superposition of two or more

solutions results in a new solution satisfying a different set of boundary

conditions compared to the ones satisfied by the ones one started with.

Of the four field variables E, D, B, and H, only two (made up of one electric and one

magnetic variable) are independent, the remaining two being determined by the consti-

tutive relations. A common choice for these two independent variables consists of the

26


vectors E and H since the Maxwell equations possess a symmetrical structure in terms

of these variable. From a fundamental point of view, however, B and H are the magnetic

analogs of E and D respectively, according to which the independent pair may be chosen

as E and B or, alternatively, D and H.

Starting from simple or known solutions of Maxwell’s equations, the prin-

ciple of superposition can be made use of to construct more complex so-

lutions that may represent the electromagnetic field in a given real life

situation to a good degree of approximation. Thus, starting from a pair

of plane monochromatic wave solutions (see sec. 1.10) one can obtain

the field produced by a pair of narrow slits illuminated by a plane wave,

where this superposed field is seen to account for the formation of inter-

ference fringes by the slits. Indeed, the principle of superposition has an

all-pervading presence in electromagnetic theory and optics.

1.6 The complex representation

In electromagnetic theory in general, and optics in particular, one often

encounters fields that vary harmonically with time, or ones closely resem-

bling such harmonically varying fields. Such a harmonically varying field

has a temporal variation characterized by a single angular frequency,

say, ω, and is of the form (we refer to the electric intensity for the sake of

concreteness)

E(r, t) = E0(r) cos(ωt+ δ(r)), (1.28)

where E0(r) stands for the space dependent real amplitude of the field and

27


δ(r) for a time-independent phase that may be space dependent. Similar

expressions hold for the other field vectors of the harmonically varying

field where the space dependent amplitudes and the phases (analogous

to E0(r) and δ(r) characterizing the electric intensity vector) bear definite

relations with one another since all the field vectors taken together have

to satisfy the Maxwell equations.

A convenient way of working with harmonically varying fields, and with

the field vectors in general, is to make use of the complex representation

Corresponding to a real time dependent (as also possibly space depen-

dent) vector A, we consider the complex vector A, such that

A = Re A. (1.29)

For a given vector A, eq. (1.29) does not define A uniquely, since the

imaginary part of A can be chosen arbitrarily. However, for a vector with

harmonic time-dependence of the form, say,

A = A0 cos(ωt+ δ), (1.30)

with amplitude A0 (a real vector, possibly space dependent), the prescrip-

tion for the corresponding complex vector A can be made unique by mak-

ing the choice

A = A0e−iωt, (1.31)

where A0 = A0e−iδ is the complex amplitude with a phase factor e−iδ.

28


A unique complex representation having a number of desirable features

can be introduced for a more general time dependence as well, as will be

explained in chapter 7.

The complex representation has been introduced here for a real time

dependent (and possibly space dependent) vector A since the electro-

magnetic field variables are vectorial quantities. Evidently, an analogous

complex representation can be introduced for space- and time dependent

scalar functions as well.

The complex representation for the harmonically varying electric field de-

scribed by eq. (1.28) is of the form

E(r, t) = E(r)e−iωt, (1.32a)

where E(r) is the space dependent complex amplitude of E(r, t), being

related to the real amplitude E0(r) and the phase δ(r) as

E(r) = E0(r)e−iδ(r). (1.32b)

The complex amplitude is often expressed in brief as E (or even simply as

E, by dropping the tilde), keeping its space dependence implied. The time

dependence of E(r, t) is obtained by simply multiplying with e−iωt, while

the actual field E(r, t) is obtained by taking the real part of E.

The abbreviated symbol E is variously used to denote the complex amplitude (E(r)), the

space- and time dependent complex field vector E(r, t), or the real field vector E(r, t)

(similar notations being used for the other field vectors as well). The sense in which the

29


symbol is used is, in general, clear from the context.

It is often convenient to employ the complex representation in expres-

sions and calculations involving products of electric and magnetic field

components, and their time averages.

In making use of the complex representation, it is a common practice

to drop the tilde over the symbol of the scalar or the vector under con-

sideration for the sake of brevity, it being usually clear from the context

whether the real or the corresponding complex quantity is being referred

to. I will display the tilde whenever there is any scope for confusion.

1.7 Energy density and energy flux

1.7.1 Energy density

It requires energy to set up an electromagnetic field in any given region

of space. This energy may be described as being stored in the field itself,

and is referred to as the electromagnetic field energy, since the field can

impart either a part or the whole of this energy to other systems with

which it can interact.

This is one reason why electromagnetic field can be described as a dynamical system.

It possesses energy, momentum, and angular momentum, which it can exchange with

other dynamical systems like, say, a set of charged bodies in motion.

30


The field energy can be expressed in the form

W =

∫

(1

2E ·D+

1

2B ·H

)

dv, (1.33)

where the integration is performed over the region in which the field is

set up (or, more generally, over entire space since the field extends, in

principle up to infinite distances).

One can work out, for instance, the energy required to set up an electric field between

the plates of a parallel plate capacitor and check that it is given by the first term on

the right hand side of eq. (1.33). Similarly, on evaluating the energy required to set up

the magnetic field within a long solenoid, one finds it to be given by the second term.

The assumption that the sum of the two terms represents the energy associated with

a time-varying electromagnetic field, is seen to lead to a consistent interpretation, com-

patible with the principle of conservation of energy, of results involving energy exchange

between the electromagnetic field and material bodies with which the field may interact.

One can say that some amount of energy is contained within any and ev-

ery finite volume within the region occupied by the field, and arrive at the

concept of the electromagnetic energy density, the latter being the field

energy per unit volume around any given point in space. Evidently, the

concept of energy in any finite volume within the field is not as uniquely

defined as that for the entire field, but the integrand on the right hand

side of eq. (1.33) can be interpreted to be a consistent expression for the

energy density w. This energy density, moreover, can be thought of as be-

ing made up of two parts, an electric and a magnetic one. The expressions

31


for the electric, magnetic, and total energy densities are thus

we =1

2E ·D, wm =

1

2B ·H, (1.34a)

and

w =1

2E ·D+

1

2B ·H. (1.34b)

For a field set up in empty space, the energy density is given by the

expression

w =1

2ǫ0E

2 +1

2µ0H

2. (1.34c)

In general, the energy density w (and its electric and magnetic compo-

nents we, wm) vary with time extremely rapidly and hence do not have di-

rect physical relevance since no recording instrument can measure such

rapidly varying fields. What is of greater relevance is the time averaged

energy density, where the averaging is done over a time large compared

to the typical time interval over which the fields fluctuate. Indeed, com-

pared to the latter, the averaging time may be assumed to be infinitely

large without causing any appreciable modification in the interpretation

of the averaged energy density.

Thus, the time averaged energy density (which is often referred to as

simply the energy density) at any given point of the electromagnetic field

32


is given by

〈w〉 = 〈12E ·D+

1

2B ·H〉, (1.35a)

where the symbols E, D, etc. stand for the time dependent real field vec-

tors at the point under consideration, and the angular brackets indicate

time averaging, the latter being defined, for a time dependent function

f(t) as

〈f〉 = limT→∞1

T

∫ T2

−T2

f(t)dt. (1.35b)

For a field set up in vacuum, the time averaged energy density is given by

the expression

〈w〉 = 〈12ǫ0E

2 +1

2µ0H

2〉. (1.35c)

At times, the angular brackets are omitted in expressions representing

the energy density for the sake of brevity, it being usually clear from the

context that a appropriate time averaging is implied.

Note that the energy densities involve the time averages of the products

of field variables. A convenient way to work out these time averages is to

make use of the complex representations of the field vectors. We consider

here the special case of a harmonic time-dependence of the field vari-

ables, discussed in sections‘1.9.2 and 1.6. Making use of the notation of

equations (1.31), (1.32b), one arrives at the following result for the energy

33


density at any given point r:

〈w〉 = 1

8〈E(r) · D(r)∗ + E(r)∗ · D(r) + H(r) · B(r)∗ + H(r)∗ · B(r)〉, (1.36a)

which can be written as

〈w〉 =1

4〈ǫ0E∗ · E+ µ0H

∗ · H〉. (1.36b)

for a field in empty space. In eq. (1.36b) the reference to the point r is

omitted for the sake of brevity.

1.7.2 Poynting’s theorem: the Poynting vector

Considering any region V in an electromagnetic field bounded by a closed

surface S, one can express in mathematical form the principle of con-

servation of energy as applied to the field and the system of particles

constituting the charges and currents within this volume. The rate of

change of the field energy within this region is obtained by taking the

time derivative of the integral of the energy density over the region V,

while the rate of change of the energy of the system of particles consti-

tuting the charges and currents in this region is the same as the rate at

which the field transfers energy to these charges and currents. The latter

is given by the expression E · j per unit volume.

The rate at which the field transfers energy to the system of particles constituting the

source charges and currents includes the rate at which mechanical work is done on

these, as also the rate at which energy is dissipated as heat into this system of parti-

34


cles. We assume here that the energy dissipation occurs only in the form of production

of Joule heat, and ignore for the sake of simplicity the energy dissipation due to the

magnetic hysteresis, if any, occurring within the region under consideration.

Summing up the two expressions referred to above (the rate of increase of

the field energy and that of the energy of the charges and currents), one

obtains the rate at which the total energy of the systems inside the region

V under consideration changes with time. The principle of conservation

of energy then implies that this must be the rate at which the field energy

flows into the region through its boundary surface S.

Making use of the above observations, and going through a few steps of

mathematical derivation by starting from Maxwell’s equations, one ar-

rives at the following important result (Poynting’s theorem),

∂

∂t

∫

V

1

2

(

E ·D+H ·B)

dv +

∫

V

E · jdv = −∫

S

E×H · nds, (1.37)

where, the right hand side involves the surface integral, taken over the

boundary surface S, of the outward normal component (along the unit

normal n at any given point on the surface) of the vector

S = E×H. (1.38)

This vector, at any given point in the field, is referred to as the Poynt-

ing vector at that point and, according to the principle of conservation of

energy as formulated above, can be interpreted as the flux of electromag-

netic energy at that point, i.e., as the rate of flow of energy per unit area

35


of an imagined surface perpendicular to the vector. Once again, there

remains an arbitrariness in the definition of the energy flux, though the

above expression is acceptable on the ground that it is a consistent one.

1.7.3 Intensity at a point

Recalling that the field vectors at any given point are rapidly varying func-

tions of time, one can state that only the time average of the Poynting vec-

tor, rather than the rapidly varying vector itself, is of physical relevance,

being given by the expression

〈S〉 = 〈E×H〉. (1.39)

Assuming that the temporal variation of the field vectors is a harmonic

one, and making use of the complex representation of vectors as ex-

plained in sec. 1.6, one obtains

〈S〉 = 1

4

(

E× H∗ + E∗ × H)

, (1.40)

where E and H stand for the complex amplitudes corresponding to the

respective real time dependent vectors (appearing in eq. (1.39)) at the

point under consideration. The magnitude of this time averaged energy

flux at any given point in an electromagnetic field then gives the intensity

(I) at that point:

S = Is, (1.41)

where the angular brackets indicating the time average has been omitted

36


for the sake of brevity and s denotes the unit vector along 〈S〉.

One way of looking at Maxwell’s equations is to say that these equations describe how

the temporal variations of the field vectors in one region of space get transmitted to

adjacent regions. In the process, there occurs the flow of field energy referred to above.

In addition, there occurs a flow of momentum and angular momentum associated with

the field. Analogous to the energy flux vector, one can set up expressions for the flux of

field momentum and angular momentum as well, where these appear as components of

a tensor quantity.

1.8 Optical fields: an overview

A typical optical set-up involves a light source emitting optical radiation

(also termed an optical field here and in the following) which is a space-

and time dependent electromagnetic field, one or more optical devices,

like beam-splitters, lenses, screens with apertures, and stops or obstacles

and, finally, one or more recording devices like photographic plates and

photocounters. The optical devices serve to change or modify the optical

field produced by the source depending on the purpose at hand, and this

modified optical field is detected and recorded to generate quantitative

data relating to the optical field.

If the electromagnetic field produced by the source or recorded by a de-

tecting device is analyzed at any given point in space over an interval

of time, it will be found to correspond to a time dependent electric and

magnetic field intensity, constituting an optical signal at that point. This

time dependence is commonly determined principally by the nature of

37


the source rather than by the optical devices like lenses and apertures.

On analyzing the optical signal, it is found to be made up of a number

of components, each component corresponding to a particular frequency.

For some sources, the frequencies of the components are distributed over

a narrow range (which, ideally, may even be so narrow as to admit of only

a single frequency), or these may be spread out over a comparatively

wider range.

On close scrutiny, the time variation of an optical signal is often found

to be of a random or statistical nature rather than a smooth and regular

one. This relates to the very manner in which a source emits optical ra-

diation. While the source is commonly a macroscopic body, the radiation

results from a large number of microscopic events within it, where a mi-

croscopic event may be a sudden deceleration of an electron in a material

or an atomic transition from one quantum mechanical stationary state

to another. Tiny differences between such individual microscopic events

lead to statistical fluctuations in the radiation emitted by the source, the

latter being a macroscopic system made up of innumerable microscopic

constituents.

The emission processes from the microscopic constituents of the source

are quantum mechanical events and, in addition, the electromagnetic

field is made up of photons resulting from these emission processes.

These photons themselves are quantum mechanical objects. It is this

essential quantum mechanical nature of the microscopic events associ-

ated with the electromagnetic field that lends a distinctive character to

the fluctuations of the field variables.

38


In summary, optical signals may be of diverse kinds, ranging from al-

most ideally monochromatic and coherent ones characterized by a single

frequency (or a close approximation to it), to incoherent signals showing

fluctuations and an irregular variation in time.

The other, complementary, aspect of the optical field is its spatial de-

pendence at any particular point of time or, more commonly, the spatial

dependence of the field obtained on averaging over a sufficiently long in-

terval of time. It is this spatial dependence of the field that is markedly

changed by the optical devices like lenses, apertures and stops.

Whatever the temporal and spatial variation of the optical field under con-

sideration, it must ultimately relate to the Maxwell equations for the given

optical set-up. Strictly speaking, an optical field is to be determined, in

the ultimate analysis, by solving the Maxwell equations in a given region

of space subject to appropriate boundary conditions on the closed bound-

ary surface of that region. However, this ideal procedure can seldom be

followed faithfully and completely because of difficulties associated with

the choice of an appropriate boundary surface, those relating to the spec-

ification of the appropriate set of boundary conditions, and finally, those

relating to getting the Maxwell equations solved with these conditions.

What is more, the statistical fluctuations of the field variables make it

meaningless to try to obtain solutions to the Maxwell equations as well-

defined functions of time(expressed in terms of deterministic variables)

since only certain appropriately defined statistical averages can be de-

scribed as meaningful physical quantities which one can relate to solu-

39


tions of the Maxwell equations. We shall, however, not be concerned with

this statistical aspect of the field variables in the present context, con-

sidering it in greater details in chapter 7 (see also sec. 1.21 for a brief

introduction).

All the difficulties mentioned above add up to what often constitutes a

formidable challenge, and the only way to deduce meaningful informa-

tion about the optical field in a given optical set-up that then remains

is to employ suitable approximations. Ray optics (or geometrical optics)

and diffraction theory constitute two such approximation schemes of wide

usefulness in optics. However, as I have already mentioned, these ap-

proximation schemes retain their usefulness even outside the domain of

optics, i.e., their range of applicability extends to frequencies beyond the

range one associates with visible light.

This is not to convey the impression that one cannot acquire working knowledge in ray

optics or diffraction theory without a thorough grounding in electromagnetic theory. In

this book, however, my approach will be to trace the origins of the working rules of these

approximation schemes to the principles of electromagnetic theory.

In working out solutions to the Maxwell equations, it is often found conve-

nient to look at regions of space where there are no free charge or current

sources as distinct from those containing the sources. These sources are

commonly situated in some finite region of space and the field they create

satisfies the inhomogeneous wave equation in these regions. The tempo-

ral variation of the field can be analyzed into monochromatic components

and each monochromatic component is then found to satisfy the inhomo-

40


geneous Helmholtz equation (see sec. 1.9.2.2). Away from the region con-

taining the sources, the field variables can be represented in terms of a

series expansion referred to as the multipole expansion whose coefficients

are determined by the boundary conditions of the set-up. Equivalently,

the multipole series results from the homogeneous Helmholtz equation

with, once again, an appropriate set of boundary conditions where now

the boundary is to be chosen in such a way as to exclude the region

containing the sources.

Often, a convenient approach consists of making appropriate clever guesses

at the solution that one seeks for a given optical set-up, depending on

a number of requirements (relating to the appropriate boundary condi-

tions) that the solution has to satisfy. However, one has to be sure that

the guesswork does indeed give the right solution. This relates to the

uniqueness theorem that tells one, in effect, that no other solution to the

field equations is possible.

After stating the uniqueness theorem in electromagnetic theory in the

next section, I will introduce a number of simple solutions to the field

equations which turn out to be useful in optics, and in electromagnetic

theory in general.

1.8.1 The uniqueness theorem

Let us consider a region V in space bounded by a closed surface S, within

which the Maxwell equations are satisfied. Let the field vectors be given

at time t = 0. Further, let the field vectors satisfy the boundary condition

41


that the tangential component of the electric intensity (Et) equals a given

vector function (possibly time dependent) on the boundary surface S for

all t ≥ 0 (recall that the tangential component is given by n × E at points

on S, where n stands for the unit normal, which is commonly chosen to

be the outward drawn one with respect to the interior of V, at any given

point of S). One can then say that the field vectors are thereby uniquely

specified within V for all t ≥ 0. The uniqueness theorem can also be

formulated in terms of the tangential component of the magnetic vector

H over the boundary surface.

In other words, if E1,H1, E2,H2 be two sets of field vectors satisfying

Maxwell’s equations everywhere within V, and satisfy E1t = E2t on S

for all t ≥ 0, and if E1 = E2, H1 = H2 at t = 0 then one must have

E1 = E2, H1 = H2 everywhere within V for all t > 0.

In the case of a harmonically varying field, Maxwell’s equations lead to

the homogeneous Helmholtz equations for the field vectors in a region

free of sources (see sec. 1.9.2). The uniqueness theorem then states that

the field is uniquely determined within any given volume in this region if

the tangential component of the electric (or the magnetic) vector is spec-

ified on the boundary surface enclosing that volume. This form of the

uniqueness theorem can be established by making use of Geen’s func-

tions appropriate for the boundary surface (see section 5.6).

This form of the uniqueness theorem is made use of in diffraction theory

where one derives the field vectors in a region of space from a number

of boundary data. In the typical diffraction problem the region within

42


V contains no sources (i.e., charge and current distributions). Once the

uniqueness of the field is established in the absence of sources, it follows

with sources included within V since the contribution of the latter to

the field, subject to the boundary condition, is separately and uniquely

determined, again with the help of the appropriate Green’s function.

1.9 Simple solutions to Maxwell’s equations

1.9.1 Overview

Much of electromagnetic theory and optics is concerned with obtaining

solutions of Maxwell’s equations in situations involving given boundary

and initial conditions while, in numerous situations of interest, the ini-

tial condition is replaced with one of a harmonic time dependence. Even

when the time dependence is harmonic, the required solution may have

a more or less complex spatial dependence. Starting from harmonic

solutions of a given frequency and with a relatively simple spatial de-

pendence, one can build up ones with a more complex spatial variation

by superposition, where the superposed solution is characterized by the

same frequency. On the other hand, a superposition of solutions with

different frequencies leads to solutions with a more complex time depen-

dence. In this book we will be mostly concerned with monochromatic

fields, i.e., ones with a harmonic time dependence of a given frequency.

In reality, the field variations are more appropriately described as quasi-

monochromatic, involving harmonic components with frequencies spread

over a small interval.

43


Monochromatic solutions to the Maxwell equations with the simplest spa-

tial dependence, namely a harmonic one, are the plane waves. These

will be considered in various aspects in sec. 1.10 since plane waves, in

spite of their simplicity, are of great relevance in optics. Two other har-

monic solutions with a simple spatial dependence are the spherical and

the cylindrical waves, briefly discussed in sections 1.17, and 1.18.

More generally, monochromatic solutions to Maxell’s equations are ob-

tained by solving the Helmholtz equations with appropriate boundary con-

ditions (see sec. 1.9.2.2). In particular, solutions to diffraction problems

in optics are fundamentally based on finding solutions to the Helmholtz

equations.

While building up of solutions to the Maxwell equations by the superposi-

tion of simpler solutions constitutes a basic approach in electromagnetic

theory and optics, such superpositions are often not adequate in repro-

ducing optical fields in real life situations. A superposition of the form

∑

ciψi, obtained from known wave functions ψi (i = 1, 2, . . .), with given

complex coefficients ci produces a wave function of a deterministic nature

while optical fields are often described more appropriately with functions

having random features, i.e., ones that require a statistical description.

Put differently, while a simple superposition produces a coherent field

variation, real life fields are more commonly incoherent or partially coher-

ent.

Any given set of known wave functions ψi (i = 1, 2, . . .), can be superposed

with coefficients ci so as to produce a coherent field of a more complex

44


nature. On the other hand, an incoherent field variation can be produced

by a mixture of these fields, where a mixture differs from a superposi-

tion by way of involving statistical features in it. You will find a brief

introduction to coherent and incoherent fields in sec. 1.21, more detailed

considerations of which will be taken up in chapter 7.

The distinction between superposed and mixed configurations of an elec-

tromagnetic field is analogous to that between superposed and mixed

states of a quantum mechanical system.

1.9.2 Harmonic time dependence

Let us assume that the source functions ρ(r, t), j(r, t) and the field vectors

(as also the potentials) all have a harmonic time dependence with a fre-

quency ω. We can write, for instance, ρ(r, t) = ρ(r)e−iωt, j(r, t) = j(r)e−iωt,

with similar expressions for the field vectors and potentials, where we use

the complex representation for these quantities, omitting the tilde in the

complex expressions for the sake of brevity. In an expression of the form

E(r, t) = E(r)e−iωt, for instance, E(r) denotes the space dependent com-

plex amplitude of the electric intensity. At times, the space dependence

is left implied, and thus E(r) is written simply as E. The meanings of the

symbols used will, in general, be clear from the context.

Among the four field vectors E, D, B, H, one commonly uses the first

and the last ones as the independent vectors, expressing the remaining

two in terms of these through the constitutive equations. This makes

the relevant field equations look symmetric in the electric and magnetic

45


quantities. Thus, we have, for a time-harmonic field with angular fre-

quency ω,

E(r, (t)) = E(r)e−iωt, H(r, (t)) = H(r)e−iωt. (1.42)

1.9.2.1 Fictitious magnetic charges and currents

For the harmonic time dependence under consideration, one can express

Maxwell’s equations for free space in terms of the relevant complex am-

plitudes. In writing out these equations, I introduce for the sake of later

use, fictitious magnetic charge- and current densities. Thus, we include

the magnetic current density ( j(m) = j(m)(r), the space dependent com-

plex amplitude of j(m)(r, t) = j(m)(r)e−iωt ), and the corresponding magnetic

charge density ( ρ(m) ). Evidently, such magnetic charges and currents

do not correspond to real sources since observed fields are all produced

by electric charge- and current distributions. However, if one consid-

ers the field within a region free of sources (i.e., the sources producing

the field are all located outside this region) then the field vectors can be

equivalently expressed in terms of a set of fictitious charges and currents

distributed over the boundary surface of the region, where these ficti-

tious sources include magnetic charges and currents. In this equivalent

representation, the actual sources are not explicitly referred to.

On introducing the magnetic charge- and current densities, the Maxwell

equations for an isotropic medium (equations (1.1a) - (1.1d)), expressed

in terms of the space dependent complex amplitudes of all the relevant

46


quantities assume the form

div E =ρ

ǫ, curl E = −j(m) + iωµH,

div H =ρ(m)

µ, curl H = j− iωǫE, (1.43)

In these equations, ρ and j stand for complex amplitudes of harmonically

varying electric charge- and current densities that may include fictitious

surface charges and currents required to represent field vectors within

any given region without referring to the actual sources producing the

fields, assuming that the sources are external to the region. The charge-

and current densities satisfy the equations of continuity which, when

expressed in terms of the complex amplitudes, assume the form.

−iωρ(m) + div j(m) = 0, −iωρ+ div j = 0. (1.44)

One observes that, with the magnetic charge- and current densities in-

cluded, the field equations assume a symmetrical form in the electric and

magnetic variables.

The field equations for free space are obtained from equations (1.43) on replacing ǫ and

µ with ǫ0 and µ0 respectively.

1.9.2.2 The Helmholtz equations

The field equations (1.43) involve the field vectors E and H coupled with

one another. One can, however, arrive at a pair of uncoupled second

order equations from the second and fourth equations by taking the curl

47


of both sides in each case, so as to arrive at

curl curl E− k2E = iωµj− curl j(m),

curl curl H− k2H = iωǫj(m) + curl j, (1.45)

In these equations, the parameter k is related to the angular frequency ω

as

k = ω√ǫµ =

ω

v, (1.46)

with v = 1√ǫµ

, the phase velocity of a plane wave (see sec. 1.10) of angular

frequency ω in the medium under consideration.

Referring to plane waves (see sec. 1.10) of angular frequency ω, the ratio k = ωv

is

termed the propagation constant. It may be noted, however, that we are considering

here harmonic solutions of Maxwell’s equations that may be more general than plane

waves. Still, we will refer to k as the propagation constant corresponding to the angular

frequency ω.

The equations (1.45), now decoupled in E and H, are referred to as the

inhomogeneous Helmholtz equations for the field variables. In a region

free of the real or fictitious charges and currents, these reduce to the

homogeneous Helmholtz equations

(∇2 + k2)E = 0, (∇2 + k2)H = 0. (1.47)

As we will see in chapter 5, the inhomogeneous Helmholtz equations are

of use in setting up a general formulation for solving diffraction problems.

48


An alternative approach for describing the harmonically varying fields

would be to make use of the electromagnetic potentials φ and A. In the

Lorentz gauge, the potentials for a harmonically varying electromagnetic

field satisfy the inhomogeneous Helmholtz equations

(∇2 + k2)φ = −ρǫ, (∇2 + k2)A = −µj, (1.48)

for real sources, i.e., in the absence of the fictitious magnetic charges

and currents. The potentials φ and A, as defined in sec. 1.3 are, however,

not symmetric with respect to the electric and magnetic field vectors,

and their definition is, moreover, not consistent with two of the Maxwell

equations (the equations for curl E and div B) in the presence of magnetic

charge- and current densities.

One can, however, adopt a broader approach and introduce an additional

vector potential C so that the vector potentials A and C taken together

(recall that the scalar potential φ associated with A can be eliminated

in favour of A by means of an appropriate gauge condition such as the

one corresponding to the Lorentz gauge) give a convenient representation

of the electric and magnetic fields in the presence of real and fictitious

charge- and current distributions. Such an approach gives a neat formu-

lation for solving a class of diffraction problems. The vector potentials A

and C are closely related to the Hertz potentials that are widely used for

a convenient description of electromagnetic fields in various contexts.

1. Equations (1.45), (1.47) hold for the space-time dependent real fields and poten-

tials E(r, t), H(r, t), φ(r, t), A(r, t), and the corresponding space-time dependent

49


complex quantities as well. We are, for the time being, considering only the space

dependent parts of the complex fields and potentials.

2. By analogy with equations (1.45), equations (1.48) are also referred to as the in-

homogeneous Helmholtz equations. Note the sign reversal in the two sets of equa-

tions, which arises due to the definitions of the differential operators ∇×∇× and

∇2.

Solutions to the inhomogeneous Helmholtz equations under given bound-

ary conditions can be obtained by making use of the appropriate Green’s

functions. This will be explained more fully in chapter 5 in connection

with the formulation of a general approach for solving diffraction prob-

lems.

1.10 The plane monochromatic wave

A plane monochromatic wave constitutes, in a sense, the simplest solu-

tion to the Maxwell equations.

1.10.1 Plane monochromatic waves in vacuum

Let us imagine infinitely extended free space devoid of source charges

in each and every finite volume in it, in which case Maxwell’s equa-

tions (1.9a) - (1.9d) imply the homogeneous wave equations for the elec-

tromagnetic field vectors E and B:

∇2E− 1

c2∂2E

∂t2= 0, ∇2H− 1

c2∂2H

∂t2= 0 (c =

√

1

ǫ0µ0

), (1.49)

while the potentials φ,A in the Lorentz gauge also satisfy the same wave

50


equation (see equations (1.20a), (1.20b), in which one has to assume ρ =

0, j = 0, and ǫ = ǫ0, µ = µ0).

It is to be noted that the wave equations (1.49) follow from the Maxwell

equations in free space but are not equivalent to these since they do not

imply the four equations (1.9a) - (1.9d).

A particular solution to eq. (1.49) as also of the Maxwell equations in free

space can be expressed in the complex representation as

E = E0 exp[i(k · r− ωt)], H = H0 exp[i(k · r− ωt)]. (1.50a)

The complex representation of a quantity is commonly expressed by putting a tilde over

the symbol for that quantity when expressed in the real form. Thus, for instance, the

complex representation for the electric intensity vector E is to be E. In (1.50a), however,

we have omitted the tilde over the symbols expressing complex field intensities for the

sake of brevity. The tilde will be put in if the context so requires. Mostly, symbols

without the tilde can stand for either real quantities or their complex counterparts, and

the intended meaning in an expression is to be read from the context.

Here ω is any real number which we will assume to be positive without

loss of generality, and k, E0, H0 are constant vectors satisfying

k2 =ω2

c2, (1.50b)

E0 · k = 0, H0 =1

µ0ωk× E0 =

1

µ0cn× E0, (1.50c)

51


where n stands for the unit vector along k. The relations (1.50c) are seen

to be necessary if one demands that the field vectors given by (1.50a)

have to satisfy not only the wave equations (1.49) but all the four Maxwell

equations simultaneously.

The above solution (equations (1.50a) - (1.50c)) is said to represent a

monochromatic plane wave characterized by the angular frequency ω and

wave vector (or propagation vector) k. At any given point in space, the elec-

tric and magnetic intensities oscillate sinusoidally in a direction parallel

to E0 and B0 respectively with a time period T = 2πω

, and with amplitudes

|E0| , |B0|.

Considering points on any straight line parallel to the propagation vector

k, the field vectors E and H are seen, from equations (1.50a), to vary si-

nusoidally with the distance along the line, being repeated periodically at

intervals of length λ = 2πk

, which implies that λ represents the wavelength

of the wave.

The expression Φ = k · r − ωt is referred to as the phase of the wave at

the point r and at time t, where the phase indicates the instantaneous

state of oscillation of the electric and magnetic field vectors at that point.

Since the phase occurs through the expression eiΦ, values of the phase

differing from one another by integral multiples of 2π are equivalent in the

sense that they correspond to the same state of oscillation of the electric

and magnetic vectors. Hence, what is of actual relevance is the reduced

phase φ ≡ Φ modulo 2π (thus, for instance, the phases Φ1 =5π2

and Φ2 =9π2

correspond to the same value of the reduced phase, φ = π2). At times the

52


reduced phase is referred to, simply, as the phase.

The relation (1.50c) tells you that the amplitude vectors E0 and H0, along

with the unit vector n along k form a right handed triad of orthogonal

vectors, where the direction of n is related to the directions of E0 and H0

in a right handed sense. Similar statements apply to the instantaneous

field vectors E(r, t), H(r, t) at any given point, and the unit vector n. In

this context, note that the oscillations of E and H at any given point in

space occur with the same phase.

Considering any given instant of time (t), points in space for which the

phase Φ is of any specified value (say, Φ = Φ0), lie on a plane perpendic-

ular to n, termed a wave front. Any other specified value (say, Φ = Φ1)

corresponds to another wave front parallel to this, and thus, one has a

family of wave fronts corresponding to various different values of Φ at any

given instant of time (see fig. 1.1). Since any straight line parallel to the

unit vector n = k

|k| is perpendicular to all these wave fronts, it is termed

the wave normal.

Imagining a succession of values of time (say, t = t1, t2, . . .), any of these

wave fronts (say, the one corresponding to Φ = Φ0) gets shifted along n to

successive parallel positions, and the distance through which the wave

front moves in any given time (say, τ ) can be seen to be cτ (check this out).

In other words, c = 1√ǫ0µ0

gives the velocity of any of the wave fronts along

the wave vector k (fig. 1.1). This is termed the phase velocity, and c is

thus seen to represent the phase velocity of plane electromagnetic waves

in vacuum. It is a universal constant and is also commonly referred to as

53


the velocity of light.

P2 Q2

P1 Q1

AA¢ B¢B

n

vt vt

Figure 1.1: Illustrating the idea of propagating wave fronts for a planewave; A, B denote wave fronts for two different values of the phase Φ atany given instant of time, which we take to be t = 0; the straight linesP1Q1 and P2Q2 are perpendicular to the wave fronts, and represent wavenormals; considering any other time instant t = τ , the wave fronts areseen to have been shifted to new positions A′, B′ respectively, each by adistance vτ , where v stands for the phase velocity; in the case of planewaves in vacuum, v = c, a universal constant; for a dielectric medium, vdepends on the frequency ω; n denotes the unit vector in the direction ofthe wave normals.

The above statements, all of which you will do well to check out by

yourself, describe the features of a plane monochromatic electromagnetic

wave, where the term ‘plane’ refers to the fact the wave fronts at any given

instant are planes (parallel to one another) and the term ’monochromatic’

to the fact that the electric and magnetic intensities at any given point

in space oscillate sinusoidally with a single frequency ω. A different set

of values of ω, k, and E0 (and correspondingly, of H0 given by the re-

lation (1.50c)) corresponds to plane monochromatic wave of a different

description characterized, however, by the same phase velocity c (though

along a different direction). Such a plane wave is, moreover, referred to

as a progressive (or a propagating) one since, with the passage of time,

the wave fronts propagate along the wave normal. Moreover, as we will

see below, there occurs a propagation of energy as well by means of the

54


wave along the direction of the wave normal.

These features of propagation of wave fronts and of energy distinguish a

propagating wave from a stationary one (see sec. 1.16) where there does

not occur energy transport by means of the wave.

1.10.2 Plane waves in an isotropic dielectric

Plane wave solutions similar to those described in sec. 1.10.1 hold in the

case of an isotropic dielectric free of sources since, for such a medium,

the Maxwell equation (1.1a) - (1.1d), along with the constitutive rela-

tions (1.3a), (1.3b), reduce to a set of relations analogous to (1.9a) - (1.9d),

with ǫ = ǫrǫ0, µ = µrµ0 replacing ǫ0, µ0 respectively (check this out). The

corresponding wave equations, analogous to (1.49), are

∇2E− ǫrµr

c2∂2E

∂t2= 0, ∇2H− ǫrµr

c2∂2H

∂t2= 0. (1.51)

We assume for now that ǫr, µr are real quantities for the medium under

consideration. In reality, while µr is real and ≈ 1 for most media of inter-

est in optics, ǫr turns out to be complex, having a real and an imaginary

part, where the latter accounts for the absorption of energy during the

passage of the wave through the medium.

The statement that the relative permittivity is a complex quantity has the following

significance: as a wave propagates through the dielectric medium under consideration,

it polarizes the medium, where the polarization vector P oscillates sinusoidally similarly

to the electric intensity E, but with a different phase. This aspect of wave propagation

55


in an isotropic dielectric will be discussed in greater details in sec 1.15.

For most dielectrics, however, the complex part of the relative permittivity

is small for frequencies belonging to ranges of considerable extent, and

is seen to assume significant values over small frequency ranges where

there occurs a relatively large absorption of energy in the medium under

consideration. In this section we consider a wave for which the absorption

can be taken to be zero in an approximate sense, and thus ǫr can be taken

to be a real quantity. Moreover, as mentioned above, we assume that µr

is real and close to unity.

With these assumptions, the Maxwell equations in an isotropic dielectric

admit of the following monochromatic plane wave solution

E = E0exp[i(k · r− ωt)], H = H0exp[i(k · r− ωt)], (1.52a)

where the magnitude of the wave vector is given by

k ≡ |k| = ω

c

√ǫrµr =

ω

v(say), (1.52b)

and where the vector amplitudes E0 and H0 satisfy

E0 · k = 0, H0 =1

µωk× E0 =

1

µvn× E0. (1.52c)

In these formulae there occurs the expression

v =ω

k=

1√ǫµ

=c√ǫrµr

=c

n, (1.52d)

56


where

n =√ǫrµr. (1.52e)

Finally, in the formula (1.52a) the unit vector n giving the direction of the

propagation vector k can be chosen arbitrarily, implying that the plane

wave can propagate in any chosen direction.

The interpretation of the various quantities occurring in the above formu-

lae is entirely analogous to that of corresponding quantities for a plane

wave in free space. Thus, ω represents the (angular) frequency of oscil-

lation of the electric and magnetic intensities at any given point, λ ≡ 2πk

the wavelength, and v the phase velocity, where the phase velocity is de-

fined with reference to the rate of translation of the surfaces of constant

phase along the direction of the propagation vector k. The only new quan-

tity is the refractive index n that will be seen in sec. 1.12.1 to determine

the bending of the wave normal as the plane wave suffers refraction at a

plane interface into another medium. Finally, E0, H0, and k (or, equiva-

lently, the electric and magnetic vectors at any given point at any instant

of time, together with the propagation vector k) once again form a right

handed triad of orthogonal vectors.

As I have mentioned above, the interpretation of these quantities gets

modified when one takes into account the fact that the relative permittiv-

ity ǫr is, in general, a complex quantity. This we will consider in sec. 1.15

57


1.10.3 Energy density and intensity for a plane monochro-

matic wave

For a plane wave in an isotropic dielectric, the electric and magnetic field

vectors in complex form are given by expressions (1.52a), where the vec-

tors E0, H0 are related as in eq. (1.52c) (which reduces to (1.50c) in the

case of a plane wave in free space), and where the tildes over the com-

plex quantities have been omitted for the sake of brevity. However, the

relations (1.52c) remain valid even when the vectors are taken to be real.

The time averaged energy density and the Poynting vector in the field

of a monochromatic plane wave are obtained from expressions (1.36a)

and (1.40) respectively as

〈w〉 = 1

4

(

ǫE20 + µH2

0

)

=1

2ǫE2

0 , (1.53a)

〈S〉 = 1

2E0H0n =

1

2

√

ǫ

µE2

0 n. (1.53b)

In these expressions, E0 and H0 stand for the amplitudes of the electric

intensity and the magnetic field strength, where both can be taken to be

real simultaneously (refer to the second relation in (1.52c); recall that we

are assuming absorption to be negligibly small).

Note that the time averaged energy density is a sum of two terms of equal magnitudes

relating to the electric and magnetic fields of the plane wave.

The two relations (1.53a), (1.53b) taken together imply that, for a plane

58


wave in an isotropic dielectric

〈S〉 = 〈w〉vn. (1.54)

This can be interpreted as stating that the flow of energy carried by the

plane wave occurs, at any given point in the field, along k, the wave

vector, and the energy flux (rate of flow of energy per unit area through

an imagined surface perpendicular to the direction of flow at any given

point) equals the energy density times the phase velocity. As a corollary,

the velocity of energy propagation is seen to be v, the phase velocity in

the medium under consideration.

1. Here we have considered just a single monochromatic wave propagating through

the medium under consideration, for which the definition of energy flux is a no-

tional rather than an operational one. In practice, the definition of energy flux

carried by means of an electromagnetic field requires that a wave packet, consti-

tuting a signal be considered, in which case the phenomenon of dispersion is also

to be taken into account. All this requires more careful consideration before one

arrives at the concept of velocity of energy flow, for which see sec. 1.15.

2. In order to see why one can interpret the phase velocity v in (1.54) as the velocity

of energy flow, let us assume, for the moment, that the energy flow velocity is

u. Considering a point P and a small area δs around it perpendicular to the

direction of energy flow, imagine a right cylinder of length u erected on the base

δs. Evidently, then, the energy contained within this cylinder will flow out through

δs in unit time. In other words, the energy flux will be 〈w〉u. Comparing with

eq. (1.54), one gets u = v.

Formulae (1.52a) - (1.52c), with any specified vector E0, define a linearly

polarized plane wave of frequency ω and wave vector k, where one has to

59


have E0 · k = 0. Plane wave solutions with the same ω and k but other

states of polarization will be introduced in sec. 1.11.

From the relation (1.54), one obtains the intensity due to a linearly po-

larized plane monochromatic wave (refer to formula (1.41) where the unit

vector s is to be taken as n in the present context):

I =1

2

√

ǫ

µE2

0 . (1.55)

The plane monochromatic wave is, in a sense, the simplest solution to

Maxwell’s equations. Two other types of relatively simple solutions to

Maxwell’s equations, obeying a certain type of boundary conditions, are

the vector spherical and cylindrical waves (see sections 1.17.2, 1.18.2). In

general, exact solutions for Maxwell’s equations satisfying given bound-

ary conditions are rare. There exists an approximation scheme, com-

monly known as the geometrical optics approximation, to be discussed in

chapter 2, where the energy carried by the electromagnetic field is seen to

propagate along ray paths, the latter being orthogonal to a set of surfaces

termed the eikonal surfaces. For the plane wave solutions the eikonal

surfaces reduce to the wave fronts and the ray paths reduce to the wave

normals. In this sense, we will at times refer to ray paths while talking of

plane progressive waves.

60


1.11 States of polarization of a plane wave

1.11.1 Linear, circular, and elliptic polarization

As mentioned at the end of sec. 1.10.3, the linearly polarized plane wave

solution described in sec. 1.10.2 corresponds to only one among several

possible states of polarization of a monochromatic plane wave, where the

term ‘state of polarization’ refers to the way the instantaneous electric

and magnetic intensity vectors are related to the wave vector k.

Considering, for the sake of concreteness, a plane wave propagating along

the z-axis of a right handed Cartesian co-ordinate system (for which n, the

unit vector along the direction of propagation is e3, the unit vector along

the z-axis; we denote the unit vectors along the x- and y-axes as e1 and

e2), the relations (1.52c) imply that the amplitude vectors E0, H0 can point

along any two mutually perpendicular directions in the x-y plane. One

can assume, for instance, that these two point along e1, e2 respectively.

This will then mean that the electric and magnetic intensity vectors at

any point in space oscillate in phase with each other along the x- and

y-axes.

More generally, a linearly polarized monochromatic plane wave propagat-

ing along the z-axis can have its electric vector oscillating along any other

fixed direction in the x-y plane, in which case its magnetic vector will os-

cillate along a perpendicular direction in the same plane, where one has

to keep in mind that for a plane progressive wave the electric vector, the

magnetic vector, and the direction of propapagation have to form a right

61


handed orthogonal triad - a requirement imposed by Maxwell’s equations.

Thus, one can think of a linearly polarized plane monochromatic wave

propagating in the z-direction, where the directions of oscillation of the

electric and magnetic intensities in the x-y plane are as shown in fig. 1.2.

E0

H0

q

q

OX

Y

Figure 1.2: Depicting the directions of oscillation (dotted lines inclined tothe x- and y-axes) of the electric and magnetic field vectors of a linearlypolarized plane progressive wave propagating along the z-axis (perpendic-ular to the plane of the figure, coming out of the plane; the plane of thefigure is taken to be z = 0), where the direction of the electric intensityis inclined at an angle θ with the x-axis; correspondingly, the direction ofthe magnetic vector is inclined at the same angle with the y-axis, the twovectors being shown at an arbitrarily chosen instant of time; the wave isobtained by a superposition of two linearly polarized waves, one with theelectric vector oscillating along the x-axis and the other with the electricvector oscillating along the y-axis, the phases of the two waves being thesame.

Such a linearly polarized wave can be looked upon as a superposition

of two constituent waves, each linearly polarized, the phase difference

between the two waves being zero. More precisely, consider the following

62


two plane waves, both with a frequency ω and both propagating along

the z-axis, and call these the x-polarized wave and the y-polarized wave

respectively:

(x− polarized wave) E1 = e1A1 exp[i(kz − ωt)], H1 = e2A1

µvexp[i(kz − ωt)],

(1.56a)

(y − polarized wave) E2 = e2A2 exp[i(kz − ωt)], H2 = −e1A2

µvexp[i(kz − ωt)].

(1.56b)

Here A1 and A2 are positive constants representing the amplitudes of

oscillation of the electric intensities for the x- and the y-waves. Evidently,

these formulae represent linearly polarized waves, the first one with the

vectors E, H oscillating along the x- and y-axes respectively, and the

second one with these vectors oscillating along the y- and x-axes, where

in each case, the instantaneous electric and magnetic intensities and the

unit vector e3 form a right handed orthogonal triad.

The superposition of these two waves with the same phase,

E = E1 + E2, B = B1 +B2, (1.57a)

then gives rise to the linearly polarised plane wave described by equa-

tions (1.52a) - (1.52c) where, now

n = e3, E0 = e1A1 + e2A2, H0 =1

µve3 × E0, (1.57b)

63


the directions of E0 and H0 being as depicted in fig. 1.2, with θ given by

tan θ =A2

A1

. (1.57c)

More generally, one can consider a superposition of the two linearly polar-

ized waves (1.56a), (1.56b) (which we have referred to as the x-polarized

wave and the y-polarized wave respectively), but now with a phase differ-

ence, say δ:

E = E1 + eiδE2, H = H1 + eiδH2, (1.58)

Considering the y-polarized wave in isolation, the multiplication of E2,

H2 with the phase factor eiδ does not change the nature of the wave,

since only the common phase of oscillations of the electric and magnetic

intensities is changed. But the above superposition (eq. (1.58)) with an

arbitrarily chosen value of the phase angle δ (which we assume to be

different from 0 or π, see below) does imply a change in the nature of

the resulting wave in that, while the instantaneous electric and magnetic

intensities and the propagation vector still form a right handed triad,

the electric and the magnetic intensities now no longer point along fixed

directions as in the case of a linearly polarized wave.

Thus, for instance, if one chooses A1 = A2(= A), say, and δ = π2or − π

2,

then it is found that the tip of the directed line segment representing the

instantaneous electric intensity E (which here denotes the real electric

intensity vector rather than its complex representation) describes a circle

in the x-y plane of radius A, while a similar statement applies to H as

64


well. For δ = −π2, the direction of rotation of the vector is anticlockwise,

i.e., from the x-axis towards the y-axis, while the rotation is clockwise for

δ = π2

(check this out; see fig. 1.3(A), (B)). These are said to correspond to

left handed and right handed circularly polarized waves respectively.

Y

XO

(A)

Y

XO

(B)

Figure 1.3: (A) Left-handed and (B) right-handed circular polarization;considering the variation of the electric intensity at the origin of a chosenco-ordinate system, the tip of the electric vector describes a circle in thex-y plane, where the wave propagates along the z-direction, coming outof the plane of the paper; the direction of rotation of the electric intensityvector is anticlockwise in (A) and clockwise in (B).

As seen above, a superposition of the x-polarized wave and the y-polarized

wave with the phase difference δ = 0 results in a linearly polarized wave

with the direction of polarization (i.e., the line of oscillation of the electric

intensity at any given point in space; in fig. 1.2 we take this point to

be at the origin of a chosen right handed co-ordinate system) inclined at

an angle θ given by (1.57c). The value δ = π, on the other hand, again

gives a linearly polarized wave with θ now given by tan θ = −A2

A1(check this

statement out).

Considering now the general case in which δ is different from the special

65


values 0, π (and, for A1 = A2, the values δ = ±π2), one finds that the tip of

the electric intensity vector describes an ellipse in the x-y plane (where,

for the sake of concreteness, we consider the variation of the electric in-

tensity at the origin of a chosen right handed co-ordinate system). Once

again, the direction of rotation of the electric intensity vector can be anti-

clockwise or clockwise, depending on the value of δ, corresponding to left

handed and right handed elliptic polarization respectively(fig. 1.4).

X

Y

O

E

X

Y

O

E

Figure 1.4: (A) Left-handed and (B) right-handed elliptic polarization; thetip of the electric vector describes an ellipse in the x-y plane, with thedirection of describing the ellipse being different in (A) as compared to(B); the direction of propagation in either case is perpendicular to theplane of the figure, coming out of it; the principal axes of the ellipse are,in general, inclined to the x- and y-axes chosen.

1.11.2 States of polarization: summary

Choosing a co-ordinate system with its z-axis along the direction of propa-

gation (with the x- and y-axes chosen arbitrarily in a perpendicular plane,

so that the three axes form a right handed Cartesian system), the vari-

ous possible states of polarization of a monochromatic plane wave can be

described in terms of superpositions of two basic linearly polarized com-

66


ponents, referred to above as the x-polarized wave (eq. (1.56a)) and the

y-polarized wave (eq. (1.56b)). The amplitudes of oscillation of the electric

intensities of these two basic components, say, A1, A2, constitute two of

the three independent parameters in terms of which a state of polariza-

tion is determined completely.

The third parameter is the phase difference δ with which the two basic

components are superposed (eq. (1.58)).

In these equations describing the basic components and their superposition, the resul-

tant electric and magnetic vectors (E, H) are expressed in the complex form, with the

tildes over the relevant symbols omitted for the sake of convenience. The vectors mak-

ing up the component waves are real ones or, equivalently, complex vectors with phases

chosen to be zero.

Depending on the values of these parameters one can have a linearly

polarized wave (δ = 0, π), circularly polarized wave (A1 = A2, δ = ±π2), or

an elliptically polarized wave propagating along the z-axis. In the general

case, the lengths of the principal axes of the ellipse, their orientation with

respect to the x- and the y-axes, and the sense of rotation in which the

ellipse is described, are all determined by the three parametrs A1, A2, δ.

1.11.3 Intensity of a polarized plane wave

Consider a monochromatic plane wave in any one of linear, circular and

elliptic states of polarization, obtained by the superposition (eq. (1.58)) of

the two basic components described by formulae (1.56a), (1.56b), where

the fields are all expressed in the complex form, to be distinguished here

67


from the real field vectors by tildes attached over their respective symbols.

In this more precise notation, then, the time averaged Poynting vector

assumes the form

〈S〉 = 〈E×H〉 = 1

4〈E× H∗ + E∗ × H〉. (1.59)

Making use of eq. (1.58)in this expression, one finds

〈E× H∗〉 = 〈E1 × H∗1 + E2 × H∗

2〉 =1

µv(A2

1 + A22)e3, (1.60)

while 〈E∗ × H〉 may be seen to have the same value as well.

In other words, one has

〈S〉 = 1

2

√

ǫ

µ(A2

1 + A22)e3 = 〈S1〉+ 〈S2〉, (1.61)

where S1, S2 stand for the Poynting vectors for the two basic compo-

nents, the x-polarized and the y-polarized waves, considered separately.

Correspondingly, the intensity of the superposed wave is the sum of the

intensities due to the two component waves considered one in absence of

the other:

I =1

2

√

ǫ

µ(A2

1 + A22) = I1 + I2. (1.62)

This is an interesting and important result: because of the orthogonality

of the x-polarized and the y-polarized waves, the intensity of the polar-

ized plane wave obtained by their superposition is simply the sum of the

68


intensities due to the two waves considered one in absence of the other,

regardless of the phase difference δ between the two.

This implies, in particular, the following relation between I1, I2, and I

in the case of a linearly polarized wave for which the electric intensity

oscillates along a line inclined at an angle θ to the x-axis,

I1 = I cos2 θ, I2 = I sin2 θ, (1.63)

and, in the case of a circularly polarized wave,

I1 = I2 =I

2. (1.64)

(Check these statements out).

1.11.4 Polarized and unpolarized waves

It is the vectorial nature of an electromagnetic wave, where the field vari-

ables are vectors, that implies that a complete description of a monochro-

matic plane wave has to include the specification of its state of polariza-

tion. This is in contrast to a scalar wave where a plane wave is specified

completely in terms of its angular frequency, wave vector, and amplitude.

The angular frequency ω and the wave vector k are related to each other as ω2 = v2k2,

where v stands for the phase velocity in the medium under consideration.

A plane wave in any of the states of polarization mentioned above is

termed a polarized wave. By contrast, one can have an unpolarized plane

69


wave as well. However, the description of an unpolarized plane wave in-

volves a new concept that we have not met with till now, namely, that of

an electromagnetic field being an incoherent one. The concept of coher-

ence of an electromagnetic wave will be introduced in sec. 1.21, and will

be discussed in greater details in chapter 7. Here I include a brief outline

of the concepts of coherence and incoherence in the context of the states

of polarization of a plane wave.

If we consider any of the field vectors, say E at any point (say, r) at succes-

sive instants of time, say, t1, t2, t3, . . ., and compare the resulting sequence

of values of the field vector with the sequence of values at instants, say

t1 + τ, t2 + τ, . . ., we will find that the degree of resemblance between the

two sequences depends, in general, on the time interval τ . In some situa-

tions, the resemblance persists even for large values of τ , which turns out

to be the case for a polarized plane wave. One expresses this by saying

that the polarized plane wave represents a coherent time dependent field

at the point under consideration. If, on the other hand, the resemblance

is lost even for sufficiently small values of τ , one has an incoherent wave.

In practice, one can characterize a wave by its degree of coherence, where

complete coherence and complete incoherence correpond to two extreme

types while electromagnetic or optical fields in commonly encountered

set-ups corresponds to an intermediate degree of, or partial, coherence.

Imagine now a superposition of the x-polarized and y-polarized waves in-

troduced above, where the amplitudes A1, A2, and the phase difference

δ are random variables. Such a wave may result, for instance, from the

emission of radiation from a large number of identical but uncorrelated

70


atoms, that may effectively be described in terms of a superposition of the

form (1.58) where the parameters A1, A2, δ are random variables with cer-

tain probability distributions over ranges of possible values. This, then,

constitutes an unpolarized plane wave with angular frequency ω and di-

rection of propagation e3, where the parameters A1, A2, δ cannot be as-

signed determinate values.

By contrast, a polarized wave results when a large number of atoms emit x-polarized and

y-polarized radiation in a correlated manner. A laser followed by a polaroid constitutes

a practical example of a coherent source of polarized light, while the radiation from a

flame is unpolarized and incoherent.

For a completely unpolarized wave, A1 and A2 are characterized by iden-

tical probability distributions and the electric intensity vector in the x-y

plane fluctuates randomly, the fluctuations of the x- and y-components

being identical in the long run. For such a wave the intensities I1, I2 of

the x- and y-polarized components (recall that the definition of intensity

involves an averaging in time) are related to the intensity of the resultant

wave as

I1 = I2 =I

2. (1.65)

Finally, I should mention that the concept of the state of polarization

of a wave is not specific to plane waves alone. I have talked of po-

larization in the context of plane progressive electromagnetic waves in

this section. However, the concept of polarization extends to electromag-

netic waves of certain other descriptions as well where the directions of

71


oscillations of the electric and magnetic field vectors bear a defininite

and characteristic relationship with the direction of propagation of the

wave. Instances where a wave can be characterized in such a manner are

what are known as the transverse magnetic (TM) and tranverse electric

(TE) spherical waves in regions of space far from their sources. Similar

characterizations are also possible for a class of cylindrical waves as well

(see sections 1.17, 1.18 for an introduction to spherical and cylindrical

waves). However, I will not enter here into a detailed description and

analysis of these waves.

1.12 Reflection and refraction at a planar in-

terface

1.12.1 The fields and the boundary conditions

Fig. 1.5 depicts schematically a plane wave incident on the plane interface

separating two homogeneous media (say, A and B) with refractive indices

n1, n2, where a co-ordinate system is chosen with the plane interface

lying in its x-y plane, so that the normal to the interface at any point on

it points along the z-axis. The figure shows a wave normal intersecting

the interface at O, where the wave normal can be described, for the plane

wave under consideration, as a ray incident at O (see sec. 1.10.3). The

wave front is then perpendicular to the ray, with the electric and magnetic

field vectors oscillating in the plane of the wave front. The plane of the

figure, containing the incident ray and the normal to the surface at O

(referred to as the ‘plane of incidence’), is the x-z plane of the co-ordinate

72


system chosen and the unit vector along the direction of the ray is, say,

n = e1 cos θ + e3 sin θ, (1.66)

where e1 and e3 denote unit vectors along the x- and z-axes, and θ is the

angle made by the ray with the interface, i.e., in the present case, with

the x-axis.

O

m1 n

m2

n1

n2

incident wavefront

reflectedwave front

refractedwave front

f¢

q

f

y

Figure 1.5: Plane wave incident on a plane interface separating two me-dia: illustrating the laws of reflection and refraction; a wave incident onthe interface with its wave normal along n gives rise to a reflected waveand a refracted one, with wave normals along m1 and m2 respectively; thethree wave normals (which we refer to as the incident, reflected, and re-fracted rays, see sec. 1.10.3) have to be geometrically related in a certainmanner (laws of reflection and refraction) so that a certain set of bound-ary conditions can be satisfied on the interface; the angles of incidence,reflection, and refraction (φ, φ′, ψ) are shown (refer, in this context, tothe sign convention for angles briefly outlined in the paragraph followingeq. (1.70)).

Because of the presence of the interface between the two media, the in-

cident plane wave all by itself cannot satisfy Maxwell’s equations every-

where in the regions occupied by both these two (reason out why). In-

stead, we seek a solution which consists of a superposition of two plane

73


waves in the region of medium A, and one plane wave in the region of

medium B as in fig. 1.5, where we call these the incident wave (along n),

the reflected wave (along m1), and the refracted wave (along m2). The

instantaneous electric and magnetic field intensities in the regions of

medium A and medium B can then be represented as follows, where we

assume the complex form for the vectors (without, however, using tildes

over the relevant symbols):

(medium A) E = E1 + E2, H = H1 +H2 =1

µ1v1(n× E1 + m1 × E2),

(medium B) E = E3, H = H3 =1

µ2v2(m2 × E3), (1.67a)

where the fields E1, E2, E3 are of the form

E1 = A1 exp[iω(n · rv1− t)], E2 = A2 exp[iω(

m1 · rv1− t)], E3 = A3 exp[iω(

m2 · rv2− t)],

(1.67b)

with the amplitudes A1,A2,A3 satisfying

A1 · n = 0, A2 · m1 = 0, A3 · m2 = 0. (1.67c)

I will first explain what the symbols and the equations stand for, and then

I want you to take your time having a good look at these so that you can

go on to the subsequent derivations (some parts of which I will ask you

to work out yourself).

First of all, I must tell you that these equations are in the nature of an in-

formed guess about what we expect in the context of the given situation,

74


where we assume that there is a monochromatic source and a collimat-

ing system located at an infinitely large distance from the interface (there

being no other source in either of the two media), sending out a parallel

beam of rays of infinite width (the incident plane wave) in the direction of

the unit vector n, and that the source has been switched on in the infinite

past so that everything is in a steady state, and the fields vary harmon-

ically with angular frequency ω. Observations tell us that there occur a

reflected and a refracted beam for which we assume plane wave expres-

sions. But mind you, these are not plane waves in the strict sense since

each is localised in a half space, namely the regions occupied by either of

the two media as the case may be. You don’t have three separate plane

waves here. Instead, the expressions (1.67a) - (1.67c) are assumed to

constitute one single solution. As yet, these expressions invlove a num-

ber of undetermined constants that will be fixed by the use of a number

of appropriate boundary conditions.

In these expressions, E1, E2, E3 describe the electric intensity vectors

corresponding to the incident wave, the reflected wave, and the refracted

wave respectively, while H1, H2, H3 describe the corresponding magnetic

vectors. Each of these expressions formally resembles the field due to a

plane wave though, as explained above, it is confined only to a half space.

However, because of this formal identity, the guess solution I have written

down above satisfies Maxwell’s equations in each of the two media con-

sidered in isolation (check this out). What remains, though, is the matter

of the boundary conditions the field vectors must satisfy at the interface.

These boundary conditions are to be made use of in determining the unit

75


wave normals m1, m2, i.e., the directions of the reflected and refracted

waves for any given direction of the incident wave (n), and the amplitudes

(in general complex) A2, A3 of these wave for a given incident amplitude

A1 (which can be assumed to be real), where these are to satisfy the rela-

tions (1.67c). Incidentally, in the above expressions, v1, v2 stand for the

phase velocities of monochromatic plane waves of frequency ω in the two

media, so that

n1 =c

v1, n2 =

c

v2, (1.67d)

and µ1, µ2 are the respective permeabilities.

The relevant boundary conditions are given, first, by the second relation

in eq. (1.8a) and then, by the second relation in (1.8b), where Σ is taken

to be the interface separating the two media under consideration. The

former states that the tangential component of the electric intensity E is

to be continuous across the interface, while the latter relates to the con-

tinuity of the tangential component of the magnetic field vector H, which

holds because of the fact that there is no free surface current on the in-

terface (K = 0). The other two boundary conditions in (1.8a), (1.8b) are

found not to give rise to any new relations between the field components.

1.12.2 The laws of reflection and refraction

A necessary condition for the above continuity conditions to hold is that

the phases of the incident, reflected, and refracted wave forms must be

continuous across the interface, which we have assumed to be the plane

76


z = 0 of the chosen co-ordinate system. This implies that, first of all

vectors m1 and m2 have to lie in the x-z plane (check this out) - the law of

co-planarity for reflection and refraction - and, moreover,

1

v1(x sinφ+ z cosφ) =

1

v1(x m1x + z m1z) =

1

v2(x m2x + z m2z), (z = 0) (1.68)

(check this out), where the suffixes x, z refer to the x- and z-components

of the unit vectors indicated. In writing these relations, I have made use

of the formula

n = e1 sinφ+ e3 cosφ, (1.69a)

where φ is the angle of incidence shown in fig. 1.5 (φ = π2−θ, see eq. (1.66)).

The unit vectors m1, m2 along the directions of propagation of the reflected

and refracted waves can similarly be expressed in terms of the angles of

reflection and refraction φ′ and ψ:

m1 = −e1 sinφ′ − e3 cosφ′, (1.69b)

m2 = e1 sin ψ + e3 cos ψ, (1.69c)

where the negative sign in the first term on the right hand side of eq. (1.69b)

is explained below.

In other words, one has the law of angles for reflection and refraction

77


(commonly referred to, in the latter case, as Snell’s law)

φ′ = −φ, n1 sin φ = n2 sin ψ. (1.70)

I owe you an explanation for the way I have written down the first of

these relations, which relates to the first relation in (1.69b) . What I have

in mind here is the sign convention in geometrical optics, which I will state

in details in section 3.2.2. This is nothing but the convention for angles

and distances that one adopts in co-ordinate geometry. In the case of

angles, for instance, a certain straight line is taken as the reference line

and the angle made by any other line with this reference line is taken to

have positive or negative sign if one needs to impart a counterclockwise

or a clockwise rotation respectively to the reference line so as to make it

coincide with the line in question. In the present instance, we take the

normal to the interface at the point O as the reference line, in which case

φ and φ′ are seen to have opposite signs, explaining the negative signs in

the first term in (1.69b) and in the first relation in (1.70). At the same

time, φ and ψ have the same sign, which explains the positive sign in

second relation, since n1, n2 are both positive quantities.

However, there arises in geometrical optics the necessity of adopting a

sign convention for refractive indices as well, in order that all the math-

ematical relations there can be made consistent with one another (see

sec. 3.2.2). For this, the directions of all the rays are compared with that

of a reference ray, which one usually chooses as the initial incident ray

for any given optical system. If the direction of any given ray happens to

be opposite to that of the reference ray because of reflection, then the re-

78


fractive index of the medium with reference to that particular ray is taken

with a negative sign. In the present instance, then, taking the incident

ray path as the reference ray direction, the signed refractive indices in

respect of the incident and reflected rays will have to be taken as n1 and

−n1 respectively.

Adopting this convention, the law of angles for reflection and refraction

can be expressed as a single formula, commonly referred to as Snell’s

law:

n1 sinφ1 = n2 sinφ2. (1.71)

In this formula, φ1 is the angle of incidence and n1 is the refractive index

(considered as a positive quantity) of the medium A, while φ2 denotes the

angle (expressed in accordance with the above sign convention) made by

either the reflected or the refracted ray with the normal (the reference

line for angles) and, finally, n2 stands for the signed refractive index as-

sociated with that ray. Alternatively, and more generally, the equation

may be interpreted as applying to any two of the three rays involved (the

incident, reflected, and refracted rays) with their respective signed angles

relative to the reference line (the normal to the interface in this instance)

and their respective signed refractive indices. As we will see in chap-

ters 2 and 3, Snell’s law expressed in the above form, with the above sign

convention implied, is the basic formula for ray tracing through optical

systems.

In a relation like (1.67d), however, the refractive indices n1, n2 will have to be taken as

79


positive quantities since these express the phase velocities v1, v2 in terms of c. In the

present context, we will have no occasion to use signed refractive indices since these

are necessary only to express the rules of geometrical optics in a consistent manner. On

the other hand, signed angles will be used here so as to keep uniformity with later use.

1.12.3 The Fresnel formulae

1.12.3.1 Setting up the problem

Let us now get on with the other consequences of the boundary conditions

mentioned above. Making use of the boundary conditions, one obtains

from (1.67a), (1.67b),

e3 × (A1 +A2) = e3 ×A3,1

µ1v1e3 × (n×A1 + m1 ×A2) =

1

µ2v2e3 × (m2 ×A3).

(1.72)

Since the vectors m1, m2 are now known from Snell’s law, these relations

can be made use of in obtaining the amplitudes A2, A3 of the electric in-

tensities for reflected and refracted waves in terms of the amplitude A1 for

the incident wave (the amplitudes for the magnetic vectors are obtained

from (1.67a)). In order to express the results in a convenient form, note

that, in accordance with (1.67c), Ai (i = 1, 2, 3) can be expressed in the

form

Ai = uiAi (i = 1, 2, 3), (1.73)

where u1 is a linear combination of e2, n × e2, u2 is a linear combination

of e2, m1 × e2, and u3 is a linear combination of e2, m2 × e2, and where the

80


scalar amplitudes Ai (i = 1, 2, 3) are, in general, complex (A1 can, however,

be taken to be real without loss of generality). It is convenient to work

out the consequences of the relations (1.72) in two installments - first by

taking ui = e2 (i = 1, 2, 3), which means that all the three waves are polar-

ized with their electric vectors oscillating along the y-axis of the chosen

co-ordinate system (this is commonly referred to as the case of perpendic-

ular polarization, since the electric intensity vectors are all perpendicular

to the plane of incidence), and then by taking u1 = n× e2, u2 = m1× e2, and

u3 = m2 × e2 (parallel polarization; let us denote these three unit vectors

as t1, t2, t3 respectively). The case of any other state of polarization of the

three waves can then be worked out by taking appropriate linear combi-

nations. Fig. 1.6 gives you an idea of all the unit vectors relevant in the

present context.

m2

m1n

O e2

t1t2

t3e3

e1

interface

medium Aplane of incidence

medium B

Figure 1.6: The unit vectors relevant in the reflection-refraction problem;the unit vector e2 along the positive direction of the y-axis of the righthanded co-ordinate system chosen points upward, while e3 is normal tothe interface, as shown; the unit vectors n, m1, m2 along the incidentray, reflected ray, and the refracted ray are as in fig. 1.5; the vectorst1 ≡ n × e2, t2 ≡ m1 × e2, t3 ≡ m2 × e2 provide the reference directions forthe electric intensities for the case of parallel polarization.

81


Incidentally, referring to the unit vectors defined in the caption of fig. 1.6,

you can take it as an exercise to show that

t1 = − cosφe1 + sinφe3, t2 = cosφe1 + sinφe3, t3 = − cosψe1 + sinψe3. (1.74)

1.12.3.2 Perpendicular polarization

Considering the case of perpendicular polarization first (ui = e2 (i =

1, 2, 3)), one obtains, from relations (1.73), (1.72), and (1.74)

A1 + A2 = A3,n1µ2

n2µ1

(A1 − A2) cosφ = A3 cosψ. (1.75a)

These two relations give us the reflected and refracted amplitudes (A2, A3)

of oscillation of the electric intensity in terms of the incident amplitude

(A1) in the case of perpendicular polarization as

A2⊥ =µ2 cosφ sinψ − µ1 sinφ cosψ

µ2 cosφ sinψ + µ1 sinφ cosψA1⊥, A3⊥ =

2µ2 cosφ sinψ

µ2 cosφ sinψ + µ1 sinφ cosψA1⊥.

(1.75b)

Here the suffix ’⊥’ is attached for the sake of clarity to indicate that the

incident wave has its electric intensity oscillating in a direction perpen-

dicular to the plane of incidence.

In most optical situations involving reflection and refraction, one can take

µ1 ≈ µ2 ≈ µ0, (1.75c)

82


in which case the above formula simplifies to

A2⊥ = −sin(φ− ψ)sin(φ+ ψ)

A1⊥, A3⊥ =2 cosφ sinψ

sin(φ+ ψ)A1⊥. (1.75d)

Let us now calculate the time averaged Poyinting vector in the regions

occupied by the two media for this particular case of the incident wave,

the reflected wave, and the refracted wave, all in a state of perpendicular

polarization. Recalling formulae (1.40), and (1.67a), one obtains

〈S(A)〉 = 1

4µ1v1〈[(E1 + E2)× (n× E∗

1 + m1 × E∗2) + c.c]〉, (1.76a)

where ‘c.c’ stands for terms complex conjugate to preceding ones within

the brackets. When the time average is worked out, one finds that 〈S(A)〉

is made up of two components, one corresponding to the average rate

of energy flow in a direction normal to the interface (i.e., along e3 in the

present instance), and the other to the energy flow parallel to the interface

(along e1). Making the assumption (1.75c) for the sake of simplicity, the

expressions for these two components are seen to be

〈(S(A))⊥〉 =1

2v1e3 · (nA2

1 + m1A22)e3, (1.76b)

〈(S(A))‖〉 =1

2v1[e1 · (n |A1|2 + m1 |A2|2) +

1

2e1 · (n+ m1)(A1A

∗2 + A∗

1A2)]e1.

(1.76c)

In writing these expressions I have not attached the suffix ‘⊥’ to A1, A2

since, in the case under consideration the electric intensity vectors are

83


all perpendicular to the plane of incidence, and do not possess compo-

nents parallel to the plane. Moreover, the suffixes ‘⊥’ and ‘‖’, when used

in the context of the time averaged Poynting vectors, as in the above ex-

pressions, carry a different connotation - respectively perpendicular and

parallel to the interface rather than to the plane of incidence, and hence

the use of these suffixes for the amplitudes Ai (i = 1, 2, 3) would be mis-

leading.

In a manner similar to above, the normal and parallel components of the

time averaged Poynting vector in the region of the medium B are seen to

be

〈(S(B))⊥〉 =1

2v2e3 · (m2 |A3|2)e3, (1.77a)

〈(S(B))‖〉 =1

2v2e1 · (m2 |A3|2)e1. (1.77b)

The parallel components (S(A))‖, S(B))‖ are of no direct relevance in the

energy accounting in reflection and refraction, since these denote energy

flow parallel to the interface, where an interpretation in terms of energy

transfer from one medium to another does not hold. While noting the

existence of this component of the Poynting vector, let us concentrate

for now on the normal components whose expressions in terms of the

84


incident amplitude (A1) of the electric intensity are

〈(S(A))⊥〉 =1

2v1A2

1 cosφ(1−sin2(φ− ψ)sin2(φ+ ψ)

)e3 = 〈(S(A)inc )⊥〉+ 〈(S

(A)refl )⊥〉 (say),

(1.78a)

〈(S(B))⊥〉 =1

2v2A2

1 cosψ4 cos2 φ sin2 ψ

sin2(φ+ ψ)e3, (1.78b)

where we have assumed A1 to be real for the sake of simplicity.

Note that the normal component of the averaged Poynting vector (i.e.,

the component normal to the interface between the two media) in the

medium A decomposes into two parts, one due to the incident wave

(〈(S(A)inc )⊥〉 = 1

2v1A2

1 cosφe3) and the other due to the reflected wave (〈(S(A)refl )⊥〉 =

− 12v1A2

1 cosφsin2(φ−ψ)sin2(φ+ψ)

e3), where the latter is oppositely directed compared to

the former. In other words, part of the normal component of energy flow

due to the incident wave is sent back into the medium A, consistent with

the interpretation that this corresponds to the reflected wave. The ratio

of the magnitudes of the two is the reflectivity,

R⊥ =

∣

∣

∣〈(S(A)

refl )⊥〉∣

∣

∣

∣

∣

∣〈(S(A)

inc )⊥〉∣

∣

∣

=sin2(φ− ψ)sin2(φ+ ψ)

. (1.79a)

Analogously, 〈(S(B))⊥〉 represents the normal component of the energy flux

in medium B, i.e., the rfracted part of the normal component of the inci-

dent energy flux. The ratio of the magnitudes of the two is the transmis-

85


sivity,

T⊥ =

∣

∣

∣〈S(B)⊥ 〉

∣

∣

∣

∣

∣

∣〈(S(A)inc )⊥〉

∣

∣

∣

=sin 2φ sin 2ψ

sin2(φ+ ψ). (1.79b)

Here the suffix ’⊥’ is attached to R and T to indicate that these expres-

sions hold for an incident wave polarized perpendicularly to the plane of

incidence, i.e., it bears a different connotation as compared to the same

symbol used as a suffix for the normal component of the Poynting vec-

tor in either medium (see the right hand sides of the above expressions),

where it indicates that the component perpendicular to the interface be-

tween the media is being referred to.

As expected, one finds

R⊥ + T⊥ = 1, (1.79c)

which tells one that the normal components of the flow of energy for the

incident, reflected, and refracted waves satisfy the principle of energy

conservation independently of the parallel components.

The relations (1.79a), (1.79b) are referred to as Fresnel formulae. In the

present section these have been obtained for incident light in the state

of perpendicular polarization. Analogous Fresnel formulae in the case of

parallel polarization will be written down in sec. 1.12.3.3.

Phase change in reflection.

Note from the first relation in (1.75d) that there occurs a phase difference

86


of π between incident field in the perpendicularly polarized state and the

corresponding reflected field if |ψ| < |φ|, i.e., the medium B is optically

denser than the medium A (n2 > n1). If, on the other hand, B is optically

rarer, there does not occur any such phase change.

By definition, the angles φ and ψ are either both positive or both negative (refer to the

sign convention briefly outlined in the paragraph following eq. (1.70)). The two angles,

moreover, satisfy |φ| < π2 , |ψ| < π

2 . In the case of the medium B being denser than

medium A, one additionally has |ψ| < |φ|. In the above paragraph we have considered

the case where both the angles are positive. The same conclusion holds if both are

negative.

1.12.3.3 Parallel polarization. Brewster’s angle

The case of parallel polarization, where the incident, reflected, and re-

fracted waves are linearly polarized with their electric intensity vectors

oscillating in the plane of incidence, can be worked out in ananalogous

manner. However, I am not going to outline the derivation here since

it involves no new principles. Referring to eq. (1.73), one has to take

ui = ti (i = 1, 2, 3) here, where the unit vectors ti are defined as in (1.74).

Using notations analogous to those in sec. 1.12.3.2, one obtains the fol-

lowing results

A2‖ =tan(φ− ψ)tan(φ+ ψ)

A1‖, A3‖ =2 cosφ sinψ

sin(φ+ ψ) cos(φ− ψ)A1‖, (1.80a)

R‖ =tan2(φ− ψ)tan2(φ+ ψ)

, T‖ =sin 2φ sin 2ψ

sin2(φ+ ψ) cos2(φ− ψ) . (1.80b)

87


The relations (1.80b) are the Fresnel formulae for parallel polarization,

obtained by calculating the component of the time everaged Poyinting

vector normal to the interface for the incident, reflected, and refracted

waves. Once again, one observes that the principle of energy conserva-

tion holds for this component of the flow independently of the parallel

component (parallel, that is, to the interface):

R‖ + T‖ = 1. (1.80c)

Brewster’s angle

Note from from the first relation in (1.80a) that, for

φ+ ψ =π

2, (1.81a)

one has R‖ = 0, i.e., the reflected component vanishes, and the whole

of the incident wave is refracted. The angle of incidence for which this

happens is given by

tanφ =n2

n1

, (1.81b)

and is known as the Brewster angle. Evidently, if the incident wave is

in any state of polarization other than the one of linear polarization in

the plane of incidene (which we have referred to here as ‘parallel polar-

ization’), then the reflected light will be linearly polarized, involving only

the perpendicular component.

In general, for any arbitrarily chosen angle of incidence, the relative

88


strengths of the parallel and perpendicular components in the reflected

wave (as also in the refracted wave) get altered compared to those in the

incident wave. Thus, for a linearly polarized incident wave containing

both parallel and perpendicular components, the reflected wave will be

polarized in a different direction, with a different mix of the two compo-

nents. Similarly, circularly polarized incident light will be converted to

elliptically polarized light, and elliptically polarized light will give ellipti-

cally polarized light, with a different set of parameters characterizing the

ellipse (in special circumstances, elliptically polarised light may give rise

to circularly polarized reflected light).

Parallel polarization: phase change on reflection.

The question of phase change in reflection for the parallel component is

not as unambiguous as for the perpendicular component where, in the

latter case, the electric vectors of the incident, reflected, and refracted

waves, all oscillate along lines parallel to the y-axis (refer to our choice

of the Cartesian axes). In the former case, on the other hand, there is

no way to directly compare the phases of oscillation of these three, and

the relative phases depend on the definition of the unit vectors ti (i =

1, 2, 3) (for instance, on may, for any one or more of these three, choose

ti to be in a direction opposite to that of our choice above). The relative

phases, moreover, depend on whether φ + ψ is an acute or an obtuse

angle. Thus, for our choice of the unit vectors ti, and for φ + ψ > π2,

there is a phase change of π in the reflected wave relative to the incident

wave when the second medium is denser than the first one. The relative

phases acquire an operational significance if, for instance, the waves are

89


made to interfere with one another. The interference will then be found

to be constructive (no phase reversal) or destructive (reversal of phase)

depending only on the value of φ+ psi (relative to π2) regardless of the way

the ti’s are defined.

The case of normal incidence.

In the case of normal incidence (φ = 0), the plane of incidence is not de-

fined, and the term ‘parallel polarization’ is devoid of meaning. A linearly

polarized incident wave is then, by default, a perpendicularly polarized

one. Indeed, the results (1.80a) go over to (??) in the limit φ → 0 despite

the apparent difference in sign in the first members belonging to the two

pairs of relations (check this out), which is accounted for by the fact that

t2 → −t1 in this limit. Thus, the phase reversal (for n2 > n1) for a linearly

polarized incident wave does not have any ambiguity associated with it in

this case. Likewise, a normally incident left handed circularly polarized

wave is converted to a state of right handed polarization on reflection, if

n2 > n1.

1.13 Total internal reflection

Let us now take a close look at what happens when a plane wave is

incident at an interface separating an optically rarer medium B from a

denser medium A (i.e., the refractive indices n1 (for A), and n2 (for B) sat-

isfy n1 > n2), propagating from A towards B , where the angle of incidence

90


φ exceeds the critical angle (φc), i.e., in other words,

φ ≥ φc = sin−1 n, n ≡ n2

n1

. (1.82)

Looking at Snell’s law (eq. (1.71)), it is evident that this situation needs

special consideration since (1.82) implies that sinψ is to have a value

larger than unity, which is contrary to the bound −1 ≤ sin θ ≤ 1 for any

real angle θ. One commonly expresses this by saying that the wave is

‘totally internally reflected’ to the medium of incidence A, without being

refracted into B. We are now going to see what this statement actually

means. In this, let us consider for the sake of concreteness the case of

an incident wave with perpendicular polarization (i.e., with its electric in-

tensity oscillating in a direction perpendicular to the plane of incidence).

All the features of total internal reflection we arrive at below turn out to

have analogous counterparts in the case of parallel polarization as well,

the derivation of which, however, I will not go into. The case of an inci-

dent wave in an arbitrary state of polarization where, once again, similar

features are seen to characterize the fields in the two media, will also not

be considered separately.

In order to obtain expressions for the field vectors at all points in the two

media such that the Maxwell equations be satisfied everywhere, along

with the boundary conditions at the interface, let us refer to (1.67b), in

which the expression for E3 needs to be put in a new form since, for the

situation under consideration, the angle ψ in (1.69c) is not well defined.

Since, by contrast, φ is well defined here, one can make the following re-

placements, making use of Snell’s law as expressed by the second relation

91


in (1.70), which we assume to be a formally valid one (the consistency of

this assumption is seen from the final expression for the fields),

sinψ → sinφ

n, cosψ → i

√

sin2φ

n2− 1 = iβ (say),

(where) β ≡√

sin2 φ

n2− 1. (1.83)

We make these replacements in (1.69c) to evaluate the assumed solution

of the form (1.67a)-(1.67b), making use of the boundary conditions (1.72)

and considering the particular case where E1 (and hence also E2, E3 each)

oscillates in a direction perpendicular to the plane of incidence. The re-

sult works out to

E1 =e2A1 exp[ik(x sinφ+ z cosφ)]e−iωt,

E2 =e2A2 exp[ik(x sinφ− z cosφ)]e−iωt,

E3 =e2A3 exp[ik(x sinφ+ iknzβ)]e−iωt = e2A3 exp[ikx sinφ− knzβ]e−iωt,

H1 =1

µ1v1n× E1, H2 =

1

µ1v1m1 × E2,

H3 =1

µ2v2(e1

sinφ

n+ ie3β)× E3. (1.84a)

where Ei,Bi, (i = 1, 2, 3) are defined as in sec. 1.12.1, and where the

constants Ai (i = 1, 2, 3) are related to one another by the boundary con-

ditions (continuity of the tangential components of the electric intensity

E and the magnetic field strength H), as

A2 =A1e−2iδ, A3 = A1(1 + e−2iδ),

with δ ≡ tan−1 nβ

cosφ= tan−1

√

sin2 φ− n2

cosφ. (1.84b)

92


Several features of the fields in the media A and B can now be stated:

1. Even though there is no refracted ‘ray’ in medium B, oscillating elec-

tric and magnetic fields are nevertheless set up in this medium, in

order that the boundary conditions may be satisfied.

2. The phase of oscillations at any given point due to the reflected wave

(E2,B2) differs from that associated with the incident wave (E1,B1), as

seen from the first relation in (1.84b), which shows that the reflected

amplitude A2 has a phase lag compared to the incident amplitude A1.

The amount of phase lag (2δ) increases with the angle of incidence φ

from zero at φ = φc = sin−1 n (the critical angle) to π2

at φ = π2.

On considering the total internal reflection of an incident wave po-

larized parallel to the plane of incidence, a different expression is

obtained for the phase lag between the incident wave and the re-

flected wave. As a result, the state of polarization of an incident

wave possessing both a perpendicular and a parallel component,

gets altered. A linearly polarized wave with its direction of oscilla-

tion of the electric intensity inclined at some angle to the plane of

incidence is, in general, transformed to an elliptically polarized wave

on suffering total internal reflection.

3. The field in medium B is in the nature of a propagating wave along e1,

parallel to the interface in the plane of incidence, and is not associ-

ated with a refracted ‘ray’. A ‘ray’ in geometrical optics corresponds

to the path along which energy is carried by the electromagnetic

field. In the present instance, the component of the time averaged

93


Poynting vector in medium B along a direction normal to the inter-

face works out to zero (check this out). It is this fact that one refers

to when one speaks of the absence of a refracted ‘ray’.

4. The electric and magnetic intensities in medium B decrease expo-

nentially in a direction normal to the interface. In other words, the

wave fronts (surfaces of constant phase, parallel to the y-z plane in

the present context) are not surfaces of constant amplitude (parallel

to the x-y plane). This is an instance of an inhomogeneous wave,

and is also termed an evanescent wave because of the exponential

decrease of the amplitude.

5. The wave set up in medium B is, strictly speaking, not a transverse

wave either, since the magnetic intensity possesses a component

along the direction of propagation (e1 in the present instance).

6. Since A1 and A2 are identical in magnitude, the energy flux carried

by the incident wave in medium A in a direction normal to the inter-

face is identical to that carried by the reflected wave, which means

that the reflectivity R is unity in the case of total internal reflection

(and thus, the transmittance T is zero). On the other hand, there is

a component of the time averaged Poynting vector in medium A in a

direction parallel to the interface (along e1), given by

〈(S(A))‖〉 =1

v12 sinφ cos2 δA2

1, (1.85)

where we assume µ1 = µ2 = µ0 for the sake of simplicity, and take

A1 to be real without loss of generality. Thus, the average energy

94


flux parallel to the interface has the value 2v1sinφcA

21 for φ = φc, when

the contributions due to the incident and reflected waves add up

because of the two being in phase, while on the other hand, it has

the value zero at φ = π2

since the incident and reflected waves have a

phase difference of π.

7. The component of the time averaged Poyinting vector in medium B

along e1 can be seen to work out to a value identical to the right hand

side of (1.85). In other words, the parallel component of the energy

flux is continuous across the interface.

8. The exponential decrease of the amplitude of the electromagnetic

field set up in the medium B (the rarer medium, towards which the

incident wave propagates while being reflected from the interface)

in a direction normal to the interface, does not signify a process of

dissipation in it, since no energy enters into this medium to start

with. The absence of dissipation is also seen from the fact that there

is no decrease in amplitude in a direction parallel to the interface.

Of, course, in the present discussion, we have assumed for the sake

of simplicity that the dielectric media under consideration are free

of dissipation, corresponding to which the refractive indices n1, n2

are taken to be real quantities. In reality, however, there occurs an

absorption of energy in the process of propagation of an electromag-

netic wave through a dielectric, which we will consider in sec. 1.15.

In general, the dissipation happens to be small for most values of

the frequency ω, which is why we have ignored it in the present

discussion. What is important to note here is that the exponential

95


decrease of amplitude in a direction normal to the interface in total

internal reflection occurs regardless of dissipation.

You will do well to check all the above statements out.

A phenomenon of considerable interest in the context of total internal

reflection is what is referred to as frustrated total internal reflection. This

will be briefly outlined in sec. 1.15.7.4

Analogous to total internal reflection from an interface separating two

isotropic dielectrics, where the incident wave propagates from the medium

of higher refractive index to the one of lower refractive index, one finds

interesting features associated with the reflection of a wave incident from

a dielectric medium on an interface separating it from a conductor. In or-

der to describe the characteristics of such a reflection, one has to look at

a number of features of electromagnetic wave propagation in a conductor.

I will briefly outline this in sec. 1.15.3.

1.14 Plane waves: significance in electromag-

netic theory and optics

In the above paragraphs, we have come across a number of features of

plane waves propagating through isotropic dielectric media where, in par-

ticular, the phenomena of reflection and refraction from planar interfaces

between such media have been addressed. It is worthwhile to pause here

and to try to form an idea as to the significance of plane waves and their

96


reflection and refraction in electromagnetic theory and optics.

While the plane wave is, in a sense, the simplest of solutions of Maxwell’s

equations, it is of little direct relevance in electromagnetic theory since it

represents an electromagnetic field only under idealized conditions. The

latter correspond to an electromagnetic field set up in an infinitely ex-

tended homogeneous dielectric medium, with a source emitting coherent

monochromatic radiation placed at an infinitely remote point. In practice,

on the other hand, fields are set up in the presence of bodies and devices

placed within finite regions of space, where one has to take into account

appropriate boundary conditions corresponding to the presence of these

bodies, whereby the space time dependence of the field possesses not a

great deal of resemblance with that of a plane wave.

In reality, however, the plane wave is of exceptional significance. In the

first place, it constitutes a basic solution of Maxwell’s equations in nu-

merous situations of interest since more complex solutions can be built

up by a linear superposition of plane wave solutions where the superpo-

sition may involve a number (often infinite) of components of different

frequencies as also of different wave vectors.

Spherical and cylindrical wave solutions introduced in sections 1.17 and 1.18 also con-

stitute such basic sets of solutions of Maxwell’s equations, where more complex solu-

tions can be built up as a superposition of particular solutions of either type.

What is more, solutions of Maxwell’s equations of a relatively complex

nature can, under certain circumstances, be described locally in terms of

97


plane waves. This is the situation, for instance, in regions far from the

source(s) of an electromagnetic field where the degree of inhomogeneity is

relatively small and where, moreover, the field is nearly harmonic in time.

Such a field looks like a plane wave whose amplitude is slowly modulated

in space and time. Ignoring the variation of the amplitude over relatively

large distances and large intervals of time, then, the field can be inter-

preted as a plane wave, and results relating to a plane wave can be seen

to have a validity in such more general situations. For instance, one can

interpret the modification of the field due to the presence of interfaces, in-

cluding curved ones, between different media, as reflection and refraction

of such locally plane waves. This is precisely the approach of geometrical

optics where a ray plays a role analogous to the wave normal of a plane

wave and an eikonal surface is analogous to the wave front.

As we will see in chapters 2 and 3, this approach is useful in the analysis

of ray paths and in the theory and practice of imaging in optics.

1.15 Electromagnetic waves in dispersive me-

dia.

1.15.1 Susceptibility and refractive index in an isotropic

dielectric

1.15.1.1 Introduction: the context

Imagine a plane monochromatic wave propagating along the z-axis of a

Cartesian co-ordinate system in a dispersive medium, where the term

98


‘dispersion’ will be explained below. Assume that the wave is linearly

polarized with the electric intensity oscillating along the x-axis, and is

represented by

E(z, t) = e1E0 exp(

i(kz − ωt))

. (1.86)

Here E0 (which one can assume to be a real quantity) represents the

amplitude of the wave, ω its angular frequency, and k its propagation

constant, being related to the angular frequency as in (1.52b), where v

stands for the phase velocity of the wave in the medium. The latter is

related to the relative permittivity (ǫr) and relative permeability (µr) of the

medium and, alternatively, to its refractive index, as in eq. (1.52d). In

other words, the refractive index is given by the formula (1.52e).

The medium under consideration here is assumed to be an isotropic di-

electric (with conductivity σ = 0) for which ǫr, µr are scalar quantities

depending on its physical characteristics.

What is of central interest in the present context is the fact that, in gen-

eral ǫr and µr are functions of the angular frequency ω, implying that the

refractive index is also frequency dependent. This dependence of the re-

fractive index on the frequency is termed dispersion, and we will now

have a look at the nature of this dependence. Fig. 1.7 shows the gen-

eral nature of the dependence of the refractive index on the frequency

for a typical dielectric. As you can see, there are frequency ranges in

which the refractive index does not change much with frequency, and

the medium behaves as only a weakly dispersive one, while, in some

99


other frequency ranges the medium is comparatively strongly dispersive.

Moreover, while the refractive index generally increases with an increase

in frequency (normal dispersion), there exist narrow frequency ranges in

which this trend is reversed. Such a sharp decrease in the refractive in-

dex is referred to as anomalous dispersion. In this section we will see why

the curve depicting the trend of normal dispersion is punctuated with

narrow frequency ranges involving anomalous dispersion.

w

n

O

1

Figure 1.7: Depicting the general nature of the dispersion curve; the re-fractive index is plotted against the frequency for plane waves propagatingin an isotropioc dielectric; in general, the refractive index increases withfrequency; however, in certain narrow frequency ranges, the refractiveindex changes anomalously, registering sharp drops (‘anomalous disper-sion’); these correspond to significant absorption in the medium; the termrefractive index actually means the real part of a certain complex functionof the frequency ω while the imaginary part accounts for the attenuationof the wave; the figure shows three ranges of anomalous dispersion, cor-responding to three different resonant frequencies (see sec. 1.15.2).

To begin with, I want you to take note of the basic fact that dispersion

is caused principally by the response of electrical charges in the medium

under consideration to the oscillating electric intensity field of the wave

(eq. (1.86)) propagating in it. For the sake of simplicity we will assume

here that µr is frequency-independent and set µr = 1, which happens

to be close to actual values for most dielectrics (and even for numerous

conducting media). With this simplification, dispersion will be explained

in terms of the frequency dependence of the relative permittivity ǫr.

100


There remains one more essential feature of dispersion that I have to

briefly mention here before outlining for you the derivation of how the

relative permittivity comes to depend on the frequency. As we will see be-

low, dispersion goes hand in hand with dissipation. This is because of the

basic fact that the number per unit volume of the charges in the medium

that respond to the electric intensity field of the propagating wave is com-

monly an enormously large one , and that these charges interact with

one another, causing an irreversible energy sharing between these. What

is more, the charges set into oscillations by the propagating wave radiate

energy over a range of wavelengths, causing energy dissipation, and at-

tenuation of the wave. From the point of view of mathematical analysis,

what all this implies is that quantities like ǫr, k and n are, in general, all

complex ones. This, in turn, needs a careful interpretation of the rela-

tions featuring these quantities, wherein the real and imaginary parts of

each of these can be seen to possess distinct meanings.

1. I will not consider in this book the phenomenon of spatial dispersion wherein the

permittivity in respect of a plane wave field depends, not only on the frequency ω

(‘time domain dispersion’), but on the wave vector k as well. Spatial dispersion is

of especial importance for conductors and plasmas where it results in a number

of novel effects.

2. Strictly speaking, the linear relationship between the electric field and the po-

larization, which we assume throughout the present section, does not hold in the

frequency ranges characterized by anomalous dispersion and pronounced absorp-

tion. We will consider nonlinear effects in optics in chapter 9, though in a different

context. Nonlinear effects can arise in a medium not only by virtue of enhanced

(‘resonant’) absorption, but by virtue of electric fields of large magnitude as well,

i.e., by waves of large intensity set up in the medium.

101


1.15.1.2 Dispersion: the basic equations

As a plane wave of the form, say, (1.86) proceeds through the dielec-

tric under consideration, which we assume to be an isotropic and ho-

mogeneous one, it causes a forced oscillation of the charges distributed

through the medium. While Maxwell’s equations are written on the as-

sumption that the medium is a continuous one, the wave actually in-

teracts with and sets in motion the microscopic charged constituents as

individual particles. We make the assumption that the response of any

single microscopic constituent is independent of that of the others, which

holds for linear dielectrics. Moreover, we analyze the interaction between

the charges and the field in classical terms, since such an analysis ex-

plains correctly the general nature of the dispersion curve as shown in

fig. 1.7.

In the case of a dielectric, the microscopic constituents of relevance, for

frequency ranges of considerable extent, are the electrons bound in the

molecules of the medium. For our purpose, we consider a molecule to be

made up of one or more bound electrons and a positively charged ionic

core where, in the absence of an electromagnetic field, the chrge centres

of the core and of the electrons coincide (i.e., in other words, we assume

the molecules to be non-polar; the general nature of the dispersion curve

remains the same in the case of polar molecules as well).

One more assumption that we make in the classical theory is that the

electrons are harmonically bound with the ionic cores. In other words,

each electron, when not under the influence of the external electromag-

102


netic field, oscillates about its mean position with some characteristic

frequency, say, ω0, where the frequency is independent of the direction of

oscillation (i.e., the electron can be looked upon as an isotropic harmonic

oscillator). Assuming, then, that the electric intensity at the location of

the electron is given by (1.86), the equation of forced oscillation of the

electron is seen to be of the form

md2x

dt2+ η

dx

dt+mω2

0x = −eE0e−iωt, (1.87)

where, for the sake of simplicity (but without loss of generality), we as-

sume the electron to be located at z = 0. Here m and − e stand for the

mass and charge of the electron respectively, and η stands for a damping

constant, assumed in order to account for the energy dissipation associ-

ated with the passage of the wave through the dielectric. Note that, in the

above equation, the displacement x of the electron from its mean position

appears in the complex form, where the actual displacement corresponds

to its real part.

We do not enter here into the microscopic theory for the damping constant η. Strictly

speaking, the theory describing the response of the bound electrons to the electromag-

netic field is to be built up on the basis of quantum theory. Within the framework of

this theory, one of the factors playing an important role in the determination of η is the

lifetime of the excited states of the electron bound to its ionic core.

The steady state solution of (1.87), i.e., the one corresponding to a har-

103


monic oscillation with frequency ω, works out to

x =eE0

m(ω2 − ω20) + iωη

e−iωt. (1.88)

This corresponds to an oscillating dipole moment produced by the field,

given by

p =− exe1 = ǫ0αE,

where α =1

ǫ0

e2

[m2(ω20 − ω2)2 + ω2η2]

12

eiφ,

and φ =tan−1 ωη

m(ω20 − ω2)

. (1.89)

The constant α is termed the electronic polarizability of the atom or

molecule concerned. It constitutes the link between the macroscopic

property of the dielectric relating to its response to the electromagnetic

field and the microscopic constituents making up the medium. If there

be N number of bound electrons per unit volume with frequency ω0, then

the dipole moment per unit volume, i.e., the polarization vector resulting

from the propagating plane wave is given by

P = Nǫ0αE, (1.90)

and hence, the dielectric susceptibility of the medium at frequency ω is

seen to be

χE(ω) = Nα =N

ǫ0

e2

[m2(ω2 − ω20)

2 + ω2η2]12

eiφ. (1.91)

104


Finally, the relative permittivity ǫr(ω) (see eq. (1.15a)) is obtained as

ǫr = 1 + χE = 1 +N

ǫ0

e2

[m2(ω2 − ω20)

2 + ω2η2]12

eiφ. (1.92)

This formula captures the essential feature of dispersion in a dielectric,

namely, the dependence of the relative permittivity and hence, of the re-

fractive index (refer to eq. (1.52e)) on the frequency ω. One has to keep

in mind, though, that it needs a number of improvements and interpre-

tations before it can be related to quantities of actual physical interest

because it is just a first estimate and holds only for a dilute gas. For

instance, it has been derived on the assumption that the field produc-

ing the polarization is the same as the macroscopically defined field ob-

tained by averaging over microscopic fluctuations. This brings in the

question of what is referred to as the ‘local field’, to be briefly introduced

in sec. 1.15.2.1, where a more general formula is set up. However, before

oulining these considerations, it will be useful to look at a few impor-

tant conclusions of a general nature that can be drawn from the above

formula.

Note, first of all, that the relative permittivity is a complex quantity hav-

ing a real and an imaginary part. Looking closely at the formula, the

imaginary part is seen to be of appreciable magnitude only over a range

of frequencies around ω0 where the response of the electron to the elec-

tromagnetic field is the strongest, being in the nature of a resonant one,

and involves a relatively large rate of energy transfer from the electromag-

netic field to the medium, causing an appreciable damping of the wave,

characterized by the damping constant η. For frequencies away from ω0

105


(referred to as the ‘resonant frequency’), the relative permittivity is dom-

inated by its real part, where the variation of the latter is, once again,

appreciable only for frequencies close to ω0.

Even as the relative permittivity works out to be of a complex value (recall

that the relative permeability µr has been assumed to be ≈ 1 for the sake

of simplicity), the formula (1.52a) continues to represent a plane wave

solution, in the complex form, to Maxwell’s equations in the dielectric

under consideration where, now, the wave vector

k = kn, (1.93a)

is a complex one, with k, v, n acquiring complex values by virtue of ǫr

being complex

k =ω

v, v =

c

n, n =

√ǫrµr. (1.93b)

Expressing ǫr, n, k in terms of real and imaginary parts (and continuing

to assume that µr ≈ 1), we write

n = nR + inI =√

(ǫrR + iǫrI), k = kR + ikI =ω

c(nR + inI). (1.93c)

The plane wave solution (1.86) then becomes

E = e1E0exp[i(ω

c(nR + inI)z − ωt)] = e1E0e

−kIzexp[i(kRz − ωt)], (1.94a)

Note from (1.94a) that the amplitude of the electric intensity decreases

exponentially with the distance of propagation z, as a result of which

106


the intensity of the wave also decreases exponentially. In order to work

out the expression for intensity, one observes that the magnetic vector H

corresponding to (1.94a) is given by

B =E0

µ0ωe2(kR + ikI)e

−kIzexp[i(kRz − ωt)], (1.94b)

telling us, among other things, that there is a phase difference between

E and H (because of the presence of the complex factor k = kR + ikI on

the right hand side), in contrast to the case where the wave propagates

without dispersion or absorption.

One can now calculate the time averaged Poyinting vector 〈S〉 = 14〈(E ×

H∗ + E∗ ×H)〉, from which the intensity due to the wave works out to

I =1

2

√

ǫ0

µ0

nRE20e

−2kIz. (1.95)

This can be compared with (1.55), the expression for intensity in the ab-

sence of dispersion and absorption, which can be written as I = 12

√

ǫ0µ0nE2

0 .

One observes that n gets replaced with nR, the real part of the complex

refractive index and, in addition, the intensity decreases exponentially

with the distance of propagation z, getting attenuated by a factor of 1e

at a distance d = 12kI

. In other words, while the imaginary part of k (or,

equivalently, of n) determines the attenuation of the wave, its real part

determines the phase Φ(= kRz − ωt = ωcnRz − ωt).

Looking back at sec. 1.12.2, one observes that it is nR that is to be used

in Snell’s law relating the angles of incidence and refraction when light

107


is refracted from vacuum into the dielectric under consideration, since

Snell’s law is arrived at from the continuity of the phases of the incident

and refracted waves. Similarly, in the case of refraction from one dielec-

tric medium to another, the relative refractive index actually stands for

the ratio of the real parts of the complex refractive indices.

Fig. 1.8 depicts schematically the variation of nR and nI with ω, as ob-

tained from (1.92) and the first relation in (1.93c). One observes that the

trend of increase of nR with ω for frequencies away from ω0 is reversed

near ω0 where, moreover, nI acquires an appreciable value.

nR

nI

ww0

Figure 1.8: Depicting schematically the variation of nR and nI with ω, asobtained from (1.92) and the first relation in (1.93c); one observes that, forfrequencies away from ω0, nR increases slowly with ω and nI has a smallvalue; close to ω0, on the other hand, nR shows a sharp decrease while nI

acquires an appreciable value, corresponding to pronounced absorptionowing to the occurrence of a resonance in the forced oscillations of theelectrons in the dielectric.

1.15.2 Dispersion: further considerations

1.15.2.1 The local field: Clausius-Mossotti relation

In writing the equation of motion (1.87) of a bound electron, the field

causing its forced oscillations has been assumed to be the field E of the

plane wave described by the Maxwell equations for the medium. The lat-

108


ter, however, is a macroscopic quantity that is obtained by an appropriate

space time averaging over the microscopically varying field intensities as-

sociated with microscopic charges and currents in the medium. Assum-

ing that an averaging over short times (corresponding to rapid variations

of microscopic origin) has been performed, there remains the small scale

spatial variations of the microscopic field. The local field that causes the

polarization of an atom by inducing forced oscillations in its charge dis-

tribution differs from the field obtained by averaging over all the atoms of

the dielectric. The relation between the two can be worked out under the

assumption of a symmetric distribution of the atoms in the neighbour-

hood of the atom under consideration or else, under the assumption of a

random distribution.

In either of the above two types of local arrangement of the atoms one

obtains, instead of (1.91), the following formula relating the macroscop-

ically and microscopically defined quantities, respectively χE and α, the

former characterising the medium in the continuum approximation and

the latter the atom considered as an individual entity,

χE =Nα

1− 13Nα

. (1.96a)

Correspondingly, the expression for the relative permittivity in terms of

the atomic polarizability is seen to be

ǫr =1 + 2

3Nα

1− 13Nα

. (1.96b)

Though derived under relatively restrictive assumptions, this formula,

109


referred to as the Clausius-Mossotti relation, is found to hold quite well

for a large number of dielectric materials, including those in solid or liquid

forms. This leads to a modification of (1.92), though the general nature of

the dispersion curve (fig. 1.8) remains the same. In the case of a gaseous

medium, on the other hand, one has Nα << 1, and thus χE ≈ Nα, as a

result of which (1.92) holds.

A variant of the Clausius-Mossotti relation, written with n2 replacing ǫr, is referred to as

the Lorentz-Lorenz relation.

1.15.2.2 Dispersion: the general formula

In inducing an oscillating dipole moment in an atom, a propagating wave

sets up forced oscillations in all the electrons bound in it, not all of which

are characterized by a single natural frequency ω0. One can, however,

assume to a good degree of approximation that the electrons respond to

the electromagnetic field independently of one another, in which case the

dipole moment per unit volume is obtained simply as the sum of dipole

moments due to electrons with various different frequencies ωj. Assum-

ing that there are, on the average, a fraction fj of the bound electrons in

the medium with frequency ωj and with damping constant ηj (j = 1, 2, . . .),

and that there is a total of N bound electrons per unit volume, one ob-

tains the following relation for the frequency dependence of the complex

relative permittivity,

ǫr = 1 +Ne2

ǫ0

∑

j

fj

m(ω2j − ω2)− iωηj

(∑

j

fj = 1). (1.97)

110


One can now make use of (1.93b), (1.93c) to evaluate kI (and hence the

attenuation coefficient 2kI, refer to eq. (1.95)) and nR, the refractive index

that relates the angles of incidence and refraction when the plane wave

is refracted from free space into the dielectric.

The general nature of the graph depicting the variation of nR with ω re-

mains the same as in fig. 1.7, where now the narrow frequency ranges

involving a rapid decrease of nR with ω (anomalous dispersion) can be

identified as those around the resonant frequencies ωj (j = 1, 2, . . .) the

typical width of the range of anomalous dispersion around the frequency

ωj being ∼ ηjm

.

Within each range of anomalous dispersion, nI (recall the relation kI =ωcnI)

varies as in fig. 1.8 implying enhanced attenuation of the wave, while

away from the resonant frequencies, the attenuation is, for most pur-

poses, negligibly small. For such frequencies away from the resonances,

the dispersion is seen to be normal, i.e., characterized by a slow increase

of nR with frequency.

Evidently, the role of damping, characterized by the damping constants

ηj (j = 1, 2, . . .) becomes important near the resonant frequencies where

there occurs an irreversible transfer of energy from the wave to the di-

electric medium through the forced oscillations of the electrons. Away

from the resonances, on the other hand, the reversible energy transfer

between the wave and the oscillating electrons dominates over the irre-

versible process of energy dissipation.

111


1.15.2.3 The distribution of resonant frequencies

Referring to the electromagnetic spectrum from very low to very high fre-

quencies, the resonant frequencies ωj (j = 1, 2, . . .) are found to be dis-

tributed over the spectrum in a manner characteristic of the dielectric

under consideration. For an colourless transparent medium, none of the

resonant frequencies reside in the visible part of the spectrum, while for

a coloured substance one or more of these fall within the visible region

(recall that frequencies close to resonant ones correspond to pronounced

absorption).

1.15.2.4 Types of microscopic response

The theory of dispersion is intimately tied up with that of atomic and

molecular scattering of electromagnetic waves, and related processes of

atomic absorption and radiation. An electromagnetic wave propagating

through a medium interacts with individual atoms and molecules as also

with atomic aggregates, such as the collective vibrational modes of a crys-

talline material. Even within a single atom or molecule, there arises the

response of the ionic core, which executes a forced oscillation analogous

to the electrons. Since the ionic core is much more massive than the

electrons, the characteristic frequency of the ionic vibrations is compara-

ratively much smaller, commonly falling within the infrared part of the

spectrum. The interaction of the electromagnetic field with the rotational

and vibrational modes of the molecules may also play important roles in

determining dispersion and absorption in certain frequency ranges, es-

pecially in the infrared and microwave parts of the spectrum. Finally,

112


for a conducting medium, the electromagnetic wave may induce forced

oscillations of the pool of free electrons, which contributes significantly to

dispersion and absorption.

1.15.2.5 The quantum theory of dispersion

The expression for the complex relative permittivity, from which one can

deduce the real and imaginary parts of the complex propagation constant

k and of the refractive index, involves, for a given dielectric, a number

of characteristic constants (see formula (1.97)), namely the resonant fre-

quencies ωj, the damping constants ηj, and the fractions fj. A complete

theory of dispersion requires that all of these constants characterizing a

medium be determined in a consistent theoretical scheme. As mentioned

above, this requires, in turn, detailed considerations relating to the inter-

action of an electromagnetic field with the atoms, molecules, and atomic

aggregates of the medium, and hence must make use of quantum princi-

ples.

The quantum theoretic approach differs from the classical theory both in

its fundamental premises and in detailed considerations. For instance,

it takes into account the stationary states of the electrons in the inverse

square Coulomb field in an atom, with no reference to their harmonic

oscillations, the latter being an ad hoc assumption in the classical the-

ory. The ‘natural frequencies’ ωj of the classical theory are then replaced

with the frequencies of transition between these stattionary states. The

fractions fj are related in the theory to the probabilities of these transi-

tions, where the fundamental quantum constant h - the Planck constant

113


- makes its appearance. What is more, the theory allows for the fact that,

in the presence of the electromagnetic field, the stationary excited states

of the electrons, obtained for an isolated atom, are no longer truly station-

ary, and each such state actually has a certain lifetime associated with it.

As I have mentioned above, these lifetimes of the excited states are made

use of in accounting for the damping constants ηj of the classical theory.

With all this, however, the final quantum theoretic results do not contra-

dict but provide support for the general form of the frequency dependence

of the complex relative permittivity (eq. (1.97)). In other words, the quan-

tum considerations supply a rigorous theoretical basis for the constants

ωj, fj, ηj (j = 1, 2, . . .) of the classical theory.

1.15.2.6 Low frequency and high frequency limits in dispersion

It is of interest to look at the low frequency and high frequency limits of

the dispersion formula (1.97), though these limits are not of direct rele-

vance in optics. As can be seen from this formula, the relative permittivity

approaches a constant real value in the limit ω → 0,

ǫstatr = 1 +Ne2

ǫ0

∑

j

fj

mω2j

, (1.98)

which is therefore the static dielectric constant of the medium under con-

sideration.

In the high frequency limit, on the other hand, the amplitude of forced os-

cillations of the electrons becomes negligibly small regardless of whether

these are bound or free, and their response to the electromagnetic wave

114


is dominated by inertia. This results in the value ǫr → 1 from the lower

side, where the limiting form of ǫr(ω) is

ǫr ≈ 1−ω2p

ω2, (1.99a)

and where the plasma frequency ωp of the dielectric is given by

ωp ≡√

Ne2

ǫ0m, (1.99b)

This is an important and interesting result: electromagnetic waves of

very high frequency propagate through a dielectric with a phase velocity

slightly larger than c, which approaches the value c for ω →∞. Thus, the

refractive index of a dielectric for X-rays is usually less than unity, as a

result of which the X-rays can suffer total external reflection when made

to pass from vacuum into the dielectric.

1.15.2.7 Wave propagation in conducting media

One can, in the context of dispersion, consider the passage of electro-

magnetic waves through a conducting medium as well. As mentioned in

sec. 1.2.3, a conductor is characterized by a conductivity σ (which we

assume to be a scalar, corresponding to an isotropic medium). From a

microscopic point of view, the conductivity arises by virtue of the pool

of free electrons in the material, which distinguiishes a conductor from

a dielectric. However, the distinction is significant only under station-

ary conditions (i.e., stationary electric and magnetic fields and stationary

currents) while, under time dependent conditions (as in the case of har-

115


monic time dependence due to a propagating electromagnetic wave) the

behaviour of a conductor becomes, in principle, analogous to that of a di-

electric, the similarity between the two being especially apparent at high

frequencies.

In particular, an electromagnetic wave sets up forced oscillations in the

pool of free electrons, thereby causing the polarisation vector to oscil-

late harmonically. This corresponds to a dispersion formula analogous

to (1.97) with, however, a resonant frequency ω0 = 0, corresponding to the

fact that the electrons are not bound to individual atoms. Correspond-

ingly, the propagation of an electromagnetic wave through the conductor

can be described in terms of a permittivity with a frequency dependence

of the form

ǫr(ω) = ǫr0(ω)−N

ǫ0m∗ω

e2f0

ω + iγ, (1.100)

where ǫr0 represents the response due to factors other than the free elec-

trons, m∗ stands for the effective mass of the conduction electrons, m∗γ(=

η) denotes an effective damping factor, and f0 stands for the number of

free electrons as a fraction of the total number of electrons.

1. The electrons in a conductor, commonly a crystalline solid, are distributed in

energy bands, where the ones belonging to the band highest up in the energy scale

(the conduction band) act as carriers of current in the presence of an externally

imposed weak electric field. While this band is a partially filled one, the other

bands, lower down in the energy scale, are all fully filled (with only few vacancies

generated by the thermal motion of the electrons). The wave functions of these

electrons are spread throughout the crystalline lattice, but nevertheless, these

116


behave in a manner analogous to the bound electrons in a dielectric in that they

cannot act as carriers contributing to the electric current. The contribution of

these electrons to the relative permittivity is denoted above by ǫr0(ω), which tends

to unity at high frequencies and to a real constant ǫr0(0) at low ones. The latter,

however, is not of much significance since, for ω → 0, the contribution of the free

electrons (those in the conduction band) diverges to an infinitely large value (while

being imaginary, see below) and dominates over that of the bound ones.

2. The effective mass m∗ in the above formula appears because of the fact that the

conduction electrons are not truly free ones, but move around in a spatially peri-

odic field produced by the ions making up the crystalline lattice.

Indeed, the second term on the right hand side of (1.100) is only an approximate

expression for the response of the free electrons in a conductor. A more accurate

theory takes into consideration the quantum features of the response, including

the ones resulting from the distribution of these electrons in the energy levels

making up the conduction band. Replacing the electron mass m with the effective

mass m∗ is a simple but fruitful way of taking into account the quantum features,

while still falling short of being a complete theory.

Looking at this basic dispersion formula for a conductor, one distin-

guishes between two regimes. In the low frequency or ‘static’ regime

(ω << γ), one has

ǫr ≈ ǫr0(0) + iω2p

γω(1.101)

where

ωp ≡√

Ne2f0

ǫ0m∗ , (1.102)

is the plasma frequency of the conductor.

117


In the dynamic regime, for a harmonic time variation with frequency ω,

the relative permittivity, given by the formula (1.100), leads to a num-

ber of characteristic features in the propagation of electromagnetic waves

through the conductor and in reflection from conducting surfaces (see

sec. 1.15.3, where a few of these features will be indicated for the case of

a plane monochromatic wave).

The commonly adopted way of characterizing a conductor is in terms of its

conductivity. In reality, the conductivity σ is complex and depends on the

frequency ω, where the low frequency behaviour of the conductor depends

on the static conductivity σ0. While σ(ω) is determined by the response

of the free electrons to an impressed electromagnetic field, the response

of the remaining electrons, lower down in the energy scale, determines

ǫr0(ω) appearing on the right hand side of (1.100). As indicated above, an

equivalent way of characterizing the response of a conductor is in terms

of ǫr(ω) appearing on the left hand side of the same equation, along with

the static conductivity.

Referring to the Maxwell equation (1.1d) and to a harmonic wave, these

two ways of describing the behaviour of a conductor correspond to the

two sides of the following formula

−iωǫ0ǫr = σ − iωǫ0ǫr0, (1.103a)

(check this out), which simplifies to

ǫr = ǫr0 −σ

iǫ0ω. (1.103b)

118


Comparing with eq. (1.100), one obtains the frequency dependence of the

complex conductivity σ,

σ(ω) =Ne2f0

m∗γ

1

1− iωγ

=σ0

1− iωγ

, (1.104a)

where

σ0 =Ne2f0

m∗γ, (1.104b)

stands for the static conductivity of the conductor.

Section 1.15.3 carries a brief outline of absorption in a conducting medium

and of reflection from the surface of a conductor, these being character-

istic features of the response of a conductor to electromagnetic waves.

1.15.2.8 Dispersion as coherent scattering

From a microscopic point of view, dispersion is related to scattering of

electromagnetic waves by atoms and molecules. Imagine the dielectric

medium as so many atoms arranged in free space. A wave that would

propagate in free space with P = 0 would correspond to ǫr = 1. The atoms

and molecules of the dielectric, however, modify this primary wave by

adding to it the waves resulting from the scattering of the primary wave

by these. For a set of scattering centres distributed with large spacings

between one another, the scattered waves add up incoherently. If, on the

other hand, the spacings are small compared to the wavelength of the

wave, then these may be considered as forming a continuous medium,

and the waves scattered from contiguous volume elements of the medium

119


add up coherently (see sec 1.21 and chapter 7 for ideas relating to coher-

ent and incoherent wave fields).

The scattered waves, added up to the primary vacuum wave, produce a

resultant wave, and it is this resultant wave that is related to the po-

larization in the medium through the complex permittivity and that we

started with in eq. (1.86). While the primary wave propagates through

vacuum with a phase velocity c, the modified wave propagates with a

different phase velocity because of the phase difference between the scat-

tered wave produced by a scattering centre and the primary wave, where

the phase difference relates to the complex polarizability of the atom.

Looked at this way, one may interpret refraction as coherent scattering.

Imagine a monochromatic plane wave to be incident from vacuum on the

interface separating a dielectric medium. As the wave enters into the

dielectric, the vacuum wave is modified by the addition of the coherent

scattered waves from tiny volume elements distributed throughout the

dielectric. The superposition of all these waves gives rise to the refracted

wave moving into the dielectric along a given direction, as dictated by

Snell’s law, and with a phase velocity v = cnR

. In all other directions, the

superposition of the scattered waves with the vacuum wave results in

zero amplitude of the field vectors and hence zero intensity.

Incidentally, the frequencies of scattered waves considered above are the

same as the frequency of the primary wave, regardless of whether the

scattering is coherent and incoherent, where the coherence characteris-

tics determine the phase relations among the waves scattered from the

120


individual scatterers. In other words, each individual scatterer scatters

coherently with reference to the primary field. However, there also occurs

a radiation from the individual scatterers, with its frequency spread over

a certain range, depending on their lifetime, this radiation being incoher-

ent with reference to the incident wave. It accounts for the irreversible

energy loss from the primary wave, and its attenuation in the medium

under consideration.

Thus, there are two distinct types of incoherence involved: one relating to the wave

scattered by an individual scatterer with reference to the primary field, and the other to

the phases of the waves scattered from all the scatterers distributed in space.

1.15.2.9 Dispersion and absorption: a consequence of causality

The complex susceptibility χE(ω) can be interpreted as a ‘response func-

tion’ characterizing the dielectric, in the sense that the electric field E(r, t),

acting as the ‘cause’, results in the polarization P(r, t) as the ‘effect’. The

principle of causality applies to this cause-effect relation in that the ef-

fect at any given time t can depend only on the cause operating at times

earlier than t. One can then define a response function R(t) relating the

‘effect’ to the ‘cause’ in accordance with this principle of causality. The

Fourier transform of this function then appears as χE(ω). As a logical con-

sequence of the principle of causality, on finds that the imaginary part

of χE(ω) cannot be arbitrarily assumed to be zero, since it is found to be

related to the real part in a certain definite manner. In other words, ab-

sorption and dispersion are related to each other as a consequence of the

general principle of causality.

121


For the sake of completeness I quote here the formulae expressing the

relation between the real and the imaginary parts of the susceptibility

referred to above:

Re(χE(ω)) =1

πP

∫ ∞

−∞

Im(χE(ω′))

ω′ − ω dω′, (1.105a)

Im(χE(ω)) = −1

πP

∫ ∞

−∞

Re(χE(ω′))

ω′ − ω dω′. (1.105b)

In these formulae, referred to as the Kramers-Kronig relations, the symbol

P is used to denote the principal value of an integral. These constitute the

most general requirement on the complex susceptibility that one can in-

fer on physical grounds. From the practical point of view, these are a pair

of formulae of great usefulness in optics. For instance, one can experi-

mentally determine the frequency dependence of Im(χE) for a medium by

measuring the absorption coefficient at various frequencies, from which

one can construct Re(χE(ω)), and then the refractive index as a function

of frequency, by making use of (1.105a).

1.15.2.10 Magnetic permeability: absence of dispersion

While seeking to explain the phenomenon of dispersion, we have all along

ignored the possible frequency dependence of the magnetic permeability,

and assumed that µr is close to unity. Considered from a general point

of view, the magnetic susceptibility χM (and hence the permeability) can

have frequency dependent real and imaginary parts, where the two are

to be related in accordance with the principle of causality. However, the

122


fact that the typical velocities of electrons in atoms are small compared to

c, may be seen to imply that the response time of the magnetization in a

medium is, in general, large compared to the time periods of electromag-

netic waves of all but the ones with considerably low frequencies. Thus,

for frequencies even much lower than the optical ones, it is meaningless

to look for the dispersion of the magnetic susceptibility because such fre-

quencies are actually sufficiently high for µr to be close to unity (recall

that the high frequency limit for ǫr is unity, though this limiting value is

reached at much higher frequencies compared to the magnetic case).

An important exception, however, relates to artificially prepared meta-

materials that contain arrays of metallic units, where each unit is of

subwavelength dimensions (compared to the waves of frequency ranges

of relevance) and is given an appropriate shape so as to have a pro-

nounced response to the magnetic components of the waves (see sec-

tions 1.15.2.12, 1.20).

1.15.2.11 Dispersion and absorption in water

The propagation of electromagnetic waves in water constitutes a special

and interesting instance of dispersion. Water molecules have resonant

frequencies in the infrared and the microwave regions associated with

molecular rotations and vibrations, and again in the ultraviolet region

associated with elctronic modes. Away from these two frequency ranges,

the refractive index varies more or less smoothly, tending to the low fre-

quency limit nR ≈ 9, attaining the value nR ≈ 1.34 in the visible part of

the spectrum, and finally tending to nR = 1 in the high frequency limit.

123


Within the resonant bands, the attenuation coefficient (2kI) is large by

several orders of magnitude compared to its value in the visible region.

In other words, water has a narrow transparency window precisely in the

visible part of the spectrum - a fact of immense biological significance.

At low frequencies, the attenuation is, as expected, very small for pure

water while being relatively large for sea water which behaves like a con-

ductor because of its salinity, where the conductivity is ionic rather than

electronic in origin. One finds that at all but extremely low frequencies,

sea water is characterized by a relatively large attenuation coefficient

(α ≡ 2kI) as compared to pure water. Making use of the static conduc-

tivity (σ0) in eq.(1.103b) one finds that, at low frequencies, α goes to zero

like α ∼√

2σ0ǫ0c2

√ω. This remains above the value for pure water down to

the lowest frequencies attainable.

The symbol α, which has been used here for the attenuation coefficient, is not to be

confused with the same symbol having been used for the polarizability (sec. 1.15.1.2).

1.15.2.12 Negative refractive index

Every material has its own characteristic response to electromagnetic

waves propagating through it, as revealed by specific features of disper-

sion in it, relating to the detailed frequency dependence of the real and

imaginary parts of the parameters ǫr and µr. However, in numerous situ-

ations of interest in optics, the magnetic parameter µr is found to be close

to unity (refer to sec. 1.15.2.10), implying that the magnetic response of

the medium is negligible to waves in the optical range of frequencies.

124


In other words, the magnetic field of an electromagnetic wave belonging to

the optical part of the spectrum does not interact appreciably with the mi-

croscopic constituents of the medium, and magnetic dipole moments are

not excited in a manner analogous to the excitation of electric dipole mo-

ments. As regards the latter, recall from sections 1.15.1.2 and 1.15.2.2

that oscillating electric dipole moments are produced throughout the vol-

ume of a medium by way of response to an electromagnetic wave propa-

gating in it, and it is predominantly this phenomenon that explains the

frequency dependence of the refractive index of the medium under con-

sideration.

However, the story does not end here. Up to this point, we have assumed

that the basic units in a medium responding to an electromagnetic wave

are its atoms and molecules. The typical wavelength of light (or of all but

the shortest of electromagnetic radiations) being much larger than the

atomic and molecular dimensions and their average separation, one can

assume that the atomic units are continuously distributed throughout the

medium and can express the response in terms of the two parameters

ǫr, µr, which represent averaged macroscopic features of the response

(in contrast, a precise description of the scattering from an individual

atom or molecule depends on a large number of parameters and involves

complex considerations).

Imagine, now, an array of small, sub-wavelength units arranged within a

material in such a way that it effectively ats as a continuous distribution

of matter in respect of its response to an electromagnetic wave, which will

now be determined by that of the response of the individual units consid-

125


ered as a whole, in addition to the response due to the scattering by the

atoms and molecules making up the system. One can still describe the

response of the system by a pair of effective averaged parameters ǫr and µr

where, depending on the structure of the units, the frequency dependence

of these parameters can be quite distinct compared to the commonly en-

countered situation where the two are determined predominantly by the

response of the atomic constituents.

Such arrays of subwavelength units, mounted on appropriate substrates,

may thus constitute artificially constructed materials with novel response

to electromagnetic waves. For instance, by appropriately choosing the

material and the structure of the individual units, one can generate a

pronounced response to both the electric and magnetic components of an

electromagnetic wave over certain chosen ranges of wavelength. In par-

ticular, it is possible to produce materials with negative refractive indices

for waves in the optical part of the spectrum.

The possibility of a negative refractive index was considered by Victor

Veselago in a paper written in 1968, where he pointed out that such neg-

ative values are not incompatible with Maxwell’s equations. For instance,

if ǫr and µr for a medium are both negative (assuming that their imag-

inary parts are sufficiently small) then Maxwell’s equations require that

the negative sign of the square root in the relation n =√ǫrµr be taken in

evaluating its refractive index. The question then arises as to whether it

is possible to have a material where ǫr and µr are simultaneously negative

for the range of frequencies of interest. It is here that artificially engi-

neered materials with novel dispersion features assume relevance. These

126


are referred to as metamaterials.

Figure 1.9: Depicting schematically a planar array of nanoscale metallicunits; units of the type shown are termed split ring resonators (SRR’s),while those of other types are also possible; each SRR can produce a pro-nounced magnetic response to an electromagnetic wave in a frequencyrange that can be made to depend on its size and composition; a meta-material made of such arrays can act as a medium of negative refractiveindex, engendering novel possibilities.

Fig. 1.9 depicts schematically an array of subwavelength metallic units,

where these units are specially designed so as to elicit a pronounced

response to the time varying magnetic field of an electromagnetic wave.

Metamaterials are commonly fabricated, making use of modern day state-

of-the-art technology, with such units of various shapes and sizes de-

pending on the type of response these are required to produce.

In sec. 1.20 I briefly outline the basic principles underlying the electro-

magnetic response of metamaterials, mentioning a few of the distinctive

features of wave propagation in a negative refractive index material. I will

also introduce the basic idea underlying transformation optics, a tech-

nique that makes possible a remarkable control over ray paths in a meta-

material.

127


1.15.3 Conducting media: absorption and reflection

1.15.3.1 Absorption in a conducting medium

Referring to the fundamentals of wave propagation in a conducting medium

as briefly outlined in sec. 1.15.2.7, recall that a conductor may be char-

acterized by a dielectric constant ǫr given by the expression (1.100), or by

a conductivity σ together with a dielectric constant ǫr0 (which relates to

the electrons that cannot act as carriers of electric current in the con-

ductor). In this latter description, both σ and ǫr0 are, in general complex,

though at sufficiently low frequencies both become real, with σ reduc-

ing to the static conductivity σ0. Typically, the low frequency regime ex-

tends up to the microwave or the infrared part of the spectrum while, at

higher frequencies, the conductivity exhibits a frequency dependence of

the form (1.104a).

The wave equation in an isotropic conducting medium, derived from (1.1b), (1.1d),

and (1.4) reads, for a harmonic time dependence with frequency ω,

∇2E = −(iµ0ωσ + ω2µ0ǫr0ǫ0)E, (1.106)

(check this out) where we have assumed that the medium is a non-

magnetic one so that µ ≈ µ0. Making use of eq. (1.103b) and considering,

in particular, the propagation of a plane wave with wave vector k = kn in

the conductor, one obtains

k2 =ω2

c2(ǫr0 +

iσ

ǫ0ω) =

ω2

c2ǫr, (1.107)

128


which, according to (1.100), tells us that k is a complex quantity.

The fact that ǫr and k are complex, is a consequence of dissipation of

energy in the conductor. Correspondingly, the conductor is characterized

by a complex refractive index (n) as well. Writing the real and imaginary

parts of k and n as

k = kR + kI, n = nR + nI,(

kR,I =ω

cnR,I

)

, (1.108)

one can work out from (1.107) the real part of the refractive index (nR),

as also the imaginary part (nI) where the latter relates to the absorption

coefficient (α) as

α = 2kI = 2ω

cnI. (1.109)

A plane wave travelling in the conducting medium gets appreciably atten-

uated as it propagates through a distance

d =1

α=

c

2ωnI

. (1.110)

Thus, at high frequencies, a plane wave can penetrate into the interior of

the conductor only up to a very small distance. This is referred to as the

skin effect, and d is termed the skin depth for the conductor.

The electric intensity vector for the plane wave under consideration, assuming that the

latter is a linearly polarized one is of the form

E(r, t) = eAe−kIzei(kRz−ωt), (1.111)

129


where the unit vector n can be assumed to be along the z-axis of an appropriately chosen

Cartesian co-ordinate system, A stands for the scalar amplitude, and e is a unit vector

in the x-y plane. The corresponding magnetic intensity vector is obtained by making use

of (1.1b). While the wave is attenuated as it propagates along the z-direction, it is in the

nature of a homogeneous wave in that the surfaces of constant amplitude coincide with

those of constant real phase, both sets of surfaces being perpendicular to the z-axis.

Assuming, for the sake of simplicity, that

σ0 >> ǫ0ǫr0(0)ω, (1.112a)

and that, at the same time, ω is small enough so as to cause σ, ǫr0 to

reduce to their static values (resp., σ0, ǫr0(0)), the expression for the skin

depth reduces to

d ≈ c√2

√

ǫ0

ωσ0. (1.112b)

The vanishing of the field in the interior of a conductor as a consequence of the skin

effect relates to the fact that charges and currents set up within the conductor quickly

decay to vanishingly small values. For instance, a charge density set up in the conductor

decays in a characteristic time τ ∼ ǫr0ǫ0σ

.

As mentioned above, these results are valid only in the low frequency

regime where σ and ǫr0 are real, being approximated by their static values.

The high frequency regime corresponds to ω >> γ where ǫr0 ≈ 1, and

ǫr ≈ 1−ω2p

ω2, (1.113)

130


as in the case of a dielectric. However, this approximation holds for a con-

ductor over a frequency range covering both ω < ωp and ω > ωp, in contrast

to a dielectric where it typically applies only for frequencies much larger

than ωp. In this regime, then, formula (1.113) implies that, for ω < ωp,

ǫr is negative, as a result of which nR = 0. This means that a wave inci-

dent on the surface of the conductor, say, from free space, is completely

reflected back with no part of the wave propagating into it, i.e., the con-

ductor is totally opaque to the wave. For ω > ωp, on the other hand, nI = 0

(and nR < 1), and the conductor becomes transparent to the radiation of

frequency ω. This transition from opacity to transparency is a notable

characteristic of conductors and is observed, for istance, in the alkali

metals across frequencies typically in the ultraviolet range.

While the description of wave propagation in a conductor looks formally

analogous to that in a dielectric, especially at high frequencies, the physics

of the process of attenuation differs in the two cases. In a dielectric, the

attenuation is principally due to the radiation from the bound electrons

caused by the propagating wave or, more precisely, by the finite lifetime

of the electronic states due to the excitation and de-excitation of the elec-

trons under the influence of the wave. In the conductor, on the other

hand, a major contribution to dissipation arises from the free electrons

drawing energy from the wave and transfering this to the crystalline lat-

tice by means of collisions with the vibrational modes of the latter.

131


1.15.3.2 Reflection from the surface of a conductor

The fact that the wave vector of a plane monochromatic wave propagating

in a conductor is necessarily complex, and that this is associated with a

complex refractive index, implies characteristic phase changes for a plane

electromagnetic wave reflected from the surface of the conductor where,

for the sake of simplicity, we assume that the wave is incident from a

dielectric with negligible absorption. In this case, the wave refracted into

the conductor is of a different nature as compared to the plane wave of

the form (1.111) in that the former is an inhomogeneous wave where the

surfaces of constant amplitude differ from those of constant phase. The

wave is attenuated in a direction perpendicular to the reflecting surface,

i.e., the surfaces of constant amplitude are parallel to this surface. The

surfaces of constant real phase, on the other hand, are determined by

an effective refractive index that depends on the parameters nR, nI, and

additionally, on the angle of incidence in the dielectric.

The phase changes involved in the reflection result in a change of the

state of polarisation of the incident wave. In general, a linearly polarized

incident wave gives rise to an elliptically polarized reflected wave. The

characteristics of such an elliptically polarized wave can be expressed in

terms of the lengths of the principal axes of an ellipse (refer to fig. 1.4) and

the orientation of these axes. These can be determined experimentally by

analysing the reflected light. Such a determination yields the values of

the parameters nR, nI characterizing the conductor. I do not enter here

into the derivation of the relevant relations since it requires one to go

through a long series of intermediate steps, and does not involve new

132


principles, the derivation being fundamentally along the same line as

that followed in arriving at the Fresnel formulae in sec. 1.12.3.

While the reflected and refracted waves for a plane monochromatic wave incident on the

surface of a conductor from a dielectric conform to the boundary conditions (1.8a), (1.8b),

the boundary conditions at the surface of a good conductor can be stated in relatively

simple terms. In particular, the boundary conditions take up especially simple forms for

a perfect conductor, for which the tangential component of the electric intensity E and

the normal component of the magnetic field vector H are zero just outside the conduc-

tor. In the interior of the conductor all the field components are zero. The normal E and

tangential H just outside the surface account for induced surface charges and currents

that ensure the vanishing of the field components in the interior.

1.15.4 Group velocity

Consider a superposition of two plane monochromatic waves with fre-

quencies ω1 = ω0 + δω and ω2 = ω0 − δω, and with wave vectors k1 = k0 + δk

and k2 = k0− δk, where the electric intensity vector expressed in the com-

plex form can be written as:

E(r, t) = A1ei(k1·r−ω1t) +A2e

i(k2·r−ω2t). (1.114)

Here we assume δω to be small (which implies that the components of

δk are also small, assuming that the directions of propagation are close

to each other) and the amplitude vectors A1 and A2 to be nearly equal

(A1,2 = A0 ± δA2

), being orthogonal to the respective wave vectors. Let us

133


write the above expression in the form

E(r, t) = ei(k0·r−ω0t)[A1ei(δk·r−δωt) +A2e

i(−δk·r+δωt)]. (1.115)

In optics, as in numerous other situations of interest, the space time

variation of the term within the brackets in the above expression is dom-

inated by the terms e±i(δk·r−δωt) since, even with |δk| << |k0|, δω << ω0, the

phases vary over large ranges for sufficiently small variations of r and t.

In other words, the small difference in the amplitudes A1 and A2 can be

ignored in accounting for the space time variations of E(r, t), and one can

write (with A0 =12(A1 +A2))

E(r, t) ≈ 2A0 cos (δk · r− δωt)ei(k0·r−ω0t). (1.116)

This expression shows that the resultant field can be interpreted as a

modulated plane wave with frequency ω0 and wave vector k0, with a slowly

varying amplitude

A(r, t) = 2A0 cos (δk · r− δωt), (1.117)

where A(r, t) varies appreciably only over distances ∼ 1|δk| , and time inter-

vals ∼ 1|δω| .

Fig. 1.10 depicts schematically the variation with distance along k0 of the

real part of any one component of the expression (1.117) at any given time

t, where the dotted curve represents the variation of the amplitude, given

by the cosine function, i.e., the envelope of the solid curve. The electric

134


intensity at the point r oscillates with a wavelength 2π|k0| , while the ampli-

tude varies much more slowly with a wavelength 2π|δk| . In representing the

variation with distance, δk has been assumed to be along k0 for the sake

of simplicity and, moreover, the medium is assumed to be an isotropic

one. Under this assumption, the envelope of the wave profile shown in

the figure is seen to get displaced by a distance ∂ω∂kt in time t. The latter

may be seen from (1.117) to be the velocity of the envelope even for a

non-isotropic medium, the Cartesian components of this velocity being

∂ω∂ki

(i = 1, 2, 3) (check this out; the partial derivatives are to be evaluated

at k = k0). These are referred to as the components of the group velocity.

O

envelope

modulated carrier

distance

wave function

vg

Figure 1.10: Depicting schematically the variation of the real part of anyone of the three components of the expression (1.117) with distance alongk0 for a given time t; the waveform consists of a modulated carrier wave ofwavelength 2π

|k0| , where the modulation corresponds to a sinusoidal enve-

lope of wavelength 2π|δk| ; with the passage of time, the envelope gets trans-

lated with a velocity ∂ω∂k

, which has been assumed to be along k0 for thesake of simplicity.

If, instead of the variation with the distance, one plots the variation with

time t at any given point r, one once again gets a curve of a similar form,

with the envelope function varying periodically with a time period 2πδω

while

the electric intensity at the point r varies much more rapidly with a time

period 2πω0

. One says that the field (1.114) represents a carrier wave of

frequency ω0 and wave vector k0, modulated by an envelope of frequency

δω and wave vector δk.

135


The above considerations can be generalized to the case of a wave packet,

i.e., a superposition of a group of waves with frequencies distributed over

a small range δω and wave vectors similarly distributed over the small

range δk. Let the central frequency in the above range be ω0 and the

central wave vector be k0, the choice of these two being, to some extent,

arbitrary. The frequency and wave vector of a typical member of this

group may be expressed as

ω = ω0 + Ω, k = k0 +K, (say), (1.118)

where the deviations Ω, K from the central frequency and wave vector

vary over narrow ranges around Ω = 0, K = 0. Let the amplitude vector

for the typical member under consideration be denoted by A(k), which we

rewrite in terms of K as a(K). We assume that the components of a have

appreciable values only for suufficiently small values of the components

of K. For instance, a(K) can be assumed to be of the Gaussian form

a(K) = ae−K2

2πb2 , (1.119)

where b gives a measure of the range of |K| over which a(K) possesses

appreciable values. Then, making use of arguments analogous to the

ones given above, one can express the electric intensity field as

E(r, t) =

∫

A(k)ei(k·r−ωt)d(3)k

=ei(k0·r−ω0t)

∫

a(K)ei(K·r−Ωt)d(3)K

≈ei(k0·r−ω0t)

∫

a(K)eiK·(r−∇KΩt)d(3)K, (1.120)

136


where∇KΩ denotes the vectorial derivative of Ω with respect to K at K = 0,

i.e., ∇kω evaluated at k = k0 (and, correspondingly, at ω = ω0). In writing

the last expression for E above, we have made use of the fact a(K) has

appreciable magnitude only for small values of the components of K, and

have retained only the first term in the Taylor expansion of Ω(K).

Digression: frequency as a function of the wave vector for isotropic and anisotropic

media.

Recall that ω and k are related to each other as

ω

|k| = v =c

n=

c√

ǫr(ω), (1.121)

where v is the phase velocity and n the refractive index at frequency ω. Here we continue

to assume that µr ≈ 1. Further, ǫr can be taken to be a real function of ω for the sake of

simplicity, i.e., absorption can be assumed to be negligibly small.

The above formula (eq. (1.121)) holds for an isotropic medium, where ω and v depend

on the components of k through |k| alone, which means that ∇kω is directed along k.

For a non-isotropic medium, on the other hand, ω(k) is not a function of |k| alone, and

∇kω is not, in general, directed along k. This implies a distinction between the ray

vector and the wave vector for an anisotropic medium (see sec. 1.19) and consequently

a distinction between the ray direction and the direction of the normal to the eikonal

surface in the geometrical optics description (refer to chapter 2 for an introduction to

the eikonal approximation in optics). In order to see why this should be so, one has

to refer to the fact that the energy transport velocity is given by the expression ∂ω∂k

under commonly encountered conditions for both isotropic and anisotropic media (see

sections 1.15.6 and 1.15.7.2).

137


Let the Fourier transform of a(K) be defined (under a conveniently chosen

normalization) as

a(ρ) =

∫

a(K)eiK·ρd(3)K. (1.122)

Then, (1.120) gives

E(r, t) = a(r− vgt)ei(k0·r−ω0t), (1.123a)

where

vg ≡ ∇kω, (1.123b)

with the vectorial derivative evaluated at ω = ω0, k = k0, is termed the

group velocity of the wave packet under consideration. In order to see the

significance of vg, note that (1.123a) can be interpreted as a modulated

plane wave with frequency ω0 and wave vector k0, with its amplitude vary-

ing slowly with position r and time t, being given by the Fourier transform

a(ρ) with ρ = r−vgt. Fig. 1.11 depicts schematically the wave packet where

the real part of any one component of E is plotted against distance along

k0 for any given value of t, with the envelope function (determined by a(ρ))

shown with a dotted line. It is the envelope function that modulates the

carrier wave of frequency ω0 and wave vector k0. If a similar plot of the

wave profile is made after an interval of time, say τ , then the envelope

is seen to get shifted by a distance vgτ (check this out; in the figure, vg

is assumed to be along k0 for the sake of simplicity of representation).

In other words, vg represents the velocity of the envelope of the group of

138


waves making up the wave profile.

In the particular case of the amplitude function a(K) being of the Gaus-

sian form (1.119), the Fourier transform a(ρ) is also a Gaussian function

a(ρ) = 2√2π3b3ae−

πb2

2ρ2 , (1.124)

whose width is proportional to b−1. In other words, if the wave packet

is made up of monochromatic plane waves covering a narrow range of ω

(and k), then the envelope of the wave packet is a broad one, having a

correspondingly large spread in space for any given value of t.

The envelope marks an identifiable structure in the wave profile at any

given instant of time, whereas a single monochromatic plane wave has no

such identifiable structure. The group velocity indicates the speed with

which this structure moves in space.

wave function

distance

envelope

modulated carrier

vg

Figure 1.11: Depicting schematically the variation of the real part of anyone of the three components of the expression (1.123a) with distancealong k0 for a given time t; the wave packet consists of a modulated car-rier wave of wavelength 2π

|k0| , where the modulation is assumed to corre-

spond to a Gaussian envelope for the sake of concreteness; the width ofthe envelope is inversely proportional to the effective range of variationof K (see equations (1.119), (1.124)), the deviation from the mean wavevector k0; with the passage of time, the envelope gets translated with avelocity vg = ∇kω, the group velocity of the wave packet; for the sake ofconvenience of representation, this has been assumed to be along k0.

139


The result (1.123a) looks neat, but it is an approximate result nonethe-

less, since it was arrived at on expanding ω as a function of k in a Taylor’s

series (refer to the third relation in (1.120)) around k = k0, ω = ω0, and

retaining only the term linear in K. Evidently, the condition for the va-

lidity of this approximation is that the variation of k around k0 sould be

restricted to a small range (i.e., for the particular case of the amplitude

function a(K) being of the Gaussian form (1.119), the width b should be

sufficiently small) and that the functional dependence of ω on k for the

medium under consideration should not involve singularities or sharp

variations near k0.

wave function

distance

envelopeenvelope

new structure

t1 t2

Figure 1.12: Depicting schematically the motion of a wave packet overa relatively large interval of time; the wave packet is shown at two timeinstants t1 and t2; the wave packet has a translational motion, and atthe same time it spreads out and develops new structures; the conceptof group velocity begins to lose its meaning; a pronounced change in thewave form also occurs in the case of anomalous dispersion over evenshort distances of propagation.

The expression (1.123a) is exact for t = 0 (check this out) while, for small

non-zero values of t, the approximation of retaining only the linear term

in K in the taylor’s expansion of ω works well. For larger values of t,

however, the higher order terms have an important role to play, and the

propagation of the wave packet can no longer be described just in terms of

the translational motion of the envelope with velocity vg. In other words,

the long term evolution of the wave packet involves processes of a more

140


complex nature. Fig. 1.12 depicts schematically the propagation of a

wave packet over a time interval during which the envelope spreads out

and, at the same time, develops new structures. For sufficiently large

time intervals, the approximation of retaining only the linear terms in K

breaks down, and the concept of group velocity loses its significance.

For a given time interval, the formula (1.123a) gives a reasonably good

description of the evolution of the wave packet only if the width of the

latter is less than a certain maximum value. As the interval is made to

increase, this permissible width decreases. Conversely, for a wave packet

of a given width, there exists a certain maximum time interval up to which

its evolution can be described as a simple translation, with its shape and

width remaining unaltered.

1.15.5 Energy density in a dispersive medium

In deriving the time averaged Poynting vector and energy density for a

plane monochromatic wave in a dielectric in sec. 1.10.3, I considered

an ideal plane wave with a sharply defined frequency and wave vector.

In reality, the closest thing to such an ideal plane wave that one can

have is a superposition of plane waves with frequencies and wave vectors

distributed over narrow ranges - as narrow as one can realize in practice.

Such a superposition is referred to as a wave packet that can be described

as a plane wave (the carrier) with a certain central frequency and wave

vector, with its amplitude modulated by a slowly varying envelope, as

indicated in sec. 1.15.4.

141


In the case of a dispersive medium, the characteristics of such a wave

packet differ from those in a non-dispersive one, one instance of which

is the distinction between its phase velocity and group velocity. Strictly

speaking, one has to give an operational definition of the term ‘velocity’

in the context of a wave packet in a dispersive medium (see sec. 1.15.7).

Similarly, there has to be an operational definition of the Poyting vector

and energy density, because both these quantities are time dependent,

and one needs an averaging for an operational definition. In the case of a

wave packet, either of these quantities has a fast as well as a slow time

variation, the former corresponding to the carrier and the latter to the en-

velope. We will consider an averaging over a time large compared with the

time period (2πω0) of the fast variation, which will result in a slowly varying

Poynting vector and energy density characterizing the wave packet.

In the following, I will consider, for the sake of simplicity, a ‘wave packet’

made up of just two plane monochromatic waves as in (1.114), where δω

and |δk| are assumed to be sufficiently small. For this superposition, the

magnetic field vector may be seen to be given by

H(r, t) =

√ǫ1rµ0c

n1 ×A1ei(k1·r−ω1t) +

√ǫ2rµ0c

n2 ×A2ei(k2·r−ω2t). (1.125)

where n1, n2 are unit vectors along k1, k2, and where we continue to

assume that there is no dispersion in the permeability (µ1r = µ2r = 1).

We will assume, moreover, that the medium is only weakly dispersive, in

which case ǫ1r and ǫ2r can be assumed to be real, and absorption in the

medium can be ignored.

142


In writing the formula (1.125) we use the approximation µ1r = µ2r = 1 which, however,

is not essential in the present context. For instance, the formula (1.131) given below

assumes a weak dispersion in the magnetic permeability.

Assuming that there are no free charges and currents in the medium

under consideration, one can write

div E×H = −(

E · ∂D∂t

+H · ∂B∂t

)

, (1.126)

as can be seen by making use of the Maxwell equations (1.1b), (1.1d)

(check this out). Since E × H represents the energy flow rate per unit

area in a direction normal to the flow, the right hand side of eq. (1.126)

(considered without the negative sign) must represent the rate of change

of energy density associated with the field per unit volume.

The energy density introduced this way includes the energy of the bound charges caus-

ing the polarization of the medium.

An important thing to note in the relation (1.126) is that the field vectors

appearing on either side of it are all real quantities (one cannot replace

these with the corresponding complex vectors since the two expressions

involve products of field vectors). Hence, one can either make the replace-

ments

E→ 1

2(E+ E∗), H→ 1

2(H+H∗),

where now the field vectors are all complex quantities, or else make use of

the real field vectors, taking the real parts of the expressions (1.114), (1.125).

Let us adopt the second approach here and, for the sake of concreteness

143


and simplicity, evaluate all the field quantities and their time derivatives

at the point r = 0, since any other choice for r may be seen to lead to the

same final result. Thus, we write,

E =E1 + E2

=A1 cos(ω + ν)t+A2 cos(ω − ν)t

=(A1 +A2) cos ωt cos νt+ (A2 −A1) sin ωt sin νt, (1.127a)

H =H1 +H2

=

√

ǫ

µ[(

1 +νη

2ǫr

)

n1 ×A1 cos(ω + ν)t+(

1− νη

2ǫr

)

n2 ×A2 cos(ω − ν)t, (1.127b)

where we have used a slightly altered notation, with ω1,2 = ω ± ν, dǫrdω

= η,

so that we can write (assuming ν to be small)

ǫ1r ≈ ǫr + νη, ǫ2r ≈ ǫr − νη,√ǫ1r ≈

√ǫr(

1 +νη

2ǫr

)

,√ǫ2r ≈

√ǫr(

1− νη

2ǫr

)

. (1.128)

With E given by (1.127a), D is given by

D = ǫ0[(ǫr + νη)A1 cos(ω + ν)t+ (ǫr − νη)A2 cos(ω − ν)t], (1.129)

Where the dielectric has been assumed to be an isotropic one. One can

now work out the time average of(

E · ∂D∂t

)

, evaluated over a time large

compared to 2πω

but small compared to 2πν

, which averages away the fast

variation of the expression under consideration. Denoting this time aver-

144


age by the symbol 〈··〉, one arrives at the following result

〈E · D〉 = −ǫ0ν sin(2νt)(ǫr + ωη)A1 ·A2, (1.130a)

where a dot over the symbol of a time dependent quantity denotes a time

differentiation. In a similar manner, one finds

〈E · E〉 = −ν sin(2νt)A1 ·A2. (1.130b)

In other words, one obtains, for a weakly dispersive medium, the result

〈E · D〉 = ∂

∂t

(1

2ǫ0(ǫr + ω

dǫr

dω)〈E2〉

)

. (1.130c)

One can similarly evaluate 〈H · B〉 under the assumption of a weak dis-

persion in µr (thus temporarily suspending our earlier assumption that

µr ≈ 1 and taking into account the dependence of the relevant quantities

on µr), and obtain

〈H · B〉 = ∂

∂t

(1

2µ0(µr + ω

dµr

dω)〈H2〉

)

. (1.130d)

Under the assumption of negligible dispersion in the magnetic permeabil-

ity (with µr ≈ 1), the right hand side of (1.130d) simplifies to ∂∂t(12µ0〈H2〉). I

will, however, make use of the expression (1.130d) below so as to indicate

the formal symmetry between the electrical and magnetic quantities.

Since the right hand side of eq. (1.126) (taken without the negative sign)

gives the time derivative of the energy density at any chosen point (recall

that we have chosen the point r = 0 without any loss in generality), the

145


energy density, averaged over the fast time variation, is now seen to be

given by the expression

〈w〉 = 1

2

[ d

dω(ωǫ)〈E2〉+ d

dω(ωµ)〈H2〉

]

. (1.131)

This is our final result for the energy density of a wave packet in a weakly

dispersive medium, and is to be compared with the result (1.53a) which

was written for the ideal case of a plane wave with a sharply defined

frequency and wave vector, in which case one has 〈E2〉 = 12E2

0 , 〈H2〉 = 12H2

0 ,

E0, H0 being the amplitudes of the electric and magnetic field vectors.

More generally, the above result can be arrived at by considering a narrow wave packet

made up of plane monochromatic waves with wave vectors distributed over a small range

and showing that the expression (E · D + H · B), averaged over a time large compared

to T0 = 2πω0

gives the time derivative of the expression on the right hand side of the

eq. (1.131), where ω0 stands for the central frequency of the wave packet. On performing

the time average mentioned here, one is left with a slow time variation that can be

written as ddt〈w〉.

On making the simplifying assumption that the dispersion in the mag-

netic permeability is negligible, one obtains the result

1

2〈 ddω

(ωµ)H2〉 ≈ 1

4ǫ(A2

1 + A22 + 2A1 ·A2 cos(2νt)), (1.132a)

and, from this,

〈w〉 ≈ 1

2

(

ǫ+1

2ωdǫ

dω

)

(A21 + A2

2 + 2A1 ·A2 cos(2νt)), (1.132b)

146


(check this out).

1.15.6 Group velocity and velocity of energy propaga-

tion

Proceeding along similar lines, we can also evaluate 〈E×H〉, the Poynting

vector averaged over the fast time variation for an isotropic dielectric, and

obtain

〈E×H〉 ≈ 1

2

√

ǫ

µ

k

|k|(A21 + A2

2 + 2A1 ·A2 cos(2νt)), (1.133)

(check this out). Here k stands for the mean wave vector k1+k2

2, and the

square and higher powers in ν, k1 − k2, A1 −A2 have been ignored.

One can, moreover, put µ = µ0 in the above formula without loss of consistency.

In other words, the relation between the time averaged Poynting vector

and the time averaged energy density in a weakly dispersive medium is

seen to be

〈S〉 ≈ 1√ǫµ

k

|k|(

1− 1

2

ω

ǫ

dǫ

dω

)

〈w〉, (1.134a)

(check this out). This shows that the velocity of energy propagation in a

weakly dispersive dielectric is

ven =1√ǫµ

k

|k|(

1− 1

2

ω

ǫ

dǫ

dω

)

. (1.134b)

One can now compare this with the group velocity (eq. (1.123b)) vg, where

147


the latter can be written for a weakly dispersive isotropic dielectric in the

form

vg = ∇kω ≈ v(1 +ω

v

dv

dω)n

≈ v(1− 1

2

ω

ǫ

dǫ

dω)n. (1.135)

In this expression, n stands for the unit vector along k, and v for the

phase velocity 1√ǫµ

. The required relation then comes out as

ven = vg, (1.136)

(check this out).

This relation is of more general validity than the derivation suggests. For

instance, it holds for an anisotropic as also for an isotropic medium, pro-

vided that the wave packet under consideration is a sufficiently narrow

one and that the medium is only weakly dispersive, with negligible ab-

sorption. Indeed, under these conditions, the energy density, averaged

over a time large compared to the time period of the central component

of the wave packet under consideration, can be expressed in the form

〈w(r, t)〉 = f(r− ∂ω

∂kt), (1.137)

regardless of whether the medium is isotropic or anisotropic, which im-

mediately leads to the relation (1.136) (check this out) and, at the same

148


time implies that the time averaged Poynting vector has to be of the form

〈S〉 = vg〈w〉. (1.138)

1.15.7 Group velocity, signal velocity, and causality

1.15.7.1 Introduction

The question of propagation of an electromagnetic wave through a dis-

persive medium is a deep and complex one. A wave form at any given

time t is completely determined by E(r, t), H(r, t) as functions of position

in space. The propagation of the waveform then consists of changes in

its shape as a function of time, consequent to the propagation of its vari-

ous Fourier components with their respective phase velocities. Since the

phase velocities depend on the frequencies, the wave form does not prop-

agate in a simple manner keeping its shape intact, and gets deformed.

The wave form is, in a sense, an object with an infinite number of ‘de-

grees of freedom’ (which one can identify with its Fourier components),

which makes its propagation a complex process, requiring a large num-

ber of parameters for an adequate description.

The case of a wave form in a non-dispersive medium (the only truly non-

dispersive medium, however, is free space) is the simplest: the wave form

propagates with the common phase velocity of its Fourier compnents

maintaining its shape. Propagation in a weakly dispersive medium is also

relatively simple to describe, as we have seen above: a wave packet with

a narrow envelope (where the frequencies and wave vectors of the Fourier

components are distributed over small ranges) moves with the envelope

149


remaining almost unaltered in shape, at least for relatively short times of

propagation, its velocity being vg, the group velocity of the wave packet.

As the wave form propagates, electromagnetic energy is carried by it with

the same velocity vg.

In this case of propagation through a weakly dispersive medium, the en-

velope marks an identifiable structure in the wave form (a purely sinu-

soidal wave with a sharply defined frequency and wave vector does not

have any such identifiable structure) that can be made use of as a carrier

of information, as in the case of an amplitude modulated carrier wave in

radio communications. In most circumstances involving weakly disper-

sive media, the magnitude of the group velocity is seen to be less than c,

the velocity of light in vacuum, which means that information is trans-

ferred through the medium at a speed less than c. This is then seen to be

consistent with the principle of relativity which states that no signal can

be transmitted with a velocity greater than c.

A signal, incidentally, is an entity (such as a particle or a wave form)

that is generated by some specific event and, on propagating through a

distance, can be made use of in producing a second event, so that the first

event can be described as the cause of the second one, the latter being the

effect produced by the cause. The statement that no signal can propagate

at a speed faster than c is equivalent to the principle of causality, which

states that the cause-effect relation must be independent of the frame of

reference.

If a wave packet propagating through a medium suffers strong or anoma-

150


lous dispersion, then its motion can no longer be described in simple

terms. In particular, the wave form gets strongly distorted - it gets spread

out and develops new structures as in fig. 1.12 - and the group velocity

defined as vg = ∇kω loses its significance, and may even become larger

than c in magnitude. The question of defining the optical or electromag-

netic signal that can be looked upon as the carrier of information then

becomes a more complex one.

A more fundamental set of questions then presents itself. Even when the

distortion of the wave form is relatively small, and the envelope is char-

acterized by a single identifiable structure during the time of its propa-

gation, does the group velocity really represent the velocity of a signal,

the carrier of information? There exist important and interesting cases of

wave propagation where the envelope does not suffer much distortion and

yet its velocity - the group velocity of the wave - is larger than c. What this

means is that, if the envelope is identified as the signal, i.e., the carrier of

information, then superluminal propagation of information is possible, in

violation of the principle of causality. If, on the other hand, the envelope

is not the carrier of information in the strict sense, then what constitutes

the ‘signal’? And finally, can the signal propagate superluminally?

In briefly addressing these questions, I will refer to a scalar wave func-

tion for the sake of simplicity, which may be taken as any one of the

Cartesian components of the electric (or magnetic) field vector, and will

consider an isotropic dielectric, where the group velocity, pointing along

the mean wave vector, can be represented by a scalar (vg = dωdk

) like the

phase velocity (v = ωk). However, before proceeding with the above queries,

151


I will first touch upon the question of the ray velocity in the geometrical

optics description.

1.15.7.2 Velocity of energy propagation and ray velocity

In section 1.15.6 we saw that the average of the Poynting vector E×H over

the fast temporal variation (for a narrow wave packet), which gives the

rate of propagation of energy by means of the wave packet, relates to the

energy density as in (1.134a), thereby implying that the velocity of energy

propagation is the same as the group velocity (eq. (1.136)), where the

medium under consideration is assumed to be a weakly dispersive one.

An equivalent way of reasoning is that the velocity of energy transport

equals the group velocity by virtue of the fact that the energy density is,

in general, of the form (1.137), in which case the relation (1.136) follows

as a consequence of (1.138).

In chapter 2, I will briefly review the basics of geometrical optics where

it will be seen that the latter is founded upon the eikonal approximation

to Maxwell’s equations according to which, the electromagnetic field can,

under certain circumstances, be approximated locally by a plane wave.

The plane wave is local in the sense that the changes in the magnitude

and direction of the wave vector occur slowly from point to point in space.

At any given point in space, the time averaged Poynting vector defines the

ray direction in the geometrical optics description.

The geometrical optics description remains valid for a wave packet char-

acterized by a slow spatial and temporal variation of the amplitude (which

is described by the envelope of the packet) where, once again, the Poynt-

152


ing vector averaged over the fast time variation gives the direction of the

energy flow, i.e., the ray direction at the point under consideration. The

rate of energy flow may exhibit a slow time variation, but the energy flow

velocity remains constant, and is given by the group velocity at the said

point. Hence it is also referred to as the ray velocity in the context of the

geometrical optics description.

In summary, the group velocity (which is the same as the energy flow

velocity in a weakly dispersive dielectric for a narrow wave packet) can

be identified with the ray velocity in the geometrical optics description,

which is valid for a weakly inhomogeneous medium. What is more, this

identification of the ray velocity (i.e., the velocity of energy transport) with

the group velocity vg holds for an isotropic as also for an anisotropic

dielectric.

Wave propagation in an anisotropic dielectric will be considered in sec. 1.19.

As we will see, such a medium shows a number of novel features relating

to wave propagation.

1.15.7.3 Wave propagation: the work of Sommerfeld and Brillouin

Imagine for the sake of simplicity a medium characterized by just one

single resonant frequency (ω0) (the so called Lorentz model), for which the

dispersion formula is of the form (for notation, see sections 1.15, 1.15.2)

n2(ω) = ǫr = 1 +Ne2

ǫ0m

1

ω20 − ω2 − iγω . (1.139)

This model of dispersion is evidently an idealized one, but still, several

153


features of the dispersion curve are qualitatively similar to those found

for realistic dielectric media. We will, moreover, assume the damping

constant γ(= η

m) to be small, in which case the refractive index n can be

taken to be real (with only a small imaginary part that can be ignored in

the first approximation).

This simplified model can be used to analyze and describe several fea-

tures of the propagation of electromagnetic wave forms in a dispersive

medium, following the approach of Sommerfeld and Brillouin, who made

pioneering contributions in this field. While elucidating several important

features of signal propagation and thereby opening up a vast and impor-

tant area of theoretical and experimental investigations, each of them

addressed the question of the possibility of superluminal group velocities

(refer to sec. 1.15.7.1). Noting that the group velocity at frequency ω (the

mean frequency of a wave packet) in an isotropic dielectric is given by

vg =c

n+ ω dndω

, (1.140)

(check this out; refer to formula (1.123b)), one observes that vg can be

larger than c if dndω

is negative and of a sufficiently large magnitude. This

is precisely what happens in the region of anomalous dispersion, i.e., for

ω ≈ ω0 in the present context. However, as I have mentioned above, this is

also the region where strong distortion of the propagating wave form takes

place and the significance of group velocity itself becomes questionable.

This was partly the reason why Sommerfeld and Brillouin took up their

investigations on signal propagation, where they addressed the problem

of propagation in general mathematical terms, not necessarily confined

154


to the case of normal dispersion or to short time intervals.

Following the approach outlined in sec. 1.15.4, let us represent an initial

wave form E(x, t = 0) in terms of its Fourier transform e(ω) (say) as

E(x, t = 0) =

∫

e(ω)exp(

iω

cn(ω)x

)

, (1.141a)

where the wave is assumed to propagate along the x-direction and E(x, t)

is a scalar wave function corresponding to, say, the y-component of the

electric intensity vector. The integration over ω may be assumed to extend

from −∞ to +∞ by defining e(ω) appropriately.

If E(x, t = 0) is to be real, e(ω) has to satisfy e(−ω) = e(ω)∗. This ensures that E(x, t) will

be real for all t.

Then, at time t, the wave form is given by

E(x, t) =

∫

e(ω)exp(

iω

c(n(ω)x− ct)

)

, (1.141b)

(check this out).

For a given initial wave form (which corresponds to a given function

e(ω)), one can obtain E(x, t) at any later time by evaluating the integral

in (1.141b) which, in principle, gives the wave form for any specified value

of t as a function of x. In practice, however, the evaluation of the integral

is not a trivial matter, which is why both Sommerfeld and Brillouin made

use of the technique of complex integration. Even so, the evaluation of the

integral for given values of x and t depends on the location of the poles of

155


the integrand and requires approximations where, in general, the nature

of the approximations varies for various different regimes of x and t.

The results obtained from such an analysis can be illustrated for an initial

wave function (see fig. 1.13) of the form

E(x, 0) =e0 sin(Ω

cx) (x < 0),

0 (x > 0), (1.142)

where Ω is a frequency chosen away from ω0 for the sake of simplicity

(i.e., the dispersion is assumed to be normal; the case of anomalous dis-

persion can also be analyzed by similar means). This corresponds to an

uninterrupted sinusoidal waveform in a half space (left of the origin, to-

wards the negative direction of the x-axis), with zero field in the remaining

half space, and can be described as a sinusoidal waveform modulated by

a step function, where the envelope corresponding to the step function is

shown on the left of fig. 1.13.

Observed after a time τ (say), the wave is seen to have moved towards

the right while undergoing a change of form which consists principally

of a ‘forerunner’ or ‘precursor’ in this case, moving ahead of the steady

wavetrain. The precursor is a wavetrain of extremely small amplitude,

and two such precursors can be identified in the figure. One of these, the

Sommerfeld precursor is made up of components belonging to the high

frequency end of the electromagnetic spectrum while the other, referred to

as the Brillouin precursor, is made up of much lower frequency (and larger

wavelength) ones. The tip of the wavetrain consisting of these precursors

156


vgt

ct

initial wave form steady wave form Brillouinprecursor

Sommerfeldprecursor

t = 0 t = t

Ox=0

x

Figure 1.13: Depicting schematically the results of Sommerfeld and Bril-louin’s analysis of wave form propagation in a dispersive medium; theinitial wave form (t = 0) is a step-modulated sinusoidal one, with a uni-form wave train to the left of x=0; the wave form after a time τ consistsof a Sommerfeld precursor of extremely small amplitude and wavelength(corresponding to high frequency components) running to the left fromx = cτ , followed by a Brillouin precursor of much longer wavelength and,finally, a steady oscillatory wave form corresponding to frequency Ω as inthe initial wave; the steady wave form runs to the left from x = vgτ ; inother words, the Sommerfeld precursor travels with the speed of light invacuum, while the steady wave form moves as a single structure with thegroup velocity vg; the onset of the steady wave form may be identified withthe ‘signal’; thus the signal moves from x = 0 to x = vgτ in time τ , i.e., thesignal velocity vs is the same in this case as the group velocity vg; this,however, is not true in general, as in the case of anomalous dispersion;while vg may be greater than c in some situations, vs can never exceed c;however, this result depends on an appropriate definition of vs.

is located at a distance cτ from x = 0, the tip of the initial step-modulated

sinusoidal wavetrain we started with.

The precursors are followed by the steady state sinusoidal wavetrain of

frequency Ω, but the front of the sinusoidal wavetrain moves through a

distance vgτ where, in the situation depicted in the figure, vg < c. The

front, i.e., the point of onset of the steady state wavetrain, was identified

by Brillouin as the ‘signal’. There occurs a transient phase of non-steady

oscillations by which the precursor connects with the steady wavetrain,

which has not been shown in the figure, and thus the signal velocity

is here the same as the group velocity, where the latter is the velocity

of the steady state wavetrain itself. Brillouin was led to the result that

157


the signal velocity is close to the group velocity for frequencies (Ω) away

from the regions of anomalous dispersion, both being less than c. In

the case of anomalous dispersion, however, the two differ conspicuously.

The group velocity vg = dωdk

may exceed the speed of light, but the signal

velocity, i.e., the velocity of the front continues to remain less than c.

Thus, he demonstrated that the relativistic principle of causality is always

satisfied, and the group velocity does not always have the interpretation

of the velocity of information carried by a wavetrain.

The fact that the tip of the precursor moves with the speed of light in vac-

uum can be explained from the observation that the highest frequency

Fourier components of the waveform correspond to ǫr ≈ 1 (refer to fig-

ures 1.8, 1.7), i.e., these high frequency components move with velocity

approaching c. From the physical point of view, a wave with a very high

frequency exerts only a negligible effect on the electrons in the dielectric

under consideration whose natural frequencies (the transition frequen-

cies in the quantum theoretic description) are much less by comparison,

and hence the ‘response’ of the medium to the wave is effectively a null

one, like that of vacuum. These Fourier components of the propagating

wave make up the Sommerfeld precursor. In a similar manner, the com-

ponents at the low frequency end of the spectrum are characterized by

a relatively large phase velocity (for instance, the phase velocity goes to

c in the Lorentz model) and give rise to the Brillouin precursor (the high

frequency components continue to remain mixed in this phase).

While the Sommerfeld-Brillouin analysis was a path-breaking one, the

question of the signal velocity was not clearly settled. Brillouin defined

158


the signal velocity for a propagating wave form from a mathematical point

of view but left open its physical interpretation, and the question of iden-

tifying the signal has later been reopened. Experimental investigations

have shown that there occur interesting instances of wave propagation

where the envelope does not get flattened or broken up, and still it moves

with a speed greater than c. Identifying the signal with the envelope in

such situations would then imply a superluminal signal propagation, in

violation of the relativistic principle of causality.

1.15.7.4 Superluminal group velocity: defining the signal velocity

A situation apparently involving superluminal signal propagation is one

where a wave packet undergoes ‘tunnelling’ or ‘barrier penetration’. As

an example of barrier penetration by a wave packet, one can refer to

what is known as ‘frustrated total internal reflection’ (FTIR). Recall that,

in total internal reflection, a wave is totally reflected from an interface

between two media, being sent back to the medium (refractive index, say,

n1) where it came from, with only an exponentially decaying field being

set up in the second medium (refractive index n2(< n1); refer to sec. 1.13).

This second medium, however, is now in the form of a thin layer, beyond

which there is a third, denser, medium (which may again be the dielectric

with refractive index n1), in which case a small part of the incident wave

gets transmitted into this third medium.

In the geometrical optics description, a ray cannot penetrate into the

second medium, nor into the third. However, in the wave description, an

incident wave packet gets split into two, of which the one (having a small

159


amplitude) ‘tunnels’ through the layer of the second medium (the ‘barrier’)

into the third one. In the quantum description of the electromagnetic

field (see chapter 8 for an introduction), a photon undergoes quantum

mechanical tunnelling into the third medium. Photonic tunnelling has

been observed in other set-ups as well, like in wave guides and in layered

dielectrics involving ‘photonic band gaps’.

In the case of quantum mechanical tunnelling of a particle through a

barrier, theoretical and experimental investigations have shown that a

‘tunnelling time’ can be associated, in a certain sense, with the process,

that implies the crossing of the barrier at superluminal speeds. As the

wave packet representing the particle emerges into the third medium, its

shape remains almost similar to the incident one, but its peak appears

to have crossed at superluminal speeds. This is illustrated in fig. 1.14

where the positions of the peak (P, P′) and the tip (T, T′) of the incident

and the emerging packets are indicated.

vgt

ct

P P¢

t = 0 t = t

distance

T¢T

Figure 1.14: Illustrating the superluminal tunnelling of a barrier by awave packet; the positions of the peak (P, P′) and the tip (T, T′) of theincident and emerging wave packets are indicated; the distance PP′ isgreater than cτ , where τ is an experimentally measured transit time; this,however, does not imply a breakdown of causality since a small portionof the incident wave packet near T completely determines the structurenear P′, where the distance from T to P′ is cτ ; the barrier is not shown;the portions of the initial and final wave packets (the one near T and theother from P′ to T′) related causally to each other are shown shaded.

160


In terms of the experimentally measured and defined ‘transit time’ τ

through the barrier, the peak-to-peak distance is seen to be larger than

cτ , implying a superluminal group velocity. However, the peak P of the in-

cident wave packet does not causally determine the peak P′ of the emerg-

ing wave, since the latter is determined completely by a small portion of

the incident wave packet near T, the distance TP′ being exactly cτ .

Superluminal group velocity is also observed in an amplifying medium,

in which a population inversion has been made to take place. In such

a medium (commonly used in the production of lasers) the distribution

of the atoms among their various energy states is inverted as compared

to the normal, Boltzmann distribution. The dispersion characteristics

of such a medium are also found to be inverted compared to a normal

dielectric, as illustrated in fig. 1.15, where there is an anomalous disper-

sion ( dndω< 0) at frequencies away from the resonance and a normal disper-

sion ( dndω> 0) near resonance. Consequently, there results a superluminal

group velocity at large and small frequencies with only a small distor-

tion in the shape of the wave packet. The velocity of energy propagation,

defined as the ratio of the time averaged Poynting vector and the time

averaged energy density, is also seen to be larger than c in magnitude.

In the Sommerfeld-Brillouin approach, the signal velocity in such a situ-

ation would be identical to the group velocity, implying superluminal sig-

nal propagation, and a breakdown of causality. However, once again, the

peak or the front of the wave packet (the rising portion of the envelope,

this was identified by Brillouin as the signal associated with the wave

packet) after propagation through a time τ is not causally determined by

161


nR

wO

1

w0

Figure 1.15: Depicting schematically the dispersion relation for an am-plifying medium, with the real part of the refractive index (nR) plottedagainst the frequency ω; only a single resonant frequency (ω0) is as-sumed; the dispersion curve for a medium with an uninverted popula-tion of atoms is shown (dotted) for the sake of comparison; the degree ofpopulation inversion in the amplifying medium may vary, and a maximalinversion is assumed for the sake of illustration; the dispersion is seen tobe anomalous for frequencies away from the resonance, and normal nearthe latter, which contrasts with the dotted curve.

the corresponding portion of the initial wave packet.

It is thus important to address the question as to what constitutes the

signal associated with a wave packet, where the signal is understood to

be the carrier of causal information. In the case of an analytic signal,

the mathematical definition of analyticity implies that only a tiny portion

of the wave packet near its tip is sufficient to determine the entire wave

packet by means of a Taylor expansion. Consistent with the principle of

causality, the tip propagates at a speed at most the speed of light in vac-

uum. In the case of a non-analytic signal, on the other hand, where the

wave function or any of its derivatives has a discontinuity at some point

on the wave packet, it is the point of non-analyticity that can be identified

as the signal, where this point admits of a binary (‘yes-no’ type) descrip-

tion. The non-analyticity is associated with high frequency Fourier com-

ponents of the signal that propagate with a speed c, which then can be

162


identified as the signal velocity. One instance of such signal propagation

with speed c is the Sommerfeld precursor mentioned in sec. 1.15.7.3.

The question of electromagnetic signal propagation is a complex one, cov-

ering a vast area of investigations, and is still under active research.

Many questions remain to be answered, including the one of a universally

accepted and physically relevant definition of the terms ‘signal’ and ‘sig-

nal velocity’. To date, all investigations and interpretations firmly support

the concept of relativistic causality. The question has recently acquired

a new significance in the light of high speed digital communications by

means of optical information transfer where information is carried by

short optical pulses.

1.16 Stationary waves

An important class of relatively simple solutions of Maxwell’s equations

includes stationary waves (or, standing waves) in bounded regions en-

closed within boundaries of certain simple geometrical shapes.

As an example, consider the region of free space bounded by two surfaces

parallel to the x-y plane of a Cartesian co-ordinate system, the two sur-

faces being located at, say, z = 0, z = L (L > 0), where each of the surfaces

is assumed to be made up of an infinitely extended thin sheet of a per-

fectly conducting material. The boundary conditions at the two surfaces

(vanishing of the tangential component of the electric intensity) are satis-

fied by the field variables described below which constitute one particular

163


solution to the Maxwell equations for the region under consideration.

E(r, t) = exE0 sin(kz) cos(ωt), H(r, t) = −eyE0

µ0ccos(kz) sin(ωt), (1.143a)

where ωk= c, and k can have any value in the set k = nπ

L(n = 1, 2, 3, . . .)

(check this statement out).

While the general practice I follow in this book is to represent the field vectors in their

complex forms, the above expressions for E and B are real ones (assuming that the

amplitude E0 is real). The corresponding complex expressions would be

E(r, t) = exE0 sin(kz)e−iωt, H(r, t) = −iey

E0

µ0ccos(kz)e−iωt, (1.143b)

(check this out).

On calculating the time average of the Poynting vector S, one obtains

〈S〉 = 0, (1.144)

which is why the field described by (1.143a), (1.143b) is termed a station-

ary wave. Any particular value of the integer n is said to correspond to

a normal mode (or, simply, a mode) of the field in the region under con-

sideration. A more general class of solutions of Maxwell’s equations in

the region under consideration can be represented as superpositions of

all the possible normal modes, where such a solution again corresponds

to zero value of the time averaged Poynting vector.

The amplitude of oscillation of the electric vector at any given point de-

pends on its location and is maximum (|E0|) at points with z = Ln(m +

164


12) (m = 0, 1, . . . , n − 1) for a mode characterized by the integer n. A plane

defined by any given value of m for such a mode is referred to an antinode

for the electric intensity, while nodes, which correspond to zero ampli-

tude, are given by z = Lnm (m = 0, 1, 2, . . . , n). Similar statements apply for

the magnetic field vector H, where the nodes are seen to coincide with the

antinodes of the electric field, and vice versa.

While the spatial dependence of the electric and magnetic field vectors

is of a simple nature because of the simple geometry of the boundary

surface of the region considered above, boundary surfaces of less simple

geometries may lead to enormous complexity in the spatial dependence

of the field vectors, corresponding to which the nodal and antinodal sur-

faces may be of complex structures. However, the time averaged Poynting

vector remains zero for any such solution.

In the case of the region bounded by the surfaces z = 0 and z = L considered above,

there exists more general solutions that can be described as standing waves in the z-

direction and propagating waves in the x-y plane, since the region is unbounded along

the x- and y-axes. For instance, a field with the field vectors given, in their real forms,

by

E(r, t) =exE0 sin(kz) cos(qy − ωt),

H(r, t) =E0

µ0c√

k2 + q2

(

eyk cos(kz) sin(qy − ωt)− ezq sin(kz) cos(qy − ωt))

, (1.145)

represents a solution to Maxwell’s equations subject to the boundary conditions men-

tioned above where, as before, k = nπL

(n = 1, 2, . . .) corresponding to the various stand-

ing wave modes, but where q can be any real number, subject to the condition ω2 =

c2(k2 + q2). The time averaged Poynting vector for this solution is directed along the

y-axis (check the above statements out).

165


The above solution represents a standing wave in the z-direction and a propagating wave

in the y-direction. Such waves are set up in waveguides.

Black body radiation at any given temperature constitutes the most com-

monly encountered example of standing waves where there exist an in-

finitely large number of modes within an enclosure, all in thermal equi-

librium with one another.

Standing waves have acquired great relevance in optics in recent decades

where stationary waves of frequencies within the visible range of the spec-

trum are set up within optical resonators of various specific geometries.

Such optical resonators are made use of, for instance, in lasers.

1.17 Spherical waves

1.17.1 The scalar wave equation and its spherical wave

solutions

The scalar wave equation

∇2ψ − 1

v2∂2ψ

∂t2= 0, (1.146)

possesses, for any given angular frequency ω, the simple spherical wave

solution

ψ(r, t) = Aei(kr−wt)

r(k =

ω

v), (1.147)

166


which corresponds to an expanding wave front of spherical shape, of am-

plitude Ar

at a distance r from the origin. Note that the expression (1.147)

satisfies the wave equation everywhere excepting the origin and, from the

physical point of view, represents the solution to the wave equation with

a monopole source located at the origin. In other words, it is actually the

solution to the inhomogeneous wave equation

∇2ψ − 1

v2∂2ψ

∂t2= −4πAδ(3)(r), (1.148)

which reduces to (1.146) for r 6= 0, with the expression on the right hand

side representing a source term at the origin.

The solution (1.147) is the first term of a series expression for the general

solution of (1.146) where the succeeding terms of the series may be in-

terpreted as waves resulting from sources of higher multipolarity located

at the origin, and where these terms involve an angular dependence of ψ

(i.e., dependence on the angles θ, φ in the spherical polar co-ordinates),

in contrast to the spherically symmetric monopole solution (1.147). At a

large distance from the origin, each term becomes small compared to the

preceding term in the series. In other words, the spherical wave (1.147)

dominates the solution of (1.146) at large distances from the origin.

1.17.2 Vector spherical waves

Analogous expressions for the electromagnetic field vectors in a source-

free region of space can be constructed in terms of spherical polar co-

ordinates (r, θ, φ), but the vectorial nature of the equations lead to expres-

167


sions of a more complex nature for these.

In a source-free region of space, each component of the field vectors E,H

satisfies a scalar wave equation of the form (1.146), and a series solution

of the form mentioned in sec 1.17.1 can be constructed formally for each

such component. However, such a solution is not of much practical use

since the components are to be combined into vectors that have to satisfy

Maxwell’s equations (it may be remarked that Maxwell’s equations imply

the wave equations in a source-free region, but the converse is not true).

One way to arrive at acceptable solutions for the field vectors is to work

out the vector and scalar potentials first, as outlined in sec. 1.17.3 below.

Assuming a harmonic time dependence of the form e−iωt for all the field

components, the solutions for the field vectors in a source free region,

expressed in terms of the spherical polar co-ordinates, can be classified

into two types, namely, the transverse magnetic (TM) and the transverse

electric (TE) fields. Analogous to the scalar case, the general solution

(where only the space dependent parts of the fields need be considered)

for either type can be expressed in the form of a series where now each

term in either series possesses an angular dependence. The first terms of

the two series constitute what are referred to as the electric and magnetic

dipole fields.

While magnetic monopoles are not known, harmonically oscillating electric monopole

sources are also not possible because of the principle of charge conservation.

These dipole fields are encountered in diffraction and scattering theory,

168


while fields of higher multipolarity are also of relevance, being represented

by succeeding terms in the two series. As in the scalar case, these terms

get progressively smaller at large distances from the origin (which, in the

present context, is assumed to be the point where the multipole sources

are located; this means that the solutions under consideration are valid

in regions of space away from the origin, where the field vectors satisfy

the homogeneous Helmholtz equations).

Strictly speaking, the solutions for the field vectors that satisfy the condition of regularity

at large distances cannot, at the same time, be regular at the origin as well. A separate

series can be constructed for each of the two types (TM and TE) representing the general

solution of the homogeneous Helmholtz equations that is regular at the origin. However,

such a series fails to be regular at large distances.

Thus, unless the dipole terms vanish (which requires the sources to be

of special nature), the TM and TE dipole fields dominate the respective

series expressions for the solutions at large distances, where the term

‘large’ describes the condition kr >> 1 (k = ωc, assuming the field to be set

up in vacuum).

1.17.3 Electric and magnetic dipole fields

Consider a charge-current distribution acting as the source of an elec-

tromagnetic field in an unbounded homogeneous medium and assume

that the time dependence of the sources is harmonic in nature, with an

angular frequency ω. Assume, moreover, that the source distribution is

localized in space.

169


The solution to eq. (1.20b) for the vector potential in the Lorentz gauge

then looks like

A(r, t) =µ0

4πe−iωt

∫

d(3)r′j(r′)eik|r−r′|

|r− r′| . (1.149)

Here d(3)r′ stands for a volume element in space around the source-point

r′ and the integration is over entire space, while the constant k is defined

as k =√ǫ0µ0ω, assuming the field point (r) to be located in free space. In

writing this solution for the vector potential we have assumed that, for

field points r at infinitely large distances from the sources, the potentials

(as also the fields) behave like outgoing spherical waves with a space-

time dependence of the form ei(kr−ωt)

r. Moreover, j(r′) in the above equation

stands for the space dependent part of the current density, where the

time dependence enters through the factor e−iωt.

With a harmonic time dependence (∼ e−iωt), potentials satisfy an inhomogeneous Helmholtz

equation of the form

∇2ψ + k2ψ = f(r, ω) (1.150)

where ψ stands for the scalar potential or any component of the vector potential, and

f(r, ω) represents the Fourier transform of the relevant source term. The solution to this

equation subject to the boundary condition mentioned above is obtained with the help

of the outgoing wave Green’s function

Gk(r, r′) = − 1

4π

eik|r−r′|

|r− r′| , (1.151)

where the harmonic time-dependence is implied. This is how the solution (1.149) is

arrived at.

170


1.17.3.1 The field of an oscillating electric dipole

For a field point r located outside the (finite) region containing the sources,

the right hand side of eq (1.149) can be expanded is a multipole series, of

which the first term is

A(r, t) =µ0

4π

ei(kr−ωt)

r

∫

d(3)r′j(r′). (1.152)

Making use, now, of the equation of continuity (eq. (1.1e)), this can be

transformed to

A(r, t) = − iωµ0

4πpei(kr−ωt)

r, (1.153a)

where

p =

∫

d(3)r′r′ρ(r′), (1.153b)

is the electric dipole moment of the source distribution, ρ(r′) being the

space dependent part of the charge density. In general, p can be a com-

plex vector, with its components characterized by different phases.

For an ideal oscillating electric dipole, which corresponds to zero charge

and current densities everywhere excepting at the origin which is a singu-

larity, (1.153a) is the only term in the multipole expansion of the vector

potential, and constitutes a simple spherical wave solution of the Maxwell

equations.

171


The principle of charge conservation, expressed by eq. (1.1e), implies that there can

be no harmonically varying electric monopole term in the solution for the potentials or

the field vectors, the monopole component of the potentials or the field vectors being

necessarily static.

Making use of the harmonic time-dependence and the Lorentz condi-

tion (1.19), one can work out the scalar potential φ for the oscillating

electric dipole placed in vacuum at the origin, which reads

φ(r, t) =k

4πǫ0(1− 1

ikr)p · er

ei(kr−ωt)

r, (k =

√µ0ǫ0ω). (1.154)

One can now make use of equations (1.16a), (1.16b), to work out the

electric and magnetic intensities of the oscillating electric dipole which

we assume to be placed at the origin in free space:

H(r, t) =ck2

4π(er × p)(1− 1

ikr)ei(kr−ωt)

r, (1.155a)

E(r, t) =1

4πǫ0

(

k2(er × p)× er + [3er(er · p)− p

r2](1− ikr)

)ei(kr−ωt)

r. (1.155b)

The above formulae are obtained on making use of equations (1.16a) and (1.16b), along

with (1.153a) and (1.154). Eq. (1.155b) may also be deduced from (1.155a), along with

eq. (1.1d) which, in the present context, reads

−iωǫ0E = curl H. (1.156)

Noting that the magnetic vector H at any given point is orthogonal to the

172


unit radial vector er, the field described by the above expressions is said to

belong to the TM type. A number of other features of the electromagnetic

field of the oscillating electric dipole may be noted from equations (1.155a)

and (1.155b) by looking at the far and near zones, corresponding, respec-

tively, to kr >> 1 and kr << 1.

In the far, or radiation, zone (kr >> 1), the fields look like

H ≈ ck2

4π(er × p)

ei(kr−ωt)

r, (1.157a)

E ≈ cµ0H× er. (1.157b)

This represents a spherical wave, where the spherical wave front moves

radially outward with a uniform speed c = 1√ǫ0µ0

, and H is transverse to the

direction of propagation (i.e., er =r

r) as also to the dipole vector p (recall

that the oscillating dipole moment is given by pe−iωt). The electric inten-

sity E, the magnetic intensity H and the unit propagation vector er make

up a right-handed orthogonal triad, as in the case of a monochromatic

plane wave (recall, in the context of the latter, the relation E = µ0cH × n,

where n stands for the unit wave normal). Thus, in the far zone, the elec-

tromagnetic field can be described as a transverse spherical wave. The

direction of the time-averaged Poynting vector at any given point r points

along er. By integrating over all possible directions of power radiation,

173


the total power radiated can be worked out, which reads

P =c3k4

12πµ0|p|2. (1.158)

While transversality of H to the unit radius vector er is maintained at all

distances, E is no longer tranversal in the near and intermediate zones.

The solution for the electromagnetic field produced by the oscillating elec-

tric dipole and represented by equations (1.155a), (1.155b) thus belongs

to the class of transverse magnetic (TM) solutions of Maxwell’s equations.

As mentioned above, the field of the oscillating electric dipole in the near

zone (kr << 1) is not transverse in the sense that E, in general, possesses

a component along er. The magnetic and electric vectors in the near zone

are given by

H ≈ iω

4π(er × p)

ei(kr−ωt)

r2, (1.159a)

E ≈ 1

4πǫ0

3er(er · p)− p

r3ei(kr−ωt). (1.159b)

Thus, the electric field in the near zone completely resembles the field of

a static dipole of dipole moment p, the only difference being the phase

factor ei(kr−ωt).

174


1.17.3.2 The oscillating magnetic dipole

The field of a harmonically oscillating magnetic dipole of dipole moment,

say, me−iωt can be similarly worked out, and reads

E = −ck2 µ0

4π(er ×m)(1− 1

ikr)ei(kr−ωt)

r, (1.160a)

H =1

4π

(

k2(er ×m)× er + [3er(er ·m)−m

r2](1− ikr)

)ei(kr−ωt)

r. (1.160b)

Here the electric intensity at any point is orthogonal to the unit radius

vector er, which is why the field is of the TE type. Once again, the field

looks quite different in the far zone (kr >> 1) as compared to that in

the near zone (kr << 1). In the far zone the field can be described as a

transverse spherical wave where the electric intensity, magnetic intensity,

and the unit radial vector er form a right handed orthogonal triad, and

the energy flux at any given point is directed along er. In contrast to the

electric field, the magnetic field possesses a longitudinal component in

the near zone. The near zone magnetic field looks the same as that of a

static magnetic dipole, differing only in the phase factor ei(kr−ωt). The time

averaged rate of energy radiation from the magnetic dipole works out to

P =k4

12π

√

µ0

ǫ0|m|2. (1.161)

175


1.17.3.3 The dipole field produced by a pin-hole

Imagine a plane monochromatic electromagnetic wave incident on an in-

finitely thin perfectly conducting planar screen with a circular hole in it,

where the radius (a) of the hole is small compared to the wavelength (λ)

of the plane wave ( aλ→ 0). In this case the field on the other side of the

hole (referred to as the shadow side) closely approximates a superposi-

tion of a TE and a TM dipole field. The solution for the field diffracted (or

scattered) by the pin hole is, in the above limit, one of the few available

exact solutions in electromagnetic boundary value problems, and will be

briefly outlined in chapter 5. The pin hole, in other words, is one of the

means by which spherical electromagnetic waves can be produced.

In the special case of a plane wave incident normally on the screen, or

more generally, for a plane wave with the direction of oscillation of the

electric vector parallel to the plane of the screen, the TE dipole field trans-

mitted by the pin hole dominates over the TM field, i.e., the pin hole acts

as an oscillating magnetic dipole with the dipole axis parallel to the plane

of the screen.

Analogous results hold for a small hole of arbitrary shape, provided the linear dimen-

sions of the hole are small compared to the wavelength λ, though it is the circular hole

that admits of exact results.

176


1.18 Cylindrical waves

1.18.1 Cylindrical wave solutions of the scalar wave equa-

tion

The scalar wave equation (1.146) can also be solved in the cylindrical co-

ordinate system involving the co-ordinates ρ, φ, z, and the general solution

with a harmonic time dependence of angular frequency ω can, once again,

be expressed in the form of a series where, at large distances (kr >> 1, k =

ωv), the first term of the series dominates over the succeeding terms and

each succeeding term becomes small compared to the preceding one.

As in the case of the spherical waves, we consider here only that part of the solution

which is regular at infinitely large distances.

Each term of the series by itself constitutes a particular solution of the

scalar wave equation, and the first term describes the cylindrical wave

ψ(r, t) = AH(1)0 (kρ)e−iωt, (1.162)

where A is a constant and H(1)0 stands for the Hankel function of order

zero of the first kind with the following asymptotic form at large distances

H(1)0 =

( 2

πkρ

) 12 ei(kρ−

π4). (1.163)

The amplitude of this wave at a distance ρ from the z-axis (which in this

case is a line of singularity representing the source producing the wave,

and on which the homogeneous wave equation no longer holds) varies as

177


ρ−12 at such large distances.

Interestingly, if we consider a uniform linear distribution of monopole

sources along the z-axis, where each element of the distribution produces

a scalar spherical wave of the form (1.147), then the superposition of all

these spherical waves gives rise to the cylindrical wave solution (1.163).

1.18.2 Vector cylindrical waves

In contrast to a scalar field, the electromagnetic field involves the vectorial

field variables E and H. Solutions to these can be worked out in cylindri-

cal co-ordinates, analogous to those in spherical co-ordinates introduced

above. In particular, assuming that the field is set up in infinitely ex-

tended free space, with a line of singularity along the z-axis representing

the sources and assuming, moreover, that the field vectors are regular at

infinitely large distances, one can again represent the general solution for

the field variables in a series form where, analogous to the vector spher-

ical waves, there occur, once again two types of solutions, namely the

TM and the TE ones. The series expression for either of these two types

involves terms that get progressively small at large distances, where the

first term of the series represents the dominant contribution. If, in any

particular case, the coefficient of the first term turns out to be zero, then

it is the second term that becomes dominant.

In any of the series solutions mentioned in the above paragraphs, there occur undeter-

mined constants, related to the boundary conditions satisfied by the field variables in

any given situation as one approaches the origin or the z-axis (as the case may be), these

178


being, in turn, related to the sources producing the fields. More precisely, the manner

in which the field variables diverge as the point or the line of singularity is approached,

is related to the nature of the sources located at the point or the line, and the constants

occurring in the series solution are determined by the strengths of the sources of the

various orders of multipolarity.

As a specific example, the following expressions give the magnetic and

electric intensity vectors resulting from the first two terms of the TE se-

ries where we assume for the sake of simplicity that the solution under

consideration is independent of the co-ordinate z. Both these field vectors

can be expressed in terms of a single scalar potential ψ defined below, in

which two undetermined constants (A,B) appear. The expression for ψ

involves the Hankel functions of the first kind, H(1)0 , H

(1)1 , of order zero and

one respectively.

ψ =AH(1)0 (kρ) + BH

(1)1 (kρ)eiφ = ψ1 + ψ2(say),

E =k2(ψ1 + ψ2)ez,

H =ωǫ0( i

ρψ2eρ + i

∂(ψ1 + ψ2)

∂ρeφ)

. (1.164)

In these expressions, eρ, eφ, ez stand for the three unit co-ordinate vectors

at any given point. Making use of the properties of the Hankel functions,

one can check that, at large distances, the above solution corresponds to

a cylindrical wave front expanding along eρ with velocity c = ( 1ǫ0µ0

)12 and

that, at such large distances, E, H, and eρ form an orthogonal triad of

vectors, with H = Ecµ0

, as in a plane wave.

179


I close this section by quoting below the expressions for the first (i.e., the

leading) term of the TM series for the field vectors, where these vectors

are expressed in terms of the scalar field ψ1 = AH(1)0 occurring in the first

expression in (1.164), A being, once again, an undetermined constant.

H =k2ψ1ez,

E =− iωµ0∂ψ1

∂ρeφ. (1.165)

Here again, the field vectors at any point at a large distance behave lo-

cally in a manner analogous to those in a plane wave, with the magnetic

intensity polarized along the z-axis and with the wave propagating along

eρ.

Analogous to the scalar case, the vector cylindrical waves correspond to the fields pro-

duced by line distributions (with appropriate densities) of sources, of various orders of

multipolarity, with each element of the distribution sending out spherical waves intro-

duced in sec. 1.17.2. For instance, considering a uniform line distribution of electric

dipoles, the axially symmetric TM cylindrical wave described by eq. (1.165) can be seen

to result from the superposition of the TM spherical waves (equations (1.155a), (1.155b),

with an appropriate choice of the dipole moment vector p) sent out from the various el-

ements of a uniform line distribution of oscillating electric dipoles.

1.18.2.1 Cylindrical waves produced by narrow slits

Imagine a plane monochromatic wave incident normally on a long narrow

slit in an infinitely extended planar sheet made of perfectly conducting

material, where the width (a) of the slit is small compared to the wave-

180


length (λ) of the plane wave. In this case the field on the other side of the

slit (i.e., the shadow side) closely approximates a superposition of a TE

and a TM cylindrical wave fields, and can be expressed in the form of a

series in aλ, as will be briefly outlined in chapter 5. The long narrow slit,

in other words, is one of the means by which cylindrical electromagnetic

waves can be produced.

As it turns out from the exact solution, the axially symmetric TM field,

of the form (1.165) transmitted by the pin hole dominates over the TE

field (for aλ→ 0). The latter is seen to be of the form (1.164) with ψ = ψ2,

i.e., with A = 0, which is why I quoted the first two terms in the TE case

in contrast to only the first term in the TM case. Note that, the field

corresponding to ψ = ψ2 is not axially symmetric while that for ψ = ψ1

possesses axial symmetry (i.e., is independent of the azimuthal angle φ).

1.19 Wave propagation in an anisotropic medium

In this section I will include a number of basic results relating to elec-

tromagnetic wave propagation in linear anisotropic dielectrics. Nonlinear

phenomena in dielectrics will be taken up in chapter 9.

1.19.1 Introduction

The constitutive equations relating the components of E to those of D in

a linear anisotropic dielectric are of the general form (1.2a). In principle,

similar relations (see equation (1.2b)) should hold between the compo-

nents of B and H as well, but for most dielectrics of interest, the perme-

181


ability can be taken to be a scalar and, moreover, one can take µ = µ0, an

approximation I will adopt in the following.

In addition we will, for the sake of simplicity, assume that the dielectric

is a non-dispersive one, though many of the reults stated below remain

valid for a weakly dispersive dielectric with negligible absorption. In what

follows, I will point this out from time to time.

The time averaged energy density for an electromagnetic field set up in a

weakly dispersive anisotropic dielectric is given by the formula

〈w〉 = 1

2

∑

ij

[

〈Eid

dω(ωǫij)Ej〉+ 〈Hi

d

dω(ωµij)Hj〉

]

, (1.166a)

where, for the sake of generality, I have introduced a magnetic perme-

ability tensor µij, and have assumed that there is negligible absorption in

the medium. This formula can be derived by considering a narrow wave

packet, analogous to the way one arrives at eq. (1.131). In the case of a

non-dispersive anisotropic dielectric with a scalar magnetic permeability

µ = µ0, this simplifies to

〈w〉 = 1

2

[

∑

ij

〈EiǫijEj〉+ µ0〈H2〉]

. (1.166b)

This is actually the density of a thermodynamic state function for the

dielectric under consideration, a fact that corresponds to the condition

that the dielectric tensor be symmetric. Thus, for any given choice of a

Cartesian co-ordinate system, the components ǫij (i, j = 1, 2, 3) are real

182


and satisfy

ǫij = ǫji. (1.167)

1. Strictly speaking, the volume elements of the dielectric cannot be in thermody-

namic equilibrium in the presence of a time-varying field. However, we assume

that the behaviour of the system is in accordance with the principle of linear

response, which holds for a system close to equilibrium and which implies the

symmetry of the dielectric tensor.

2. In the presence of a stationary magnetic field H, the components obey the relation

ǫij(H) = ǫji(−H) (i, j = 1, 2, 3).

In the following, however, we assume that stationary magnetic fields are absent.

One can then choose a special Cartesian co-ordinate system with refer-

ence to which the matrix of the coefficients ǫij is diagonal. The co-ordinate

axes are then referred to as the principal axes, and the diagonal elements

ǫ1, ǫ2, ǫ3, all of which are real, are termed the principal components of

the dielectric (or permittivity) tensor, each of which is ǫ0 times the cor-

responding principal component of the relative permittivity (or dielectric

constant) ǫri (i = 1, 2, 3). Moreover, the positive definiteness of the energy

density implies that the principal dielectric constants are all positive.

Thus, referred to the principal axes, the components of the dielectric ten-

sor are of the form

ǫij = ǫiδij (i, j = 1, 2, 3), (1.168)

183


where δij stands for the Kronecker symbol with value 1 (resp., 0) if the

indices i, j are equal (resp., unequal).

1. For the sake of simplicity, we will assume the dielectric to be a homogeneous one.

Most of the results derived below hold locally (i.e., for a small neighbourhood of

any given point) for a weakly inhomogeneous medium when interpreted in terms of

the eikonal approximation. I will introduce the eikonal approximation in chapter 2

where, however, I will mostly confine myself to considerations relating to isotropic

media.

2. For a dispersive anisotropic medium, the components ǫij of the dielectric tensor

are, functions of the frequency ω of the field set up in the medium (and are,

moreover, complex if there is appreciable absorption). This means, in general,

that the principal components ǫi are frequency dependent and, in addition, the

directions of the principal axes are also frequency dependent. However, as I have

already mentioned, I will ignore dispersion (and absorption) effects in most of the

present section.

1.19.2 Propagation of a plane wave: the basics

Let us consider a plane monochromatic wave propagating in the medium,

with frequency ω and propagation vector k = km. Here we use the symbol

m for the unit vector along k, while the symbol n is commonly used to

denote the ‘refractive index vector’

n =c

ωk =

c

vpm. (1.169)

For such a wave, each of the field vectors has a space-time dependence

of the form exp(

i(k · r − ωt))

in the complex representation. The cen-

tral result relating to such a wave is then obtained from Maxwell’s equa-

184


tions (1.1b), (1.1d) (with ρ = 0, j = 0) along with the relations (1.2a), as

∑

j

(

kikj − k2δij + ω2µ0ǫij)

Ej = 0 (i = 1, 2, 3). (1.170)

For a non-trivial solution for the components Ei to exist, one has to have

detA = 0, (1.171a)

where the elements of the matrix A are

Aij ≡ kikj − k2δij + ω2µ0ǫij (i, j = 1, 2, 3), (1.171b)

(check this result out). One can, in principle, obtain from this the dis-

persion relation expressing ω in terms of the components of k (where the

components of the dielectric tensor appear as parameters) and then the

ray velocity vr = vg = ∂ω∂k

. This is not an easy job in practice, especially

when the medium is dispersive, though one can have an idea of the type

of results it implies by considering a number of simple cases.

For instance, assuming that the principal axes are fixed directions in

space, independent of the frequency, let us take these as the co-ordinate

axes, and consider the special case of a plane wave with the propaga-

tion vector along the x-axis. Thus k1 = k, k2 = k3 = 0, from which, us-

ing (1.171a), (1.171b), one obtains the three equations

E1 = 0, (−k2 + ω2µ0ǫ2)E2 = 0, (−k2 + ω2µ0ǫ3)E3 = 0. (1.172a)

This tells us that a wave with its propagation vector directed along the

185


first principal axis has to be polarized with its electric vector (and dis-

placement) either along the second principal axis or along the third prin-

cipal axis (see fig. 1.16(A), (B)), its phase velocity vp = ωk

being different in

the two cases. More precisely, one can have

either, (a) E3 = 0,ω

k=

1√ǫ2µ0

, or, (b) E2 = 0,ω

k=

1√ǫ3µ0

. (1.172b)

This is a basic and important result. While we have arrived at it by refer-

ring to a special case, it admits of a generalization which states that, for

any given direction of the propagation vector (defined by m), there exist,

in general two possible values of ω, i.e., two values of the phase velocity

vp, the electric displacement vectors for these two being perpendicular

to each other (the electric intensity vectors are mutually perpendicular

only in the special situation being considered here). In other words, two

different plane waves, both linearly polarized, can propagate with the prop-

agation vector pointing in any given direction (as seen in the special case

considered above, the phase velocity does not depend on the magnitude

of the wave vector). The electric intensity vectors of these two need not,

however, be perpendicular to k, though. As seen from the Maxwell equa-

tion (1.1a) (with ρ = 0), the electric displacement vector D is perpendicular

to k for each of these two waves.

The other basic result in the optics of anisotropic media (recall that our

concern with electromagnetic theory is principally in the context of optics)

relates to ray directions: for any given direction of the wave vector, the

direction of energy propagation, i.e., the ray direction, differs from that of

the wave normal. This I will come back to in sec. 1.19.4.

186


Y

Z

X

H

E

k

O

v vp 2=

(A)

Y

Z

X

H

E

k

O

v vp 3=

(B)

Figure 1.16: Illustrating the propagation of a plane wave through ananisotropic dielectric; the special case of the propagation vector k pointingalong the first principal axis of the dielectric tensor is considered for thesake of simplicity; two possible solutions, with distinct phase velocitiesare depicted (see (1.172b)); (A) electric intensity and displacement alongthe second principal axis, vp = v2; (B) electric intensity and displacementalong the third principal axis, vp = v3; the principal phase velocities aredefined as in (1.173b).

1.19.3 The phase velocity surface

Since, for any given direction m(= k

k) of the wave vector, there are, in

general, two values of vp = ωk, a polar plot of vp as a function of the direc-

tion cosines (mx,my,mz) of the wave vector is a two-sheeted surface. This

is variously referred to as the phase velocity surface, the wave normal

surface, or, in brief, the normal surface.

1. A typical point on the polar plot is obtained by drawing a line from the origin of the

co-ordinate axes along any direction specified by mx,my,mz and locating a point

on it at a distance vp on this line. For a linear anisotropic medium, two such

points are, in general, obtained for any given direction.

2. Recall that, by contrast, the phase velocity is independent of the direction cosines

in the case of an isotropic medium, and the polar plot of vp is a one-sheeted

surface, namely, a sphere of radius cn, n being the refractive index of the medium.

3. Considering any point on the normal surface, the wave normal m along the radius

vector to that point from the origin does not, in general, represent the normal to

the phase velocity surface

187


The equation describing this two-sheeted phase velocity surface can be

deduced from (1.171a), (1.171b), and is referred to as Fresnel’s equation

of wave normals (also referred to as Fresnel’s equation for the phase ve-

locity), which reads

m2x

v2p − v21+

m2y

v2p − v22+

m2z

v2p − v23= 0, (1.173a)

where v1, v2, v3 are the principal phase velocities (but not the components

of the phase velocity vector vp = ωkm along the principal axes) defined in

terms of the principal components of the dielectric tensor as

vi =c√ǫiµ0

(i = 1, 2, 3). (1.173b)

Eq. (1.173a) is a quadratic equation in v2p, giving two solutions for any

given m, thus explaining the two-sheeted structure of the phase velocity

surface.

1. For each of the two possible solutions for v2p for a given m, there correspond two

values of the phase velocity of the form ±vp. These we do not count as distinct

solutions since they correspond to waves traveling in opposite directions, with the

same magnitude of the phase velocity.

2. The phase velocity surface effectively describes the dispersion relation in the

graphical form, relating the frequency ω to the components of the wave vector

kx, ky, kz since it gives ωk

in terms of mx,my,mz. For any given k one obtains, in

general, two different values of ω.

Fig. 1.17 depicts schematically the two-sheeted nature of the phase ve-

locity surface, where the surface is shown only in the positive octant,

188


with the co-ordinate axes along the principal axes of the dielectric tensor.

Considering a typical point on the phase velocity surface, its co-ordinates

are of the form (ξ = vpmx, η = vpmy, ζ = vpmz), where vp is the phase ve-

locity in the direction (mx,my,mz). The equation of the surface is one of

sixth degree in the co-ordinates ξ, η, ζ, and the section by any of the three

principal planes of the two sheets of the surface are, in general, a circle

and an oval, the latter being a closed curve of the fourth degree.

The two sheets of the phase velocity surface intersect each other at four

points located at the ends of two line segments, one of which is the point

N shown in fig. 1.17. The directions along the two line segments define

the optic axes (more precisely, the wave optic axes since, as we will see

below, there exist a pair of ray optic axes as well) of the medium.

As mentioned above, another representation of identical mathematical

content as the phase velocity surface is in terms of the ω-k surface,

which depicts graphically the relation (1.171a), (1.171b) where a typi-

cal point has co-ordinates (ω(k)mx, ω(k)my, ω(k)mz). Since ω(k) = kvp, the

ω-k surface is nothing but a scaled version of the phase velocity sur-

face. Expressing the left hand side of (1.171a) as F (ω, kx, ky, kz) the phase

velocity surface is seen to be a surface geometrically similar to the one

represented by the equation

F (ω, kx, ky, kz) = 0. (1.174)

Incidentally, the formula (1.173a) can be expressed in an alternative form

in terms of the components (nx, ny, nz) of the refractive index vector n

189


introduced in sec. 1.19.2 (eq. (1.169)), which reads

n2(ǫ1n2x + ǫ2n

2y + ǫ3n

2z)− (ǫ1(ǫ2 + ǫ3)n

2x + ǫ2(ǫ3 + ǫ1)n

2y + ǫ3(ǫ1 + ǫ2)n

2z) + ǫ1ǫ2ǫ3 = 0.

(1.175)

X

Y

Z v2

v3

v3

v1

v1

v2 O

P

N

m

Figure 1.17: Illustrating the two-sheeted phase velocity surface deter-mined by formula (1.173a); the part of the surface in the first octantis shown; v1, v2, v3 are the three principal phase velocities defined asin (1.173b); these are assumed to be ordered as v1 > v2 > v3 for thesake of concreteness; the intercepts on the x-axis (the first principal axis)are v2, v3 (see (1.172b)), while the other intercepts are also shown; if P beany point lying on the surface and the unit vector along OP be m, thenthe phase velocity vp for a plane wave with wave vector along m is given bythe length OP; the two sheets of the phase velocity surface (also termedthe normal surface) intersect, in general, at four points (end points of twoline segments lying in the x-z plane), of which one is at N; the ω-k sur-face is geometrically similar to this phase velocity surface, scaled by thepropagation constant k.

In summary, two distinct plane waves can propagate for any given di-

rection, specified by the unit vector m, of the wave vector k, the electric

displacement vectors of the two being perpendicular to each other. The

phase velocities of the two waves are obtained from the phase velocity

surface, which is geometrically similar to the ω-k surface. There exist, in

190


general, two directions, along the optic axes, for which there is only one

possible phase velocity, which means that a plane wave of arbitrary state

of polarization can propagate with a single (i.e., unique) phase velocity

along either of the optic axes.

As we will see (refer to sec. 1.19.8), there may exist media for which the

anisotropy is of a relatively simple kind, wherein the two optic axes de-

generate to a single direction in space. These are termed uniaxial media,

in contrast to the more general biaxial ones.

1.19.4 The ray velocity surface

As I have mentioned above, one can in principle work out the ray velocity

(vg = ∂ω∂k) by differentiation from (1.171a), (1.171b). However, the ray

velocity vector vr(= vg) can be characterized in alternative ways.

The direction of the phase velocity being along that of k, the phase velocity vector is

given by vp = ωkm.

Referring to the function F = detA introduced above (see sections 1.19.2

and 1.19.3), and making use of the principles of partial differentiation,

one obtains

vr =[∂ω

∂k

]

F=0= −

∂F∂k∂F∂ω

. (1.176)

The expression ∂F∂k

on the right hand side of this formula is a vector along

the normal to the ω-k surface at the point corresponding to the wave vec-

191


tor k, which thus tells us that the ray velocity vector for given (mx,my,mz)

is along the normal to the phase velocity surface at the corresponding

point on it. In other words, while the phase velocity is given by the vec-

torial distance of a specified point on the phase velocity surface from the

origin, the ray velocity is directed along the normal to that point. This

relation between the phase- and the ray velocity is depicted graphically

in fig. 1.18.

Consider now a vector s along the direction of the ray velocity for a given

unit wave normal m (along the direction of the phase velocity correspond-

ing to which the refractive index vector is n), the magnitude of s being

determined in accordance with the formula

n · s = 1. (1.177a)

Analogous to the relation (1.169), the vector s is related to the ray velocity

vector vr as

s =1

cvr. (1.177b)

Making use of the definition (1.177a), this is seen to be equivalent to the

relation

vp = vr cosα, (1.177c)

where α is the angle between the directions of the phase velocity and ray

velocity vectors, as shown in fig. 1.18.

192


Here is yet another instance of use of the symbol α, which is not to be confused with

the same symbol having been used earlier in two senses (polarizability, attenuation

coefficient; refer to sections 1.15.1.2, 1.15.2), both different from the present one. No

matter.

Y

Z

X

O

vp

vr

PQ

a

Figure 1.18: Depicting the relation between the phase velocity surface,the direction of the wave vector, and the ray direction (schematic); O is anorigin chosen in the anisotropic medium, while P is a point on the phasevelocity surface, where part of only one sheet making up the surface isshown for the sake of illustration; corresponding to the chosen point Pon the surface, the wave vector k is directed along OP, while the lengthof the segment OP gives the phase velocity vp; PQ is along the normalto the surface at P, giving the direction of the ray velocity vr (and of thecorresponding vector s, see (1.177b)); the angle α between the directionsOP and PQ relates the phase- and ray velocities as in (1.177c).

Assuming the medium under consideration to be non-dispersive, the energy density is

given by

w = we + wm =1

2(E ·D+H ·B) =

1

2[− kωE · (m×H) +

k

ωH · (m×E)],

i.e., vpw = m · S, where an appropriate time averaging is implied. Again, the ray ve-

locity vr = vg is related to S and w as S = vrw. These two relations taken together

imply (1.177c) (check this out), and hence (1.177b).

The vector s being parallel to S, is perpendicular to both E and H. This,

193


along with the Maxwell equations (1.1b), (1.1d), in the absence of source

terms, leads to the following results

H = cs×D, E = −cµ0s×H. (1.178)

Making use of (1.1d), one gets, for the plane wave under consideration, s ×D = − kωs ×

(m×H) = 1cn · sH = 1

cH. The second relation in (1.178) is similarly obtained.

In turn, the two relations (1.178) imply

detBij = 0, (1.179a)

where

Bij = sisj − s2δij + ǫ0(ǫ−1ij ), (1.179b)

the coefficients ǫ−1ij being the elements of the inverse matrix of ǫ (i.e., of

the matrix made up of the elements ǫij).

These relations are analogous (and, in a sense, dual) to formulae (1.171a), (1.171b),

and define a two-sheeted ray velocity surface relating the ray velocity vr to

the unit vector t ≡ s

|s| specifying the ray direction. The equation express-

ing vr to the components of t (referred to as Fresnel’s equation for the ray

velocity) reads

t2x

v−2r − v−2

1

+t2y

v−2r − v−2

2

+t2z

v−2r − v−2

3

= 0, (1.180)

194


where v1, v2, v3 stand for the principal ray velocities, these being the same

as the corresponding principal phase velocities.

This equation describes a surface of degree four in the co-ordinates ξ =

vrtx, η = vrty, ζ = vrtz, a section of which by any of the three co-ordinate

planes is, in general, a circle and an ellipse. the two sheets of the ray

velocity surface again intersect in four points located at the ends of two

line segments, and the directions along these line segments define the

ray optic axes of the medium. Considering any point P on the ray velocity

surface, the segment OP extending from the origin up to that point gives

the value of vr for the ray direction along OP. What is more, the wave vec-

tor k corresponding to the ray along OP is directed along the normal to

the ray velocity surface drawn at P. All this indicates that there is a cer-

tain correspondence, or duality, as one may call it, between statements

pertaining to wave vectors and those pertaining to rays.

The ray velocity surface tells us that, for any given ray direction specified

by the unit vector t, there can be two plane waves with different ray

velocities, the electric intensity vectors for the two being perpendicular to

each other. The two ray optic axes are special directions for each of which

there corresponds only one single ray velocity, while the electric intensity

vector can correspond to any arbitrary state of polarization.

1.19.5 The wave vector and the ray vector

One basic distinctive feature of plane wave propagation in an anisotropic

medium, as compared with an isotropic one, relates to the fact that the

direction of the ray, i.e., of energy propagation, differs from that of the

195


wave vector (or propagation vector). While the latter is given by k = ωvpm,

the corresponding ray vector is s = vrct. We have seen how the two di-

rections m and t are related to each other in terms of the geometries of

the wave velocity surface and the ray velocity surface. Here is another

set of formulas that allows one to obtain the ray direction t directly from

the wave vector direction m, where I skip the series of intermediate steps

necessary to arrive at the final formulas.

As we see below, there corresponds, in general, not one but two ray directions for any

direction of the wave normal. This is so because, for any given m, there are, in general,

two points of intersection of the line of propagation with the phase velocity surface, and

two normals at the points of intersection.

First, one needs a formula relating the ray velocity directly with the phase

velocity for any given unit vector m along the wave vector, which reads

v2r = v2p +

1v2p

(

mx

v2p−v21

)2+(

my

v2p−v22

)2+(

mz

v2p−v23

)2 . (1.181)

Recall that, for any given m, the phase velocity vp is known from Fresnel’s

equation (formula (1.173a)), which then gives vr from (1.181). Using this

value of vr, the components of t are obtained from the relations

ti =vp

vr

v2p − v2rv2i − v2p

mi (i = 1, 2, 3). (1.182)

Since there are, in general, two values of vp for any given wave vector

direction m, it follows that there are, in general, two ray directions t as

well, with four distinct ray velocities (recall that, for each ray direction

196


there are, in general, two ray velocities where ray velocities differing only

in sign are not counted as being distinct) and, correspondingly, four dis-

tinct sets of directions of the pair of vectors D, E. For the special case of a

wave normal along either of the two optic axes (the wave optic axes, that

is), there correspond not just two but an infinite number ray directions,

all lying on the surface of a cone. Analogously, for any given ray direction

t, there exist, in general, two wave vector directions m while, in the spe-

cial case of a ray along either of the two ray optic axes, there correspond

an infinite number of wave vector directions, all lying on the surface of a

cone.

1.19.6 Polarization of the field vectors

Continuing to refer to a plane monochromatic wave propagating through

an anisotropic medium, with the wave vector k along the unit wave nor-

mal m, and any one of the two corresponding unit ray vectors, t, the

directions of the field vectors E, D, and H can be seen to be related to m

and t in a certain definite manner.

Assuming that there are no free charges and currents, Maxwell’s equa-

tions (1.1a), (1.1c) imply that m is perpendicular to D and H (recall that B

and H are parallel to each other under the assumption that the magnetic

permeability is a scalar; we assume, moreover, that µ ≈ µ0). On the other

hand, equations (1.1b) and (1.1d) imply that E and D are perpendicular

to H.

It follows that D, H, and m form a right handed orthogonal triad of vec-

197


tors. Again, t being directed along the Poynting vector E × H, the three

vectors E, H, and t form a right handed orthogonal triad. The vectors

t, m, E, and D being all perpendicular to H, are co-planar. Hence, the

angle α between the unit vectors m and t (see fig. (1.18)) is also the angle

between E and D. All this is depicted schematically in fig. 1.19.

The validity of these statements is based on the condition that that the dielectric tensor

be real, which in turn requires that absorption in the medium under consideration be

negligible.

For a given direction of the unit wave normal m, the two possible ray

directions define two corresponding planes containing m and t. Once this

plane is fixed, the directions of D and E are determined as in fig. 1.19.

These directions of E and D give the state of polarization of the plane wave

under consideration. In other words, each of the two possible plane waves

for any given direction of m is in a definite state of linear polarization. This

state of polarization can be determined by a geometrical construction

involving what is referred to as the ellipsoid of wave normals or the index

ellipsoid. An alternative approach is to describe the state of polarization

in terms of the ray ellipsoid.

1.19.7 The two ellipsoids

The index ellipsoid.

Considering a plane wave with a given unit wave normal m and referring

to the expression for the energy density for the wave, one arrives at the

conclusion that the components of D are proportional to the components

198


Figure 1.19: Depicting the orientation of the field vectors E, D, and H

with reference to the unit wave normal m and the unit ray vector t; thevectors E and D are co-planar with m and t while H is perpendicular totheir common plane; the angle α between m and t (refer to fig. 1.18) isshown.

(x, y, z) of a certain vector r that satisfy the relation

x2

ǫ1+y2

ǫ2+x3

ǫ3= 1. (1.183)

Here D stands for either one of the two vectors D1, D2 corresponding to

the given unit normal m and any given value of the energy density. For

any other value of the energy density, there again correspond two possible

electric displacement vectors which are parallel to D1 and D2 respectively.

1. Recall that we have chosen a set of Cartesian axes along the three principal axes

of the dielectric tensor, and that ǫi (i = 1, 2, 3) are the principal components of

the dielectric tensor. In other words, referred to the principal axes, the dielectric

tensor is given by ǫij = ǫiδij (i, j = 1, 2, 3).

2. In referring to the phase velocity surface, ray velocity surface, index ellipsoid, or

the ray ellipsoid (see below), one chooses the origin at any point in the medium

under consideration, assuming the latter to be a homogeneous one, in which case

the principal axes and the principal velocities do not depend on the choice of the

origin. For an inhomogeneous medium, one can invoke the methods relating to

199


the eikonal approximation (outlined in chapter 2 in the context of isotropic media),

provided the inhomogeneity is in a certain sense, a weak one.

3. In the following, we consider a given value of the energy density without loss of

generality, since a different value would correspond to different magnitudes of the

electric displacement vectors with their directions, however, remaining unaltered.

The two corresponding phase velocities are also independent of the value of the

energy density.

4. I do not enter into proofs and derivations relating to the statements made in this

section.

The vector D is thus parallel to r, which extends from the origin (located

at any chosen point in the dielectric, assumed to be a homogeneous one)

up to the surface of the ellipsoid represented by the above equation. More

precisely, D lies in the principal section of the ellipsoid (i.e., the section

by a plane passing through the centre) perpendicular to m where this

section, in general, is an ellipse. Fig. 1.20 depicts the principal axes

P1P′1 and P2P

′2 of the ellipse. The rule determining the directions of the

vectors D1 and D2 is simple to state: these are parallel to P1P′1 and P2P

′2

respectively.

For each of these two, the direction of the displacement vector can point in either of

two opposite directions. However, these will not be counted as distinct, since they

simply correspond to two opposite directions of propagation, with the same propagation

constant k.

The ellipsoid (1.183), termed the index ellipsoid, or the ellipsoid of wave

normals, also permits a geometrical evaluation of the phase velocities of

the two waves with the given unit wave normal m. Thus, in fig. 1.20,

200


consider the lengths of the segments OP1 and OP2, i.e., the magnitudes of

the radius vectors r1, r2 along the two principal axes of the elliptic section

of the index ellipsoid by a plane perpendicular to m. These are inversely

proportional to the two phase velocities in question, corresponding to the

plane waves with electric displacement vectors D1 and D2 respectively.

More precisely, denoting by√ǫ(1),

√ǫ(2) the lengths of the two segments

mentioned above, the two phase velocities are given by

vp1 =1

√

µ0ǫ(1), vp2 =

1√

µ0ǫ(2). (1.184)

The special case of the wave vector pointing along either of the two optic

axes deserves attention.

As mentioned in sec. 1.19.8, the number of optic axes is generally two for an anisotropic

medium. In the special case of a uniaxial medium, however, there is only one optic axis.

For an ellipsoid there exist, in general, two planar sections each of which

is circular instead of elliptic. Considering the directions perpendicular to

these special sections, one obtains the directions of the optic axes. Hence,

for a wave with the wave vector along either of the two optic axes, any two

mutually perpendicular axes in the circular section may be chosen as the

principal axes and thus, the directions of D1, D2 are arbitrary. Moreover,

instead of two distinct values of the phase velocity, there corresponds

only one single value vp. This means that a plane wave of an arbitrarily

chosen state of polarization can propagate with its wave vector directed

along either of the two optic axes.

201


O

O1O2

P1

P1¢

P2P2

¢

D1

D2

m

Figure 1.20: Illustrating the idea of the index ellipsoid; the x-, y-, andz-axes are the principal axes of the index ellipsoid defined by eq. (1.183);a section of the ellipsoid is shown by a plane perpendicular to the wavevector k (i.e., of the unit wave normal m); this section is in general anellipse, and its principal axes are along P′

1OP1 and P′2OP2; the two possible

electric displacement vectors D1 and D2 are polarized along these two;the phase velocities corresponding to these are inversely related to thelengths of the segments OP1 and OP2; the two optic axes are also shownschematically (dotted lines along OO1, OO2), along with the sections ofthe ellipsoid perpendicular to these two, these being circular in shape;for a wave with its wave vector along either of the two optic axes, D1, andD2 can be along any two mutually perpendicular directions in the planeof the circle.

The ray ellipsoid.

Like the index ellipsoid, the ray ellipsoid is another useful geometrical

construct. Analogous to the correspondence (in a sense, a duality) be-

tween the phase velocity surface and the ray velocity surface, the index

ellipsoid and the ray ellipsoid are also related by a duality. The ray ellip-

202


soid is given by the equation

ǫ1x2 + ǫ2y

2 + ǫ3z2 = 1, (1.185)

and is obtained from the expression of the energy density of a plane

monochromatic wave in terms of the electric intensity E (by contrast,

the equation of the index ellipsoid is obtained from the expression for the

energy density in terms of the electric displacement vector). The centre of

the ellipsoid can be chosen anywhere in the medium under consideration

(recall that the latter has been assumed to be homogeneous for the sake

of simplicity), and the radius vector r from the centre, chosen as the ori-

gin, to any point P on the ellipsoid then represents the electric intensity,

up to a constant of proportionality, for a wave of some specified energy

density where the ray direction for the wave is perpendicular to r. More

specifically, regardless of the value of the energy density, the electric in-

tensity for any given unit ray vector t lies in the principal section (i.e., a

section by a plane passing through the centre which is, in general, an

ellipse) of the ray ellipsoid by a plane perpendicular to t.

Moreover, the two possible directions of E for the given t point along the

principal axes of the ellipse. Finally, the corresponding ray velocities are

proportional to the principal semi-axes of the ellipse. All this, actually, is

an expression of the relation of duality I have mentioned above.

203


1.19.8 Uniaxial and biaxial media

Crystalline dielectrics constitute examples of anisotropic media, many of

which are optically transparent. The microscopic constituents in a crystal

are arranged in a symmetric manner, where there can be various different

types of symmetric arrangements. In a crystal of cubic symmetry, all

the three axes in a Cartesian co-ordinate system are equivalent, and the

dielectric tensor then reduces effectively to a scalar (ǫ1 = ǫ2 = ǫ3). In

a number of other crystals, one can choose two equivalent rectangular

axes in a certain plane while the third axis, perpendicular to the plane,

is non-equivalent. Such a crystal is of an intermediate symmetry, while

the least symmetric are those where there exist no two Cartesian axes

equivalent to each other.

For the crystals of the third type, the three principal components of the

dielectric tensor (ǫ1, ǫ2, ǫ3) are all different. For a crystal of intermediate

symmetry, on the other hand, two of the principal components are equal,

the third being unequal. One can choose axes such that referred to these

axes, the matrix representing the dielectric tensor is diagonal, with two

of the principal components satisfying ǫ1 = ǫ2, while the third, ǫ3, has a

different value. In this case, any two mutually perpendicular axes in the

x-y plane can be chosen to constitute one pair of principal axes but the

third principal axis is a fixed direction perpendicular to this plane.

For a crystal of such intermediate symmetry, the index ellipsoid and the

ray ellipsoid both reduce to spheroids. A spheroid is a degenerate ellipsoid

possessing an axis of revolution where the principal section perpendicu-

204


lar to this axis (the z-axis with our choice of axes indicated above) is a

circle. This axis of revolution then constitutes the optic axis where the

wave optic axis (i.e., the direction of wave vector for which there is only

one phase velocity) and the ray optic axis (direction of ray vector corre-

sponding to which there is only one ray velocity) coincide with each other.

Such a crystal constitutes a uniaxial anisotropic medium.

For a crystal of the least symmetric type, on the other hand, the index

ellipsoid or the ray ellipsoid does not possess any axis of revolution, and

there exist two principal sections of a circular shape. The directions per-

pendicular to these sections then define the optic axes where, in general,

the wave optic axes and the ray optic axes do not coincide. Such a crystal

constitutes an instance of a biaxial medium.

In the case of an isotropic medium the index ellipsoid and the ray ellipsoid

both degenerate to a sphere while the phase velocity surface and the ray

velocity surface are also spherical, the ray velocity and the phase velocity

being along the same direction.

Referring to a uniaxial medium, the two optic axes degenerate into a

single one along the axis of revolution of the index- or the ray ellipsoid.

One of the two sheets of the phase velocity surface is spherical, while

the other is a surface of the fourth degree (an ovaloid). The ray velocity

surface similarly reduces to a sphere and a spheroid. In the case of a

biaxial medium, the equations representing the phase velocity surface

and the ray velocity surface do not admit of a factorization as they do for

a uniaxial one (see sec. 1.19.9).

205


1.19.9 Propagation in a uniaxial medium

With this background, we can now have a look at a number of features

of wave propagation in an anisotropic medium where, for the sake of

simplicity, we will consider a uniaxial medium with v1 = v2 which we

denote as v′. Let the remaining principal phase velocity v3 be denoted as

v′′ (refer to eq. (1.173b) for the definition of the principal phase velocities).

In this case the index ellipsoid is a spheroid with the z-axis as the axis of

revolution, which is then the direction of the optic axis of the medium.

The equation for the phase velocity surface (eq. (1.173a)) factorizes as

(v2p − v′2)(v2p − v′2 cos2 θ − v′′2 sin2 θ) = 0, (1.186)

where θ stands for the angle between the direction of the wave vector k

and the z-axis, i.e., the optic axis. Thus, for any given direction of the

wave vector, one of the two possible phase velocities is

vp = v′, (1.187a)

independent of the direction of k, while the other is given by

v2p = v′2 cos2 θ + v′′2 sin2 θ, (1.187b)

which depends on the angle θ characterizing the direction of the wave

vector. The plane waves with these two values of the phase velocity for

any given direction of k are termed respectively the ordinary and the ex-

206


traordinary waves, where the former corresponds to the spherical sheet

of the phase velocity surface and the latter to the ovaloid. The two val-

ues of the phase velocity are then denoted as vo and ve respectively - the

ordinary- and the extraordinary phase velocities.

A uniaxial medium is termed a positive or a negative one depending on

whether v′ is larger or smaller than v′′, corresponding to which one has

vo > ve or vo < ve respectively. Fig. 1.21 depicts schematically the phase

velocity surface for a uniaxial anisotropic medium. One observes that, for

a positive medium the spherical sheet lies outside the ovaloid while the

reverse is the case for a negative medium. The two sheets touch at two

diametrically opposite end points of a line segment parallel to the optic

axis.

Similar statements apply to the ray velocity surface as well, with the

difference that, instead of the ovaloid, the sheet corresponding to the

extraordinary ray is a spheroid. The ordinary and extraordinary ray ve-

locities are given by

vro = vo, v−2re = v′−2 cos2 θ + v′′−2 sin2 θ. (1.188)

Fig. 1.22 depicts the index ellipsoid for the uniaxial medium under con-

sideration, along with the wave vector k, where the latter makes an angle

θ with the optic axis. The plane containing the wave vector and the optic

axis (the plane of the figure in the present instance) is referred to as the

principal plane for the plane wave.

207


X

Y

Z

vo

ve

O

k

(A)

Z

X

Y

k

vo

ve

O

(B)

Figure 1.21: The phase velocity surface for (A) a positive uniaxial medium,and (B) a negative uniaxial medium; in either case, the surface is madeup of two sheets, of which one is a sphere and the other is an ovaloid,with the optic axis (the z-axis in the figure) as the axis of revolution forthe latter; the two sheets of the wave velocity surface touch at the endpoints of a segment parallel to the optic axis; the ordinary and the ex-traordinary phase velocities (vo, ve) for an arbitrarily chosen direction ofthe wave vector k are indicated; the ordinary velocity is independent ofthe direction of k.

The principal section of the ellipsoid by a plane perpendicular to the wave

vector, which is, in general, an ellipse, is shown. The principal axes

of the ellipse are along OP1 and OP2, where OP1 lies in the x-y plane,

perpendicular to the optic axis. These two then gives the directions of

the electric displacement vectors for the ordinary and the extraordinary

waves respectively, propagating with the wave vector k.

The phase velocities (vo, ve) of the two waves are inversely proportional to

the lengths of the line segments OP1 and OP2 respectively, where the for-

mer is, evidently, independent of the direction of k. The figure shows the

index ellipsoid of a positive uniaxial medium, which is a prolate spheroid,

in contrast to an oblate spheroid corresponding to a negative uniaxial

medium.

Analogous statements apply to the ray ellipsoid of a uniaxial anisotropic

medium.

208


Figure 1.22: The index ellipsoid for a positive uniaxial medium, wherethe ellipsoid is a prolate spheroid; the optic axis (the z-axis in the figure)is the axis of revolution of the ellipsoid; the plane of the figure depictsthe principal plane for a wave with wave vector k; the section of the el-lipsoid by a plane perpendicular to k is shown, which is an ellipse withprincipal axes along OP1 and OP2 respectively; of the two, OP1 lies in thecircular section of the spheroid perpendicular to the optic axis; the elec-tric displacement vectors for the ordinary and the extraordinary wavesare along these two directions, and are perpendicular to each other; thephase velocities are inversely proportional to the lengths of the segmentsOP1, OP2.

1.19.10 Double refraction

Fig. 1.23 depicts schematically the refraction of a plane wave from an

isotropic dielectric into an anisotropic one, where we assume the latter

to be a uniaxial medium for the sake of simplicity. Let the frequency of

the incident wave be ω and its phase velocity in the medium of incidence

be vp, the ray velocity vr being in the same direction, i.e., along the wave

vector k.

The wave vector k′ of the refracted wave lies in the plane of incidence

(i.e., the plane containing the normal to the interface and the incident

209


wave vector k), and the angle φ′ between the refracted wave vector and

the normal is related to the angle if incidence φ as

n sinφ = n′ sinφ′, (1.189)

where n = cvp

, and n′ = cv′p

, v′p being the phase velocity in the anisotropic

medium. This relation is just Snell’ law in the present context, that can

be arrived at making use of the boundary condition satisfied by the field

vectors at the interface, as in sec. 1.12.2. However, now the phase velocity

v′p has two possible values for any direction of the wave vector k′. Of these,

one is vo and is independent of the direction of k′. This gives rise to the

ordinary wave in the second medium, for which φ′ is obtained directly

from (1.189).

For the extraordinary wave, on the other hand, v′p depends on the di-

rection, i.e., on φ′. This means that (1.189) is now an implicit equation

in φ′, which is to be solved by taking into account the dependence of v′p

on the angle θ between the refracted wave vector and the optic axis (see

eq. (1.187b), with notation explained in sec. 1.19.9). One thereby obtains

the direction of the wave vector for a second refracted wave, the extraordi-

nary wave in the anisotropic medium. The phenomenon where there are,

in general, two refracted waves for an incident wave, goes by the name of

double refraction.

Note that the wave vectors for both the refracted waves lie in the plane of

incidence. This cannot, however, be said of the ray vectors, where only

one of the two possible rays, namely the ordinary ray, lies in the plane

210


of incidence. Recall how the ray vector s = vrct can be obtained from the

wave normal m by formulas summarized in sec. 1.19.5. Adopting this

approach, one can determine the ray vectors arising in double refraction

from the wave vectors obtained from (1.189). While the ordinary ray lies

in the plane of incidence, the extraordinary ray does not, in general, lie in

this plane since it has to lie in the plane containing the wave vector and

the corresponding electric displacement vector, i.e., in the plane contain-

ing the wave vector and the optic axis, where the latter may point in a

direction off the plane of incidence.

N

O

N¢C

B1

B2

A

O¢

k

plane ofincidence

interface

Figure 1.23: Depicting schematically the phenomenon of double refrac-tion at the interface separating an isotropic medium from an uniaxialanisotropic one; AO is an incident ray corresponding to the wave vector k

in the isotropic medium; N′ON is the normal to the interface at the pointO; the wave vectors corresponding to the two refracted waves are alongOB1 (ordinary wave) and OB2 (extraordinary wave), both lying in the planeof incidence; the ray corresponding to the ordinary wave is directed alongOB1; however, the ray corresponding to the extraordinary wave lies in theplane containing OB2 and the optic axis OO′; considering the general sit-uation in which OO′ is off the plane of incidence, the extraordinary ray isalso along a direction OC off the plane of incidence.

I do not enter here into a discussion of the distinctive features of refraction from an

isotropic medium into a biaxial anisotropic medium. One such distinctive feature relates

to conical refraction where one of the refracted wave vectors points along an optic axis of

the medium, in which case there arises a bunch of refracted rays lying on the surface

of a cone.

211


1.20 Wave propagation in metamaterials

1.20.1 Electric and magnetic response in dielectrics and

conductors

We have had a brief introduction to dispersion of electromagnetic waves

in dielectrics and in conducting media in sections 1.15.1, 1.15.2.7. Both

these types of media exhibit response of a considerable magnitude to the

electrical components of electromagnetic waves, where the response is

predominantly determined by resonances in the case of dielectrics and

by plasma oscillations of free electrons in the case of a conductor. The

resonances in a dielectric material are due to transitions between dis-

crete atomic or molecular energy levels, while the energy levels of the free

electrons in a conductor are continuously distributed in energy bands.

Still, there may occur interband transitions in a conductor resulting in resonance-like

features in its dispersion (which is, once again, predominantly an electrical response).

These transitions contribute to ǫr0(ω) occurring in (1.100) and, in the optical range of

the spectrum, are responsible for the colour of metals like gold and copper.

Both in dielectrics and conductors, the electrical response results in a

lowering of the relative permittivity in certain frequency ranges as seen

from the dip in the curve (fig. 1.8) depicting the variation of the refrac-

tive index in a frequency range around a resonance. There may even

be frequency ranges in which there results a negative value for ǫr for a

dielectric. Similarly, in a conducting medium, one can have a negative

value of ǫr at frequencies below the plasma frequency ωp, as seen from

212


formula (1.99a).

However, in spite of the possibility of such negative values of ǫr occurring

in certain frequency intervals for dielectrics and conductors, the possi-

bility of a negative value of the refractive index does not arise because

of the lack of magnetic response in these materials in all but the low-

est frequency ranges (recall from sec. 1.15.2.12 the result pointed out by

Veselago that the conditions ǫr < 0, µr < 0 imply n < 0; this requires a

pronounced magnetic response, in the absence of which one has µr ≈ 1;

however, the condition for a negative refractive index can be stated in

more general terms, as we will see below).

1.20.2 Response in metamaterials

Indeed, few, if any, of the naturally occurring substances are character-

ized by a negative refractive index, which is why Veselago’s paper had

to remain dormant for more than three decades. Around the beginning

of the present century, however, technological advances relating to the

fabrication and use of nanomaterials opened the door to a veritable rev-

olution where artificially engineered materials with negative refractive in-

dices in various frequency ranges, including optical frequencies, became

a distinct possibility.

The basic approach was to make use of miniature metallic units of ap-

propriate shapes, with dimensions small compared to the wavelengths

of interest, that could show a pronounced diamagnetic response to the

waves, resulting in negative values of µr for a medium made up of one

213


or more arrays of such units. For instance, a split ring resonator (SRR;

refer to fig. 1.9) can act as an L-C circuit, where the metallic ring-like

structures form the inductive element while the gap between the rings

(as also the gap in each ring) acts as a capacitive element.

Such an L-C circuit is characterized by a certain resonant frequency (ω0(=

1√LC

)) depending on the size and shape of the rings and of the gaps, and

possesses a pronounced response to an electromagnetic field of frequency

ω close to ω0. The response is paramagnetic for ω > ω0 and diamagnetic

for ω < ω0 where, in the latter case, the magnetic moment developed in

the ring is in opposite phase to the magnetic field of the wave.

Thus, it is possible to have negative values of ǫr and µr, the latter in the

case of artificially engineered materials, and the problem that now comes

up is to ensure that the two parameters are both negative at the same

frequencies belonging to some desired range.

The magnetic resonance frequency can be altered by choosing the metal-

lic units of appropriate shape and size. In particular, scaling down the

size results in an increase of the resonant frequency, and recent years

have witnessed the emergence of technologies where the frequency can

be scaled up to the optical part of the spectrum.

A great flexibility in the electrical response can be achieved by making use

of what are known as surface plasmon polariton modes. These are modes

of propagation of electromagnetic waves, analogous to those in waveg-

uides, along the interface of a metal and a dielectric, where the electro-

214


magnetic field is coupled to plasma oscillations (the plasmons) of the free

electrons in the metal localized near the interface. The plasmon oscil-

lations are characterized by a great many resonances distributed over

relatively wide ranges of frequencies. The enhanced electrical response

at or near these frequencies causes a lowering of the effective permittiv-

ity, analogous to what happens near a resonance resulting from atomic

transitions in the bulk dielectric.

This makes possible the fabrication of metamaterials in which the mag-

netic and electric responses are made to occur simultaneously, in desired

frequency ranges. Such a material responds to electromagnetic waves ef-

fectively as a continuous medium with negative values of ǫr and µr, and

thus, with a negative refractive index (see sec. 1.20.3).

1.20.3 ‘Left handed’ metamaterials and negative refrac-

tive index

In accordance with Maxwell’s equations, a monochromatic plane wave

propagating in a material with negative values of ǫr and µr is characterized

by a number of special features.

To start with, consider a plane wave with a propagation vector k and an

angular frequency ω(> 0) for which the field vectors are of the form (1.50a),

where the wave is set up in a medium for which each of the parameters

ǫr, µr can be either positive or negative. In the absence of surface charges

215


and currents, the Maxwell equations (1.1b), (1.1d) imply

k× E0 = ωµ0µrH0, k×H0 = −ωǫ0ǫrE0. (1.190)

One can have any one of four possible situations here. Specifically, the

two relations above are consistent for either (i) ǫr > 0, µr > 0, or (ii) ǫr <

0, µr < 0, corresponding to which the medium under consideration is

termed a positive or a negative one. On the other hand, the two relations

are mutually inconsistent for (iii) ǫr > 0, µr < 0 or (iv) ǫr < 0, µr > 0, in

which case the medium can support an inhomogeneous plane wave but

not a homogeneous one.

Inhomogeneous waves were encountered in sec. 1.13. These are characterized by dis-

tinct sets of surfaces of constant amplitude and surfaces of constant phase. An inho-

mogeneous wave arising in the case of total internal reflection as also one in a medium

of type (iii) or (iv) above are, moreover, evanescent ones since it is characterized by an

exponentially decreasing amplitude.

Moreover, one notes that for a positive medium (case (i) above) the vec-

tors E0, H0, and k form a right handed triad, which is what we found in

sec. 1.10.1. On the other hand, for a negative medium (case (ii)) the three

vectors form a left handed triad. Such a medium is therefore termed

at times a ‘left handed’ one, though this term does not imply any chiral

property (i.e., one involving a rotation of the plane of polarization in the

medium), and the term ‘negative medium’ appears to be more appropri-

ate.

216


In contrast to the propagation vector k, the Poynting vector S = E × H

is, by definition, always related to E0 and H0 in a right handed sense.

Hence, for a plane wave in a negative medium, the Poynting vector is

oppositely directed to the propagation vector. As we will see in chapter 2,

the ray direction (or the direction of the ray velocity) in a medium, in

the ray optics description, is along the direction of energy propagation

which, under commonly occurring circumstances, is also the direction of

the group velocity. On the other hand, the propagation vector gives the

direction of the phase velocity. Thus, in a negative medium, the group

velocity and the phase velocity point in opposite directions.

What is more, a negative medium is characterized by a negative refractive

index. To see this, consider once again a plane wave incident on an

interface separating two media as in fig. 1.5 (see sec. 1.12.1), where now

medium A is assumed to be free space (n1 = 1) and medium B is a negative

one (n2 = n, say). Assume, for the sake of simplicity, that the incident

wave along n is polarized with its electric vector perpendicular to the

plane of incidence. In this case, the boundary conditions involving E

implies that the amplitude E0 = e2E0 (say) is the same on both sides of

the interface, while that involving D is identically satisfied.

The boundary condition involving the continuity of the tangential compo-

nent of H may be seen to imply that the cosines of the angles made by n

and m2 with the normal to the interface (e3), i.e., n · e3 and m · e3, are of

opposite signs. Finally, the boundary condition involving the continuity

217


of the normal component of B may be seen to imply

√ǫrµrm2 · e1 = n · e1, (1.191)

which, in this instance, coincides with the condition of continuity of the

phase across the interface (check all these statements out).

Taken together, the above results imply that m2, the unit wave normal

of the refracted wave is directed toward the interface (the x-y plane in

fig. 1.5) and lies on the same side of the normal to the latter (the z-axis) as

the incident wave normal. The ray direction of the refracted wave, on the

other hand, is directed away from the interface while lying on the same

side of the normal as that of the incident wave, as shown in fig. 1.24.

Moreover, the angle of incidence (i.e., angle made by the incident ray

with the normal, defined with the appropriate sign) φ and the angle of

refraction (the angle made by the refracted ray with the normal, once

again carrying its own sign) ψ are related to each other (compare with the

second relation in (1.70)) as

sinφ = −√ǫrµr sinψ. (1.192a)

In other words, a material with negative values of ǫr and µr is charac-

terised by a negative refractive index

n = −√ǫrµr. (1.192b)

Incidentally, the parameters ǫr, µr can be negative only in a dispersive

218


incident ray

free spacen

f

y

refracted ray

N¢

m1

m2

negative medium

N

Figure 1.24: Depicting the refraction of a plane wave from free spaceinto a negative metamaterial, i.e., one where both ǫr, µr (assumed realfor the sake of simplicity; in reality, both can be complex) are negative;n, m2 are the unit normals along the propagation vectors of the incidentand refracted waves (m1 is the reflected wave normal; see fig. 1.5 forcomparison), both of which lie on the same side of the normal (NN′) to theinterface (AB); the refracted ray points in the opposite direction to m2, andthe angles of incidence and refraction (φ, ψ) are related as in (1.192b); therefractive index is negative.

medium, i.e., dispersion is a necessary condition for a negative value of

the refractive index. Thus, continuing to consider, for the sake of sim-

plicity, an isotropic medium with negligible energy dissipation, negative

values of ǫr, µr imply a negative value of the time averaged energy den-

sity for a non-dispersive medium (refer to eq. (1.35a) and the constitutive

relations), which is a contradiction. For a dispersive medium, on the

other hand, the time averaged energy density is given by formula (1.131),

which can be positive even with negative values of ǫr, µr, provided that the

dispersion is sufficiently strong.

Recall, in this context, that dispersion is a necessary consequence of causality, i.e.,

every medium other than free space has to be, in principle, a dispersive one. Further,

dispersion is necessarily associated with dissipation, which means that the imaginary

parts of ǫr, µr have to be non-zero (though these can be small in magnitude) where

these, moreover, have to be positive so as to imply a positive value of the rate of energy

219


dissipation.

1.20.4 Negative refractive index: general criteria

Up to this point we have considered isotropic media with negligible ab-

sorption, where the imaginary parts of ǫr and µr are real scalars. In reality,

the dielectrics and conductors used in the fabrication of metamaterials

may be characterized by a considerable degree of absorption, especially

in frequency ranges where their electrical and magnetic responses are

strong. Continuing to consider an isotropic medium, but now with com-

plex values of the effective parameters ǫr, µr, one arrives at the following,

more general, condition implying a negative real part of the refractive

index:

Re(ǫr) |µr|+Re(µr) |ǫr| < 0. (1.193)

Evidently, this represents a more general condition, since it is satisfied if

both ǫr and µr are real and negative.

Two other factors responsible for producing negative refractive index in

a metamaterial are anisotropy and spatial dispersion. Anisotropy in the

electrical response is a common feature of crystalline dielectrics. Mag-

netic anisotropy is also common in artificially fabricated materials where

the shape and disposition of the metallic units (e.g., split ring resonators)

can be made use of in producing the anisotropy. The term ‘spatial dis-

persion’ is employed to denote a dependence of the permittivity or the

permeability on the propagation vector k in addition to that on ω, and

220


arises due to non-local effects being relevant in the determination of the

effective ǫr, µr at any given point. Once again, spatial dispersion is a com-

mon feature of metamaterials because of the finite size of the metallic

units which, though small compared to the relevant wavelength, is quite

large compared to atomic dimensions.

While a negative value of the real part of ǫr or µr of a medium is not ruled out on gen-

eral grounds, thermodynamic considerations relating energy dissipation in the medium

imply that the imaginary part has to be positive. If, then, one assumes that, in addition

to the real parts of ǫr, µr being negative, the medium under consideration is a passive

one, i.e., causes an attenuation, rather than amplification, of a wave passing through it

(which is another way of saying that the imaginary part of the refractive index is posi-

tive), then it follows that the real part of the refractive index has to be negative (check

this out). This condition is more general than the one considered in sec. 1.20.3 though,

at the same time, less general than (1.193).

The fact that a metamaterial is, in general, required to have a strong

electrical and magnetic response in the wavelength ranges of interest,

implies that there should occur pronounced energy loss as a wave prop-

agates through it. Great demands are therefore placed on the designing

and on fabrication technologies of metamaterial devices so as to make

them function in desired ways.

221


1.20.5 Metamaterials in optics and in electromagnetic

phenomena

Veselago, in his 1968 paper, predicted a number of novel consequences

of a negative refractive index. Thus, in addition to the direction of energy

propagation and that of the phase velocity being opposite,, there arises

new features in phenomena like the Doppler effect and Cerenkov radia-

tion.

In Doppler effect in a positive medium, the frequency recorded by an ob-

server increases as the observer approaches the source while in a nega-

tive medium, the frequency decreases for an approaching observer. Sim-

ilarly, in a positive medium, for a source moving with a speed larger than

the phase velocity of electromagnetic waves in the medium, the direction

of propagation of the Cerenkov radiation emitted by the source makes an

acute angle with its direction of motion (the envelope of the wave fronts

emitted by the source at various instants of time is a cone lying behind

the moving source), while in the case of a negative medium, the direction

of propagation of the Cerenkov radiation makes an obtuse angle with that

of the source (the envelope lies in front of the source).

Several other novel effects have been predicted for negative refractive in-

dex metamaterials, and many of these have been verified for metamate-

rials fabricated with present day technology. While most of these relate

to electromagnetic waves belonging to frequency ranges lower than opti-

cal frequencies, a number of out of the ordinary optical effects have been

foreseen and are likely to be verified in the near future. Looking at the fu-

222


ture, novel devices of great practical use are anticipated, and a veritable

revolution in optics and electromagnetism seems to be in the offing.

Before I close this section I will briefly tell you how a negative refractive

index material can be made use of in image formation by a super lens,

i.e., a ‘lens’ having ideal focusing properties, in complete disregard of the

so-called ‘diffraction limit’, where the latter is the limit to the focusing

or imaging property of a lens set by diffraction at the edges of the lens

or (more commonly) at the edges of the stop used to minimize various

aberrations (see sec. 3.7). Confining ourselves to bare principles, the

super lens is just a flat slab of negative refractive index material assumed,

for the sake of simplicity, to be placed in vacuum, and characterized by

parameters ǫr = −1, µr = −1, and n = −1.

Fig. 1.25 shows a point object O placed at a distance l from the lens,

where l is less than d, the lens thickness. A ray from O, on being refracted

at the lens interface, gets bent to the same side of the normal (two such

rays are shown), the incident and refracted rays making the same angle

(ignoring their signs) with the latter. Since this happens for all the rays

incident on the lens, a perfect image is formed at I′, from which the rays

diverge so as to be refracted once again from the second surface of the

lens, this time forming a perfect image at I, at a distance d− l from it.

Such a super lens is capable of reconstructing every detail of an extended

object, down to sub-wavelength length scales. Assuming that the object

is illuminated with monochromatic coherent light (basic ideas relating to

coherence are presented in sec. 1.21 and, at a greater length, in chap-

223


O II¢

l l

d

d l– d l–

Figure 1.25: Explaining the basic principle underlying the action of a su-per lens, which is essentially a uniform slab of metamaterial, of refractiveindex n = −1 relative to the surrounding medium; a ray from a point ob-ject O, on being refracted at the front face of the super lens, gets bent onthe same side of the normal, and passes through the intermediate imageI′, two such rays being shown; on diverging from I′, the rays are refractedat the second surface, forming the final perfect image at I; all details ofan extended object are reconstructed at the final image since the evanes-cent waves from the object grow in the interior of the metamaterial, whichcompensates their decay outside it.

ter 7), the radiation from the object can be represented in the form of

an angular spectrum (refer to sec. 5.4) that consists of two major com-

ponents - a set of propagating plane waves traveling at various different

angles, and a set of inhomogeneous evanescent waves with exponentially

diminishing amplitudes. The evanescent waves do not carry energy, but

relate to details of the object at length scales smaller than a cut-off value

determined by the frequency of the radiation.

In conventional imaging systems the evanescent wave component of the

angular spectrum gets lost, because the amplitudes of the evanescent

waves become exponentially small at distances of the order of several

wavelengths from the object. However, a super lens builds up the evanes-

cent component because of its negative refractive index. For n = −1, there

occurs perfect reconstruction of the evanescent waves in the image, and

all the details of the object, down to the finest length scales, are captured.

224


Finally, while we have mostly confined our attention to negative refrac-

tive index materials, metamaterials of more general types have been fab-

ricated, having distinctive types of response to electromagnetic waves in

various frequency ranges. As for the science of optics, all these extraordi-

nary developments are sure to change the face of the subject as hitherto

studied and taught. It is perhaps fitting to call the emerging new science

of optics by the name meta-optics - optics beyond what we know of it,...

and optics based on metamaterials.

One area with immense potentials, that has already emerged is transfor-

mation optics, on which I include a few words of introduction in sec. 1.20.6.

1.20.6 Transformation optics: the basic idea

Fig. 1.26(A) depicts a grid made up of a set of identical squares form-

ing the background in a region of space filled up with a homogeneous

medium with positive values of ǫr, µr, with a ray path shown against the

grid. We assume the medium to be free space for the sake of simplicity

(ǫr = 1, µr = 1). The ray path corresponds to field vectors that satisfy the

Maxwell equations which, for a harmonic field of angular frequency ω,

and in the absence of free charges and currents, can be written as

div (ǫr · E) = 0, div (µr ·H) = 0

curl E = iωµ0µr ·H, curl H = −iωǫ0ǫr · E, (1.194)

where ǫr, µr are tensors of rank two and ‘·’ denotes the inner product of

a tensor and a vector (thus, (a ·G)i =∑3

j=1 aijGj (i = 1, 2, 3) where a is a

225


tensor, G is a vector, and i, j label Cartesian components).

A result of central importance is that, under a spatial transformation of

the form

x1, x2, x3 → x′1, x′2, x

′3, (1.195a)

along with appropriate corresponding transformations of the field vari-

ables and of the parameters ǫr, µr,

E→ E′, H→ H′, ǫr → ǫ′r, µr → µ′r, (1.195b)

the Maxwell equations (1.194) remain invariant. In other words, if the

transformations (1.195b) are chosen appropriately, for a given transfor-

mation (1.195a) of the Cartesian co-ordinates (where (x1, x2, x3) are the co-

ordinates of any chosen point in space and (x′1, x′2, x

′3) are the co-ordinates

of the transformed point), then equations of the form (1.194) hold for the

transformed quantities, i.e.,

div′ (ǫ′r · E′) = 0, div′ (µ′r ·H)′ = 0

curl′ E′ = iωµ0µ′r ·H′, curl′ H′ = −iωǫ0ǫ′r · E′, (1.196)

where div′ and curl′ denote divergence and curl with respect to the trans-

formed co-ordinates.

Making use of this result, one can choose the transformation in such a

way that the ray path of fig. 1.26(A) now gets transformed to a path of

any chosen shape, like the one shown in fig. 1.26(B), where now the field

226


ab

(A) (B) (C)

Figure 1.26: Explaining the basic idea underlying transformation optics;(A) a ray path in a homogeneous medium with positive values (assumedreal for the sake of simplicity) of the parameters ǫr, µr; a grid is shownin the background, made up of identical squares; (B) a transformationwherein the squares making up the grid are deformed and, at the sametime, the ray path is deformed away from its rectilinear shape; the trans-formation involves spatial co-ordinates, the field variables, and the pa-rameters ǫr, µr in such a way that Maxwell’s equations are still satisfied,but now for a medium that has to be an artificially produced one; (C) raypaths in a metamaterial with an appropriate spatial variation of ǫr, µr,where these paths avoid a spherical region, passing instead through aregion shaped like a hollow spherical shell; the inner spherical regionthereby becomes ‘invisible’ to the incoming rays.

variables (the primed ones) refer to a harmonically varying field of fre-

quency ω in some medium other than the one of fig. 1.26(A) (free space

in the present instance) because of the transformation of the permittivity

and permeability tensors (as we will see in chapter 2, a ray path points

in the direction of the time averaged Poynting vector E×H). In this man-

ner, ray paths can be deformed so as to meet any chosen purpose by

an appropriate choice of ǫ′r, µ′r. In general, the transformed parameters

will correspond to an anisotropic and inhomogeneous medium which can

only be realized in the form of a metamaterial with an artificially engi-

neered structure. The figure shows how the transformation of the spatial

co-ordinates deforms the squares making up the grid in the background

of the ray path.

Fig. 1.26(C) depicts a situation where the choice of the transformed per-

227


mittivity and permeability tensors results in deformed ray paths that

avoid a spherical region of radius a, passing instead through a region

of the form of a hollow spherical shell of inner and outer radii a, b. The

transformation is so chosen as to convert rectilinear ray paths in free

space to the curved paths shown in the figure in a medium with the ap-

propriate spatial variations of the permittivity and permeability tensors.

As seen in the figure, the spherical region A is effectively ‘invisible’ to

the incoming rays. This is the basic principle of the technique of optical

cloaking, an emerging one of immense possibilities in the area of trans-

formation optics.

One apprehends, however, that the technique of optical cloaking, as also other possible

areas of application of transformation optics, may find uses in surveillance and intelli-

gence activities associated with non-peaceful and non-humanitarian projects. This, in

a sense, is the great tragedy of physics.

It now remains to state the transformation rule for the field variables and

the permittivity and permeability tensors for any chosen transformation

(eq. (1.195a)) of the co-ordinates under which the Maxwell equations are

to remain invariant. for this we define the Jacobian matrix (g) of the

transformation as

gij(x) =∂x′i(x)

∂xj, (i, j = 1, 2, 3), (1.197)

where x stands for the triplet of spatial co-ordinates (x1, x2, x3) (x′ will have

a similar meaning). The required transformation rules can then be stated

228


as

E ′i(x

′) =∑

j

((gT)−1)ij(x)Ej(x),

H ′i(x

′) =∑

j

((gT)−1)ij(x)Hj(x) (i = 1, 2, 3), (1.198a)

(ǫ′r)ij(x′) =

1

(det g)(x)

∑

l,m

gil(x)(ǫr)lm(x)(gT)mj(x),

(µ′r)ij(x

′) =1

(det g)(x)

∑

l,m

gil(x)(µr)lm(x)(gT)mj(x) (i, j = 1, 2, 3), (1.198b)

where gT stands fior the transpose of the Jacobian matrix g, with elements

(gT)ij(x) = gji(x) (i, j = 1, 2, 3). (1.198c)

I skip the proof of the above statement which involves a bit of algebra, but is straight-

forward (try it out).

In the example of fig. 1.26(C), the region r′ < a inside a sphere of ra-

dius a (say) is transformed into a spherical shell a < r′ < b (b > a) which

acts as the cloak around the inner spherical region, it being convenient

in this instance to use spherical polar co-ordinates r′, θ′, φ′ in place of

the cartesian ones (x′, y′, z′) in the transformed space. Note that the de-

formed ray paths, described in terms of the co-ordinates x′, y′, z′, pertain

to the medium characterized by the primed quantities while the unprimed

quantities pertain to the medium we started with (which we have chosen

to be free space for the sake of simplicity) where the ray paths are straight

229


lines. The two situations are to be made to correspond to each other

in terms of appropriate boundary conditions, or initial ray directions as

these approach the cloaked region and the cloak.

On working out the required transformation in this instance (there can

be more than one possible transformations, among which a linear one

relating r′ to r is commonly chosen for the sake of simplicity) one finds

that the medium in which the cloaking takes place is to be a strongly in-

homogeneous and anisotropic one, and requires an artificially engineered

material (a metamaterial) for its realization.

Transformation optics is relevant in other applications as well, and is

currently an area of enormous activity (with, unfortunately, a component

likely to have a strategic orientation).

1.21 Coherent and incoherent waves

The idea of coherence is of great relevance in optics and in electromag-

netic theory, as also in other areas of physics. For instance, interference

patterns (refer to chapter 4) are generated with the help of coherent waves

while a lack of coherence between the waves results in the patterns being

destroyed.

The basic idea can be explained by referring to a space-time dependent

real-valued scalar field ψ(r, t) where ψ may, for instance, stand for any of

the Cartesian components of the field vectors constituting an electromag-

netic field.

230


Terms like ‘wave’, ‘field’, ‘disturbance’, and ‘signal’ are commonly used with more or less

identical meanings, with perhaps only slightly different connotations depending on the

context.

Consider the variations of ψ(t) and ψ(t + τ) as functions of time t, where

τ is any fixed time interval (commonly referred to as the delay between

the two functions), and where the reference to the position vector r is sup-

pressed by way of choosing some particular field point in space. Fig. 1.27(A)

depicts an instance of the two functions where the variations in time are

seen to resemble each other to a great extent, while the degree of re-

semblance appears to be much less in fig. 1.27(B). Assuming that the

situation depicted in the two figures remains substantially the same for

arbitrarily values of the delay τ , one says that the wave described by ψ(r, t)

is a temporally coherent one at the chosen point r for the case (A), while

it is said to be temporally incoherent for the case (B).

t

y

y ( )ty ( + )t t

t

y

y ( )t

y ( + )t t

(A) (B)

Figure 1.27: Illustrating the concept of coherence; the wave form of a realscalar field ψ(r, t) is shown for any chosen point r; (A) the wave forms ofψ(t) and ψ(t+ τ) are shown for comparison; the resemblance or degree ofcorrelation between the two is high; (B) the degree of correlation is low,as the two waveforms are seen to have little resemblance to each other;the time delay τ chosen in either case is large compared to the range oft shown in the figure; (A) corresponds to a coherent wave at r, while (B)represents an incoherent wave.

More generally, though, one speaks of partial coherence where the degree

of resemblance referred to above may be quantified by a value that may

vary over a range, and where it may depend on the delay τ . For instance,

231


there may exist a certain value, say, τ0 of the delay (often not defined

very sharply) such that coherence may exist for τ < τ0 and may be de-

stroyed for τ > τ0. The delay τ0 is then referred to as the coherence time

characterizing the field at r.

One may also consider the spatial coherence characteristics of the field

by referring to any two chosen points r1 and r2 by looking at the degree

of resemblance (or of correlation) between ψ(r1, t) and ψ(r2, t) for various

different values of the separation between the two points. As is seen

in numerous situations of interest, the degree of resemblance is high

when the separation d is less than a certain transition value d0 (which,

once again, may not be sharply defined), while being almost zero for d >

d0. It is d0, then, that describes the spatial coherence of the field under

consideration.

Instead of considering one single space-time dependent field ψ, one may

even consider two field functions ψ1 and ψ2, and look at their mutual

coherence characteristics. For instance, the degree of correlation between

ψ1(r, t) and ψ2(r, t + τ) as functions of t for any chosen point r and for

various values of the delay parameter τ describes the temporal coherence

of the two fields at the chosen point. The mutual coherence between the

two fields ψ1 and ψ2 is reflected in the degree of self coherence of the

superposed field ψ1 + ψ2.

Coherence is of relevance in optics because optical field variables are

quite often in the nature of random ones and their time variation resemble

random processes. This element of randomness finds its expression in the

232


lack of correlation between the field components, the degree of which may

depend on the set-up producing the field.

In chapter 7, I will take up the issue of coherence in greater details, where

the notion of random variables and random processes will be explained,

and that of ’degree of resemblance’ (or the degree of correlation) will be

quantified in terms of the ensemble average of the product of two sam-

ple functions. The fact that the electromagnetic field involves vector wave

functions rather than scalar ones adds a new facet to the issue of coher-

ence, namely, the one relating to the degree of polarization of the wave

under consideration.

233

chapter 1 electromagnetic theory and optics

Documents