quantum electronics / photonics

181
Quantum Electronics FS2014 Summary of lecture notes, v 1.2 Prof. Steven Johnson March 4, 2014

Upload: mue-slo

Post on 24-Apr-2017

252 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Quantum Electronics / Photonics

Quantum Electronics FS2014Summary of lecture notes, v 1.2

Prof. Steven Johnson

March 4, 2014

Page 2: Quantum Electronics / Photonics

Contents

1 Introduction and basics of light propagation 51.1 What is Quantum Electronics? . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Outline of course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Maxwell’s equations in a medium . . . . . . . . . . . . . . . . . . . . . . . 71.4 P and M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.6 Plane wave solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.7 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.8 Energy and the Poynting vector . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Propagation in dispersive media 162.1 Mathematical aspects of dispersion . . . . . . . . . . . . . . . . . . . . . . 162.2 Physical origins of dispersion: Lorentz model . . . . . . . . . . . . . . . . . 182.3 Kramers-Kronig relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4 Index of refraction in dispersive media . . . . . . . . . . . . . . . . . . . . 212.5 Phenomenology of n(ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.6 Pulses: non-monochromatic waves . . . . . . . . . . . . . . . . . . . . . . . 242.7 Group, phase and front velocity . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Interfaces, interference and coherence 303.1 Boundary conditions for isotropic media . . . . . . . . . . . . . . . . . . . 303.2 Brewster’s angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.3 Total reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4 Longitudinal coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Interferometry 424.1 More about coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2 Wiener-Khinchin Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.3 Interferometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.4 Matrix methods for multiple surfaces . . . . . . . . . . . . . . . . . . . . . 524.5 Application to Fabry-Perot . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

1

Page 3: Quantum Electronics / Photonics

Johnson, QE FS2014 2

5 Fourier Optics 575.1 Helmholtz Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Spatial Fourier transform and transfer function . . . . . . . . . . . . . . . 585.3 Fresnel Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.4 Gaussian Beams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.5 Fraunhofer limit: the far field . . . . . . . . . . . . . . . . . . . . . . . . . 615.6 Examples of amplitude modulation . . . . . . . . . . . . . . . . . . . . . . 62

5.6.1 Rectangular aperture . . . . . . . . . . . . . . . . . . . . . . . . . . 625.6.2 Round aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 Fourier Optics 2 646.1 Phase masks and Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646.2 Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.3 Gratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.3.1 Example: harmonic amplitude transmission grating . . . . . . . . . 696.4 Fresnel zone-plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.5 Holography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.6 Paraxial ray optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.7 Gaussian beam optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.7.1 ABCD law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

7 Gaussian and other beams 787.1 Paraxial Helmholtz equation revisited . . . . . . . . . . . . . . . . . . . . . 787.2 Hermite-Gaussian beams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797.3 Laguerre-Gaussian beams . . . . . . . . . . . . . . . . . . . . . . . . . . . 837.4 Bessel beams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837.5 Appendix: Proof of eq. 7.10 . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8 Optical resonators 888.1 What are optical resonators? . . . . . . . . . . . . . . . . . . . . . . . . . . 888.2 Spherical mirror resonators . . . . . . . . . . . . . . . . . . . . . . . . . . . 898.3 Symmetric spherical mirror resonators . . . . . . . . . . . . . . . . . . . . 938.4 Resonance frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 938.5 Resonator loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9 Laser fundamentals 1 989.1 Photon-atom interactions in two-level systems . . . . . . . . . . . . . . . . 989.2 Einstein coefficients: relationship between spontaneous and stimulated emis-

sion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019.3 Line broadening mechanisms: homogeneous versus inhomogeneous . . . . . 103

9.3.1 Lifetime broadening . . . . . . . . . . . . . . . . . . . . . . . . . . . 1039.3.2 Collision broadening . . . . . . . . . . . . . . . . . . . . . . . . . . 1069.3.3 Inhomogeneous broadening . . . . . . . . . . . . . . . . . . . . . . . 106

Page 4: Quantum Electronics / Photonics

Johnson, QE FS2014 3

9.4 Atom-level based lasers: general considerations . . . . . . . . . . . . . . . . 1079.5 Three-level lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099.6 Four-level lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

10 Laser fundamentals 2 11310.1 Laser oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11310.2 Laser characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11510.3 Spectrum and longitudinal modes . . . . . . . . . . . . . . . . . . . . . . . 116

10.3.1 Homogeneous broadening . . . . . . . . . . . . . . . . . . . . . . . 11610.3.2 Inhomogeneous broadening . . . . . . . . . . . . . . . . . . . . . . . 118

10.4 Transverse modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11810.5 Mode selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

10.5.1 Selecting a laser line . . . . . . . . . . . . . . . . . . . . . . . . . . 11910.5.2 Transverse mode selection . . . . . . . . . . . . . . . . . . . . . . . 12010.5.3 Polarization selection . . . . . . . . . . . . . . . . . . . . . . . . . . 12010.5.4 Longitudinal mode selection . . . . . . . . . . . . . . . . . . . . . . 120

10.6 Pulsed lasers: overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12110.7 Examples of lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

10.7.1 Ti3+:Al2O3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12210.7.2 CO2 lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12210.7.3 Free-electron laser . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

11 Polarization optics 1 12511.1 Describing polarization: Poincare sphere and the Stokes vector . . . . . . . 12511.2 Jones vector formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12811.3 Anisotropic materials: overview . . . . . . . . . . . . . . . . . . . . . . . . 13011.4 Propagation in anisotropic materials . . . . . . . . . . . . . . . . . . . . . 132

11.4.1 Propagation along a principal axis . . . . . . . . . . . . . . . . . . . 13211.4.2 Propagation in an arbitrary direction . . . . . . . . . . . . . . . . . 133

11.5 Dispersion relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13511.6 Optical activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13611.7 Magneto-optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13911.8 Electro-optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13911.9 Polarization devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

11.9.1 Polarizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14111.9.2 Wave retarders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14111.9.3 Intensity control, switching . . . . . . . . . . . . . . . . . . . . . . . 14111.9.4 Optical isolators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

12 Waveguides 14612.1 Waveguides: an introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 14612.2 Modes of the planar-mirror waveguide . . . . . . . . . . . . . . . . . . . . 147

12.2.1 Electric field profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Page 5: Quantum Electronics / Photonics

Johnson, QE FS2014 4

12.2.2 Number of modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15012.2.3 TM modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15112.2.4 Dispersion relation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15112.2.5 Group velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

12.3 Planar dielectric-waveguide . . . . . . . . . . . . . . . . . . . . . . . . . . . 15212.3.1 Numerical aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . 15412.3.2 Field distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

12.4 Optical fibers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

13 Nonlinear optics 1 15713.1 Nonlinear optics overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 15713.2 Symmetry and nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . 16013.3 Nonlinear wave equation and the first Born approximation . . . . . . . . . 16113.4 Second order processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

13.4.1 Second harmonic generation . . . . . . . . . . . . . . . . . . . . . . 16213.4.2 The electro-optic effect . . . . . . . . . . . . . . . . . . . . . . . . . 16313.4.3 Three-wave mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . 16313.4.4 Phase matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

14 Nonlinear optics 2 17014.1 Third order processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

14.1.1 Third harmonic generation . . . . . . . . . . . . . . . . . . . . . . . 17114.1.2 Optical Kerr effect . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

14.2 Coupled-wave theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17514.2.1 Photon number conservation: Manley-Rowe Relation . . . . . . . . 17714.2.2 Example: second harmonic generation . . . . . . . . . . . . . . . . 178

14.3 Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17814.4 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Page 6: Quantum Electronics / Photonics

Chapter 1

Introduction and basics of lightpropagation

Learning Objectives

After this chapter, you should be able to

• Identify the subjects encompassed by Quantum Electronics

• Explain the physical differences between the microscopic Maxwell’s equations and themacroscopic Maxwell’s equations. (Should know both sets of Maxwell’s equations: eq.1.1-4; 1.13-16.)

• Explain the physical meaning of the polarization density P and the magnetizationdensity M . (Should know eq. 1.31-32)

• For plane wave solutions, give the constraints on the relative orientation of the EMfields (Should know the wave equation and the generic relations for plane waves: eq.1.36-37; 1.43; 1.49 )

• Define the Poynting vector and its physical meaning in terms of energy flux andmomentum. (Should know/derive eqs. 1.59, 1.60, 1.61. 1.64, 1.65 )

• Calculate the intensity of a linearly polarized plane wave, given the amplitude of theelectric field

1.1 What is Quantum Electronics?

Quantum electronics is essentially a branch of physics that derives from the invention oflasers in 1960. Broadly speaking, it covers the subject of how lasers work and how theunique properties of the output of lasers can be used. The name “quantum electronics” issomewhat old-fashioned. The term refers to the historical development of the first lasers,

5

Page 7: Quantum Electronics / Photonics

Johnson, QE FS2014 6

in which the interaction between light and electronic energy levels of atoms played a keyrole. Since this time, a wider variety of types of lasers have been developed, and some ofthese do not quite fit the literal meaning of “quantum electronics.” In reality much of whatfalls into quantum electronics is really well-described by classical physics. More moderndesignations would be “laser physics” or “photonics.”

The word “laser” is an acronym, meaning “Light amplification by stimulated emission ofradiation.” The phrase “stimulated emission” is a key point, introduced in 1917 by AlbertEinstein. We’ll discuss this process later in the course. For now, we’ll just recognize thatlasers are a way of efficiently amplifying light in a particular way. Historically, the word“light” refers specifically to the visible range of the electromagnetic spectrum. . . although,in modern parlance, we now have lasers that function all the way to the x-ray range of thespectrum. The laser was preceded by the “maser,” which is the in theory same as a laserbut operating in the microwave range of the EM spectrum. The first maser was inventedin 1954 by Charles Townes and Arthur Schawlow. Following some public discussion aboutmaking a version of the maser for optical wavelengths, there was a race to build the first“true” laser that operated in the visible wavelength range. This was not an easy task, sincethe differences in wavelength between microwaves and visible light is on the order of onethousand, and so many of the technical solutions that worked for microwaves simply didnot work for optical light. It is generally accepted that Theodore Maiman made the firstworking laser, based on a rod of ruby, in 1960. Interestingly, the initial paper reporting theinvention of the laser was rejected by Physical Review Letters as being “just another maserpaper” (the editor was a theorist). He ended up publishing it in the journal “Nature” asa short 300-word report. The 1964 Nobel prize was awarded to Charles Townes, NikolaiBasov and Alexander Prokhorov for their work on the theory of the laser. This no doubtfurther irritated Maiman, but do not feel too sorry for him: he received a numerous otherprizes and honorary degrees for his work.

When the laser was first invented there was quite a bit of excitement about the uniquecharacteristics of the light emitted, but no one had a clear idea of how it was really use-ful. Today the situation is quite different. Lasers have invaded everyday life: we havebarcode scanners, laser printers and laser pointers. Lasers are also used for machining andmarking applications in industry. The coherence of lasers allow for radically new imag-ing methods, for example holography. In the natural sciences lasers have revolutionizedoptical spectroscopy, and have enabled so-called “nonlinear” spectroscopy that was noteven really considered before the invention of the laser. These methods allow for pre-cision measurements of atomic, molecular and solid-state properties. Short pulse lasers(so-called “ultrafast”) have allowed investigations of dynamics on time scales of hundredsof attoseconds, roughly the time scale for electronic wavepacket motion in an atom.

1.2 Outline of course

The main objective of this course is to enable you to explain to someone with similareducational background (1) how lasers work, (2) how light from lasers is different from

Page 8: Quantum Electronics / Photonics

Johnson, QE FS2014 7

“normal” light, and (3) how this light can be manipulated. This course also provides thetools needed to take more advanced courses in this area. There are many such coursesoffered; please see a list with descriptions posted online via moodle.

Below is an outline of the course:

• Classical wave propagation in isotropic media

• Interference, coherence and diffraction

• Principles of lasers

• Light propagation in anisotropic media

• Waveguides

• Nonlinear and ultrafast optics

As you can see, a large fraction of the course discusses classical EM wave propagation.This is no accident: in many ways, the radiation from lasers is in a strongly classical limit.We frequently will treat the EM field classically but the material properties quantummechanically (called a semi-classical treatment). For some cases we would need a fullyquantum treatment using quantum electrodynamics, but for the topics discussed in thisparticular course it is not necessary (for that you can refer to the follow-up course onquantum optics!).

1.3 Maxwell’s equations in a medium

The classical description of the electromagnetic field is in principle given by Maxwell’smicroscopic equations (here in SI units):

∇ · E =ρ

ε0(1.1)

∇ ·B = 0 (1.2)

∇× E = −∂B

∂t(1.3)

∇×B = µ0j + ε0µ0∂E

∂t(1.4)

Here ε0 = 8.854 × 10−12 As/Vm is called the electric constant, and µ0 = 1.257 ×10−6 Vs/Am is the magnetic constant.

This is great, but what should we do when we want to consider the propagation ofan electromagnetic wave in something that is not vacuum? A precise description of therelevant charge densities and currents on an atomic scale is pretty challenging for a classicaltheory, and probably impossible to do since we generally lack knowledge of exactly where

Page 9: Quantum Electronics / Photonics

Johnson, QE FS2014 8

every nucleus and electron is in, say, a piece of glass. We need some way to simplify thisproblem so that it becomes tractable.

We start by playing some mathematical games in an effort to “hide” the explicit de-pendence on every microscopic charge and current. We are not (yet) going to be makingany approximations, just playing some mathematical tricks.

We start by breaking up the charge and current densities into two parts:

ρ = ρf + ρb (1.5)

j = jf + jb (1.6)

For now the division is completely arbitrary. Our aim will be to find a way to sweep thedensities with the subscript “b” under the rug. We’ll start with equation 1.1, the electricfield divergence equation. If we introduce a new field variable

D = ε0E + P (1.7)

and define P implicitly viaρb = −∇ ·P (1.8)

we can re-write the electric field divergence equation in the form

∇ ·D = ρf (1.9)

which looks pretty similar, but now we can incorporate into our new field variable D anarbitrary amount of the charge density. We could even choose to make ρf = 0, whichwould make this equation very symmetric with equation 1.2.

Similarly, we can attack equation 1.4 in a similar fashion by defining a new magneticvector field

H =1

µ0

B−M (1.10)

where

∇×M = jb −∂P

∂t(1.11)

We then get a new curl equation in H

∇×H = jf +∂D

∂t(1.12)

which again looks suspiciously similar to the other curl equation if we were to set jf = 0.All together, our new equations have the form of the so-called macroscopic Maxwell’s

equations:∇ ·D = ρf (1.13)

∇ ·B = 0 (1.14)

∇× E = −∂B

∂t(1.15)

Page 10: Quantum Electronics / Photonics

Johnson, QE FS2014 9

∇×H = jf +∂D

∂t(1.16)

As stated above, so far this is just a mathematical trick, since we have not given anyphysical meaning to any of the new variables we have introduced. All we have shownis that it is possible to hide a portion of the explicit dependence on charge and currentdensities within two new field variables P and M. Next we go through one method ofperforming this assignment.

Let’s imagine we cut up our medium (conceptually, not physically!) into very smallcubes of equal size, each with volume V centered at regularly spaced locations R. Foreach cube, we measure the total charge qR =

∫ρ(r′)d3r′, the electric dipole moment pR =∫

(r′−R)ρ(r′)d3r′, a current ir =∫

j(r′)d3r′, and a magnetic dipole moment mR = 12

∫(r′−

R)× j(r′)d3r′. Our new variables can then be defined as follows

ρf (r) = qR/V (1.17)

P(r) = pR/V (1.18)

jf (r) = iR/V (1.19)

M(r) = mR/V (1.20)

if the vector r falls within the volume V of the cube centered at R. With these definitionsP is fairly obviously the polarization density and M is the magnetization of the medium.We can show fairly straightforwardly that these definitions plugged into the macroscopicMaxwell’s equations will reduce to the microscopic equations if (1) the “free” densities ρfand jf are uniformly distributed within each cube, and (2) the “bound” densities ρb and jbare both due to charges and currents on the boundaries between cubes. This is obviouslyan approximation: this scheme will fail to model accurately the fields within each cube.It works, though, for cases where the fields are not expected to vary significantly withinindividual cubes.

1.4 P and M

To make any sort of further progress, we need to know more about P andM. Specifically,we might expect the polarization P to depend on the electric field E, and the magnetizationdensity M to depend on the magnetic field B. Some pretty general expressions for thisdependence one could write as follows

P(r, t) =∫ t

−∞fP (r,E(r, t′), t′)dt′ (1.21)

M(r, t) =∫ t

−∞fM(r,H(r, t′), t′)dt′ (1.22)

We’ve implicitly assumed locality; that is, we assume that P and M depend only on theelectric and magnetic fields that have been at that particular place in the medium. This

Page 11: Quantum Electronics / Photonics

Johnson, QE FS2014 10

is usually pretty safe, as long as our “bound” charges and currents do not move around inthe medium very much (this is why they are called “bound”).

These expressions are still pretty unwieldy, since we a priori know nothing about thesevector functions. First, let’s assume that the explicit t dependence of these functions is notimportant. This is essentially an assumption that over the time scale of our calculation, theoptical properties of the medium do not change much. We’ll also (for now) assume that theexplicit r dependence is also not important; that is, the medium is optically homogeneous.

We still have this annoying time integral. For our immediate purposes we will assumethat P and M depend only on the immediate value of E and H. This assumption allowsus to get rid of the time integral. It is actually a pretty bad assumption for most materials,as we will see in the next lectures when we relax it. We then get

P(r, t) = fP (E(r)) (1.23)

M(r, t) = fM(H(r, t)) (1.24)

We can expand each of these expressions in a Maclaurin series:

Pi(r, t) = P (0)i + ε0χ

(1)ij Ej(r, t) + ε0χ

(2)ijkEj(r, t)Ek(r, t) + . . . (1.25)

Mi(r, t) = M (0)i + χ

(m1)ij Hj(r, t) + χ

(m2)ijk Hj(r, t)Hk(r, t) + . . . (1.26)

Here, χ(1)ij and χ

(m1)ij are a constant second-rank tensors, and χ

(2)ijk and χ

(m2)ijk are a constant

third-rank tensors. In general, P and M depend nonlinearly on the fields and also may bein different directions. These are both extremely important in many quantum electronicsapplications, but for many situations it is a reasonable approximation to further simplifyby assuming that only the second terms of the above relations are important, and that themedium is isotropic so that χ(1) and χ(m1) are just identity matrices multiplied by a scalar.We then get the equations for an isotropic, homogeneous, dispersionless, linear medium:

P(r, t) ≈ ε0χE(r, t) (1.27)

M(r, t) ≈ χmH(r, t) (1.28)

Here χ is called the dielectric susceptibility, and χm is the magnetic susceptibility. Quitefrequently we use the definitions for the dielectric constant

ε = 1 + χ (1.29)

and the magnetic permeabilityµ = 1 + χm (1.30)

so that we may write the other fields as

D = ε0εE (1.31)

B = µ0µH (1.32)

Page 12: Quantum Electronics / Photonics

Johnson, QE FS2014 11

Note that in a more general treatment for anisotropic materials, these relations will needmodification. Also, a small caveat: many people will include the factors of ε0 and µ0 intothe definition of ε and µ, so that they do not appear in the above equations. You canusually tell whether this is the case by dimensional analysis: our definitions of ε and µhave no units, whereas if we were to fold in ε0 and µ0 this would no longer be the case. Ofcourse, not everyone uses SI units either!

1.5 Wave equation

To describe light, we need an equation that relates the time-derivative of each field toits spatial derivative. Under our assumptions of a homogeneous, isotropic, dispersionlessmedium, we can derive this from the macroscopic Maxwell’s equations. We’ll also assumeno free currents or charges to make things really easy. First, we start with one of the curlequations...say this one:

∇× E = −∂B

∂t(1.33)

and then take the curl of both sides

∇× (∇× E) = −∂(∇×B)

∂t(1.34)

Using equations 1.31, 1.32, 1.16, 1.13, and the vector calculus identity

∇× (∇×A) = ∇(∇ ·A)−∇2A (1.35)

we obtain after some sweating the wave equation in the electric field

∇2E = ε0µ0εµ∂2E

∂t2(1.36)

We can do the same thing for the H field to get

∇2H = ε0µ0εµ∂2H

∂t2(1.37)

These both have the form of generic linear wave equations, with wave velocity v =1/√ε0µ0εµ. The speed of light c in vacuum is just

c = 1/√ε0µ0 (1.38)

We define the index of refraction

n = c/v =√µε (1.39)

The index of refraction is essentially the amount by which the phase velocity of light isslowed in a medium, relative to propagation ion vacuum. Very often at optical frequencieswe make an approximation that the material is non-magnetic, µ ≈ 1. This would makethe index of refraction simply the square root of the dielectric constant.

Page 13: Quantum Electronics / Photonics

Johnson, QE FS2014 12

1.6 Plane wave solutions

There are many solutions to the wave equation, but one of the most useful is that of themonochromatic plane wave. The electric field solution is

E(r, t) = E0 cos(ωt− k · r + φ) (1.40)

where the frequency ω and the wavevector k are related by

ω = c|k|/n (1.41)

The wavelength λ is related to k = |k| by

k = 2π/λ (1.42)

Note that since the wave equation (for our assumptions) is linear, any linear combinationof plane wave solutions is also a solution. In fact, it is possible to show that the planewaves form a complete basis for all possible solutions.

There are, however, some restrictions on the choices for the constant E0 imposed byMaxwell’s equations. From the divergence equation (for no free charges),

∇ ·D = ε0ε∇ · E = −ε0εE0 · k sin(ωt− k · r + φ) = 0 (1.43)

which implies that E0 is perpendicular to k. Note that this is true only because we assumedan isotropic medium!!

What about the magnetic fields? We could solve the wave equation in the same way,and get a general solution

B(r, t) = B0 cos(ωmt− km · r + φm) (1.44)

To get a relationship to the electrical field, we plug these forms into one of Maxwell’s curlequations:

∇× E = −∂B

∂t(1.45)

(−k× E0)(− sin(ωt− k · r + φ)) = ωmB0 sin(ωmt− km · r + φm) (1.46)

In order for this to be true for all space and time, we require ωm = ω, φm = φ, km = k,and

k× E0 = ωB0 (1.47)

or (reshuffling a bit)

B0 =n

ck× E0 (1.48)

Here the hat over the vector means that it is normalized to unity. Thus, we see that themagnetic field is mutually perpendicular to both E and the direction of propagation k.Waves satisfying these conditions are generally called TEM (Transverse Electro-Magnetic)waves.

Page 14: Quantum Electronics / Photonics

Johnson, QE FS2014 13

If we write the amplitude of H in terms of the electric field we get

H0 =

√ε0ε

µ0µk× E0 =

1

Zk× E0 (1.49)

where we introduce the impedance Z. The impedance of vacuum is Z0 = 377 Ω. For anon-magnetic material µ = 1, the impedance of an isotropic linear medium is Z0/n.

Quite often it is convenient to use complex numbers in calculations of fields. Thisoften allows for more compact formulations. For example, we can represent the plane wavediscussed above as

E(r, t) =1

2

[E0e

i(ωt−k·r+φ) + c.c.]

(1.50)

This can be succinctly written

E(r, t) = E0ei(ωt−k·r) + c.c. (1.51)

If we make E0 = 12E0e

iφ.People will often drop the “+ c.c.” as implicitly understood. Another (equivalent)

approach is to just take the real part of the field when you want to talk about somethingmeasurable. . . this is the same as above, up to a factor of 2 that can be normalized out byredefining E0.

We need to be somewhat careful, though, when calculating quantities that dependnonlinearly on the fields. In these cases it is best to explicitly take the real part or addthe complex conjugate. We’ll see an example of this when we discuss EM energy and thePoynting vector.

1.7 Polarization

Polarization refers to the direction of the electric and magnetic fields in the wave. It isusual to talk only about the electric field E, with the understanding that the directionother fields are tied to this and k.

Linear polarization is when the direction of the electric field always points along +/−direction. For example, if the wavevector k points along the Cartesian +z direction, wecould have an plane wave with electric field polarization along the x direction:

E(r, t) = E0x cos(ωt− kz) (1.52)

or we could have it point along the y direction:

E(r, t) = E0y cos(ωt− kz) (1.53)

We could even point it along an arbitrary direction in the (x, y) plane:

E(r, t) = E0(x cos θ + y sin θ) cos(ωt− kz) (1.54)

Page 15: Quantum Electronics / Photonics

Johnson, QE FS2014 14

Figure 1.1: Sketch of a linearly polarized TEM light wave.

Linearly polarized light is not the only kind of polarization possible. Since any linearcombination of plane waves is also a solution to the wave equation, we can add up a wavelinearly polarized along x

E1(r, t) = E0x cos(ωt− kz) =1

2E0xe

i(ωt−kz) + c.c. (1.55)

with another wave in the same direction linearly polarized along y, but with a small phasedifference φ 6= 0:

E2(r, t) = E0y cos(ωt− kz + φ) =1

2E0ye

iφei(ωt−kz) + c.c. (1.56)

The result is

Etot(r, t) =1

2E0(x + yeiφ)ei(ωt−kz) + c.c. (1.57)

which, if we go though the addition of the complex conjugate becomes

Etot(r, t) = E0 [(x + y cosφ) cos(ωt− kz)− y sinφ sin(ωt− kz)] (1.58)

In this solution the electric field changes direction depending on the value of ωt − kz.This is in general called elliptical polarization, since the direction of the electric fieldtraces out an ellipse as we vary ωt − kz. A special case of this is for φ = π/2 calledcircular polarization, which is the subject of one of the problems in this weeks’ exerciseset. Circular polarization has great importance in studies of magnetism with light.

Note that in complex notation these funny polarizations can be compactly representedby a value of E0 where each element of the vector has a different complex phase. This isone reason why this kind of notation is so popular.

Page 16: Quantum Electronics / Photonics

Johnson, QE FS2014 15

1.8 Energy and the Poynting vector

It seems apparent that light, and waves in general, carry energy. The key relation here isthe Poynting theorem, provable directly from Maxwell’s equations:

∇ · S +∂u

∂t= −jf · E (1.59)

where we have introduced the Poynting vector

S = E×H (1.60)

and the electromagnetic energy density

u =1

2(E ·D + B ·H) (1.61)

Note that these equations use the real fields; using the complex versions without care willgive you nonsense. The Poynting theorem is an energy continuity equation, where the righthand side is the source term (the mechanical work done by the field on free charges).

The direction of the Poynting vector gives the direction of energy transport. In ourassumptions so far, this is equivalent to the direction of the wave propagation l. Themagnitude of the Poynting vector give us the energy transported in that direction per unitarea per unit time. The linear momentum per unit area and unit time transferred bylight is given by S/c.

Note that in general the value of the Poynting vector oscillates with the light frequencyω. Often we don’t care so much about these very fast changes, and we would rather knowthe time-averaged energy flux. In general for a wave

E(r, t) = Re[E(r)eiωt] (1.62)

H(r, t) = Re[H(r)eiωt] (1.63)

with complex amplitudes E and H,

< S >=1

2Re(E× H∗) (1.64)

The time-averaged magnitude of S is often called the intensity I. For the specific case ofplane waves in a linear, homogeneous medium

< S >=1

2Z|E0|2k (1.65)

In this case the time-averaged electromagnetic energy density is simply

< u >=1

2ε0ε|E0|2 (1.66)

Page 17: Quantum Electronics / Photonics

Chapter 2

Propagation in dispersive media

Learning objectives

After completing this section of the course, you should be able to

• Explain the physical meaning of the frequency-dependent susceptibility (eqs. 2.3-4)

• Recall that there is a relationship that connects the real and imaginary parts ofthe suscpeptibility (the Kramers-Kronig relations), and explain the physical reasonsbehind it

• Identify the kinds of dispersion in materials based on data

• Explain the physical consequences of a complex-valued index of refraction in termsof wave propagation (eqs. 2.46, 2.48)

• Explain the physical meaning of phase and group velocity in an electromagnetic pulse

• Calculate the group and phase velocity given appropriate data (eqs. 2.5,2.57)

• Estimate the minimum duration of a pulse given the spectrum

• Calculate the broadening of a pulse due to propagation in a dispersive medium(eq.2.64)

2.1 Mathematical aspects of dispersion

Back in section 5 of the notes from last week, we made the approximation that the polar-ization density and magnetization density depend linearly on the instantaneous values ofthe E and H fields, respectively. While this may be true for slow changes in the fields withtime, it seems like that as we increase the speed of changes to the fields at some point thiscannot be true. In the next section we will explore the physical orgin of the breakdown ofthis assumption; for now, we will treat it only mathematically.

16

Page 18: Quantum Electronics / Photonics

Johnson, QE FS2014 17

Assuming that P and M at a particular point in space depend linearly on the “history”of the electric and magnetic fields, respectively, we can write

P(t) =∫ ∞−∞

ε0χ(t− t′)E(t′)dt′ (2.1)

M(t) =∫ ∞−∞

χm(t− t′)H(t′)dt′ (2.2)

(Here we will be talking only about a particular point in space, so we suppress the depen-dence on r.) These relations may be familiar from linear response theory. In these cases,the susceptibilities are impulse response functions for a time-invariant system; that is, theygive the time behavior of P or M for a delta-function like E(t) or H(t).

If we apply a Fourier transform to the above relations, we get

P(ω) = ε0χ(ω)E(ω) (2.3)

M(ω) = χm(ω)H(ω) (2.4)

where the field variables are all now given in the frequency domain rather than in the timedomain. Here we are using the following freuqency domain quantities

P(ω) =∫ ∞−∞

P(t)e−iωtdt (2.5)

χ(ω) =∫ ∞−∞

χ(t)e−iωtdt (2.6)

E(ω) =∫ ∞−∞

E(t)e−iωtdt (2.7)

M(ω) =∫ ∞−∞

M(t)e−iωtdt (2.8)

χm(ω) =∫ ∞−∞

χm(t)e−iωtdt (2.9)

H(ω) =∫ ∞−∞

H(t)e−iωtdt (2.10)

This notation is a bit improper; we should really define new function names for all thetransformed fields. This is, however, pretty common to see in the literature. Normally youcan distinguish the frequency domain functions form the time domain functions by context(the t or ω are pretty good giveaways). The general idea that the response function dependson frequency is called “dispersion.”

Note that here all the transformed fields can be (and often are) complex. In partic-ular, let’s think about the susceptibilities. The time-domain susceptibilties must all bezero-valued for t < 0, since otherwise the system would be able to predict the future. Thisimplies that for a non-trivial time-domain susceptibility the frequency-domain susceptibil-ity must be complex-valued. What does this mean to have complex susceptibility?

Page 19: Quantum Electronics / Photonics

Johnson, QE FS2014 18

Let’s suppose we apply an electric field E(t) = E0 cos(ω0t) to our medium. The Fouriertransform is

E(ω) =1

2

∫ ∞−∞

E0(eiω0t + e−iω0t)e−iωtdt (2.11)

=1

2

∫ ∞−∞

E0(ei(ω0−ω)t + e−i(ω0+ω)t)dt (2.12)

= πE0(δ(ω − ω0) + δ(ω + ω0)) (2.13)

(2.14)

Thus the polarization is simply

P(ω) = πε0χ(ω)E0(δ(ω − ω0) + δ(ω + ω0)) (2.15)

Back-transforming to the time domain gives

P(t) =1

∫ ∞−∞

P(ω)eiωtdω (2.16)

=ε02

E0(χ(ω0)eiω0t + χ(−ω0)e−iω0t) (2.17)

(2.18)

Note that since the time-domain susceptibility is real, χ(−ω0) = χ(ω0)∗. This allows us towrite

P(t) = ε0E0Re[χ(ω0)eiω0t] (2.19)

From this it is clear that |χ0(ω)| gives the amplitude of the polarization density response,whereas the complex phase of χ(ω0) gives a phase shift in time. We will explore what thismeans in terms of wave propagation a bit later.

2.2 Physical origins of dispersion: Lorentz model

(Saleh, chapter 5, section C)To get some physical intuition for what dispersion really is, it is helpful to think in

terms of simple models. In this section we are going to work through a simple model forthe polarization density for an atom. We treat the electronic charge on the atom as a masswith charge -q and mass m on a spring, fixed on the other end to an ion with charge +qand mass M m.

A simple equation of motion for such a system is a damped simple harmonic oscillator

d2x

dt2+ γ

dx

dt+ ω2

0x = F/m (2.20)

The force term is given byF (t) = −qE(t) (2.21)

Page 20: Quantum Electronics / Photonics

Johnson, QE FS2014 19

Where E(t) is the magnitude of the (real) electric field (we are dropping vector notationfor now). Using the relation p = −qx for the dipole moment, we then get an equation forthe time-dependance of the dipole moment

d2p

dt2+ γ

dp

dt+ ω2

0p = q2E(t)/m (2.22)

If we multiply this by the number of such dipoles per unit volume N we obtain

d2P (t)

dt2+ γ

dP (t)

dt+ ω2

0P (t) = Nq2E(t)/m (2.23)

If we Fourier transform this equation we get

− ω2P (ω) + iγωP (ω) + ω20P (ω) = Nq2E(ω)/m (2.24)

Solving for P (ω) reveals

P (ω) =Nq2/m

ω20 − ω2 + iγω

E(ω) (2.25)

And now we have a value for the frequency-domain susceptibility

χ(ω) =Nq2/ε0m

ω20 − ω2 + iγω

(2.26)

Defining

χ0 =Nq2

ε0mω20

, (2.27)

we then get

χ(ω) = χ0ω2

0

ω20 − ω2 + iγω

(2.28)

The real and imaginary parts are

Re[χ(ω)] ≡ χ′(ω) = χ0ω2

0(ω20 − ω2)

(ω20 − ω2)2 + (γω)2

(2.29)

Im[χ(ω)] ≡ χ′′(ω) = −χ0ω2

0ωγ

(ω20 − ω2)2 + (γω)2

(2.30)

Under the assumption γ ω0 (low damping), these functions are plotted in figure 2.2.Since resonators are ubiquitous in physics, it is very much worth it to examine this in detailto really understand it.

First, let’s consider the case where ω ω0. In this limit the susceptibility is nearlyreal, and it approaches the value χ0 > 0. This means that the polarization is in sync withthe electric field for low frequencies.

Next, consider ω ω0. Here the susceptibility is also nearly real, and approaches avalue of zero from the negative side. This means that when we “overdrive” the resonator,the polarization is 180 degrees out of sync with the electric field.

Page 21: Quantum Electronics / Photonics

Johnson, QE FS2014 20

Figure 2.1: Dependence of the real and imaginary parts of the susceptibility (in units ofχ0) as a function of frequency ω. Here the quality factor Q = ω0/γ = 10.

Finally, consider when ω ∼ ω0 For exactly ω = ω0 the susceptibility is completelyimaginary, meaning the polarization oscillation is 90 degrees shifted from the electric field.This also corresponds to a sharp peak in the imaginary part, with a full-width-at-half-maximum γ. The height of the maximum is given by Qχ0, where Q = ω0/γ is called thequality factor of the resonator. Note that as the damping γ decreases, the spike in theimaginary part becomes narrower and taller. The real part flips sign at ω0, going frompositive to negative.

In many cases we can represent real systems as collections of resonators with a distribu-tion of frequencies. The resonators do not need to be electrons bound to ions; they couldbe for example two differently charged ions bound to each other (phonons). This evenworks to some extent for some apparently pathological cases, for example when ω0 = 0.This is the subject of one of the exercises for this week.

Also, although we are here only talking about polarization density, we could make ananalogous treatment for magnetization density. The treatment is pretty much the same,although at optical frequencies magnetic resonances are not often important.

2.3 Kramers-Kronig relations

Jumping for a moment back to the math of section 1, we defined the frequency dependentsusceptibilities as Fourier transforms of what we called a time-dependent susceptibilities,which are linear response functions defined by

P(t) =∫ ∞−∞

χ(t− t′)E(t′)dt′ (2.31)

M(t) =∫ ∞−∞

χm(t− t′)H(t′)dt′ (2.32)

Page 22: Quantum Electronics / Photonics

Johnson, QE FS2014 21

These response functions are real, and we also noted that causality implies that the responsefunctions must be zero for t < 0. From linear response theory (see for example L. E. Franks,Signal Theory, Prentice Hall, 1981) it can be shown that the real and imaginary parts ofχ(ω) (and also χm(ω)) are related by the following integral relations

Re[χ(ω)] ≡ χ′(ω) =2

π

∫ ∞0

ω1χ′′(ω1)

ω21 − ω2

dω1 (2.33)

Im[χ(ω)] ≡ χ′′(ω) =2

π

∫ ∞0

ωχ′(ω1)

ω2 − ω21

dω1 (2.34)

These relations constrain the possible forms for the susceptibility. It is also possible to usethese relations to measure the full complex susceptibility if you can measure one compo-nent (or some combination) at all frequencies. “All frequencies” is of course not possiblepractically; however, if the system is composed of a collection of Lorentz resonators (orcan be modeled as such) it usually suffices to make measurements for some factor of 10 orso below and above the resonance frequencies. Extrapolations are then used to fill in theunmeasured portions for the K-K transforms.

2.4 Index of refraction in dispersive media

By Fourier transforming Maxwell’s equations (with respect to time) and applying similararguments as before, we can obtain a frequency domain wave equation

∇2E(r, ω) = −ω2n(ω)2

c2E(r, ω) (2.35)

where we have introduced a complex valued, frequency-dependent index of refraction

n(ω) =√µ(ω)ε(ω) (2.36)

and we now haveε(ω) = 1 + χ(ω) (2.37)

µ(ω) = 1 + χm(ω) (2.38)

which are also complex. Very often the index of refraction is split into real and imaginaryparts1

n(ω) = n′(ω) + in′′(ω) (2.39)

Monochromatic plane wave solutions with frequency ω1 look like

E(r, ω) = Cδ(ω − ω1)e−ik(ω1)·r (2.40)

1Note that for the form of Fourier transforms we have chosen, the imaginary part of both the suscepti-bility and the index of refraction is negative. It is also possible to use a FT convention (where iωt→ −iωt)that makes these imaginary parts positive. It makes little difference when calculating physical observablesunless you accidentally mix these conventions.

Page 23: Quantum Electronics / Photonics

Johnson, QE FS2014 22

with

k(ω) =n(ω)ω

c(2.41)

Going to back to the time domain we recover the more familiar

E(r, t) = E0ei(ω1t−k(ω1)·r) (2.42)

provided we set E0 = C/2π. We can split the wavevector into real and imaginary parts

k(ω) = k′(ω) + ik′′(ω) (2.43)

Substituting this into the time-domain plane wave solution gives

E(r, t) = E0ei(ω1t−k′(ω1)·r)ek

′′(ω1)·r (2.44)

We can see that the real part of the wavevector corresponds to an oscillating phase factor,whereas the imaginary part leads to a decay of the wave amplitude (since n′′ < 0, whichmeans k′′ is antiparallel to k′).

Let us first examine the oscillatory part, which looks a lot like our equation for a planewave in a nondispersive medium. The wavelength of the oscillatory part is

λ(ω) = 2π/|k′(ω)|. (2.45)

The phase velocityvp(ω) = c/n′(ω) (2.46)

is the apparent speed that the phase of the oscillating part moves through the medium.The imaginary part causes the wave to decay. This causes a loss in the energy carried

by the wave as is moves through the medium. The intensity of the wave (averaged overthe fast oscillations) is

I =1

2Z(ω1)|E0|2e2k′′(ω1)·r (2.47)

where Z(ω) is the (also complex) impedance. The intensity attenuation factor

α(ω) = −2n′′(ω)ω/c (2.48)

is the inverse of the distance over which the intensity drops by 1/e. Note that this is forthe intensity ; the electric field drops more gradually (by a factor of 2).

2.5 Phenomenology of n(ω)

For a non-magnetic (µ = 1) medium described well as a collection of identical Lorentzresonators,

n(ω) =√

1 + χ(ω) =

√√√√1 + χ0ω2

0

ω20 − ω2 + iγω

. (2.49)

Page 24: Quantum Electronics / Photonics

Johnson, QE FS2014 23

Figure 2.2: Dependence of the real and imaginary parts of the index of refraction as afunction of frequency ω. Here the quality factor Q = ω0/γ = 10, and χ0 = 1.

See figure 2 for a plot, using parameters similar to what we used for the susceptibility.For most optically transparent media (where the attenuation factor is small), the

strongest electronic resonances lie at higher energies. This corresponds to ω ω0. Wealso assume γ ω0. Inspection of figure 1 in this limit shows that here the imaginarypart of χ is very small, and the real part approaches χ0. Also, dχ′

dω> 0. This in turn means

that the index of refraction n has a very small imaginary part, and the slope dn′

dω> 0. The

fact that the real part of the index has a positive slope with respect to frequency is prettycommon in optical materials, and this condition is called normal dispersion.

Notice that on the resonance the slope of the real part of the susceptibility is differentand the imaginary part is large. In this region the derivative of the real part dn′(ω)

dω< 0,

and so we have what is called anomalous dispersion. The dispersion returns to “normal”at higher frequencies.

In practice, to find the values for the index of refraction for transparent optical materialswe use interpolation formulas with tabulated parameters. Since useful optical materialshave very small n′′ in their range of transparency, usually only the real part of the indexis treated and n is approximated as purely real. One of the more well-known formulas isthe Sellmeier equation, which has the form

n2(λv) = 1 +∑j

Bjλ2v

λ2v − Cj

(2.50)

where the sum is truncated at some point, usually at two or three terms. The Sellmeierequations are given in terms of the wavelength λv in vacuum, not in the medium! Inother words, λv = c/ν = 2πc/ω. Note that the Sellmeier equations look a lot like the indexof refraction for a collection of Lorentz oscillators at different frequencies (if we re-writein terms of λv). This is not an accident, although in practice it is problematic to assignprecise meanings to all the parameters in terms of actual resonators. Parameters for the

Page 25: Quantum Electronics / Photonics

Johnson, QE FS2014 24

equations are tabulated in various places; for example, Saleh gives some parameters forcommon optical materials on page 180 of the text. A particular set of parameters is validonly over a specified wavelength range, so make sure when doing such a calculation thatyou satisfy these criteria.

For x-rays, the Sellmeier equation is usually not adequate. Here we mostly rely on tablesof measured values (a handy resource is given in the exercises for this week). Usually thevery small deviations from n = 1 in the x-ray range are denoted as follows

n = 1− δ − iβ (2.51)

where δ and β are usually very small for λ ∼ 1A, unless the frequency of the EM fieldhappens to be close to an ionization threshold. Very often δ > 0, which means that thephase velocity is greater than c. Is this a problem?

2.6 Pulses: non-monochromatic waves

Monochromatic plane waves by definition are infinite in both space and time. Real elec-tromagnetic waves are always finite in both space and time. In the present section we willdiscuss plane wave pulses, which are EM waves with a finite time duration and a finitespatial extent in one direction (the propagation direction).

What happens when we superpose two identically polarized, equal amplitude monochro-matic plane waves with different frequencies ω1 and ω2? If we look a the time-dependentE field at r = 0 we get something like

E(t) = E0 [cos(ω1t) + cos(ω2t)] (2.52)

If we define ωa = ω1+ω2

2and ∆ω = ω2 − ω1 and E′0 = 2E0, we can re-write this to get

E(t) = E′0 cos(ωat) cos(∆ωt/2) (2.53)

For ∆ω ωa this looks like figure 3. We have oscillations at the average frequency, but theamplitude of the oscillations is modulated slowly in time by the difference (beat) frequency.

We can continue by adding in more and more waves; see figures 4 and 5. The generaleffect of adding more frequencies is to suppress oscillations away from t = 0, or in thecase of a constant frequency difference ∆ω, away from times t = 0,±2π/∆ω,±4π/∆ω, . . ..If we were to keep adding frequencies like this we would eventually end up with a trainof equally spaced single-cycle pulses. This is actually very similar to what happens in acertain type of laser designed to make very short pulses, used quite a bit for time-resolvedspectroscopy.

So far we have considered only the electric field at r = 0. What about the propagationof such pulses (or pulse “trains”) in space and time? To do this we need to include theadded phase and attenuation from the propagation by multiplying the complex field byeik·r. For convenience, let the direction of k′ be along the +z axis. Then the added phaseterm is just eik

′z. Considering our original example of two monochromatic waves that are

Page 26: Quantum Electronics / Photonics

Johnson, QE FS2014 25

Figure 2.3: Two superposed monochromatic waves with a small difference frequency.

Figure 2.4: Four superposed monochromatic waves with small difference frequencies.

Page 27: Quantum Electronics / Photonics

Johnson, QE FS2014 26

Figure 2.5: Eight superposed monochromatic waves with small difference frequencies.

in phase at t = 0 and z = 0, the position zmax of the maximum of the envelope at anarbitrary time t is just the condition that makes sure the phases are still equal

ω1t− k′(ω1)zmax = ω2t− k′(ω2)zmax (2.54)

Solving for zmax yields

zmax =ω2 − ω1

k′(ω2)− k′(ω1)t (2.55)

The peak of the envelope moves with an apparent speed. In the limit where the differencefrequency approaches zero, this becomes

vg =dω

dk′

∣∣∣∣∣ω=ωa

=c

n′(ωa) + ωadn′

∣∣∣ω=ωa

(2.56)

and we refer to vg as the group velocity. In terms of the vacuum wavelength λv

vg(λv) =c

n′ − λv dn′dλv

(2.57)

We also often talk about a “group index”

ng ≡ c/vg = cdk′

dω= n′ + ω

dn′

dω= n′ − λv

dn′

dλv(2.58)

These expressions for group velocity also give accurate values for the speed of the envelopefor pulses with more than two frequency components, as long as the group velocity doesnot change much over the spectral bandwidth of the pulse. Small deviations from this idealcondition lead to broadening of the pulse as it moves through the material, since differentpairings of frequencies have envelopes moving at slightly different frequencies. Changes in

Page 28: Quantum Electronics / Photonics

Johnson, QE FS2014 27

the group velocity vg or (equivalently) the group index ng with frequency is called “groupvelocity dispersion”, often abbreviated as GVD. The case where

dngdω

> 0 (2.59)

or equivalentlyd2n′

dλ2v

> 0 (2.60)

is called “positive” group dispersion, whereas

dngdω

< 0 (2.61)

ord2n′

dλ2v

< 0 (2.62)

is “negative.” Note that for a Lorentz oscillator (see figure 2), for frequencies below theresonance there is positive group dispersion, whereas well above the resonance the groupdispersion is negative. Since it leads to changes in the shapes of pulses, group dispersionis a critical concept for ultrafast laser physics, and also for optical communication.

Wave packets and time-bandwidth product

We already have a recipe for making a pulse: just add up a bunch of plane waves withslightly different frequencies. In general, E(ω) can take many forms, and is called thespectrum of the light. Usually the spread of frequencies in the spectrum is much smallerthan the average frequency ω0. The time-dependent E field is just an inverse fouriertransform away:

E(t) =1

∫ ∞−∞

E(ω)eiωtdω (2.63)

As we might suspect from the previous section, the complex phase of E(ω) as a functionof frequency is very important in determining the ultimate time duration of a pulse witha particular spectral power distribution. A constant phase leads to the shortest pulses(called transform limited pulses).

When discussing pulse durations, we typically use the intensity envelope functionI(t), where we average the energy/area/time over the fast oscillations of the electric fieldbut not over the slow envelope that defines the pulse. The intensity envelope is proportionalto the magnitude of the electric field envelope squared. In the frequency domain Iω(ω) ∝|E(ω)|2. 2

For pulses like this the width of the intensity time envelope is inversely proportionalto the spectral bandwidth. The time duration is often reported as the “full width at half

2Note that the time-domain intensity envelope function is not the inverse Fourier transform of Iω(ω)!

Page 29: Quantum Electronics / Photonics

Johnson, QE FS2014 28

Name Intensity envelope function ∆ν∆t

Gaussian e−at2

2 ln 2/π = 0.4413Hyperbolic secant squared sech2at 0.3148Flat top 1 if |t| < a/2, otherwise 0 0.8859Lorentz 1

1+at20.2206

Table 2.1: Time-bandwidth product for a few common pulse shapes. We use ∆ν ratherthan ∆ω here since the former is more common.

maximum” (FWHM): if the intensity at the maximum of the pulse is I0, the FWHM isthe difference between the times on either side of the maximum where the intensity is halfas strong. A similar definition holds for the FWHM of the power spectrum. The productof the two FWHM’s is called the time bandwidth product, and it is often used as ameasure of how optimally shaped a short laser pulse is. The time bandwidth product fora number of different pulse shapes is given in table 2.1.

As these constant phase pulses propagate in a medium with group velocity dispersion,a phase difference gradually appears over the spectrum. In the time-domain, this resultsin a broadening of the pulse and a change in the apparent average frequency over theduration of the pulse. This is called “chirp” since it is reminiscent of a bird song. Chirpcan be positive or negative, depending on the sign of the GVD. Manipulating the chirp ofa laser pulse is a key technology in modern ultrafast laser systems, allowing for huge peakintensities to be created [see e.g. Strickland, Bado, Pessot and Mourou, J. Quant. Elec.24, p. 398 (1988)].

In communication contexts, the group velocity dispersion is often reported as a dis-persion parameter

D = − 1

v2g

dvgdλv

= −λvc

d2n′

dλ2v

(2.64)

which has units of time/distance2. The dispersion parameter can be used to estimate the

chirp-induced broadening of an initially transform-limited pulse as τ ≈√

∆t2 + (D∆λv∆z)2,where ∆t is the initial duration, ∆λv is the bandwidth in wavelength and ∆z is the prop-agation distance.

2.7 Group, phase and front velocity

We’ve already seen that for x-rays, the phase velocity exceeds c. What about groupvelocity? It turns out that there are some circumstances where it, too, can exceed c. Ifyou turn again to the behavior or n for a Lorentz oscillator (figure 2), you will noticethat near the resonance there is a very sharp slope, where dn′/dω can be very negative,depending on the quality factor. According to equation 2.58, for a large enough negativevalue of this derivative, we can force ng < 1, which means that vg > c. This has infact been demonstrated experimentally (e.g. L. J. Wang, A. Kuzmich, and A. Dogariu,“Gain-assisted superluminal light propagation,” Nature, vol. 406, p. 277, 2000).

Page 30: Quantum Electronics / Photonics

Johnson, QE FS2014 29

Figure 2.6: (a) A transform-limited pulse (constant phase spectrum); (b) a pulse withpositive chirp.

Is this all compatible with relativity? Does this mean we can build time machines fromLorentz oscillators? Well, no. It turns out that neither the phase nor the group velocityare necessarily good measures for how fast information can be transmitted optically. Thephase velocity we can just think of as the speed of a zero-crossing of a wave, and this doesnot really contain any new information. The group velocity is the velocity of the envelopemaximum, but in principle this is also not really information, since for a smooth, analyticpulse shape (like a Gaussian) if we are careful about measuring the intensity at early times,we can do a Taylor-series expansion and extrapolate from this early data where the pulsemaximum will be.

So, how do we then measure the maximum speed of information flow in an opticalsystem? The answer is the front velocity, which is the speed of how a non-analyticfeature in the electromagnetic field propagates. This is always equal to c, regardless ofmedium [see e.g. Stratton, Electromagnetic Theory, Wiley (2007), section 5.18].

Page 31: Quantum Electronics / Photonics

Chapter 3

Interfaces, interference and coherence

Learning objectives

Much of this week will be review, and we will not be covering everything discussed here inthe actual lectures. In class we will be emphasizing the practical implications a bit more.

After completing this section of the course, you should be able to

• Calculate the reflection and transmission of EM waves across interfaces betweenisotropic, linear media (eqs. 3.18-3.21)

• Given the electric field reflectivity and transmission coefficients, calculate the corre-sponding intensity and power coefficients textit(eqs. 3.26-29)

• Explain the mathematical and physical meaning of Brewster’s angle (eq. 3.36).

• Identify the conditions for total reflection

• Describe some measures of temporal coherence in the context of light (eqs. 3.44,3.47)

3.1 Boundary conditions for isotropic media

Reference: Saleh and Teich, pp. 209-215

Earlier we have considered the case where light propagates in a uniform medium withoutboundaries. This week we will consider the case of plane boundaries between differentmedia.

To figure out how this works, we need to again look at Maxwell’s equations to findrelationships between the fields at such boundaries. In integral form, the curl equationsbecome ∫

E · dl = −∫ ∫ ∂B

∂t· dA (3.1)

30

Page 32: Quantum Electronics / Photonics

Johnson, QE FS2014 31

Figure 3.1: Curve used to integrate Maxwell’s equations.

∫H · dl =

∫ ∫ (jf +

∂D

∂t

)· dA (3.2)

To get boundary conditions for the interface, we make the integral on the left hand sideover a curve as depicted in figure 3.1, where the distance d extending across the interfaceis considered as infinitesimally small: d → 0. The area spanning the curve is then alsoinfinitesimally small, and so we obtain boundary conditions for the components of E andH parallel to the interface (assuming no surface current):

(E1)‖ − (E2)‖ = 0 (3.3)

(H1)‖ − (H2)‖ = 0 (3.4)

Which just says that E‖ and H‖ are continuous across the interface.Now consider the case where a monochromatic plane wave in a linear medium

E(r, t) = E1ei(ωt−k1·r) (3.5)

encounters such an interface. It is clear that just this wave is not adequate to satisfy theboundary conditions for E and H as long as ε or µ is different in the two media, since inmedium 2 (the medium after the interface) this is not a valid solution to the wave equa-tion. Consequently, we need to introduce additional waves. Since the boundary conditionsobviously require that there be electric and magnetic fields in medium 2, we would expectthere to be at least one wave in medium 2 propagating away from the interface. These arecalled transmitted or refracted waves. There may also be waves propagating away fromthe interface on the side of medium 1. These are called reflected waves.

In general, then, the boundary conditions are (for a boundary containing the origin ina global reference frame)

∑j

E(1,j)‖ e

i(ω(1,j)t−k(1,j)

‖ ·r‖) =∑j

E(2,j)‖ e

i(ω(2,j)t−k(2,j)

‖ ·r‖) (3.6)

∑j

H(1,j)‖ e

i(ω(1,j)t−k(1,j)

‖ ·r‖) =∑j

H(2,j)‖ e

i(ω(2,j)t−k(2,j)

‖ ·r‖) (3.7)

Page 33: Quantum Electronics / Photonics

Johnson, QE FS2014 32

for r on the boundary. Here we have identified the incident wave with E(1,1) = E1 andω(1,1) = ω and so on.

How many waves do we really need? Here we make another assumption, which isactually more of a clarification: we assume that the energy flux (Poynting vector) of allwaves except the incident wave is directed away from the interface. This is really just away of stating that there is only one incident wave. Mathematically, this (usually) meansthat the real part of the component of the wavevector normal to the interface must bepointing away from the interface plane.1

Next, we note that in order to satisfy equations 3.6 and 3.7 for all time, we need ingeneral two additional waves at the same frequency ω as the incident wave, at least one ofwhich must be in medium 2. Additionally, to be able to satisfy the boundary conditionsfor all points on the interface, the k‖ components must also be the same for all thesewaves. Since the magnitude k of the wavevector is determined by the frequency ω andthe index of refraction of each medium, this completely defines the wavevector of eachwave up to the sign of the component perpendicular to the interface. From our aboveassumption/clarification this is also known: for all waves except the incident wave the realpart of the perpendicular component must point away from the interface.2 This meansthat there can be only one such wave in medium 2, and one reflected wave in medium 1.

What about other frequencies? For any given frequency not equal to that of the incidentwave, we will have a problem. Following similar arguments, we can only have one wavein each medium for a particular k‖ that each propagate away from the interface. Thisnumber of waves gives only one solution to the boundary conditions: namely, all fieldsare zero! This is a consequence of the linearity assumption. When considering nonlinearcontributions, interface effects can often lead to additional frequencies (second harmonic,etc.).

Returning to the problem at hand, we will simplify life a bit by renaming our waves andconsidering only the three useful ones, as depicted in figure 3.2. From our above arguments,we have essentially determined the direction and magnitude of all the wavevectors. Thisresults in the familiar conditions

θ3 = θ1 (3.8)

andn2 sin θ2 = n1 sin θ1 (3.9)

where θj = cos−1(|kj · n|) and n is the interface normal. This last condition is the famousSnell’s law, which you will note works also for complex angles and indices.

For convenience we will now adopt a convention where the electric field polarizationfor each wave is defined in reference to an orthogonal coordinate system for each wave asshown in figure 3.2, where the ±z direction is the direction of propagation. Note that for

1There are strange exceptions for “negative index” materials near resonances, but these are difficult toachieve in practice and usually only for long wavelengths. For an example, see Shelby, Smith and Schultz,Science 292, pp. 77079 (2001)

2If the real part of the wavevector perpendicular to the interface is zero, then we pick the solution thatleads to decay of the field as we move away from the interface.

Page 34: Quantum Electronics / Photonics

Johnson, QE FS2014 33

Figure 3.2: Sketch of a monochromatic wave reflected from and transmitted through aninterface, showing the coordinate system used for each wave.

the reflected beam the y-axis direction is chosen so that its projection into the interfacepoints along the same direction as the other +y axes.

Our boundary conditions then become

E1x + E3x = E2x (3.10)

E1y cos θ1 + E3y cos θ1 = E2y cos θ2 (3.11)

H1x +H3x = H2x (3.12)

H1y cos θ1 +H3y cos θ1 = H2y cos θ2 (3.13)

Recall from the week 1 lectures that H = 1Z

k× E, which means that we can re-write thelast two conditions to be

1

Z1

E1y −1

Z1

E3y =1

Z2

E2y (3.14)

1

Z1

E1x cos θ1 −1

Z1

E3x cos θ1 =1

Z2

E2x cos θ2 (3.15)

Solving we get (E2x

E2y

)=

(ts 00 tp

)(E1x

E1y

)(3.16)

Page 35: Quantum Electronics / Photonics

Johnson, QE FS2014 34

(E3x

E3y

)=

(rs 00 rp

)(E1x

E1y

)(3.17)

where

rs =Z2 cos θ1 − Z1 cos θ2

Z2 cos θ1 + Z1 cos θ2

(3.18)

ts = 1 + rs (3.19)

rp =Z2 cos θ2 − Z1 cos θ1

Z2 cos θ2 + Z1 cos θ1

(3.20)

tp = (1 + rp)cos θ1

cos θ2

(3.21)

Light with an electric field polarized only along the x-direction is called s-polarized, sincethis is perpendicular (senkrecht) to the plane of incidence (the plane spanned by the inter-face normal and the wavevectors). Conversely, light polarized along y is called p-polarized.3

People sometimes also call s-polarized light transverse electric (TE) and the p-polarizedcase transverse magnetic (TM), although this is less common (and sometimes the assign-ments are reversed!).

Staring at these equations for a while gives you some idea of why Z is called theimpedance. If Z2/Z1 → ∞ or Z2/Z1 → 0 we have complete reflection (with differentreflection phases). For Z1 ≈ Z2 there is efficient transmission. This is analogous to generaltransmission lines in electronics and acoustics.

In the pretty common case where both media are non-magnetic such that µ = 1,Z = Z0/n and we get the Fresnel equations

rs =n1 cos θ1 − n2 cos θ2

n1 cos θ1 + n2 cos θ2

(3.22)

ts = 1 + rs (3.23)

rp =n1 cos θ2 − n2 cos θ1

n1 cos θ2 + n2 cos θ1

(3.24)

tp = (1 + rp)cos θ1

cos θ2

(3.25)

Usually, the most useful of these are the reflection Fresnel coefficients (rs and rp).The above give boundary conditions for the electric field components of the light. What

about the power (energy/time)? In the incident medium, the ratio of reflected power toincident power is

Rs = |rs|2 (3.26)

for s-polarized light, andRp = |rp|2 (3.27)

3Curiously, for x-rays you see quite often σ- and π- in instead of s- and p- to describe polarization.

Page 36: Quantum Electronics / Photonics

Johnson, QE FS2014 35

for p-polarized light. Energy conservation then implies the transmission power ratios are

Ts = 1−Rs (3.28)

Tp = 1−Rp. (3.29)

Note that in general Ts 6= |ts|2 and Tp 6= |tp|2 since the wave in the second medium travelswith a different impedance and angle (the beam is either stretched or compressed by therefraction, causing the cross-sectional area of the beam to change). Taking these intoaccount, the correct formulas are (for µ = 1 and real cos θ2)

Ts,p =n2 cos θ2

n1 cos θ1

|ts,p|2 (3.30)

3.2 Brewster’s angle

From equation 3.24 for the p-polarized reflectivity, the reflectivity is zero for

n1 cos θ2 = n2 cos θ1 (3.31)

We will suppose that n1 and n2 are both real. We then can obtain

cos2 θ2 =

(n2 cos θ1

n1

)2

(3.32)

From Snell’s law

sin2 θ2 =

(n1 sin θ1

n2

)2

(3.33)

Adding them gives

sin2 θ2 + cos2 θ1 = 1 = sin2 θ1 + cos2 θ1 =

(n1 sin θ1

n2

)2

+

(n2 cos θ1

n1

)2

(3.34)

Which then can be written(n2

1 − n22

n22

)sin2 θ1 =

(n2

1 − n22

n21

)cos2 θ1 (3.35)

θ1 = θB = tan−1(n2/n1) (3.36)

where θB is called Brewster’s angle.Figure 3.3 shows rp as a function of θ1 for n1 = 1 (for vacuum or approximately for

air) and n2 = 1.55 (typical for glass). For θ1 < θB ≈ 57 we have rp < 0, and for θ1 > θBwe have rp > 0. This implies there is a phase shift by π when increasing incidence angleabove Brewster’s angle.

Page 37: Quantum Electronics / Photonics

Johnson, QE FS2014 36

Figure 3.3: Reflection coefficient rp vs. incidence angle for n1 = 1 and n2 = 1.55.

The refracted ray for light incident at Brewster’s angle makes an angle of 90 withrespect to the wavevector of the (suppressed) reflected wave, as shown in figure 3.4. Thisis no coincidence. In general, we can consider the reflection from an interface to be the resultof re-emission of radiation from the near-surface region of the refracted ray. At Brewster’sangle for p-polarization the electric field of the refracted ray at the surface points exactlyin the direction of the reflection. The polarization induced by tthe field is also pointingalong this direction. The radiation of an oscillating dipole is zero along the axiss of thedipole (see figure 3.5). Consequently, the reflected wave is compltely suppressed.

Brewster’s angle is frequently used in the design of laser cavities, since it is an easy wayto suppress optical losses due to reflectivity for polarized light. For example, solid statelasers that use crystals for optical gain often have the crystals cut so that the light in thelaser hits the surfaces at Brewster’s angle. Some types of polarizers consist of many platesof glass or a similar transparent material at Brewster’s angle. The effect of propagatinglight through such a stack of plates is to reflect s-polarized light and to transmit p-polarizedlight.

For the above we assumed that the indices of refraction are real. What happens if oneor both are complex? In this case, θB is complex, and no incidence angle will completelysuppress reflection. There is, nevertheless, a minimum in reflectivity for tan θ1 = Re[n2/n1].

3.3 Total reflection

If (now again assuming real indices of refraction) n1 > n2,

cos θ2 =√

1− sin2 θ2 =

√√√√1− n21

n22

sin2 θ1 (3.37)

is purely imaginary for θ1 > sin−1(n2/n1). This makes the reflection coefficients

rs =n1 cos θ1 − n2 cos θ2

n1 cos θ1 + n2 cos θ2

(3.38)

Page 38: Quantum Electronics / Photonics

Johnson, QE FS2014 37

Figure 3.4: Sketch of refraction at Brewster’s angle.

Figure 3.5: Radiaiton pattern of a dipole.

Figure 3.6: Sketch of Brewster’s angle optics in a laser cavity.

Page 39: Quantum Electronics / Photonics

Johnson, QE FS2014 38

Figure 3.7: Sketch of total internal reflection in various situations (from page 10 of Salehand Teich). (a) From a planar boundary. (b) From the interior of a prism. This issometimes used as a broadband mirror. (c) In an optical fiber, total internal reflection isused to guide light along the interior.

rp =n1 cos θ2 − n2 cos θ1

n1 cos θ2 + n2 cos θ1

(3.39)

complex, but with a magnitude equal to unity. This leads to the phenomenon of totalreflection.

This is practically quite an important phenomenon. Some examples are shown infigures 3.7, 3.8 and 3.9.

Since the power reflectivity for these conditions is unity, we expect no propagatingtransmitted wave across the interface. The transmission coefficients

ts = 1 + rs (3.40)

and

tp = (1 + rp)cos θ1

cos θ2

(3.41)

are, however, not zero. What does this mean?From the definition of θ2, the wavevector k2 for the refracted ray in this case must

satisfyk2 · n = cos θ2. (3.42)

Since cos θ2 is purely imaginary, this means that the real part of k2 points along theinterface. This in turn means that the oscillatory, propagating part of the wave has nocomponent traveling into medium 2. The electric field as a function of distance from theinterface just decays exponentially, even though we have purely real indices of refraction.We call this an evanescent wave with an attenuation (in field)

γ = −i2πn2 cos θ2/λv =2π

λv

√n2

1 sin2 θ1 − n22 (3.43)

Page 40: Quantum Electronics / Photonics

Johnson, QE FS2014 39

Figure 3.8: Sketch of total internal reflection used as a “retroreflector”. Total internalreflection from the inside of a cube corner prism causes incoming light rays to be reflectedback along an anti-parallel path. These and related devices are often used to delay lightby specific amounts.

Figure 3.9: Total internal reflection is used to make the compact optics in binoculars.

Page 41: Quantum Electronics / Photonics

Johnson, QE FS2014 40

Frustrated Total Internal Reflection (TIR)By placing another surface in contact with a totally internallyreflecting one, total internal reflection can be “frustrated.”

nn

Total internal reflection

nn

Frustrated total internal reflection

Interesting question: How close do the prisms have to be before TIR is frustrated? Answer: pretty close, but NOT touching!

The quantity 1/E tells us how far the evanescent wave extends beyond the surface of the first prism, which tells us how close the second prism needs to be in order to frustrate the TIR.

Figure 3.10: Frustrated total internal reflection for two prisms at a sufficiently close dis-tance.

An application of frustrated TIR

The ridges on a finger act as locations where TIR is frustrated, so no light comes from there. But between the ridges, there is still TIR so light is reflected.

Figure 3.11: Frustrated total internal reflection used to make images of fingerprints.

The fact that the evanescent field extends beyond the interface can be exploited ifwe introduce another interface within this short distance. This can result in “frustrated”total internal reflection, provided the new interface allows for propagating solutions. Inthis situation light can “tunnel” through the gap where propagating solutions to the waveequation are not allowed (see figures 4.7 and 3.11).

The concept of frustration and extinction is of importance when designing waveguidesthat use total internal reflection (e.g. in optical fibers).

3.4 Longitudinal coherence

Reference: Saleh and Teich, pp. 405-413The coherence of light is, qualitatively, the ability of light to interfere. Brightness

combined with coherence is a primary distinguishing characteristic of light from a laserversus that from a thermal source (like a lightbulb). Coherence is naturally divided intotwo types, called longitudinal and transverse. These terms reference directions parallel or

Page 42: Quantum Electronics / Photonics

Johnson, QE FS2014 41

perpendicular to the wave propagation direction. This section is a brief introduction tolongitudinal (or temporal) coherence, which we will follow up with next week when we talkabout interferometers. We will talk about transverse coherence a bit later.

Mathematically, (first order) coherence in general is defined in terms of the mutualcoherence function

Γ(r1, t1; r2, t2) = 〈E∗(r1, t1) · E(r2, t2)〉 (3.44)

where E is the complex electric field, and here the brackets are indicating an ensemble aver-age (similar to that in quantum mechanics). Transverse coherence refers to the propertiesof this function for t1 = t2; longitudinal or temporal coherence refers to the properties atr1 = r2.

If we assume that the coherence function does not depend on the absolute times t1 andt2 but on the difference τ = t2 − t1 ( the field is “stationary”) we can write the coherencefunction as

Γ(r1, r2, τ) = Γ(r1, t1, r2, t1 + τ) (3.45)

For r = r1 = r2 we recover the temporal/longitudinal coherence function

G(r, τ) = Γ(r, r, τ) (3.46)

The complex degree of temporal coherence is defined as

g(τ) =G(τ)

G(0)(3.47)

For non-stationary fields this depends on the absolute time t1.For nearly monochromatic, stationary light with frequency ω, g(τ) = h(τ)eiωτ where

h(τ) is a slowly varying function of τ . In this case the behavior of |h(τ)| will generallydecay from 1 to zero with some characteristic time scale, often modeled as an exponentialwith time constant τcoh called the coherence time of the light. This is related to the spectralwidth ∆ν ∼ 1/τcoh. Note that this is for stationary, not pulsed light. Pulsed light also has alimited coherence time, upward-bounded by the pulse duration. We’ll explore these issuesmore next week when discussing how we measure temporal coherence experimentally.

Page 43: Quantum Electronics / Photonics

Chapter 4

Interferometry

Learning objectives

After completing this section of the course, you should be able to

• Identify and recall features of basic types of interferometers

• Explain how interferometers measure coherence properties of light

• Analyze the data from an interferometer to extract coherence time

• Identify the basic features of a Fabry-Perot etalon

• Use the scattering and wave propagation matrix formalism to describe scatteringfrom multiple parallel interfaces (eqs. 4.12,4.13)

4.1 More about coherence

See also: Saleh and Teich, chapter 11

At the very end of last week we started to discuss aspects of temporal coherence the-ory. Although there is quite a bit more to this theory (discussed in more detail in quantumoptics), for now we would like to tie these bare bones of the theory to experimental mea-surements.

As you may recall, the hand-waving definition of coherence is the “ability to interfere.”Last week we discussed some mathematical properties of ensembles of fields. Since therewas some confusion expressed on what is meant by this, we’ll start here by clarifying whatis meant by this.

First, we will consider again our favorite kind of light, the monochromatic linearlypolarized plane wave. Figure 4.1 shows the time dependence of an ensemble of such waves,each with the same frequency but with random phases. If we arbitrarily make t1 = 0,〈E∗(t1) · E(t1 + τ)〉 = E2

0 cos(ωτ)/2. The complex degree of temporal coherence is then a

42

Page 44: Quantum Electronics / Photonics

Johnson, QE FS2014 43

Figure 4.1: An ensemble of monochromatic waves. Here this just means that we averageover a bunch of waves with a random relative phase.

quickly oscillating function of τ that never decays, regardless of τ . Normally we factor outthe fast oscillations (which we would automatically do if we had used complex amplitudes)and we would say that the coherence time is infinite since there is no decay.

Figure 2 shows an ensemble of fields from a random source. The electric field valuesare truly random, but high frequencies are filtered out either by the physical process ofhow the light was generated (e.g. blackbody radiation), and/or by the absorption of highfrequency components (e.g. passing through glass). Figure 4.3 shows the magnitude of thecomplex degree of temporal coherence, showing a very fast decay. The coherence time τcohis essentially the width of the first maximum. Using the power equivalent width definition,

τcoh =∫ ∞−∞|g(τ)|2dτ (4.1)

we get τ ≈ 0.39 for this particular ensemble. A truly random electric field without thelow-frequency pass would have a coherence time of zero (i.e. |g(τ)| would be a deltafunction).

In general we say that for times less than the coherence time, the field fluctuationsare “strongly” correlated, and for times larger than the coherence time they are “weakly”correlated. Another way to look at this is that if we are given the amplitude and phase ofthe field at a particular time, we can fairly accurately predict the amplitude and phase attimes within the coherence time, but not for times much later.

4.2 Wiener-Khinchin Theorem

The “power spectral density” is defined as

S(ω) = limT→∞

1

2πT

⟨|VT (ω)|2

⟩(4.2)

Page 45: Quantum Electronics / Photonics

Johnson, QE FS2014 44

Figure 4.2: An ensemble of thermal fields, subjected to a low-frequency filter (for example,by passing through a material that absorbs high frequencies).

Figure 4.3: Plot of |g(τ)| vs. τ for an ensemble of 10000 waves that was sampled infigure 4.2.

Page 46: Quantum Electronics / Photonics

Johnson, QE FS2014 45

where

VT (ω) =∫ T/2

−T/2E(t)e−iωtdt (4.3)

is a truncated Fourier transform of the (scalar) electric field. It essentially gives us theenergy/frequency/area/time for the light field. Sometimes we just call this the “spectraldensity” or even just the “spectrum.” Note that this is just the same as what we calledI(ω) for pulses.

The Wiener-Khinchin theorem states that

S(ω) =1

∫ ∞−∞

G(τ)e−ıωtdt (4.4)

which implies that measurement of the coherence properties of a stationary wave can betransformed to a measurement of the power spectral density.

Proof:

S(ω) = limT→∞

1

2πT

⟨∣∣∣∣∣∫ T/2

−T/2E(t)e−iωtdt

∣∣∣∣∣2⟩

= limT→∞

1

2πT

⟨∫ T/2

−T/2

∫ T/2

−T/2E∗(t)E(t′)e−iω(t′−t)dtdt′

= limT→∞

1

2πT

∫ T/2

−T/2

∫ T/2

−T/2〈E∗(t)E(t′)〉 e−iω(t′−t)dtdt′

Here we make a change of variables τ = t′ − t and re-write the limits of integration asshown in figure 4.4. Now

S(ω) = limT→∞

1

2πT

(∫ T

0dτe−iωτ

∫ T/2

τ−T/2dt′G(τ) +

∫ 0

−Tdτe−iωτ

∫ T/2+τ

−T/2dt′G(τ)

)

= limT→∞

1

2πT

(∫ T

0dτG(τ)e−iωτ (T − τ) +

∫ 0

−TdτG(τ)e−iωτ (T + τ)

)

= limT→∞

1

∫ T

−TdτG(τ)e−iωτ

=∫ ∞−∞

G(τ)e−iωτdτ

4.3 Interferometry

To actually measure temporal coherence properties we use interferometers. Some commoninterferometer designs are shown in figures 4.6, 4.11, 4.12 and 4.13.

Page 47: Quantum Electronics / Photonics

Johnson, QE FS2014 46

Figure 4.4: Sketch of the integration area change caused by the change of integrationvariables.

The basic idea behind all interferometers is to take light and split it into at least twobranches that are guided along different optical paths. They are then recombined and theintensity of the resulting wave is then measured using a photodiode or bolometer.

The output beam of an interferometer for the interference of two plane waves can begenerally (assuming no change in polarization) written

Eout(t) = C1Ein(t− τ1) + C2Ein(t− τ2) (4.5)

where Ein(t) is the input amplitude, C1 and C2 are constants, and τ1 and τ2 are the effectivetime delays imposed by propagation along each arm. The (ensemble average) intensity is

Iout =1

2Z< |Eout|2 >= Iin

|C1|2 + |C2|2 + 2Re [C∗1C2g(τ1 − τ2)]

(4.6)

For the simple “symmetric” case where C1 = C2,

Iout = 2|C1|2Iin 1 + Re [g(τ1 − τ2)] (4.7)

which indicates that is is possible to measure the real part of the autocorrelation functiong(τ) in a simple way from the output of an interferometer where it is possible to vary therelative path length.

Figure 4.5 shows a typical interferogram for light with a limited coherence time, acase intermediate between a completely random field and an monochromatic wave. Thevisibility of the interferogram is the amplitude of the oscillations to the average. Thevisibility generally peaks at a path length difference of zero and decays as we move away.

Different interferometers are useful for different tasks. The Michelson interferometershown in figure 4.6 is historically the most famous, due to its role in the Michelson-Morleyexperiment that found no experimental evidence for the hypothetical medium in which

Page 48: Quantum Electronics / Photonics

Johnson, QE FS2014 47

Figure 4.5: A sample interferogram.

Figure 4.6: Michelson interferometer

Page 49: Quantum Electronics / Photonics

Johnson, QE FS2014 48

Figure 4.7: Fourier transform spectrum of absorption from a bacterium, used as a medi-cal diagnostic (Davis and Mauer, Current Research, Technology and Education Topics inApplied Microbiology and Microbial Biotechnology A. Mendez-Vilas (Ed.), 2010).

light propagates. Michelson interferometers use relatively few optics, although the basicdesign is difficult to keep aligned over large path lengths due to beam spreading effects.Modern implementations usually modify the design a bit by adding focusing optics toalleviate this problem.

Michelson interferometers are nowadays used in several areas of application. One im-portant application is Fourier transform spectroscopy. This makes use of the Wiener-Khinchin theorem to transform an interferogram into a power spectrum. In the infraredregion of the spectrum, this kind of spectroscopy is generally far superior to other meth-ods. Spectroscopy in the infrared is important in chemistry since many vibrational androtational energy transitions lie in this range. Figure 4.7 shows an example of using thecharacteristic infrared absorption features of a bacterium as a means of identification.

Another application of the Michelson interferometer is optical coherence tomography(see figures 4.8, 4.9 and 4.10). The goal of this technique is to make 3D images of highlyscattering materials, such as milky biological tissue. The sample of interest is placed in onearm of a michelson interferometer, typically using a lens to focus. The light source of theinterferometer is chosen to be extremely broadband, so that it has a very limited coherencetime. The limited coherence time makes it so that interference fringes can be observed onlyfor light from the sample that is backscattered with a total path length equal to that ofthe other interferometer arm. This allows for rejection of multiple scattering paths (whichact to obscure the image) as well as precise depth resolution. Figure 4.10 shows an imageof a living mouse embryo taken using optical coherence tomography.

The Mach-Zehnder interferometer (figure 4.11) is a modification of the Michelson wheredifferent optics are used to separate and recombine the beams. The Mach-Zehnder config-uration is often used with lasers, since the Michelson has the disadvantage that one of theoutput beams is directed back toward the source. For intense sources this can be prob-lematic and a possible source of damage. Mach-Zehnder interferometers are often used for

Page 50: Quantum Electronics / Photonics

Johnson, QE FS2014 49

Figure 4.8: Sketch of the apparatus of an optical coherence tomography measurement(Schmitt, IEEE J. Sel. Top. in Quant. Elec. 5, p. 1205 (1999).

Figure 4.9: A closer view of the actual OCT measurement. The short coherence lengthof the broadband source limits the depth of field, allowing for 3D resolution (also fromSchmitt).

Page 51: Quantum Electronics / Photonics

Johnson, QE FS2014 50

Figure 4.10: Live image of a mouse embryo taken using OCT (Larin, Larina and Dickinson,SPIE Newsroom (2011), DOI:10.1117/2.1201103.003581).

Figure 4.11: Mach-Zehnder interferometer

precision length measurements and for characterizing short laser pulses.The Sagnac interferometer is shown in figure 4.12. In this configuration the two in-

terfering beams follow the same optical path, but travel in opposite directions. Sagnacinterferometers are used for measuring angular velocity. In an inertial reference frame,the two beams would always constructively interfere. In a rotating frame (like on earth),fringe shifts can be used to measure the absolute angular velocity. The first successful useof such an interferometer to measure the earth’s rotation was reported in 1913 as a “proof”of the existence of aether. This was incorrect, since this effect is actually quite consistentwith special relativity as was discussed two years earlier. Sagnac interferometers are to-day an important component of GPS and inertial guidance systems, since precision clocksynchronization over long distances requires a precise accounting for the earth’s rotation.

Lastly, figure 4.13 shows a different kind of interferometer called a Fabry-Perot inter-ferometer, sometimes called an etalon. In general, a Fabry-Perot interferometer consistsof two partially reflective parallel planes spaced some distance apart. This can be twopartially reflective mirrors with an air gap, or just a slab of a dielectric material. Lightentering the interferometer through one side bounces back and forth within the interfer-ometer, coupling out on both sides. The interference of the multiple paths influences theintensity of the outputs.

The Fabry-Perot interferometer turns out to be the basis for the resonators in lasers,so we will be looking at it in some more detail. First, though, we will talk about some

Page 52: Quantum Electronics / Photonics

Johnson, QE FS2014 51

Figure 4.12: Sagnac interferometer

Figure 4.13: Fabry-Perot interferometer

Page 53: Quantum Electronics / Photonics

Johnson, QE FS2014 52

Figure 4.14: Conceptual basis for matrix methods: virtual planes are placed between eachfeature of the optical system that modifies the phase or amplitude of propagating waves.

Figure 4.15: Conceptual basis for matrix methods: the elements between each plane areabstracted into “boxes” that “do something” to the electric field amplitude.

new calculational tools that we can use to make calculations of interference from multipleparallel surfaces a bit easier.

4.4 Matrix methods for multiple surfaces

(reference: Saleh and Teich, chapter 7.1)

Let’s consider an optical system composed of multiple conceptual elements as shownin figures 4.14 and 4.15. Here each box is some optical element that somehow modifiesthe electric field amplitude: e.g. a surface where light is reflected and transmitted, or adistance within a medium where the light must propagate. The field values E

(+/−)1 , E

(+/−)2 ,

E(+/−)3 , E

(+/−)4 are either the s-polarized or p-polarized components of the electric field at

different virtual “planes” within this multi-element system. The + superscript indicatesthat the field is associated with a rightward-propagating wave, whereas the − sign indicatesa leftward-propagating wave.

If we represent the fields at each plane as a vector, e.g.(E

(+)1

E(−)1

)(4.8)

we can relate the fields at different planes using a matrix(E

(+)2

E(−)2

)=

(A12 B12

C12 D12

)(E

(+)1

E(−)1

)(4.9)

Page 54: Quantum Electronics / Photonics

Johnson, QE FS2014 53

called the wave-propagation matrix. The nice thing about this matrix is that we canrelate the fields at the two extreme ends of the device just by multiplying the individualwave-propagation matrices. For example,(

E(+)4

E(−)4

)=

(A34 B34

C34 D34

)(A23 B23

C23 D23

)(A12 B12

C12 D12

)(E

(+)1

E(−)1

)

=

(A14 B14

C14 D14

)(E

(+)1

E(−)1

)

How do we find the elements of the wave propagation matrix? For the case of simplepropagation in a uniform medium, this is relatively straightforward. For normal incidencethrough a slab of thickness d, (

A BC D

)=

(e−ikd 0

0 eikd

)(4.10)

where k is the wavevector magnitude. One of this week’s exercises is to generalize this tooblique incidence.

Actually determining the elements of the wave propagation matrix for an interface isnot so straightforward. Fortunately, there is a useful trick. The scattering matrix isdefined as (

E(+)2

E(−)1

)=

(a12 b12

c12 d12

)(E

(+)1

E(−)2

)(4.11)

For an interface, it is apparent that a12 = t12, b12 = r21, c12 = r12, and d12 = t21. Here it isunderstood that we use either s- or p-polarization, as the situation merits. You will showin this week’s exercises the general relations between the wave propagation matrix and thescattering matrix: (

A BC D

)=

1

d

(ad− bc b−c 1

)(4.12)

and (quite symmetrically) (a bc d

)=

1

D

(AD −BC B−C 1

)(4.13)

4.5 Application to Fabry-Perot

Let’s use this formalism to calculate the transmission of a Fabry-Perot interferometer asdepicted in figure 4.13. Here we assume the outside medium has index n = 1, and thatthe slab has real index n > 1. We will cover only the case where or normal indcidence (thegeneral case is a problem in the exercises).

The scattering matrix for the leftmost interface is(t12 r21

r12 t21

). (4.14)

Page 55: Quantum Electronics / Photonics

Johnson, QE FS2014 54

The corresponding wave propagation matrix is

1

t21

(t12t21 − r21r12 r21

−r12 1

). (4.15)

Since in this case r12 = −r21, t12 = 1 + r12 and t21 = 1− r12, this simplifies to

1

(1− r12)

(1 −r12

−r12 1

). (4.16)

For the rightmost interface we have the scattering matrix(t21 r12

r21 t12

)(4.17)

and the corresponding wave propagation matrix is

1

t12

(t21t12 − r12r21 r12

−r21 1

)(4.18)

which again simplifies to1

(1 + r12)

(1 r12

r12 1

). (4.19)

The wave propagation matrix for the propagation through the dielectric is(e−ikl 0

0 eikl

). (4.20)

The wave propagation matrix for the whole system is then

1

1− r212

(1 r12

r12 1

)(e−ikl 0

0 eikl

)(1 −r12

−r12 1

)

which works out to

M =

(A BC D

)=

1

4n

((1 + n)2e−ikl − (1− n)2eikl 2i(n2 − 1) sin kl−2i(n2 − 1) sin kl (1 + n)2eikl − (1− n)2e−ikl

)(4.21)

Using equation 4.13 we can convert this back to a total scattering matrix

S =

(t14 r41

r14 t41

)(4.22)

which gives us effective reflection and transmission coefficients for the entire system. Sincethe system as we have presented it is lossless (n is real) and the waves begin and end inthe same medium, (

t14 r41

r14 t41

)=

(t41 r14

r41 t14

)(4.23)

Page 56: Quantum Electronics / Photonics

Johnson, QE FS2014 55

From the definition of the scattering matrix(E

(+)4

E(−)1

)=

(t14 r41

r14 t41

)(E

(+)1

E(−)4

). (4.24)

If we impose the condition E(−)4 = 0, we get

E(+)4 = t14E

(+)1 = t41E

(+)1 =

4ne−ikl

(1 + n)2 − (1− n)2e2iklE

(+)1 (4.25)

The transmitted intensity is then

I(+)4 =

16n2

(1 + n)4 + (1− n)4 − 2(1− n2)2 cos 2klI

(+)1 =

1

1 + (2F/π)2 sin2(kl)I

(+)1 (4.26)

where

F =π√R

1−R (4.27)

is called the finesse (here R = |r12|2).We see from this that the transmission of the Fabry-Perot interferometer is a periodic

function of the product kl. The transmission as a function of this phase is shown infigure 4.16 for different values of R (and thus different values of the finesse). As R andthe finesse increase, the maxima become sharper and sharper. Since kl is wavelength-dependent, this makes Fabry-Perot etalons with high finesse excellent wavelength filters.

The finesse F is the effective number of internal waves that interfere to create theoutput wave. It is equal to ratio of the distance between transmission “spikes” to thewidth of an individual peak.

Page 57: Quantum Electronics / Photonics

Johnson, QE FS2014 56

Figure 4.16: Transmission of a Fabry-Perot etalon with different reflectance values. Thex-axis here is kl/π = 2nl/λv.

Page 58: Quantum Electronics / Photonics

Chapter 5

Fourier Optics

Learning objectives

After completing this section of the course, you should be able to

• Identify the Helmholtz equation and what it describes (eq. 5.4)

• Apply the paraxial and fresnel approximation when appropriate (eq. 5.17 )

• Identify some terminology used to describe Gaussian beams

• Use the transfer function formalism to describe spatial beam propagation (eq. 5.10)

• Estimate the Fresnel number for a beam (eq. 5.18 )

• Apply the Fraunhofer approximation when appropriate (eq. 5.33 )

• Compute the intensity of a wave at arbitrary points, given the intensity in one planeusing the above approximations.

5.1 Helmholtz Equation

Recall the wave equation for the electric field in a sourceless medium

∇2E− n2

c2

∂2

∂t2E = 0 (5.1)

For a monochromatic wave we may write the complex-valued field

E(r, t) = E(r)eiωt (5.2)

If we now assume that the polarization is approximately independent of space,

E(r) = E0U(r) (5.3)

57

Page 59: Quantum Electronics / Photonics

Johnson, QE FS2014 58

we can then extract the Helmholtz equation:

∇2U + k2U = 0 (5.4)

where k = nω/c is our familiar wave number.Under the above conditions, the complex U(r) is related to the intensity by

I(r) =1

2Z|U(r)|2 (5.5)

For the monochromatic plane waves we discussed earlier, |U(r)| is independent of r. withthe help of the Helmholtz equation we can extend this to cases where the intensity is notuniform in space.

The wavefront of a wave that solves the Helmholtz equation is a surface of constantcomplex phase. In other words, if φ(r) = arg[U(r)], a wavefront is a 2-D surface whereφ(r) is constant.

5.2 Spatial Fourier transform and transfer function

For what follows we will consider the case where the wavefronts normals are all very closeto one particular direction, which we will designate as +z. This is also the condition wherethe polarization could possibly be approximately space-independent.

For a plane defined by a fixed value of z, we can define a function

V (kx, ky, z) =∫ ∞−∞

∫ ∞−∞

U(x, y, z)e−i(kxx+kyy)dxdy (5.6)

as a 2-D Fourier transform in space.Applying the transform to the Helmholtz equation we get

− (k2x + k2

y)V +∂2V

∂z2+ k2V = 0 (5.7)

or∂2V

∂z2+ (k2 − k2

x − k2y)V = 0 (5.8)

This can be solved as

V (kx, ky, z) = V (kx, ky, 0)e−iz√k2−k2

x−k2y . (5.9)

which gives a way to express the 2D transform in any plane perpendicular to z in terms ofan already known 2D transform in any other plane.

A convenient way to think of this is to refer to the transfer function

H(kx, ky) = e−iz√k2−k2

x−k2y (5.10)

valid for propagation in any homogeneous isotropic medium.Given the spatial profile of U in any plane (say z = 0), we can deduce the profile at

any other plane defined by a fixed z with the following procedure:

Page 60: Quantum Electronics / Photonics

Johnson, QE FS2014 59

• Calculate V (kz, ky, 0)

• Multiply V (kz, ky, 0) by H(kx, ky)

• Inverse transform this product to obtain U(x, y, z)

The inverse transform of H is

h(x, y) = h0eik x

2+y2

2z (5.11)

with h0 = ik2πze−ikz is known as the impulse response function. It is the field generated at

a plane z from a delta-function-like field at the origin of the coordinate system. It is alsopossible to use the impulse response function to calculate fields for arbitrary values on thez = 0 plane by spatial integration.

U(x, y, z) =ik

2πze−ikz

∫ ∞−∞

∫ ∞−∞

U(x′, y′, 0)e−ik2z

[(x−x′)2+(y−y′)2]dx′dy′ (5.12)

5.3 Fresnel Approximation

If we assume k2x + k2

y k2, the angles θx ≈ kx/k and θy ≈ ky/k are both very small. Wecan then re-write the phase of the transfer function

z√k2 − k2

x − k2y = kz

√1− k2

x + k2y

k2= kz

1− k2x + k2

y

2k2+

1

8

(k2x + k2

y

k2

)2

+ . . .

(5.13)

Cutting this off at the second term is called the Fresnel approximation. This yields thesimplified transfer function

H(kx, ky) = H0eiz(k2

x+k2y)/2k (5.14)

withH0 = e−ikz (5.15)

The Fresnel approximation essentially just approximates the phase contributions from kxand ky as a quadratic function (a paraboloid) rather than the more precise spherical func-tion. The condition for validity is that the third term in the expansion above is smallcompared with π for all θx and θy.

kz

8(θ2x + θ2

y)2 π (5.16)

If θm is the maximum of both |θx| and |θy| and a is the largest radial distance in the outputplane, this can be written

NFθ2m

4 1 (5.17)

Page 61: Quantum Electronics / Photonics

Johnson, QE FS2014 60

where

NF =a2

λz(5.18)

is the Fresnel number.The limit we have just described (where the components of the wave all travel in nearly

the z direction) is also known as the paraxial approximation. In this limit we can describe

U(r) = A(r)e−ikz (5.19)

where A(r) is a slowly varying function of r. In this approximation we can also make anew version of the Helmholtz equation

∇2TA− i2k

∂A

∂z= 0 (5.20)

called the paraxial Helmholtz equation.

5.4 Gaussian Beams

One solution to the paraxial Helmholtz equation that is of great importance is the Gaussianbeam

A(r) =A1

q(z)e−ik(x2+y2)/2q(z) (5.21)

with q(z) = z+ iz0. (In the exercises for this week you will show this). Since the Gaussianbeam is so important in laser physics we will take some time with now to define somesalient terminology.

Usually 1/q(z) is broken into real and imaginary parts

1

q(z)=

1

R(z)− i λ

πW 2(z)(5.22)

The full complex amplitude is

U(r) = A0W0

W (z)e− ρ2

W2(z) e−ikz−ikρ2

2R(z)+iζ(z) (5.23)

with beam parameters

W (z) = W0

√1 +

(z

z0

)2

(5.24)

R(z) = z

[1 +

(z0

z

)2]

(5.25)

ζ(z) = tan−1 z

z0

(5.26)

Page 62: Quantum Electronics / Photonics

Johnson, QE FS2014 61

W0 =

√λz0

π(5.27)

We will discuss this in greater detail a bit later. For now, note that W (x) sets the widthof the intensity profile at a given value of z. It gives the distance form the z axis wherethe intensity falls to 1/e2. The spot size of the beam is defined at 2W (z). The minimumat z = 0 is called the waist, and W0 is the waist radius.

The parameter z0 is known as the Rayleigh range, and sets the length scale along zover which the beam expands. Note that only A0 and z0 are needed to define the beam.

At values of z z0, W (z) ≈ W0/z0z = θ0z where

θ0 =λ

πW0

(5.28)

is the half-angle of the divergence of the beam. The full divergence angle is twice this

2θ0 =2λ

πW0

(5.29)

This means that the more tight the waist of the beam, the more it diverges.The “depth of focus” refers to the z-range over which the beam is close to (within a

factor of√

2) of its minimum value. This is simply

2z0 =2πW 2

0

λ(5.30)

Which means that the depth of focus for small waists decreases quadratically; this is whyit is harder to focus high-resolution optics.

The parameter R(z) gives the radius of curvature for the wavefronts. The singularityat z = 0 indicates that this radius is infinite at the waist (i.e. the wavefronts are alignedto the z = 0 plane). The phase parameter ζ(z) is known as the Guoy phase shift, and isresponsible for a phase shift of π across the depth of focus.

5.5 Fraunhofer limit: the far field

If we let a be the largest distance measured away from the z axis in the “output” plane,and let b be the maximum distance in the “input” plane. We call the condition

NF ≡a2

λz 1 (5.31)

and

N ′F ≡b2

λz 1 (5.32)

the Fraunhofer approximation. Note that these conditions are much more stringent thanthe requirements for the Fresnel approximation. In the Fraunhofer limit, we can identify

Page 63: Quantum Electronics / Photonics

Johnson, QE FS2014 62

the complex field at a point (x, y) in the output plane with the wave with angles θx ≈ x/zand θy ≈ x/z. Thus kx ≈ kx/z and ky ≈ ky/z. The complex U is then

U(z, y, z) ≈ h0V

(kx

z,ky

z

)(5.33)

where h0 = ik2πze−ikz. (for a proof starting from equation 5.12 see Saleh and Teich, page

117). This is a pretty nice result: it says that if we can apply the Fraunhofer approximation,the field output is just a Fourier transform of the input field.

5.6 Examples of amplitude modulation

5.6.1 Rectangular aperture

Suppose we have a rectangular aperture of width Dx and height Dy. We illuminate thisaperture with a +z plane wave from below. We also assume the observation screen isplaced a z = d, and that D2

x λd and D2y λd. For the intensity on the screen at

distances √dλ, we can apply the Fraunhofer approximation to obtain

U(z, y, d) ≈ h0V

(kx

d,ky

d

)(5.34)

V (kx, ky) =∫ Dx/2

−Dx/2dx∫ Dy/2

−Dy/2dy√I0e

ikxxeikyy

=√I0

[1

ikx(eikxDx/2 − e−ikxDx/2)

] [1

iky(eikyDy/2 − e−ikyDy/2)

]

= −√I0 ·

1

kxkysin(kxDx/2) sin(kyDy/2)

V

(kx

d,ky

d

)= −

√I0DxDy

sin πDxxλd(

πDxxλd

) sin πDyy

λd(πDyyλd

)

Then the intensity at the screen is

Iscreen =1

2Z|U |2 = I0

(DxDy

λd

)2sin πDxx

λd(πDxxλd

)2 sin πDyy

λd(πDyyλd

)2

(5.35)

5.6.2 Round aperture

Consider now a round aperture with diameter D. Again assuming the Fraunhofer limit,

V (kx, ky) =√I0

J1(√k2x + k2

y)√k2x + k2

y

(5.36)

Page 64: Quantum Electronics / Photonics

Johnson, QE FS2014 63

Figure 5.1: Fraunhofer diffraction from a rectangular aperture.

Figure 5.2: Fraunhofer diffraction from a round aperture.

Where J1 is a Bessel function of the first kind. This gives an intensity

Iscreen = I0

(πD2

4λd

)2 [2J1(πDρ/λd)

πDρ/λd

]2

(5.37)

with ρ =√x2 + y2. The resulting pattern is plotted in figure 5.2. This comes up quite a

bit, and has a special name: the Airy pattern. The first minimum occurs at 1.22λd/D anddefines the radius of the Airy disk. The far-field half-angle divergence is

θ = 1.22λ

D(5.38)

Page 65: Quantum Electronics / Photonics

Chapter 6

Fourier Optics 2

Learning objectives

After completing this section of the course, you should be able to

• Understand the use of phase masks as diffraction sources

• Use lenses to perform Fourier Transforms

• Apply the concept of impulse response functions to characterizing imaging systems

• Identify basic characteristics of gratings and zone plates (eq. 6.17)

• Explain the fundamental elements of holography

• Use the ray approximation to model paraxial and gaussian beams with optical ele-ments (eqs. 6.33-37, 6.44, 6.47 )

6.1 Phase masks and Lenses

Last week we considered several examples of “amplitude masks”, i. e. apertures that block(or partially block) the amplitude of light. It is also possible to consider diffraction fromelements that do not affect the amplitude at all, but instead the phase. We will considerone such example here.

Let us consider a thin transparent material that introduces a pure phase shift given bythe complex transparency

t(x, y) = exp(ik(x2 + y2)/2f) (6.1)

We consider a paraxial plane wave with fixed kx and ky components incident on thismaterial. Just after interaction with the material

U(x, y, 0) = F (kx, ky)e−i(kxx+kyy)eik(x2+y2)/2f (6.2)

64

Page 66: Quantum Electronics / Photonics

Johnson, QE FS2014 65

which can be re-written

U(x, y, 0) = F (kx, ky)ei k2f

[x2+y2− 2fkxk

x− 2fkyk

y]

U(x, y, 0) = F (kx, ky)e−i k

2f(x2

0+y20)ei

k2f

[(x−x0)2+(y−y0)2]

where x0 = fkx/k and y0 = fky/k.Applying eq. 12 (the impulse response formalism) from last week,

U(x, y, f) = h0

∫ ∞−∞

dx′∫ ∞−∞

dy′F (kx, ky)e−i f

2k(k2x+k2

y)eik2f

[(x′−x0)2+(y′−y0)2]e−ik2f

[(x−x′)2+(y−y′)2]

U(x, y, f) = h0

∫ ∞−∞

dx′∫ ∞−∞

dy′F (kx, ky)e−i f

2k(k2x+k2

y)eikf

(−x′x0−y′y0+xx′+yy′)eik2f

(x20+y2

0−x2−y2)

U(x, y, f) = h0

∫ ∞−∞

dx′∫ ∞−∞

dy′F (kx, ky)e−i f

2k(k2x+k2

y)eikf

[x′(x−x0)+y′(y−y0)]eik2f

(x20+y2

0−x2−y2)

U(x, y, f) =

(2πf

k

)2

h0F (kx, ky)e−i f

2k(k2x+k2

y)δ(x− x0)δ(y − y0)eik2f

(x20+y2

0−x2−y2)

U(x, y, f) =

(2πf

k

)2

h0F (kx, ky)e−i f

2k(k2x+k2

y)δ(x− x0)δ(y − y0) (6.3)

and we see that this type of mask focuses all of the intensity of the plane wave to a point(x0, y0) = (fkx/k, fky/k) at a distance f from the mask. This is exactly the definingproperty of a lens.

If we were to allow the wave to propagate in free space a distance d before the lens, wewould get the modified relation

U(x, y, d+ f) = iλfe−ik(d+f)ei(k2x+k2

y)(d−f)/2kF (kx, ky)δ(x− x0)δ(y − y0) (6.4)

For d = f and now considering an arbitrary input wave composed of plane waves,

U(x, y, 2f) =i

λfe−2ikfV (kx/f, ky/f, 0) (6.5)

we see that the lens arrangement simply takes the Fourier transform of the input. This asimilar to what happens in the Fraunhofer limit, but note that with a lens we only needto satisfy the conditions of the Fresnel approximation.

The Fourier transform property of the lens can be quite useful. For example, a sequenceof two lenses can be used to perform a Fourier transform and an inverse Fourier transform.In the Fourier plane, one can place an amplitude mask to filter out particular spatialfrequencies to modify the image.

Page 67: Quantum Electronics / Photonics

Johnson, QE FS2014 66

Figure 6.1: Spatial filtering with lenses.

6.2 Imaging

In the last section we indicated with lenses a way to make images. A more formal approachrequires us to generate impulse response functions and/or transfer functions for such sys-tems, consisting usually of regions of free propagation and one or more lenses. Here wewill consider one example, that of a single lens imaging system. We’ll derive the impulseresponse function for such a system.

Consider a lens with focal length f placed a distance d1 from an object plane and adistance d2 from an image plane. In impulse in from the object plane produces in theFresnel approximation a field just before the lens

U(x, y, 0) =i

λd1

e−ikd1e−ik x

2+y2

2d1 (6.6)

The lens multiplies this field by its pupil function p(x, y) (the effective aperture formed bythe lens) and the lens phase factor eik(x2+y2)/2f and gives the field “post-lens” as

U ′(x, y, 0) = U(x, y, 0)p(x, y)eik(x2+y2)/2f (6.7)

Propagating this a distance d2 via eq. 12 from last week gives an expression for the impulseresponse function for the whole system

U ′(x, y, d2) =i

λd2

e−ikd2

∫ ∞−∞

∫ ∞−∞

U ′(x′, y′, 0)e−ik (x−x′)2+(y−y′)2

2d2 dx′dy′ (6.8)

by recasting

e−i k

2d2[(x−x′)2+(y−y′)2]

= e−i k

2d2(x2+y2)

e−i k

2d2(x′2+y′2)

ei kd2

(xx′+yy′)

Page 68: Quantum Electronics / Photonics

Johnson, QE FS2014 67

we can turn the integrals into Fourier transforms and obtain

h(x, y) =−1

λ2d1d2

e−ik(d1+d2)e−i k

2d2(x2+y2)

P1

(kx

d2

,ky

d2

)(6.9)

where P1(kx, ky) is the Fourier transform of the generalized pupil function

p1(x, y) = p(x, y)e−i k

2

(1d1

+ 1d2− 1f

)(x2+y2)

(6.10)

If the system is focused, 1d1

+ 1d2− 1

f= 0 and the generalized pupil function is simply the

pupil function. If we also keep relatively on-axis such that x2 + y2 2d2/k, we can writethe really simple impulse response function for a single lens imaging system

h(x, y) ≈ −1

λ2d1d2

e−ik(d1+d2)P

(kx

d2

,ky

d2

)(6.11)

where P (kx, ky) is the Fourier transform of p(x, y). For a simple round aperture as a pupil,this is just the Airy pattern we discussed last week.

6.3 Gratings

A grating is a mask (amplitude or phase) periodic in one direction. For definiteness, let ussuppose that the periodicity is in the x-direction. The transmission then obeys

t(x+ Λ) = t(x) (6.12)

where Λ is the periodicity of the grating. We can write such a periodic function as adiscrete Fourier series

t(x) =+∞∑

m=−∞Tme

−im 2πΛx (6.13)

Tm =1

Λ

∫ Λ

0t(x)eim

2πΛxdx (6.14)

where m = 0,±1,±2, . . ..Suppose that we have a monochromatic plane wave

U(x, y, 0) = U0e−i(kxx+kyy) (6.15)

incident on the z = 0 plane of the grating. After the grating,

U ′(x, y, 0) = U(x, y, 0)t(x) = U0

+∞∑m=−∞

Tme−i[(kx+m 2π

Λ)x+kyy] (6.16)

and we see that the overall effect of the grating is to create in principle an infinite numberof beams with different x-components of the wavevector, corresponding to different integer

Page 69: Quantum Electronics / Photonics

Johnson, QE FS2014 68

Figure 6.2: Diffraction from a grating in transmission.

Figure 6.3: Diffraction from a grating in reflection.

Page 70: Quantum Electronics / Photonics

Johnson, QE FS2014 69

values of m. The integer m is called the order of the diffraction. The intensity of aparticular order is proportional to |Tm|2. Figures 6.2 and 6.3 show examples of gratings inboth transmission and reflection.

Assuming incidence in the xz plane (i.e. no y component), the angle of the mth orderfrom the grating is given by

k sin θ2 = k sin θ1 +m2π

Λ

where θ1 is the incidence angle and θ2 is the exit angle (measured from the normal to thegrating). This is normally written in the form

sin θf + sin θi = mλ

Λ(6.17)

where θf = θ2 and θi = −θ1 (i.e. we adopt a sign convention such that the incidentand zero-order diffracted beam have opposite signs). Equation 6.17 is called the gratingequation.

Gratings are typically used to measure wavelength by the amount of beam deflection ata given order (see for example problem 2 of last week’s exercises). They are also commonlyused to manipulate polychromatic light sources, for example functioning as a filter to reducebandwidth.

6.3.1 Example: harmonic amplitude transmission grating

The relative strengths of the orders of a grating depends on its form via equation 6.14.One example treated last week in the exercises was that of the “rectangular” slit profile.Here we’ll briefly go over another example, where

t(x) = A0 sin(2πx/Λ)

For this form we can evaluate

Tm =1

Λ

∫ Λ

0t(x)eim

2πΛxdx =

iA0

2Λ(δm,1 − δm,−1)

which means that for such a grating we have only the m = ±1 orders.We could also work through a hypothetical harmonic phase grating with

t(x) = A0e2πix/Λ

Which would give

Tm =1

Λ

∫ Λ

0t(x)eim

2πΛxdx =

iA0

Λδm,−1

where only the m = −1 order is allowed.

Page 71: Quantum Electronics / Photonics

Johnson, QE FS2014 70

6.4 Fresnel zone-plate

A circularly symmetric aperture with a transmission function

t(x, y) =

1, for cos

(π x

2+y2

λf

)> 0

0, otherwise.(6.18)

can be shown to function in some ways as a lens with focal length f . To rigorously show thisis a little tiresome, so instead we just give a rough argument. The transmission functionabove gives a pattern of open rings. The center of the mth ring has a radius ρm thatsatisfies

ρ2m/λf = 2m (6.19)

The path length of light from the center of one of these rings to a distance f along theoptical axis is √

ρ2m + f 2 =

√2mλf + f 2 ≈ f(1 +mλ/f) = f +mλ (6.20)

assuming mλ f . We thus see that the contributions from all the rings will add con-structively at the center of the plane z = f . Actually, this argument also applies for f/2,f/3 and so on. This leads to an important property of the Fresnel zone plate: it acts as akind of lens that focuses at a series of multiple focal lengths f/n for integer n ≥ 1.

Fresnel zone plates are actually quite important optics for x-rays, where it is difficult tomake conventional lenses due to the small dispersion in most materials. Good quality zoneplates with small focus sizes require very careful fabrication methods to make accurate ringpatterns.

6.5 Holography

Holograms are transparencies that encode information required to reconstruct the opticalwave from an object, including its amplitude and phase. If we could somehow producethin materials with arbitrary complex transmittance, this would be easy: we just maket(x,y) for the transparency equal to U(x,y,0) for the object, and then later (to reconstructthe light from the object) we illuminate the back of the transparency with a plane wave.This is, however, not so easy since pretty much all optical detectors (including film) aresensitive only to optical intensity, and throw away phase. The phase is, however, anessential component of the wave. The tick to holography is to find ways to encode thephase in the intensity.

The way this is done is by interfering a reference wave with the object wave (the wavethat you want to encode). See figure 6.4. We’ll call the reference wave Ur and the objectwave U0. If these beams overlap on the z = 0 plane, we can record the intensity patternon a transparency

t ∝ |U0 + Ur|2 = |Ur|2 + |U0|2 + U∗rU0 + UrU∗0

= Ir + I0 + U∗rU0 + UrU∗0

Page 72: Quantum Electronics / Photonics

Johnson, QE FS2014 71

= Ir + I0 + 2√IrI0 cos[arg(Ur)− arg(U0)]

Through the interference of the reference with the object wave, the transparency now hasinformation on the phase.

Decoding the information on the transparency requires illuminating the transparencyagain with the reference wave Ur. The resulting wave (just after the transparency) is

U = tUr ∝ UrIr + UrI0 + IrU0 + U2rU∗0 (6.21)

If the reference wave is selected to be a uniform plane wave moving along the z axis√Ire−ikz, we can divide eq. 21 by

√Ir and write

U(x, y) ∝ Ir + I0(x, y) +√IrU0(x, y) +

√IrU

∗0 (x, y) (6.22)

The third term is essentially our goal (the reconstructed complex wave from the object).The first term is just the intensity from the reference wave. The second term is relatedto the intensity from the object wave, often called the ambiguity wave. The fourth termis proportional to the complex conjugate of the object wave, appropriately enough oftencalled the conjugate wave. The remaining trick to making a useful hologram is to figureout how to separate these waves from each other so that we see only the desired wave.

Before talking about how to do that, we’ll consider some really simple examples. Let’ssuppose the object wave is a plane wave that makes an angle θ with the z axis, U0(x, y) =√I0e−ikx sin θ. In this case

U(x, y) ∝ Ir + I0 +√IrI0e

−ikx sin θ +√IrI0e

ikx sin θ (6.23)

In this simple case the transparency is just a sinusoidal grating (with a x-independentoffset) that produces m = 0,±1 beams.

Another example is that of a point source object located at r0(0, 0,−d). In this casethe object wave is a spherical wave with

U0(x, y) ∝ e−ik|r−r0|/|r− r0|

where r = (x, y, 0). In this case the first term of eq. 6.22 is a plane wave moving in the +zdirection; the second term (the ambiguity wave) is proportional to 1/|r − r0|2 and movesalso n the +z direction but with a small angular spread. The third term is proportionalto the amplitude of the original wave and continues to spread out; the fourth term (theconjugate wave) is a converging spherical wave that comes to a focus at (0, 0, d).

Normally the separation of the reconstructed wave from the other three waves is doneby ensuring a large angular separation between the reference wave and the object wave,as shown in figure 6.4. Since the reference and ambiguity waves both move in the +zdirection, they are automatically discarded. The conjugate wave moves as the “mirrorimage” of the object wave, so it is also discarded by looking only at the range of angleswhere the object wave appears.

Page 73: Quantum Electronics / Photonics

Johnson, QE FS2014 72

Figure 6.4: Transmission hologram. (a) generation, (b) reconstruction.

Figure 6.5: Physical methods for constructing and displaying a transmission hologram.

Page 74: Quantum Electronics / Photonics

Johnson, QE FS2014 73

This type of hologram requires the use of high-coherence, monochromatic light sourcesfor both creation and reconstruction. Since these are not always available, there are somevariants that can make do with a lower coherence source for reconstruction.

One option in this regard is called volume holography, which involves using a trans-parency with a significant depth to record the hologram. The volume diffraction patternthen acts as a wavelength filter and so white light can be used to reconstruct the hologram.This is sometimes used to make reflection holograms.

Another option that you are probably familiar with is the rainbow hologram. Thisis the kind of hologram that often appears on credit cards for anti-counterfeit measures.The rainbow hologram is constructed as a transmission hologram but with a narrow hor-izontal slit placed between the object and the transparency plane. If reconstructed withmonochromatic light of the same frequency as the original reference, the reconstructedwave looks like we are seeing the object through a slit. Changing the wavelength of thereference has the interesting effect that the slit appears vertically displaced. If white lightis used, the reconstructed wave looks like the object seen through many displaced slits, allof a different color. The reconstructed wave has parallax in the horizontal direction butnot in the vertical.

6.6 Paraxial ray optics

Quite often a full Fourier treatment of simple optical elements is unnecessary, and it sufficesto use the ray approxmimation. This is especially true for situations where the effects ofdiffraction can be neglected: for example, when there all apertures or beam sizes are muchlargee than the wavelength of the light. Some of the tools of ray optics can also be carriedover to a wave optics description in some cases, as we will see with Gaussian beams.

First, some definitions: a ray is simply a normal to a wavefront, an “arrow” that pointsin the direction that part of the wave propagates. Here we’ll be making as usual theparaxial approximation, which in this case says that all the rays make small angles withrespect to a predefined z-axis.

We describe an individual ray as a 2D vector (r, θ) where r is the distance of the rayfrom the z-axis and θ is the angle the ray deviates from the direction of the z-axis. For aconverging lens with focal length f , we can relate the vector elements before and after thelens as follows (assuming sin θ ≈ θ):

r2 = r1 (6.24)

θ2 = θ1 − r2/f (6.25)

We can then write a matrix to relate the vectors(r2

θ2

)=

(1 0−1/f 1

)(r1

θ1

)(6.26)

For propagation through a homogeneous medium of length d,

r2 = r1 + dθ1 (6.27)

Page 75: Quantum Electronics / Photonics

Johnson, QE FS2014 74

Figure 6.6: Sketch of a ray moving through a lens

θ2 = θ1 (6.28)

yielding the matrix (1 d0 1

)(6.29)

For a planar interface between two homogeneous media with indices of refraction n1

and n2, the matrix is (1 00 n1/n2

)(6.30)

And for a spherical mirror with concave radius of curvature R we get(1 0

−2/R 1

)(6.31)

To model the effect of a series of such components on the ray, all we have to do itmultiply the matrices together, with the rightmost matrix representing the first opticalelement. For example, a lens followed by a distance of propagation would be(

1 d0 1

)(1 0−1/f 1

)=

(1− d/f d−1/f 1

)(6.32)

Note that this treatment does nothing to calculate actual reflectivities or transmis-sions. . . this is just a way to keep track of the positions of particular rays.

6.7 Gaussian beam optics

We now return to the behavior of Gaussian beams that we briefly discussed last week.We’ll now consider how these waves behave in response to common optical elements likelenses and curved mirrors. We will use this later to discuss resonant cavities.

Recall from last week the form of the Gaussian beam with waist at z = zw

U(r) = A0W0

W (z)e− ρ2

W2(z) e−ikz−ikρ2

2R(z)+iζ(z) (6.33)

Page 76: Quantum Electronics / Photonics

Johnson, QE FS2014 75

with beam parameters

W (z) = W0

√1 +

(z − zwz0

)2

(6.34)

R(z) = (z − zw)

[1 +

(z0

z − zw

)2]

(6.35)

ζ(z) = tan−1 z − zwz0

(6.36)

W0 =

√λz0

π(6.37)

Consider a lens at z = 0 aligned normal to the z axis. The complex transmittance of thelens is proportional to e−ikρ

2/2f . The phase is transformed as follows:

kz + kρ2

2R− ζ → kz + k

ρ2

2R− ζ − k ρ

2

2f= kz + k

ρ2

2R′− ζ

where1

R′=

1

R− 1

f(6.38)

is the new wavefront curvature. The size of the intensity profile is unchanged. We canthen use equations 25-28 to determine z′0 and consequently all the other beam parameters.Here we show the results without the somewhat tedious algebra:

z′0 = M2z0 (6.39)

M =

∣∣∣∣∣∣ f√(zw + f)2 + z2

0

∣∣∣∣∣∣ (6.40)

W ′0 = MW0 (6.41)

(z′w − f) = −M2(zw + f) (6.42)

θ′0 =θ0

M(6.43)

6.7.1 ABCD law

There is an extremely convenient methodology for calculating the evolution of a Gaussianbeam through a series of paraxial optical components. It makes use of the complex beamparameter

1

q(z)=

1

R(z)− i λ

πW 2(z)(6.44)

which impliesq(z) = (z − zw) + iz0 (6.45)

Page 77: Quantum Electronics / Photonics

Johnson, QE FS2014 76

and allows us to write the complex wave as

A(r) =A1

q(z)e−ik

ρ2

2q(z) . (6.46)

Note that the complex value of q implicitly specifies all properties of the beam except forits overall amplitude/phase and the direction of propagation (which we here always assumeis along z).

The ABCD law states that for any paraxial optical element there exists a matrix definedby elements A, B, C and D such that the q parameters before and after are related by

q2 =Aq1 +B

Cq1 +D(6.47)

It turns out that these matrices are identical to the matrices derived for paraxial ray optics.Rather than show this in general, we will cover some important examples that pretty muchprove this for most practical uses.

For free space of length d, the appropriate ABCD matrix is(A BC D

)=

(1 d0 1

)(6.48)

which makesq2 = q1 + d

which of course makes a lot of sense when looking at equation 6.45: we are just advancinga bit in the propagation direction, which is just adding a real constant to q.

For a “thin” optical component (like the lens) with the same medium on either side, weassume that the beam size is the same both before and after, but the wavefront curvaturewill change. For the lens with focal length f ,

1

R2

= − 1

f+

1

R1

and1

q2

= − 1

f+

1

q1

The ABCD matrix for the lens is then(A BC D

)=

(1 0−1/f 1

)(6.49)

As with the ray optics matrices, the nicest thing about the (identical) ABCD matrixesis that for a series of optical elements we can just multiply the matrices, such that therightmost matrix is the element that the beam sees first (you will show this in a futureexercise). For example, the matrix for a lens followed by a distance of free space is(

1 d0 1

)(1 0−1/f 1

)=

(1− d/f d−1/f 1

)

Page 78: Quantum Electronics / Photonics

Johnson, QE FS2014 77

and so

q2 =(1− d/f)q1 + d

−q1/f + 1

which gives a new beam after these optical elements. This is a handy way to shape Gaussianbeams to have parameters well-suited for particular applications. One example is laser-cutting applications. Given a Gaussian-like beam (say from a high-power laser) say wewould like to use this beam to cut a small hole in a material. For this we need to focusthe energy of the beam into as small an area as possible, both to make the hole small andto have enough power to drill the material. We can use a series of lenses with controlledseparations to control both the size and position of the waist of the beam.

Page 79: Quantum Electronics / Photonics

Chapter 7

Gaussian and other beams

Learning objectives

Note that most of this week’s lectures will be finishing the material discussed in “week 6”that we did not quite get to. This document is a supplement to this. Learning goals forthis supplement are:

• Identify basic characteristics of Hermite-Gaussian beams

• Identify basic characteristics of Laguerre-Gaussian beams

• Identify basic characteristics of Bessel beams

• Qualitatively explain some practical applications for Bessel beams

7.1 Paraxial Helmholtz equation revisited

Reference: Saleh & Teich, section 2.1C.

Although we briefly talked about the paraxial Helmholtz equation in week 5, here wecome back to it with a more complete derivation.

The Helmholtz equation is∇2U + k2U = 0 (7.1)

If we now writeU = A(x, y, z)e−ikz (7.2)

and substitute this into the Helmholtz equation we obtain(∇2A

)e−ikz − k2Ae−ikz − 2ik

∂A

∂ze−ikz + k2Ae−ikz = 0

∇2A− 2ik∂A

∂z= 0

78

Page 80: Quantum Electronics / Photonics

Johnson, QE FS2014 79

Under the paraxial approximation, we assume that A does not change significantly as zvaries over a scale of one wavelength. This implies that

∂A

∂zλ A

or∂A

∂z kA

2π< kA

Similarly, the derivative ∂A/∂z also varies slowly over ∆z = λ, and so

∂2A

∂z2 k

∂A

∂z

This allows us to discard the second derivative of A with respect to z and write

∇2TA− 2ik

∂A

∂z= 0 (7.3)

which is known as the paraxial Helmholtz equation (here ∇2T = ∂2/∂x2 + ∂2/∂y2). This

is also called the slowly varying envelope approximation (where |A|2 is proportional to theintensity envelope).

To see the relationship between the paraxial Helmholtz equation and the Fourier op-tics implementation of the paraxial approximation, let’s take the Fourier Transform withrespect to x and y:

−(k2x + k2

y)B − 2ik∂B

∂z= 0

where B(kx, ky, z) is the FT of A. Solving this differential equation for B gives

B(kx, ky, z) = B(kx, ky, 0)eiz(k2x+k2

y)/2k (7.4)

which is exactly our expectation from the Fresnel approximation in Fourier optics.

7.2 Hermite-Gaussian beams

Reference: See Saleh & Teich, section 3.3

The Gaussian beam is only one special case of a non-plane wave solution to theHelmholtz equation in the paraxial approximation. For the purposes of describing theEM radiation in the resonant cavity of a laser, it is of particular importance to considersolutions to the paraxial-Helmholtz equation with approximately spherical wavefront cur-vatures, since these can match the shape of the spherical mirrors that are typically usedin laser cavities. We’ll see a bit later that this makes these kinds of waves self-reproducinginside the cavity, and thus form a natural way to describe the output of such resonators.

Page 81: Quantum Electronics / Photonics

Johnson, QE FS2014 80

The complex envelope of a Gaussian beam is

AG(x, y, z) =A1

q(z)e−ik(x2+y2)/2q(z) (7.5)

with q(z) = z + iz0. The beam width W (z) is

W (z) = W0

√1 +

(z

z0

)2

(7.6)

and the radius of wavefront curvature is

R(z) = z

[1 +

(z0

z

)2]

(7.7)

Let’s consider a modified version of such a beam

A(x, y, z) = X [√

2x/W (z)]Y [√

2y/W (z)]eiZ(z)AG(x, y, z) (7.8)

where X , Y and Z are all real functions. Further, we will assume that Z(z) is a slowlyvarying function of z (more precisely, dZ/dz kZ).

Because X and Y are real, the phase of this beam is the same as for the correspondingGaussian beam except for an extra phase given by Z(z) that depends only slowly on z. Thisimplies that both the modified and Gaussian beams have wavefronts with approximatelythe same radius of curvature. This further implies that they are affected in the same wayby thin lenses and curved mirrors.

The intensity envelope of the beam is

|A|2 =|A1|2z2

0

X 2[√

2x/W (z)]Y2[√

2y/W (z)]

[W0

W (z)

]2

e−2(x2+y2)/W 2(z) (7.9)

You’ll notice that this can be considered as a function of x/W (z), y/W (z). The value ofW (z) just rescales the size of the beam in the transverse directions. In the end this is justa Gaussian beam modulated by X 2 and Y2.

By plugging eq. 7.8 into the paraxial Helmholtz equation, we can (after considerablecalculus and algebra) obtain

1

X

(∂2X∂u2− 2u

∂X∂u

)+

1

Y

(∂2Y∂v2− 2v

∂Y∂v

)+ kW 2(z)

∂Z∂z

= 0 (7.10)

where u =√

2x/W (z), v =√

2y/W (z), and we have used the fact that AG by itself satisfiesthe paraxial Helmholtz equation (see section 7.5 for a proof).

Since each term of equation 7.10 is a function of different independent variables, eachof these terms must be constant, and these constants must add to zero. We may thenseparate this into three ordinary differential equations:

− 1

2

d2Xdu2

+ udXdu

= µ1X (7.11)

Page 82: Quantum Electronics / Photonics

Johnson, QE FS2014 81

− 1

2

d2Ydv2

+ udYdv

= µ2Y (7.12)

z0

[1 +

(z

z0

)2]dZdz

= µ1 + µ2 (7.13)

Equations 7.11 and 7.12 are identical eigenvalue problems. The eigenvalues are µ1 = l andµ2 = m, where l and m are any non-negative integers. The eigenfunctions are Hermitepolynomials X (u) = Hl(u) and Y(v) = Hm(u). Hermite polynomials are defined recursivelyby

Hl+1(u) = 2uHl(u)− 2lHl−1 (7.14)

H0(u) = 1, H1(u) = 2u (7.15)

The first few Hermite polynomials beyond the starting conditions are

H2(u) = 4u2 − 2 (7.16)

H3(u) = 8u3 − 12u (7.17)

Using µ1 = l and µ2 = m, equation 7.13 can be written

dZdz

=l +m

z0

[1 +

(zz0

)2]

and integrating

Z(z) =∫ l +m

z0

[1 +

(zz0

)2]dz = (l +m) tan−1(z/z0) + C

Since the integration constant C just sets an overall phase for the entire wave, we can setC = 0, leaving us with

Z(z) = (l +m) tan−1(z/z0) (7.18)

With these relations we can now write an expression for the full complex electric fieldamplitude A(x, y, z)e−ikz associated with particular eigenvalues (l,m)

Ul,m(x, y, z) = Al,mW0

W (z)Gl

[ √2x

W (z)

]Gm

[ √2y

W (z)

](7.19)

× exp

[−ikz − ikx

2 + y2

2R(z)+ i(l +m+ 1) tan−1(z/z0)

](7.20)

where Al,m is a constant and we have introduced the Hermite-Gaussian function

Gl(u) = Hl(u)e−u2/2 (7.21)

which, as its name suggests, is just a product of a Hermite polynomial and a Gaussianlineshape. Figure 7.1 shows some of the lower-order Hermite-Gaussian functions.

Page 83: Quantum Electronics / Photonics

Johnson, QE FS2014 82

Figure 7.1: Low-order Hermite-Gaussian functions, from Saleh and Teich, p. 96

Figure 7.2: Intensity profiles for Hermite-Gaussian beams, from Wikipedia

Page 84: Quantum Electronics / Photonics

Johnson, QE FS2014 83

Figure 7.2 shows plots of the intensity |Ul,m|2 of these solutions, called Hermite Gaussianbeams. You’ll notice that higher order solutions all have nodes in either the horizontal orvertical direction. Hermite-Gaussian beams are often used as a basis to describe laserbeams in resonators since these devices are often built with components that to someextent define two orthogonal transverse directions, making these modes an appropriatebasis. In practice we can often spend a lot of effort trying to suppress contributions fromhigher order Hermite-Gaussian modes, since these are usually not so desirable in a workinglaser.

7.3 Laguerre-Gaussian beams

Although the Hermite-Gaussian series is a complete basis of solutions for the paraxialHelmholz solution, we can also choose other bases that might be more appropriate de-pending on the symmetry of our resonator or whatever.

For situations with cylindrical symmetry, a more appropriate basis might be the Laguerre-Gaussian beams. Written in cylindrical coordinates (ρ, φ, z) the complex amplitude forthese beams is

Ul,m(ρ, φ, z) = Al,mW0

W (z)

W (z)

)lLlM

(2ρ2

W 2(z)

)e−ρ

2/W 2(z) (7.22)

× exp

[−ikz − ik ρ2

2R(z)− ilφ+ i(l + 2m+ 1) tan−1(z/z0)

](7.23)

where Llm(x) is the generalized Lauguerre polynomial function

Llm(x) =x−lex

m!

dm

dxm

(xl+me−x

)(7.24)

For l = m = 0 we have the Gaussian beam. The intensity of all the “pure” Laguerre-Gaussian beams have cylindrical symmetry; the phase of the beams, however, do have adependence on φ which allows for linear combinations of Laguerre-Gaussian beams to notbe cylindrically symmetric. For l 6= 0 these beams all have zero intensity at the center.These l 6= 0 beams have a helical wavefront, which allows them to carry orbital angularmomentum and can thus impart a torque.

7.4 Bessel beams

Another possible approach to finding beam-solutions is to try to find beams that are likeplane waves in terms of flat wavefronts, but have non-uniform intensity profiles in thetransverse direction. The form for such a beam would be

U(x, y, z) = A(x, y)e−iβz (7.25)

Page 85: Quantum Electronics / Photonics

Johnson, QE FS2014 84

Figure 7.3: Intensity profiles for Laguerre-Gaussian beams, from Wikipedia.

For this to satisfy the Helmholtz equation ∇2U + k2U = 0,

∇2TA+ k2

TA = 0 (7.26)

where k2T + β2 = k2. In polar coordinates the solution is

A(x, y) = AmJm(kTρ)eimφ (7.27)

where m is any integer, Jm(x) is the Bessel function of the first kind and mth order, andAm is a constant.

For m = 0, A(x, y) is purely real and so the wavefronts are planar. The intensityis circularly symmetric and does not depend at all on z (see figure 7.4). This is a bitsurprising...such a beam does not diffract at all! This kind of beam is called a Bessel beam.Unlike Gaussian beams, Bessel beams are exact solutions to the Helmholtz equation andis less constrained by the requirements of the paraxial approximation.

For large values of ρ, J20 (kTρ) = (2/πkTρ) cos2(kTρ−π/4). This is an oscillating function

that is capped at a value that decays inversely with ρ. The slowness of the decay impliesthat the cross-sectional RMS width

σ =

√√√√∫∞0 ρ2J20 (kTρ)dρ∫∞

0 J20 (kTρ)dρ

(7.28)

is infinite since the integral in the numerator diverges. The Bessel beam carries infinitepower, which is a big contrast to the other beams that we have discussed. Consequently,it is not really possible to physically realize an exact Bessel beam.

Despite this, approximations to Bessel beams are important for a growing number ofapplications. In these cases the central part of the beam follows the Bessel profile, and theedges at some point get clipped off. These “clipped” Bessel beams still retain a resistance todiffraction. These beams require special optics to produce (see for example Garces-Chavez

Page 86: Quantum Electronics / Photonics

Johnson, QE FS2014 85

Figure 7.4: Intensity distribution of a Bessel beam (from M. Hegner, Nature 419, pp.125-127 (2002)).

et al., Nature 419, 145-147 (2002)). One application for these beams is to use them topunch small holes in cell walls for biology applications. Here Bessel beams are useful sincethe energy/area of the central intensity peak is constant, regardless of z. This means thatthe biological tissue sees a very well controlled optical intensity that is just barely sufficientto bore a several micrometer-sized hole in a cell wall positioned over a wide possible rangeof z positions but not enough to do more widespread damage.

Another interesting application for Bessel beams is in “optical tweezers.” Optical tweez-ers refer generally to schemes that use focused laser light to control the motion of smallparticles. This is quite an involved subject to treat very generally, but to get the idea we’llthink about one particular example.

Figure 7.5 shows a simple case of a dielectric sphere with a diameter significantly largerthan the wavelength of light. Since the refraction of light through the sphere causes achange in the direction of individual light rays, there is a change in momentum of the lightwhich is taken up by the sphere. This has the tendency to push the sphere to the positionof maximum intensity.

Although most tweezer applications have so far been done with Gaussian beams, Besselbeams are attractive since the gradient is constant as a function of z. This can be usedto “guide” particles in a line over distances that would be impractical with other kinds ofbeams (Arlt et al., Opt. Comm. 197, 239-245 (2001)). You can also make schemes withoverlapping Bessel beams to get full 3D control. There are even some (involved) schemesto use Bessel beams to drag particles toward the source of the beam, so-called “tractorbeams” (Brzobohaty et al. Nature Photonics 7, 123-127 (2013)).

Page 87: Quantum Electronics / Photonics

Johnson, QE FS2014 86

Figure 7.5: Simple ray optics description of optical tweezers in dielectric spheres with asize much larger than the wavelength of the light. When there is a spatial gradient inthe transverse intensity profile, the refraction of light rays at different transverse locationsrequires that the sphere take on a net momentum from the light that drives it toward thehigher intensity region. (figure from Neuman and Block, Rev. Sci. Instrum. 75, 2787(2004).)

7.5 Appendix: Proof of eq. 7.10

Let us defineF (x, y, z) = X (u)Y(v)eiZ(z) (7.29)

The paraxial Helmholtz equation then becomes

(∇2TF )AG + 2(∇TF ) · (∇TAG) + F∇2AG − 2ik

∂F

∂zAG − 2ikF

∂AG∂z

= 0

The third and fifth terms by themselves satisfy the paraxial Helmholtz equation since AGis a solution, and so we are left with

(∇2TF )AG + 2(∇TF ) · (∇TAG)− 2ik

∂F

∂zAG = 0

Working on each term:

(∇2TF )AG =

[d2Xdu2Y + X d

2Ydv2

](2e−Z

W 2

)AG

2(∇TF ) · (∇TAG) =

[dXduY ∂AG∂x

+ X dYdv

∂AG∂y

](2

√2eiZ

W

)

= −2ik

[dXduYu+ X dY

dvv

](eiZ

q

)AG

Page 88: Quantum Electronics / Photonics

Johnson, QE FS2014 87

= −2ik

[1

XdXdu

u+1

YdYdvv

](1

q

)FAG

2ik∂F

∂zAG = 2ik

[dXdu

∂u

∂zYeiZ + X dY

dv

∂v

∂zeiZ + iXY dZ

dzeiZ]AG

= 2ik

[−(

1

XdXdu

u+1

YdYdvv

)1

W

dW

dz+ i

dZdz

]FAG

Plugging these all in and multiplying by W 2/2FAG yields[1

Xd2Xdu2

+1

Yd2Ydv2

]− ik

[1

XdXdu

u+1

YdYdvv

](W 2

q−W dW

dz

)+ kW 2dZ

dz= 0

From the definition of q and eq. 7.6,

W 2

q−W dW

dz=W 2

0 (z20 + z2)

z20(z + iz0)

−W0

√1 + (z/z0)2 · W0z

z20

√1 + (z/z0)2

=W 2

0 (z20 + z2)z

z20(z2 + z2

0)− iW

20 (z2

0 + z2)z0

z20(z2 + z2

0)− W 2

0 z

z20

=W 2

0 z

z20

− iW20

z0

− W 20 z

z20

= −iλz0

πz0

= −2i

k

where for the penultimate step we used the definition W0 =√λz0/π. Using this we can

now recover eq. 7.10.

Page 89: Quantum Electronics / Photonics

Chapter 8

Optical resonators

Learning objectives

This week is devoted to the physics of optical resonators. For reference: Saleh and Teich,chapter 10. Goals include:

• Describe the key characteristics of an optical resonator

• Apply criteria for identifying stable resonator configurations (eq. 8.7)

• Describe the eigenmodes of spherical mirror resonators

• Calculate the frequencies allowed in such resonators (eq. 8.28)

8.1 What are optical resonators?

Optical resonators are devices that confine and store light at particular frequencies. Theydo this by guiding the propagation of light along a path that causes it to come back alongitself. There are many different types of optical resonators. In fact, we have discussed onekind already: the Fabry-Perot resonator, constructed by two parallel interfaces that reflectlight back on itself.

In general, (ideal) optical resonators share the property of supporting only specific,discrete frequencies of light. These frequencies are called the longitudinal modes of theresonator, since they are related to the propagation of the light. This we saw back inweek 4 when discussing the output of a Fabry-Perot etalon, one type of optical resonator.We’ll see another example of this in the following discussion of spherical mirror resonators.Most resonators also support only specific spatial beam shapes, called transverse resonatormodes.

For the lectures this week we will be talking about the most important type of resonatorfor laser applications, the spherical mirror resonator. We’ll first discuss the conditionsrequired to make a stable resonator, and then talk about the properties of light confinedwithin such a device. A conventional optical laser is based around such resonators.

88

Page 90: Quantum Electronics / Photonics

Johnson, QE FS2014 89

Figure 8.1: A spherical mirror resonator.

8.2 Spherical mirror resonators

An example of a spherical mirror resonator is given in figure 8.1. In this context sphericalmeans that the shape of the reflective surface of the mirror is in the shape of a sphere withradius R. Here we will use a convention where R is positive if the mirror is concave (thatis, if the reflective surface is directed toward the center of curvature). We can also haveR < 0 mirrors which are called convex. For those following along in the textbook, pleasenote that Saleh and Teich use the opposite sign convention. This is unfortunate, but bothconventions are used and for our purposes it is more convenient to adopt a conventionwhere R is usually a positive number.

The first question we must ask about these kinds of resonators is whether or not theyare stable. In this context, stability means that the resonator can support a real beam oflight that is confined to the resonator. This is an important criterion for designing thiskind of device.

The paraxial ray matrix for one round-trip through a spherical resonator is given by(A BC D

)=

(1 0− 2R1

1

)(1 d0 1

)(1 0− 2R2

1

)(1 d0 1

)(8.1)

(A BC D

)=

(1− 2d

R22d(1− d

R2)

2( 2dR1R2

− 1R1− 1

R2) 4d2

R1R2− 4d

R1− 2d

R2+ 1

)(8.2)

where d is the separation between the mirrors, R1 is the curvature of one mirror and R2 isthe curvature of the other. We will now try to find whether there exists a gaussian beam

Page 91: Quantum Electronics / Photonics

Johnson, QE FS2014 90

that can be confined in this resonator. The requirement for confinement is that the beamparameters (waist position and spot size) be exactly the same for one round-trip throughthe cavity. This can be mathematically expressed using the ABCD law:

q =Aq +B

Cq +D(8.3)

which is a quadratic equation for the complex beam parameter q. Solving for q gives

q =(A−D)±

√D2 + A2 − 2AD + 4BC

2C(8.4)

Since A, B, C and D are all real numbers, q is complex if and only if

D2 + A2 − 2AD + 4BC < 0 (8.5)

What would it mean physically if q were a real number? Recall that q = z− zw + iz0. Thereal part of q is the position of the waist, but the imaginary part is the Rayleigh range,which gives the depth of focus of the beam. The smaller the value of z0, the larger thedivergence of the beam. If z0 → 0, this would mean that the beam diverges infinitely, whichis a bit of a problem if we want to use spherical mirrors in a paraxial approximation. In thiscircumstance there is literally no mirror big enough to contain the beam! Consequently,we must look for solutions where 8.5 is satisfied. Well outside this condition the resonatoris unstable.

To simplify this relation, we will use a bit of a trick. Note first that for each of thematrices we used to construct the ABCD matrix, the determinant is equal to 1. Sincethe determinant of a product of matrices is equal to the product of the determinants, thisimplies that AD −BC = 1. We can use this to simplify the stability criterion to∣∣∣∣A+D

2

∣∣∣∣ < 1 (8.6)

or

0 <

(1− d

R1

)(1− d

R2

)< 1 (8.7)

which gives an explicit requirement for resonator stability. We can define so-called g-parameters g1 = 1− d/R1 and g2 = 1− d/R2 and write this as

0 < g1g2 < 1 (8.8)

Quite often (and in our text) the resonator stability condition is expressed in a slightlydifferent way:

0 ≤ g1g2 ≤ 1 (8.9)

where equality at the end points is permitted. This is, however, not really practically mean-ingful since the g parameters are physical quantities based on distances and are thereforenever known with exact precision. Also, as we discussed above, the equality condition

Page 92: Quantum Electronics / Photonics

Johnson, QE FS2014 91

results in a resonator with infinite beam sizes at the mirrors. In any case, the accepted ter-minology is to say that the resonator is stable when the strict inequalities are well-satisfied,and the resonator is conditionally stable if they are near the end-ranges.

An example of a conditionally stable resonator is the Fabry-Perot etalon. If constructedout of parallel plane mirrors, the etalon corresponds to R1 →∞ and R2 →∞, which makesg1 = g2 = 1 and so g1g2 = 1. Thus the Fabry-Perot etalon of a finite transverse size cannotstore a gaussian beam indefinitely, even if the mirrors have perfect reflectivities. This isalso easy to see using a ray-tracing method: even if the plates are exactly parallel, only oneray direction (that of a ray exactly perpendicular to the plates) will retrace its own path.Since real beams of finite size always diffract to some extent, real beams will always involverays that deviate from this one particular direction and be unconfined. This does not meanthat Fabry-Perot etalons are useless, since this process of losing the beam may take many,many round-trips: this does mean, however, that they are less effective in storing light fora long time than a truly stable type of resonator. Even unstable resonators can be usedto make lasers, but with the understanding that losses very high since there is no truelong-term confinement. This is in fact sometimes quite useful, particularly in high-powerapplications where very strong out-coupling of laser energy is desirable.

Figure 8.2 shows some typical configurations for spherical resonators. Symmetric res-onators are ones in which R1 = R2. A symmetric confocal resonator has d = R1 = R2, sothat the center of the resonator is the focal point of both windows. A symmetric concentricresonator has R1 = R2 = d/2 (and is only conditionally stable).

Let’s now try to find out the characteristics of the trapped Gaussian beam in a stableresonator. Re-writing equation 3 in terms of 1/q gives

B

(1

q

)2

+ (A−D)1

q− C = 0 (8.10)

which solves to

1

q=D − A± i

√4− (A+D)2

2B(8.11)

The real part of this gives us the value of 1/R for the beam at the first mirror:

1

R=D − A

2B= − 1

R1

(8.12)

. . . which just means that the curvature of the beam wavefront matches the curvature ofthe mirror. We can do a similar procedure to find that the wavefront at the other mirroralso matches its curvature. This makes a lot of sense if we consider the requirement thatthe beam be self-replicating after each roundtrip.

We can find the position of the waist of the beam by taking the real part of equation 8.4:

Re[q] = z − zw =−d(d−R2)

2d−R1 −R2

(8.13)

Page 93: Quantum Electronics / Photonics

Johnson, QE FS2014 92

Figure 8.2: Some types of spherical mirror resonators (from wikipedia).

Page 94: Quantum Electronics / Photonics

Johnson, QE FS2014 93

Here, negative numbers mean that the waist is located to the right of mirror 1; positivenumbers mean that is to the left. The imaginary part gives the Rayleigh range

z0 =

√√√√−d(d−R1)(d−R2)(d−R1 −R2)

(2d−R1 −R2)2(8.14)

and the beam waist radius is W0 =√λz0/π.

8.3 Symmetric spherical mirror resonators

An important special case is the symmetric resonator where R1 = R2. In this section we’llcall the radius of both mirrors R. In this case the beam waist position is centered in theresonator, and

z0 =d

2

√2R

d− 1 (8.15)

W 20 =

λd

√2R

d− 1 (8.16)

W 21 = W 2

2 =λd/π√

dR

(2− d

R

) (8.17)

where W1 and W2 are the radii at mirrors 1 and 2. The stability criterion becomes

0 <d

R< 2 (8.18)

Figure 8.3 shows the beam widths as a function of d/R across the stability range. Theminimum widths at the ends of the resonator are achieved for the confocal case d = R.Here

z0 = d/2 (8.19)

W0 =√λd/2π (8.20)

W1 = W2 =√

2W0. (8.21)

8.4 Resonance frequencies

So far we have talked only about the spatial envelope A(r) of the resonator modes. Theself-replicating requirement, however, extends to the full electric field, including the rapidlyoscillating phase. Recall that the phase of a Gaussian beam is

φ(x, y, z) = kz − tan−1(z/z0) +k(x2 + y2)

2R(z)(8.22)

Page 95: Quantum Electronics / Photonics

Johnson, QE FS2014 94

Figure 8.3: Beam widths as a function of d/R for a symmetric spherical mirror resonator.

The y-axis here is in units of√λd/π.

At the center of mirror 1 of the resonator, the phase is

φ(0, 0, z1) = kz1 − tan−1(z1/z0) (8.23)

and at mirror 2 we haveφ(0, 0, z2) = kz2 − tan−1(z2/z0) (8.24)

The phase change of the beam as is propagates from mirror 1 to mirror 2 is

kd−∆ζ (8.25)

where∆ζ = tan−1(z2/z0)− tan−1(z1/z0) (8.26)

For one complete roundtrip, the on-axis phase changes by 2kd− 2∆ζ.For the wave to be self-replicating, we require

2kd− 2∆ζ = 2πn (8.27)

where n is any integer. Using k = 2πν/c and defining νF = c/2d, the resonance frequenciesνn that satisfy this condition are

νn = nνF +∆ζ

πνF (8.28)

Since the wavefronts match the curvature of the mirrors, this also ensures that the off-axisphase of the beam is self-replicating.

Page 96: Quantum Electronics / Photonics

Johnson, QE FS2014 95

Equation 8.28 implies that only particular frequencies are allowed within the resonator.Also, these frequencies are at equally spaced intervals, depending only on the length of theresonator. The curvature of the mirrors has no effect other than to displace the allowedfrequencies by a constant (via the second term in equation 8.28).

As noted earlier, Gaussian beams are not the only modes of spherical mirror resonators.In general, any beam with a spherical wavefront is a candidate. Since Hermite-Gaussianbeams all have spherical wavefronts, they also work just as well (in fact the ABCD lawalso works for them).

The beam properties of higher order Hermite-Gaussian beam solutions are pretty muchthe same as for Gaussian beams. What does change is the resonant frequencies of higherorder modes. Problem 1 of this week’s exercises asks you to derive the expression forHermite-Gaussian beams analogous to equation 8.28.

The density of modes is usually defined as the number of modes per unit cycle frequencyper unit length of the resonator. For a 1-D resonator (which is all we are treating here)this is

M(ν) =4

c. (8.29)

8.5 Resonator loss

So far we have been talking about perfect resonators, where the mirrors are perfect reflec-tors and there are also no diffraction losses from the edges of the mirrors. Both effectscause the beams trapped in a resonator to eventually leave the resonator. This also relaxessome of the conditions for allowed modes. We’ll concentrate on the effects on the allowedfrequencies from imperfect mirror reflectivities since this usually is the most important.

To understand this it helps to think about a “phasor” representation of how waves fromsubsequent roundtrips add together. Let’s suppose that an initial on-axis electric field fora gaussian mode just after mirror 1 is given by U0. After one round trip, we have

U1 = rU0ei∆φ (8.30)

where 0 < r < 1 is the loss factor from one round-trip and ∆φ = 2kd−2∆ζ is the roundtripphase change. The total electric field is therefore

U =∑i

Ui = U0/(1− rei∆φ) (8.31)

The intensity is

I = |U |2 =I0

|1− rei∆φ|2 =I0

1 + r2 − 2r cos ∆φ(8.32)

This can be written as

I =Imax

1 + (2F/π)2 sin2(∆φ/2)(8.33)

where

Imax =I0

(1− r)2(8.34)

Page 97: Quantum Electronics / Photonics

Johnson, QE FS2014 96

Figure 8.4: Frequency dependence of the intensity of stored light in a resonator given abroadband frequency injection, in frequency units of νF .

and we re-introduce the finesse (see also week 4, page 14)

F =π√r

1− r (8.35)

In terms of the frequency of the light we can also write the intensity as

I =Imax

1 + (2F/π)2 sin2(πν/νF )(8.36)

As figure 8.4 shows, just like in the case of the Fabry-Perot etalon the finesse controls thesharpness of the lines. At infinite finesse we recover the result that only discrete frequencieswill be supported inside the resonator. The width of a line is

δν ≈ νFF (8.37)

One way that losses in a resonator (from either the mirrors or some thing inside theresonator) are often characterized is by an effective distributed loss coefficient αr definedby

r = e−2αrd (8.38)

Page 98: Quantum Electronics / Photonics

Johnson, QE FS2014 97

which would be equal to the attenuation coefficient of the light in the case where all losseswere distributed equally over the beam path. We can express the finesse as

F =πe−αrd

1− e−2αrd(8.39)

for the usual case αrd 1,

F ≈ π

2αrd(8.40)

We see from this that the finesse is roughly equal (same order of magnitude) as the numberof round-trips a beam can undergo before significant attenuation.

The quality factor of a resonator is defined as

Q = 2πstored energy

energy loss per cycle(8.41)

where “cycle” refers to the period of the optical field 1/ν0. For our resonators, the storedenergy is proportional to the max intensity, and the loss is Iαrc/ν0. This makes

Q =2πν0

αrc≈ ν0

νFF (8.42)

Since usually ν0/νF is extremely high for optical resonators, the Q factor is usually muchlarger than the finesse.

Page 99: Quantum Electronics / Photonics

Chapter 9

Laser fundamentals 1

Learning objectives

This week we will start to talk about how conventional lasers are made, starting with adiscussion on the basic mechanisms of light amplification. Similar material is covered inSaleh and Teich, chapters 12 and 13. After this week you should be able to

• Explain what stimulated emission is phenomenologically

• Relate quantitatively coefficients for spontaneous emission, stimulated emission andabsorption in a two-level system embedded in a cavity (eq. 9.18)

• Identify different mechanisms for line broadening and how to model this effect foratom-photon interactions (eq. 9.19, 9.23, 9.25)

• Construct rate equations for laser schemes

• Determine conditions for population inversion and gain in common laser schemes

9.1 Photon-atom interactions in two-level systems

In this section we will give a brief review and overview of atom-photon interactions, withan emphasis on aspects pertinent for the description of lasers based on atomic transitions.A full theory of atom-photon interactions is part of quantum electrodynamics (QED). Thistheory describes both the atom and radiation field as fully quantum mechanical systems.The results we discuss in this section can be derived from a QED treatment where theatom-photon interaction is calculated as a first order perturbation.

A basic result of quantum mechanics is that the allowed energies of an atom are pack-aged into discrete levels. For example, the allowed energy levels of a hydrogen atom aredescribed by the Rydberg series

En = −Ry

n2(9.1)

98

Page 100: Quantum Electronics / Photonics

Johnson, QE FS2014 99

Figure 9.1: Energy levels for eigenstates of a hydrogen atom. Source: Wikimedia(http://commons.wikimedia.org/wiki/File:Hydrogen energy levels.png)

where Ry = 13.6 eV is the Rydberg constant (see figure 9.1). The separation of allowedenergies into discrete levels also holds for more complex systems with multiple electronsand various interactions among the electrons and the nucleus, although in these casesthe spacing of the levels can become quite complex. Each level in such a system can beidentified by a series of integers (quantum numbers) that indicate the allowed values ofcertain properties of the state associated with the energy level. The values of the quantumnumbers often give important information about the symmetry of the quantum state ofthe system at that energy. We’ll not in the lecture be too concerned with the systematicnotation used to indicate the values of the quantum numbers of particular states, but wewill be using them when we talk about particular laser systems. The spectrum of a givenatom can be used to identify relative concentrations in a material. One example wherethis is done is aboard NASA’s Curiosity rover on Mars, which uses an intense laser toconvert Martian samples into a hot plasma. The emission lines of the atoms in the plasmagive the relative concentrations of various elements (e.g. R. Wiens et al. Space Sci. Rev.170:167-227 (2012)).

For the purposes of the discussion that follows, we will be approximating an atom as atwo-level system, as shown in figure 9.2. Here level 1 is the lower-energy “ground state,”and level 2 is the higher energy “excited state.” We will specifically concentrate on theinteraction of this 2-level system with photons in one particular electromagnetic mode withenergy hν = E2 −E1, where E2 is the upper level energy and E1 is the lower level energy.We’ll also be assuming that the symmetry of this 2-level system allows for direct interaction

Page 101: Quantum Electronics / Photonics

Johnson, QE FS2014 100

Figure 9.2: A generic 2-level system.

between the photons and the atom (i.e. we assume that the transition 1 → 2 is “dipoleallowed”: 〈E1|d|E2〉 6= 0).

Figure 9.3 shows pictorially the three different ways in which this two-level system caninteract with these photons. The first is absorption: if the atom is in level 1, a photon tunedto the transition energy can promote this atom to level 2 but then disappears (absorbed).We’ll call P abs

12 the probability that at atom in level 1 is promoted to level 2 via absorptionof a photon (given that the atom is indeed originally in level 1). Since we would expectthat absorption events are probabalistically independent (i.e. the non-absorption be onephoton does not affect the chances of absorption be another photon),

P abs12 ∝ n (9.2)

where n is the number of photons in the mode. This just says that the probability ofabsorption is proportional to the number of photons the atom sees.

The other processes to consider are those leading to the emission of a photon froman atom in level 2. This is split into two distinct parts. One part is called spontaneousemission: the emission of a photon from the atom independent of any photons alreadyinside the cavity. We denote this as P spon

21 , and this is a constant independent of thenumber of photons in the cavity. A second emission process is called stimulated emission.This is a process where preexisting photons in the mode “stimulate” the transition fromlevel 2 to level 1, resulting in another photon being added to the mode with exactly thesame properties as the preexisting photons. We’ll call the probability of this process (giventhe atom is initially in level 2) as P stim

21 . This probability is proportional to the number ofphotons in the mode, so

P stim21 ∝ n (9.3)

as is the case with absorption.

Page 102: Quantum Electronics / Photonics

Johnson, QE FS2014 101

(a)

(b)

(c)

Figure 9.3: Photon interactions with a 2-level system: (a) absorption, (b) spontaneousemission, and (c) stimulated emission.

9.2 Einstein coefficients: relationship between spon-

taneous and stimulated emission

If we consider now the interaction of the atom with multiple electromagnetic modes (withdifferent polarizations and propagation directions), each of the above relations must holdfor each mode. The values of P spon

21 and the proportionality constants will depend on themode frequency and polarization. For spontaneous emission the total probability that thedecay from level 2 to level 1 happens is denoted as

A21 =1

tsp(9.4)

where tsp is the measured lifetime of the emission process (which can be measured exper-imentally). Assuming that the electromagnetic modes within the spectral bandwidth ofthe atomic transition are equally populated (i.e. the atomic transition line is narrow), wecan also write for the transition probability for at atom in state 1 to absorb a photon as

W12 = B12ρ(ν0) (9.5)

where ρ(ν0) is the energy density of the electromagnetic field (energy/bandwidth/volume)and B12 is a constant. Similarly for stimulated emission

W21 = B21ρ(ν0) (9.6)

As formulated above, the coefficients A12, B12 and B21 do not depend at all on theelectromagnetic field population, but instead are properties of the atomic transition itself.

Page 103: Quantum Electronics / Photonics

Johnson, QE FS2014 102

To find a relationship between these coefficients we’ll go through an analysis originallygiven by Albert Einstein in 1917 that was one of the early precursors to laser technology.The trick here is that we will consider a situation in which a collection of 2-level atoms areinside an optical cavity, and both the atoms and the electromagnetic field of the cavity arein mutual thermal equilibrium.

In thermal equilibrium, the relative probability of having atoms in state 1 or state 2is determined by the Boltzmann distribution. Simply stated, this just means that theprobability of being in a state with energy ε is

P ∝ e−ε/kT (9.7)

In thermal equilibrium, this implies that for our 2-level atomic system

N2

N1

= e−(E2−E1)/kT = e−hν0/kT (9.8)

where N2 is the number of atoms in level 2, and N1 is the number of atoms in level 1.How do these atoms interact with the photons in the cavity? To find the rate of change

of N2, all we have to do is sum up the transition rates for each process that results in achange in the state of the atom:

dN2

dt=

(dN2

dt

)abs

+

(dN2

dt

)spon

+

(dN2

dt

)stim

(9.9)

where (dN2

dt

)abs

= W12N1 = B12ρ(ν0)N1 (9.10)

(dN2

dt

)spon

= −A21N2 (9.11)

(dN2

dt

)stim

= −W21N2 = −B21ρ(ν0)N2. (9.12)

At the end we havedN2

dt= −A21N2 + (B12N1 −B21N2)ρ(ν0) (9.13)

In thermal equilibrium dN2/dt = 0, and we then get the relation

ρ(ν0) =A21N2

B12N1 −B21N2

=A21

B21

· 1

ehν/kT −B12/B21

(9.14)

for the energy density of the photon bath in the cavity.An alternate way to calculate ρ(ν) is based on treating the photons as bosons in thermal

equilibrium (see for example Kittel and Kroemer, Thermal Physics, chapter 4). This givesrise to the Planck distribution function

ρ(ν) =8πhν3

c3· 1

ehν/kT − 1(9.15)

Page 104: Quantum Electronics / Photonics

Johnson, QE FS2014 103

The only way for equations 9.14 and 9.15 to agree is for

A21

B21

=8πhν3

c3=

8πh

λ3(9.16)

andB21 = B12. (9.17)

If we define A = A21 and B = B21 = B12 we then have the Einstein relation

B =λ3

8πhA =

λ3

8πhtsp(9.18)

that relates both the probabilities for absorption and stimulated emission to the probabil-ity for spontaneous emission. It is also apparent from this analysis that absorption andstimulated emission are very symmetric processes, due to the equality of B21 and B12. Oneimportant result from Einstein’s original work (which predated QED) was that stimulatedemission is crucial to make the thermal equilibrium condition agree with Planck’s distri-bution function at all. Although stimulated emission can also be a feature of nonlinearclassical systems (as seen in the physics of free-electron lasers), this was an interesting andunexpected consequence of quantization in atomic systems.

9.3 Line broadening mechanisms: homogeneous ver-

sus inhomogeneous

Although we have above discussed the transition in an atomic-like 2-level system as oc-curring at a definite energy hν0, in practice if we measure the frequency of photons spon-taneously emitted from an excited atom or collection of atoms it always has a non-zeroenergy spread. This spread can have several different physical origins, depending on thetransition and the environment of the atom. The spread of energies sets the number ofcavity modes that participate in the absorption or emission process.

In general, the different processes that lead to broadening can be divided into twocategories: homogeneous and inhomogeneous broadening. Homogeneous broadening isbroadening from the emission of individual atoms. Inhomogeneous broadening is broad-ening due to differences in the emission from different atoms. We’ll now discuss a fewexamples of each.

9.3.1 Lifetime broadening

One important factor that broadens atomic emission and absorption lines is lifetime broad-ening, the consequence of the finite lifetime of the states. In general, we can represent thelifetime of level 2 as τ2, a number which includes both the spontaneous radiative decay tolevel 1 as well as other decay mechanisms to all lower energy levels. Similarly, if level 1 is

Page 105: Quantum Electronics / Photonics

Johnson, QE FS2014 104

not the true ground state of the system it might also have a lifetime τ1 that represents alldecays to lower levels. Usually we can model these decay processes as a single exponential.

The main effect of these lifetimes is to broaden the range of allowed energies for thestates from truly discrete levels into a narrow continuum of energies. This is really justa Fourier transform effect. Consider what would be the consequences of a truly discrete,delta-function-like energy level distribution. A delta function in frequency space implies inthe time domain a sinusoidal function that stretches from −∞ to +∞. This would meanthat emission from the spontaneous process would literally take forever. This is clearlynot the case, since we have a measured finite lifetime for at least the upper level state. So,there must be some kind of frequency broadening from the finite lifetime of the states.

Quantitatively, we can get the energy broadening of a particular transition by takingthe single-sided Fourier transform of e−t/2τei2πν0t which represents the electric field shapeof a transition between levels with nominal energy difference hν0 and effective lifetimeτ = 1/(τ−1

1 + τ−12 ), starting at a time t = 0. This gives us a shape in frequency space

proportional to 1/[1 + 2i(ω − 2πν0)τ ]. The intensity spectrum shape is then a Lorentzianfunction

I(ω) ∝ 1

(ω − ω0)2 + (∆ω/2)2(9.19)

where ω0 = 2πν0 and ∆ω = 1/τ = 1/τ1 + 1/τ2. It is often convenient to convert all the ωvariables to ν and define the lineshape function

g(ν) =∆ν/2π

(ν − ν0)2 + (∆ν/2)2(9.20)

so that∫∞−∞ g(ν)dν = 1. This function gives the relative probabilities for an emission or

absorption process where lifetime broadening is the dominant broadening mechanism.When discussing the frequency dependence of absorption or stimulated emission, it is

helpful to consider a form of the probability with a form slightly different from the onewe previously used. If we again go back to considering only interaction of an atom with aparticular mode in a cavity of volume V ,

P abs12 = P stim

21 = nc

Vσ(ν) = φσ(ν) (9.21)

Here φ = nc/V can be considered as the photon flux; i.e. the number of photons perunit area per unit time passing “through” the atom. The new function σ(ν) is called thetransition cross section. It is typically reported in units of cm2. In a loose sense it givesyou the effective interaction area of an individual atom. For a given lineshape functiong(ν), the transition cross section is

σ(ν) =(∫

σ(ν)dν)g(ν) =

V

cnBρ(ν0)∆νg(ν) =

λ20

8πtspg(ν) (9.22)

where we have applied the Einstein relation to express the total integrated cross section interms of the spontaneous lifetime. (Note that n/V = ρ(ν0)∆ν/hν0.)

Page 106: Quantum Electronics / Photonics

Johnson, QE FS2014 105

Figure 9.4: Lifetime broadening is a consequence of the limited lifetime of emission frominidivual atoms.

Figure 9.5: Comparison of Lorentzian and a Gaussian lineshapes. The Lorentzian hasmuch longer “tails.”

Page 107: Quantum Electronics / Photonics

Johnson, QE FS2014 106

Figure 9.6: Collision broadening is the result of phase randomization when atoms elasticallycollide. In the sketch the vertical lines indicate collision times.

9.3.2 Collision broadening

Another type of homogeneous broadening is collision broadening. This is usually importantin systems like gases of liquids where frequent collisions between atoms randomize the phaseof the atomic wavefunction. This ends up having an effect on the linewidth quite similar tothat of lifetime broadening, resulting in a Lorentzian line shape. The width is ∆ν = fcol/π,where fcol is the number of collisions per second (for a proof see Siegmann’s Lasers, section3.2). When also lifetime broadening is taken into account, the total linewidth is just

∆ν =1

(1

τ1

+1

τ2

+ 2fcol

)(9.23)

9.3.3 Inhomogeneous broadening

Inhomogeneous broadening refers to broadening of the measured transition lines from apopulation of emitters (atoms) that have slightly different transition energies. Thus thecross section for the collective is broader than the transitions from individual atoms. Thiscan have several different causes. One (often seen in solid-state systems) the atoms areembedded in different environments that cause shifts in the transition energies (crystalfield splitting). In gas lasers, an important source of inhomogeneous broadening is theDoppler effect. Inside a gas, the atoms are all moving with different velocities and differentdirections. In the presence of a radiation field, the atoms all then see slightly differentfrequencies depending on their velocity relative to the direction of the light. The frequencyis slightly higher if the atom moves toward the source of the light, and conversely it isslightly lover if it moves with the light. A quantitative account of the classical Doppler

Page 108: Quantum Electronics / Photonics

Johnson, QE FS2014 107

Figure 9.7: Inhomogeneous broadening comes from differences in the center frequency fordifferent atoms.

effect in a gas results in the lineshape function

g(ν) =∫ ∞−∞

g(ν − ν0

v

c

)p(v)dv (9.24)

where v is the component of the velocity along the wavevector of the light, and p(v)dv isthe probability that there is an atom with this velocity component between v and v + dv.Here g(ν) is the homogeneous lineshape.

Usually when inhomogeneous effects dominate, the lineshape can be considered as aGaussian

g(ν) ≈ 2

∆ν

√ln 2

πe−4 ln 2(ν/∆ν)2

(9.25)

where ∆ν is the full width at half maximum as empirically measured. In fact, the shape ofthe linefunction is often used to ascertain whether homogeneous or inhomogeneous effectsare most important.

9.4 Atom-level based lasers: general considerations

A laser is a device that uses stimulated emission to amplify light. Since on a per-atombasis absorption has the same cross section as stimulated emission, in order to achieve gainthe number of atoms in the upper level state must be higher than the number of atoms inthe lower level state. This condition of N2 > N1 is called population inversion. If insteadN1 > N2, there would be more absorption than emission and the light would experienceoverall attenuation.

Page 109: Quantum Electronics / Photonics

Johnson, QE FS2014 108

Suppose we define N = N2 − N1. For N > 0 we have population inversion. The gaincoefficient is defined as

dz= γ(ν)φ (9.26)

where φ is the photon flux, the number of photons in a given mode per unit area (earlierdenoted as nc/V ) , and z is a distance coordinate along which the photons propagate. Thegain coefficient is given by

γ(ν) = Nσ(ν) = Nλ2

0

8πtspg(ν). (9.27)

For propagation over a length d of the gain medium, the gain G is

G(ν) = eγ(ν)d (9.28)

Achieving population inversion to get G > 1 is one of the main challenges to makinga laser. Since thermal excitation (as represented by the Boltzmann factor e−E/kT ) alwaysgives lower probabilities for higher energy levels, we need some non-thermal mechanism forachieving population inversion. This can be done in many different ways: bombardmentwith particles (e.g. electrons), optical pumping (e.g. with a laser we already have, or withmore conventional light sources), or even sound waves. For our discussion here we will bemostly focused on optical pumping: using light to stimulate population inversion.

For now let’s not worry about exactly how we pump the system and consider thisabstractly. The rate of change for the upper level of a 2-level system is

dN2

dt= R2 −

N2

τ2

−N2Wi +N1Wi (9.29)

where R2 is the rate of change caused by the pump, 1/τ2 = 1/τ20 + 1/τ21, τ20 is the decaytime for relaxation from the upper state to all states other then the lower-level state weexplicitly consider, τ21 is the total decay time for relaxation to state 1 from spontaneousdecay and non-radiative channels, and Wi = φσ(ν). The corresponding equation for level1 is

dN1

dt= −R1 −

N1

τ1

+N2

τ21

+N2Wi −N1Wi (9.30)

where τ1 is the decay time for relaxation from level 1 to lower-lying levels.The steady state solutions for the above are obtained by setting both time derivatives

to zero. We then obtain a value for the population inversion

N =N0

1 + τsWi

(9.31)

where

N0 = R2τ2

(1− τ1

τ21

)+R1τ1 (9.32)

Page 110: Quantum Electronics / Photonics

Johnson, QE FS2014 109

1

2

3

pump

rapid decay32

W1iW1 21

Figure 9.8: A generic 3-level lasing scheme.

and

τs = τ2 + τ1

(1− τ2

τ21

)(9.33)

is the saturation time constant. We see from this that important contributors to achievingpopulation inversion include

• Large R1 and R2

• Large τ2

• Small τ1 if R1 < (τ2/τ21)R2

• Small τs

Note also that a small Wi contributes also to a high population inversion, although thisalso implies that the photon flux is small, not usually a desirable property of a laser!

We now consider more concrete examples using optical pumping to achieve inversion.In the exercises you will show that direct optical pumping of a 2-level system cannotachieve population inversion. Consequently, optically pumped lasers require considerationof additional energy levels. We will now consider two of the most common schemes, thatof the 3-level and 4-level lasers.

9.5 Three-level lasers

Maybe the simplest way to achieve population inversion is with the 3-level system, sketchedin figure 9.8. Here the lasing transition is between levels 1 and 2. Level 1 is the true ground

Page 111: Quantum Electronics / Photonics

Johnson, QE FS2014 110

state of the system. The pump excites the transition from level 1 to level 3. Level 3 quicklydecays into level 2 with a time constant τ32. Provided this decay time is short enough,the population in level 1 will always far exceed the population in level 3. We can in thiscase ignore emission from level 3 down to level 1. The full rate equations can be expressed(assuming τ2 = τ21)

dN3

dt= WN1 −WN3 −

N3

τ32

(9.34)

dN2

dt=N3

τ32

− N2

τ21

−N2Wi +N1Wi (9.35)

dN1

dt= −WN1 +WN3 +

N2

τ21

+N2Wi −N1Wi (9.36)

If we take the steady state solution (all derivatives are zero) and solve for N = N2 − N1

under the constraint Ntot = N1 + N2 + N3 we could get a general expression for thepopulation inversion. Rather than go through this, we will instead find an approximateexpression that works as long as 1/τ32 W . In this case N3 ≈ τ32N1W N1 and we get

N =N0

1 + τsWi

(9.37)

but with

N0 =Ntot (τ21W − 1)

1 + τ21W(9.38)

and

τs =2τ21

1 + τ21W(9.39)

The first laser based on ruby was a 3-level system. Figure 9.9 shows the level diagram.Three-level lasers are generally more difficult to work with than 4-level systems, since thepopulation in the ground state tends to be quite high and therefore requires very highpumping levels.

9.6 Four-level lasers

A 4-level laser scheme is shown in figure 9.10. It is a bit like the 3-level scheme, only nowthere is an additional level below that we will call level 0. The decay time from level 1 tolevel 0 is assumed to be very short, making N1 small. Going through a similar analysis theinversion is given by

N =N0

1 + τsWi

(9.40)

where

N0 ≈τ21NtotW

1 + τ21W(9.41)

Page 112: Quantum Electronics / Photonics

Johnson, QE FS2014 111

4F1

4F2

2E

4A21

3

2

694 nm

32

Cr3+:Al2O3

Figure 9.9: Energy levels in a ruby laser.

1

2

3

pump

rapid decay32

W1iW1 21

0rapid decay

10

20

Figure 9.10: A generic 4-level lasing scheme.

Page 113: Quantum Electronics / Photonics

Johnson, QE FS2014 112

0

3

2

1053 nm

32

Nd3+:Glass

14I9/2

4I11/2

4F3/2

1

Figure 9.11: Energy levels in a four-level Nd:glass laser.

andτs ≈

τ21

1 + τ21W(9.42)

By comparing these to the equations for the 3-level system it is apparent that N > 0 ismuch easier to achieve for a given pumping rate W .

Figure 9.11 shows the 4-level scheme for the common Neodynium doped phosphateglass laser. As with ruby, several different pumping options exist.

Page 114: Quantum Electronics / Photonics

Chapter 10

Laser fundamentals 2

Learning objectives

This week we continue to discuss lasers, with more emphasis on how real laser systems areconstructed and operate. Similar material is covered in Saleh and Teich, chapter 15 whichcontains more details and is highly recommended as a reference/supplement. After thisweek you should be able to

• Explain the conditions required for laser oscillation (eq. 10.2, 10.4)

• Determine the steady state power output of a continuous-wave (CW) laser oscillator,and identify factors that influence this (eq. 10.9, 10.10, 10.14)

• Estimate the number of longitudinal modes that will lase in a given oscillator

• Explain why different transverse modes may or may not lase

• Explain conceptually come common techniques for forcing lasers into pulsed operation

• Explain conceptually how free-electron lasers work

10.1 Laser oscillation

A laser oscillator is a positive feedback system where light is amplified again and again.A conventional laser uses a stable optical resonator with a gain medium placed inside theresonator. The job of the resonator is to repeatedly send a high intensity of photons re-peatedly through the gain medium. With each pass, the intensity of the photons increases.Balancing this gain is loss due to the resonator. Achieving oscillation requires that thegain exceed the losses for at least some level of photon flux. If this is the case, even asmall amount of input light (e.g. from spontaneous emission that happens to match theresonator mode) will grow and grow into a coherent beam that is trapped inside the cavity.

113

Page 115: Quantum Electronics / Photonics

Johnson, QE FS2014 114

This beam can then be coupled out of the cavity, typically by making one of the mirrorssightly transparent. So, how do we identify the conditions for laser oscillation?

As was briefly discussed last week, the gain coefficient

γ(ν) = Nσ(ν) = Nλ2

0

8πtspg(ν) (10.1)

gives the gain-per-unit-length of photons inside a medium that are resonant with a par-ticular transition with an inversion N . In general N depends on the photon flux in thecavity. We also last week introduced a parameter N0 that is the population inversion withWi = 0 (no photon flux density). We take N = N0 as an initial condition: this assumesthat the pumping mechanism has driven inversion, but so far there is only a negligible levelof photons trapped in the cavity. Under these circumstances γ(ν) = γ0(ν) where

γ0(ν) = N0σ(ν) (10.2)

is the small-signal gain coefficient.Fighting against the gain is the loss of the resonator. Here we recall from week 8

(equation 38) the distributed loss coefficient αr which gives the fractional loss in intensityof light per unit length, averaged over one round-trip through the resonator. This includesthe reflectivity of the mirrors, diffraction over the edges of the mirrors, and unaccounted-for absorption from things inside the cavity (but excluding the absorption from the lasingtransition of the gain medium). Sometimes rather than talking about αr we refer to thephoton lifetime

τp =n

αrc(10.3)

which is a measure of the average time a photon spends inside the cavity before being lost.Here n is the “average” index of refraction.1

In order for the laser to start working at all, it must satisfy the threshold gain condition

γ0(ν) > αr (10.4)

In terms of the photon lifetime and the spontaneous emission lifetime, we can write thiscondition as

N0 > Nt ≡8πn

λ20c

tspτp

1

g(ν)(10.5)

where Nt is called the threshold population difference.We’ve so far implicitly assumed that the frequency ν corresponds exactly to an allowed

mode of the resonator. As we move away from allowed modes, the effective αr increasesdue to phase mismatch. As a result, for a high finesse resonator only frequencies very closeto allowed modes will result in oscillation.2

1The “average” index n for heterogeneous intra-cavity media to use in this equation is given by n−1 =1d

∑ di

niwhere d is the length of the cavity, di is the length of each constituent element of the cavity, and

ni is the index for that element.2More precisely, there is interaction between the gain and the resonator modes that effectively cause

resonator modes to be slightly shifted toward the center frequency of the gain line (see section 15.1 ofSaleh and Teich).

Page 116: Quantum Electronics / Photonics

Johnson, QE FS2014 115

10.2 Laser characteristics

Supposing we satisfy the conditions for oscillation, we can characterize various propertiesof the light. For simplicity, we will first consider continuous-wave (CW) lasers, which arelasers where the photon flux during operation reaches a time-independent steady state.Most commonly seen inexpensive commercial lasers (laser pointers, barcode scanners, etc.)are CW lasers.

In the steady state, the absorption must balance the gain exactly:

γ(ν) = αr(ν) (10.6)

Sinceγ(ν) = Nσ(ν) (10.7)

and

N =N0

1 + τsWi

=N0

1 + τsφσ(ν)(10.8)

is it possible to re-write this in terms of the steady-state photon flux φ. We get

φ =

φs(ν)

(N0

Nt− 1

), N0 > Nt

0, N0 ≤ Nt

(10.9)

where

φs(ν) =1

τsσ(ν)(10.10)

is called the saturation photon-flux density. Note that this is not the maximum photon-fluxpossible. It is more the photon flux at which the population inversion drops by a factor of2 from its value for zero photon flux. The actual value of N above the threshold energy isactually clamped at a value of Nt due to the photon flux generated by the laser (see thisby substituting equations 10.9 and 10.10 into 10.8). The photon number density is

N =φ

c(10.11)

which can be written in the intuitive form

N = (N0 −Nt)τpτs. (10.12)

Usually when we make a laser we are interested in getting the light outside of theresonator so we can use it. Usually one of the mirrors is only partially reflective, allowingus to bring some fraction of the beam stored inside the resonator to the outside world.This is called the output coupler. If we assume the output coupler has a transmittance ofT , the output flux density is

φo =T φ2

(10.13)

Page 117: Quantum Electronics / Photonics

Johnson, QE FS2014 116

where the factor of 2 takes into account that only half of the photons are traveling in the“outward” direction at any given time. The intensity is simply

Io = hνφo =hνT φ

2(10.14)

and the power Po = IoA where A is the cross-sectional area of the laser beam.Often we wish to maximize the output laser intensity. Since the transmittance of the

output coupler is also a loss mechanism that contributes to αr, we must balance thisadditional loss against the increased transmission. If we break up the loss coefficient intotwo parts

αr = αr2 −1

2dln(1− T ) (10.15)

where αr2 is the loss excluding the output coupler. The output photon flux is then

φo =1

2φsT

[g0

L− ln(1− T )− 1

](10.16)

withg0 = 2γ0(ν)d (10.17)

andL = 2αr2d. (10.18)

This is plotted in figure 10.1. For T 1 we have

Toptimal ≈√g0L− L. (10.19)

10.3 Spectrum and longitudinal modes

The requirement that the gain exceed the loss and that the photon frequency satisfy theresonator conditions place restrictions on the allowed frequencies of a laser. A commonsituation is depicted in figure 10.2 that shows resonator modes and the gain profile of anamplifier medium. The value of αr sets a lower bound on the gain, giving then a range ofallowed frequencies. Only for resonator modes within this range can laser oscillation occur.Whether they actually do lase depends on the mechanism leading to the lineshape of thegain curve.

10.3.1 Homogeneous broadening

For homogeneous line broadening, there is a sequence of events as shown in figure 10.3that occur just after the laser is “turned on” (imagine that the medium is pumped and ina steady state, but at “activation time” the resonator was placed around the gain mediumallowing for photons to become trapped). Initially, the population inversion N ≈ N0 is at

Page 118: Quantum Electronics / Photonics

Johnson, QE FS2014 117

Figure 10.1: Plot of the output flux versus output coupler transmission for g0 = 0.5 andloss factor L = 0.02. The optimal transmission is about 0.07.

Figure 10.2: Spectra of allowed resonator modes and the gain profile of the laser. Only formodes where the gain exceeds the resonator loss is lasing possible.

Page 119: Quantum Electronics / Photonics

Johnson, QE FS2014 118

gainloss

intensity

Figure 10.3: Sequence of events in a laser using a homogeneously broadened gain medium,starting just after the resonator is active (see text).

its maximum value and so the small signal gain γ0 is high. If it is sufficiently high, severaldifferent longitudinal modes will experience gain above the absorption rate. So, severalmodes start to lase. As time goes on, however, the rapidly increasing photon flux in thecavity will cause the population inversion N to decrease, which will lower the gain of themedium and suppress some modes. The steady state of the laser had N = Nt for only onetransverse mode that is nearest the peak of the gain profile.

This is, however, not the whole story. In real homogeneously broadened lasers, there isa phenomenon called spatial hole burning. The wave associated with a particular mode ofthe cavity sets up a standing wave inside the resonator, and so different points inside thegain medium see different average intensities. The saturation effect depletes the populationof these parts of the gain medium. If, however, one of the other resonator modes has adifferent spatial intensity profile it may see enough gain from the undepleted part of thegain medium to lase.

10.3.2 Inhomogeneous broadening

For inhomogeneous broadened gain media the situation is a little different. Initially (justafter laser turn-on) the situation is similar: the gain for a range of frequencies encompassingseveral modes is above the lasing threshold. After this, however, there is a big different. Asphoton flux builds up, the inversion of atoms that coincide with gain of allowed resonatormodes is depleted down to Nt. Since the different modes of the resonator achieve gainusing different atoms, this works independently for each mode. As a result we get a gainprofile as shown in figure 10.4 where at frequencies corresponding to allowed modes are“burned out.” This is called “spectral hole burning” and leads to strongly multi-modelasing in inhomogeneously broadened lasers.

10.4 Transverse modes

As discussed in week 8, there are several different transverse modes of a given resonator.For spherical mirror resonators, these can be described as Hermite-Gaussian or Laguerre-Gaussian modes. In general, transverse modes experience differences in the gain and loss

Page 120: Quantum Electronics / Photonics

Johnson, QE FS2014 119

Figure 10.4: Spectral hole burning in a laser with an inhomogeneously broadened gainmedium. The gain is depleted only at frequencies corresponding to resonator modes.(Figure from Saleh & Teich, fig 15.2-7.)

due to their different spatial distributions. They also have different longitudinal modeswith the same spacing but with some relative offset. Losses from finite mirror sizes areusually smallest for low-order beams, which has a tendency to make Gaussian beams easierto lase. If, however, there is an obstruction in the cavity this will tend to favor a modewith an intensity node at that location.

Another factor is the gain depletion. For a homogeneously broadened medium, thestrongest mode tends to suppress the others below threshold . . . but spatial hole burningcan allow some other modes to oscillate as well. Usually lasers are designed to lase onlyin one transverse mode, but for some special applications (e.g. very high power) it issometimes useful to make multiple modes.

10.5 Mode selection

Usually a laser is made to run in only one mode by inserting something inside the resonatorto somehow cause undesired modes not to lase. We’ll run just briefly through some differentways this is done.

10.5.1 Selecting a laser line

Sometimes the gain medium and pumping allow in principle lasing between several differentatomic transitions. Argon-ion lasers for example, have six different lines from 488 nm to 514nm. Usually we want to pick only one of these. A line is typically selected by introducingsome kind of intracavity element that rejects the “wrong” lines. A common scheme is toplace a prism inside the resonator, arranged so that only the correct line is deflected insuch a way as to have a stable resonator (see figure 10.5).

Page 121: Quantum Electronics / Photonics

Johnson, QE FS2014 120

gain medium

outputcoupler“wrong” line

prism

High reflector

.

.“wrong” polarization

aperture

Figure 10.5: Examples of how to select for laser line, transverse mode and polarizationin a laser cavity. The cavity is stable only for a small range of wavelengths due to therefraction of the prism. An aperture inside the resonator also selects for the gaussiantransverse mode. Finally, polarization losses for light polarized out of plane cause thelasing of only one polarization.

10.5.2 Transverse mode selection

As hinted at in section 10.4, a particular transverse mode can be selected for by introducinga mask or aperture inside the cavity to increase the loss for all modes except the one we areinterested in. This is also shown in figure 10.5. Usually we are interested in the low-ordermodes, so often there are apertures in the beam that try to limit the size of the transversemodes. Sometimes an “accidental” intensity mask is formed by optically induced damageto in intracavity optic, resulting in a mode with a node in the center of the beam.

10.5.3 Polarization selection

So far we did not talk much about polarization. In general, a resonator made of sphericalmirrors can support a two-dimensional space of possible polarizations. If the gain mediumalso has cylindrical symmetry about the optic axis, the laser output will have no preferredpolarization and generally be unpolarized. If we want polarized laser output, we haveto introduce some polarization-dependent gain or loss. Usually this is done by placing apolarizer inside the cavity. The polarizer does not need to be particularly good: we justneed sufficient contrast to push the loss of the undesired polarization direction above thegain level. Often this is done as shown in figure 10.5 where faces of the gain medium arecut at Brewster’s angle act as polarizers to reject light polarized out of the plane of thedrawing.

10.5.4 Longitudinal mode selection

Although the spacing between modes is usually too fine to be dealt with effectively by anintra-cavity prism, similar methods can be used to select only one desired mode. In theexercises you will investigate using a Fabry-Perot etalon to filter out only one particularline. For this the finesse of the etalon must be high enough to isolate a single longitudinal

Page 122: Quantum Electronics / Photonics

Johnson, QE FS2014 121

mode, and the etalon thickness must be small enough so that the spacing between adjacentetalon modes is enough so that only one falls within the gain bandwidth of the gain medium.

10.6 Pulsed lasers: overview

Here we very briefly talk about how pulsed lasers operate. Pulsed laser operation is im-portant for several reasons. First, very high laser powers are generally not sustainable overa long period of time due to gain depletion. This means that high peak powers are prettymuch the exclusive domain of pulsed lasers. This is extremely important for nonlinearoptics applications, since this deals with some processes that are measurable only for veryhigh peak intensities (i.e. it does not scale linearly). Pulsed lasers are also critical forthe field of ultrafast laser science, where short pulses allow for resolving the time scales ofatomic vibrations or even electronic wavepackets.

There are several different ways in which one can force a laser into pulsed operation,which have increasing levels of complexity:

• Gain switching refers simply to repeatedly turning on and off the pumping of thelevels that gives population inversion. The laser will start to lase when N0 > Nt andwill stop once the gain is turned off and N dips below Nt. The duration of the pulsesachievable with this method is limited be several time scales. The initial build-up ofphoton flux happens with a time scale given by tsp. The decay time after the gainis switched off is given by τp. These depend on the medium and the cavity, but it isgenerally suitable for long pulse durations of several microseconds or longer.

• Q-switching refers to pulsing by modulating the losses in the cavity. The idea isthat most of the time the losses are high and so the Q-factor of the resonator is so badthat lasing cannot happen. When we want the lasing to happen, we change somethingin the cavity to make the Q-factor much better, leading to lasing. Physically this isusually done using a birefringent material with optical properties that can be modifiedby applying an electric field (more on this in week 11). Relative to gain switchingthis has the advantage that the population inversion is nearly always there, so we donot have to wait for it to build up. Q-switching is in fact a pretty efficient way tosqueeze maximum peak power out of a laser system. Usually Q-switched lasers arelimited to several nanoseconds pulse duration (roughly equivalent to τp in the high-Qstate).

• Mode locking leads to some of the shortest pulses, and can by used to obtain pulsedurations on the order of 10 fs for the visible range. Although a thorough descriptionis a bit beyond what we can cover here we will discuss the general concept (seeSiegman’s Lasers chapters 27 and 28 for more details).

The basic goal of mode locking is to take a laser with many adjacent longitudinal modesthat all lase with a fixed phase relationship such that the phases all constructively add upin one spatial location for some particular time. The “locked modes” will then add up to

Page 123: Quantum Electronics / Photonics

Johnson, QE FS2014 122

produce a train of pulses, the duration of which depends on the total range of frequenciesused (see week 2, section 6). The spacing of the train of pulses is equivalent to 2d/c,which just indicates that there is only one pulse that bounces back and forth through theresonator.

There are several methods used to achieve mode locking. Active mode locking employsan intracavity device that modulates the resonator loss in a manner similar to the Q-switch. In this case, however, the extra loss is switched off only for very short times witha periodicity equal to the round-trip time of photons inside the resonator. This suppresseslasing for all but a particular phase relationship among the different longitudinal modes.At some point the “right” phase combination will be tried out be the laser, leading toamplification of a mode-locked pulse.

Active mode locking is actually not so much used nowadays. Most modern modelocking is done using passive nonlinear optical elements that sit inside the laser cavity.The elements and cavity are arranged so that the cavity loss decreases strongly with theintensity of light in the cavity. One particular set of phase-related modes will start to gainin intensity, which then suppresses all other possible combinations of mode-phases belowthe lasing threshold due to depletion effects.

10.7 Examples of lasers

Here we will give a brief overview of some types of lasers and how they fit in with thetheory discussed so far. Last week we already talked a little about ruby and Nd3+ glasslasers. Many more examples can be found in the textbook.

10.7.1 Ti3+:Al2O3

Titanium-doped sapphire based lasers are a mainstay in ultrafast laser labs due to the ex-cellent thermal properties of sapphire and the very broad bandwidth of the lasing transitionthat allows lasing from 700-1050 nm. A sketch of the energy levels for the 4-level lasingscheme is shown in figure 10.6. The pump is usually driven by another laser operating inthe green range of the spectrum (e.g. frequency-doubled Nd:YAG, Nd:YVO4 or Yb:YAG).The broad range of energies for level 1 is due to the interaction of the d-orbital of the Tiion with local vibrations in the sapphire. This wide range allows for mode-locked lasing,routinely giving 10 fs pulses at repetition rates of about 100 MHz with pulse energies ofseveral tens of nJ.

10.7.2 CO2 lasers

The energy levels involved in lasers do not need to rely on electronic transitions. Gasesof polar molecules have vibrational, rotational and librational energy transitions that caninteract with light in a fashion analogous to electronic levels. Figure 10.7 shows the lasingscheme for a CO2 mid-infrared laser that uses vibrational energy level transitions.

Page 124: Quantum Electronics / Photonics

Johnson, QE FS2014 123

Figure 10.6: Energy levels involved in lasing in Ti:Al2O3 (from Saleh & Teich, figure 15.3-4).

Figure 10.7: Levels in a CO2 laser. The gain medium is actually a mixture of gasesincluding CO2 and N2. Pumping is accomplished by applying a current through the gas(glow discharge). The N2 helps make the pumping more efficient by transferring energy toexcited states of CO2 vibrations. Figure from K. Uno, Laser Pulses - Theory, Technologyand Applications.

Page 125: Quantum Electronics / Photonics

Johnson, QE FS2014 124

Figure 10.8: Sketch of a free electron laser (see text). This figure is from Saleh & Teich,figure 15.3-8.

10.7.3 Free-electron laser

A bit of a strange example is the free-electron laser, which is a laser that can be describedwithout using any quantum mechanics at all (but it does need relativity!). As shown infigure 10.8 the gain medium is a beam of electrons that have been accelerated to nearlythe speed of light. A periodic magnetic structure forces the beam of electrons to wiggleback and forth along their trajectory. Since accelerating charged particles radiate light,this wiggling process causes the emission of radiation that is mostly directed forward of theelectron beam. This “spontaneous emission” of the electrons can then be caught inside aresonator and made to repeatedly interact with the electron beam again. The alternatingelectric field of the light causes some of the electrons to speed up a little, and others toslow down. If this goes on long enough the electrons in the beam start to bunch togetherand radiate in phase with each other. This “stimulated emission” results in lasing providedthe gain is sufficient.

Originally FELs were used to generate highly tunable radiation in the far-infrared atfrequencies where conventional lasers could not operate. More recent generations of FEL’shave been used to push laser technology into higher frequencies. Prominent examples arethe hard x-ray lasers LCLS (Stanford, USA) and SACLA (Japan). Another such deviceis under construction here in Switzerland: the “SwissFEL.” These devices are run in apulsed mode and offer < 100 fs duration pulses at wavelengths small enough to performx-ray diffraction. My research group uses these FELs to study the dynamics of structuralchanges in solid-state materials.

A good review of free electron laser theory with regard to x-rays can be found in Huangand Kim, Phys. Rev. Spec. Top. 10, 034801 (2007). A copy of this article is available onthe moodle page for this course.

Page 126: Quantum Electronics / Photonics

Chapter 11

Polarization optics 1

Learning objectives

This week we will be discussing the use of optical elements with properties that depend onthe polarization of light. After this week you should be able to

• Use the Poincare sphere construction to describe polarization of light

• Apply the Jones formalism to describe the function of common polarization optics

• Describe what isotropic, uniaxial and biaxial materials are

• Explain the concept of the refractive index ellipsoid (eq. 11.37, 11.38)

• Determine the normal polarization modes for light propagating in a uniaxial or biaxialcrystal (eq. 11.43, 11.44)

• Explain the concept of the dispersion surface

• Determine the direction of energy flow in a uniaxial crystal given the direction of thewavevector

• Describe how magnetic and electric fields can be used to construct polarization de-vices

• Describe how to make a polarizer, wave retarder, or an optical isolator

11.1 Describing polarization: Poincare sphere and the

Stokes vector

As we briefly discussed in week 1, one basic property of light is polarization, which here wewill refer to as the direction of the oscillating electric field in a propagating electromagnetic

125

Page 127: Quantum Electronics / Photonics

Johnson, QE FS2014 126

wave. There are several different ways in which the polarization of light can be described.Here we will discuss three of the more common ways: the Poincar sphere, the Stokesparameters and the Jones vector formalism.

Let’s start off by assuming that we have a plane wave propagating in the +z directionin free space. If we then look at the electric field as a function of time at one particularpoint, we can separately write the x and y components (using complex notation) as

Ex = axei(ωt+φx) (11.1)

Ey = ayei(ωt+φy) (11.2)

where ax, ay, φx and φy are all real, we can write the real part of the electric field vectoras

Ex = ax cos(ωt+ φx) (11.3)

Ey = ay cos(ωt+ φy). (11.4)

These are the parametric equations for an ellipse

E2x

a2x

+E2y

a2y

− 2 cos(φy − φx)ExEyaxay

= sin2(φy − φx) (11.5)

as depicted in figure 11.1.The shape of this ellipse can be characterized in several different ways. One way which

we will find convenient is to define an angle ψ as the angle of the major axis (the long axis)with respect to the x-axis. To define the width of the ellipse, we introduce another angle χto be the angle between the major axis and the corner of the smallest-area rectangle thatincludes the entire ellipse (see figure 11.1). Mathematically,

tan 2ψ =2R

1−R2cosφ (11.6)

sin 2χ =2R

1 +R2sinφ (11.7)

R =ayax

(11.8)

φ = φy − φx (11.9)

Note that the sign of χ indicates whether the electric field rotates clockwise (χ > 0) orcounter-clockwise (χ < 0) when viewed by an observer looking at the beam in the −zdirection (against the beam propagation direction).

All that we really need to specify the polarization state of light are the values of ψ andχ. A common visualization method for describing these angles maps them onto points ona unit-radius sphere, called the Poincare sphere (figure 11.2). In spherical coordinates, weselect a point (r, θ, φ) = (1, 90−2χ, 2ψ). The points on the sphere which represent linearlypolarized states (χ = 0) are along the equator of the sphere. On one side of the equator

Page 128: Quantum Electronics / Photonics

Johnson, QE FS2014 127

Ex

Ey

Ψ

χ

ax

ay

ab

Figure 11.1: The ellipse traced out by the electric field of an optical pulse with a definitepolarization state. At any given time, the tip of the electric field vector lies on the ellipse.As time progresses, the tip moves around the ellipse.

Page 129: Quantum Electronics / Photonics

Johnson, QE FS2014 128

Figure 11.2: The Poincare sphere. Each point on the sphere indicates a polarization state,as discussed in the text. This figure is from Saleh & Teich, figure 6.1-5.

we have x-polarized light, whereas on the other side at θ = 180 we have the orthogonaly-polarized light. As we move on the sphere from the equator toward either of the twopoles, the polarization becomes more elliptical. At the north pole (2χ = 90) we have rightcircularly polarized light, where the E-field tip traces out a clockwise circle in a given (x,y)plane when viewed in the −z direction. At the south pole (2χ = −90) the light is leftcircularly polarized, where the E-field vector moves in the opposite direction.

The Poincare sphere gives only the polarization state, but says nothing about theamplitude of the electric field. Another way to describe polarization is by using the Stokesparameters, which also specify the magnitude of the E-field. In terms of our above variables,these four parameters are

S0 = a2x + a2

y (11.10)

S1 = a2x − a2

y (11.11)

S2 = 2axay cosφ (11.12)

S3 = 2axay sinφ (11.13)

Note that these parameters are not completely independent, since S20 = S2

1 + S22 + S2

3 .

11.2 Jones vector formalism

The Jones vector formalism is yet another way to describe polarization. The Jones vectoris simply

J =

(axe

iφx

ayeiφy

)(11.14)

which also can completely specify the state of the light field at a particular point in space.Examples of Jones vectors are:

Page 130: Quantum Electronics / Photonics

Johnson, QE FS2014 129

• Linearly polarized light along x:

J =

(10

)(11.15)

• Linearly polarized light along y:

J =

(01

)(11.16)

• Right circular polarization:

J =1√2

(1i

)(11.17)

• Left circular polarization:

J =1√2

(1−i

)(11.18)

We say that two states of polarization are orthogonal if and only if

(J1,J2) = (a1xeiφ1x)(a2xe

−iφ2x) + (a1yeiφ1y)(a2ye

−iφ2y) = 0 (11.19)

Examples of orthogonal polarization states are x- and y- linearly polarized light, or right-and left-circularly polarized light. Any Jones vector can be represented as a weighted sumof two orthogonal Jones vectors.

The Jones vector representation is useful because it allows us to use 2× 2 matrices todescribe the action of devices or materials that alter the intensity or polarization of light.If J1 is the input state, the output state can be written as

J2 = TJ1 (11.20)

where T is our matrix. Examples of matrices include:

• Linear polarizers. An optical element that only passes x-polarized light is describedby

T =

(1 00 0

)(11.21)

• Wave retarders. In some materials (as we will soon see), light of one linear polariza-tion can be delayed relative to the orthogonal polarization. If the x-axis is the fasterdirection, the matrix representing this is

T =

(1 00 e−iΓ

)(11.22)

Certain values of Γ are more common and useful, and so get special names. ForΓ = π/2, the slow-axis light is delayed by one quarter cycle relative to the fast axis,

Page 131: Quantum Electronics / Photonics

Johnson, QE FS2014 130

and so this is called a quarter-wave retarder. For Γ = π it is a half-wave retarder.Quarter wave retarders can transform linearly polarized light into circularly polarizedlight and vice versa. Half-wave retarders can rotate the polarization of linear light,or can change right circularly polarized light into left-circularly polarized light.

• Polarzation rotators rotate the polarization of light by an angle θ:

T =

(cos θ − sin θsin θ cos θ

)(11.23)

In the case of the wave retarders above we gave matrices assuming that the x-axis is the“fast” axis. We can also make an arbitrary axis the “fast” axis simply by using a coordinatetransform

T ′ =

(cos θ − sin θsin θ cos θ

)(1 00 e−iΓ

)(cos θ sin θ− sin θ cos θ

)(11.24)

where the rightmost matrix rotates the coordinate system by an angle −θ, the middlematrix applies the retardation, and the first matrix restores the coordinate system byrotating by θ. Hence the total effect is that T ′ is a retardation with a fast axis at an angleof θ with respect to the x-axis.

The normal modes of a polarization system given by a matrix T are the Jones matricesthat are eigenvectors of the matrix, i.e. a normal mode J satisfies

TJ = µJ (11.25)

for some scalar µ.

11.3 Anisotropic materials: overview

Since most materials have lower symmetry than vacuum, quite often the properties of lightpropagation through such materials depends on both the direction of propagation and onthe polarization of the light. In some cases this can be ignored. For example in manypolycrystalline, amorphous, gaseous or liquid materials there is on a macroscopic scale nodirectional dependence of the optical properties. In cubic crystals we also have isotropicoptical behavior. The rest of this week will be discussing the exceptions to this.

Returning to our discussion of Maxwell’s equations from week 1, remember that insection 4 we assumed that the medium was isotropic, and so the D vector was alwayscollinear with E, and H ‖ B. Now we revoke this assumption for the case of D and E,while still assuming the medium is magnetically isotropic. . . in fact, we’ll be making ourusual assumption that µ = 1 that is valid for the vast majority of dielectric materials atoptical frequencies. We then have

Dj = ε0∑k

εjkEk (11.26)

Page 132: Quantum Electronics / Photonics

Johnson, QE FS2014 131

Figure 11.3: An ellipsoid, which is a graphical representation of a real, symmetric secondrank tensor in three dimensions.

where here j, k are indices that run from 1 to 3. Note that now the dielectric constant εhas gone from being a scalar (single number) to a collection of 9 numbers: a tensor. Thisis most often represented as a matrix

ε =

ε11 ε12 ε13

ε21 ε22 ε23

ε31 ε32 ε33

(11.27)

Usually symmetries allow for relationships among these 9 numbers that reduce the numberof free parameters. For our present purposes, we will assume that εjk is real and symmetric,i.e. εjk = εkj. This is true in most non-absorbing, non-magnetic materials. An importantexception are so-called optically active materials, which we discuss in section 11.6.

A way to geometrically visualize εjk in the real, symmetric case is to make a 3-D plotof all points xj which satisfy ∑

jk

εjkxjxk = 1 (11.28)

This describes the surface of an ellipsoid, as shown in figure 11.3. If we rotate the coordinatesystem so that the axes of the ellipsoid coincide with the x,y,z axes, the dielectric tensoris diagonal and the ellipsoid is

ε1x21 + ε2x

22 + ε3x

23 = 1 (11.29)

In this principal coordinate system, we have also

D1 = ε1E1 (11.30)

D2 = ε2E2 (11.31)

Page 133: Quantum Electronics / Photonics

Johnson, QE FS2014 132

D3 = ε3E3 (11.32)

The permittivities ε1, ε2 and ε3 correspond to refractive indices

n1 =√ε1 (11.33)

n2 =√ε2 (11.34)

n3 =√ε3 (11.35)

which are called the principal refractive indices of the material.Materials where n1 = n2 = n3 are isotropic. If two of the refractive indices are equal

(n1 = n2) the material is said to be uniaxial. In this case we call n1 = n2 = no theordinary indices, and n3 = ne the extraordinary index. If ne > no the material ispositive uniaxial, and if ne < no it is negative uniaxial. The general case where all indicesare different is called biaxial.

For some applications it is convenient to represent the dielectric tensor in a slightlydifferent way. The impermeability tensor is defined as the inverse of the dielectrictensor

η = ε−1 (11.36)

It can be shown that the impermeability tensor and the dielectric tensor share the sameprincipal coordinate system. The ellipsoid∑

jk

ηjkxjxk = 1 (11.37)

is called the index ellipsoid. In the principal coordinate system this is simply

x21

n21

+x2

2

n22

+x2

3

n23

= 1 (11.38)

and so we see that the half-lengths of the axes of the ellipsoid are the principal refractiveindices.

11.4 Propagation in anisotropic materials

Now that we have introduced several different ways to describe the optical properties ofanisotropic materials, we are left with the problem of how light propagates through them.We’ll consider this in steps of increasing generality.

11.4.1 Propagation along a principal axis

The first set of cases we will consider is when the direction of propagation (i.e. the wavevec-tor k) is along a principal axis. This simplifies matters considerably, and indeed in manyapplications of anisotropic materials this is forced to be the case.

Page 134: Quantum Electronics / Photonics

Johnson, QE FS2014 133

Polarization along a principle axis

If the light is also linearly polarized along a principal axis j, the wave behaves exactly asit would in an isotropic material with an index of refraction nj since D is parallel to E.The wave has a phase velocity of c/nj. Similarly, the orthogonal polarization travels withthe phase velocity appropriate to that direction.

For definiteness, let us suppose that the wave propagates in the +z direction, and thatthe x, y, z axes are the principal axes of the material. An x-polarized wave then propagateswith phase velocity c/nx, and a y-polarized wave propagates with phase velocity c/ny.

Polarization along an arbitrary direction

Suppose now that the polarization of the wave is not directed along a principal axis. Letthe Jones vector at some point be

J1 =

(axe

iφx

ayeiφy

)(11.39)

As discussed earlier, this is just a weighted sum of a linearly x-polarized wave and alinearly y-polarized wave. Since we are still operating in the linear approximation of thewave equation, we can treat these two polarizations as independent waves and simply sumtheir contributions. After a propagation distance d the Jones vector is

J2 =

(axe

i(φx−k1d)

ayei(φy−k2d)

)(11.40)

where k1 = 2π/λ1 = 2πn1/λ0 is the wavevector for x-polarized light and k2 = 2π/λ2 =2πn2/λ0 is the wavevector for y-polarized light. The phase retardation (phase differencebetween x and y polarizations) induced by the propagation is then

Γ = −2π(n1 − n2)d/λ0 (11.41)

We see from this that propagation in an anisotropic medium is a way to make a waveretarder as discussed in section 11.2. In fact the vast majority of actual half- and quarter-wave retarders are made using propagation along a principal axis in a uniaxial crystal.

11.4.2 Propagation in an arbitrary direction

We now consider the fully general case where the direction of propagation is not necessarilyalong one of the principal axes of the medium. Here we will describe how to calculate thepolarization of the normal modes for such a wave using a geometrical construction involvingthe index ellipsoid. The proof of this we will not give here, but those interested can referto Saleh & Teich, pages 220-221.

Figure 11.4 shows the index ellipsoid in the principal axis coordinate system. Let ussuppose that the direction of the wavevector k is given in terms of this coordinate system,

Page 135: Quantum Electronics / Photonics

Johnson, QE FS2014 134

Figure 11.4: The index ellipsoid, also showing the geometric constructions used to deter-mine the normal polarization modes. This figure is from Saleh & Teich, figure 6.1-5.

and we also show this as a vector starting at the center of the ellipsoid (at the origin ofthe coordinate system). Let’s now draw a plane passing through the origin perpendicularto k. The intersection of this plane with the ellipsoid is an ellipse, which we will call theindex ellipse. The half-lengths of the major and minor axes of this ellipse are equivalent tothe effective indices of refraction na and nb of the two normal modes. The directions of themajor and minor axes give the directions of D for the two normal modes. The directionsof the E fields for the normal modes may be found by

Ej =1

ε0

∑k

ηjkDk (11.42)

which in the principal coordinate system is

Ej =1

ε0

Dj

n2j

(11.43)

For uniaxial materials, this construction can be simplified a bit. If we take θ as theangle between k and the optic axis, one of the axes of the index ellipse will have half-length no and be along the direction mutually perpendicular to k and the optic axis. Theother axis will be perpendicular to this, at an angle of 90 − θ from the optic axis. Thecorresponding index n(θ) is given by

1

n(θ)2=

cos2 θ

n2o

+sin2 θ

n2e

(11.44)

The normal mode with index no is called the ordinary wave, and for this wave the directionof D and E are the same. The other normal mode is called the extraordinary wave andfor this wave the directions of D and E are not parallel.

Page 136: Quantum Electronics / Photonics

Johnson, QE FS2014 135

Figure 11.5: The first octant of the dispersion surface for various optical symmetries. Thisfigure is from Saleh & Teich, figure 6.3-9.

11.5 Dispersion relation

The dispersion relation of a wave refers to the relationship between the frequency ω andthe wavevector k. In isotropic materials ω is independent of the direction of k. This is nolonger the case for anisotropic media. As we will see, the dispersion relation gives a wayto see how energy flows in a wave in an anisotropic medium.

Assuming plane waves with wavevector k and frequency ω, Maxwell’s equations fromweek 1 give two important relations:

k×H = −ωD (11.45)

k× E = ωµ0H (11.46)

These implyk× (k× E) + ω2µ0D = 0 (11.47)

which, if expressed in component form using the principal coordinate system, is n21k

20 − k2

2 − k23 k1k2 k1k3

k2k1 n22k

20 − k2

1 − k23 k2k3

k3k1 k3k2 n23k

20 − k2

1 − k22

E1

E2

E3

=

000

(11.48)

where k0 = ω/c is the magnitude of the wavevector in vacuum. For there to be non-trivialsolutions, the determinant of the matrix must be zero. This condition for a fixed value ofω defines a 3-D surface in k-space called the dispersion surface or k surface. Typicaldispersion surfaces for biaxial, uniaxial and isotropic media are shown in figure 12.6. Forthe isotropic case it is the shell of a sphere. For the uniaxial case, it is two nested surfacesthat meet at two points along the optic axis. One of these surfaces is a spherical shell withradius noko. For the biaxial case we have two nested non-spherical surfaces.

We can use the dispersion surface to find the wavevectors of normal modes for a par-ticular propagation direction. A wavevector k = k(u1, u2, u3) (with u2

1 +u22 +u2

3 = 1) must

Page 137: Quantum Electronics / Photonics

Johnson, QE FS2014 136

satisfy ∑j

u2jk

2

k2 − n2jk

20

= 1 (11.49)

So for a given direction, there are four possible values of k. Two of these are negative,which we discard. We are left with the two solutions that correspond to the normal modes.We can use these to solve eq. 11.48 to get the electric field polarization for the normalmodes. This makes a nice alternative to the geometric construction for finding the normalmodes of propagation.

The direction of the average Poynting vector 〈S〉 = 12Re [E×H∗] can also be determined

from the dispersion surface. The Poynting vector gives the direction of energy flow in thewave, and it also gives the direction of rays: spatially localized solutions to the waveequation. In an isotropic medium the direction of the Poynting vector is the same as thatof the wavevector k. In anisotropic materials this is not necessarily the case. For example,the extraordinary waves in uniaxial crystals have a noncollinear E and D that are bothperpendicular to H. The wavevector k is, however, always perpendicular to both D (seeeq. 11.45) and H. This means that S cannot be collinear with k.

The direction of the Poynting vector for a given k on the dispersion surface is alwaysnormal to the dispersion surface. Put another way, the direction of S is parallel to ∇kω(k).For those interested a proof will be posted to the moodle page later this week.

For uniaxial crystals, the dispersion surface has the relatively simple form

(k2 − n2ok

20)

(k2

1 + k22

n2e

+k2

3

n2o

− k20

)= 0 (11.50)

which has two solutions: a spherical shell where k = nok0, and an ellipsoid of revolution

k21 + k2

2

n2e

+k2

3

n2o

− k20 = 0 (11.51)

Since the optic axis is an axis of rotational symmetry, we can represent this two dimen-sionally without loss of generality by assuming that k2 = 0 (see figure 11.6).

11.6 Optical activity

Optical activity refers to the property of some materials to function as polarization rotators.In these materials εjk 6= εkj, and so the treatment of section 11.3 is not valid. In thesematerials the normal modes are circularly polarized, rather than linearly polarized. Putanother way, the phase velocity of right circularly polarized light is different from thatof left circularly polarized light. Examples of such materials are tellurium and quartz, aswell as solutions of sugar (dextrose, levulose/fructose) and amino acids. These materialsall have spiral-like atomic structures that lead to the optical activity. Measurements ofoptical activity is often used as a way to determine the ratio of “left handed” and “righthanded” chemical substances in a solution.

Page 138: Quantum Electronics / Photonics

Johnson, QE FS2014 137

k1/k0

k3/k0

n0

n0 ne

Figure 11.6: Two dimensional representation of the dispersion surface for a uniaxial crystal.The actual dispersion surface is an ellipsoid of revolution about the k3 axis.

Suppose we start with a linearly polarized state with Jones vector

J1 =

(10

)(11.52)

We can write this as the sum of a right-circular and left-circular polarized state:

J1 =1

2

[(1i

)+

(1−i

)](11.53)

Now suppose we propagate this wave through a thickness d of an optically active materialwith index of refraction n+ for right-circular polarization and index n− for left-circularpolarization. The new polarization state is then

J2 =1

2

[e−i2πdn+/λ0

(1i

)+ e−i2πdn−/λ0

(1−i

)](11.54)

or

J2 = e−iπd(n++n−)/λ0

(cos d(n− − n+)/λ0

sin d(n− − n+)/λ0

)(11.55)

which is, up to an overall phase, just a rotation of the polarization direction by an angleφ = π(n− − n+)d/λ0.

Page 139: Quantum Electronics / Photonics

Johnson, QE FS2014 138

Before After

Forward direction

Before After

Backward direction

Figure 11.7: Influence of an optically active material on linearly polarized light. Regardlessof the initial polarization direction, the material will cause a rotation of the polarizationby an amount determined by the activity and the thickness. The direction of rotationdepends on the direction of the wave.

Page 140: Quantum Electronics / Photonics

Johnson, QE FS2014 139

The rotation direction φ depends on the direction of the wave, since the definitionof right and left circular polarization also depends on the direction of k. As shown infigure 11.7, this means that a linearly polarized wave that passes once through an opticallyactive material, is reflected, and is them made to pass again through the material willbe have the same polarization direction as when it started. In other words, the rotationinduced by optical activity is reversible with time.

11.7 Magneto-optics

The optical properties of materials can also be controlled by the application of externalstrain, electric or magnetic fields. When we are talking about oscillating magnetic orelectric fields we enter into the domain of nonlinear optics, which we will discuss later.

A static magnetic field causes many materials that are not optically active to functionas polarization rotators. This is called the Faraday effect. Phenomenologically, the rotarypower ρ (the angle rotation per unit length) is

ρ = VB (11.56)

where V is known as the Verdet constant and B is the projection of the magnetic Bfield along the direction of propagation. Here B can be positive or negative, leading torotations in either direction. Materials showing the Faraday effect include various glassesand transparent garnet crystals. Typical values of the Verdet constant are around 3×10−4

degrees/Oersted cm at optical wavelengths.An interesting contrast between the Faraday effect and optical activity is illustrated

in figure 11.8. If a plane polarized wave propagates through a material with a non-zeroVerdet constant with a B field applied in one direction, the light is rotated by a particularangle when it emerges. If now the light is reflected back through the material in theopposite direction, the polarization continues to rotate in the same direction. As a result,the polarization angle changes by twice the angle from a single pass. This maybe surprisingresult is a consequence of the symmetry properties of the magnetic field on time-reversal.This particular property is used in practical devices, as discussed in section 11.9.4.

11.8 Electro-optics

There are also many materials with optical properties that change upon the application of astatic electric field. For materials with no inversion symmetry, the Pockels effect describesa linear dependence of the parameters of the index ellipsoid on an applied electric fieldvector. The more general Kerr effect is actually a nonlinear effect where the index ellipsoidchanges quadratically with an applied field. Mathematically, the impermeability tensor is

ηjk(E(0)) = ηjk +

∑l

rjklE(0)l +

∑lm

sjklmE(0)l E(0)

m (11.57)

Page 141: Quantum Electronics / Photonics

Johnson, QE FS2014 140

Before After

Forward direction

Before After

Backward direction

Figure 11.8: Influence of the faraday effect on linearly polarized light. Regardless of theinitial polarization direction, the material will cause a rotation of the polarization by anamount determined by the Verdet constant, the magnetic field strength, and the thickness.The direction of rotation does not depend on the direction of the wave.

Page 142: Quantum Electronics / Photonics

Johnson, QE FS2014 141

The second term on the right-hand-side is the Pockels effect, while the last term is theKerr effect. The tensors are called electro-optic coefficients, and individual elements canoften be related to each other via symmetries. We will not get into much more detail here,but it is important to note that these effects can be applied to give rapid control overpolarization devices that are very commonly used in optical systems today.

11.9 Polarization devices

With what remains we will give an overview of how the polarization-dependent opticalproperties of materials can be used to make devices that switch or modulate light.

11.9.1 Polarizers

Polarizers are devices that pass only one particular linear polarization. There are manydifferent types of polarizers, which rely on different physics.

Some polarizers rely on a difference in absorption between two orthogonal linear po-larizations. Such materials are called dichroic, which is unfortunate since this has little todo with color (the term is historical). This is the operating principle of the inexpensivepolarizer sheets you may have seen used. These are sheets of iodine-impregnated polyvinylalcohol that is heated and stretched in one particular direction. Microscopically, the mate-rial looks like a bunch of parallel conductors that preferentially absorb radiation polarizedin the same direction as the stretched molecules.

Polarizers can also use the differences in reflectivity for different orthogonal polariza-tions, as we have already discussed in week 3. Similarly, in an anisotropic crystal polar-izations can be separated by the differences in the refraction angle for different normalmodes. These polarizing beamsplitters have the further advantage that usually bothpolarizations can be used. These devices typically offer the best contrast (ratio of desiredpolarization to undesired polarization).

11.9.2 Wave retarders

As we already mentioned, wave retarders are usual constructed from uniaxial materials(like quartz) and cut so that the light enters at a surface normal to one of the ordinaryprincipal axes. The total retardation is proportional to the thickness of the crystal. Waveretarders that make a phase difference between the two normal directions less than 2πare called “zero order” wave retarders. They are usually desirable in situations where thewavelength-dependence of the retardation should be minimized (e.g. for ultrafast pulses).

11.9.3 Intensity control, switching

As shown in figure 11.10, polarizers and wave retarders can be used in combination tocontrol the intensity of light. The first polarizer defines the initial polarization of the light.

Page 143: Quantum Electronics / Photonics

Johnson, QE FS2014 142

The wave retarder can then induce an ellipticity into the beam. The last polarizer selectsone projection of the ellipse. Varying the total retardation (maybe by applying an electricfield) allows us to modulate the output intensity. Variants on this can use the electro-opticeffect to quickly switch the intensity from one optical path to another. This is the basis ofthe “Pockels cell” used for a long time in many pulsed laser systems.

11.9.4 Optical isolators

One important application based on the Faraday effect is the optical isolator, sketched infigure 11.11. This is a device that transmits light in only one direction. This is importantin many high-power laser applications, where unwanted reflections of even a few percentcan potentially damage sensitive optics. In this example light enters through polarizer Aand is rotated by 45 clockwise by a Faraday rotator. If polarizer B is set to also 45

clockwise, the light passes through without loss.Now consider what happens to light following the reverse path. The light from polarizer

B is rotated in the same direction by its path through the Faraday rotator, making it cross-polarized with polarizer A. As a result, no light makes it through.

Page 144: Quantum Electronics / Photonics

Johnson, QE FS2014 143

Polarizer A Polarizer B

fast axis Wave retarder

/4/2

3/4

Figure 11.9: Device for modulating the intensity of light. The polarized light is madeelliptical by the wave retarder, acquiring an ellipticity that depends on the retardation.For a retardation of π it is possible to cross polarize the light with respect to the exitpolarizer, switching off the intensity (see figure 11.10).

Page 145: Quantum Electronics / Photonics

Johnson, QE FS2014 144

Figure 11.10: Intensity of the device depicted in figure 11.9 as a function of retardation.

Page 146: Quantum Electronics / Photonics

Johnson, QE FS2014 145

Polarizer A Polarizer B

Faraday rotator

B

Polarizer A Polarizer B

Faraday rotator

B

Left-to-right

Right-to-left

Figure 11.11: One realization of an optical isolator based on the Faraday effect. This isnot to scale: usually Faraday rotators need to be fairly thick due to the inability to makereally large B fields.

Page 147: Quantum Electronics / Photonics

Chapter 12

Waveguides

Learning objectives

This week we will start a discussion of waveguides, which are important elements ofminiature-scale “integrated optics” devices. This follows parts of chapters 8 and 9 inthe textbook. After this you should be able to

• Explain how boundary conditions in a plane mirror waveguide lead to waveguidemodes (eq. 12.2)

• Calculate the number of modes in a plane mirror waveguide (eq. 12.16)

• Explain the meaning of the dispersion relation for a waveguide (eq. 12.21)

• Calculate the group velocity of a plane mirror waveguide (eq. 12.22)

• Draw the transverse intensity profile of selected waveguide modes for both the planemirror waveguide and the dielectric waveguide

• Explain the differences in function between the planar dielectric waveguide and theplane mirror waveguide

• Explain conceptually how stepped-index optical fibers work

• Calculate the numerical aperture of optical fibers and planar dielectric waveguides(eq. 12.29)

12.1 Waveguides: an introduction

Waveguides are devices that transmit light (or other electromagnetic radiation) along pre-scribed paths inside a material. In comparison to more traditional optical schemes thatsimply use beams in free space, waveguides have the advantage of alignment robustnessand the ability to easily circumvent obstructions (especially in the case of optical fibers).

146

Page 148: Quantum Electronics / Photonics

Johnson, QE FS2014 147

ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 1 –

Kap. 8 Wellenleiter und Integrierte Optik

8.1 Einleitung Licht kann durch räumlich inhomogenen Brechungsindex geführt werden. Die grundlegende Idee zur optischen Wellenführung kann einfach verständlich mit folgenden Wellenleitergeometrien erklärt werden: Ein dielektrisches Medium mit einem bestimmten Brechungsindex ist von einem anderen dielektrischen Medium mit einem niedrigeren Brechungsindex umgeben. Das Licht ist dann bei geeigneter Dimensionierung im eingebetteten dielektrischen Medium durch die Totalreflexion "gefangen". Solche Strukturen nennt man optische Wellenleiter ("optical waveguide"), und man unterscheidet je nach Geometrie verschiedene Typen von Wellenleitern (Fig. 1): Faser ("fiber"), planare Wellenleiter ("slab waveguide") und seitlich begrenzte planare Wellenleiter, die sogenannten Streifenwellenleiter ("strip waveguide").

Fig. 1: Das dunkel dargestellte dielektrische Material hat einen höheren Brechungsindex als das umhüllende Material. Wellenleitung entsteht durch Totalreflexion. Typen der Wellenleiter sind: (a) planare Wellenleiter, (b) Streifenwellenleiter, und (c) Fasern. (Abbildung aus Ref. [1], S. 239)

Elektromagnetische Wellen können über lange Strecken durch flexible Glasfaserkabel geführt werden. Dadurch fallen Störungen z.B. durch Luftbewegung oder atmosphärische Effekte nicht mehr ins Gewicht. In einem Wellenleiter kann die elektromagnetische Welle einen kleinen Strahlenquerschnitt über grosse Längen aufrechterhalten. Das ist im krassen Gegensatz zur Ausbreitung im freien Raum (vgl. z. B. Gauss-Strahl in Fourier-Optik, Kap. 5). Im freien Raum ist die Divergenz umso stärker, je kleiner der Strahl im Fokus ist. Im Wellenleiter haben wir einen konstanten Strahlradius – im Prinzip über beliebig grosse Wechselwirkungslänge (begrenzt z.B. durch Absorptionsverluste).

Figure 12.1: Examples of waveguides: (a) a slab geometry dielectric waveguide; (b) a stripwaveguide; (c) an optical fiber. This figure is from Saleh & Teich, figure 8.0-1.

Waveguides are often used in telecommunications to transmit light over long distances.They are also often used in biomedical applications where light is to be brought into ortaken out from difficult to access places. Waveguides are also a vital component of “inte-grated optics” devices, where miniaturized optical components are used to manipulate thestate of light.

Waveguides come in many different forms. Figure 12.1 shows some examples of typicalwaveguide geometries, all based on using dielectric materials with different indices of re-fraction. In this part of the course we will give a fairly detailed treatment of planar-mirrorwaveguides and a brief overview of salient aspects of the other types.

12.2 Modes of the planar-mirror waveguide

A planar-mirror waveguide is shown in figure 12.2. Physically, it is just two highly reflectivemirrors placed parallel to each other at a fixed distance d apart from each other, so thatlight can bounce back and forth in the small space between them. This is not really apractical device to actually construct, since the costs of making high-reflective mirrors likethis are a bit too high. Instead dielectric waveguides are used, which we discuss later insection 12.3. The planar mirror waveguide is, however, of pedagogical importance since itis essentially a simplified case of the dielectric planar waveguide.

There are a couple different ways to think about electromagnetic radiation inside aplanar-mirror waveguide. One approach is to use ray optics, where we imagine rays oflight bouncing back and forth between the mirrors, as is shown in figure 12.2. In thispicture, we can identify modes of the waveguide as waves that self-consistently interfereafter a single reflection from each mirror, as shown in figure 12.3. This placed conditionson the wavelength λ and the angle θ of the wavevector with respect to the mirror planes.Geometric arguments give

λ2d sin θ = 2πm (12.1)

where m is any positive integer. Usually we index the allowed θ values with the integer m

Page 149: Quantum Electronics / Photonics

Johnson, QE FS2014 148

ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 4 –

Aus Fig. 5 folgt mit dem Snellius-Gesetz:

n0 sinθa = n1 sinθ (2)

wobei n0 der Brechungsindex ausserhalb des Wellenleiters ist. In den meisten Fällen ist das z.B. Luft, d.h. n0 =1 . Für den maximalen Einfallswinkel gilt gemäss den Fig. 3 und 5 für θ :

θ = 90°−θ ≥ θT ⇒ θ ≤ 90°−θT (3)

Mit den Gln. 2 und 3 ergibt sich:

NA ≡ n0 sin θa = n1 sin 90°−θT( ) = n1 cosθT (4)

und mit Gl. 1 folgt:

NA = n1 cosθT = n1 cos arcsinn2n1

⎛ ⎝ ⎜ ⎞

⎠ ⎟ = n1 1− sin

2 arcsinn2n1

⎛ ⎝ ⎜ ⎞

⎠ ⎟ = n1 1 −

n2n1

⎛ ⎝ ⎜ ⎞

⎠ ⎟ 2

NA ≡ n0 sin θa = n12 − n2

2

(5)

8.2.2 Planarer Spiegelwellenleiter

Fig. 6: Planarer Spiegelwellenleiter ("planar mirror waveguide") (Abbildung aus Ref. [1], S. 240)

Ein Wellenleiter kann auch erzeugt werden, indem zwei planparallele Spiegel im Abstand d zueinander aufgestellt werden (Fig. 6). Normalerweise ist das kein praktischer Ansatz, weil solche Wellenleiter sehr teuer herzustellen sind. Aus didaktischen Gründen wollen wir diesen Fall nun aber doch diskutieren, weil wir damit die fundamentalen Wellenleiterkonzepte sehr einfach erklären können. Wir nehmen im weiteren an, dass die Spiegel ideal sind, d.h. keine Reflektionsverluste erzeugen. Im Gegensatz zum planaren dielektrischen Wellenleiter werden wir keine Neben-bedingungen haben, was den maximalen Winkel θ betrifft. An den Spiegeln werden alle Einfallswinkel gleich ideal reflektiert.

Figure 12.2: A planar-mirror waveguide.

and get the condition

sin θm = mλ

2d(12.2)

An alternate way to view the same problem is to just write down the electric fieldfor two plane waves and find the requirements for these waves to satisfy the boundaryconditions for the plane mirror surfaces. Assuming the mirrors are perfect, this requiresthat the electric field be zero at the mirror surfaces. For s-polarized electric fields (electricfield vector pointing always perpendicular to the plane of incidence, in the x-direction) wethen have

Etot = Re[E1e

i(ωt−k1·r) + E2ei(ωt−k2·r)

](12.3)

At y = d/2,

Etot = Re[eiωt(E1e

−ik1yd/2−ik1zz + E2e−ik2yd/2−ik2zz)

](12.4)

Forcing this to zero for all times and values of z requires that k1z = k2z and

E1e−ik1yd/2 = −E2e

−ik2yd/2 (12.5)

Similarly, for the other mirror boundary condition at y = −d/2 we require

E1eik1yd/2 = −E2e

ik2yd/2 (12.6)

Dividing these we geteik1yd = eik2yd (12.7)

Since the medium and the frequency is the same for both waves, we also know that themagnitude of the wavevector k is the same for both. This implies that k1y = ±k2y.Since we would like two different waves, we consider only the solution where k1y = −k2y.Equation 12.7 then implies

2k1yd = 2πm (12.8)

ork1y =

πm

d(12.9)

Page 150: Quantum Electronics / Photonics

Johnson, QE FS2014 149ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 5 –

Fig. 7: Einfache Interpretation der Moden eines Wellenleiters: (a) Nach zwei Reflexionen wird die Welle in Phase zur ursprünglichen Welle liegen. (b) Intensitätsmuster aufgrund der Interferenz zwischen zwei ebenen Wellen. Das Intensitätsmuster ändert sich entlang des Wellenleiters nicht. (Abbildung aus Ref. [1], S. 241)

Die Wellenleitung ensteht durch Reflexion an den Spiegeln, was zu einer zickzackförmigen Ausbreitung einer ebenen Welle führt (Fig. 7a). Eine ebene Welle wird nach zwei Reflexionen wieder in sich selbst übergeführt. Darum kann man die endgültige Intensitätsverteilung innerhalb des Wellenleiters als eine Überlagerung von zwei ebenen Wellen betrachten, die sich in unterschiedlicher Richtung ausbreiten (Fig. 7b). Das Interferenzmuster ändert sich nicht entlang des Wellenleiters, darum bezeichnet man diese Feldverteilung als Mode des Wellenleiters. Moden entsprechen transversalen Feldverteilungen, die sich entlang des Wellenleiters nicht mehr verändern.

Die Randbedingungen an den idealen Spiegeln für das elektrische Feld, das entlang der x -Achse polarisiert ist, lauten:

Ex y( )

y∈Spiegel= 0

r = −1 (6)

wobei r die Amplitudenreflexion ist. Das bedeutet, dass wir eine Phasenverschiebung von π mitberücksichtigen. Damit die ebene Welle durch zwei Reflexionen in sich selber übergeführt wird, muss gemäss Fig. 7 und Gl. 6 gelten:

k AC − 2π − k AB = 2πq , q = 0, 1, 2, ... (7)

wobei k = 2π λ und λ wie bisher die Vakuumwellenlänge ist. Gemäss Fig. 7 ergibt sich:

Figure 12.3: Condition of self-consistency for the waves in a waveguide. Two bounces fromeach mirror should lead to a phase of the wave that matches that of the original wave.Figure from Saleh & Teich, 8.1-2.

Since sin θ = k1y/k we recover

sin θm = mλ

2d(12.10)

To find the relationship between E1 and E2, we return to examination of equations 12.5and 12.6. Multiplying these equations gives

E21 = E2

2 (12.11)

which implies that E1 = ±E2. Which of these applies depends on the value of m. For oddm, e−ik1yd/2 = −i and e−ik2yd/2 = i, and so E1 = E2. For even m, the exponentials are bothunity and we get E1 = −E2.

The sum of these two plane waves then makes a self-replicating solution for the EMfield inside the waveguide, otherwise known as a mode or the waveguide. The z-componentof the two plane waves is the same, and is called the propagation constant βm = k1z = k2z.It is readily derived from the above as

β2m = k2 − k2

1y = k2 − m2π2

d2(12.12)

Higher-order modes (modes with high values of m) make a more oblique angle with respectto the mirrors and thus propagate more slowly through the waveguide. This relationshipis shown graphically in figure 12.4.

Page 151: Quantum Electronics / Photonics

Johnson, QE FS2014 150

Figure 12.4: The relationships between the mode index, propagation constant, and themagnitude of the y-component of the wavevector for modes in a planar mirror waveguide.Figure from Saleh & Teich, 8.1-3.

12.2.1 Electric field profiles

From our treatment of the waveguide mode as two superimposed plane waves, we canalso readily investigate the electric field profile of the light for a given mode inside thewaveguide. From equations 12.4 and 12.9,

Etot = Re[ei(ωt−βz)E1(e−imπy/d ± eimπy/d)

](12.13)

For odd modes (m = 1, 3, 5, ...),

E(odd)tot = Re

[2E1e

i(ωt−βz) cos(mπy

d

)](12.14)

and for even modes m = 2, 4, 6, ... we have

E(even)tot = Re

[−2iE1e

i(ωt−βz) sin(mπy

d

)](12.15)

We see from these that the electric field amplitude as a function of y differs with eachmode, while the amplitude is completely independent of z. The electric fields for differentmodes are sketched in figure 12.5.

12.2.2 Number of modes

Planar mirror waveguides are only able to support a finite number of different modes.Essentially what the waveguide does is force ky = mπ

dto take on equally spaced values in

intervals of πd. Since ky must also be smaller than the total wavevector magnitude k, there

must exist a maximum value of m. This maximum value gives us the number of modesM supported by the waveguide

M = floor

(2d

λ

)(12.16)

Page 152: Quantum Electronics / Photonics

Johnson, QE FS2014 151

ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 7 –

Mit Gl. 9 ergibt sich:

βm2 = k2 cos2 θm = k 2 1− sin2 θm( ) = k2 1−m 2 λ2

4d2⎛ ⎝ ⎜ ⎞

und damit

βm2 = k2 − m

2π 2

d2 (13)

Die Superposition der beiden Wellen (Gl. 10) ergibt zwei Gruppen von Lösungen Exm = Ex ,up ± Ex, down, die gemäss Fig. 9 auch die Randbedingungen der Spiegel erfüllen müssen, d.h. Exm y( )

y∈Spiegel= 0 . Die Polarisation dieser Welle hat wieder ein elektrisches Feld entlang der

x -Achse.

Exm y,z( ) =2Am cos kymy( )exp −iβmz( ), m = 1, 3, 5,.. .

2iAm sin kym y( )exp −iβmz( ), m = 2, 4, 6, ...

⎨ ⎪

⎩ ⎪ (14)

Fig. 9: Transversale Feldverteilung eines planaren Spiegelwellenleiters (Abbildung aus Ref. [1], S. 244)

Wir können dieses Resultat etwas umschreiben:

Exm y,z( ) = amum y( )exp −iβmz( )

um y( ) =

2d

cosmπyd

⎛ ⎝

⎞ ⎠ , m = 1, 3, 5,.. .

2d

sinmπyd

⎛ ⎝

⎞ ⎠ , m = 2, 4, 6, ...

⎨ ⎪

⎩ ⎪

(15)

Figure 12.5: Field distributions for TEM modes in a planar mirror waveguide. Odd modeshave intensity in the center, but even modes have a node there. The amplitude is indepen-dent of z for all modes. Figure from Saleh & Teich, 8.1-4.

Where the “floor” function indicates that we should round down the value obtained in itsargument to the nearest integer. What if M is zero? This can happen if λ > 2d. In thiscase the waveguide cannot support any modes at all. Typically we say that the waveguidehas a cutoff frequency given by

νc =c

2nd(12.17)

below which the waveguide will not work.

12.2.3 TM modes

So far we have just considered waves with electric field polarized along the x direction,perpendicular to both the mirror separation direction and the direction of wave propaga-tion. We can also treat the case of TM modes: that is, modes where the H field is inthe x-direction. We can treat this case in a completely analogous manner, arriving at thesame relations for the mode spacing and number of modes for TM waves. The only realdifference is in the electric field profile, where now since the E-field is directed along y ithas a different form:

E(odd,TM)tot = Re[2E1e

i(ωt−βz) cos(mπy

d) cot θm] (12.18)

E(even,TM)tot = Re[−2iE1e

i(ωt−βz) sin(mπy

d) cot θm] (12.19)

12.2.4 Dispersion relation

If we write equation 12.12 in terms of the angular frequency ω we obtain

β2m = (nω/c)2 −m2π2/d2 (12.20)

Page 153: Quantum Electronics / Photonics

Johnson, QE FS2014 152

Figure 12.6: Graphical view of the dispersion relation. (a) The number of modes Mincreases with frequency. Below the cutoff frequency no modes exist. (b) The dispersionrelation, showing the relationship between ω and the propagation constant β. (c) Therelationship of ω to the group velocity. Figure from Saleh & Teich, 8.1-5.

which can be written

βm = k cos θm =nω

c

√1−m2

ω2c

ω2(12.21)

in terms of the cutoff angular frequency ωc = 2πνc. This is known as the dispersionrelation. Figure 12.6 shows graphically the behavior of the dispersion relation. For largevalues of ω or β the dispersion approaches the linear relationship characteristic of freespace.

12.2.5 Group velocity

The group velocity in a waveguide is the speed of a pulse of light through the waveguide.Analogous to the group velocity in a uniform medium, the speed of the pulse in the waveg-uide is v = dω/dβ. In the assumption that the dispersion in the waveguide material isnegligible, this evaluates to

vm =c cos θm

n=c

n

√1−m2

ω2c

ω2(12.22)

This equation is easy to understand geometrically, if you imagine a pulse bouncing along thewaveguide. The z component of the pulse velocity is just the group velocity of the mediumprojected along the z axis by the cos θm. From this picture it makes sense that higherorder modes travel slower than lower order modes, as shown graphically in figure 12.6.

12.3 Planar dielectric-waveguide

As we hinted at earlier, in real life planar waveguides are much more commonly imple-mented in transparent dielectric materials, as shown in figure 12.7. This is a sandwich

Page 154: Quantum Electronics / Photonics

Johnson, QE FS2014 153ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 3 –

Fig. 3: Planarer dielektrischer Wellenleiter mit n1 > n2 : Die Strahlen, die einen Einfallswinkel grösser als den kritischen Winkel für Totalreflexion haben θ > θT , werden durch die Totalreflexion im Wellenleiter geführt. (Abbildung aus Ref. [1], S. 249)

Die Bedingung für Totalreflexion ist gegeben durch die Gl. 34 in Kap. 3:

n1 > n2 ⇒ θ > θT = arcsin n2n1

(1)

Ein planarer Wellenleiter kann auch durch unterschiedliche Materialien umhüllt sein. Zum Beispiel zeigt Fig. 4 einen asymmetrischen planaren Wellenleiter mit n1 > n2 und n1 > n3 , aber n2 ≠ n3 .

Fig. 4: Asymmetrischer planarer Wellenleiter ("asymmetric planar waveguide").

Die numerische Apertur NA eines Wellenleiters ist definiert durch den maximalen Einfallswinkel θa ausserhalb des Wellenleiters (Fig. 5), so dass der einfallende Lichtstrahl noch vom Wellenleiter geführt wird.

Fig. 5: Numerische Apertur eines Wellenleiters.

Figure 12.7: A planar dielectric-waveguide (see text). Figure from Saleh & Teich, 8.2-1.

of materials with different indices of refraction. The inner slab of dielectric has index ofrefraction n1. The outer layers have a different index n2 < n1. For optical rays withsufficiently small angles with respect to the planar interface, total internal reflection willconfine those rays to the inner slab. Since total internal reflection is highly efficient, thisis an inexpensive way to make a planar waveguide.

The critical angle for total internal reflection relative to the surface normal is sin−1(n2/n1),which when converted to the angle with respect to the surface itself gives

θc = cos−1(n2/n1) (12.23)

Rays at angles less than this can potentially be trapped in the waveguide. To find themodes and spatial profiles of such a waveguide, we can proceed in a manner analogous tothat of the mirror waveguide, with one important difference. For total internal reflection,the phase shift of the reflected beam is not π, but some value that depends on the waveangle θ due to the presence of an evanescent wave inside the outer dielectric (see week 3notes). For TE modes with the electric field along x, the phase shift φr is determined by

tanφr2

=

√sin2 θcsin2 θ

− 1 (12.24)

A wave trapped in the waveguide experiences this phase shift twice before retracing itspath. Applying this extra shift gives the relation

tan

(πd

λsin θ −mπ

2

)=

√sin2 θcsin2 θ

− 1 (12.25)

This is difficult to evaluate analytically, so numerical or graphical solutions are usuallyapplied instead. As with the mirror waveguide we get a discrete set of different modes atapproximately an equal spacing, but in this case the spacing of modes is not equal.

Page 155: Quantum Electronics / Photonics

Johnson, QE FS2014 154

The number of TE modes in this kind of waveguide is limited by the constraint thatθm < θc. Mathematically,

M = floor

(sin θcλ/2d

)= floor

(2d

λ0

√n2

1 − n22

)(12.26)

this leads to the cutoff frequency

νc =1√

n21 − n2

2

c

2d(12.27)

For TM modes we have identical relations.

12.3.1 Numerical aperture

A concept that is useful with waveguides and other optics is the numerical aperture, whichis defined as

NA = nout sin θout (12.28)

where nout and θout are the index of refraction and angle with respect to the waveguideplanes in the medium outside the waveguide (usually air). For the planar dielectric waveg-uide,

NA =√n2

1 − n22 (12.29)

The numerical aperture is important since it determines the input coupling efficiency ofthe waveguide.

12.3.2 Field distributions

The presence of an evanescent wave in the outer dielectric makes the field distributionextend a bit beyond the inner dielectric material, as shown in figure 12.8. The fact thatthe fields “spill out” from their nominally confined locations allows for interesting effectswhere it is possible to couple the fields from two waveguides together by bringing themclose together, as shown in figure 12.9.

12.4 Optical fibers

Optical fibers are an important class of waveguides since they are almost ubiquitous intelecommunications applications. They allow light to travel along nearly arbitrary pathsalong a flexible cable-like structure, in a manner similar to electrical cables. Here we willbriefly give an overview of some aspects of optical fibers. There is more information in thetext (chapter 9) and references given therein.

Most optical fibers are step-index fibers, as shown in figure 12.10. It is essentiallya cylindrical version of the dielectric planar waveguide. The inner dielectric (called the

Page 156: Quantum Electronics / Photonics

Johnson, QE FS2014 155ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 14 –

Für grössere m wird auch θm grösser und damit γ m kleiner, d.h. höhere Ordnungen von Moden dringen tiefer in die umliegenden Schichten ein. Auch hier gilt, dass die Moden orthonormiert sind, und wir zwischen TE und TM Polarisation unterscheiden (Fig. 14).

Fig. 13: Feldverteilung der TE-Moden für einen dielektrischen Wellenleiter gemäss Gl. 33. (Abbildung aus Ref. [1], S. 254)

Fig. 14: (a) TE und (b) TM Moden in einem dielektrischen planaren Wellenleiter. (Abbildung aus Ref. [1], S. 255)

In einem Wellenleiter erhält man so eine konstante Feldverteilung entlang der Ausbreitungsrichtung (Fig. 15).

Fig. 15: Vergleich zwischen einem Gauss-Strahl und einer Grundmode in einem Wellenleiter. (Abbildung aus Ref. [1], S. 255)

Figure 12.8: Field distributions in a dielectric waveguide for various modes.

ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 23 –

da2dz

= −iς12a1 z( )e− iΔβz (38b)

wobei Δβ der Unterschied zwischen den Ausbreitungskonstanten der beiden Moden ist und damit die Phasenverstimmung ("phase mismatch") pro Längeneinheit ist:

Δβ = β1 − β2 (39)

und ς21 und ς12 die Kopplungskoeffizienten ("coupling coefficients") sind:

ς21 =

12n12 − 1( ) k

2

β1u1 y( )u2 y( )dy

a

a+ d

ς12 =12n22 − 1( ) k

2

β2u1 y( )u2 y( )dy

− a−d

−a

∫ (40)

Fig. 26: Optische Kopplung zwischen zwei planaren Wellenleitern. Bei z = z1 befindet sich das meiste Licht im Wellenleiter #1, bei z = z2 ist das Licht in beide Wellenleiter etwa gleich verteilt, und bei z = z3 ist das Licht vor allem im Wellenleiter #2 . (Abbildung aus Ref. [1], S. 264)

Betrachten wir den Fall in Fig. 26 mit der Anfangsbedingung, dass a1 0( ) ≠ 0 ist und der Grundmode des Wellenleiters #1 entspricht. Weiter gilt, dass kein Licht in den Wellenleiter #2 einfällt, d.h. a2 0( ) = 0 ist. Für diese Anfangsbedingung ergibt sich die Lösung von Gl. 38 zu:

a1 z( ) = a1 0( )exp +iΔβ z

2⎛ ⎝

⎞ ⎠ cos γz − i Δβ

2γsin γz

⎛ ⎝ ⎜ ⎞

⎠ (41)

und

a2 z( ) = a1 0( ) ς1 2

iγexp −

iΔβ z2

⎛ ⎝

⎞ ⎠ sin γz (42)

wobei

Figure 12.9: Coupling modes in different waveguides by overlapping evanescent waves.

Page 157: Quantum Electronics / Photonics

Johnson, QE FS2014 156

ETH Zürich, Quantenelektronik Kap. 8 Prof. Dr. Ursula Keller Wellenleiter und Integrierte Optik

– 20 –

8.4.4 Mode einer optischen Faser

Wellenleiter-Moden können berechnet werden mit der Helmholtz-Gleichung (Kap. 1):

ΔE + kn2E = 0 (35)

Für die Berechnung von Moden in Fasern verwenden wir zylindrische Koordinaten r, ϕ, z. Dann gilt

ΔE =∂ 2E∂r 2 +

1r ⋅

∂E∂r +

1r 2 ⋅

∂ 2E∂ϕ 2 +

∂ 2E∂z2

Wir verwenden den Ansatz

Elm = Flm r( ) ⋅cos lϕ( ) ⋅ exp iβlmz( ) (36)

Der Index l bestimmt die ϕ–Abhängigkeit, während der Index m Moden mit verschiedener Anzahl Nullstellen in radialer Richtung bezeichnet: m ist überlicherweise die Anzahl der Nullstellen plus 1. βlm hängt von den Werten l und m ab.

Einsetzen in die Wellengleichung führt zu der Radialgleichung

′ ′ F lm r( ) +′ F lm r( )r + n2 r( )k 2 − l2

r2 − βlm2⎡

⎣ ⎢

⎦ ⎥ Flm r( ) = 0

(37)

Nur für bestimmte Werte der Propagationskonstante βlm lässt sich eine Radialfunktion ermitteln, die für

r→ ∞ verschwindet.

Fig. 21: Typen von optischen Fasern, die sich durch das Brechungsindexprofil unterscheiden: (a) Mehrmoden "step-index"-Faser, (b) Einmoden "step-index"-Faser und (c) Mehrmoden "graded-index"-Faser . (Abbildung aus Ref. [1], S. 274)

Figure 12.10: Types of optical fibers, from Saleh & Teich, 9.0-2. (a) A multimode step-index fiber. (b) A single-mode fiber. (c) A graded index (GRIN) fiber.

core) is of a higher index of refraction than the surrounding dielectric material (calledthe cladding). The principle of operation is very similar to that of the planar waveguide:at low glancing angles light rays are totally internally reflected along the core-claddinginterface, trapping the wave inside the core. The number of modes varies with the diameterof the core, so smaller diameter cores mean fewer modes supported by the fiber. Some fibersare made to support only one mode; these are called single mode fibers. Fibers withmultiple modes are called multimode fibers.

Analogous to the planar dielectric waveguide, light incident from an outside materialto the fiber is only accepted into the fiber if the angle with respect to the fiber axis θoutsatisfies

nout sin θout < NA =√n2

1 − n22 (12.30)

where nout is the index of the outside material. The numerical aperture NA can be usedto estimate the input coupling efficiency.

Another type of fiber is the graded-index fiber, also called a GRIN fiber. This ismade of a material where the index of refraction is varied continuously from a larger valuein the center along a parabolic path. GRIN fibers are more difficult to make, but haveadvantages since the mode dispersion is much smaller than in a conventional fiber.

Page 158: Quantum Electronics / Photonics

Chapter 13

Nonlinear optics 1

Learning objectives

For this and next week we will be discussing an important aspect of photonics, that ofnonlinear optics. After this week you should be able to

• Identify different types of nonlinear optical phenomena

• Categorize different nonlinear optical phenomena by order (eq. 13.2)

• Explain the symmetry conditions for second order nonlinear phenomena

• Calculate conditions for phase matching for second harmonic generation using collineartype I and type II schemes (13.14, 13.15, 13.18, 13.19)

• Estimate the coherence length and bandwidth of phase matching for second harmonicgeneration (eq. 13.28, 13.31)

For a reference, please see Saleh & Teich, chapter 21.

13.1 Nonlinear optics overview

Nonlinear optics refers to all optical phenomena that stem from a breakdown of the as-sumtption that the D, E, H and B fields are all linearly related to each other. At near-visible wavelengths nonlinear optics typically comes from a nonlinear relationship betweenthe polarization density P and the electric field E.

These nonlinearities allow for many interesting applications, including

• Frequency conversion. In a linear medium, if we start with a frequency ω of lightit is not possible to change this frequency. Nonlinear optics, however, allows us tochange the freuqency of light. In suitable nonlinear media is is possible to start withone color of light and generate nearly arbitrary frquencies. This is a very commonpractice in laser physics, where often the frequencies that are easiest to produce witha laser do not necessarily match the frequencies we want to have (see figure 13.1).

157

Page 159: Quantum Electronics / Photonics

Johnson, QE FS2014 158

• Pulse compression. In ultrafast optics, in order to have short pulses you needa broad freuqency spectrum. Nonlinear effects (self-phase modulation) can be usedto increase the width of the frequency spectrum of a pulse of light to allow forsignificantly shorter pulses.

• Optical solitons are pulses of light that can travel through a nonlinear medium oververy long distances without changing shape. This is the result of a balance betweendispersion and nonlinear self-phase modulation.

• Optical switching. In many cases nonlinear optical effects can be used to controlthe path of light. One example that we have already hinted a bit at is electro-optics.Another is optical bistability, where nonlinear effects are used in combination withsome type of feedback.

• Symmetry probes. Certain nonlinear effects are strongly connected with particu-lar microscopic structural symmetries. By measuring the the extent of a nonlinearprocess it is often possible to learn about this symmetry.

• Scattering probes. Nonlinear effects often allow light to interact with variousexcitations of a medium, such as phonons. Raman and Brillouin scattering are nowestablished tools for characterizing these kinds of excitations using light.

• Advanced spectroscopies. Nonlinear effects are often used as the basis for manydifferent spectroscopic methods that investigate correlations in chemical or solid-state systems. These methods can address questions like how different excitations inmaterials are connected to each other (coupling). We will not have time to discussthese methods comprehensively, but we will give some examples after we have coveredthe underlying principles.

These different applications have certain commonalities in the underlying methods thatwe will try to give in the next days. A more complete treatment of these subjects has tobe left to further courses.

The origin of nonlinearities can come from a variety of different places. One kind ofnonlinear interaction we have already discussed when talking about saturation in lasergain media. In this case the absorption of light can and often does depend in a nonlinearway on the intensity of light. Another mechanism for nonlinearity can be non-harmoniccontributions from a dipole-active microscopic oscillator. As we saw in week 2, if wemodel a medium as a collection of microscopic harmonic oscillators we get a nice, linearrelationship between E(ω) and P (ω). This changes if we assume that Hooke’s law is nolonger a good approximation for the restoring force of an induced dipole moment. For ourpresent purposes we will treat the nonlinearities as an abstraction, and concentrate solelyon the consequences of the nonlinear relationship between P and E without consideringtheir physical origins.

We will start off by suppressing the vector nature of P and E, which essentially assumesthat we have light polarization along a normal mode. We will also assume the medium

Page 160: Quantum Electronics / Photonics

Johnson, QE FS2014 159

Figure 13.1: Internal diagram of a typical green laser pointer. The original laser output isin the near-infrared at 808 nm, nearly invisible to the human eye. This is used to pump asmall laser based on a Nd:YVO4 crystal and generate light at 1064 nm. Nonlinear effectsare then used to convert the frequency to 532 nm, which appears green to us. This figure isfrom Chris Chen, available at http://en.wikipedia.org/wiki/File:Green-laser-pointer-dpss-diagrams.jpg

Page 161: Quantum Electronics / Photonics

Johnson, QE FS2014 160

is non-dispersive. These assumptions are not necessary, but for now it is a simplifyingassumption that makes some of the general ideas more accessible. A nonlinear relationshipbetween E and P can be expressed as a MacLaurin series

P = a1E +1

2a2E

2 +1

6a3E

3 + . . . (13.1)

The first term represents linear optical effects, whereas the following terms denote varioustypes of nonlinearity. At small values of E the linear term dominates, and so we canconclude that a1 = ε0χ. The second term is called the second-order nonlinearity, andthe third term is the third-order nonlinearity. This can be generalized to arbitrarily highorders.

Usually equation 13.1 is expressed in one of two different forms. One form is

P = ε0χE + 2dE2 + 4χ(3)E3 + . . . (13.2)

Another common representation is

P = ε0(χE + χ(2)E2 + χ(3)E3 + . . .) (13.3)

Note that the third order susceptibility χ(3) is defined differently, so it is important toidentify which convention is active. For our purposes we will be using the first conventionunless otherwise indicated.

13.2 Symmetry and nonlinearity

Although we are for now not concentrating on the mechanisms of nonlinearity, it is stillpossible to make some conclusions about what kinds of nonlinearities can be present incertain materials. The second-order nonlinearity is particularly interesting in this context.Let’s suppose we have a material with a non-zero value of d and we apply to it an electricfield in a particular direction, as shown in figure 13.2. We write the polarization density Pas the sum of two terms: the linear term Plin = ε0χE, and a nonlinear term PNL = 2dE2.Suppose now that we spatially invert both the medium and the electric field. This reversesthe sign of both E and P . For PNL to change sign, the value of d must also change signwith spatial inversion. This means that the medium cannot possess spatial inversion as asymmetry operation.

This is an important result since it limits the variety of materials that possess secondorder nonlinearity. Most materials have inversion symmetry. Exceptions include opticallyactive materials and ferroelectric crystals.

No such general restriction for the third order nonlinearity exists. Consequently, nearlyevery material has some non-zero third order susceptibility.

Page 162: Quantum Electronics / Photonics

Johnson, QE FS2014 161

Spatial inversion

E P E P

P = 0E + 2dE2 P = 0E 2dE2

Figure 13.2: The effect of spatial inversion on a second-order nonlinear medium. For thereto be non-zero second order contributions to P the medium must not be symmetric withrespect to spatial inversion.

13.3 Nonlinear wave equation and the first Born ap-

proximation

The wave equation for a homogeneous, nonlinear, nonmagnetic medium can be written

∇2E − 1

c2

∂2E

∂t2= µ0

∂2P

∂t2(13.4)

where the polarization P can be written

P = ε0χE + PNL (13.5)

We can then re-cast the wave equation in the following form

∇2E − n2

c2

∂2E

∂t2= −S (13.6)

where

S = −µ0∂2PNL∂t2

(13.7)

can be effectively considered as a source term for the wave. Since, however, S depends onE the full equation is indeed nonlinear and difficult to solve in an exact way.

Approximate solutions can be easily derived using a number of methods. The firstmethod (and the easiest) is an iterative method. The basic idea comes from the assumptionthat the nonlinear effects are very small compared to linear optical effects.

In the Born approximation we start with the incident light field given by (for example)

E0(r, t) = E0 cos(ωt− k · r) (13.8)

We then use this field to make a first estimate for PNL as

PNL = 2dE0(r, t)2 + 4χ(3)E0(r, t)3 + . . . (13.9)

Page 163: Quantum Electronics / Photonics

Johnson, QE FS2014 162

We then can use equations 13.6 and 13.7 to solve for the radiated field E1(r, t). This iscalled the first Born approximation. We can then use the E1 field to again calculate acorrection for PNL, and so on until reaching convergence. For cases where the scatteredfield is much smaller than the incident field, usually only one iteration of this procedure isnecessary.

An alternative method called the coupled-wave theory is a better option for cases wherethe nonlinear effects are large; we will discuss this next week.

13.4 Second order processes

As an example of a nonlinear process, consider the case where the incident light consistsof two frequencies ω1 and ω2. When these propagate in a material with a non-zero value ofthe second order nonlinear coefficient d, in the first Born approximation we get (assumingonly second-order nonlinearity)

PNL = 2d[E1 cos(ω1t) + E2 cos(ω2t)]2

=d

2[E1(eiω1t + e−iω1t) + E2(eiω2t + e−iω2t)]2

=d

2[E2

1(ei2ω1t + e−i2ω1t)

+E22(ei2ω2t + e−i2ω2t)

+2E1E2(ei(ω1+ω2)t + e−i(ω1+ω2)t)

+2E1E2(ei(ω1−ω2)t + e−i(ω1−ω2)t)

+2E21 + 2E2

2 ]

= d[E21 cos(2ω1t) + E2

2 cos(2ω2t) + 2E1E2[cos((ω1 + ω2)t) + cos((ω1 − ω2)t)]

+E21 + E2

2 ] (13.10)

We see from this that the nonlinear component of the polarization density has severaldifferent frequencies at work. There are terms that oscillate at 2ω1 and 2ω2. These arecalled the second harmonic contributions. There is also a component that oscillates atω1 + ω2 called the “sum frequency,” and a component at ω1 − ω2 called the “differencefrequency.” Finally, we also have non-oscillating components equal to d(E2

1 + E22). This

is called optical rectification: the transformation of an oscillating E-field into a stationary,non-oscillating field.

13.4.1 Second harmonic generation

The terms where the frequency is doubled is very often used to generate visible light frominfrared lasers. For example, most common green laser pointers are actually lasers thatoperate in the near-infrared (e.g. 1.06 microns) and are frequency doubled to get emissionin the green. If we consider the case above where E1 6= 0 and E2 = 0, we see thatPNL(2ω1) ∝ E2

1 , and so the scattered field also is proportional to E21 . This means that the

Page 164: Quantum Electronics / Photonics

Johnson, QE FS2014 163

intensity of the scattered wave is proportional to E41 and to d2. If we also assume that

the length L of the medium is relatively small (more on this later), the scattered intensityis also proportional to L2. Efficient second harmonic generation would therefore ideallymaximize the incident intensity, the length of the medium, and the size of the second ordernonlinear coefficient.

13.4.2 The electro-optic effect

As we have already pointed out, the last couple terms in equation 13.10 correspond tooptical rectification. Physically, this means that intense light in a second-order nonlinearmedium can create a DC voltage inside the medium. This is closely related to anothernonlinear optical phenomenon, the electro-optic effect.

To better understand this, let’s set ω1 = ω and ω2 = 0. This corresponds to the secondwave just being an applied static field. Equation 13.10 then becomes

PNL = 2d[E21 cos 2ωt+ E2

1 + 2E22 + 2E1E2 cosωt] (13.11)

The last term of PNL is a term that oscillates at the frequency of the first wave. This isequivalent to saying that the linear susceptibility χ is modified by

∆χ = 4dE2/ε0 (13.12)

or that the index of refraction n =√

1 + χ is modified by an amount

∆n =2d

ε0nE2 (13.13)

Thus, an applied DC field in a second-order nonlinear medium can alter the refractiveindex of the material. This is sometimes used to make voltage-tuned polarization devices,as discussed in week 11.

13.4.3 Three-wave mixing

The fully general case of equation 13.10 is called three-wave mixing. This is very often usedto create nearly arbitrary frequencies from a given laser. Some different types of three-wavemixing are shown in figure 13.3. In this context wave 1 is often called the signal and wave2 the idler.

13.4.4 Phase matching

A given three-wave mixing process is not necessarily efficient, due to the requirement thatthe waves emitted by PNL constructively interfere over a large space in the nonlinearmedium. Specifically, we require that for three plane waves

ω1 + ω2 = ω3 (13.14)

Page 165: Quantum Electronics / Photonics

Johnson, QE FS2014 164

Figure 13.3: Optical parametric (i.e. three wave mixing) interactions: optical frequencyconverter (OFC), optical parametric amplifier (OPA), optical parametric oscillator (OPO),and spontaneous parametric down-conversion (SPDC). Figure from Saleh & Teich, 21.2-8.

Page 166: Quantum Electronics / Photonics

Johnson, QE FS2014 165

andk1 + k2 = k3 (13.15)

You will show this relation from the Born approximation in the third problem of thisweek’s exercises. A useful way to think of this is in terms of photon interaction process.Conservation of energy gives

hω1 + hω2 = hω3 (13.16)

and the conservation of momentum gives

hk1 + hk2 = hk3 (13.17)

which lead directly to the frequency and phase matching equations.

Collinear phase matching

If all waves travel in the same direction and are polarized along normal mode directions,we can simplify the phase matching condition to

n1ω1 + n2ω2 = n3ω3 (13.18)

where n1 is the effective index of refraction for the wave 1, n2 is the effective index forwave 2, etc. Usually these indices are not the same due to dispersion. The phase-matchingcondition can in this case be hard to achieve without some effort. One common methodused to compensate for the frequency dispersion is to use an anisotropic medium and usethe dependence of the index on polarization and direction of propagation to meet the phasematching condition.

Usually a uniaxial crystal us used for this purpose. Recall that a uniaxial medium ischaracterized by two refractive indices no and ne. The ordinary index no is the effectiveindex for waves polarized perpendicular to the optical axis of the crystal. The extraordinaryindex ne is the index for waves with polarization along the optical axis. For waves travelingat an angle θ with respect to the optic axis there exists both an ordinary wave with indexno and an extraordinary wave with

1

n2(θ, ω)=

cos2 θ

n2o(ω)

+sin2 θ

n2e(ω)

(13.19)

and mixed polarization.Type I phase matching is when both waves 1 and 2 have the same polarization. In

principle, for a uniaxial crystal there are four possibilities for type 1 phase matching: e-e-o,o-o-e, e-e-e and o-o-o (here o means ordinary polarization, e means extraordinary, and theordering corresponds to the wave id). In practice, it is usually never possible to properlycompensate for dispersion using e-e-e or o-o-o configurations. We are then left with e-e-oor o-o-e. For e-e-o the phase matching condition is

ω1n(θ, ω1) + ω2n(θ, ω2) = (ω1 + ω2)no(ω1 + ω2). (13.20)

Page 167: Quantum Electronics / Photonics

Johnson, QE FS2014 166

For the o-o-e configuration we have instead

ω1no(ω1) + ω2no(ω2) = (ω1 + ω2)n(θ, ω1 + ω2) (13.21)

For the special case of second harmonic generation, we can set ω1 = ω2 and these become

n(θ, ω1) = no(2ω1) (13.22)

for the e-e-o configuration, andno(ω1) = n(θ, 2ω1) (13.23)

for the o-o-e configuration. Usually only one of these configuration is possible in a givencrystal.

Type II phase matching is when waves 1 and 2 have orthogonal polarizations. Herewe have e-o-o, o-e-o, e-o-e and o-e-e configurations. As an example, for second-harmonicgeneration in an e-o-o configuration would be

n(θ, ω1) + no(ω1) = 2no(2ω1) (13.24)

Non-collinear phase matching

If waves 1 and 2 are not collinear, we have somewhat more flexibility in meeting theconditions for phase matching. Equation 13.15 can then be written

ω1n1 sin θ1 = ω2n2 sin θ2 (13.25)

ω1n1 cos θ1 + ω2n2 cos θ2 = ω3n3 (13.26)

where now θ1 is the angle of wave 1 with wave 3 and θ2 is the angle of wave 2 with wave 3.Non-collinear phase matching is often used if it is not possible to achieve collinear phasematching. It works well for cases where the transverse spatial extent of the beams is largerthan or at least comparable to the propagation length on the medium; otherwise the beamsseparate spatially, limiting the effective length of the nonlinear medium.

Phase mismatch and coherence length

What happens if we do not meet the phase matching condition? If we assume a mismatch∆k = k3 − k2 − k1 over a second-order nonlinear medium of volume V , it can be shown(see the third problem in this week’s exercises) that the intensity of wave 3 is

I3 ∝∣∣∣∣∫Vei∆k·rdr

∣∣∣∣2 (13.27)

Scattered light from different parts of the medium add together as phasors with phase∆k · r (see figure 13.4). It is clear that the integral is maximal when ∆k = 0. If we assumeas a simple case a rectangular medium with a variable length L along the direction of ∆k,

Page 168: Quantum Electronics / Photonics

Johnson, QE FS2014 167

Figure 13.4: Phasor picture of how phase mismatch results in lower intensities. (a) Perfectphase matching; (b) Bad phase matching; (c) Quasi-phase matching using an alternatingdirection of d. Figure from Saleh & Teich, 21.2-16.

Figure 13.5: Intensity of second order intensity for various values of the phase mismatch.Figure from Saleh & Teich, 21.2-14.

Page 169: Quantum Electronics / Photonics

Johnson, QE FS2014 168

the first zero of the intensity occurs when the sum of the phasors makes a complete circle,or when L is equal to

Lc = 2π/|∆k| (13.28)

This is called the coherence length. The full dependence of the intensity on L fromsolving the integral is shown in figure 13.5.

Even if we have perfect phase matching for one set of waves, we might also be interestedin the phase matching for sets of waves that have slightly different sets of frequencies thatstill satisfy ω1 + ω2 = ω3. This is called the phase matching bandwidth, and can be veryimportant for pulsed applications. For collinear SHG where ω1 = ω2 = ω3/2,

∆k(ω1 + ∆ω) = ∆k′∆ω (13.29)

where ∆k′ = d∆k/dω. If we define ∆ω as the deviation from ω1 that leads to the firstzero of the nonlinear scattered intensity,

∆ω =2π

|∆k′|L. (13.30)

Since for this case ∆k(ω) = k3(2ω) − 2k1(ω), the derivative is ∆k′ = dk3(2ω)/dω −2dk1(ω)/dω = 2[ng3 − ng1]/c where ng3 and ng1 are the group indices of refraction foreach wave. We then get the final bandwidth relation

∆ω =c

2L

|ng3 − ng1|(13.31)

We learn from this that for SHG of short pulses with broad bandwidth, we need to useshort crystals or otherwise try to arrange for a low group velocity difference between thetwo frequencies.

Quasi-phase matching

Another trick to getting phase matching is to use an inhomogeneous nonlinear medium.This is called quasi-phase matching. Mathematically, we have

I3 ∝∣∣∣∣∫Vd(r)ei∆k·rdr

∣∣∣∣2 (13.32)

where now d is a function of the spatial coordinate r. If d(r) has a component exp(−iG ·r)such that G = ∆k we can eliminate the consequences of the phase mismatch.

The simplest version of this is illustrated in figure 13.6. Here the nonlinear material iscut up into short slabs, each of which have half the coherence length Lc. Every other slabis rotated by 180, so that the nonlinear coefficient d has the opposite sign for these slabs.Part (c) of figure 13.4 shows the phasor representation for the electric field generated insuch a structure where the incident wave moves along the z axis. Just at the points wherethe intensity of the scattered wave would start to decline in a homogeneous material, thesign of d is reversed and so the new contributions to the scattered wave turn in the oppositedirection, adding to the intensity.

Page 170: Quantum Electronics / Photonics

Johnson, QE FS2014 169

Figure 13.6: Quasi-phase matching scheme. Figure from Saleh & Teich, 21.2-15.

Page 171: Quantum Electronics / Photonics

Chapter 14

Nonlinear optics 2

Learning objectives

For this and next week we will be concluding our discussion on nonlinear optics. After thisweek you should be able to

• Identify typical third-order nonlinear processes

• Explain how the optical Kerr effect can influence pulse propagation (eq. 14.6, 14.7)

• Explain what coupled-wave theory is and how it can be used to model nonlinearprocesses (eq. 14.16, 14.17, 14.22-24, along with an explanation of how to proceedfrom there)

• Identify how a complete treatment of anisotropy alters the relations for estimatingthe efficiency of nonlinear processes (eq. 14.56)

• Identify how material dispersion can influence nonlinear optical properties

For a reference, please see Saleh & Teich, chapter 21.

14.1 Third order processes

Whereas last week we concentrated mostly on second-order processes, there are manymaterials and devices where third-order nonlinear processes are important. Here we con-sider materials where the leading order nonlinear interaction is third-order. We thereforeapproximate the nonlinear component of the polarization density as

PNL = 4χ(3)E3. (14.1)

170

Page 172: Quantum Electronics / Photonics

Johnson, QE FS2014 171

14.1.1 Third harmonic generation

If we insert for E a wave at a frequency ω1 we obtain for PNL

PNL = 4χ(3)(E1 cosω1t)3 = E3

1χ(3)(cos 3ω1t+ 3 cosω1t) (14.2)

which gives us a wave at the original frequency and at 3ω1. Thus, third-order nonlinearmedia can be used to triple the frequency of light. In practice this is not often done, sincethe efficiencies are low. Instead, it is much more common to double the frequency in asecond-order medium, and then to use sum-frequency generation between the fundamentaland the second harmonic to get the tripled frequency.

14.1.2 Optical Kerr effect

The ω1 term in equation 14.2 can be viewed as a modification of the linear susceptibilityχ by an amount

∆χ = 3χ(3)

ε0E2

1 = 6χ(3)

ε0ZI1 (14.3)

where Z is the impedance and I1 is the intensity of the initial wave. This can be re-writtenas a change in the index of refraction

∆n = n2I1 (14.4)

where

n2 =3Z0

n2ε0χ(3) (14.5)

and Z0 is the impedance of free space. Here n2 is called the optical Kerr coefficient.The index of refraction can then be written as

n(I1) = n+ n2I1 (14.6)

which indicates that the index depends linearly on the intensity. This is called the opticalKerr effect and is one of the main nonlinear effects on light moving through a third-order nonlinear medium. It has several different applications, which we will now brieflysummarize.

Self-phase modulation

If we imagine a beam traveling through a third-order nonlinear medium over a distance L,the optical Kerr effect introduces a phase shift

∆φ = −2πn2I1L/λ0 = −2πn2L

λ0AP (14.7)

where P is the power of the beam (energy/time) and A is the cross-sectional area of thebeam. The effect is maximized for large L and small A, which is usually relatively easy toachieve in waveguides.

Page 173: Quantum Electronics / Photonics

Johnson, QE FS2014 172

Figure 14.1: Photonic swtiching in a Mach-Zehnder interferometer, both (a) in a conven-tional optical configuration, and (b) in an integrated optic configuration. This figure isfrom Saleh and Teich, 23.3-17.

One example of how this can be used is shown in figure 14.1. Here an intense “control”light beam is used to modify the refractive index of one arm of a Mach-Zehnder interfer-ometer. A weaker switched beam will then experience a phase shift in this arm that isproportional to the control beam intensity. A phase shift of π causes the switched beamto be guided to different output arms of the interferometer.

Self-focusing

Another possible consequence of the optical Kerr effect is self-focusing, shown in figure 14.2.If we take a Gaussian beam and propagate it through a third-order nonlinear medium, therefractive index of the medium will introduce a phase shift that is largest in the center ofthe beam and gradually returns to the zero-intensity value as we move toward the wingsof the beam. This can have an effect similar to a lens, leading to a phenomenon where thebeam focuses itself. This can be a very important phenomenon when trying to accuratelymodel short pulse propagation, since even the small third-order nonlinearity of air can leadto significant self-focusing effects. One application of this is in atmospheric spectroscopy,where self-focusing is used to make air into essentially a GRIN fiber that allows for long-distance propagation of light to the upper atmosphere (see figure 14.3). Another importantapplication is in mode-locking of lasers, where self-focusing is used to make the alignmentof an oscillator cavity change with intensity.

Mathematically, self-focusing is governed by the Helmholtz equation

∇2U + n2(I)k20U = 0 (14.8)

where n(I) = n + n2I and I = |U |2/2Z. Since this is nonlinear it is difficult to solvewithout making some approximations. For the case where the envelope A varies slowlywith z and the field is independent of the transverse direction y,

∂2A

∂x2− 2ik

∂A

∂z+ k2

0[n2(I)− n2]A = 0 (14.9)

Page 174: Quantum Electronics / Photonics

Johnson, QE FS2014 173

Figure 14.2: Self-focusing in a medium with a large optical Kerr effect. This figure is fromSaleh and Teich, 21.3-2.

If we also assume n2I n, this becomes

∂2A

∂x2+n2

Z0

k2|A|2A = 2ik∂A

∂z(14.10)

which is known as the nonlinear Schroedinger equation. A solution is

A(x, z) = A0sech(x

W0

)exp

(−i z

4z0

)(14.11)

where W0 and A0 are constants that satisfy the relationship n2(A20/2Z0) = 1/k2W 2

0 , andz0 = πW 2

0 /λ. Note that the intensity of this solution is independent of z, meaning that itis an example of a beam that does not increase in size as it propagates.

Raman scattering

In general, the value of χ(3) can be complex. If we break the nonlinear susceptibilityexplicitly into real and imaginary parts

χ(3) = χ(3)R + iχ

(3)I (14.12)

the self-phase modulation becomes complex, and so we effectively have both a phase shift

∆φR = −6πZ0χ(3)R

ε0n2

L

λ0AP (14.13)

Page 175: Quantum Electronics / Photonics

Johnson, QE FS2014 174

Figure 14.3: Self-focusing in air is used to propagate ultrashort pulses over very longdistances without expansion. This can be used to make spectroscopy of the atmosphere.(Figure from Woste, Frey and Wolf, Adv. Atom. Mol. Opt. Phys. 53, 413-441 (2006)).

Page 176: Quantum Electronics / Photonics

Johnson, QE FS2014 175

and an amplitude gain eγRL/2

γR =12πZ0χ

(3)I

ε0n2

P

λ0A(14.14)

proportional to the optical power P . This is called the Raman gain (or loss, if negative).For most materials, the physical origin of this gain at optical frequencies is coupling of theelectronic states to vibrational modes in the medium. The Raman gain can be used as thegain mechanism of a laser oscillator (this is then called a Raman laser).

Cross-phase modulation

Another consequence of the third-order nonlinearity comes up when considering the prop-agation of two waves with different frequencies ω1 and ω2 through the same medium. Thethird-order nonlinearity leads to a modulation of the nonlinear polarization at ω1 with theform

PNL(ω1) = χ(3)[3|E(ω1)|2 + 6|E(ω2)|2

]E(ω1) (14.15)

which leads to a coupling between the two waves. This is called cross-phase modulationand can be a problem when using intense pulses in a waveguide for optical communication.

14.2 Coupled-wave theory

We return to the wave equation in a nonlinear medium

∇2E − n2

c2

∂2E

∂t2= −S (14.16)

S = −µ0∂2PNL∂t2

(14.17)

For demonstration purposes, we’ll assume only second-order nonlinearity

PNL = 2dE2 (14.18)

Let’s suppose that we want to model some kind of 3-wave mixing process. We then modelthe electric field as the sum of three waves, represented as

E(t) =∑

q=1,2,3

Re[Eqeiωqt] =

1

2

∑q=±1,±2,±3

Eqeiωqt (14.19)

where ωq = −ω−q is the frequency of each wave, and Eq = E∗−q. The nonlinear polarizationdensity is then

PNL(t) =d

2

∑q,r=±1,±2,±3

EqErei(ωq+ωr)t (14.20)

and then

S =dµ0

2

∑q,r=±1,±2,±3

(ωq + ωr)2EqEre

i(ωq+ωr)t (14.21)

Page 177: Quantum Electronics / Photonics

Johnson, QE FS2014 176

which contains components that oscillate at the sums and differences of the componentfrequencies.

Let’s suppose we break up S into terms with different frequency components, such thatS1 is a component oscillating at ω1, S2 a component oscillating at ω2, and S3 a componentoscillating at ω3. These components are non-zero only if the frequencies are commensurate,i. e. one frequency is the sum of the other two. If we also assume that all the frequenciesare different, we can write a set of coupled wave equations for each wave

(∇2 + k21)E1 = −S1 (14.22)

(∇2 + k22)E2 = −S2 (14.23)

(∇2 + k23)E3 = −S3 (14.24)

where kq = nωq/c. Let’s assume that ω1 + ω2 = ω3 and ω1 6= ω2. In this case

S1 = 2µ0ω21dE3E

∗2 (14.25)

S2 = 2µ0ω22dE3E

∗1 (14.26)

S3 = 2µ0ω23dE1E2 (14.27)

The three partial differential equations (the coupled-wave equations) are then

(∇2 + k21)E1 = −2µ0ω

21dE3E

∗2 (14.28)

(∇2 + k22)E2 = −2µ0ω

22dE3E

∗1 (14.29)

(∇2 + k23)E3 = −2µ0ω

23dE1E2 (14.30)

For the special case where ω1 = ω2 we have only two such equations

(∇2 + k21)E1 = −2µ0ω

21dE3E

∗1 (14.31)

(∇2 + k23)E3 = −µ0ω

23dE1E1 (14.32)

The coupled wave equations can be simplified a bit if we assume all the waves are planewaves traveling in the z direction. If

Eq =√

2Zhωqaqe−ikqz (14.33)

the variable aq represents the amplitude of the wave q, scaled such that |aq|2 is the pho-ton flux density. Assuming aq(z) varies slowly with z, we can then convert the coupledwave equations (in the general, non-degenerate case) to the following ordinary differentialequations

da1

dz= −iga3a

∗2e−i∆kz (14.34)

da2

dz= −iga3a

∗1e−i∆kz (14.35)

Page 178: Quantum Electronics / Photonics

Johnson, QE FS2014 177

da3

dz= −iga1a2e

i∆kz (14.36)

whereg =

√2hω1ω2ω3Z3d2 (14.37)

and∆k = k3 − k2 − k1 (14.38)

is the phase mismatch. These equations along with appropriate boundary conditions allowus to solve for the number of photons in each wave as a function of z.

For degenerate three wave mixing (SHG) the equations are instead

da1

dz= −iga3a

∗1e−i∆kz (14.39)

da3

dz= −ig

2a1a1e

i∆kz (14.40)

14.2.1 Photon number conservation: Manley-Rowe Relation

Before we get into some more specific examples, let’s consider the z-derivative of the photonfluxes (for the non-degenerate case)

d|aq|2dz

= a∗qdaqdz

+ aqda∗qdz

(14.41)

Plugging this in for each derivative gives

d|a1|2dz

= −iga3a∗2a∗1e−i∆kz + c.c. (14.42)

d|a2|2dz

= −iga3a∗1a∗2e−i∆kz + c.c. (14.43)

d|a3|2dz

= −iga1a2a∗3ei∆kz + c.c. (14.44)

From this we can immediately conclude that

d|a1|2dz

=d|a2|2dz

= −d|a3|2dz

(14.45)

which is known as the Manley-Rowe relation. Physically, it can be viewed in terms ofphoton scattering: to convert a photon from waves 1 and 2 to wave 3, we need to subtractand equal number of photons from waves 1 and 2 and add that same number to wave 3.

Related is the Energy conservation relation for the non-degenerate case

d

dz(I1 + I2 + I3) = 0 (14.46)

where Iq = hωq|aq|2 is the intensity of each wave.

Page 179: Quantum Electronics / Photonics

Johnson, QE FS2014 178

14.2.2 Example: second harmonic generation

For degenerate three wave mixing (ω1 = ω2) and perfect phase matching (∆k = 0),

da1

dz= −iga3a

∗1 (14.47)

da3

dz= −ig

2a1a1 (14.48)

Let’s suppose that we start at z = 0 with a1|z=0 = A and a3|z=0 = 0. Energy conservationimplies

d

dz

(|a1|2 + 2|a3|2

)= 0 (14.49)

and so|a1|2 + 2|a3|2 = |A|2 (14.50)

for all values of z. We then can differentiate eq. 14.48 and write it as

d2a3

dz2= −g2(|A|2 − 2|a3|2)a3 (14.51)

which is a second order ODE. The solution for our boundary conditions is

a3(z) = − i√2Atanh

(g|A|z√

2

)(14.52)

a1(z) = Asech

(g|A|z√

2

)(14.53)

The photon flux densities φ1 = |a1|2 and φ2 = |a3|2 are

φ1(z) = |A|2sech2

(g|A|z√

2

)(14.54)

φ3(z) =|A|2

2tanh2

(g|A|z√

2

)(14.55)

and plotted in figure 14.4

14.3 Anisotropy

So far for simplicity we have avoided a complete treatment of anisotropy in nonlinearcrystals. We can add this in in a manner similar to how we did for the linear opticalproperties, by writing the polarization density as a vector

Pi = ε0∑j

χijEj + 2∑jk

dijkEjEk + 4∑ijkl

χ(3)ijklEjEkEl (14.56)

Page 180: Quantum Electronics / Photonics

Johnson, QE FS2014 179

Figure 14.4: Photon flux densities in second harmonic generation with perfect phase match-ing. Figure from Saleh and Teich, 21.4-1.

J (j,k)1 (1,1)2 (2,2)3 (3,3)4 (2,3)5 (3,1)6 (1,2)

Table 14.1: Correspondence of the contracted index J with actual tensor indices j, k. Thiskind of contraction is very common when the tensor is symmetric with respect to exchangeof two indices.

for i, j, k, l running from 1 to 3. Now the second-order nonlinear coefficient is a third-rank tensor, and the third-order coefficient is the fourth-rank tensor. In principle dijk has3× 3× 3 = 27 components, and χijkl has 34 = 81 components. In reality dijk has only 18independent components since dijk = dikj. It is usually represented then as a 3 × 6 arraydiJ , where J is a “contracted index” that represents certain combinations of j and k. Thecorrespondence is given in table 14.1. Similarly, χ

(3)ijkl can be represented as a 6× 6 matrix

χ(3)IK where I is a contracted index for (i, j) and K is a contracted index for (k, l) that

follows the same rule as J .The symmetry of a crystal usually places restrictions on which elements of these ma-

trices are non-zero or related. Some typical crystal symmetries and their impact on dijkare shown in table 14.2

The coordinate system for all these is always given in the principal coordinate system ofthe crystal. For three-wave mixing, the introduction of anisotropy effectively replaces thevalue of d with an effective value that depends on the orientation of each wave’s polarizationwith respect to the principal axes. A full treatment is beyond the scope of this course, butit is relatively straightforward (see e. g. R. W. Boyd, Nonlinear Optics, 2nd Ed., Acad.Press, 2003).

Page 181: Quantum Electronics / Photonics

Johnson, QE FS2014 180

Cubic 43m Tetragonal 42m Trigonal 3m 0 0 0 d14 0 00 0 0 0 d14 00 0 0 0 0 d14

0 0 0 d14 0 0

0 0 0 0 d14 00 0 0 0 0 d36

0 0 0 0 d15 −d22

−d22 d22 0 d15 0 0d31 d31 d33 0 0 0

Table 14.2: Some examples of diJ for common crystal symmetries (space groups).

14.4 Dispersion

Finally, we have so far not treated dispersive effects properly for nonlinear materials.As with linear optics, the non-instantaneous response of the polarization density to theelectric field leads to a temporal lag in the polarization response, which manifests itself as acomplex-valued, frequency dependent nonlinear susceptibility. A more elaborate treatmentof this is given in chapter 21.7 of Saleh and Teich, but the main result is that the nonlinearcoefficients dijk and χ

(3)ijkl become complex and dependent on the frequencies of the waves

involved in the nonlinear process. When we want to make this clear we often write themas explicit functions of frequency. Thus,

dijk → dijk(ω3;ω1, ω2) (14.57)

for three wave mixing where ω3 = ω1 + ω2; similarly

χ(3)ijkl → χ

(3)ijkl(ω4;ω1, ω2, ω3) (14.58)

for ω4 = ω1 + ω2 + ω3. Like the linear susceptibility, in the vicinity of resonances thechanges to these nonlinear coefficients can be very large.