ira m. gessel - brandeis universitypeople.brandeis.edu/~gessel/homepage/slides/wilf80-slides.pdf ·...

On the WZ Method

Ira M. Gessel

Department of MathematicsBrandeis University

Waterloo Workshop in Computer AlgebraWilfrid Laurier University

May 28, 2011

In Honor of Herbert Wilf’s 80th Birthday

The WZ (Wilf-Zeilberger) method

The way the WZ method is usually described:

We have two functions f and g satisfying the WZ equation

f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j) (WZ)

If we sum on i the right side telescopes, so

f (i , j + 1)−N∑

f (i , j) = g(N + 1, j)− g(0, j).

If g(N + 1, j) = g(0, j) = 0 then∑N

i=0 f (i , j + 1) is independentof j so we can evaluate

∑i f (i , j).

Note: we might want to take N =∞.

f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j) (WZ)

f (i , j + 1)−N∑

f (i , j) = g(N + 1, j)− g(0, j).

If g(N + 1, j) = g(0, j) = 0 then∑N

∑i f (i , j).

f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j) (WZ)

f (i , j + 1)−N∑

f (i , j) = g(N + 1, j)− g(0, j).

If g(N + 1, j) = g(0, j) = 0 then∑N

∑i f (i , j).

f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j) (WZ)

f (i , j + 1)−N∑

f (i , j) = g(N + 1, j)− g(0, j).

If g(N + 1, j) = g(0, j) = 0 then∑N

∑i f (i , j).

An example—Vandermonde’s theorem

Let’s use the WZ method to prove the Chu-Vandermondetheorem:

r − i

(p + q

It’s convenient to replace the upper limit of summation with∞.We divide the left side by the right side to get a sum which isindependent of the parameters:

∞∑

ti = 1, where ti =

r − i

) / (p + q

To apply the WZ method, we need a second parameter j . Butwe have three parameters, p, q, and r . Which one should wetake?

An example—Vandermonde’s theorem

Let’s use the WZ method to prove the Chu-Vandermondetheorem:

r − i

(p + q

It’s convenient to replace the upper limit of summation with∞.We divide the left side by the right side to get a sum which isindependent of the parameters:

∞∑

ti = 1, where ti =

r − i

) / (p + q

To apply the WZ method, we need a second parameter j . Butwe have three parameters, p, q, and r . Which one should wetake?

Let’s try taking p to be j . So our identity becomes∑∞i=0 f (i , j) = 1, where

f (i , j) =

r − i

) / (j + q

We would like to find a WZ-mate for f : a solution of the WZequation f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j). We can do thisusing Gosper’s algorithm, which gives

g(i , j) =i (q − r + i)

(i − 1− j)(j + 1 + q)f (i , j)

= − i + q − rj + q + 1

i − 1

r − i

) / (j + q

Since g(0, j) = 0 and g(i , j) = 0 for i > j + 1, we have

∞∑

(f (i , j + 1)− f (i , j)

f (i , j) =

r − i

) / (j + q

We would like to find a WZ-mate for f : a solution of the WZequation f (i , j + 1)− f (i , j) = g(i + 1, j)− g(i , j).

We can do thisusing Gosper’s algorithm, which gives

g(i , j) =i (q − r + i)

(i − 1− j)(j + 1 + q)f (i , j)

= − i + q − rj + q + 1

i − 1

r − i

) / (j + q

∞∑

(f (i , j + 1)− f (i , j)

f (i , j) =

r − i

) / (j + q

g(i , j) =i (q − r + i)

(i − 1− j)(j + 1 + q)f (i , j)

= − i + q − rj + q + 1

i − 1

r − i

) / (j + q

∞∑

(f (i , j + 1)− f (i , j)

f (i , j) =

r − i

) / (j + q

g(i , j) =i (q − r + i)

(i − 1− j)(j + 1 + q)f (i , j)

= − i + q − rj + q + 1

i − 1

r − i

) / (j + q

∞∑

(f (i , j + 1)− f (i , j)

So ∑

f (i , j)

is independent of j and is therefore equal to∑

f (i ,0) = 1

(at least as long as j is a nonnegative integer).

What if we take r to be j? So

f (i , j) =

j − i

) / (p + q

We apply Gosper’s algorithm and it is again successful: we findthe WZ-mate

g(i , j) =i (q − j + i)

(i − 1− j)(p + q − j)f (i , j)

= −(

ji − 1

)(p + q − j − 1

p − i

) / (p + q

We find that any way of making the parameters linear functionsof j works.

For example, if we replace i → i + j , p → p + j , q → q − j , sothat

f (i , j) =

(p + ji + j

)(q − j

r − i − j

) / (p + q + j

then Gosper’s algorithm finds a WZ-mate for f ,

g(i , j) =q − r + i

q − jf (i , j).

We find that any way of making the parameters linear functionsof j works.

For example, if we replace i → i + j , p → p + j , q → q − j , sothat

f (i , j) =

(p + ji + j

)(q − j

r − i − j

) / (p + q + j

then Gosper’s algorithm finds a WZ-mate for f ,

g(i , j) =q − r + i

q − jf (i , j).

So we have many (infinitely many, in fact) slightly different WZproofs of Vandermonde’s theorem.

I Why does Gosper’s algorithm always work?I Why should we care?

I Why does Gosper’s algorithm always work?

I Why should we care?

I Why does Gosper’s algorithm always work?I Why should we care?

Path Invariance

(Gosper, Kadell, Zeilberger)

Another way to look at WZ pairs: We consider a grid graph witha weight function defined on every directed edge with thefollowing property:

I For all vertices A and B, all paths from A to B have the sameweight. (path invariance)

This property is equivalent to the existence of a potentialfunction defined on the vertices: the weight of an edge is thedifference in the potential of its endpoints:

Some observations

I To check that a weighted grid graph has the path invarianceproperty, it is sufficient to check it on each rectangle:

d a + b = c + d

Some observations

I We can add arbitrary other edges to the graph, and they willhave uniquely determined weights that satisfy the path invari-ance property.

Some observations

I The weights of a directed edge and its reversal are negativesof each other.

In particular, from one path-invariant weighted grid graph, wecan get another by taking a different grid on the same set ofpoints:

Suppose that we have a a path-invariant weighted grid graph,where the vertices are in Z× Z. Let f (i , j) be the weight on theedge from (i , j) to (i + 1, j) and let g(i , j) be the weight on theedge from (i , j) to (i , j + 1):

f(i, j + 1)

f(i, j)

g(i, j) g(i + 1, j)

(i, j)

The path invariance property is

f (i , j) + g(i + 1, j) = g(i , j) + f (i , j + 1)

This is the same as the WZ identity.

Suppose that we have a a path-invariant weighted grid graph,where the vertices are in Z× Z. Let f (i , j) be the weight on theedge from (i , j) to (i + 1, j) and let g(i , j) be the weight on theedge from (i , j) to (i , j + 1):

f(i, j + 1)

f(i, j)

g(i, j) g(i + 1, j)

(i, j)

The path invariance property is

f (i , j) + g(i + 1, j) = g(i , j) + f (i , j + 1)

This is the same as the WZ identity.

We can think of our previous application of the WZ identity interms of path invariance: We rewrite the identity

f (i , j + 1)−N∑

f (i , j) = g(N + 1, j)− g(0, j).

g(0, j) +N∑

f (i , j + 1) =N∑

f (i , j) + g(N + 1, j).

This comes from path invariance along these paths:

(0, j)

(0, j + 1) (N + 1, j + 1)

(N + 1, j)

path invariance vs. telescoping series viewpoint

I Using path invariance, can find other kinds of identities bysumming along different paths.

I Rather than proving the telescoping formula

f (i , j + 1)−N∑

f (i , j) = 0,

we can prove directly that

f (i , j) = 1.

using the following two paths:

path invariance vs. telescoping series viewpoint

I Using path invariance, can find other kinds of identities bysumming along different paths.

I Rather than proving the telescoping formula

f (i , j + 1)−N∑

f (i , j) = 0,

we can prove directly that

f (i , j) = 1.

using the following two paths:

(N + 1, j)

(N + 1, 0)(0, 0)

(0, j)

(1, 0)1 0

f(i, j)

Another important example of path invariance:

(0, 0) (M, 0)

(0, M)

M−1X

g(0, j)

M−1X

f(i, 0)

If the weight of the green connecting path goes to 0 as M →∞then

∞∑

f (i ,0) =∞∑

g(0, j).

Change of Variables

Path invariance allows us to assign a weight to any step,thereby giving us change of variables formulas for WZ pairs.

Suppose we have a WZ pair (f (i , j),g(i , j)). We take the steps(1,0) and (1,1).

The weight of an edge from (i , j) to (i + 1, j) is f (i , j).

Let h(i , j) be the weight of a step from (i , j) to (i + 1, j + 1), so

h(i , j) = f (i , j) + g(i + 1, j) = g(i , j) + f (i , j + 1).

Change of Variables

Then we have the path invariance formula

h(i , j) + f (i + 1, j + 1) = f (i , j) + h(i + 1, j) (∗)

f(i, j)(i, j)

f(i + 1, j + 1)

h(i, j) h(i + 1, j)

(i + 2, j + 1)

To get a WZ pair we must represent the values of f and h in thecoordinates (1,0) and (1,1).

Since i(1,0) + j(1,1) = (i + j , j), we set F (i , j) = f (i + j , j) andH(i , j) = h(i + j , j). Then replacing i with i + j in (∗) gives

H(i , j) + F (i , j + 1) = F (i , j) + H(i + 1, j)

so(F (i , j),H(i , j)

)is a WZ pair.

Then we have the path invariance formula

h(i , j) + f (i + 1, j + 1) = f (i , j) + h(i + 1, j) (∗)

f(i, j)(i, j)

f(i + 1, j + 1)

h(i, j) h(i + 1, j)

(i + 2, j + 1)

To get a WZ pair we must represent the values of f and h in thecoordinates (1,0) and (1,1).

Since i(1,0) + j(1,1) = (i + j , j), we set F (i , j) = f (i + j , j) andH(i , j) = h(i + j , j). Then replacing i with i + j in (∗) gives

H(i , j) + F (i , j + 1) = F (i , j) + H(i + 1, j)

so(F (i , j),H(i , j)

)is a WZ pair.

Path invariance in several variables

In n variables we may define a WZ n-tuple to be an n-tuple(f1(i1, . . . , in), . . . , fn(i1, . . . , in)

such that for j 6= k , the pair (fj , fk ) is a WZ-pair with respect tothe variables ij and ik .

This implies that if we put a weight of fj(i1, . . . , in) on the edgefrom (i1, . . . , in) to the adjacent point (i1, . . . , ij + 1, . . . , in) thenwe get a path-invariant grid graph in Zn (or Cn or Rn).

Since there are(n

)conditions, it would seem to be difficult to

satisfy them all. But it’s really not so hard:

Lemma. Suppose that for j = 2, . . . ,n, (f1, fj) is a WZ pair withrespect to the variables i1 and ij . Then (f1, . . . , fn) is a WZn-tuple in the variables i1, . . . , in.

Since there are(n

satisfy them all.

But it’s really not so hard:

Since there are(n

WZ forms

Zeilberger introduced WZ forms (in Closed Form (PunIntended!) ) as a way to associate the variables to the functionsin a WZ n-tuple. To the WZ n-tuple (f1, . . . , fn) we associate theWZ form

fj(i1, . . . , in) δij ,

by analogy with differential forms, where δij is a “formalsymbol". So summing a WZ form along a path is analogous tointegrating a differential form along a path.

Going back to the Chu-Vandermonde theorem,

r − i

(p + q

by applying Gosper’s algorithm we can find a correspondingWZ form in four variables, i ,p,q, and r :

r − i

(p + q

)(δi − i (q − r + i)

(p + 1− i)(p + 1 + q)δp

p + q + 1δq − i (q − r + i)

(r − i + 1)(p + q − r)δr)

As in our change of variables formula, we can get all thedifferent WZ pairs (or more generally, WZ forms) correspondingto parametrizations of Vandermonde’s theorem by takingdifferent basis steps and using ωV to evaluate the weights ofthese steps.

But we don’t need to find these WZ pairs; all the results that wemight have found from them we can get directly from pathinvariance.

As in our change of variables formula, we can get all thedifferent WZ pairs (or more generally, WZ forms) correspondingto parametrizations of Vandermonde’s theorem by takingdifferent basis steps and using ωV to evaluate the weights ofthese steps.

But we don’t need to find these WZ pairs; all the results that wemight have found from them we can get directly from pathinvariance.

The central fact developed is that identities are bothinexhaustible and unpredictable; the age-old dream of puttingorder in this chaos is doomed to failure.

John Riordan, Combinatorial Identities

WZ forms bring order to this chaos.

The central fact developed is that identities are bothinexhaustible and unpredictable; the age-old dream of puttingorder in this chaos is doomed to failure.

John Riordan, Combinatorial Identities

WZ forms bring order to this chaos.

What WZ forms are there?

All WZ forms are associated with one of four standardhypergeometric series summation formulas:

I The binomial theorem

I the Chu-Vandermonde theorem (Gauss’s theorem)

I the Pfaff-Saalschütz theorem

I Dougall’s very-well-poised 7F6 summation theorem

All1 WZ forms are associated with one of four standardhypergeometric series summation formulas:

1almost

Shifting and shadowing

In addition to change of variables, which comes from pathindependence, there are two other simple operations on WZforms that give forms we will consider to be equivalent.

It is convenient to assume that all of our functions are “closedform" or “proper hypergeometric terms" or gamma quotients: intwo variables (i and j) for simplicity, we want each to be of theform

f (i , j) =Γ(a1i + b1j + u1) · · · Γ(ak i + bk j + uk )

Γ(c1i + d1j + v1) · · · Γ(cl i + dl j + vl)z iw j

where the an, bn, cn, and dn are integers, and the un, vn, z, andw are complex numbers. It is also convenient to allow i and j tobe complex numbers, so that f is a meromorphic function.(Similarly for more than two variables.)

I ShiftingWe replace a variable i in a WZ form with i + c for some c ∈ C.

I ShadowingWe can multiply a WZ form by any periodic function with period1 in each variable. In particular, we can take as our multiplier

(−1)iΓ(i)Γ(1− i) = (−1)i π

sinπi.

So we can replace 1/Γ(i) with (−1)iΓ(1− i) or vice versa.

This allows us to get rid of inconvenient zeros or poles.

(−1)iΓ(i)Γ(1− i) = (−1)i π

sinπi.

(−1)iΓ(i)Γ(1− i) = (−1)i π

sinπi.

Rising factorials

It is convenient to use the notation for rising factorials,

(a)k = a (a + 1) · · · (a + k − 1)

=Γ(a + k)

Γ(a).

So, for example,

n! = (1)n

(2n)! = 22n(12)n(1)n

n + 1 = (2)n/(1)n

A zeta function example

A well-known formula for ζ(2) = π2/6 is

ζ(2) =∞∑

= 3∞∑

1i2(2i

∞∑

(32)i(2)i

A generalization was proved by Herb Wilf using (a variation of)the WZ method.

This identity is a special case (x = 0) of theBailey-Borwein-Bradley formula

∞∑

ζ(2n + 2)x2n =∞∑

1i2 − x2

= 3∞∑

)(i2 − x2)

i−1∏

4x2 −m2

x2 −m2

1− x2

∞∑

(2)i(1− 2x)i(1 + 2x)i

(32)i(2− x)i(2 + x)i

Proved using a WZ pair by Khodabakhsh and Tatiana HessamiPilehrood.

∞∑

ζ(2n + 2)x2n =∞∑

1i2 − x2

= 3∞∑

)(i2 − x2)

i−1∏

4x2 −m2

x2 −m2

1− x2

∞∑

(2)i(1− 2x)i(1 + 2x)i

(32)i(2− x)i(2 + x)i

∞∑

ζ(2n + 2)x2n =∞∑

1i2 − x2

= 3∞∑

)(i2 − x2)

i−1∏

4x2 −m2

x2 −m2

1− x2

∞∑

(2)i(1− 2x)i(1 + 2x)i

(32)i(2− x)i(2 + x)i

∞∑

ζ(2n + 2)x2n =∞∑

1i2 − x2

= 3∞∑

)(i2 − x2)

i−1∏

4x2 −m2

x2 −m2

1− x2

∞∑

(2)i(1− 2x)i(1 + 2x)i

(32)i(2− x)i(2 + x)i

We will see that their WZ pair comes from the WZ form for theChu-Vandermonde theorem, or equivalently Gauss’s theorem,which is

∞∑

(a)i(b)i

(1)i(c)i=

Γ(c)Γ(c − a− b)

Γ(c − a)Γ(c − b).

It is convenient to write this in the form∞∑

(k)i(l)i

(1)i(j + k + l)i=

Γ(j)Γ(j + k + l)Γ(j + k)Γ(j + l)

which corresponds to the nicely symmetric WZ form

Γ(i + k)Γ(i + l)Γ(j + k)Γ(j + l)Γ(i)Γ(j)Γ(k)Γ(l)Γ(i + j + k + l)

We have∞∑

ζ(2n + 2)x2n =∞∑

1i2 − x2

=∞∑

1(i + 1)2 − x2

=∞∑

1(1 + x + i)(1− x + i)

1− x2

∞∑

(1 + x)i(1− x)i

(2 + x)i(2− x)i.

We want to transform the summand of Gauss’s theorem,

(k)i(l)i

(1)i(j + k + l)i

into a constant times

(1 + x)i(1− x)i

(2 + x)i(2− x)i.

If we replace i by 1 + x + i in (k)i(l)i/(1)i(j + k + l))i (shifting)and then set j = 1, k = 0, l = −2 we get

− 2xΓ(0)(1− x2)

· (1 + x)i(1− x)i

(2 + x)i(2− x)i

We can get rid of the Γ(0) in the denominator by shadowing.

(k)i(l)i

(1)i(j + k + l)i

(1 + x)i(1− x)i

(2 + x)i(2− x)i.

− 2xΓ(0)(1− x2)

· (1 + x)i(1− x)i

(2 + x)i(2− x)i

(k)i(l)i

(1)i(j + k + l)i

(1 + x)i(1− x)i

(2 + x)i(2− x)i.

− 2xΓ(0)(1− x2)

· (1 + x)i(1− x)i

(2 + x)i(2− x)i

The shadowed WZ form is

ω = (−1)k Γ(i + k) Γ(i + l) Γ(j + k) Γ(j + l) Γ(1− k)

Γ(i) Γ(j) Γ(l) Γ(i + j + k + l)

×(δii

If we sum ω in the direction (1,0,0,0) starting at the point(1 + x ,1,0,−2x) we get

−2x∞∑

1(1 + i)2 − x2 = −2x

∞∑

1i2 − x2 .

To transform the sum, we compute the value of the i th stepstarting at the same point and going in another direction. Welook for something interesting and we have to check that thesum along the “connecting path" goes to 0.

The shadowed WZ form is

ω = (−1)k Γ(i + k) Γ(i + l) Γ(j + k) Γ(j + l) Γ(1− k)

Γ(i) Γ(j) Γ(l) Γ(i + j + k + l)

×(δii

If we sum ω in the direction (1,0,0,0) starting at the point(1 + x ,1,0,−2x) we get

−2x∞∑

1(1 + i)2 − x2 = −2x

∞∑

1i2 − x2 .

To transform the sum, we compute the value of the i th stepstarting at the same point and going in another direction. Welook for something interesting and we have to check that thesum along the “connecting path" goes to 0.

We find that the direction (1,2,−1,−1) gives us what we want:the sum starting at (1 + x ,1,0,−2x) gives

− 3x1− x2

∞∑

(2)n(1− 2x)n(1 + 2x)n

(32)n(2− x)n(2 + x)n

An advantage of this approach is that it gives us somethingmore; if we start at the point (x ,1,0, y − x) we get the moregeneral formula

∞∑

1(i + x)(i + y)

∞∑

(1 + x + y + 3i)

× i! (1 + x − y)i(1 + y − x)i

(32)i(1 + x)i(1 + y)i

We find that the direction (1,2,−1,−1) gives us what we want:the sum starting at (1 + x ,1,0,−2x) gives

− 3x1− x2

∞∑

(2)n(1− 2x)n(1 + 2x)n

(32)n(2− x)n(2 + x)n

An advantage of this approach is that it gives us somethingmore; if we start at the point (x ,1,0, y − x) we get the moregeneral formula

∞∑

1(i + x)(i + y)

∞∑

(1 + x + y + 3i)

× i! (1 + x − y)i(1 + y − x)i

(32)i(1 + x)i(1 + y)i

Amalgamated forms of identities

Recall that the coefficients of the variables in a WZ form mustbe integers.

We have the identity

(d)2i = 22i(12d)i(

12d + 1

So in Gauss’s theorem∞∑

(a)i(b)i

(1)i(c)i=

Γ(c − a)Γ(c − b).

we can set a = d/2, b = d/2 + 1/2 and simplify to get

∞∑

(1)i(c)i2−2i =

22c−d−2Γ(c)Γ(c − d − 12)

√πΓ(2c − d − 1)

Amalgamated forms of identities

Recall that the coefficients of the variables in a WZ form mustbe integers.

We have the identity

(d)2i = 22i(12d)i(

12d + 1

So in Gauss’s theorem∞∑

(a)i(b)i

(1)i(c)i=

Γ(c − a)Γ(c − b).

we can set a = d/2, b = d/2 + 1/2 and simplify to get

∞∑

(1)i(c)i2−2i =

22c−d−2Γ(c)Γ(c − d − 12)

√πΓ(2c − d − 1)

We can get a corresponding WZ form

21−2i−2j−k√π Γ(2i + k)Γ(2j + k)

Γ(i)Γ(j)Γ(k)Γ(i + j + k + 12)

+δjj− 2

Is it possible to derive this directly from the WZ form for thegeneral Gauss’s theorem without a separate application ofGosper’s algorithm?

We can get a corresponding WZ form

21−2i−2j−k√π Γ(2i + k)Γ(2j + k)

Γ(i)Γ(j)Γ(k)Γ(i + j + k + 12)

+δjj− 2

Is it possible to derive this directly from the WZ form for thegeneral Gauss’s theorem without a separate application ofGosper’s algorithm?

Sporadic WZ forms

There are many evaluations of 2F1’s with just one freeparameter. They all have WZ pairs, but these seem to besporadic.

ira m. gessel - brandeis universitypeople.brandeis.edu/~gessel/homepage/slides/wilf80-slides.pdf ·...

Documents