scheduling and congestion

8/7/2019 Scheduling and Congestion

1/157


2/157

Scheduling and CongestionControl for Wireless andProcessing Networks


3/157


4/157

Synthesis Lectures onCommunication Networks

EditorJean Walrand, University of California, Berkeley

Synthesis Lectures on Communication Networks is an ongoing series of 50- to 100-page publicationson topics on the design, implementation, and management of communication networks. Each lecture is

a self-contained presentation of one topic by a leading expert. The topics range from algorithms tohardware implementations and cover a broad spectrum of issues from security to multiple-accessprotocols. The series addresses technologies from sensor networks to reconfigurable optical networks.The series is designed to:

Provide the best available presentations of important aspects of communication networks.

Help engineers and advanced students keep up with recent developments in a rapidly evolvingtechnology.

Facilitate the development of courses in this field

Scheduling and Congestion Control for Wireless and Processing NetworksLibin Jiang and Jean Walrand

2010

Performance Modeling of Communication Networks with Markov ChainsJeonghoon Mo2010

Communication Networks: A Concise IntroductionJean Walrand and Shyam Parekh2010

Path Problems in NetworksJohn S. Baras and George Theodorakopoulos2010

Performance Modeling, Loss Networks, and Statistical MultiplexingRavi R. Mazumdar2009

Network SimulationRichard M. Fujimoto, Kalyan S. Perumalla, and George F. Riley2006


5/157

Copyright 2010 by Morgan & Claypool

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in

any form or by any meanselectronic, mechanical, photocopy, recording, or any other except for brief quotations in

printed reviews, without the prior permission of the publisher.

Scheduling and Congestion Control for Wireless and Processing NetworksLibin Jiang and Jean Walrand

www.morganclaypool.com

ISBN: 9781608454617 paperback

ISBN: 9781608454624 ebook

DOI 10.2200/S00270ED1V01Y201008CNT006

A Publication in the Morgan & Claypool Publishers series

SYNTHESIS LECTURES ON COMMUNICATION NETWORKS

Lecture #6

Series Editor: Jean Walrand, University of California, Berkeley

Series ISSN

Synthesis Lectures on Communication Networks

Print 1935-4185 Electronic 1935-4193
http://www.morganclaypool.com/http://www.morganclaypool.com/


6/157

Scheduling and CongestionControl for Wireless andProcessing Networks

Libin Jiang and Jean WalrandUniversity of California, Berkeley

SYNTHESIS LECTURES ON COMMUNICATION NETWORKS #6

CM

&cLaypoolMor gan publishe rs&


7/157


8/157

vii

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 A Small Wireless Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Feasible Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1.2 Maximum Weighted Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.3 CSMA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.4 Entropy Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Admission Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Randomized Backpressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Scheduling in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1 Model and Scheduling Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 CSMA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3 Idealized Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3.1 CSMA Can Achieve Maximal Throughput . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3.2 An idealized distributed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4 Distributed Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4.1 Throughput-Optimal Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4.2 Variation: Constant Update Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4.3 Time-invariant A-CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.5 Maximal-Entropy Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.6 Reducing Delays: Algorithm 1(b) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.7 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


9/157

viii

3.7.1 Time-invariant A-CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.7.2 Time-varying A-CSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.8 Proof Sketch of Theorem 3.10-(i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.9 Further Proof Details of Theorem 3.10-(i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.9.1 Property 3.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.9.2 Property 3.22: Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.9.3 Property 3.22: Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.10 Proof of Theorem 3.10-(ii) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.11 Proof of Theorem 3.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.12 General Transmission Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.13 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.13.1 Proof of the fact that C is the interior ofC . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.13.2 Proof the Proposition 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.14 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.15 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.15.1 Maximal-weight scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.15.2 Low-complexity but sub-optimal algorithms . . . . . . . . . . . . . . . . . . . . . . . . 57

3.15.3 Throughput-optimum algorithms for restrictive interference models . . . . 57

3.15.4 Random Access algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4Utility Maximization in Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1 Joint Scheduling and Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1.1 Formulation of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.1.2 Derivation of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.1.3 Approaching the Maximal Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2.1 Anycast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2.2 Multicast with Network Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.4 Properties of Algorithm 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4.1 Bound on Backpressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.4.2 Total Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4.3 Queue Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.6 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73


10/157

ix

5 Distributed CSMA Scheduling with Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2 CSMA/CA-Based Scheduling with Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.2.3 Computation of the Service Rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.3 A Distributed Algorithm to Approach Throughput-Optimality. . . . . . . . . . . . . . 81

5.3.1 CSMA Scheduling with Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.4 Reducing Delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.6 Proofs of Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855.6.1 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.6.2 Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.6.3 Proof of Theorem 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.7 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.8 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6 Stochastic Processing networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.3 Basic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036.4 DMW scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.4.1 Arrivals that are smooth enough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.4.2 More random arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.5 Utility maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.7 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.7.1 DMW scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.7.2 Utility maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6.8 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.9 Skipped proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.9.1 Proof of Theorem 6.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.9.2 Proof of Theorem refthm:rate-stable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.9.3 Proof of Theorem 6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.9.4 Proof of Theorem 6.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121


11/157

x CONTENTS

A Stochastic Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123A.1 Gradient algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

A.2 Stochastic approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

A.3 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

A.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Authors Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143


12/157

Preface

This book explains recent results on distributed algorithms for networks. The book is based onLibins Ph.D. thesis where he introduced the design of a CSMA algorithm based on a primal-dualoptimization problem, extended the work to networks with collisions, and developed the schedulingof processing networks based on virtual queues.

To make the book self-contained, we added the necessary background on stochastic approx-imations and on optimization. We also added an overview chapter and comments to make thearguments easier to follow. The material should be suitable for graduate students in electrical engi-neering, computer science, or operations research.

The main theme of this book is the allocation of resources among competing tasks. Suchproblems are typically hard because of the large number of possible allocations. Instead of searchingfor the optimal allocation at each instant, the approach is to design a randomized allocation whosedistribution converges to one with desirable properties. The randomized allocation is implementedby a scheme where tasks request the resources after a random delay.Each task adjusts the mean valueof its delay based on local information.

One application is wireless ad hoc networks where links share radio channels. Another appli-cation is processing networks where tasks share resources such as tools or workers. These problems

have received a lot of attention in the last few years. The book explains the main ideas on simpleexamples, then studies the general formulation and recent developments.

We are thankful to Prof. Devavrat Shah for suggesting adjusting the update intervals in oneof the gradient algorithms, to Prof. Venkat Anantharam and Pravin Varaiya for their constructivecomments on the thesis, to Prof. Michael Neely and R. Srikant for detailed constructive reviews ofthe book, and to Prof. Vivek Borkar, P.R. Kumar, Bruce Hajek, Eytan Modiano, and Dr. AlexandreProutiere for their encouragement and useful feedback. We are grateful to NSF and ARO for theirsupport of our research during the writing of this book.

Libin Jiang and Jean WalrandAugust 2010


13/157


14/157

1

C H A P T E R 1

Introduction

In a wireless network, nodes share one or more radio channels. The nodes get packets to transmitfrom the application and transmit them hop by hop to their destination. For instance, one user maybe downloading a file from another node; two other users might be engaged in a Skype call.

The nodes cannot all transmit together, for their transmissions would then interfere with one

another. Consequently, at any given time, only a subset of nodes should transmit. The scheduling

problem is to design an algorithm for selecting the set of nodes that transmit and a protocol forimplementing the algorithm. Moreover, the nodes should decide which packet to send and to whatneighboring node.

This problem admits a number of formulations. In this book, we adopt a simple model ofinterference: two links either conflict or they do not. Thus, conflicts are represented by an conflictgraph whose vertices are all the links and the edges are between pairs of links that conflict and shouldnot transmit together. Equivalently, there are subsets of links that can transmit together because theydo not share an edge. Such sets are called independent sets.

Intuitively, the set of links that should transmit depends on the backlog of the nodes. Forinstance, we explain that choosing the independent set with the maximum sum of backlogs is agood policy when the nodes need to transmit each packet only once. This policy is called Maximum

Weighted Matching(MWM). Another good policy is to first select the link with the largest backlog,then the link with the largest backlog among those that do not conflict with the first one, and soon. This policy is called Longest Queue First(LQF). These two policies are not easy to implementbecause the information about the backlog of the nodes is not available to all the nodes. Moreover,even if all nodes knew all the backlogs, implementing MWM would still be computationally hardbecause of the huge number of independent sets even in a small graph.

One key idea in this book is that, instead of looking for the independent set with the maximumsum of backlogs, one designs a randomized scheduling algorithm. To implement this algorithm, thenodes choose random waiting times. The mean of the waiting time of each node decreases withthe backlog of that node. After that waiting time, the node listens to the channel. If it does nothear any transmission, it starts transmitting a packet. Otherwise, it chooses a new waiting time andrepeats the procedure. Note that this algorithm is distributed since each node needs only know its

own backlog and whether any conflicting node is transmitting. Moreover, the algorithm does notrequire any complex calculation. One can show that this algorithm, called A-CSMAfor adaptivecarrier sense multiple access, selects an independent set with a probability that increases with the sum

of the backlogs in that set. Thus, this randomized algorithm automatically approximates the NP-hard selection that MWM requires. As you might suspect, the probability distribution of the active


15/157

2 1. INTRODUCTION

independent sets may take a long time to converge. However, in practice, this convergence appears

to be fast enough for the mechanism to have good properties.When the nodes must relay the packets across multiple hops, a good algorithm is to choose

an independent set such that the sum of the differences of backlogs between the transmitters andthe receivers is maximized. Again, this problem is NP-hard and a randomized algorithm is an

approximation with good properties. In this algorithm, the nodes pick a random waiting time whosemean decreases with the back-pressureof the packet being transmitted. Here, the back-pressure of apacket is the difference in queue lengths between the transmitter and the receiver, multiplied by thelink rate. We should call this protocol B-CSMA, but we still call it A-CSMA to avoid multiplyingterminology.

When we say that the randomized algorithms have good properties, we mean more than theyare good heuristics that work well in simulations. We mean that they are in fact throughput-optimal

or utility-maximizing. That is, these algorithms maximize the rates of flows through the network, ina sense that we make precise later. One may wonder how simple distributed randomized algorithms

can have the same throughput optimality as a NP-hard algorithm such as MWM. The reason isthat achieving long-term properties of throughput does not require making the best decision ateach instant. It only requires making good decisions on average. Accordingly, an algorithm thatcontinuously improves the random selection of the independent set can take a long time to convergewithout affecting the long-term throughput. The important practical questions concern the abilityof these algorithms to adapt to changing conditions and also the delays that packets incur with analgorithm that is only good on average. As we explain, the theory provides some answers to thesequestions in the form of upper bounds on the average delays.

Processing networks are models of communication, manufacturing, or service networks. For

instance, a processing network can model a multicast network, a car assembly plant, or a hospital. Ina processing network, tasks use parts and resources to produce new parts that may be used by othertasks. In a car assembly plant, a rim and a tire are assembled into a wheel; four wheels and a chassisare put together, and so on. The tasks may share workers and machine tools or robots. In a hospital,a doctor and nurses examine a patient that may then be dispatched to a surgical theater where othernurses and doctors are engaged in the surgery, and so on.

The scheduling problem in a processing network is to decide which tasks should be performedat any one time.The goal may be to maximize the rate of production of some parts, such as completed

cars, minus the cost of producing these parts. Such a problem is again typically NP-hard since itis more general that the allocation of radio channels in a wireless network. We explain schedulingalgorithms with provable optimality properties.

The book is organized as follows. Chapter 2 provides an illustration of the main results on

simple examples. Chapter 3 explains the scheduling in wireless networks. Chapter 4 studies thecombined admission control, routing, and scheduling problem for network utility maximization.Chapter 5 studies collisions in wireless networks. Chapter 6 is devoted to processing networks.Chapter A explains the main ideas of stochastic approximations that we use.


16/157

3

C H A P T E R 2

Overview

Thischapterexplainsthemainideasofthisbookonafewsimpleexamples.InSection 2.1,weconsiderthe scheduling of three wireless nodes and review the maximum weighted matching (MWM) and

the A-CSMA scheduling. Section 2.2 explains how to combine admission control with scheduling.Section 2.3 discusses the randomized backpressure algorithms. Section 2.4 reviews the Lagrangianmethod to solve convex optimization problems. We conclude the chapter with a summary of the

main observations.

2.1 A SMALL WIRELESS NETWORK

Consider the network shown on the left side of Figure 2.1. There are three wireless links numbered

1

2

3

1 2 3

conflict conflict

Figure 2.1: A network with three links.

1, 2, and 3 where each link is a pair of radio transmitter and receiver. Packets arrive at the links (ormore specifically, the transmitters of the links) with the rates indicated in the right figure. A simplesituation is one where at each time t= 0, 1, 2, . . . a random number of packets with mean i anda finite variance arrive at linki, independently of the other times and of the arrivals at other linksand with the same distribution at each time.Thus, the arrivals are i.i.d. (independent and identicallydistributed) at each link, and they are independent across links. Say that the packet transmissionstake exactly one time unit.

The links 1 and 2 conflict: if their transmitters transmit together, the signals interfere and

the receivers cannot recover the packets. The situation is the same for links 2 and 3. Links 1 and3, however, are far enough apart not to interfere with one another. If they both transmit at thesame time, their receivers can get the packets correctly. The figure on the right side has omitted thereceivers (such that the three circles there correspond to the three links in the left figure), and itrepresents the above conflict relationships by a solid line between links 1 and 2 and another between


17/157

4 2. OVERVIEW

links 2 and 3. (In Section 2.1 and 2.2, since only one-hop flows are considered, we omit the receivers

and use the terms nodeand link interchangeably.)Thus, at any given time, if all the nodes have packets to transmit, the sets of nodes that can

transmit together without conflicting are , {1}, {2}, {3}, and {1, 3}, where designates the emptyset. These sets are called the independent sets of the network.An independent set is said to be maximalif one cannot add another node to it and get another independent set. Thus, {2} and {1, 3} are themaximal independent sets.

2.1.1 FEASIBLE RATES

The scheduling problem is to find which independent set should transmit at any given time to keepup with the arriving packets.The first question is whether this is feasible at all. The answer dependson how large the arrival rates are and is given in Theorem 2.3 below. However, before stating the

theorem, we should review the following notions about Markov chains.

Definition 2.1 Irreducibility and Positive Recurrence.Consider a discrete time Markov chain {X(n), n = 0, 1, . . .} with a countable state space (i.e.,

a finite or countably infinite number of states). The Markov chain is irreducibleif it can go fromevery state to any other state (not necessarily in one step). An irreducible Markov chain is positiverecurrentif it spends a positive fraction of time in every state.

The following result is well known (see, for example, (2)).

Theorem 2.2 [Lyapunov Function and Positive Recurrence]Consider a discrete time Markov chain

{X(n), n

=0, 1, . . .

}with a countable state space.

(a) If the Markov chain is irreducible, then either it is positive recurrent or it spends a zero fractionof time in every state.

(b) If the Markov chain is irreducible and such that there is a nonnegative function V (X(n)) suchthat

E[V (X(n + 1)) V (X(n))|X(n)] 1{X(n) A} (2.1)for some > 0, < 0 and some finite setA, then the Markov chain is positive recurrent. In that case, wesay thatVis a Lyapunov function for the Markov chain.

Condition (2.1) means that, outside of a finite set A of states, the function Vtends to decrease.Since the function is nonnegative, it cannot decrease all the time. Consequently, X(n) must spenda positive fraction of time inside A. By (a), this implies that the Markov chain is positive recurrent

since A is finite.

Theorem 2.3 Feasibility and Positive Recurrence.

For simplicity, consider a time-slotted system. In time slot, a node can either serve one packet or notserve any packet. (Also, recall that it cannot serve a packet if any of its conflicting nodes serves a packet in


18/157

2.1. A SMALL WIRELESS NETWORK 5

the slot). In time slotn

=0, 1, 2, . . . , Ai (n) packets arrive at queuei (whereAi (n) is integer). Assume

thatAi (n), i, n are independent of each other. Also, assume thatE(Ai (n)) = i , n (wherei is thearrival rate to queuei), andE(A2i (n)) C < , i, n.

(a) There is a schedule such that the queue lengths do not grow to infinity only if

1 + 2 1 and2 + 3 1. (2.2)

(b) Moreover, if

1 + 2 < 1 and2 + 3 < 1, (2.3)then thereis a schedule suchthatX(n) is positive recurrent, whereX(n) = (X1(n), X2(n), X3(n)) denotesthe vector of queue lengths at timen.

We say that the arrival rates arefeasible if they satisfy (2.2) and that they arestrictly feasible if

they satisfy (2.3).

Proof:(a) Assume that 1 + 2 > 1. At any given time, at most one of the two nodes 1 and 2 can

transmit. Consequently, the rate at which transmissions remove packets from the two nodes {1, 2} isat most 1. Thus, packets arrive faster at the nodes {1, 2} than they leave. Consequently, the numberof packets in these nodes must grow without bound.

To be a bit more precise, let Qn be the total number of packets in the nodes {1, 2} at timen {0, 1, 2, . . . }. Note that

Qn An nwhere An is the number of arrivals in the nodes

{1, 2

}up to time n. Indeed, at most n packets have

left between time 0 and time n 1. Also, by the strong law of large numbers, An/n 1 + 2almost surely as n . Thus, dividing the above inequality byn, we find that

lim inf1

nQn 1 + 2 1 > 0.

This implies that Qn almost surely as n . Thus, no schedule can prevent the backlog inthe network from growing without bound if1 + 2 > 1, and similarly if2 + 3 > 1.

(b) Assume that (2.3) holds. Then there is some p [0, 1] be such that

2 < 1 p; 1 < p; 3 < p. (2.4)

We claim that a schedule that at each step chooses to serve nodes 1 and 3 with probabilityp and node 2 with probability1 p, independently from step to step, makes the queue lengthspositive recurrent. To see this, we show that when this schedule is used the function

V (X(n)) = 12[X21 (n) + X22 (n) + X23 (n)]


19/157

6 2. OVERVIEW

is a Lyapunov function.To check the property (2.1), recall that Ai (n) the number of arrivals in queuei at time n. Let also Si (n) take the value 1 if queue i is served at time n, and the value 0 otherwise.Then Xi (n + 1) = Xi (n) Zi (n) + Ai (n) where Zi (n) := Si (n)1{Xi (n) > 0}. Note that Xi (n) isa non-negative integer (since both Ai (n) and Si (n) are integers). Therefore,

X2i (n + 1) X2i (n)= A2i (n) + Z2i (n) + 2Xi (n)Ai (n) 2Xi (n)Zi (n) 2Ai (n)Zi (n)= A2i (n) + Z2i (n) + 2Xi (n)Ai (n) 2Xi (n)Si (n) 2Ai (n)Zi (n). (2.5)

Hence,1

2E[X2i (n + 1) X2i (n)|X(n)] i + (i pi )Xi (n),

where

i =E(A2

i(n)

+S2

i(n))/2

andp

i =E

[S

i(n)

|X(n)

], so thatp

1 =p

3 =p, p

2 =1

p

.Consequently, summing these inequalities for i = 1, 2, 3, one finds that

E[V (X(n + 1)) V (X(n))|X(n)] +3

i=1(i pi )Xi (n)

with = 1 + 2 + 3. Now, i pi < < 0 for i = 1, 2, 3, for some because the arrivalrates are strictly feasible. This expression is less than if (X1(n) + X2(n) + X3(n)) < ,which occurs if

X1(n) + X2(n) + X3(n) > +

,

and this is the case when X(n) is outside the finite set defined by the opposite inequality. Therefore,

X(n) is positive recurrent by Theorem 2.2 (b).2

The theorem does not clarify what happens when 1 + 2 = 1 or 2 + 3 = 1. The answeris a bit tricky. To understand the situation, assume 1 = 1, 2 = 3 = 0. In this case, one may servenode 1 all the time. Does queue 1 grow without bound? Not if the arrivals are deterministic: ifexactly one packet arrives at each time at node 1, then the queue does not grow. However, if thearrivals are random with mean 1, then the queue is not bounded. For instance, if two packets arrivewith probability 0.5 and no packet arrives with probability 0.5, then the queue length is not positiverecurrent. This means that the queue spends a zero fraction of time below any fixed level and itsmean value goes to infinity.

2.1.2 MAXIMUM WEIGHTED MATCHINGWe explained that when the arrival rates are strictly feasible, there is a schedule that makes thequeues positive recurrent. However, the randomized schedule we proposed required knowing thearrival rates. We describe an algorithm that does not require that information. This algorithm iscalled Maximum Weighted Matching(MWM).


20/157


Definition 2.4 Maximum Weighted Matching.

The MWM algorithm serves queue 2 if the backlog of that queue is larger than the sum ofthe backlogs of queues 1 and 3; otherwise, it serves queues 1 and 3. That is, the algorithm serves theindependent set with the largest sum of backlogs.

The following result gives the property of MWM.

Theorem 2.5 MWM and Positive Recurrence.Assume that the arrival rates are strictly feasible and have a finite variance. Then MWM makes the

queues positive recurrent.

Proof:This result is due to Tassiulas and Ephremides (66). Let Xi (n) be the queue length in node

i at time n (i = 1, 2, 3; n = 0, 1, 2, . . .). Let also X(n) = (X1(n), X2(n), X3(n)) be the vector ofqueue lengths. Define

V (X(n)) = 12[X21 (n) + X22 (n) + X23 (n)]

as half the sum of the squares of the queue lengths.The claim is that, under MWM, V (X(n)) is a Lyapunov function for the Markov chain X(n).Proceeding as in the proof of the previous theorem and with the same notation, one finds (2.5).

Taking expectation given X(n) and noting that Si (n) is now a function ofX(n) determined by the

MWM algorithm, one finds

E[V (X(n + 1)) V (X(n))|X(n)] +3

i=1(i Si (n))Xi (n).

To prove (2.1), it now suffices to show that this expression is less than for X(n) outside a finiteset.

To do this, note that MWM chooses the value of{Si (n), i = 1, 2, 3} that maximizesSi (n)Xi (n). The maximum value must then be larger than pX1(n) + (1 p)X2(n) + pX3(n),

where p is the probability defined before such that (2.4) holds. Indeed, the maximum is eitherX1(n) + X3(n) or X2(n), and this maximum is larger than any convex combination of these twovalues. Hence,

E[V (X(n + 1)) V (X(n))|X(n)] + (1 p)X1(n) + (2 (1 p))X2(n) + (3 p)X3(n).

In the proof of the previous theorem, we showed that the right-hand side is less than

when Xn

is outside of a finite set.2

You will note that the crux of the argument is the MWM makes the sum of the squares ofthe queue lengths decrease faster than a randomized schedule and such a randomize schedule existsthat makes that sum decrease fast enough when the arrival rates are strictly feasible.


21/157

8 2. OVERVIEW

2.1.3 CSMA

Although the MWM algorithm makes the queues positive recurrent when the arrival rates arestrictly feasible, this algorithm is not implementable in a large network for two reasons. First, to

decide whether it can transmit, a node must know if it belongs to the independent set with themaximum weight. To determine that independent set, some node must know the queue lengths.Getting that information requires a substantial amount of control messages. Second, identifying themaximum weight independent set, even when knowing all the queue lengths, is a computationallyhard problem.Indeed,the number of independent sets in a large network is enormous and comparingtheir weights requires an excessive number of computations.

In this section, we describe a different approach based on a Carrier Sense Multiple Access(CSMA) protocol. When using this protocol, node i waits a random amount of time that is expo-nentially distributed with rate Ri , i.e., with mean 1/Ri . The waiting times of the different nodes

are independent. At the end of its waiting time, a node listens to the radio channel. If it hears sometransmission, then it calculates a new waiting time and repeats the procedure. Otherwise,it transmitsa packet. For simplicity, we assume for now that the carrier sensing is perfect. That is, if one node istarts transmitting at time tand a conflicting node j listens to the channel at time t+ , then weassume that node j hears the transmission of node i, for any arbitrarily small . Therefore, thereis no collision because under the above assumption a collision can only occur when two conflictingnodes start transmitting at exactly the same time, which has probability 0 with the exponentiallydistributed backoff times. In practice, this assumption is not valid: it takes some time for a node to

sense the transmission of another node. Moreover, we assume that there is no hidden node. Thismeans that if node i does not hear any conflicting transmission and starts sending a packet to itsintended receiver k, then there is no other node j that is transmitting and can be heard byk and

not byi. This is another approximation. In Chapter 5, we explain how to analyze the network withcollisions.

It turns out that this protocol is easier to analyze in continuous time than in discrete time.For ease of analysis, we also assume that the packet transmission times are all independent andexponentially distributed with mean 1.The arrival processes at the three nodes are independent withrate i .

Let us pretend that the nodes always have packets to transmit and that, when they run out,they construct a dummy packetwhose transmission time is distributed as that of a real packet. Withthese assumptions, the set Stof nodes that transmit at time tis modeled by a continuous-timeMarkov chain that has the state transition diagram shown in Figure 2.2.

For instance, a transition from to {1} occurs when node 1 starts to transmit, which happenswith rate R1 when the waiting time of that node expires. Similarly, a transition from

{1}

to

occurswhen the transmission of node 1 terminates, which happens with rate 1. Note that a transition from{2} to {1, 2} cannot happen because node 1 senses that node 2 is already transmitting when itswaiting time expires. The other transitions can be explained in a similar way. We call this Markov

chain the CSMA Markov chain because it models the behavior of the CSMA protocol.


22/157


{1}

{2}

{3}

{1, 3}

R1

1

R3

11

R2

R3

1

R1

1

Figure 2.2: The CSMA Markov chain.

One has the following theorem.

Theorem 2.6 Invariant Distribution of CSMA Markov Chain.The CSMA Markov chain is time-reversible and its invariant distribution is given by

() = K and(S) = KiSRi forS {{1}, {2}, {3}, {1, 3}} (2.6)

whereK is such that the probabilities of the independent sets add up to one.

Proof:Recall that a continuous-time Markov chain with rate matrix Q has invariant distribution

and is time-reversible if and only if

(i)q(i, j) = (j)q(j,i), i,j.A stochastic process is time-reversibleif it has the same statistical properties when reversed in time.Theconditions above,called detailedbalance equations,meanthatwhentheMarkovchainisstationary,the rate of transitions from i to j is the same as the rate of transitions from j to i. If that were notthe case, one could distinguish between forward time and reverse time and the Markov chain wouldnot be time-reversible. Note also that by summing these identities over i, one finds that

i

(i)q(i,j) = (j)

i

q(j,i) = 0

where the last equality follows from the fact that the rows of a rate matrix sum to zero.Thus, Q = 0and is therefore the stationary distribution of the Markov chain. See (38) for a discussion of thismethod and its applications.

For the CSMA Markov chain, it is immediate to verify the detailed balance equations. Forinstance, let i = {1} and j = {1, 3}. Then q(i,j) = R3 and q(j,i) = 1, so that one has

(i)q(i, j) = (KR1) R3 and (j)q(j,i) = (KR1R3) 1,


23/157


24/157

2.1. A SMALLWIRELESS NETWORK 11

To solve the problem (2.7), we associate a Lagrange multiplier with each inequality constraint

and with the equality constraint. (See Section 2.4 for a review of that method.) That is, we form theLagrangian

L(, r) =

S

(S) log (S)

j

rj (j

{S|jS}(S)) r0(1

S

(S)). (2.8)

We know that if the rates are strictly feasible, then there is a distribution that satisfies theconstraints of the problem (2.7). Consequently, to solve (2.7), we can proceed as follows. We find that maximizes L(, r) while r 0 minimizes that function. More precisely, we look for a saddlepoint(, r) ofL(, r), such that maximizes L(, r) over , and r minimizes L(, r) overr 0.Then is an optimal solution of (2.7). (In Section 2.4, we give an example to illustrate thisgeneric method for solving constrained convex optimization problems.)

To maximize L(, r) over , we express that the partial derivative ofL with respect to (S0)is equal to zero, for every independent set S0. From (2.8), we find

(S0)L(, r) = 1 log((S0)) +

jS0

rj + r0 = 0.

This implies that

(S) = C exp{jS

rj }, (2.9)

where C is a constant such that

S(S) = 1. This distribution corresponds to a CSMA algorithm

with parameters Rj=

exp

{rj

}. We conclude that there must exist some parameters Rj such that

the CSMA protocol serves the links fast enough.Next, we look for the parameters r. To do this, we use a gradient algorithm to minimize

L(, r). Note that the derivative ofL(, r) with respect to rj is given as follows:

rjL(, r) = (j

{S|jS}

(S)) = (j sj ()).

Consequently, to minimize L(, r), the gradient algorithm should update rj is the direction oppositeto this derivative, according to some small step size. Also, we know that rj 0, so that the gradientalgorithmshould project the update into[0, ).That is,the gradient algorithmupdates rj as follows:

rj (n

+1)

= {rj (n)

+(n)

[j

sj ((n))

]}+. (2.10)

Here, for any real number x, one defines {x}+ := max{x, 0}, which is the value in [0, ) that is theclosest to x. Also, n corresponds to the n-th step of the algorithm. At that step, the parameters r(n)are used,and they correspond to the invariant distribution (n) given by (2.9) with those parameters.In this expression, (n) is the step size of the algorithm.


25/157

12 2. OVERVIEW

This update rule has the remarkable property that linkj should update its parameter rj(which corresponds to Rj = exp{rj } in the CSMA protocol) based only on the difference betweenthe average arrival and service rates at that link. Thus, the update does not depend explicitly on whatthe other links are doing. The service rate at linkj certainly depends in a complicated way on theparameters of the other links. However, that average arrival and service rates at linkj are the only

information that linkj requires to update its parameter.Note also that if the average arrival rate j at linkj is larger that the average service rate

sj of that link, then that link should increase its parameter rj , thus becoming more aggressive inattempting to transmit, and conversely.

Unfortunately, the link observes actual arrivals and transmissions, not their average rates. Inother words, linkj observes a noisy version of the gradient j sj ((n)) that it needs to adjustits parameter rj . This noise in the estimate of the gradient is handled by choosing the step sizes

(n) and also the update intervals carefully.Let us ignore this difficulty for now to have a rough sense of how the link should update its

parameter. That is, let us pretend that linkj updates its parameter rj every second, that the totalnumber of arrivals Aj (n) at the link during that second is exactlyj , and that the total number oftransmissions Dj (n) by the link is exactly equal to the average value sj .To simplify the situation evenfurther, let us choose a fixed step size in the algorithm, so that (n) = . With these assumptions,the gradient algorithm (2.10) is

rj (n + 1) = {rj (n) + [Aj (n) Dj (n)]}+.

Now observe that the queue length Xj (n) at time n satisfies a very similar relation. Indeed, one has

Xj (n + 1) {Xj (n) + Aj (n) Dj (n)}+.Comparing these update equations, we find that

rj (n) Xj (n), n 1,

if we choose rj (0) = Xj (0).Thus, with this algorithm, we find that the parameters of the CSMA protocol should be

Rj = exp{Xj }, j = 1, 2, 3. (2.11)

In other words, node j should select a waiting time that is exponentially distributed with rateexp

{Xj

}. This algorithm is fully distributed and is very easy to implement.

However, the actual algorithms we will propose, although still simple and distributed, area little different from (2.11). This is because we derived the algorithm (2.11) by making a keyapproximation, that the arrivals and transmissions follow exactly their average rate.To be correct, wehave to adjust rslow enough so that the CSMA Markov chain approaches its invariant distributionbefore rchanges significantly. There are at least two ways to achieve this.


26/157

2.1. A SMALLWIRELESS NETWORK 13

One way is to modify algorithm (2.11) by using a small enough constantstep size , a large

enough constantupdate interval T, and imposing an upper bound on rso that the mixing time ofthe CSMA Markov chain is bounded. Specifically, node i uses a waiting time that is exponentiallydistributed with rate Ri . EveryTseconds, all Ri s are updated as follows:

Ri (n) = exp{min{(/T )Xi (n), rmax + }} (2.12)

where rmax , > 0 are constants, and Xi (n)s are the queue lengths at the time of the ns update.Then, we explain in Section 3.4.3 that this algorithm is almost throughput-optimal. (That

is, it can stabilize the queues if the vector of arrival rates is in a region parametrized byrmax . The

region is slightly smaller than the maximal region.)Another way is to use diminishingstep sizes and increasingupdate intervals so that eventually

the arrival rates and service rates get close to their average values between two updates.This is a time-varying algorithm since the step sizes and update intervals change with time. Detailed discussionsare provided in Section 3.4.1 and 3.4.2.

Recapping, the main point of this discussion is that solving the problem (2.7) shows thatthere are parameters Rj of the CSMA protocol that serve the links fast enough. Moreover, theseparameters are roughly exponential in the queue lengths. Finally, with a suitable choice of the stepsizes and of the update intervals, one can make the algorithm support the arrival rates.

2.1.5 DISCUSSION

Before moving on to the next topic, it may be useful to comment on the key ideas of the currentsection.

The first point that we want to address is the two different justifications we gave for why

Rj = exp{Xj } are suitable parameters. Recall that the first justification was that if the queuelengths do not change much while the Markov chain approaches its stationary distribution, thenchoosing these values leads to a product form (S) = C exp{jSXj } that favors independentsets with a large weight. Thus, in some sense, this choice is an approximation of MWM, which weknow is stable. One flaw in this argument is that the approximation of MWM is better if is large.However, in that case, the parameters Rj change very fast as the queue lengths change. This is notconsistent with the assumption that the queue lengths do not change much while the Markov chainapproaches its stationary distribution. The second justification is that it corresponds to a gradientalgorithm with a fixed step size. For this algorithm to be good, the step size has to be fairly small.However, in that case, we know that the algorithm takes a long time to converge. Thus, we find the

usual tradeoff between speed of convergence and accuracy of the limit.

The second point is related to the first and concerns the convergence time of the algorithm.The number of states in the Markov chain is the number of independent sets. This number growsexponentially with the number of links. Thus, one should expect the algorithm to converge slowlyand to result in very poor performance in any practical system. In practice, this is not the case. In fact,the algorithm appears to perform well. The reason may have to do with the locality of the conflicts


27/157

14 2. OVERVIEW

so that good decisions may depend mostly on a local neighborhood and not on a very large number

of links.

2.2 ADMISSION CONTROL

In the previous sections, we assumed that the arrival rates i are given and that the schedulingalgorithm tries to keep up with these arrivals.In this section, we consider the case where the networkcan control the arrivals by exercising some admission control. The goal is the admit packets and

schedule the transmissions to maximize the sum of the utilities of the flows of packets. That is, weassume that the packets that arrive at rate i are for some user whose utility is ui (i ) where ui ()is a positive, increasing, and concave function. This function expresses the diminishing return fora higher rate. Thus, ifi increases by, the user perceives a smaller improvement when i is largethan when it is small.

The problem is then

Maximize

i

ui (i )

s.t. is feasible. (2.13)

The approach is to combine the A-CSMA protocol as before with admission control. As weexplain below, the network controls the arrivals as follows. When the backlog of node i is Xi , thearrival rate i is chosen to maximize ui (i ) Xi i where is some positive constant. Note thatthe choice ofi depends only on the backlog in linki, so that the algorithm is local.

Thus, the arrival rates decrease when the backlogs in the nodes increase. This is a form ofcongestion control. Since the mechanism maximizes the sum of the user utilities, it implements afair congestion control combined with the appropriate scheduling. In the networking terminology,one would say that this mechanism combines the transport and the MAC layer. This mechanism isillustrated in Figure 2.3.

1 2 3

1 2 3

Each i maximizes ui(i) - Xii

CSMA: Node i uses delay with mean exp{- Xi}

Figure 2.3: Combined admission control and scheduling. Note that the node decisions are based on

local information.


28/157

2.3. RANDOMIZED BACKPRESSURE 15

The main idea is to derive this combined admission control and scheduling algorithm is to

replace the problem (2.13) by the following one:

Maximize H() +

i

ui (i )

s.t. sj ( ) j , j. (2.14)

In this problem, is some positive constant. If this constant is large, then a solution of (2.14)approximates the solution of (2.13). Indeed, H() is bounded and has a negligible effect on theobjective function of (2.14) when is large.

The Lagrangian for problem (2.14) (see Section 2.4) is

L(,,r) = H() + i

ui (i ) +j

rj [sj ( ) j ] r0[1 S

(S)].

As before, the maximization over results in the CSMA protocol with rates Rj = exp{rj }. Also,the minimization over rusing a gradient algorithm is as before, which yields rj Xj . The maxi-mization over amounts to choosing each j to solve the following problem:

Maximize uj (j ) j rj .

Since rj Xj , this problem is essentially the following:

Maximize uj (j ) j Xj 1 = uj (j ) j Xjwith = 1. This analysis justifies the admission control algorithm we described earlier.

2.3 RANDOMIZED BACKPRESSURE

So far, the nodes had to transmit each packet once. We now consider the case of multi-hop networks.In the network shown in Figure 2.4, the circles represent nodes.There are two flows of packets

(flow 1, indicated by solid lines and flow 2 indicated by dashed lines). Packets of flow 1 arrive at thetop node and must be transmitted down and eventually to the bottom right node. Packets of flow 2arrive at the left node and must make their way to the right middle node. The possible paths of theflows are shown in the figure. Note that the packets of flow 2 can follow two possible paths. That is,some nodes make a routing decision. Each node maintains one queue per flow that goes through it.For instance, node i has one queue for flow 1 and one for flow 2. In the figure, the backlog of packets

of flow 1 is indicated by an integer that is underlined whereas that of flow 2 is not underlined.Thus,

the backlog of packets of flow 1 in node i is 8 and that of flow 2 is 3.Define a link as an ordered pair of two nodes (the transmitter and receiver). In Fig. 2.4,

a , b , . . . , h are all links. Denote byt () and w() the transmitter and receiver of a link, respectively.The links are subject to conflicts that are not indicated explicitly in the figure. In particular,

any two links may or may not transmit at the same time. (Note that the conflicts here are among


29/157

16 2. OVERVIEW

Figure 2.4: A multi-hop network.

links instead of nodes.) One obvious conflict is that a link can send only one packet at a time, so thatit must choose whether to send a packet of flow 1 or one of flow 2 when it has the choice. We assumealso that the transmissions have different success probabilities and possibly different physical-layertransmission rates. For instance, when linke transmits packets, these packets reach the next nodewith average rate r(e).

The goal is to design the admission control, the routing, and the scheduling to maximize theutility

u1(1) + u2(2)of the two flows of packets.

We explain in Chapter 4 that the following algorithm, again called A-CSMA combined withadmission control and routing, essentially solves the problem:

1) Queuing: Each node maintains one queue per flow of packets that goes through it.

2) Admission Control:

1 is selected to maximizeu

1(

1)

X1

1 where

is some constant and

X1

is the backlog of packets of flow 1 in the ingress node for these packets. Similarly, 2 maximizesu2(2) X22 where X2 is the backlog of packets of flow 2 in the ingress node for these packets.3) Priority: Each link selects which packet to send as follows. Linkdchooses to serve flow 1 since(8 4)r(d) > (3 5)r(d) and (8 4)r(d) > 0. Here, (8 - 4) is the difference between the backlogs


30/157


31/157

18 2. OVERVIEW

that the term sj,fappears in L at most twice: one is in the total departure rate of flowffrom nodet(j), and the other isin the total arrival rate of flowfto node w(j), ifw(j) = f. Accordingly, sj,fappears in L with the factor b(j,f) := rt(j),f rw(j),f (Xt(j),f Xw(j),f), with the conven-tion that rf,f= 0. Denote the maximal backpressure on linkj as B(j) := maxf b(j, f). Then,subject to the constraint

fsj,f sj ( ) (where sj ( ) is fixed at the moment), the Lagrangian

is maximized by choosing sj,f= sj ( ) for an fsatisfying b(j,f) = B(j) (i.e., choosing a flowwith the maximal backpressure) ifB(j) > 0, and choosing sj,f= 0, fifB(j) 0. Plugging thesolution of{sj,f} back to L, we get

L = H() +

f

[uf(f) rf,ff] +

j

[B(j)]+sj ( ) r0[1

S

(S)].

Then, we maximize L over . Similar to the last section, this gives the CSMA algorithm with

Rj = exp{[B(j)]+}. Finally, the maximization ofL over fyields the same admission controlalgorithm as before. By now, we have derived all components of the algorithm described earlier inthis section.

2.4 APPENDIX

In this section,we illustrate an important methodto solve a constrained convexoptimization problem

by finding the saddle point of the Lagrangian. Consider the following problem.

maxx x21 x22s.t. x1 + x2 4

x1 6, x2 5. (2.15)With dual variables 0, form a Lagrangian

L(x; ) = x21 x22 + 1(x1 + x2 4) + 2(6 x1) + 3(5 x2).We aim to find the saddle point (x, ) such that x maximizes L(x; ) over x, and

minimizes L(x; ) over 0.One can verify that x = (2, 2)Tand = (4, 0, 0)Tsatisfy the requirement.Indeed,we have

L(x; )/x1 = 2x1 + 1 2L(x; )/x2 = 2x2 + 1 3

L(x; )/1 = x1 + x2 4L(x; )/2 = 6 x1L(

x;)/

3 =5

x

2.

So given , L(x; )/x1 = 0 and L(x; )/x2 = 0. Given x, L(x; )/1 =0, 1 > 0; L(x

; )/2 > 0, 2 = 0 and L(x; )/3 > 0, 3 = 0.It is also easy to verify that x = (2, 2)Tis indeed the optimal solution of (2.15).For an in-depth explanation of this Lagrangian method, please refer to (8).


32/157

2.5. SUMMARY 19

2.5 SUMMARY

This chapter introduced the problem of scheduling links that interfere. We use a simplified modelof interference captured by a conflict graph: either two links conflict or they do not. Accordingly, atany given time, only links in an independent set can transmit. The first problem is to decide whichindependent set should transmit to keep up with arrivals.

We explained that the problem has a solution if the arrival rates are small enough (strictlyfeasible). In that case, a simple randomized schedule makes the queue lengths positive recurrent.The technique of proof was based on a Lyapunov function. However, this schedule requires knowingthe arrival rates.

MWM selects the independent set with maximum sum of backlogs. We proved it makes the

queues positive recurrent, again by using a Lyapunov function. Unfortunately, this algorithm is notimplementable in a large network.

We then described the A-CSMA protocol where theexponentially distributed waiting time ofa node has a rate exponential in its backlog. By exploring the CSMA Markov chain, we showed thatthis protocol tends to select an independent set with a large sum of backlogs. We stated a theoremthat claims that this protocol makes the queues positive recurrent.

We then showed how to combine this protocol with admission control to maximize the sumof the utilities of the flows of packets through the network. The network accepts packets at a ratethat decreases with the backlog in their ingress node.

Finally, we described a multi-hop network where nodes can decide which packet to sendand to which neighbor. We explained that each link selects the flow with the largest backpressure.Moreover, thelinks usea CSMA protocol where themean waiting times areexponentially decreasingin that backpressure.


33/157


34/157

21

C H A P T E R 3

Scheduling in WirelessNetworks

In this chapter, we consider thescheduling of wireless nodes,assuming perfect CSMA and no hiddennodes,as we did in Chapter 2. The arrival rates are fixed and each packet reaches its intended receiverin one hop. We model the interference between links by a conflict graph. The objective is to design

a distributed scheduling protocol to keep up with the arrivals.In Section 3.1, we formulate the scheduling problem. Section 3.2 defines the CSMA algo-

rithm and studies the CSMA markov chain with fixed parameters. In Section 3.3, we show thatthere exist suitable parameters in the CSMA algorithm to support any vector of strictly feasible ar-rival rates, and these parameters can be obtained by maximizing a concave function whose gradientis the difference between the average arrival rates and the average service rates at the nodes. Thisobservation suggests an idealized algorithm to adjust the CSMA parameters. However, the nodesobserve the actual service rates and arrival rates,not their average values. Consequently, the proposedalgorithm, described in Section 3.4.1, is a stochastic approximation algorithm called Algorithm 1.Different from Algorithm 1, Section 3.4.3 proposes another algorithm where the CSMA param-eters are directly related to the queue lengths. Section 3.5 provides an alternative interpretation of

the algorithms. It shows that the suitable invariant distribution of the independent sets has themaximal entropy consistent with the average service rates being at least equal to the arrival rates.This maximum entropy distribution is precisely that of a CSMA Markov chain with the appro-priate parameters. This interpretation is important because it enables to generalize the algorithmsto solve utility maximization problems with admission control and routing, as we do in Chapter4. Section 3.6 explains a variation of Algorithm 1, called Algorithm 1(b), to reduce delays in thenetwork. Section 3.7 provides simulation results that confirm the properties of Algorithms 1 and1(b). Sections 3.8, 3.9 and 3.10 are devoted to the proof of the optimality of the proposed algorithm.In Section 3.12, we explain how the result extends to the case when the packet transmission times

have general distributions that may depend on the link. Finally, Section 3.13 collects a few technicalproofs.

3.1 MODEL AND SCHEDULING PROBLEM

As in Chapter 2, we assume a simple model of interference captured by a conflict graph, or equiva-lently by independent sets. Assume there are K links in the network, where each link is an (ordered)transmitter-receiver pair. The network is associated with a conflict graph (or CG) G = {V,E},


35/157

22 3. SCHEDULING IN WIRELESS NETWORKS

where Vis the set of vertices (each of them represents a link) and Eis the set of edges. Two links

cannot transmit at the same time (i.e., conflict) if and only if (iff ) there is an edge between them.An independent set(IS) in G is a set of links that can transmit at the same time without any

interference. For example, in the network of Figure 2.1, the ISs are , {1}, {2}, {3}, {1, 3}.Let Xbe the set of all ISs ofG (not confined to maximal independent sets), and let N= |X|

be the number of ISs. Denote the i th IS as xi {0, 1}K , a 0-1 vector that indicates which links aretransmitting in this IS. The kth element ofxi , xik = 1 if linkk is transmitting,and xik = 0 otherwise.We also refer to xi as a transmission state, and xik as the transmission state of link k.

Packets arrive at the nodes as processes with rate i for linkk. These arrival processes can befairly general, as long as their long-term rates are well-defined. For instance, the arrival processescan be stationary and ergodic.They do not have to be independent. For simplicity, assume that eacharrived packet has a unit size of 1.

We define the feasibility and strict feasibility of arrivals.

Definition 3.1 Feasibility and Strict Feasibility of Arrivals.(i) is said to be feasibleif and only if =Ni=1 pi xi for some probability distribution

p RN+ satisfying pi 0 andN

i=1 pi = 1.That is, is a convex combination of the ISs, such thatit is possible to serve the arriving traffic with some transmission schedule. Denote the set of feasible byC.

(ii) is said to be strictly feasibleiff it is in the set C which denotes the interior ofC.

Recall that the interior of a setis the collection of points surrounded by a ball of points inthat set. That is, the interior ofC is defined as int C := { C|B(, d) C for some d > 0}, whereB(, d) = {| || ||2 d}.

We show the following relationship in Section 3.13.1.

Theorem 3.2 Characterization of Strictly Feasible Rates. is strictly feasible if and only if it can be written as =Ni=1 pi xi wherepi > 0 andN

i=1 pi = 1.

For example,the vector = (0.4, 0.6, 0.4) ofarrivalratesisfeasiblesince = 0.4 (1, 0, 1) +0.6 (0, 1, 0). However, it is not strictly feasible because the IS (0, 0, 0) has zero probability. Onthe other hand, = (0.4, 0.5, 0.4) is strictly feasible.

Now we define what is a scheduling algorithm and when it is called throughput-optimum.

Definition 3.3 Scheduling Algorithm, Throughput-Optimal, Distributed.A scheduling algorithm decides which links should transmit at any time instance t, given the

history of the system (possibly including the history of queue lengths, arrival processes, etc.) up totime t.


36/157

3.2. CSMA ALGORITHM 23

A scheduling algorithm is throughput optimalif it can support any strictly feasible arrival rates C (in other words, it can stabilize the queues whenever possible). Equivalently, we also say thatsuch an algorithm achieves the maximal throughput.

We say that a scheduling algorithm is distributedif each link only uses information within itsone-hop neighborhood.We are primarily interested in designing a distributed scheduling algorithm

that is throughput optimum.

In the definition above, stabilizing the queues admits two definitions. When the network ismodeled by a time-homogeneous Markov process (e.g., if the algorithm uses a constant step size),we define stability by the positive (Harris) recurrence1 of the Markov process. When the networkMarkov process is not time-homogeneous (e.g., if the algorithm uses a decreasing step size), wesay that the queues are stable if their long-term departure rate is equal to their average arrival rate

(which is also called rate-stability).

3.2 CSMA ALGORITHM

The idealized CSMA Algorithm works as follows.

Definition 3.4 CSMA Algorithm.If the transmitter of linkk senses the transmission of any conflicting link (i.e., any linkm

such that (k,m) E), then it keeps silent. If none of its conflicting links are transmitting, then thetransmitter of linkk waits (or backs-off ) for a random period of time that is exponentially distributedwith mean 1/Rk and then starts its transmission

2. If some conflicting link starts transmitting during

the backoff, then linkk suspends its backoff and resumes it after the conflicting transmission is over.The transmission time of linkk is exponentially distributed with mean 1.

For simplicity, assume that the packet sizes upon transmission can be different from the sizesof the arrived packets (by re-packetizing the bits in the queue), in order to give the exponentiallydistributed transmission times. We discuss how to relax the assumption on the transmission times

in Section 3.12 (which not only provides a more general result but can also make re-packetizationunnecessary).

Assuming that the sensing time is negligible, given the continuous distribution of the backofftimes, the probability for two conflicting links to start transmission at the same time is zero. Socollisions do not occur in idealized-CSMA.

1Positive recurrence is defined for Markov process with countable state space. The concept of positive Harris recurrence is forMarkov process with uncountable state space, and it can be viewed as a natural extension of positive recurrence. However, theprecise definition of positive Harris recurrence is not given here since the concept is not used in this book. Interested readerscan refer to (29) for an exact definition and a proof that our CSMA algorithm with a constant step size ensures positive Harrisrecurrence.

2If more than one backlogged links share the same transmitter, the transmitter maintains independent backoff timers for theselinks.


37/157


It is not difficult to see that the transmission states form a continuous time Markov chain,

which is called the CSMA Markov chain. The state space of the Markov chain isX. Denote linkksneighboring set byN(k) := {m : (k,m) E}. If in state xi X,linkk is not active (xik = 0) and allof its conflicting links are not active (i.e., xim = 0, m N(k)), then state xi transits to state xi + ekwith rate Rk, where ek is the K-dimension vector whose kth element is 1 and all other elements

are 0s. Similarly, state xi + ek transits to state xi with rate 1. However, if in state xi , any link in itsneighboring set N(k) is active, then state xi + ek does not exist (i.e., xi + ek / X).

Let rk = log(Rk). We call rk the transmission aggressiveness (TA) of linkk. For a given positivevector r= {rk , k = 1, . . . , K}, the CSMA Markov chain is irreducible. Designate the stationarydistribution of its feasible states xi byp(x i; r). We have the following result (see ((5; 71; 45)):

Lemma 3.5 Invariant Distribution of the CSMA Markov Chain

The stationary distribution of the CSMA Markov chain has the following product-form:

p(x i; r) = exp(K

k=1 xikrk)

C(r)(3.1)

where

C(r) =j exp(Kk=1 xjk rk) (3.2)where the summation

j is over all feasible states x

j .

Proof: As in the proof of Theorem 2.6, we verify that the distribution (3.1)(3.2) satisfies thedetailed balance equations. Consider states xi and xi + ek where xik = 0 and xim = 0, m N(k).From (3.1), we have

p(x i

+e

k;r)

p(xi; r) = exp(rk) = Rk

which is exactly the detailed balance equation between state xi and xi + ek. Such relations hold forany two states that differ in only one element, which are the only pairs that correspond to nonzero

transition rates. It follows that the distribution is invariant. 2

Note that the CSMA Markov chain is time-reversible since the detailed balance equationshold. In fact, the Markov chain is a reversible spatial process and its stationary distribution (3.1)is a Markov Random Field((38), page 189; (17)). (This means that the state of every linkk is

conditionally independent of all other links, given the transmission states of its conflicting links.)Later, we also write p(xi; r) as pi (r) for simplicity. These notations are interchangeable

throughout the chapter. And let p(r) RN+ be the vector of all pi (r)s. It follows from Lemma 1that sk(r), the probability that linkk transmits, is given by

sk(r) =

i[xik p(x i; r)]. (3.3)Without loss of generality, assume that each linkk has a capacity of 1. That is, if linkk

transmits data all the time (without contention from other links), then its service rate is 1 (unit of


38/157

3.3. IDEALIZED ALGORITHM 25

data per unit time). Then, sk (r) is also the normalizedthroughput (or service rate) with respect to

the link capacity.Even if the transmission time is not exponential distributed but has a mean of 1, references (5;

45) show that the stationary distribution (3.1) still holds. That is, the stationary distribution is

insensitiveto the distributions of the transmission time. For completeness, we present a simple proof

of that insensitivity as Theorem 3.22 in Section 3.12.

3.3 IDEALIZED ALGORITHM

In this section, we show that there is a choice r of the parameters rfor which the CSMA protocolachieves the maximal throughput. We show in Theorem 3.8 of Section 3.3.1 that one can choose theparameters r that maximize some concave function F (r; ). Moreover, we show that the gradient ofF (

r;)

with respect to ris the difference between the arrival rates and the average service rates whenthe parameters rare used. This observation leads to an idealized algorithm to adjust the parameters

rdescribed in Section 3.3.2. The proposed algorithm based on this idealized version is discussed inSection 3.4

3.3.1 CSMA CAN ACHIEVE MAXIMAL THROUGHPUT

The goal of this section is to prove Theorem 3.8. This is done in two steps. First, we show in Lemma3.6 that suitable rates exist for the CSMA algorithm, provided that a specific function F (r;) attainsits maximum over r 0. Second, one shows that this maximum is attained if is strictly feasible.

We show that maximizing F (r;) is equivalent to minimizing the Kullback-Leibler diver-gence between p and p(r) where p characterizes . The interpretation of the theorem is then thatthe parameters of the CSMA algorithm should be chosen so that the invariant distribution of the

CSMA Markov chain is as close as possible (in the Kullback-Leibler divergence) to the distributionof the independent sets that corresponds to the arrival rates.

For a C,let p be a probability distribution such that =Ni=1 pi xi . (Note that p may notbe unique, in which case we arbitrarily choose one such distribution.) Define the following function(the log-likelihood function (68) if we estimate the parameter rassuming that we observe pi ).Note that p only shows up in the derivation of our algorithm, but the information ofp is not neededin the algorithm itself.

F (r; ) := i pi log(pi (r))= i pi[Kk=1 xikrk log(C(r))]= k k rk log(j exp(

Kk=1 x

jk rk))

(3.4)

where k =

i pi xik is the arrival rate at linkk. (Note that the function F (r;) depends on , but

it does not involve p anymore.)Consider the following optimization problem:

supr0 F (r;). (3.5)


39/157


Since log(pi (r))

0, we have F (r

;)

0. Therefore, supr

0 F (r

;) exists. Also, F (r

;) is

concave in r(8). We show that the following lemma holds.

Lemma 3.6 CSMA Can Serve ifmaxr0 F (r;) ExistsIfsupr0 F (r;) is attainable (i.e., there exists finiter 0 such thatF (r; ) =

supr0 F (r;)), then sk(r) k , k. That is, the service rate is not less than the arrival rate whenr= r.

Proof. Let d 0 be a vector of dual variables associated with the constraints r 0 in problem (3.5),then the Lagrangian is L(r; d) = F (r; ) + dTr. At the optimal solution r, we have

L(r;

d)rk = k

j xjk exp(Kk=1 xjk r

k )

C(r) + dk= k sk(r) + dk = 0 (3.6)

where sk(r), according to (3.3), is the service rate (at stationary distribution) given r. Since dk 0,

k sk(r). 2Equivalently, problem (3.5) is the same as minimizing the KullbackLeibler divergence (KL

divergence) between the two distributions p and p(r):

infr0

DKL (p||p(r)) (3.7)

where the KL divergence

DKL (p||p(r)) : =

i[ pi log(pi /pi (r))]= i[ pi log(pi )] F (r;).

That is, we choose r 0 such that p(r) is the closest to p in terms of the KL divergence.The above result is related to the theory of Markov Random Fields (68): when we minimize

the KL divergence between a given joint distribution pIand a product-form joint distribution pI I,then depending on the structure ofpI I, certain marginal distributions induced by the two jointdistributions are equal (i.e., a moment-matching condition). In our case, the time-reversible CSMA

Markov chain gives the product-form distribution. Also, the arrival rate and service rate on linkk are viewed as two marginal probabilities. They are not always equal, but they satisfy the desired

inequality in Proposition 3.6, due to the constraint r 0, which is important in our design.The following condition, proved in Section 3.13.2, ensures that supr0 F (r;) is attainable.

Lemma 3.7 If is Strictly Feasible, then maxr0 F (r;) ExistsIf the arrival rate is strictly feasible, then supr0 F (r;) is attainable.


40/157

3.4. DISTRIBUTED ALGORITHMS 27

Combining Lemmas 3.6 and 3.7, we have the following desirable result.

Theorem 3.8 Throughput-Optimality of CSMA.For any strictly feasible there exists a finiter such thatsk(r) k, k.

To see why strict feasibility is necessary, note that the links are all idle some positive fractionof time with any parameters of the CSMA algorithm.

3.3.2 AN IDEALIZED DISTRIBUTED ALGORITHM

Since F (r;)/rk = k sk(r), a simple gradient algorithm to solve (3.5) is

rk(j

+1)

= [rk(j )

+(j)

(k

sk(r(j)))

]+,

k (3.8)

where j = 0, 1, 2, . . . , and (j) is some (small) step sizes.Since this is an algorithm to maximize a concave function, we know from Theorem A.1 how

to choose step sizes to either converge to the solution or to approach it.

The most important property of algorithm (3.8) is that it admits an easydistributedimplemen-tation in wireless networks because linkk can adjust rk based on its local information: arrival rate kand service rate sk(r(j)). (If the arrival rate is larger than the service rate,then rk should be increased,and vice versa.) No information about the arrival rates and service rates of other links is needed.

One important observation is that the nodes observe actual arrival and service rates that are

random and are not equal to their mean values, unlike in (3.8). Therefore, (3.8) is only an idealizedalgorithm which cannot be used directly.

3.4 DISTRIBUTED ALGORITHMS

In this section, we construct three algorithms (Algorithm 1, a variation, and Algorithm 2) basedon the above results, and we establish their throughput-optimality (or near-throughput-optimality)properties.

The main idea of Algorithm 1 and its variation is that the nodes observe a noisy version ofthe gradient instead of the actual gradient. Accordingly, we use stochastic approximation algorithmsthat adjust the parameters slowly enough so that the observed empirical arrival and service ratesapproach their mean value. The two algorithms differ in how they adjust the parameters.

There are two sources of error between the observed service rates under CSMA with param-

eters rand their mean value s(r) under the stationary distribution of the CSMA Markov chain with

these parameters.This first one is that the services are random. The second is that the Markov chaintakes time to converge to its stationary distribution after one changes the parameters. This secondeffect results in a bias: a difference between the mean value of the observations and their meanvalue under the stationary distribution.Thus, the error has two components: a bias and a zero-meanrandom error. To make the effect of the bias more and more negligible, we use the same values of


41/157


the parameters over intervals that increase over time. The complication is that the Markov chain

might take longer and longer to converge as we change the parameters. The precise proof requiresgood estimates of the convergence time of the Markov chain (i.e., of its mixing time).

Section 3.4.1 explains Algorithm 1 and proves that it is throughput-optimal. This algorithmuses decreasing step sizes and increasing update intervals3. Section 3.4.2 shows that a variation of

Algorithm 1 with decreasing step sizes and constant update intervals stabilizes the queues when thearrival rates are in a smaller set (although the set can be made arbitrarily close to C). Both of thealgorithms are time-varying since the step sizes change with time. A time-invariant algorithm, calledAlgorithm 2, is given in Section 3.4.3 with an arbitrarily small loss of throughput. In Algorithm 2,the CSMA parameters are direct functions of the queue lengths.

3.4.1 THROUGHPUT-OPTIMAL ALGORITHM 1

In Algorithm 1 defined below, the links modify their aggressiveness parameter at times0, T (1) , T (1) + T (2) , T (1) + T (2) + T (3), and so on. Here, the durations T (n) increase with n togive more and more time for the CSMA Markov chain to approach its invariant distribution underthe updated parameters. The rate of convergence of the Markov chain to its stationary distribution

is bounded by the mixing time of the chain, which depends on its parameters. Moreover, the ad-justments are in the opposite direction to the noisy estimate of the gradient with diminishing stepsizes.

The tricky aspect of the algorithm is that T (n) depends on the parameters r(n) of the Markovchain. These parameters depend on the step sizes up to step n. We want to choose the step sizes andthe T (n) so that T (n) gets large compared to the mixing time, and yet the step sizes sum to infinity.This balancing act of finding step sizes that decrease just slowly enough is the technical core of the

proof.Let linkk adjust rk at time tj , j = 1, 2, . . . with t0 = 0. Define the update intervalT ( j ) :=

tj tj1, j = 1, 2, . . . . Define period j as the time between tj1 and tj ,and r(j ) as the value ofrset at time tj .Let

k(j ) and s

k (j ) be, respectively, the empirical average arrival rate and service rate

at linkk between time tj and tj+1.That is, sk(j ) :=tj+1

tjxk()d/T (j + 1), where xk ( ) {0, 1}

is the state of linkk at time instance . Note that k (j ) and sk(j ) are generally random variables.

We design the following distributed algorithm.

Definition 3.9 Algorithm 1.At time tj+1 where j = 0, 1, 2, . . . , let

rk (j

+1)

= [rk(j )

+(j)

(k(j )

sk(j))

]D,

k (3.9)

where (j) > 0 is the step size, and []D means the projection to the set D := [0, rmax] wherermax > 0. Thus, [r]D = max{0, min{r, rmax}}. We allowrmax = +, in which case the projectionis the same as []+.3We would like to thank D. Shah for suggesting the use of increasing update intervals.


42/157

3.4. DISTRIBUTED ALGORITHMS 29

Observe that each linkk only uses its local information in the algorithm.

Remark: If in period j + 1 (for anyj ),the queue of linkk becomes empty, then linkk still transmitsdummy packets with TA rk (j ) until tj+1. This ensures that the (ideal) average service rate is stillsk(r(j)) for all k. (The transmitted dummy packets are counted in the computation ofs

k (j ).)

The following result establishes the optimality property of Algorithm 1.

Theorem 3.10 Algorithm 1 is Throughput-Optimal.For simplicity, assume that at time instances t {0, 1, 2, . . . }, Ak(t ) units of data arrive at link

k. Assume thatAk (t), k, t {0, 1, . . . } are independent of each other. Also, assume thatE(Ak(t)) =k, t {0, 1, . . . } andAk(t ) C. Therefore, the empirical arrival rates are bounded, i.e., k(j ) ,

k, j for some rmax

(3.11)

Then, the following can be shown.

Theorem 3.11 Feasible Rates with Constant Update Intervals.Assume that

C(rmin , rmax , ):= {| argmax

rF (r; + 1) (rmin , rmax )K}.

Also assume the same arrival process as in Theorem 3.10, such that the empirical arrival rates are bounded,i.e., k(j ) , k, j for some < .

Then, if(j) > 0 is non-increasing and satisfies

j (j) = ,

j (j)2 < and(0) 1,

then r(j ) converges to r as i with probability 1, wherer satisfies sk(r) = k + > k, k.Also, the queues are rate stable and return to 0 infinitely often.

Remark: Clearly, as rmin , rmax and 0, C(rmin , rmax , ) C. So the maximalthroughput can be arbitrarily approximated by setting rmax , rmin and .

The proof of the theorem is similar to that of Theorem 5.4 to be presented later, and it istherefore omitted here.


44/157

3.5. MAXIMAL-ENTROPY INTERPRETATION 31

3.4.3 TIME-INVARIANT A-CSMA

Although Algorithm 1 is throughput-optimal, ris not a direct function of the queue lengths. In thissection, we consider algorithm (3.12) where ris a function of the queue lengths. It can achieve a

capacity region arbitrarily close to C.

Definition 3.12 Algorithm 2.Let Qk(j ) be the queue lengthof node k at time j T. For simplicity, assume thatthe dynamics

ofQk(j ) is

Qk(j + 1) = [Qk(j ) + T (k(j ) s

k(j))]+

The nodes update rat time j T , j = 1, 2, . . . . (That is, T ( j) = T , j .) Specifically, at timej T, node k sets

rk(j ) = min{(/T )Qk(j),rmax + }, k, (3.12)where rmax , > 0 are two constants.

We have the following result about Algorithm 2.

Theorem 3.13 Time-invariant A-CSMA and Queueing Stability.

Assume that the vector of arrival rates

C2(rmax ) := {| argmaxr0

F (r; ) [0, rmax]K}.

Clearly,C2(rmax ) C.Then, with a small enough and a large enough T, Algorithm 2 makes the queues stable.

Remark: Note thatC2(rmax ) C as rmax +.Therefore,the algorithmcan be arbitrarilyapproach throughput-optimality by properly choosing rmax , and T.

The proof is given in Section 3.11.

3.5 MAXIMAL-ENTROPY INTERPRETATION

This section provides a different view of the above scheduling problem, which will help us laterdevelop a number of other algorithms. It shows that the desired distribution of the independentsets has maximum entropy subject to serving the links fast enough. Accordingly, to generalize thealgorithm of this chapter, one will add an entropy term to the objective function of more general

problems.Rewrite (3.5) as

maxr,h {

k krk log(

j exp(hj ))}s.t. hj =

Kk=1 x

jk rk, j

rk 0, k.(3.13)


45/157


For each j=

1, 2, . . . , N , associate a dual variable uj to the constraint hj=

K

k=1x

j

k

rk .Write the vector of dual variables as u RN+ . Then it is not difficult to find that the dual problemof(3.13) is the following maximal-entropy problem. (The computation is given in (31).)

maxu

i ui log(ui )

s.t.

i (ui xik) k, kui 0,

i ui = 1.

(3.14)

where the objective function is the entropy of the distribution u, H (u) := i ui log(ui ). 4Let us define the domain of the objective function H(u) as D0 = {u|ui 0, i,

i ui = 1}.

Then, problem (3.14) is the same as

maxuD0 i ui log(ui )s.t. i (ui xik) k , k. (3.15)Also, if for each k, we associate a dual variable rk

scheduling and congestion

Documents