a synchronization algorithm for distributed multimedia environments

Multimedia Systems (1996) 4:1-11 Mul t imed ia Systems © Springer-Verlag 1996

A synchronization algorithm for distributed multimedia environments Panagiotis N. Zarros I , Myung J. Lee 2, Tarek N. Saadawi 2

ICS First Boston Corporation, 5 World Trade Center, New York NY 10048, USA 2Department of Electrical Engineering, The City University of New York, City College, 40th St. & Convent Ave, New York, NY 10031, USA

Abstract . Network synchronization plays a significant role in transmitting multimedia objects over computer networks. Even packets from a single channel must be synchronized due to the problems in a packet switching environment, such as network jitter, frequency, and time offsets. We present an algorithm that determines the set of packets generated periodically by various participants arriving at a node. The basic advan- tage of the proposed algorithm is that the receiver estimates the reference times (expected arrival times of the packets) and achieves synchronization, without knowledge of the packet delays. The accuracy is improved and the complexity is reduced by predicting the time/frequency offsets between the clocks at the source and the mixer. The error is calculated by the Chernoff bound, demonstrated by simulation, and shown to be acceptable in practical applications.

Key Words: Interparticipant synchronization - Multimedia communications - Network jitter - Reference times - Fre- quency offset - Time offset

1 Introduction

Since the early 1980s multimedia communications has caught the attention of many researchers (Bertsekas and Gallagher 1992; Saadawi et al. 1994). Multimedia communications are concerned with the transportation of multimedia information over computer networks. Multimedia information can be im- ages, real-time voice and video data, graphics, and regular text data. This trend has its roots in the rapid technological ad- vances made and foreseen for the near future in many areas related to the field of computer communications. Notably, some of these areas are (1) the geometric increase of the processing capabilities of the computers; (2) the development of fast algorithms for data compression, especially image compression; and (3) the exponential increase of the available bandwidth made possible by fiber-wire technology.

One class of multimedia applications, multimedia telecon- ferencing, permits real-time interaction among participants of

Correspondence to: RN. Zarros e-mail: pzarros @ opsny.fbc.com {eetns,mjlee} @ee-mail.engr.ccny.cuny.edu

conferences by exchanging audio, video, and text information. One inherent problem is intermedia synchronization. It deals with synchronizing the temporal relations of various media. An example of this is lip-synchronization. There is another type of synchronization to be considered. If a conference is not in its simplest form with only two participants, but in a more general form with three or more participants, management issues and synchronization problems with packets (messages) arriving from the various participants arise. We refer to this synchronization problem as interparticipant synchronization.

Interparticipant synchronization is related to synchronization of packets arriving from various sources and is the subject of this paper. As shown in Fig. 1, receiver/~ can be any of the participants or a separate mixer and, as noted, both the average delay D i and the network jitter differ from participant to participant. In this paper, the existence of different bounds in the arrival time of a packet from various sources is rec- ognized to improve the synchronization algorithm. When we refer to packets, we mean the messages from the application level. Rangan et al. (1993) follow the deterministic approach, whereas we take the statistical approach in our paper. Rangan and colleagues' synchronization algorithm has been proved to be optimal when no control messages are exchanged. Ran- gan et al. (1992) and Ramanathan and Rangan (1993) deal with techniques for synchronized multimedia retrieval over integrated networks, for both residential video stores and multimedia conferences. Steinmetz (1990) and Little and Ghafoor (1990) present models for formally describing synchronization requirements among media streams. In Woo and Ghafoor (1994) and Woo et al. (1994) present an algorithm to determine the optimal number of channels needed to transmit multimedia information efficiently. A similar problem, the problem of voice synchronization between sender and receiver has been studied by many researchers (Alvarez-Cuevas et al. 1993; Barberis and Pazzaglia 1980; Montgomery 1983). Ramjee et al. (1994) present adaptive playout mechanisms for packetized audio applications in wide-area networks. Escobar et al. (1994), Ravindran and Bansal (1993), Znati and Field (1993) present protocols and theoretical tools for multimedia communications.

The packets from a multimedia participant are assumed to be generated periodically. Along with each packet its gener-

ation time at the source is included. However, if the period of the sources is known to the receiver, it suffices to send the sequence number of the packet.

In Sect. 2, we present a new interparticipant synchronization algorithm. We identify the problems in relation to multimedia conferences and present the part of our algorithm that estimates the average arrival time of a packet for each source. Then, the interval [Amin,i A{nax ] for each source i and the minimum waiting time for the reception of one packet from each of the sources is determined. In Sect. 3, we present a methodology to predict the shifts of the expected arrival time of the packets for each source relative to the clock at the receiver, and find bounds of the probability of error in those predictions. Next, a more elaborate performance analysis, with the mean square error as a criterion, is presented and shows the optimality of our algorithm. In Sect. 5, we present performance bounds of the estimation error in calculating the average arrival time and numerical examples of various parts of the interparticipant synchronization algorithm. Finally, in Sect. 5 we present our concluding remarks.

2 Interparticipant synchronization

The variability of packet arrival time is also known as the network jitter, and we define it as the variation between the actual arrival time and the expected arrival time. As shown in Fig. 2, D i is the average delay of packets from a source i as they arrive at the destination and T is the generation period of the packets at the sender. The actual arrival time of the nth packet from source i is denoted as ~ , where n is an integer. If there is no variation in the delay, the packets arrive at points t~¢f,~ indicated in Fig. 2. Observe that

tiref,n i = t , . ~ f , l + ( n - 1 ) T ~, n = l , 2 , . . . (1)

o r

t ~ f , ~ = t~d,~ a + (nl - n2)T i, n = 1 ,2 , . . .

where T ~ is the transmission period T of source i measured according to the clock at the receiver.

The nominal value of the transmission period of the packets measured at the clock at the source i is T. Good references for internet time synchronization are the papers by Mills (1981) and Mitra (1980). Unless explicitly mentioned, when we compute the set of points defined by Eq. 1 we simply use T = TR = Ti because the observer at the receiver consid- ers only the time given by his own clock. The distinction we make in Eq. 1 by using T ~ instead of T proves versatile when time/frequency offsets are considered later in this paper. Since packets experience a variation in the delay, an alternate defini-

is as follows: the reference tion for the reference points tref, n times for source i are defined to be the set of points generated by Eq. 1 with the property

N •

o. q'~=l

(2)

In other words, the sum of the differences between the actual packet arrival times and the corresponding reference points approach zero as N approaches infinity. Consequently, the network jitter A~ for the nth packet received from source i is defined as the difference between these two points, i.e.,

i ~ 1) T i) . (3) = - = - + -

Different definitions of the network jitter that are useful for theoretical analysis of burstiness of the networks can be found in the papers by Cruz (1991) and Matragi et al. (1994).

A participant of a multimedia conference may receive packets from two or more participants. Each packet contains time-sensitive data, such as voice packets, and an algorithm is needed to counteract any time asynchrony. For example, voice can be distorted because, to listen to all of the participants at the same time, the voice packets must be mixed and then played back. If the playback is not sufficiently smooth because of network jitter, a degradation of the voice quality is observed. Degradation of voice quality occurs when either one packet from a specific source arrives late and is not played along with the rest, or a packet arrives out of sequence (if an unreliable transport protocol, for example, the user datagram (UDP), is used).

We now identify most of the issues related to interparticipant synchronization, which will help us put the problem into the right perspective: 1. The delay for each packet may differ from one source to an-

other, but as long as the difference is preserved, the quality of service will not be affected.

2. Even though a number of participants begin to send their packets in the same period T, phase offsets of their starting times are inevitable due to the lack of perfect synchronization among the participants.

3. Packets belonging to the same set will not be received at once, but most likely in a time window whith a length in- fluenced by the phase offsets of the sources, the frequency of the clocks, and the network jitter.

4. If interparticipant synchronization at a time instance is achieved, and the exact frequency of the clocks of all the sources is known to the receiver, synchronization can be maintained for the rest of the time.

2.1 S u m m a ~ of the synchronization algorithm

In this paper, the maximum error allowable in the synchronization of packets from the various sources is designated eT. As shown later, the error cT is composed of two components: the error eR due to the finite number of packets used to estimate the reference times, and the error e:c due to the time offset acquired in the interval N T . We now give a summary of the synchronization algorithm.

The main algorithm. The following steps are executed for each new set of N packets received from each source i. We show that synchronization is achieved for the next N packets received• 1. Determine the reference time and maximum and minimum

values of the network jitter (A~ax, A ~ ) for each source i .

2. Divide the time at the receiver into equally spaced intervals of duration T. We refer to these intervals as the reference intervals, and packets with reference times that fall into the same reference interval belong to the same set.

3. Determine the additional time (.minimum waiting time) after the end of the current reference interval that the receiver has to wait to receive one packet from each source, i.e. to receive all packets belonging to the current set. The waiting time for each set is the time measured from the end of its reference interval to its playback time.

4. Group the packets according to their reference times. Wait for the time interval found in step 3, and then play them back together.

Until the I ~ l th packet arrives, we simply register the arrival times t~, 1 _< n _< [@1 - 1, of the previously received packets. When all N packets have been received, we measure the time differences between the arrival times of the rest of the packets from this source and the set of points defined by

~)@] + ( n - I N ] ) TR , l < n < N .

After all the differences are added up, and the entire sum is divided by the total number of the received packets N, we estimate the time Ai~ N our initial guess ti~ differs from the 7 ] !

actual reference time, which is given by

Enhancement of the algorithm. If steps 1-4 are repeated 2k times, 2kN packets are received from each source. This additional information can be used to improve the synchronization algorithm. The steps are: 1. Estimate the frequency offsets between the clock at each

source i and the clock at the receiver. 2. Determine the prediction interval, which is defined as the

interval after the time 2kNT that the error in the estimation of the frequency offsets will be less than the error e.

3. Estimate the time it takes for the reference time of each source i to move out of its reference interval (time ~]T). This time must be shorter than the prediction interval for the error of the frequency offset estimation to be less than c. This is the time instance at which packets from source i upon arrival at the receiver, they are played back with a different set of packets, either one ahead or one behind. During the conference set-up time, we can allow the error

to be larger than cR. As will be shown later in Sect. 5, by allowing the larger error, the number of packets N needed for a certain level of accuracy is decreased geometrically. An alternative scheme is to transmit with a shorter period T~ - :r to achieve better accuracy.

N

n = l

(5)

A detailed proof of Eq. 5 and the reason why the packet arrival time ti~_ is used and not any other packet arrival time is given in Appendix A.

^ .

With A}4] available, the estimated reference time for source i is obtained as

( = t e f , [ ~ ] + n -

(6)

The maximum and minimum bounds of the network jitter can also be obtained as

Zxi,~,.~ = max {t~ ~ - - ] ; r e f , n } , n

and

^ i A , ~ min {t~ - , n

l < n < N

l < n < N

2.2 Reference-time estimation

Since the reference time plays a vital role in our synchronization algorithm, we first focus on finding the best estimate of the reference time. The reference time is defined as the average arrival time of the packets at the receiver from a multimedia source. Reference times are important because they are used in determining whether packets from various sources belong to the same set for mixing. They have the characteristic of be- ing relatively immune to network jitter, provided the averaging includes a sufficient number of packets. To estimate the reference time for a particular source/, t~f ,~ (hereafter, estimated values will appear with a ^), let the arrival time of the I@~ th packet ~iF~]_ be the reference time for this source. Because N can be either even or odd, we use the notation of the ceiling. i.e., the smallest integer greater or equal to the argument. We show, later in this section, that the choice of the ~N1 th packet arrival time t ~ N in Eq. 5 results in cancellation of the time

IT? offsets, and therefore, the proof is valid even for this case.

We cannot assume that the maximum jitter is known a priori for a given protocol because network jitter is affected by random network traffic and the routes taken. Obviously a trade-off exists between the accuracy of the algorithm and the number of the packets taken into consideration. We will elaborate on this trade-off later in Sect. 5.

2.3 Minimum waiting time for playback

The minimum time window required for the reception of one packet from each source (i.e., 1 packet from source 1, 1 packet from source 2 , . . . , and 1 packet from source M) generating packets with period T in the absence of network jitter is T. This is due to the phase differences among the sources. Rangan et al. (1993) state that the smallest time window during which all M sources are guaranteed to generate packets is T T 2 M - 1 •

However, in practice, due to the frequency drifts of the clocks, we will see that this minimum time window requires constant movement at the receiver, which is undesirable because of the

complexity in implementing the algorithm. Here is the use- fulness of the reference time, which is relatively immune to network jitter and therefore indicates with high probability whether a packet received belongs to the set with a reference time within a time window. Packets with reference times belonging to the same time window T are assigned to the same group (which Rangan et al. call the fusion set) when played back at the receiver. In contrast to the reference time, actual packet arrival time varies due to network jitter. A packet from source i with its reference time at the end of the time window T and having a maximum jitter A ~ , might arrive as late as T + A,~x . The minimum waiting time to receive one packet from each source can be shown to be

^ i l¥~i~ = T + max { A ~ , x }

+ c R + N m a x { l T s - T R I } , 1 < i < M (9) 8

and s is over all possible sources in a given network. The fourth term N max {1 T8 - T RI} is due to the max-

s

imum time offset we expect from the clocks in a particular situation. In any real network there are mechanisms that read- just the time offsets of each clock to the time-host of the network, e.g., the UNIX rdate command. This possibility should be considered for any actual implementation of our algorithm, but for the sake of clarity we do not consider the implications of this problem in this paper. The total error, given as a design specification as noted before, is distributed in two components: the time-offset error e= N m a x {IT s - TRI} and the estima-

tion error cR. The proportionality of these two components is determined by the designer of the algorithm to suit his environment properly. For convenience, we define the rate of the time offset as

T i _ T R i (10)

q-off - T R

The error due to the (as yet) unknown time offset will be re- moved from Eq. 9 when the prediction algorithm described in Sect. 3 is applied.

As shown in Fig. 3, the minimum waiting time consists of two intervals. Interval AB(= T) and interval BC. Interval At3 is due to the phase offset and is the interval that is often referred to as the reference interval, and interval t3C is function of both the network variation in the delay and the frequency differences of the clocks between sender and receiver. Note that interval BC overlaps with the next interval t3D. A packet with a reference time in interval AB that arrives at interval t3C will be played back at time C. However, a packet with a reference time in the interval BD that arrives at interval/3C will be played back at time E.

3 Prediction of resynchronization interval

This section constitutes an enhancement to the interparticipant synchronization algorithm described in Sect. 2. The algorithm described so far can work by itself synchronizing packets from

Fig. 1. Multiparty conference with M + 1 participants

D i

Sender

Receiver tl

A',

=_, T' T ~

,I ~ref,1 t2 t~ef#T ~u+2T t3 tr~fl+3T t4

T z5 x,

Fig. 2. Determination of network jitter A~ for each packet n sent from source i

Wmin ,9

!t2f+(n-1)T A! T

w

• Wmin 1,11

i , ,

t~,~nT '

tlret+(n-1)T, ' i t~t +nz

Start p l a L g set of packets n-2.

C T

Start playing set of packets n-1.

Fig. 3. Minimum waiting time

D E

Start p 3ring set of packets n.

various sources every ArT s. The only drawback of the algorithm described in Sect. 2 is that we do not have exact knowledge of the frequency drifts of the clocks yet, and therefore the worst-case scenario must be assumed in the determination of the time offsets in Eq. 9).

As time progresses, more and more packets arrive, improv- ing the accuracy of the algorithm and making it possible to use the methodology described in this Sect. for estimating the frequency differences between the clocks at the sources and the

clock at the receiver. Knowing the frequencies, how the reference times move as function of time can be determined. This, in effect, means knowledge of the time instance that a particular reference time will move out of its reference interval and therefore fall into another group or fusion set. Since this is the function of the interparticipant synchronization algorithm, the algorithm itself becomes redundant once the frequencies of the clocks are known. As noted in Sect. 2, statement 4, once synchronization is obtained at some time instance and accurate knowledge of the frequencies of every clock can be known to the receiver, synchronization can be maintained for the rest of the time.

After N packets from a source have been received, the reference time for this source is estimated. If the process of estimating the reference time is repeated 2k times, the average value of G~f is evaluated for two intervals, one from 0 to the kth run and another from the (k + 1)th to the 2/cth run. As a result, the error of the estimated reference time to the true reference time for these two intervals will be reduced by x/~. A justification for this is given in Appendix B.

Since what we do is an averaging over/cAr packets, the obtained value of the reference time is true for the two time instances hNT/2 and 3kNT/2. These are the middle points of the two intervals [O,kNT] and [kNT+T, 2kNT] respectively. The reason is that the accumulated contribution of the time offsets in estimating the reference time is zero only in these two time instances.

Hence, the actual reference time equals the measured refer-

. ~i fSre.f,n % ence t ime tvef, n -4- an error less than ~ , i.e., i _ tvef,n _< ~ , where n is either hN/2 or 3kN/2. Therefore, the maximum error we make in estimating the accumulative time offset kN (T ~ - T R) with k N ( ~ . i _ T R) during the interval [kNT/2, 3t~NT/2] is

2e IkAr(T -T )-;XT{I=/CNIT{-T I <_ (11)

where ATi - - fref,~'~i _t~ref, kT is the estimated accumulative

time offset and k N (T i T R) i i is the - = t s , ~ - t ~ , ~

actual time offset. The estimated rate of time offset ^i 7-of f = 7~i T R TR of each source i can now be determined as

kAr - T T o f f = hNTR Tof f - l c ~ (12)

Our objective is to find the future time instance ANT that the error will reach the design specification error e. Imagine a straight line drawn from the time instance kArT/2 to the time instance 3kNT/2. At these two instances, the maximum error we make in estimating the reference time {~f,., is less than -4- ~ Therefore, the maximum tangent this line can have ,/~"

compared to the line that introduces no error is ~ /kNT, as illustrated in Fig. 4. Observe that the real meaning of the slope is the maximum rotor we make in estimating the rate of the time offset, i.e.,

-- T o f f l -- < (13) I.NT R •

It is important to note that the slope derived from Fig. 4 is independent of the individual rates of the time offsets of the clocks Tif f - T ~ R . ~ The rate of the time offset for a specific source i might be ten times larger than the rate of the time offset for source j , but the absolute error in determining the rates is the same for both sources.

To compute A we first note that, at time 3/CNT/2, we £ might start with an error in our estimation al large as i ~ and

second, that we start predicting only after kN/2 packets have been received. Therefore, the following equation holds:

c ~ 3k T + ANT -- <_ ~ (14)

For a meaningful value of k, such that A > 2k, solving Eq. 14, we get A = ,/~+2 k.

Figure 5 presents a simple example to clarify the involved implementation of the prediction algorithm presented in this section. We assume that the receiver accepts packets from three sources. Suppose that at times (1/2)kNT and (3/2)kNT the reference time of each source is as shown in Fig. 5. Specifically, at time (3/2)kNT let the distances of each reference time to the boundary between reference interval n and reference interval n + 1 be 61T for source 1, (1 - 62)T for source 2, and 63T for source 3. The time offsets }~T i, which represent the amount of time the reference time of each source i has moved during the interval [(1/2)kNT, (3/2)kNT], are indicated as A T ~ in the figure. Using Eq. 12, the estimated rates of tile time offsets ;~ f f are determined. Some of the clocks at the sources (sources 1 and 3 in the example) run faster than the receiver clock, whereas the rest of them (source 2) run slower. The interval 'rl~T taken from time 2kNT as shown in Fig. 4, where at its end the reference time from source i moves out of the current reference interval, can now be determined as the solution of the following equation

6 ~ kN 7ff -- T~ff ^~ 2 ' 0 _ < r / _ < (A 2/{) N . (15)

Finally, we see that the minimum waiting time for playback after the execution of the prediction algorithm does not have the term N (T i - T R) as already noted in Sect. 2.3, and Eq. 9 becomes

r ^ i l,I,~+,=T+max{Am~.~}+ey , 1 < i < M . (16) {

3.1 Performance analysis of the fi"equency offset estimation method

In this subsection, a performance analysis of our approach in estimating the rate of the time offset as described in Fig. 4 and Eqs. 11-13 is provided. Specifically, after the mean square error is determined (Appendix C), we show that our method

produces nearly optimal results. The algorithm is nearly optimal in the sense that its mean square error is only about 25% 2E greater than the minimum mean square error determined by the Cramer-Rao bound.

The expression for the mean square error when only N packets are received is shown in Appendix C to be

160.~x 4 E ( 2 ) = (vT)

Wolf and Schwartz (1990) use the Cramer-Rao bound (see also Sorenson (1980)), to show that the best estimation of the frequency offset (or rate of the time offset) one can have yields a minimum mean square error of

E = N ( N + I ) ( N + 2 ) @2) best 12°'2

The validity of Eq. 17 is also demonstrated with simulations as shown in Fig. 6. In Fig. 6 we have plotted both the theoretical results from Eq. 17 and simulations for two cases: (a) for the case that the probability density function (pdf) of the A~ is normal with a mean of zero and a variance of one, and (b) for the case that the pdf of the A~ is uniform with a mean of zero and a variance one. In addition, a plot of the optimum mean square error is included for comparison purposes. Our 5 method provides a computationally cost-effective (actually, 2k + 2 additions and 3 divisions) means to approach optimum results very closely. To attempt to get any closer to the optimum solution would require much additional calculation (related to the evaluation of the pseudo-inverse matrix) and would be impractical.

4 Performance bounds in reference time estimation

4.1 Estimation error in the reference time

The error in estimating the reference time is due to the finite number of packets N received at the receiver. Bai et al. (1944), Field and Znati (1991), and Papadopoulos and Parulkar (1993) show that the pdf of packet arrivals is approximately normal or

i i Rayleigh and that Atom _< A . The pdf of packet arrivals is shown to be approximately normal when the message size is relatively small, While it tends to be Rayleigh when the message size is large (~ 24 KB ) (Bai et al. 1994).

We will derive an upper bound in the probability of error of our estimation. In Fig. 7a, a bell-shaped pdf is depicted to model the distribution of real packet arrival times. Conse- quently, in this model it is evident that a pdf of realistic packet arrival times has smaller variance than a uniform density function in the interval [ - A ~ , A ~ ] . Its average value is the same average value estimated for the actual density and its variance is c~[ A~,~x A loose upper bound for the probability

- - 3 "

of error is obtained by Chebychev's inequality which is solely based on the variance of the pdf. Cover and Thomas (1991) show that the uniform density function exhibits a larger entropy than any other pdf. If we assume that the packet arrivals are independent, then the entropies are added, which means

g

E

kNT/2 kNT &kNT/2 21~NT-~ ~ ~ * i-- n T ,i

prediction interval[~NT, ANTi.

,n n neighborhood of the time instance (I/2)kNT

', ~,T I ', , , , : !

1 . : neighborhood of the time instance (3/2)kNT

I

2 6T ~T

-n

i reference interval for i reference interval for ! reference interval for group n-l. ' group n. • group n+l.

6

1011

io-~.~

10 .3

10 .4

10 ,2

10 ̧

+ : simulations when the Ads are uniformly distributed

1B # t N'

simulations when the An's are mal distributed

0 io io ' ' ' 20 4 100 120 140 160 180 of packets N

Fig. 4. Schematic diagram of the prediction algorithm Fig. 5. Example for calculating the future times when reference times move out of their reference interval T Fig. 6. The mean square error of the rate of the time offset estimation method ('r~ff : (T ~ - T R ) / T R)

that a random variable constructed as a sum of independent identically distributed (i.i.d.) and uniform random variables has the largest entropy. However, a random variable with the largest entropy implies the largest uncertainty.Therefore, the assumption of a uniform density function of the packet arrivals

as shown in Fig. 7b leads to an upper bound in the estimation error of the reference time that is greater than in the case of a realistic pdf of packet arrivals.

As an example, let us assume that real packet arrivals follow a normal density function. The same procedure can be followed as for an example in which the packet arrivals follow the Rayleigh pdf and for any other bell-shaped density function. We use the Chernoff bound, and assume statistical independence in the arrival time of the packets, and see that if N packets are received, the probability that the average value lies outside a confidence interval ~R will be (Wozencraft

frr~aa~ and Jacobs 1990):

l - - = ~r 'e f ,n - - ~ r e f , n > - -

= 2P E AI~ - > e ~ ~ = 1 Z~ma32

[ { k ('° ) : e ~'~ ~ :~' e k ° ° - X N / 2

E u e k ~ ' " ~ ]

= e ~ o ~ k ~ ]

(19)

where A0 is given as the solution [with the help of tables of integrals given in Gradsteyn and Ryzhik (1980)] of the following equation:

E[ A~e~°~] zR

f m a t r

¢ zx normal

= ~2, t~h(AoA~ax) ~R tanh() ,0A~) ' uniform, A ~ ~

(20)

and EN, E u are the expectation operators under the normal and uniform density functions, respectively. For the uniform

1 ~ 2 ( ER ) s ignif ies theprobabi l i ty pdf, c~ 2 = 5 ( A ~ a z ) • P that the estimated reference time differs from the actual reference time by more than A~----.~R Equation 20 for the unilbrm ~ x pdf is to be solved numerically, and for the following values of interest 0.05, 0.1, and 0.2 of error ~ , the solutions of A0 are 0.15, 0.3, and 0.6, respectively. The value of A , ~ , is taken as 1 in all cases.

For comparison purposes between the normal, the uniform, and the Rayleigh density functions, we chose the same A ~ x = 1. The variance is chosen as crux = 0.2, such that both the sum of the probabilities outside A , , ~ , = 1 in the normal function is

negligible, and the variance has the same value as that for the Rayleigh density function with its probability of error plotted in Fig. 8b. Then, eR is a fraction of Am~ ~ ; 1. Therefore, Eq. 20 for the normal pdf and for the values of interest 0.05, 0.1, and0.15 of error cR = cR, the solutions of A0 are 0.25,

A rr~ace 0.5, and 0.75, respectively.

4.2 Numerical examples

The proposed interparticipant synchronization algorithm in- troduced in this paper must be executed every N T s. We have not yet provided the reader with a specific value of N because N changes according to the application and the network environment. However, We use the theory developed in the previous section, to provide explicitly the trade-offs among the three parameters: probability of error, confidence interval

~ , and the number of packets N. Depending on the application (how accurate we really need to be), and on the network environment (how large the frequency drifts of the clocks and the network jitter are), the designer can choose the value of N suited for his application.

We give a realistic example. Assume workstations with voice and video capabilities. Since the speed of the video boards is 30 fi-ames/s, the transmission period of the voice packets is also assumed to be 30 packets/s or 1 packet/33 ms (= T). In our lab, the pair of workstations with video-boards attached have a difference in clock rates that cause a time shift of about 1 s/30 rain or (T~-TR) T~ - (30 X 60) - 1. The maximum value of jitter which we observed for the case of UDP is about 10 ms (-- A,~a~). If the tolerance desired is less than 3 ms and the error is distributed as 1 ms for the time offset and 2 ms for the estimation error, then by allowing 1 ms of time shift in the

AT" _ lms = 1.8S. interval N T , N T must equal N T - +~ ~30×60~ ' o f f

This means that N is 54 (1.8 x 30). With N = 54, as shown later in this section, the estimation error is much less than 2ms ( or 0 . 2 A ~ = 0.2 × 10ms) for a probability of error

=00 The values of A0 are also derived numerically for the

Rayleigh density with the parameter b = 0.1 and for the uni- density. These are plotted versus the error cR (= ~ ) form

%

in Fig. 8a. In Fig. 8b we have plotted the theoretical upper bounds of the probability of error P(c) derived from the Cher- noffbound for the uniform, normal, and Rayleigh density functions. As shown in Fig. 8b, with eR = 0.1, P(e) = 0.05 can be achieved with N = 120 when the Rayleigh or the normal density function is used, and with N = 200 when the uniform density function is usded. However, we notice a dramatic de- crease in the probability of error when the confidence interval is allowed to be eR = 0.15. P(c) = 0.05 can now be achieved with only 25 packets when the Rayleigh density function is used. This is a very important result, especially for the connection or set-up time for a conference.

Notice here that Eq. 19 only shows the upper bounds. To get a better estimate of the relation between the error cR and N for a constant probability of error, we resort to simulations.

-Amin ~t 0 Amax Average value of tile true probability density function

7a

-Amax

7b

100

O-

Prob. 1 2~max

A~x

1.6

1.4 2,,o

1.2

1

0.8

0.6

0.4

0.2

o f / 0.05 0.1 0.15

E . = - - - E[ e ~,,~,, • ]

104

10

10 -3 b i 5 100

No packets N

0.22,

o.18t \ C.l=OO I \ ~ ~ 0.161- \

0.08 igh - - - - - - - - - _

1~o 2oo 0 0 % 15 2o 2~ ao 35 I ' ' ' - - - - ~ - ~ o

No of packets N

8a 8b 9

For the exponential density, 1 Amax is 4 times the average

e -x The energy outside &max ; 5 is exp(-5).

+ 5, 4 = ~ m a x

10

1 (x/b)exp(-x2/2b) For the Rayleigh density ~ b=O.1 we ohoee parameter b = 0.1

/ / i " OzAm.x - ~ - 1.35

Fig. 7. a Example of a real probability density function, b The w/orst-casexscenario. The uniform density in the interval is [ - A ...... A~a x ] Fig. 8. a Numerical evaluation of A0 for different range of e• ~= ~ ) . b Numerical evaluation of the probability of elTor P(c) derived from the Chernoff bound using the Ao from a for the Uniform and the Rayleigh density, The A0's for the normal density are 0.5 and 0.75 for eR = 0.1 and 0.15, respectively Fig. 9. Simulation results for three density functions: uniform, exponential, and Rayleigh with parameter b = 0.1. The confidence interval

~R~ is plotted versus the number of packets N Fig. 10. Simulation models for exponential and Rayleigh density functions used in Fig. 9

In Fig. 9 we present simulations for three density functions: uniform, exponential, and Rayleigh. The simulation models for the pdfs of exponential and Rayleigh functions are shown in Fig. 10. For constant probability of error P(c) = 0.05, the confidence interval cR (which is presented here as fraction of A~ax) is plotted versus the number of packets N received. It is clearly seen from the figure that the uniform case is worse than the other two cases. Nevertheless, the results we get from

simulations are much better than the ones for which the Cher- noff bound is used. Specifically, for the Rayleigh case, the confidence interval cR --- 0 .1A~ax can now be achieved with only 15 packets instead of 120, as when the Chernoff bound is used for the same probability of error 0.05. If the transmission period of voice packets is 33 ms (or 30 packets/s), then a workable synchronization can be achieved within 0.5 s.

5 Concluding remarks

In this paper we attacked the problem of synchronization among computers participating in a multimedia conference. More specifically, we were concerned with the synchronization of the packets/messages at the application level.

The core of the proposed Inter-participant Synchronization Algorithm is the concept and the estimation of the reference times. Once the reference times are estimated, the maximum jitter and the minimum waiting time can be determined, and consequently, synchronization is achieved for the next N packets. Then we expanded the synchronization algorithm by pre- senting a methodology for predicting the time/frequency offsets. The accuracy is improved and the computational complexity of the algorithm is reduced further by the prediction algorithm.

Time stamping or sequence numbering associated with packets at the application level allows our algorithm to work even with unreliable transport mechanisms such as UDR The synchronization algorithm has also been shown to encompass cases in which traffic sources transmit packets with different periods. The degree of accuracy of synchronization depends on the accuracy of the estimation of the reference times. Modeling the pdf of packet arrivals with Rayleigh, normal and uniform density functions, an upper bound for the probability of error derived from the Chernoff bound is plotted with respect to the number of received packets N. For instance, by allowing an error of 0.15 of A,~.~, only 25 packets are required for P(e) = 0.05 with the Rayleigh model. With simulations, the same performance can be achieved with only 15 packets. This means that, at the connection/set-up time, workable synchronization can be achieved in a very short time - approximately 1 s. This displays the high feasibility of the proposed algorithm.

A p p e n d i x A - R e f e r e n c e t i m e

In this appendix we show the validity of Eq. 5. Then we show why we have used the packet arrival time of the IN 1 th packet, t'ig ] , rather than any other packet arrival time. For ease of reference, Eq. 5 is replicated here:

I N ( ( I N 1 ) ) = t ~ + T R - t ~ (A .1 ) FT1 r~=i

We assume for a moment that there are no frequency offsets between the clocks at the sources and the clock at the receiver. We prove that in the limit the rhs ofEq. A. 1 approaches A F~]'

which exactly represents the amount of time the F ~- ] th packet arrival time t[@]~ is off from its reference time t ~ f , [ wN ]. From

Eqs. 1 and 3, the packet arrival times t~ can be represented as a function of t ~ and A~ as follows:

=tre f , '+An = Tef:[~]

+ ( n - [ N ] ) T~ + A~,I < n < N . (A.2)

As was also noted in Eq. 1, even though source i sends packets with a nominal period T, for the observer at the receiver it appears that the sender transmits packets with period T ~. By substituting Eq. A.2 into Eq. A. 1, the rhs of Eq. A. 1 becomes

- - ~ef, ~v + A N] + n - N ,~=1 IT] V

= - - A [ g ] - A , ~ + n - (T ; ~ - T ~ N n=l

(A.3)

Since frequency offsets are not considered yet, T re = T ~ and the third term in Eq. A.3 is zero, leading to

N '

rt=l

N 1 i N ~ i

• (A.4)

This is due to the fact that the average value of A~ approaches zero by definition of the network jitter in Eq. 2. Thus Eq. A. 1 has been proved. Notice that the error of our estimation is the

1 N i term eR = N Y]-~=I A due to the fnite N considered. The careful reader may already wonder why the packet

arrival time tiF~ ] _ was chosen rather than another packet arrival time, for instance, the first fl or the last t~. We chose the middle point of the time interval [0, NT], or the middle packet [~1 because only at this point do the contributions of the time offsets ( caused by the frequency offset of the corresponding clock at the source and the clock at the receiver ) in the estimation of A~ r ~] cancel themselves out. The proof is as

follows: if T R is not T ~ Eq. A.3 will result in one more term, the following:

N r~,=I

{ (T*~-T~) N even = 2

0 N odd. (A.5)

The effect of the third term is approximately zero, as just shown. For comparison purposes, if we use the first packet

• i in Eq. A. 1, the third term can arrival time t] instead of t[~_3 be shown to be

N 1

n=l

(A.6)

l0

and Eq. A.1 now becomes

N 1

= ~ ( t ~ + ( n - 1 ) T R - t~) . (A.7) n=l

For example, if N = 101 and (T n - T i) = 10 -3 x T R, the time offset is 0.05 × T R. As exemplified with Eq. A.5, it is better to take N as an odd integer to avoid any frequency offset contribution.

Appendix B - Confidence interval

In this appendix we give a justification of the argument that when the number of packets used in the estimation of the average is increased by a factor k, then the confidence interval is reduced by , ~ . We can show that this is true when the bound used is the bound derived by Chebyshev's inequality:

~=1

_ c r2

and

{ 1 n~=lA~ } ~2 P ~ > c < - - __ _ ]VC2

(B.1)

where the A~ are zero mean and assumed to be i.i.d, random variables with variance a2 . A~ are defined in Eq. 3. Cheby- shev's inequality is a loose upper bound, and the reader may question the validity of the argument when the Chernoffbound is used (in Sect. 5, the Chernoff bound is used exclusively for determining an upper bound for the probability of error). For this reason, some remarks are in order. Firstly, the validity of the argument is proven readily - however, in an ad-hoc method - for the Chernoff bound for the case of the uniform density function. This can be done by trying a few examples with Eqs. 19 and 20 derived for the uniform case. Secondly, the validity of the argument, not only that the confidence interval is reduced (x/k), but also the probability of error for the

cases (P ( ~ ) and P (e))are exactly the same, is proved %

two for the normal pdf by the Central Limit Theorem (Wozencraft and Jacobs 1965). In addition, using the Central Limit Theo- rem, which gives an upper bound for the probability of error less than the bound derived from Chebyshev's inequality, but slightly greater than the Chernoff bound, our argument is true for any pdf when the bound considered is the bound derived from the Central Limit Theorem.

Appendix C - Mean square error of the rate of the t ime-offset est inmtion

The rates of the time offsets are estimated every 2kN packets received. For simplicity of notation, we assume that k = 1. Therefore, 2N packets are grouped into two groups, one group

with the first N packets and the second with the last N packets. With Eq. 2 and the identity T i = T R - (T R - Ti) , we can

i define the normalized packet arrival time ~ as follows:

(C.1)

The pNdf of the random variables x and y, where x = ~ = l " r ~ and y _~ ~N = ~ = N + I ~-;~ will be very close to

the normal according to the Central Limit Theorem, assum- ing that the ~-~s are independent and N is sufficiently large. The variance of the new random variables x and y will be

2 = c r y = ° (7 2 i s c7 z ~vx/N, where the variance of the A~. Given the uncertainty caused by the A~, we find the mean square error in estimating the frequency offset (T R - Ti) .

The error in estimating the rate of the time offset arises from the fact that although the two groups of points have the same pdf, their exact distance from their true average differs. We emphasize here that while both groups give the same distance from their mean, i.e., x :c = Y - Y, there is no error in the rate of the time-offset estimation. Since the error in the estimation does not depend on the individual distances of the random variables x and y from their true mean, but rather on the relative distance between them, the error is:

kx-vl e - - - (C.2) N

It can easily be shown that, for any two independent normal 2 = o_2/N, and random variables x and y with variance cr~ = crv

a mean of zero, the pdf of the new random variable x - y is a zero mean, normal random variable with a variance:

= 2cr~ = 2°~x ~r;_y N (C.3)

Therefore, the mean square error in determining the rate of the time offset using our methodology is

N ~ N 3 , (C.4)

N for the case of only and making the transformation N -4 7 N packets received, we derive to the final expression for the mean square error as:

16o-~x E @ 2 ) _ j ~ 3 ( C . 5 )

11

References

Alvarez-Cuevas E Bertran M, Oller F Selga JM (1993) Voice synchronization in packet switching networks. IEEE Network Magazine 7:20-25

Bai W, Zarros PN, Lee MJ, Saadawi T (1994) Design and analysis of a multimedia conference system using TCP/UDR Proceedings of IEEE ICC '94, New Orleans, IEEE Computer Society Press, Los Alamitos, CA, pp 432-438

Barberis G, Pazzaglla D (1980) Analysis and optimal design of a packet voice receiver. IEEE Trans Commun 6:1022-1027

Bertsekas D, Gallagher R (1992) Data networks (2rid edn). Prentice- Hall, Englewood Cliffs, NJ

Cover TM, Thomas JA ( 1991) Elements of information theory. Wiley, New York, p 20

Cruz RL (1991) A calculus for network delay, Part I: Network elements in isolation. IEEE Trans Information Theory 37:114-131

Escobar J, Partridge C, Deutsch D (1994) Flow synchronization protocol. IEEE/ACM Trans Networking 2:111-121

Field B, Znati T (1991) Experimental evaluation of transport layer protocols for real-time applications. Proceedings of the 16th Lo- cal Computer Networks Conference, Minneapolis, Minnesota, pp 521-534

Gradshteyn IS, Ryzhik (1980) Table of integrals, series, and products. Academic Press, London, pp 307, 338

Little T, Ghafoor A (1990) Synchronization and storage models for multimedia objects. IEEE J Selected Areas Commun 8:413427

Matragi W, Bisdikian C, Sohraby K (1994) Jitter calculus in ATM net-works: single node case. Proceedings of IEEE Infocom '94, Toronto, IEEE Computer Society Press, Los Alamitos CA, pp 232-241

Mills D (1991) Internet time synchronization: the network time protocol. IEEE Trans Commun 39:1482-1493

Mitra D (1980) Network synchronization: analysis of a hybrid master, slave and mutual synchronization. IEEE Trans Commun 28:1245-1259

Montgomery WA (1983) Techniques tbr packet voice synchronization. IEEE J Selected Areas Commun 6:1022-1027

Papadopoulos C, Parulkar GM (1993) Experimental evaluation of SUNOS IPC and TCP/IP protocol implementation. IEEE/ACM Trans Networking 1:199-216

Ramathan S, Rangan PV (1993) Adaptive feedback techniques for synchronized multimedia retrieval over integrated networks. IEEE/ACM Trans Networking 1:246-260

Ramjee R, Kurose J, Townsley D, Schulzrinne H (1994) Adaptive playout mechanisms for packetized audio applications in wide- area networks. Proceedings of IEEE Infocom '94, Toronto, IEEE Computer Society Press, Los Alamitos CA, pp 680-688

Rangan PV, Vin HM, Ramanthan S (1992) Designing and on-demand Multimedia service. IEEE Commun Magazine 30:56-65

Rangan PV, Vin HM, Ramanthan S (1993) Communication archi- tectures and algorithms for media mixing in multimedia conferences. IEEE/ACM Trans Networking 1:20-47

Ravindran K, Bansal V (1993) Delay compensation protocols for synchronization of multimedia data streams. IEEE Trans Knowledge Data Eng 5:574-589

Saadawi TN, Ammar M, Elhakeem A (1994) Fundamentals of telecommunications networks. Wiley, New York

Sorenson HW (1980) Parameter estimation. Dekker, New York Steinmetz R (1990) Synchronization properties in multimedia sys-

tems. IEEE J Selected Areas Commun 8:401-412

Wolf JK, Schwartz JW (1990) Comparisons of estimators for frequency offset. IEEE Trans Commun 38:124-127

Woo M, Ghafoor A (1994) Multichannel scheduling for pre-orchestrated multimedia information. Proceedings of IEEE Infocom '94, Toronto, IEEE Computer Society Press, Los Alamitos CA, pp 920-928

Woo M. Qazi NU, Ghafoor A (1994) A synchronization framework for communication of pre-orchestrated multimedia information. IEEE Network 8:52-61

Wozencraft J, Jacobs IM (1965) Principles of communications engineering. Wiley, New York (reissued in 1990, Waveland Press, Illinois), p 108

Zarros NP (1994) Multimedia network synchronization in real-time applications. PhD Thesis, Graduate Center, The City University of New York, New York

Znati T, Field B (1993) A network level channel abstraction for multimedia communication in real-time networks. IEEE Trans Knowl- edge Data Eng 5:590-599

. . . . J i

7

PANAGIOTIS N. ZARROS was born in New York on 17 February 1964. He received his BE, ME, MPh, and PhD degrees in Electrical engineering from the City College of New York in 1988, 1993, and 1994, respectively. He is currently with the CS First Boston Corporation, New York City. His research interests in- clude multimedia communications, ATM wireless Lans, and information theoretic analysis of computer networks.

MYUN G J. LEE received his BS (1976) and MS (1978) from Seoul National University in Korea, and his PhD (1990) from Columbia University, all in Electrical Engineering. He is currently an Assistant Professor in the Department of Electrical Engineer- ing at the City College, City Uni- versity of New York. His current research interests are in multimedia communications systems, ATM switch design and analysis, and neu- ral and fuzzy applications.

TAREK N. SAADAWI received his BSc and MSc from Cairo University, Egypt and his PhD from the Univer- sity of Maryland, College Mark (all in Electrical Engineering). Since 1980 he has been with the City University of New York, City College where he is currently a Professor at the Depart- ment of Electrical Engineering. His current interests are high-speed networks and multimedia networks. He is a co-author of the book Fundamen- tals of Telecommunication Networks, Wiley, 1994. Dr. Saadawi is a Senior

Member of the IEEE, Technical Editor of IEEE Communications Magazine, former Chairman of the IEEE Computer Society of New York City (1986-87).

a synchronization algorithm for distributed multimedia environments

Documents