modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have...

16
Modelling user behaviour in networked games Tristan Henderson and Saleem Bhatti Department of Computer Science University College London Gower Street London WC1E 6BT UK T.Henderson,S.Bhatti @cs.ucl.ac.uk June 25, 2001 Abstract In this paper we attempt to gain an understanding of the behaviour of users in a multipoint, interactive communication scenario. In particular, we wish to understand the dynamics of user participation at a session level. We present wide-area session-level traces of the popular multiplayer networked games Quake and Half-Life. These traces were gathered by regularly polling 2256 game servers located all over the Internet, and querying the number of players present at each server and how long they had been playing. We analyse three specific features of the data: the number of players in a game, the interarrival times between players and the length of a player’s session. We find significant time-of- day and network externality effects in the number of players. Player duration times fit an exponential distribution, while interarrival times fit a heavy-tailed distribution. The implications of our findings are discussed in the context of provisioning and charging for network quality of service for multipoint and multicast transmission. This work is ongoing. 1 Introduction In spite of being an active research area for over a decade, multicast has yet to see large-scale deployment. This may be due to a number of factors, such as the lack of compelling multicast applications, or the lack of a method for multicast service providers to charge for a multicast service [10]. We are in the process of developing a pricing scheme which allows efficient and predictable charging for multicast (initial details of this scheme are described in [15]). One problem with attempting to charge for multiple-source applications, however, is the need for predictable prices. If users share the cost of a multiuser transmission, and the number of users changes as users join and leave the session, the price paid per user will also change. We therefore need to understand how users behave in multiuser scenarios if we are to engineer pricing schemes that will provide stable and predictable prices. Models for user behaviour are also useful in designing pricing schemes for maximising objectives such as aggregate utility or network utilisation, and for understanding and providing for network Quality of Service (QoS). In a somewhat “chicken and egg” situation, the limited deployment of multicast also means that there are few sources of data from which a model for user behaviour can be determined. A feature of our intended pricing scheme is that it should be independent of the underlying network protocols. Thus, there is no reason why a model for multipoint user behaviour need be determined from IP multicast sessions. From an end-user viewpoint, there is no functional difference between an IP multicast session and several unicast streams, and so user behaviour should be similar in both of these multipoint situations. There may be a difference in cost and this may be used as an incentive for the use of multicast. With the removal of the absolute requirement for native IP multicast sessions in order to provide multi- point communications, the problem of creating a model for multipoint behaviour becomes more tractable. There are many existing, and popular, multipoint applications that use unicast routing, for example, online 1

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

Modelling userbehaviour in networkedgames

TristanHendersonandSaleemBhattiDepartmentof ComputerScience

UniversityCollegeLondonGowerStreet

LondonWC1E6BTUK�

T.Henderson,S.Bhatti� @cs.ucl.ac.uk

June25,2001

Abstract

In thispaperweattemptto gainanunderstandingof thebehaviour of usersin amultipoint,interactivecommunicationscenario.In particular, we wish to understandthe dynamicsof userparticipationat asessionlevel. We presentwide-areasession-level tracesof the popularmultiplayernetworked gamesQuake andHalf-Life. Thesetracesweregatheredby regularly polling 2256gameservers locatedallover the Internet,and queryingthe numberof playerspresentat eachserver and how long they hadbeenplaying. We analysethreespecificfeaturesof the data: the numberof playersin a game,theinterarrival times betweenplayersand the length of a player’s session. We find significant time-of-day andnetwork externality effects in the numberof players. Playerdurationtimesfit an exponentialdistribution,while interarrival timesfit a heavy-taileddistribution. The implicationsof our findingsarediscussedin thecontext of provisioningandcharging for network quality of servicefor multipoint andmulticasttransmission.Thiswork is ongoing.

1 Introduction

In spiteof beinganactiveresearchareafor overa decade,multicasthasyet to seelarge-scaledeployment.This maybedueto a numberof factors,suchasthelack of compellingmulticastapplications,or thelackof a methodfor multicastserviceprovidersto chargefor a multicastservice[10]. We arein theprocessofdevelopingapricingschemewhichallowsefficientandpredictablechargingfor multicast(initial detailsofthisschemearedescribedin [15]). Oneproblemwith attemptingto chargefor multiple-sourceapplications,however, is the needfor predictableprices. If userssharethe costof a multiusertransmission,and thenumberof userschangesas usersjoin and leave the session,the price paid per userwill also change.We thereforeneedto understandhow usersbehave in multiuserscenariosif we are to engineerpricingschemesthat will provide stableand predictableprices. Models for userbehaviour are also useful indesigningpricing schemesfor maximisingobjectivessuchasaggregateutility or network utilisation,andfor understandingandproviding for network Quality of Service(QoS).

In a somewhat“chickenandegg” situation,the limited deploymentof multicastalsomeansthat thereare few sourcesof datafrom which a model for userbehaviour can be determined. A featureof ourintendedpricing schemeis thatit shouldbeindependentof theunderlyingnetwork protocols.Thus,thereis no reasonwhy a modelfor multipoint userbehaviour needbe determinedfrom IP multicastsessions.Fromanend-userviewpoint, thereis no functionaldifferencebetweenanIP multicastsessionandseveralunicaststreams,andsouserbehaviour shouldbesimilar in bothof thesemultipoint situations.Theremaybea differencein costandthis maybeusedasanincentive for theuseof multicast.

With theremoval of theabsoluterequirementfor nativeIP multicastsessionsin orderto providemulti-point communications,theproblemof creatinga modelfor multipoint behaviour becomesmoretractable.Therearemany existing,andpopular, multipoint applicationsthatuseunicastrouting,for example,online

1

Page 2: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

chatapplicationssuchasInternetRelayChat(IRC), or multiusernetworked gamessuchasQuake. Wehave chosento examinethelatter to determineuserbehaviour, andthusto determinetherequirementsforpricing this behaviour andprovisioningnetwork resources.

This paperis structuredasfollows. We discussour motivation for choosinggamesasan applicationandnoteprevious work in Section2. Section3 describesthe methodologyusedfor gatheringdataandsummarisesour results. Sections4, 5 and6 analysethreeparticularaspectsof the data,namelysessionmembership,sessiondurationanduserinterarrival times respectively. Finally, Section7 concludesthepaperanddiscussespossibilitiesfor futureresearch.

2 Motivation

Multiplayer networkedgamesarecontributing to an increasinglylargeproportionof network traffic [22].Suchnetwork usageis likely to increasefurthernow thatconsolessuchastheSegaDreamcastandSonyPlaystation2 featureEthernetandmodemconnections.Gamesplayersarealreadywilling to pay extrato get an improvementin their playing experience,asevidencedby specialistgaminghardwaresuchasjoysticks,mice,mousepadsandevenfurniture.Gamepublishershaveproposedchargingplayersper-gamevia network delivery, ratherthanthecurrentpracticeof charging a one-off feefor thesoftware[25], or bycharging a fee pergamewith the opportunityfor playersto win money or prizes[29]. More interesting,from a networking point of view, is the existenceof modemsmarketedasbeingspeciallyoptimisedforgames[1], andsoftwaredesignedto determinenetwork characteristicsof potentialgamesserverssuchasdelay[13]. Thesedevelopmentsindicatethatgamesplayersareinterestedin network QoS,andwould bewilling to payfor theability to improveit.

The gamesthat we study hereareof the type commonlyreferredto asFPS(First PersonShooter)games.Playersconnectto a centralserverusingunicastUDP (or occasionallyTCP)flows. Themaximumnumberof playersthat canconnectto a server is setarbitrarily by the server administrator, accordingtotheamountof network traffic andCPUtime they wish thegameserver to consume(for thegamesstudiedhere,this figure is typically set to 16 or 32 players). Players’actionsaretransmittedfirst to the centralserver, which calculatesandmaintainsthe overall stateof the gameandthentransmitsthis statebacktotheplayers.Thegeneralobjectiveof mostof thesegamesis to explorea commonvirtual world andkill asmany of theotherplayersaspossible.

2.1 Previous work

Multicast sessionson the MBone arestudiedby Almeroth andAmmar [2]; thesesessionsareall single-sourceandperhapsdo not reflectthedifferentdynamicsof multiple-sourceapplications.Thereis a longhistoryof network andInternettraffic analysis(see[24] for asurvey). Themajorityof this,however, looksat packet-level andnetwork-level traces. In particular, Bangunet al. [3] andBorella [5] both study thetraffic patternsof multiplayergames,but donotexaminesession-level userdynamics,andlimit thestudiesto local areanetwork traffic only. Althoughtheremaybeinterestingrelationshipsbetweenthedataat thepacketandsessionlevels,for instancein termsof self-similarity, wedonotconsiderthesein thisstudy, butleave themfor possiblefuturework.

3 Methodology

AlmerothandAmmar [2] show that themonitoringof IP multicastsessionsis possiblethroughjoining asessionandthenwatchingothersessionmembersjoin andleave. This is impracticalfor networkedgames,however, sinceto join a gameimpliesparticipation.As mostpeopleareonly capableof playingonegameatatime,andonly for acertainnumberof hoursaday, thislimits thescopeof any datacollection.Althoughit is possibleto simulatea userthrougha scriptor program,such“bots” arefrowneduponby many gameserveroperatorsandgenerallyleadto theuserin questionbeingbarredfrom thatserver. Furthermore,thereis adataintegrity problemin thatuserbehaviour mightdependon thenumberof playersin agame,andsoby joining agameto monitorit, we mightaffect theresults.

2

Page 3: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

Somegameserversoffer a queryingmechanism,wherebyspecificvariablesaboutgamestatuscanberetrieved. Sincejoining and continuouslymonitoring gamesseemedimpractical,polling and queryinggamesservers at regular intervals was determinedto be the next bestoption. By polling servers anddeterminingthenumberof playersateachpoll, anapproximationof userbehaviour canbeobtained.Manynetworkedgamesalsoallow thequeryingof suchvariablesasplayers’nicknamesandtheamountof timethatthey havebeenplaying,andsothedurationof eachusers’sessioncanalsobeestimated.Theaccuracyof thismethoddependson thefrequency of polls. If thepollsaretoofarapart,thenany userswhojoin andleave betweenpolls will bemissed.If thepolls aretoo frequent,theamountof network traffic might haveaneffecton theserversandperhapsaffectuserbehaviour.

Datawerecollectedusing the QStattool [27], which is a programdesignedto display the statusofgamesservers,andwhich supportsa largenumberof onlinemultiplayergames.Of thesegames,thegameHalf-Life [14] wasdeterminedto bethemostpopulargame,andwasalsooneof thegameswhichsupportsthereportingof a player’sconnectiontime,andsoit waschosento concentrateon playersof this game.

A list of 2193 IP address/portpairs1 of machinesrunning the Half-Life daemonwasobtainedfroma “masterserver” at half-life.west.won.net. This list is composedfrom submissionsby serveradministratorsand/orautomaticregistrationby servers(dependingon the game). This list may alsobequeriedby usersthroughtheapplicationitself, or throughtheuseof someof theaforementionedprogramsfor determiningtheclosestor quickest-respondinggameserver.

ServerswerepolledusingQStatat regular intervals (Figure1). At eachpoll, the numberof players,their chosennicknamesandthe numberof secondsthat eachplayerhadbeenconnectedwereretrieved(exampleoutputfrom QStatis shown in Figure2). If a responsewasnot received from a server, groupmembershipwasassumedto bethesameasat theprevioussuccessfulpoll. Sincepolling tookplaceat theapplicationlevel, we could not detectsucheventsasunsuccessfuljoin attempts,asthesedo not registerin thegame.We werealsolimited in thatpolling takesplacefrom a centralmachineat UCL, andsoanynetwork failuresthatexistedsolelybetweenUCL andthegameservers(but not betweenthegameserverandtheplayers)wouldaffectour results.

SERVERGAMEPOLLING

USER USER

USERUSER

REPLY (2 UDP pkts) <−

QUERY (2 UDP pkts) −>

(PlayerName, ConnectTime)

MACHINE

Figure1: Datagatheringsetup

Servers Game Frequency DurationO-I 2193 Half-Life 30min 1 weekO-II 35 Half-Life 5min 3 daysO-III 22 Quake 5min 1 weekO-IV 3 Half-Life 5min 2 monthsO-V 3 QuakeIII Arena 5min 2 months

Table1: Observations

1It is not uncommonfor a singlemachineto run several serverson differentports;of our list of 2193servers,therewere1725uniqueIP addresses.

3

Page 4: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

NAME: Merlin TIME: 5710NAME: [F.u.T]The_LAW TIME: 5728NAME: MagNETo [FH] TIME: 2176NAME: TomiN TIME: 2409NAME: [DEM] Guybrush T. TIME: 8575NAME: [.HoF.]Ben Kenobi TIME: 1177NAME: [Thug]Tosh TIME: 142NAME: TDMT_Silvan TIME: 1540NAME: Gulzak TIME: 874NAME: [DBK]HannibalTC TIME: 954NAME: [STANDARD] Kill Demon TIME: 1085NAME: -=Phoenix=- TIME: 5593

Figure2: ExampleQStatoutput

Severalsetsof observationsweretaken; thedifferencesbetweenthese,andthe labelsthatareusedinthis paperto referto them,areshown in Table1. Thefirst setO-I usedthemasterlist of Half-Life servers.From this, the 35 most popularservers were selectedfor more detailedobservation over one weekendin O-II.

SetO-III used22Quakeservers,theaddressesof whichwerealsoobtainedby queryingamasterserver.Quake is an older game,introducedin 1996,which is why the numberof serversis so muchlower thanHalf-Life, whichfirst wentonsalein 1998.Quake, however, is oneof thefew gamesto allow thequeryingof players’IP addresses,which maybeusefulfor determiningthenetwork topologiesandspatialanalysisof games.Thesourcecodefor thegameis freelyavailable,sothissetof observationsmayproveusefulforfuturework.

Thelastpair of observations,O-IV andO-V, comefrom two setsof serverswhich MicrosoftResearchhavebeenrunningat their sitein Cambridge,UK. Usingapublic list of serversprovedto havedifficulties,sincesomeof the IP addresseson the list weredynamicallyallocated(for instance,usersrunninggameserverson dial-up machines).It would appearthat the masterlist doesnot updatefrequentlyenoughtoeliminatethese,andso many polls would endup targetingmachineswhich wereno longerrunning thegameserver. Of the original list of 2193addresses,we found that 265 of thesewerenever running theserverduringthecourseof ourpolls. Thegameusedin O-V, QuakeIII Arena, doesnotallow thequeryingof playerduration.For this setof datawe assumethateachplayerjoinedat thetime of thepoll at whichthey arefirst noticed;this figurethushasaninaccuracy of up to two poll periods.

3.1 Summary of observations

We observed a total of 1,757,539individual sessions(i.e., individual usersjoining and then leaving agame). Table2 shows someof the overall aspectsof the data. We were interestedin examining threespecificfeatures:thenumberof participantsin a game,theinterarrival timebetweenparticipants,andhowlonga playerremainedin a game.

4 Session membership

Figure3(a)shows thetotal numberof playersfor all theserversin O-I andO-III to O-V, aggregatedto aone-weekperiod.Figure3(b) shows thenumberof playersfor oneserver from O-IV, againscaledto oneweek.It canbeseenthatthenumberof participantsin agameexhibits strongtime-of-dayeffects,peakingin the middleof the day. Thestrongsinusoidalpatternin the correlogramsin Figures3(c) and3(d) alsoindicatesseasonalvariation.

It canbeseenfrom Figure3(a)thatmid-Tuesdayis anoutlier, with anunusuallylow numberof players.This might have beendue,for instance,to a breakin network connectivity. This outlier wasremovedbyreplacingthedatawith theaverageof thesamples24 hoursbeforeand24 hoursafter.

4

Page 5: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

Total joins Averagejoins/server/hr

Medianinterarrival(sec)

Max interar-rival (sec)

Median du-ration (sec)

Max dura-tion (sec)

O-I 1510445 4.65 225 246171 1576 3165999O-II 69961 27.76 70 17309 1098 66738O-III 37037 10.01 118 51706 618 410699O-IV 23559 4.19 115 77843 612 2614737O-V 5872 1.04 300 76800 901 403201

Table2: Summaryof results

0

2000

4000

6000

8000

10000

12000

14000

Sun Mon Tue Wed Thu Fri Sat

Mem

bers�

Total membership

(a)O-I, O-III, O-IV, O-V aggregated

0

10

20

30

40

50

60

70

80

Sun Mon Tue Wed Thu Fri Sat

Mem

bers�

Total membership

(b) singleserver from O-IV

0 1 2 3 4 5 6 7

−0.

50.

00.

51.

0

Lag

AC

F

(c) ACF of Figure3(a)

0 1 2 3 4 5 6 7

−0.

50.

00.

51.

0

Lag

AC

F

(d) ACF of Figure3(b)

Figure3: Numberof users

5

Page 6: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

Sincethetime-of-dayeffect is soclearlyevident, it is possibleto do a simpleseasonaldecompositionby subtractingeachobservationfrom themeanvaluefor all theobservationstakenat thattime of day[7].Theresultsof this areshown in Figure4, wherethehighersolid line representsthetime-of-dayeffect, thelowersolid line theremainder, andthedashedline theobserveddata.Threedaysarehigherthantheothers;these,asonemightexpect,areFridayto Sunday.

Seasonal decomposition of members

Time (days)

Mem

bers

1 2 3 4 5 6 7 8

−20

000

2000

4000

6000

8000

1000

012

000

Figure4: Seasonaldecompositionof smoothedmembershipdata

4.1 Network externalities

It is acceptedthat thevalueof a groupactivity to an individual participantmaybe relatedto thenumberof participantsin thatgroup. This hasbeenquantifiedconjecturallyby engineersasMetcalfe’s Law (thevalueof a network is proportionalto n2, wheren is the numberof users[23]), or more recentlyas theGroup-FormingLaw (thevalueof theInternetis proportionalto 2n [28]). Economists,however, generallyreferto theseeffectsas“positiveconsumption”or “network externalities”: for example,Katz andShapirodefinenetwork externalitiesas“productsfor which theutility thata userderivesfrom consumptionof thegoodincreaseswith thenumberof otheragentsconsumingthegood.” [18] Network externalitieshavemostcommonlybeenstudiedin termsof standardisationandcompatibility (e.g.,thetake-upandacceptanceoffaxmachines[11]), althoughHenrietandMoulin [16] presentacostallocationschemefor networkswhereuserssharecostsaccordingto thenetwork externalitiesthatareaccrued.

Onewould expectthatmultiplayergameswould alsoexhibit network externalities.Thepurposeof anetworkedmultiplayergameis to participatewith otherpeople;if a userwishesto play againstelectronicopponentstherewould be lessneedfor the networked aspectof the game(unless,for example,a userwishedto playagainsta farmorepowerful computersuchasthefamouschessmatchesbetweenKasparovandIBM machines).In general,however, it is reasonableto assumethatagivenparticipantin anetworkedgameis takingpartbecausethey wish to interactwith otherremote,humanusers,and,therefore,thattheirutility is derived,to someextent,from theexistenceandnumberof theseotherusers.

Figure5 shows thetemporalACF (autocorrelationfunction)of thecorrecteddatafrom Figure4, afterremoving the time-of-dayeffect; this shows the degreeto which the numberof playersin a subsequenttime perioddependson the sessionmembershipin the previous period. It canbe seenthat the level ofautocorrelationis high, even for a large numberof time periods. Thus,asexpected,thereappearto besomenetwork externalityeffects.

Having observedthetime-of-dayandnetwork externalityeffects,weanalysedthesessionmembershipdatausingtime-seriesanalysis.ARIMA (Autoregressive IntegratedMoving Average)models,introduced

6

Page 7: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Figure5: Temporalautocorrelationin numberof players

by Box andJenkins[6] area popularmeansof modellingtime seriesdata. An autoregressive processisdefinedasa serially dependentprocesswherebyelementsin a time seriescanbe describedin termsofpreviouselements:

Xt � φ1Xt � 1 � φ2Xt � 2 � φ3Xt � 3 ������� ε (1)

A moving averageprocessis whereeachelementin a timeseriesis affectedby pasterrors,independentof theautoregressiveprocess:

Xt � µ � εt � θ1εt � 1 � θ2εt � 2 � θ3εt � 3 � ���� (2)

An ARIMA modelincorporatesboth theautoregressive andmoving averageprocesses.Suchmodelsarereferredto asARIMA � p d q� , wherep is theautoregressive parameter, d thenumberof differencingpassesrequiredto make theinput seriesstationary, andq themoving averageparameter. If thetime serieshasa seasonalcomponent,additionalseasonalparametersarerequired,andthemodelis referredto asanARIMA � p d q����� P D Q� s model,whereP, D andQ representthe ARIMA parametersof the seasonalcomponent,ands is theperiodof theseasonality.

For theaggregatesessionmembershipdatafrom Figure4, thereis little to choosebetweena � 1 1 1���� 0 1 1� 48 anda � 2 1 1����� 0 1 1� 48 model(Figure6 shows thediagnosticoutputfor thelatter).Applyingthesetwo modelsto individual servers’ data,however, showed that a � 2 1 1����� 0 1 1� 48 model is themostappropriate.Figure7 shows thediagnosticsfor oneserver; theBox-Piercestatisticindicatesa highgoodnessof fit.

A � 2 1 1����� 0 1 1� 48 modelincorporatesbothnetwork externalities,sincetheautoregressivecompo-nentmeansthat the numberof playersup to an hour prior to a playerjoining hasan effect on a player’sdecisionto join, andalsoincludesthetime-of-dayeffect throughtheseasonal� 0 1 1� 48 component.

Proportionalfairnesshasbecomea popularmetric for allocatingbandwidthbetweenflows on a con-gestedlink [19]. This relieson theassumptionof usershaving logarithmicutility functions.It is unclear,however, whetherthis sameunicastlogarithmicutility functionshouldbeassumedfor a multicastor mul-tipoint transmission.Chiu [8] shows that proportionalfairnessmay producean “unfair” outcomein themulticastcase,andproposesaweightedproportionallyfair solution,wheremulticastflowsreceiveaband-width shareweightedaccordingto the aggregateutility of the downstreamreceivers. Legout et al. [20]suggestthat this bandwidthsharemight relateeither linearly or logarithmicallyto the numberof down-streamreceivers.Thedatapresentedheresupportsthelattersuggestion,sincethey imply that it might bemoreappropriateto assumeindividualutility functionswhich incorporatenetwork externalities.

7

Page 8: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

Standardized Residuals

Time2 3 4 5 6 7 8

−2

02

46

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.4

0.8

Lag

AC

F

ACF of Residuals

2 4 6 8 10

0.0

0.4

0.8

p values for Box−Pierce statistic

lag

p va

lue

0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

frequency

residuals from (2,1,1)x(0,1,1) process

Figure6: ARIMA diagnosticsandcumulative periodogramfor � 2 1 1����� 0 1 1� 48 modelon datafromFigure4

Standardized Residuals

Time2.0 2.5 3.0 3.5 4.0

−4

−2

02

0.0 0.1 0.2 0.3 0.4

−0.

20.

20.

61.

0

Lag

AC

F

ACF of Residuals

2 4 6 8 10

0.0

0.4

0.8

p values for Box−Pierce statistic

lag

p va

lue

0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

frequency

residuals from (2,1,1)x(0,1,1) process

Figure7: ARIMA diagnosticsandcumulativeperiodogramfor � 2 1 1����� 0 1 1� 48 modelfor singleserver

Time-of-daypricing is commonlyusedfor pricing utilities suchaselectricity andtelephoneservice,andhasbeenproposedasasimple,if suboptimal,methodfor pricing Internettraffic [21]. Thetime-of-dayeffectsobserved heremeanthat this might be appropriateon a per-applicationbasis,at leastfor games.Thismightalsohaveimplicationsfor network provisioning,wherebyanetwork designedfor gameswouldwantto beableto dealwith thepeaks.

5 User duration

Gameservers tend to run continuously, with usersjoining and leaving as they wish. As suchit is notmeaningfulto discussthe overall sessionduration,i.e., a whole game. Insteadwe examinethe duration

8

Page 9: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

0

5000

10000

15000

20000

25000

30000

35000

40000

Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00S

econ

ds�

Duration

Figure8: Durationof user’sgame

of eachindividual session,i.e., a user’s game. Figure8 shows the durationof users’individual sessionsfrom O-II: it canbeseenthat thesedurationsvary quitewidely, andthatmany gamedurationsarelowerthanour polling periodof five minutes. This might be due to droppedconnections,or usersbrowsinggamesby startingasessionto seewhatis goingonanddecidingthataparticulargameis not to their liking.At theotherendof thespectrum,thereareseveral long gamedurationsof over 24 hours.Thesemight be“hardcore”gamers,automatedplayers/botsor userswhohavemistakenly left their connectionsactive.

In Figure 9 we fit the userdurationdatafor two individual servers to a set of randomlygeneratedexponentially-distributeddata. The Quantile-Quantileplots show that this is an appropriatemodel. Thisagreeswith thefindingsfor multicastsessionsin [2]. It is alsoknown thatsomesingle-userapplications,suchasvoicetelephonecalls[4] fit anexponentialdistribution.

Sincewehadalreadyobservednetwork externalityeffectsin thenumberof players,weexpectedto findacorrelationbetweenthedurationof aplayer’ssessionandthenumberof playersin thatgame;agamewithmoreplayersmightbelikely to leadto playersenjoying thegamemore,whichshouldleadto themstayinglonger. Surprisingly, thereappearsto beno evidencefor this. Figure10(a)showsa boxplotof thenumberof playersat thestartof aplayer’ssessionagainstthedurationof theirsession.Theredoesnotseemto beacorrelation,andthemediandurationis relativelyconstantirrespectiveof thenumberof players.Comparingthedurationto theaveragenumberof playersover thefirst hourof asession(Figure10(b))showedaslightcorrelation,but this wasinsignificant.This might indicatethattheabsolutenumberof playersin a sessionis notnecessarilyadeterminantof whenaplayerdecidesto leaveasession;it maybethebehaviour or skillof thespecificplayersthatis moreimportant,or a completelyunrelatedfactor.

6 Interarrival times

Figure 11 shows the interarrival times betweenplayersfor one server. As for duration, there is largevariation. Unlike the durationdata,interarrival timesdo not appearto fit an exponentialdistribution, asshown in Figure12.

Interarrival timesbetweenusersfor single-userapplicationshave beenfoundto fit a Poissondistribu-tion [12, 26]. This is unlikely to be the casefor multiuserapplications,however, wherethe presenceofotherusersmayalteruserbehaviour. Borella[5] findsthat for games,packet interarrival timesarehighlycorrelated.Figure13 showsthatthis is alsotruefor playerinterarrivals;thereis significantautocorrelationatshortlags,whichimpliesthatthearrival of someuserswill leadto othersarriving. Thus,theinterarrivalsdo not fit theindependentarrivalsof thePoissondistribution.

Heavy-taileddistributionshavebeenobservedfor Internetusagebehaviour, for examplein World WideWebusage[9] andaggregateEthernettraffic [30]. Onemethodfor visualisinga heavy-taileddistributionis a log-log complementarydistribution (LLCD) plot, wherethe complementarycumulative distributionis plottedon logarithmicaxes. Linear behaviour in an LLCD plot indicatesa heavy-tailed distribution.Figure 14(a)shows sucha plot for the interarrival times, and linear behaviour canbe observed for the

9

Page 10: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

050

0010

000

1500

020

000

time

dura

tion

(sec

)0

5000

1000

015

000

sample exponential data

dura

tion

(sec

)

0 5000 10000 15000

050

0010

000

1500

020

000

sample exponential data

obse

rved

dur

atio

ns (

sec)

020

0040

0060

0080

0010

000

time

dura

tion

(sec

)0

2000

4000

6000

8000

sample exponential data

dura

tion

(sec

)

0 2000 4000 6000 8000

020

0040

0060

0080

0010

000

sample exponential data

obse

rved

dur

atio

ns (

sec)

Figure9: Fitting anexponentialdistribution to userdurationdata

largerobservations(Figure14(b)).A morerigoroustestfor heavy-taileddistributionsis theHill estimator[17]. A distribution of variable

X is heavy-tailedifP �X � x��� x� α asx ∞ 0 ! α ! 2 � (3)

TheHill estimatorcanbeusedto calculateα

αn � "1 # k i $ k � 1

∑i $ 0

� logX% n � i & � logX% n � k& �(' � 1

(4)

wheren is thenumberof theobservations,andk indicateshow many of thelargestobservationshavebeenusedto calculateαn. Figure15 shows thatα is approximately1.15.

Comparingthe interarrival times to the numberof playersin a sessionshows someevidenceof aninverselyproportionalrelationship(Figure16); asthenumberof playersin a sessionincreases,the inter-arrival timesdecrease.This supportsthe hypothesisthat the numberof playersis a determinantin otherplayers’decisionsto join a session.

Thehigh variationin userdurationandinterarrival timeshave several implicationsfor pricestabilityandprovisioning if the membersof a multicastgroupareto sharethe overall costof a sessionamongstthemselves.Theautocorrelationin thenumberof playersandinterarrival timesmeansthatif theusersaresharingthe costsof a session,this costwill snowball; new usersjoining will be followedby otherusersjoining (anduserswill join fasterasthenumberof usersincreases),leadingto rapiddecreasesin thecostperuser, andvice versafor whenusersleave. This couldberectified,for example,by only changingtheprice for eachuseron a periodicbasisratherthanwith eachjoin or leave. The autocorrelationseemstoexist for large lags,however, which meansthat the periodsof price reevaluationwould alsohave to belarge,andthiscouldimpedetheefficiency of any pricingscheme.

10

Page 11: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

0 2 4 6 8 11 14 17 20 23 26

1e+

001e

+02

1e+

041e

+06

Members

Dur

atio

n (s

ec)

(a)Numberof playersat startof session

0 2 4 6 8 11 14 17 20 23 26

1e+

001e

+02

1e+

041e

+06

Members

Dur

atio

n (s

ec)

(b) Averagenumberof playersover first hourof session

Figure10: Numberof playersversusdurationof session

0

10000

20000

30000

40000

50000

60000

70000

80000

Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00 Sat 00:00

Sec

onds�

Interarrival time

Figure11: Interarrival times

Costsharingmay alsochangethe behaviour of players,sincewe have observed that the numberofplayersin a sessionis not a largefactorin theutility receivedfrom a game.Oncenetwork pricing or QoSarea featureof networks, thenuserswill needto choosebetweennetwork flows dependingon the valuethat they receive from eachapplication;in otherwords,they will attemptto maximisetheir utility giventheir individualbudgets.Thepriceof asessionthusbecomesa factorin userbehaviour, andif thecostof asessionis sharedamongstsessionmembers,thenthenumberof playersin a sessionwill becomea factor,sinceit will be a determinantin the price of the session.Additionally, the numberof usersin a sessionmight affect theQoS,which would becomea furtherfactorfor users’behaviour.

11

Page 12: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

050

010

0015

0020

00

time

inte

rarr

ival

tim

e (s

ec)

050

010

0015

0020

00

sample exponential data

inte

rarr

ival

tim

e (s

ec)

0 500 1000 1500 2000

050

010

0015

0020

00

sample exponential data

obse

rved

inte

rarr

ival

s (s

ec)

050

0015

000

2500

0

time

inte

rarr

ival

tim

e (s

ec)

050

0015

000

2500

0

sample exponential data

inte

rarr

ival

tim

e (s

ec)

0 5000 10000 15000 20000 25000

050

0015

000

2500

0

sample exponential data

obse

rved

inte

rarr

ival

s (s

ec)

Figure12: Fitting anexponentialdistribution to interarrival times

0 20 40 60 80

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Figure13: ACFof interarrival times

7 Conclusions and future work

Therehasbeenlittle study of session-level userbehaviour in large-scalemultiple-sourcescenarios.Inthis paperwe have presentedstatisticalanalysisof severalsession-level tracesof popularmultiplayernet-

12

Page 13: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

1 2 5 10 20 50 100 500

0.1

0.2

0.5

1.0

x = interarrival time

P[X

>x]

(a) full dataset

500 1000 1500 2500 3500

0.1

0.2

0.5

1.0

x = interarrival time

P[X

>x]

(b) uppertail

Figure14: Log-logcomplementaryplotsof interarrival times

0 10000 20000 30000 40000

0.0

0.5

1.0

1.5

2.0

2.5

3.0

k

Hill

Est

imat

or

Figure15: Hill estimatorfor interarrival times

workedgames.We have foundthat thenumberof playersexhibits strongtime-of-dayandnetwork exter-nality effects,andwe have fitted anappropriateARIMA model.Players’durationtimesfit anexponentialdistribution, while interarrival times fit a heavy-tailed distribution. The numberof playersin a sessionappearsto have a greatereffect on players’decisionsto join a sessionratherthanleave. In many respectswe have observedsimilar behaviour to that seenfor multicastapplications,despitethe unicastnatureofthesegames.This impliesthatin theabsenceof appropriatemulticastdata,unicastmultipointapplicationsareanappropriatesubstitute.Wehavediscussedhow theseresultscouldimpactpotentialmulticastpricing

13

Page 14: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

0 2 4 6 8 11 14 17 20 23 26

110

010

000

Members

Inte

rarr

ival

(se

c)

Figure16: Numberof playersversusinterarrival time

policiesandnetwork provisioning.Networking researchinto multiplayergamesis still at an early stage,andthereis muchfuture work

thatneedsto becarriedout. Understandinguserbehaviour is but onestagein creatingappropriatepricingpolicies.It doesnot,for example,helpusexplainwhatusersdesireor require.Futurework will investigatetheQoSrequirementsfor networkedgames,in particular, concentratingon how therequirementschangedependingon thecompositionof asessionandgroup.

We alsointendto look at packet-level traffic statistics.Someof the resultspresentedherehave beensimilar to thoseof previous packet-level gamesstudies,and it will be interestingto conductsimultane-ousanalysisat the packet andsessionlevels, to determinewhetherthereis any relationshipbetweenthebehaviour at bothlevels. This is important,for example,if QoSprovisioningis to take placethroughcon-gestionpricing, i.e., chargingusersfor thenetwork congestionthatthey cause.Unfortunately, mostof thegameserveroperatorsandISPsthatwehavespokento donot log many of theappropriatestatistics.Thus,we arenow runningour own gamesserversin orderto collectpacket-level data.Runningour own serverspresentsmany opportunitiesfor furtherwork. Sincewe canlog theexacttimeswhenplayersconnectandleaveto theserver, wecanbetterestimatetheinaccuracy of thepolling methodthatwehaveusedhere.Wearealsousingthesegamesserversfor experimentalratherthancorrelationalstudy;for examplechangingsuchvariablesasnetwork delayin orderto determinetheeffectson userbehaviour.

Thestudypresentedherehasonly lookedatonetypeof game,theFPS.Althoughweexaminetwo of themostpopulargamesof this genre,it is notnecessarilytruethattheseresultswill hold for otherFPSgamesandthis shouldbefurtherexamined.Moreover, othertypesof games,for instanceMMORPGs(MassivelyMultiplayer OnlineRole-PlayingGames)suchasEverquest,arelikely to exhibit differentuserbehaviour,sincetheMMORPGcanbeslightly slowerpaced,andcaninvolvethousandsof usersconnectingto asingleserver, ratherthanthelargenumberof smallgroupsin FPSgames.

8 Acknowledgements

Thanksto BrendanMurphy at Microsoft ResearchCambridgeUK for allowing us to poll their gamesservers.

14

Page 15: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

References

[1] 3Com. 3Comscoreswith modemgearedtowardsseriousgamers,Oct. 25, 1999. http://www.3com.com/corpinfo/en_US/pressbox/press_release.jsp?INFO_ID=2001253.

[2] K. C. Almeroth andM. H. Ammar. Collectingandmodelingthe join/leave behavior of multicastgroupmembersin the MBone. In Proceedingsof the 5th IEEE InternationalSymposiumon HighPerformanceDistributedComputing(HPDC-5), pages209–216,Syracuse,NY, Aug. 1996.

[3] R. A. Bangun,E. Dutkiewicz, andG. Anido. An analysisof multi-playernetwork gamestraffic. InProceedingsof 1999InternationalWorkshopon MultimediaSignalProcessing, pages3–8, Copen-hagen,Denmark,Sept.1999.

[4] V. A. Bolotin. Modeling call holding time distributionsfor CCSnetwork designandperformanceanalysis.IEEE Journalof SelectedAreasIn Communications, 12(3):433–438,Apr. 1994.

[5] M. S. Borella. Sourcemodelsof network gametraffic. ComputerCommunications, 23(4):403–410,Feb. 15,2000.

[6] G. E. Box andG. M. Jenkins.Timeseriesanalysis:forecastingandcontrol. McGraw-Hill, London,UK, 1970.

[7] C. Chatfield.TheAnalysisof TimeSeries. Chapman& Hall, London,UK, fifth edition,1996.

[8] D. M. Chiu. Someobservationson fairnessof bandwidthsharing.TechnicalReportTR-99-78,SunMicrosystemsResearch,PaloAlto, CA, Sept.1999.

[9] M. E. Crovella andA. Bestavros. Self-similarity in world wide web traffic: Evidenceandpossiblecauses.IEEE/ACM Transactionson Networking, 5(6):835–846,Dec.1997.

[10] C. Diot, B. N. Levine, B. Lyles, H. Kassem,and D. Balensiefen. Deployment issuesfor the IPmulticastserviceandarchitecture.IEEE Network, 14(1):78–88,Jan.2000.

[11] N. EconomidesandC. Himmelberg. Critical massandnetwork evolution in telecommunications.InProceedingsof the 22ndTelecommunicationsPolicy Research Conference, Solomon’s Island,MD,Oct.1994.

[12] A. Feldmann,A. C. Gilbert, W. Willinger, andT. G. Kurtz. Thechangingnatureof network traffic:Scalingphenomena.ComputerCommunicationReview, 28(2):5–29,Apr. 1998.

[13] GameSpy. http://www.gamespy3d.com.

[14] Half-Life. http://www.sierrastudios.com/games/half-life/.

[15] T. N. H. HendersonandS. N. Bhatti. Protocol-independentmulticastpricing. In Proceedingsof the10thInternationalWorkshopon NetworkandOperatingSystemSupportfor Digital AudioandVideo(NOSSDAV), pages11–17,ChapelHill, NC, June2000.

[16] D. HenrietandH. Moulin. Traffic-basedcostallocationin a network. RANDJournal of Economics,27(2):332–345,Summr1996.

[17] B. M. Hill. A simplegeneralapproachto inferenceaboutthe tail of a distribution. TheAnnalsofStatistics, 3(5):1163–1174,Sept.1975.

[18] M. L. Katz andC. Shapiro. Network externalities,competition,andcompatibility. AmericanEco-nomicReview, 75(3):424–440,June1985.

[19] F. P. Kelly. Charging andratecontrol for elastictraffic. EuropeanTransactionson Telecommunica-tions, 8(1):33–37,Jan./Feb., 1997.

15

Page 16: Modelling user behaviour in networked gameskrasic/cpsc538a/papers/p212... · 2003-12-22 · have chosen to examine the latter to determine user behaviour, and thus to determine the

[20] A. Legout,J.Nonnenmacher,andE.Biersack.Bandwidthallocationpoliciesfor unicastandmulticastflows. In Proceedingsof the 18th IEEE Conferenceon ComputerCommunications(INFOCOM),pages254–261,New York, NY, Mar. 1999.

[21] J. K. MacKie-MasonandH. R. Varian. Someeconomicsof the Internet. In W. SichelandD. L.Alexander, editors,Networks,InfrastructureandtheNew Taskfor Regulation. Universityof MichiganPress,1996.

[22] S. McCrearyandK. Claffy. Trendsin wide areaIP traffic patterns:A view from AmesInternetEx-change.In ITC SpecialistSeminaronIP Traffic Modeling, MeasurementandManagement, Monterey,CA, Sept.2000.

[23] R. M. Metcalfe. TheInternetafter the fad. In The1996MonticelloLectures, Monticello, VA, May1996.

[24] T. Monk andK. Claffy. Internetdataacquisitionandanalysis:Statusandnext steps.In Proceedingsof the7th InternetSocietyConference(INET), KualaLumpur, Malaysia,June1997.

[25] A. Patrizio. Comingsoon: Pay-per-game. Wired News, Oct. 20, 2000. http://www.wired.com/news/culture/0,1284,39505,00.html.

[26] V. PaxsonandS.Floyd. Wide-areatraffic: Thefailureof Poissonmodeling.IEEE/ACM Transactionson Networking, 3(3):226–244,June1995.

[27] QStat.http://www.qstat.org.

[28] D. P. Reed.Weaponof mathdestruction.Context Magazine, Spring1999.

[29] UrbanMercenary.http://www.urbanmercenary.com.

[30] W. Willinger, V. Paxson,andM. S. Taqqu. Self-similarity andheavy tails: Structuralmodelingofnetwork traffic. In R. J.Adler, R. E. Feldman,andM. S.Taqqu,editors,A Practical Guideto HeavyTails — StatisticalTechniquesandApplications, pages27–53.Birkhauser, Boston,MA, 1998.

16