visualization of mobile network simulationsseptember-october 2001 simulation 103 technical article...

SEPTEMBER-OCTOBER 2001 SIMULATION 103

TECHNICAL ARTICLESIMULATION 77:3-4,©2001, Simulation Councils Inc.ISSN 0037-5497/01Printed in the United States of America

Visualization of Mobile Network Simulations

T.A. Dahlberg and K.R. SubramanianComputer Science Department

University of North Carolina at Charlotte9201 University City Boulevard

Charlotte, NC 28223, USA{tdahlber, krs}@uncc.edu

The use of adaptive techniques in mobile networkspermits scalable resource allocation policies to meetvarying demand as well as Quality of Service (QoS)performance objectives. As these algorithms operate atmultiple layers of the communications architecture,evaluation of such techniques must take into accounta variety of scenarios, which are in turn parameterizedby a large number of variables. The need to monitoralgorithm behavior in real time results in a dataexplosion. In this work, we propose new real-timemetrics to characterize and identify the critical statesof a mobile network in the wake of channel failures,congestion, signal degradation, etc. We use thesemetrics to define a survivability index, a measure ofmobile network performance in the wake of failures.We demonstrate the effectiveness of information visu-alization techniques in understanding the complexspatial and temporal relationships between perfor-mance and cost metrics that influence adaptive algo-rithms. Our visualization system is highly scalableand interactive, permitting multiple algorithms to besimultaneously evaluated. We demonstrate applica-tions to network monitoring and to the design andevaluation of adaptive admission control algorithms.

Keywords: Mobile networks, visualization

1. IntroductionA wireless access network is characterized by burstsof demand from mobile users with diverse service re-quirements, attempting to gain access to wirelesslinks with varying signal quality. Allocation of re-sources to meet varying demand is constrained by thelimited frequency spectrum. Hence, current researchon mobile networks is focusing on the development ofadaptive techniques to scale resource allocation poli-cies to meet Quality of Service (QoS) performance ob-jectives [1, 2, 3, 4, 5, 6]. Furthermore, protocols operat-ing at higher layers (in the communicationsarchitecture) cannot be completely shielded from theuncertainties of the unreliable, mobile physical envi-ronment. Therefore, multi-layer adaptive techniquesare being pursued to coordinate the resource manage-ment task among protocols at multiple layers withinthe communications architecture [7, 8]. Finally, due tothe complexities inherent in modeling a heteroge-neous mobile network, simulation has become a pri-mary method for performance analysis of mobile net-work protocols [4, 5, 8, 9, 10, 11, 12].

The basic operation of an adaptive algorithm istwofold: (1) to monitor the system to determine thesystem state (e.g., failure, congestion, burst activity),and (2) to invoke a particular adaptation policy that is

104 SIMULATION SEPTEMBER-OCTOBER 2001

best suited for that state. In general, greater emphasishas been placed on adaptation policies. However net-work monitoring for system state identification is cru-cial for distinguishing “real problems which must beaddressed” and “burst activity that should be ig-nored.” Current approaches to evaluating adaptivealgorithms have given insufficient attention to under-standing the sensitivity of algorithms to make thisdistinction over a wide variety of simulation sce-narios. As will be detailed in Section 6, a considerablenumber of variables and parameters influence theadaptive resource management policies/algorithms(e.g., offered load, threshold values). The need to si-multaneously comprehend this large problem spaceof variables both spatially and temporally for design-ing adaptive algorithms results in a data explosion,and we turn to modern information visualizationtechniques to abstract the essence of their complexrelationships. We describe such a system and its ap-plication to both network monitoring as well as adap-tive admission control. Finally, our own earlier workon survivability analysis has shown the need to con-sider performance during the transient period imme-diately following a failure, during the steady statefailure period, and following recovery [13]. This fur-ther justifies the need for data visualization to be anintegral part of the analysis.

An overall goal of our work is development ofadaptive resource management policies for radio-level survivability of cellular mobile networks. To-wards this goal, the primary objective of this study isto use visualization techniques to characterize systemstates of mobile network simulations in terms of real-time metrics. We introduce new real-time metrics thathave been found to be the most useful for state char-acterization and identification. We demonstrate theessential insight provided by the visual animations inevaluating these metrics. We define a survivabilityindex (SI) for comparative survivability analysis ofresource management policies. We describe how statecharacterization is used to enable these policies toadapt to current network conditions. Our examplesinclude description of our adaptive admission controlalgorithms and channel access algorithms. Our resultsdemonstrate that dynamic visualizations of systemvariables and interactions among them are fundamen-tal to leveraging the user’s intuition and domain ex-pertise to help explore (and reduce) a large searchspace of these variables.

2. BackgroundAn ideal model of the radio-level layer of a wirelessaccess network is shown in Figure 1. A cell refers to ahexagonal area surrounding a basestation tower. Eachbasestation has a group of wireless channels for use incommunicating with a mobile user in its cell. A mo-bile user receives a wireless signal from the closestbasestation. The user makes a new-connection request

to the closest basestation upon placing (or receiving) anew call. As the user moves to a new cell, the usermust relinquish its old channel and obtain a new onefrom the target basestation via a handover request.

Radio-level resource management policies includeadmission control to limit the number of new calls/connections allowed into the system, and channel ac-cess control to schedule use of wireless channels byongoing calls/connections [14]. Resource manage-ment policies are often implemented as distributedprotocols executed at each basestation to manage localresources (channels). However, due to user mobility,spatial relationships among neighboring cells signifi-cantly impact protocol performance.

Survivable Resource Management refers to the degreeto which such protocols can maintain performancestandards in the wake of failure and overload condi-tions. Due to the novelty of wireless devices, verylittle work has yet been done on mobile network sur-vivability. However, as mass use of mobile networksincreases, users will demand the same service guaran-tees delivered by current wired networks. Compara-tive analysis of competing survivability strategies re-quires a definition of a Survivability Index (SI) as ameasurable metric. The review of survivabilitymetrics, as described in [29], reveals the need for ad-ditional cost and performance metrics for survivabil-ity analysis. Additional challenges are the develop-ment of techniques to measure these metrics and thedefinition of how metrics will be used to adapt net-work policies in real time.

For instance, the simulation analysis in [8], whichstudies admission control and channel access controlfor mobile networks, uses a quality metric that is afunction of 14 variables. Results are presented as 2Dgraphs, each graph point representing the time-wise(over 1 simulated hour) and system-wide (over allcells) average of the quality metric for one value ofoffered load. The spatial (between cells) and temporal(critical times) relationships of the proposed policies/algorithms are not considered at all. Also not consid-ered is an understanding of how the 14 variables con-tribute to the quality index.

In our own preliminary studies [15, 16], we experi-mented with two Adaptive Admission Control

Figure 1. Ideal cellsite architecture


(AAC) protocols, called AAC 1 and AAC 2. We de-fined the “percentage of handover requests that aredenied,” called handover blocking rate (hbr), as a per-formance metric. 2D plots of simulation results, asshown in Figure 2, do not clearly explain the perfor-mance differences in the two approaches. To do this,it was necessary to look at these metrics in real-timefor each cell of the network. As this immediately re-sults in the generation of a large amount of data (de-termined by the number of cells, metrics, the simula-tion period and the sampling rate), we turned tovisualization techniques, and in particular, multi-vari-ate visualization schemes. Multi-variate visualizationtechniques are relevant in this domain, since we havea large number of variables and the challenge is to un-derstand their relationships and interactions spatiallyand temporally.

The earliest visualization schemes for multivariatedata were projection-based, such as scatterplot matri-ces [17], dimension stacking [18], Worlds withinWorlds [19] and HyperSlice [20]. These techniqueslook at a subset of the space (usually 2 or 3 dimen-sions) at a time, some support linking variables acrossthese projected views [21] and some require an order-ing of the dimensions. The major disadvantage of pro-jection schemes are their inability to scale with thenumber of dimensions and the needed ordering ofvariables in some schemes, which prevents all dimen-sions from being treated uniformly. A second class oftechniques based on displaying multi-variate pointdata through simultaneous multiple views, andlinked together by line segments connecting theviews, include the window wiring diagrams of rooms[22] and parallel coordinates [23, 24]. Parallel coordi-nates, in particular have received considerable atten-tion. It maps multidimensional data into 2D plots,treating all dimensions uniformly, by plotting n-di-mensional points as polyline segments through the Naxes, all of which are parallel to each other. Finally,[25] describes XmdvTool, a system that integratessome of the most important multivariate visualizationtechniques.

Fundamental to any of these techniques is brushing,an interactive capability to look at subsets of the data.This allows users to highlight, analyze or focus onsmall or manageable portions of their data. Brushingmay be screen based, using input devices to selectdata elements, or indirect (data driven), such as ma-nipulation of multiple sliders to specify containmentcriteria. A third form of brushing is structure-based[26]; this could involve specification of level of detail,data cluster size, etc., and is appropriate for very largedatasets. Brushing is a key capability that scales withdata size and makes drill down/roll up operations veryefficient.

3. Modeling and Simulation FrameworkOur system model is illustrated in Figure 3 (details in[14]). Rectangles represent static models that are fixedthroughout a simulation run, while ovals representdynamic models that define time-varying propertiesof a simulation.

The geographical data set is a two-dimensional arrayof grid points, as shown in Figure 4, representing themobile user’s physical environment (e.g., terrainproperties, transportation system, radio signal pathloss model). The data set used in this study representsa region of North Carolina.

The cellsite architecture describes the placement ofbasestation towers, basestation coverage area, and thewireless capacity of a basestation, in the context of thephysical environment. As shown in Figure 4, ourcellsite architecture represents the use of two reusepartitions to provide two groups of channels, withlonger and shorter coverage ranges at eachbasestation. It is assumed that a user at a grid pointexperiences the physical environment defined by thegeographical data set at that point and receives a us-able wireless signal from the basestation tower in thecenter of any cell that covers this grid point.

We have developed an object-oriented discrete-event simulator in C++ to implement the systemmodel. During simulation, calls arrive into the systemaccording to a teletraffic model, and are initially locatedat a grid point according to location distributionswithin the geographical data set. Calls move amonggrid points according to a mobility model demandingresources (channels) from overlapping cells in thecellsite architecture. During simulation, fault modelsinject failures into the cellsite or degraded conditionsinto the geographical data.

We have implemented various teletraffic models, Fig-ure 3, to define calling patterns with respect to callarrival and holding times. One can choose amongvarious loads (in the same or different simulationruns). A nominal load represents users’ demand forchannels equivalent to that for which the system wasoptimally designed.

Figure 2. Averaged Blocking Rates versus Offered Load


Our mobility model defines characteristics of how auser moves among the geographical grid points, Fig-ure 3.

Our results include use of two fault models, Figure3. The channel failure model fails 75% of the channelsin specified cells for any time-period (models partialhardware/software failure). The hot-spot model in-creases offered load significantly beyond the nominalvalue in specified cells for a specified time period(models user congestion in the physical environment,such as that caused by traffic surrounding a highwayaccident, or people leaving a stadium event).

Example multilayer resource management policiesare also shown in Figure 3. Thus far, we have imple-mented Adaptive Admission Control and adaptive Chan-nel Access (CA) control algorithms [14, 15].

Ideally, a simulation analysis of these algorithmswould explore all simulation scenarios in terms of the

various models described. However, each model con-tains multiple factors (variables that affect perfor-mance results) that can take on multiple values. A fullfactorial analysis [27] is not possible, and an approachto design a fractional factorial set of experiments isnot clear. Thus, our approach in this work is to useapplication domain expertise to design real-timemetrics for better understanding of adaptive resourceallocation policies. We turn to information visualiza-tion techniques to assist us in viewing multiple real-time metrics simultaneously within a highly interac-tive environment, in order to facilitate the analysis ofsuch policies, as well as the design of new algorithms.

4. Visualization FrameworkAll visualizations have been constructed using the Vi-sualization Toolkit (VTK) [28]. We are currently usingthree types of visualizations,

1. color mapped planes, where the metric values aremapped into a set of colors (we use a rainbow colormap, from blue to red),2. height fields, where the height corresponds to a met-ric value at a specific cell, and3. parallel coordinates, where each metric is representedas a separate vertical axis.

The cell site network used in our experiments con-sists of 105 cells arranged in a square grid. A struc-tured grid (a grid that is topologically a multidimen-sional array, but its lattice points are not uniformlyspaced) is used to represent each metric. The geom-etry of the grid can be displayed either as solid colors

Figure 3. Modeling framework

Figure 4. Cellsite Architecture


or wireframe (which is useful in focusing on specificcells) format; also, the hexagonal lattice structure canbe overlaid for better perspective of the cell site archi-tecture.

The visualization system can look at all 16 metricswithin a single run (Figure 5-7), or be configured tolook at multiple runs with specified metrics (Figure 9shows 4 runs with 4 metrics, with each run in a sepa-rate row). The ability to look at multiple runs is espe-cially useful in designing adaptation policies/algo-rithms in terms of the metrics/parameters. Once thesimulation data is input to the visualization system,the entire run can be animated and controlled usingVCR style controls (play, stop, forward step, back-ward step, etc.). In the multi-run mode, the VCR con-trols will affect all simulation runs, again permittingdirect comparison between simulation runs, param-eterized by adaptation policies, algorithms or simula-tion parameters.

A number of analysis tools are in use and in devel-opment. For instance, it is possible to compute statis-tics of a particular run (peak and average values ofthe metrics over the entire run, variances, etc.). Anymetric value for a particular cell can be queried whenit is necessary to focus on a particular cell. The visual-ization also allows tracking a small group of cells, forinstance, cells in the vicinity of a failure to focus onspatial relationships.

5. Real-time MetricsThe objective of network monitoring is to identify thecurrent state of the system so that a resource adapta-tion policy can be invoked that best maintains surviv-ability for that state. For this study, five system states,also called system operating modes, are character-ized—normal, degraded, transient failure, steady-statefailure, and recovery. Table 1 summarizes how metrics

were used to characterize and identify the systemsstates as the offered load to the system varied amonglight, nominal, heavy, and very heavy offered load val-ues.

A normal mode implies that no explicit failure hasoccurred. A degraded mode refers to a time duringwhich an explicit system failure may not be present,but system performance is diminished due to envi-ronmental reasons (e.g., increased demand in one celldue to highway congestion resulting from a traffic ac-cident or due to an event, such as people leaving afootball stadium). We generate a degraded mode us-ing our hot-spot fault model, described in Section 3. Atransient failure mode refers to the time-period immedi-ately following a failure when a system may be some-what unstable while trying to maintain ongoing con-nections affected by the failure. Steady-state failuremode refers to the time period in which a fault stillpersists, but the system has reached a somewhatstable, though degraded mode of operation. A recov-ery mode is the time period following failure, duringwhich a system attempts to reinstate normal usagepatterns. These latter three modes are generated usingour channel failure fault model. In addition, we willdefine our survivability index in terms of the surviv-ability objective in Table 2, in a later section.

The real-time metrics found to be most useful forstate characterization are defined next and explainedmore fully in Section 6.

Metric DefinitionsEach metric is measured as a sliding window variable.The metric is evaluated at a rate determined by asampling_rate parameter, by calculating raw data overa time period specified by a time_window_size param-eter. The summation symbols in the equation that fol-

Table 1. Characterization of system states

System states Characterization Normal Little activity, arbitrary, sporadic increases in all metrics. Degraded Persistent spatial & temporal increase in oc, cu and in ct, and

Recurring spatial & temporal spikes in sum, hbr, and nbr Transient failure Spatial & temporal increase in dcr. Steady-state failure Persistent spatial & temporal decrease in cu and in ct, and

Recurring spatial & temporal spikes in sum, hbr, and nbr Recovery Recurring spatial & temporal dips in oc, cu, and ct.

Table 2. Survivability objectives

Survivability Objectiveso Handover blocking rate ≤ 1%o New connection blocking rate ≤ 2%o Minimize overall blocking rateo Handover blocking rate equal to half of new

connection blocking rateo Maximize carried traffic


lows imply that data is summed over a period equalto the time_window_size.

Handover blocking rate, hbr, is the percentage offorced handover requests that are denied. A handoverrequest is considered to be forced, if denial of a chan-nel will result in termination of the call.

hbr

forced handovers deniedforced handover requests

=∑∑

_ __ _ , 0 ≤ hbr ≤ 1

The derivative of hbr represents the rate of changein hbr between successive samples of the metric.

hbr

hbr hbr

window sizetime t time t′ = _ _ ±±

_1

, hbr′ ∈ ℜ

Similarly, new connection blocking rate, nbr, and itsderivative represent the percentage of new connectionrequests that are denied, and the rate of change of thisvalue.

nbr

new connections deniednew connection requests

=∑∑

_ __ _ , 0 ≤ nbr ≤ 1

nbr

nbr nbr

window sizetime t time t′ = _ _ ±±

_1

, nbr′ ∈ ℜ

Because of the nature of voice traffic, which cur-rently dominates our teletraffic model, the hbr and nbrvalues are bursty in nature. Therefore, we are also in-terested in looking at the real-time sum of these val-ues, as measured with the sum metric. The assump-tion is that a significant increase in both metrics at thesame temporal and spatial location more likely con-veys a problem rather than a random burst.The blocking ratio metric, br, is an indicator of howwell the system is managing to meet the survivabilityobjective of keeping a 0.5 ratio of handover blockingrate to new connection blocking rate. A br = 0 is mostdesirable as it indicates that either the hbr/nbr ratio is0.5, or that hbr and nbr are within the desired range,less than 1% and less than 2%, respectively.

brhbrnbr

if hbr AND nbr

ifhbrnbr

otherwise

=

≤ ≤

≥

00 5

0 5

0 01 0 02

1 0,

. ,

± . ,

. .

.,

–0.5 ≤ br ≤ +0.5

The dropped connections rate metric, dcr, measuresthe number of connections that are prematurely ter-minated due to something other than a failedhandover attempt.

dcr

dropped connectionsoutput size channels used

=∑

∗ +_

_ ( _ )1 Σ , dcr ≥ 0

For the failure models that we have used, a connec-tion is terminated if the channel being used fails. Weexperimented with various strategies to determine

how to view the number of dropped connections, e.g.,as an absolute value, or with respect to time or offeredload. We determined that a ratio ofdropped_connections to offered load provided themost meaningful values. As such, the denominator ofdcr represents a scaled estimate of offered load. Thatis, the denominator is an estimate of the offered loadin Erlangs divided by 3600. (1 Erlang is equal to usageof one channel for 1 hour, which is 3600 seconds.) Theoutput_size parameter is the rate at which raw data issampled. We estimate channel usage by assumingthat if a channel is in use at the time of sampling, thenthe channel has been in use since the prior sample.

The offered connections metric, oc, measures therate of new connection arrivals with respect to time.Later we will demonstrate how oc is very useful inillustrating the behavior of our distributed adaptiveCA algorithm. We experimented with measuring of-fered load in units of Erlangs (as we do for carriedtraffic). Offered connections measures arrival of newconnections without regard to connection bandwidthor length, while offered load considers bandwidth andlength. However, only oc proved useful for this cur-rent study.

oc

new connection requestswindow size

=∑ _ _

_ , oc ≥ 0

The carried traffic metric, ct, is an estimate of thecarried load in units of Erlangs. Note that offered loadis capacity demanded by users, while carried load ortraffic is capacity actually delivered to users by a sys-tem.

ct

channels used output size

seconds hour=

∑( ) ∗_ _

/3600 , ct ≥ 0

The channel utilization metric, cu, is an estimate ofthe percentage of channels used with respect to time.

cu

channels used output size

window size=

∑( ) ∗_ _

_,

0 ≤ ct ≤ maximum channelsThe handover activity rate metric, har, represents

the number of successful forced and voluntary (de-scribed later) handovers with respect to time.

har

number of handoverswindow size

=∑ _ _

_ , har ≥ 0

We define a survivability index, SI, as a measure ofhow well the system meets the survivability objec-tives specified in Table 1. Most current wireless net-work providers aim to maintain handover and block-ing rates below 1 and 2 percent. From a user’sperspective, terminating an ongoing connection isperceived to be a worse problem than blocking of anew connection attempt. However, keeping an ex-tremely low hbr is only achievable by allowing an ex-tremely high nbr. As such, since hbr and nbr blockingrates will exceed desired values during failure or con-


gestion conditions, the objective is to maintain a 0.5hbr/nbr ratio. Lastly, since carried traffic representsnetwork provider revenue, an objective is to maxi-mize this value.

Based on these objectives, we define the followingsurvivability index, where optimum survivability isachieved when SI = 0.

SI h

k A k B k Ckii

i= + +∑

==1

1 2 3

13

( ), 0 ≤ SI ≤ 1,

A br= ∗2 , 0 ≤ A ≤ 1,

B

hbr nbr= +2

, 0 ≤ B ≤ 1,

C

cu maximum channelsmaximum channels

=± _

_ , 0 ≤ C ≤ 1.

6. ExperimentsIn Section 6.1, we demonstrate how the visualizationsproved essential for state characterization of mobilenetwork simulations that incorporate all of the mod-els described in Section 3. (Initially, we use a resourcemanagement model that includes our adaptive chan-nel access scheme and no admission control.) The vi-sual appearance of each metric during normal modeis described. An explanation of the metrics that char-acterize abnormal modes is then given. The effects ofteletraffic loading conditions are discussed through-out, and summarized in a separate section, followedby a section discussing the effects of varying samplingrate and time window size. In Section 6.2, we discusspotential survivability strategies that could be in-voked during each state and give specific examplesfor adaptive admission control and channel accesscontrol.

6.1 State Characterization

6.1.1 Normal ModeSnapshots of normal mode are shown in Figure 5. (Re-fer to Figure 10 for the color map.) The optimal valuefor blocking rate metrics hbr, nbr, and sum is zero—visualized by a flat dark blue color. During simulationof a lightly loaded system, hbr, nbr, and sum will pri-marily remain flat and dark blue, with occasionallighter blue bumps indicating normal bursty traffic.This agrees with our intuition since even in a lightlyloaded system some, blocking will occur as randomcall arrivals may occasionally cluster in the same cell,causing resources to be depleted. The sum metrichelps to filter out some randomness, since in the ab-sence of faults, it is less likely that hbr and nbr willburst at the same spatial and temporal instant. Duringsimulation of a heavily loaded system, all threemetrics become increasingly bursty. The animationdisplays lighter blue bumps moving among all cells,

with an occasional larger, more orange peak in a par-ticular cell.

Under light loading, the blocking rate derivativemetrics, hbr′ and nbr′ , are mostly flat green with anoccasional color spot. These metrics can take on posi-tive or negative values. A zero value indicates nochange (which corresponds to green on a blue to redrainbow color scale). As loading increases and hbr andnbr become more variable, hbr′ and nbr′ becomeslightly more colorful, indicating sharp rises and dipsin hbr and nbr values.

The aim of br is to illustrate the degree to which thenetwork meets the survivability objective of maintain-ing a 0.5 ratio for hbr/nbr. This ratio is managed by ad-mission control. br varies between [-0.5,+0.5]. There-fore, a br that is flat green indicates that the blockingratio objective is being met in real-time. A spike uptowards red indicates that hbr is too high relative tonbr (i.e., our performance has degraded). Conversely,a downward spike towards blue indicates that nbr isrelatively too high (i.e., our cost has degraded). Later,we describe how our adaptive admission control algo-rithms managed this cost/performance tradeoff.

The dcr metric measures dropped calls and variesfrom zero only during transient failure mode. The ocmetric displays new connections arriving to cells andthe har metric displays handovers occurring withineach cell. The cu metric displays channel utilization ateach cell. oc, har, cu, and ct are useful for illustratingthe behavior of our load-based adaptive channel ac-cess (CA) protocol, as described in Section 6.2. ct issimilar to oc, which indicates that for normal opera-tion, carried traffic does not vary greatly from offeredload. The values of oc, har, cu, and ct increase with of-fered load. The visual appearance of the metrics main-tains a spatial uniform look as the color of each metricmoves towards red.

6.1.2 Degraded ModeSnapshots of degraded mode caused by a 20 minutehot-spot in a cell in the very center of the cellsite gridare given in Figure 6. The presence of the hot-spot ismost clearly characterized by spatial and temporalchanges in metrics oc, cu, ct, sum, hbr, and nbr (Table1). A spatial change implies that the value of the met-ric at one cell varies significantly from the value of thesame metric at neighboring cells. A temporal changeimplies that the value of the metric at one cell variessignificantly as compared to a long-term average ofthe value of the metric at the same cell over a recent-past time interval. The visualization animations be-tween the times of each snapshot in Figure 6 showthat the spatial and temporal increase in oc, cu, and ctare persistent, meaning their values remain at an in-creased level for a relatively long time period. Theanimations between the two snapshots also show thatincreases in sum, hbr, and nbr, are recurring, meaning


that their values oscillate from a noticeable high to alower value within a relatively small timeframe.(Note that this is not the same as an arbitrary randomspike.)

6.1.3 Transient Failure ModeSnapshots of transient failure mode caused by a 20-minute failure of channels in a cell in the very centerof the cellsite grid are given in Figure 7. The onset ofthe failure is identified by a sharp temporal increasein the dcr metric. We could also consider the spatialvalue of dcr to help identify the location of the failurewithin the mobile network, e.g., an increase in dcr in asingle cell could indicate a problem at the cellbasestation, while an increase within multiple adja-cent cells could indicate a problem at the basestationcontroller or mobile switching center (which connectmultiple basestation to the fixed network).

6.1.4 Steady-state Failure ModeAs noted in Table 1, this mode is characterized by per-sistent spatial and temporal decreases in cu and ct,and recurring spatial and temporal spikes in sum, hbr,and nbr. (Snapshot not shown.) The “look” of sum,hbr, and nbr for steady-state failure is similar to thatfor degraded mode. However, these two modes aredistinguished by whether ct and cu increase or de-crease. The former implies that congestion resultsfrom an unexpected increase in user demand, whilethe latter implies that congestion results from an un-expected reduction in cell resources. It is important tomake this distinction since one might invoke differentsurvivability strategies for each mode.

6.1.5 Recovery ModeAs listed in Table 1, this mode is characterized by re-curring spatial and temporal rises and dips in oc, cu,and ct metrics. The sum, hbr, and nbr metrics gradu-ally return to their normal look. Actually, we ex-pected that all of the unusual looking metrics wouldgradually return to their normal look during recov-ery. However, cu and ct exhibited a series of high tolow oscillations that seemed to be exacerbated underheavier loading conditions. This very surprising dis-covery required further investigation, which weelaborate upon in Section 6.2.

6.1.6 Effects of Offered LoadThe visual appearance and the measured values ofsome of the real-time metrics vary significantly withoffered load. This affects state characterization acrossvarying loads. For example, bursty metrics such as hbrand nbr become more bursty under increased loads.During very heavy loading, peak values of thesemetrics increase in normal cells, and decrease in failedcells as compared to peak values during light loading.The reason for the decrease is that with greater of-

fered load, the impact of each blocked call is dimin-ished (i.e., the denominator of hbr and nbr equationsbecomes larger, but the numerator remains the same).For our experiments, oc, cu, and ct remain basicallyunchanged over variations in offered load. As such,we conclude that the characterization of each of thefive modes in Table 1 holds across all four offeredloads considered. However, our continuing efforts areto measure each metric as a function of offered load toprovide more robust metrics.

6.1.7 Effects of Metric MeasurementWe experimented with various values of samplingrate and time window size to determine the best fre-quency and time period over which to measure real-time metrics. A larger time window size causes thevisual appearance of bursty metrics to becomesmoother and diminishes the impact of peaks, espe-cially those that indicate abnormal conditions (e.g.,dcr). A smaller time window size has the opposite ef-fect. If the time window size is too small, it is impos-sible to distinguish between bursty activity and fail-ure. If it is too large, failure is masked. For a giventime window size, variations in sampling rate onlyaffect how quickly a failure is detected (i.e., resolutionof failure time is proportional to sample rate). In gen-eral, a large range of values is acceptable and does notdramatically alter the look and usefulness of eachmetric.

6.2 Survivability StrategiesWe next consider the use of state characterization forsurvivable adaptive resource management in twoways. One, our adaptive algorithms use state charac-terization for network monitoring. Two, the visualiza-tions proved to be an essential aid for design/debugof our adaptive algorithms. The visual animations en-abled us to quickly understand algorithm behaviorand dramatically reduce the time taken to evaluatevarious algorithm design and parameter choices. Wediscuss examples of our adaptive channel access andadaptive admission control protocols. As previouslyexplained, our survivability objective is given in Table2.

6.2.1 Adaptive Channel Access ExampleOur system model (Section 3) is such that a mobileuser moves among grid points in the geographicaldata set. A mobile user in need of a channel will con-sider the channel utilization, cu, at the basestation ofeach cell that overlaps the user’s grid point. The mo-bile will request a channel from the basestation withthe lowest cu (assumed to be most lightly loaded). Inaddition, mobile users periodically compare cu valuesat all accessible cells, and may request a voluntaryhandover if one of these cells has significantly lowercu. (Details are in [14, 15].) Therefore, the normal


mode appearance of oc, har, cu and ct, as smallishbumps, with a uniform spatial distribution, confirmsthat our load-based CA algorithm is indeed evenlydistributing load throughout the network.

In degraded mode, we detect an expected increasein oc, cu, and ct in the faulty cell, and an increase inthese metrics in cells surrounding the faulty cell. Thisfurther confirms operation of our CA algorithm. As acell becomes overloaded, it will attempt to offloadcalls in overlap regions to its neighboring cells. Dur-ing failure mode, cu and ct decrease, since less trafficis being carried by the cell that has lost most of itschannels (a failed channel cannot be considered to be“in use”).

The most surprising discovery as to the behavior ofour CA algorithm came from further investigation ofrecovery mode behavior of oc, cu, and ct. The oscilla-tion of these metrics during recovery did not matchour intuition. As soon as all failed channels are re-stored in a cell, the cell immediately has lower cu val-ues as compared to its neighbors. Therefore, we ex-pected oc to continue increasing in the restored celluntil cu and ct in the restored cell matched those val-ues in its neighboring cells. Then we expected oc, cu,and ct to return to their normal appearances. Instead,we found that each of these metrics would oscillate(to unexpectedly low values) for up to 30 minutes be-yond the point of restoration.

A detailed analysis uncovered that a sudden fluc-tuation in cu causes our CA algorithm to generate un-necessary voluntary handover activity. Immediatelyfollowing recovery (Figure 8, far left), the failed cell’scu returns to green (lower loading) as it picks up loadwith newly restored channels. However, soon after(Figure 8, center left), the cell drops to deep blue,while its neighbors turn red. This indicates that therestored cell became overloaded, as compared to itsneighbors, and off-loaded calls using voluntaryhandover. However, too many calls were off-loaded,leaving the cell under-utilized. Thus, the cell returnsto green again (Figure 8c, center right) and repeats thecycle (Figure 8, far right). We attempted to fix this os-cillation through our restoration model. Rather thanrestore all failed cells simultaneously, we experi-mented with restoring channels over a period of time(up to 30 minutes). A longer restoration periodslowed the oscillation cycle, but did not completelysolve the problem. As such, we are looking at ways tofurther modify our CA algorithm to better adapt torecovery mode.

6.2.2 Adaptive Admission Control ExampleOur basic approach to adaptive admission control(AAC) is to vary the size of a guard band maintainedat each cell. The guard band specifies a percentage ofchannels, called δ, that should be set-aside forhandover requests and not allowed for use by new-connection requests. This enables one to manage a

tradeoff between hbr and nbr (details are in [15, 16]).There are a great many ways to implement AAC.Each approach consists of monitoring a real-time met-ric, comparing its value to upper/lower threshold val-ues, and then increasing or decreasing δ based on thiscomparison. We experimented with 4 AAC algo-rithms, called AAC1, AAC2, AAC4, and AAC5.These monitor hbr, hbr′ , br, and br′ , respectively. Fig-ure 9 shows metrics hbr, hbr′ , and br, along with thereal-time value of δ. (AAC 1, 2, 4, 5 occupy rows 1, 2,3, 4, respectively of the snapshot.) The performanceof each algorithm varies significantly depending onthe choice of upper and lower threshold values. Theresponsiveness of each algorithm to abnormal condi-tions also varies greatly. Looking at averaged simula-tion results revealed differences in performance, butgave little insight into the cause of such differences.

The visual animations provided quick, powerfulinsight. For example, the AAC2 and AAC5 that usederivative values are extremely sensitive to minorchanges in threshold values. In many instances thevalue of d would increase in response to burst activityin normal mode. The value of δ would then remain“stuck” at an unnecessarily high value for a long time.Also, these algorithms are less sensitive to degradedmode, which is characterized by a more gradual in-crease in metric values (rather than quick rate ofchange). Thus, in general, using derivatives causedtoo many false reactions in normal mode, and non-responsiveness in fault modes.

The algorithms that used actual values, AAC1 andAAC4, were more responsive, meaning that the valueof δ was increased when needed and decreased whennot needed. However, the optimum setting of thresh-old values is different for different values of offeredload. Also, at higher loads, metric spatial relation-ships are more important to help distinguish faultystates. With the help of the visualization, our study onAAC algorithms continues.

6.2.3 Survivability IndexWe are also studying the development of a survivabil-ity index (SI). The SI function used is described in Sec-tion 5. As shown in Figure 10, the visualization en-ables us to evaluate the real-time contribution ofindividual components (e.g., A, B and C) on the SIfunction.

7. ConclusionsIn this work, we have taken the first step in perform-ing a systematic evaluation of adaptive techniques forcomprehending the complex behavior of mobile net-works under a variety of network conditions, and inparticular, during failures. In contrast to most currentwork, we look at real-time metrics with the aid ofmulti-variate visualization schemes to not only designand evaluate adaptive algorithms, but also to under-


Figure 5. Normal Mode – Left: Lightly Loaded; Right: Heavily Loaded

Figure 6. Degraded Mode Snapshots – Left: Onset of Hotspot; Right: 600 Seconds Later

Figure 7. Transient Failure Mode Snapshots – Left: Onset of Failure; Right: 300 Seconds Later


Figure 9. Comparing AAC 1, 2, 4, and 5

Figure 10. Survivability Index Components (si_A, Si_B, and si_C) and Function (si)

Figure 8. CU Oscillations During Recovery Mode


stand algorithm behavior across a large number ofstates. Visualization permits us to abstract the essenceof large amounts of data, which is an inevitable out-come of using real-time metrics over long simulationruns.

We have introduced a number of real-time metricsthat have been found be useful in influencing adap-tive algorithms and for network monitoring. In par-ticular, we have made progress in identifying andcharacterizing critical states of the network in terms ofthese metrics, and these will form the basis to a soundunderstanding of survivability of mobile networks.

Our experiments already revealed a problem withour channel access algorithm, which simply wouldnot have been possible without the use of real-timemetrics and interactive visualization. Such insights,coupled with domain expertise and the assistance ofvisualization, can make the analysis process very effi-cient. We also demonstrated how various adaptiveadmission control strategies can be quickly comparedand contrasted with the help of the visualization sys-tem and better understanding of the influence andsensitivity of each metric on performance and cost.

In addition to continuing our investigation intotuning the metrics proposed in this work and design-ing better adaptive algorithms, the work on statecharacterization will be followed by specific algo-rithms to permit automatic feature detection and in-vocation of adaptation policies. Work on better under-standing of the survivability index requires furtherinvestigation, especially for comparative analysisacross multiple layers.

While the use of visualization tools in and of itselfhas proved to be of immeasurable value, work onbuilding visualizations that are custom designed tothe challenges facing mobile networks will be pur-sued in the future. The current visualizations displaymetric values in isolation. It is also important to ex-plore spatial and temporal relationships in a more ex-plicit fashion, for instance, between cells within a cer-tain neighborhood or look at multiple metrics incombination.

8. References

[1] Boumerdassi, S. and Beylot, A.-L. “Adaptive channel allocationfor wireless PCN,” MONET, Vol. 4, 1999, pp 111-116.

[2] Lee, K. “Adaptive network support for mobile multimedia.”Proc. Of Intn’l Conf. on Mobile Computing and Networking,Nov. 1996, pp 62-74.

[3] Lu, S. and Bharghavan, V. “Adaptive resource managementalgorithms for indoor mobile computing environments,”Proc. SIGCOMM, Aug. 1996.

[4] Modiano, E. “An adaptive algorithm for optimizing the packetsize used in wireless ARQ protocols,” WINET, Vol. 5, 1999,pp 279-286..

[5] Oliveira, C., Kim, J.B. and Suda, T. “An adaptive bandwidthreservation scheme for high-speed multimedia wireless net-works,” IEEE JSAC, Vol. 16, No. 6, Aug. 1998, pp 858-874.

[6] Priscoli, F.D. and Sestini, F. “Fixed and adaptive blockingthresholds in CDMA cellular networks,” IEEE Personal Com-munications, April 1998, pp 56-63.

[7] Eckhardt, D.A. and Sttenkiste, P. “A trace-based evaluation ofadaptive error correction for a wireless local area network,”MONET, Vol. 4 (1999), pp 273-287.

[8] Iera, A., Marano, S. and Molinaro, A. “Transport and controlissues in multimedia wireless networks,” WINET, Vol. 2,1996, pp 249-261.

[9] Anastasi, G., Lenzini, L., Mingozzi, E., Hettich, A. andKramling, A. “MAC protocols for wideband wireless localaccess: Evolution toward wireless ATM,” IEEE Personal Com-munications, Oct. 1998, pp 53-64.

[10] Harpantidou, Z. and Paterakis, M. “Random multiple access ofbroadcast channels with pareto distributed packetinterarrival times,” IEEE Personal Communications, April1998, pp 48-55.

[11] Ryu, B. “Modeling and simulation of broadband satellite net-works—Part II: Traffic modeling,” IEEE CommunicationsMag., July 1999, pp 48-56.

[12] Talukdar, A.K., Badrinath, B.R. and Acharya, A. “Integratedservices packet networks with mobile hosts: Architectureand performance,” WINET, Vol. 5, 1999, pp 111-124.

[13] Tipper,D., Ramaswamy, S. and Dahlberg, T.A. “PCS NetworkSurvivability,” Proc. of the Mobile and Wireless CommunicationNetworks Conference (WCNC), Sept. 1999.

[14] Dahlberg, T. and Jung, J. “Survivable Load Sharing Protocols:A Simulation Study,” ACM/Baltzer Wireless Networks Journal(WINET), to appear. Retrieve from: http://www.cs.uncc.edu/~tdahlber/tdahlberbio.html#pub

[15] Dahlberg, T.A., Subramanian, K.R. “Visualization of Real-timeSurvivability Metrics for Mobile Networks,” Proc. of the Mod-eling and Simulation of Wireless and Mobile Systems (MSWiM),Aug. 2000.

[16] Subramanian, K.R., Dahlberg, T. “Congestion Control in Mo-bile Networks,” IEEE Symposium on Information Visualization2000 (InfoVis 2000), Oct. 2000, IEEE Computer Society.

[17] Cleveland, W.S. The Elements of Graphing Data, WadsworthInc., 1985

[18] LeBlanc, J., Ward, M.O., and Wittels, N. “Exploring N-dimen-sional databases,” Proceedings of Visualization 1990, San Fran-cisco, CA., Oct., 1990, pp 230-237.

[19] Feiner, S. and Beshers, C. “Automated Design of VirtualWorlds for Visualizing Multivariate Relations,” Proceedingsof Visualization 1992, Boston, MA, Oct. 1992, pp 283-290.

[20] van Wijk, J.J., van Liere, R. “HyperSlice: Visualization of Sca-lar functions of many variables,” Proceedings of Visualization1993, San Jose, CA, Oct. 1993, pp 119-125.

[21] Buja, A., McDonald, J.A., Michalak, J. and Stuetzle, W. “Inter-active Data Visualization Using Focusing and Linking,” Pro-ceedings of Visualization '91, San Diego, CA, Oct. 1991, pp 22-25.

[22] Henderson, D.A. and Card, S.K. “Rooms: The use of multiplevirtual workspaces to reduce space contention in a windowbased graphical user interface,” ACM Transactions on Graph-ics, Vol. 5, No. 3, pp 211-243.

[23] Inselberg, A., and Dimsdale, B. “Parallel Coordinates: A Toolfor visualizing Multi-dimensional Geometry,” Proceedings ofVisualization '90, San Francisco, CA., Oct. 1990, pp 361-375.

[24] Inselberg, A. “Multidimensional Detective,”Proceedings IEEEInformation Visualization, Phoenix, AZ., Oct. 1997, pp 100-107.

[25] Ward, M.O. “XmdvTool: Integrating Multiple Methods forVisualizing Multivariate Data,” Proceedings of Visualization1994, Washington, DC., Oct. 17-21, 1994, pp 326-334.

[26] Fua, Y.H., Ward, M.O. and Rundensteiner, E.A. “Structure-Based Brushes: A Mechanism for Navigating HierarchicallyOrganized Data and Information Spaces,” IEEE Transactionson Visualization and Computer Graphics, Vol. 6, No. 2, 2000, pp150-159.

[27] Jain, R. The Art of Computer Systems Performance Analysis,John Wiley & Sons, 1991.


[28] Schroeder, W., Martin, K. and Lorensen, B. The VisualizationToolkit: An Object-Oriented Approach to 3D Graphics, PrenticeHall, Inc., Second Edition, 1998.

[29] Moitra, S.D., Oki, E. and Yamanaka, N. “Some New Surviv-ability Measures for Network Analysis and Design,” IEICETrans. Communications, Vol. E80-B, No. 4, April 1997.

Teresa A. Dahlberg([email protected]) is an assistantprofessor of Computer Science inthe College of Information Technol-ogy at the University of North Caro-lina at Charlotte. She holds aB.S.E.E. from the University of Pitts-burgh and M.S. and Ph.D. degreesin Computer Engineering fromNorth Carolina State University. Shepreviously worked in the ECE de-partment and was with IBM Corpo-

ration for several years. Her research, within the Multime-dia Computing and Networking laboratory, is supportedby the NSF. Her current funded projects include wirelessaccess network survivability, interactive visualization ofmobile network simulations, development of a secure, mul-timedia, wireless network testbed, and integrated solutionsfor fault tolerant middleware and networking.

Kalpathi Subramanian is an associate professor in the de-partment of computer Science at the University of NorthCarolina at Charlotte. He obtained a B.E(Honors) in Elec-tronics and Communication Engineering from the Univer-sity of Madras in 1983, followed by MS (1987) and PhD(1990) in Computer Science at the University of Texas atAustin. Between 1991 and 1993 he was a post-doctoralmember of technical staff at AT&T Bell Laboratories inMurray Hill, New Jersey. He was a visiting NIH faculty fel-low during the summers of 2000 and 2001. His research in-terests include large scale scientific and information visual-ization, data representations and models, and medical dataimaging/visualization.

visualization of mobile network simulationsseptember-october 2001 simulation 103 technical article...

Documents