mobile network analysis using probabilistic connectivity matrices

9
694 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007 Mobile Network Analysis Using Probabilistic Connectivity Matrices Richard R. Brooks, Senior Member, IEEE, Brijesh Pillai, Stephen Racunas, and Suresh Rai, Senior Member, IEEE Abstract—Researchers use random graph models to analyze complex networks that have no centralized control such as the Internet, peer-to-peer systems, and mobile ad hoc networks. These models explain phenomena like phase changes, clustering, and scal- ing. It is necessary to understand these phenomena when designing systems where exact node configurations cannot be known in ad- vance. This paper presents a method for analyzing random graph models that combine discrete mathematics and probability theory. A graph connectivity matrix is constructed where each matrix el- ement is the Bernoulli probability that an edge exists between two given nodes. We show how to construct these matrices for many graph classes, and use linear algebra to analyze the connectivity matrix. We present an application that uses this approach to ana- lyze network cluster self-organization for sensor network security. We conclude by discussing the use of these concepts in mobile sys- tems design. Index TermsAd hoc networks, peer-to-peer (P2P) networks, random graphs, scale-free graphs. I. INTRODUCTION A N INCREASING number of networks are constructed without central planning or organization. Examples in- clude the Internet, ad hoc wireless networks, and peer-to-peer (P2P) systems like Napster and Gnutella. Mobile computing im- plementations often fit this category, since user positions vary unpredictably. On the other hand, it is often quite easy to deter- mine the aggregate statistics for the user classes. Traditional methods of analysis are often inappropriate for these systems, since the exact topology of the system at any point in time cannot be known. For these reasons, researchers turn to statistical or probabilistic models to describe and analyze these network classes [1]–[3]. Random graph and percolation theories allow us to use statistical descriptions of component behaviors to determine many useful characteristics of the global system. This paper presents a network analysis technique that combines random graph theory, percolation theory, and linear algebra for analyzing statistically defined networks. Random graph theory originated with the seminal works of Erd¨ os and R´ enyi in the 1950s. Until then, graph theory consid- Manuscript received May 17, 2005; revised November 21, 2005. This work was supported in part by the NSF Grant CCR 0310916 and in part by the ARO Grant W911-NF-05-10226. This paper was recommended by Associate Editor J. Wang. R. R. Brooks and B. Pillai are with the Holcombe Department of Electri- cal and Computer Engineering, Clemson University, Clemson, SC 29634 USA (e-mail: [email protected]; [email protected]). S. Racunas is with the Computational Learning Laboratory, Stanford University, Stanford, CA 94305 USA (e-mail: [email protected]). S. Rai is with the Department of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA 70803 USA (e-mail: suresh@ ece.lsu.edu). Digital Object Identifier 10.1109/TSMCC.2007.897484 ered either specific graph instances or deterministically defined graph classes. Erd ¨ os and R´ enyi considered graph classes with a uniform probability for edges existing between any two nodes. Their results were mathematically interesting and found appli- cations in a number of practical domains [1]. Another random network model, given in [2], is used to study ad hoc wireless networks like those used in many mobile net- works. A set of nodes is randomly distributed in a 2-D region. Each node has a radio with a given range r. A uniform proba- bility exists (in [2], the probability is 1) for edges being formed between nodes as long as they are within range of each other. This network model has obvious practical applications. Many of its properties resemble those of Erd¨ os–R´ enyi graphs, yet it also has significant clustering like the small-world model [3]. This paper introduces a technique for analyzing random and pseudorandom graph models. We construct connectivity matri- ces for random graph classes, where every matrix element is the probability that an edge exists between two given nodes. This contains elements of discrete mathematics, linear algebra, and percolation theory. It is useful for a number of applications. Ap- plications already documented include system dependability [7] and quality of service (QoS) [8] estimation. In this paper, we show how to use our approach to design and analyze a self-organizing ad hoc sensor network. These results are derived from our network security [9] and distributed sens- ing [10] work. The application picks k nodes at random to serve as coordinators for all nodes within h hops of the coordinator. The union of these coordination regions needs to form a viable network infrastructure. We show how to analytically determine appropriate values for h and k for a system with a random sub- strate. This approach has many advantages over deterministic network deployments. Most notably, it is tolerant of component failures and requires minimal advance planning. If the system statistics are adequately described and analyzed, a viable struc- ture will almost certainly be constructed by the network without outside intervention. The layout of the paper is as follows. Section II reviews graph theory concepts used in this paper. Section III describes the random graph classes considered and the construction of connectivity matrices for random graph classes. This paper doc- uments matrix construction for Erd¨ os–R´ enyi and range-limited graphs. Matrix construction for small-world graphs, scale-free networks, and graphs using regular tessellations (as considered in percolation theory) can be found in [7] and [11]. Section IV shows some uses of connectivity matrices for analyzing the net- work classes that they represent. Section V applies these results to the design of an embedded network key-management system. Conclusions and ideas for future research are in Section VI. 1094-6977/$25.00 © 2007 IEEE

Upload: suresh

Post on 15-Apr-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mobile Network Analysis Using Probabilistic Connectivity Matrices

694 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007

Mobile Network Analysis Using ProbabilisticConnectivity Matrices

Richard R. Brooks, Senior Member, IEEE, Brijesh Pillai, Stephen Racunas, and Suresh Rai, Senior Member, IEEE

Abstract—Researchers use random graph models to analyzecomplex networks that have no centralized control such as theInternet, peer-to-peer systems, and mobile ad hoc networks. Thesemodels explain phenomena like phase changes, clustering, and scal-ing. It is necessary to understand these phenomena when designingsystems where exact node configurations cannot be known in ad-vance. This paper presents a method for analyzing random graphmodels that combine discrete mathematics and probability theory.A graph connectivity matrix is constructed where each matrix el-ement is the Bernoulli probability that an edge exists between twogiven nodes. We show how to construct these matrices for manygraph classes, and use linear algebra to analyze the connectivitymatrix. We present an application that uses this approach to ana-lyze network cluster self-organization for sensor network security.We conclude by discussing the use of these concepts in mobile sys-tems design.

Index Terms—Ad hoc networks, peer-to-peer (P2P) networks,random graphs, scale-free graphs.

I. INTRODUCTION

AN INCREASING number of networks are constructedwithout central planning or organization. Examples in-

clude the Internet, ad hoc wireless networks, and peer-to-peer(P2P) systems like Napster and Gnutella. Mobile computing im-plementations often fit this category, since user positions varyunpredictably. On the other hand, it is often quite easy to deter-mine the aggregate statistics for the user classes.

Traditional methods of analysis are often inappropriate forthese systems, since the exact topology of the system at anypoint in time cannot be known. For these reasons, researchersturn to statistical or probabilistic models to describe and analyzethese network classes [1]–[3]. Random graph and percolationtheories allow us to use statistical descriptions of componentbehaviors to determine many useful characteristics of the globalsystem. This paper presents a network analysis technique thatcombines random graph theory, percolation theory, and linearalgebra for analyzing statistically defined networks.

Random graph theory originated with the seminal works ofErdos and Renyi in the 1950s. Until then, graph theory consid-

Manuscript received May 17, 2005; revised November 21, 2005. This workwas supported in part by the NSF Grant CCR 0310916 and in part by the AROGrant W911-NF-05-10226. This paper was recommended by Associate EditorJ. Wang.

R. R. Brooks and B. Pillai are with the Holcombe Department of Electri-cal and Computer Engineering, Clemson University, Clemson, SC 29634 USA(e-mail: [email protected]; [email protected]).

S. Racunas is with the Computational Learning Laboratory, StanfordUniversity, Stanford, CA 94305 USA (e-mail: [email protected]).

S. Rai is with the Department of Electrical and Computer Engineering,Louisiana State University, Baton Rouge, LA 70803 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TSMCC.2007.897484

ered either specific graph instances or deterministically definedgraph classes. Erdos and Renyi considered graph classes with auniform probability for edges existing between any two nodes.Their results were mathematically interesting and found appli-cations in a number of practical domains [1].

Another random network model, given in [2], is used to studyad hoc wireless networks like those used in many mobile net-works. A set of nodes is randomly distributed in a 2-D region.Each node has a radio with a given range r. A uniform proba-bility exists (in [2], the probability is 1) for edges being formedbetween nodes as long as they are within range of each other.This network model has obvious practical applications. Manyof its properties resemble those of Erdos–Renyi graphs, yet italso has significant clustering like the small-world model [3].

This paper introduces a technique for analyzing random andpseudorandom graph models. We construct connectivity matri-ces for random graph classes, where every matrix element is theprobability that an edge exists between two given nodes. Thiscontains elements of discrete mathematics, linear algebra, andpercolation theory. It is useful for a number of applications. Ap-plications already documented include system dependability [7]and quality of service (QoS) [8] estimation.

In this paper, we show how to use our approach to design andanalyze a self-organizing ad hoc sensor network. These resultsare derived from our network security [9] and distributed sens-ing [10] work. The application picks k nodes at random to serveas coordinators for all nodes within h hops of the coordinator.The union of these coordination regions needs to form a viablenetwork infrastructure. We show how to analytically determineappropriate values for h and k for a system with a random sub-strate. This approach has many advantages over deterministicnetwork deployments. Most notably, it is tolerant of componentfailures and requires minimal advance planning. If the systemstatistics are adequately described and analyzed, a viable struc-ture will almost certainly be constructed by the network withoutoutside intervention.

The layout of the paper is as follows. Section II reviewsgraph theory concepts used in this paper. Section III describesthe random graph classes considered and the construction ofconnectivity matrices for random graph classes. This paper doc-uments matrix construction for Erdos–Renyi and range-limitedgraphs. Matrix construction for small-world graphs, scale-freenetworks, and graphs using regular tessellations (as consideredin percolation theory) can be found in [7] and [11]. Section IVshows some uses of connectivity matrices for analyzing the net-work classes that they represent. Section V applies these resultsto the design of an embedded network key-management system.Conclusions and ideas for future research are in Section VI.

1094-6977/$25.00 © 2007 IEEE

Page 2: Mobile Network Analysis Using Probabilistic Connectivity Matrices

BROOKS et al.: MOBILE NETWORK ANALYSIS USING PROBABILISTIC CONNECTIVITY MATRICES 695

Fig. 1. On the left is a graph of six nodes. On the right is its associatedconnectivity matrix.

II. PRELIMINARIES

A graph is traditionally defined as the tuple [V, E]. V is a set ofvertices, and E is a set of edges. Each edge e is defined as (i, j)where i and j designate the two vertices connected by e. In thispaper, we consider only undirected graphs where (i, j) = (j, i).(Many systems are modeled using directed graphs (di-graphs)where (i, j) �= (j, i).) An edge (i, j) is incident on vertices i andj. We do not consider multigraphs where multiple edges canconnect the same end points. We use the terms vertex and nodeinterchangeably. Edge and link are also used synonymously.

Many data structures have been used as practical represen-tations of graphs. Common representations and their uses canbe found in [12]. For example, a graph where each node has atleast one incident edge can be fully represented by the list ofedges. Another common representation of a graph, which weexplore in more depth, is the connectivity matrix. The connec-tivity matrix M is a square matrix where each element m(i, j)is 1 (0) if there is (not) an edge connecting the vertices i and j.For undirected graphs, this matrix is symmetric. Fig. 1 shows asimple graph and its associated connectivity matrix.

As a matter of convention, the diagonal of the matrix canconsist of either 0s or 1s. 1s are frequently used, based on thesimple assertion that each vertex is connected to itself. We usethe convention where the diagonal is filled with 0s. Justificationfor it is given in Section IV-A.

A walk of length z is an ordered list of z edges[(i0, j0), (i1, j1) . . . , (iz, jz)], where each vertex ja is the sameas vertex ia+1. A path of length z is a walk where all ia areunique. If jz is the same as i0, the path is a cycle.

A connected component is a set of vertices where there is apath between any two vertices in the component. (In the caseof di-graphs, this is a fully connected component.) A completegraph has an edge directly connecting any two vertices in thegraph. A complete subgraph is a subset of vertices in the graphwith edges directly connecting any two members of the set.

A useful property of connectivity matrices is the fact that el-ement mz(i, j) of the power z of graph G′s connectivity matrixM (i.e., Mz) is the number of walks of length z from vertexi to vertex j that exist on G [13]. This can be verified usingthe definition of matrix multiplication and the definition of theconnectivity matrix. It is possible to find the connected compo-

nents in a graph using iterative computation of Mz . After eachexponentiation:

� set all non-zero elements of the M to 1, giving C1;� set Ci+1 to CiC1;� set all nonzero elements of Ci+1 to 1;� set Ci+1 to the inclusive or of Ci+1 and Ci;� stop when Ci+1 is equal to Ci.Each row of Ci has a 1 in the element corresponding to each

node in the same connected component. The number of distinctrows is the number of connected components.

III. MATRIX CONSTRUCTION

We now show how to construct connectivity matrices foranalyzing classes of random and pseudorandom graphs. The firstmodel that we discuss is the Erdos–Renyi random graph [14];we then consider a graph model of an ad hoc wireless network.

A. Erdos–Renyi Random Graphs

Erdos–Renyi graphs are defined by the number of nodes nand a uniform probability p of an edge existing between any twonodes. We use E for |E| (i.e., the number of edges in the graph).Since the degree of a node is essentially the result of multipleBernoulli trials, the degree of an Erdos–Renyi random graphfollows a Bernoulli distribution. Therefore, as n approachesinfinity, the degree distribution follows a Poisson distribution.

It has been shown that the expected number of hops betweennodes in these graphs grows proportionally to the log of thenumber of nodes [15]. It is to be noted that Erdos–Renyi graphsdo not necessarily form a single connected component. WhenE − n/2 < −n2/3, the graph is in a subcritical phase and almostcertainly not connected. A phase change occurs in the criticalphase where E = n/2 + O(n2/3), and in the supercritical phasewhere E − n/2 > −n2/3, a single giant component becomes al-most certain. When E = n log n/2 + O(n), the graph is fullyconnected [16]. (Note that the expected number of edges for anErdos–Renyi graph is n(n − 1)p/2).

Definition: The probabilistic connectivity matrix of an n noderandom graph is an n-by-n matrix where each element (j, k) isthe Bernoulli probability that an edge exists between nodes j andk. By convention, we set elements where j = k to 0. The prob-abilistic connectivity matrix construct translates random graphclasses into an equivalent set of probabilities for the existence ofedges between two given nodes. It is to be noted that, in contrastto many matrix representations of stochastic systems, the rowsand columns of M do not necessarily sum to 1.

As an example, for an Erdos–Renyi graph with n set to 3 andp set to 0.25, the probabilistic connectivity matrix is

0 0.25 0.250.25 0 0.250.25 0.25 0

. (1)

B. Ad Hoc Wireless Networks

Mobile wireless networks, in particular ad hoc wireless net-works, with no fixed infrastructure, are suited to analysis using

Page 3: Mobile Network Analysis Using Probabilistic Connectivity Matrices

696 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007

random graphs. A fixed radius model for random graphs is usedin [2] to analyze phase-change problems in ad hoc network de-sign. In Section V, we study phase changes in an ad hoc sensornetwork to determine whether or not a given system will pro-duce a viable sensor network. After the phase change, a systemcan almost certainly self-organize into a viable network. Beforethe phase change, self-organization is virtually impossible. Theapproach presented in this paper is used to predict where thephase change occurs.

The model in [2] places nodes at random in a limited 2-Dregion. Two uniform random variables provide a node’s x andy coordinates. Since nodes in proximity with each other have ahigh probability of being able to communicate, the distance rbetween pairs of nodes is used as a threshold. If r is less thana given value, then an edge exists between the pair of nodes.Otherwise, no edge exists. Many similarities exist between thisgraph class and the graphs studied by Erdos and Renyi.

The analysis in [2] looks at finding phase transitions for con-straint satisfaction problems. Range-limited graphs differ fromErdos–Renyi graphs in that they have significant clustering. Weuse the model from [2], except where they create an edge withprobability 1 when the distance between two nodes is less thanthe threshold; we allow the probability to be any value in therange [0 . . . 1].

The range-limited graph class differs from Erdos–Renyi andother random graph classes in that while it is defined by arandom process, the random process determines edge creationonly indirectly. This makes it difficult, if not impossible, toundertake formal analysis. Instead of formally decomposing thegraph definition into a set of Bernoulli probabilities, we areforced to derive a model that approximates system behavior. Weprovide results in Sections IV and V showing that this model,while not perfect, is a useful tool for predicting system behavior.

We construct range-limited graphs using the following pa-rameters:

� n is the number of nodes;� max x(max y) is the size of the region in the x(y)

direction;� r is the maximum distance between nodes where connec-

tions are possible;� p is the probability that an edge exists connecting two nodes

within range of each other.Range-Limited Graph Model Definition: For range-limited

graphs, element (j, k) of the probabilistic connectivity matrixhas value

p(2c − c2

)(2)

where c is a constant defined by

c = r2 −(

j

n + 1− k

n + 1

)2

(3)

when r2 ≥(

jn+1 − k

n+1

)2

, and 0 otherwise.

Range-Limited Graph Model Derivation: Each element (j, k)of the probabilistic connectivity matrix is defined by the proba-bility that an edge exists between the pair of nodes j and k that

is given by (2) and (3). Derivation of (2) and (3) proceeds in twosteps:

Step 1: sort nodes by the x coordinate value (y could be usedas well; the choice is arbitrary) and use order statisticsto find the expected value of the x coordinate for eachnode; and

Step 2: determine the probabilities that an edge exists be-tween two nodes using the expected values fromStep 1).

By definition, each node is located at a point defined bytwo random variables: the x and y coordinates. Without loss ofgenerality, max x and max y are used to normalize the valuesof x, y, and r to the range [0 . . . 1]. Constant scaling factors areneeded to compensate for lack of symmetry when max x �=max y.

Rank statistics are used to estimate the probability that twogiven nodes k and j are within communications range. To dothis, sort each point by its x (or y) coordinate. For n samplesfrom a uniform distribution of range [0 . . . 1], the rank statisticsgive expected value of the j th largest as j/(n + 1) with variance

1n + 2

(j

n + 1

)(1 − j

n + 1

).

Node position j in the sorted list, therefore, has expected valuej/(n + 1). Since our ad hoc network model uses the Euclideandistance metric, an edge exists between two nodes j and k withprobability p when

(xj − xk)2 + (yj − yk)2 ≤ r2. (4)

Entering the expected values for the x coordinate of the nodesof rank j and k and reordering terms, this becomes

(yj − yk)2 ≤ r2 −(

j

n + 1− k

n + 1

)2

. (5)

By definition, the random variables giving the x and y posi-tions of the nodes are uniformly distributed and uncorrelated.

The relation (5) states that the probability that the square ofthe difference of two normalized uniform random variables isless than the constant value c that we define as the right-handside of (5). Fig. 2 presents this as a geometry problem. Thevalues of the two uniform random variables with range [0 . . . 1]describe a square region where every point is equally likely. Thewhite region in the lower right-hand corner of Fig. 2 is the areathat does not satisfy (5) because yj–yk is greater than c. It is aright triangle, whose hypotenuse has these end points:

� When yk is 0, yj cannot be greater than c. The triangle basehas length 1 − c.

� When yj is 1, yk cannot be less than 1 − c. The triangleheight is 1 − c.The area of this triangle is, therefore, (1 − c)2/2.

The region that does not satisfy (5) because yj–yk is less than– c, is contained in the triangle in the upper left-hand corner ofFig. 2. The area of that region is also (1 − c)2/2, which can bedemonstrated either by using symmetry or by repeating the logicin the previous paragraph and switching the variable names.

Page 4: Mobile Network Analysis Using Probabilistic Connectivity Matrices

BROOKS et al.: MOBILE NETWORK ANALYSIS USING PROBABILISTIC CONNECTIVITY MATRICES 697

Fig. 2. Geometric representation of (5). The regions which do not satisfy theinequality are white.

Summing the areas of the two white triangles in Fig. 2 gives

(1 − c)2. (6)

Since the area satisfying (5) is the area not contained in thetwo white triangles, the likelihood that nodes j and k are withincommunications range as

1 − (1 − c)2 = 1 − (1 − 2c + c2) = 2c − c2. (7)

Multiplying (7) by the probability p that two nodes within rangecan communicate ends the derivation of (2) and (3).

An example matrix for six nodes in a unit square with r = 0.3and p = 1.0 is

0 0.134 0.0167 0 0 00.134 0 0.134 0.0167 0 00.0167 0.134 0 0.134 0.0167 0

0 0.0167 0.134 0 0.134 0.01670 0 0.0167 0.134 0 0.1340 0 0 0.0167 0.134 0

. (8)

Fig. 3 shows a 3-D plot of an example matrix. When we com-pare the number of edges for range-limited graphs constructeddirectly versus those constructed using the probabilistic con-nectivity matrices as a function of n and r. The approximationachieved by this model is good, but not perfect. One reason forthe deviation is the use of expected values in the derivation. Forgraph instances with a small number of nodes, the variance ofthe node positions is greater. Second-order effects are possible.Using expected values also assumes independence between ran-dom variables. Independence may not strictly hold throughoutthe range-limited graph-construction process. As we discuss inSections IV and V, in spite of undercounting the number ofedges, this model is very useful for predicting many aspects of

Fig. 3. Three-dimensional plot of the connectivity matrix for a range-limitedgraph of 35 nodes with a range of 0.3.

network behavior. In particular, we have found it very useful forpredicting where phase changes occur in the system.

IV. MATRIX CHARACTERISTICS

We have illustrated how to construct probabilistic connectiv-ity matrices for two graph classes. We now discuss the structureand meaning of the matrices. By definition, connectivity ma-trices are square with the numbers of both rows and columnsequal to the number of vertices in the graph (n). Each element(j, k) is the probability that an edge exists between the nodes jand k. Since we consider only nondirected graphs, (j, k) mustequal (k, j), and therefore, care needs to be taken to guaranteethat algorithms for constructing matrices provide symmetricalresults.

An instance of a graph class can be produced by usingthe probabilistic connectivity matrix and performing n(n − 1)Bernoulli trials. One trial is made for each element (j, k) wherek > j. If it is successful, edge (j, k) exists. This produces aninstance of the graph class desired for all the classes discussed,with the caveat mentioned previously that the range-limited con-nectivity matrix is based on a model that approximates the statis-tics of the actual process. The graph constructed has slightlydifferent statistics than actual range-limited graphs.

Theorem 1: The sum of each row (column) of the probabilis-tic connectivity matrix M provides the expected degree of thecorresponding node in G.

Proof: The expected value of a single trial of a Bernoullidistribution is the probability of success. The expected value ofa sum of random variables is the sum of the expected values.Therefore, the expected number of edges incident on node j is∑n−1

k=0 (j, k). QED.

A. Probability of Walks of z Hops Between Nodes

We now consider uses of probabilistic connectivity matrixes.The first application calculates the likelihood of connections ofmultiple hops between nodes. To do so, we define an analog tomatrix multiplication.

Theorem 2: The probability that a path of two hops existsbetween nodes j and k(j �= k) in a random graph is

1 −∏

l �=j,k

(1 − (j, l)(l, k)) (9)

Page 5: Mobile Network Analysis Using Probabilistic Connectivity Matrices

698 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007

where (j, l) and (l, k) are elements of the probabilistic connec-tivity matrix.

Proof: Each element (j, k) of the probabilistic connectivitymatrix is the probability that an edge exists between the nodesj and k. Since self-loops are not considered, a path of length 2between nodes j and k must pass through an intermediate nodel that is neither j nor k. This value is the probability of the unionof a set of events defined by the likelihoods of paths through allintermediate nodes.

The product of two probabilities is the likelihood of the in-tersection of their two events when they are independent. Sincethe existence of each edge in the graph is determined by anindependent Bernoulli trial, the likelihood of edges existing si-multaneously from node j to node l and node l to node k is theproduct of elements (j, l) and (l, k).

The probability that either of two independent events j andk occurs is: pj + pk − pjpk. The probability that three eventsj, k, and l occur can be computed recursively as pl + (pj +pk − (pjpk) − pl(pj + pk − pjpk). This is commonly referredto as inclusion-exclusion. As the number of events increases,the number of factors involved also increases, making this com-putation awkward for a large numbers of events. An equivalentcomputation is

1 −∏

l �=j,k

(1 − pjlplk). (10)

Equation (10) is more efficient in computing. It computesthe complement of the intersection of all the complementsof the atomic events, which is logically equivalent to the unionof the set of events. Since the matrix elements represent prob-abilities, (9) is the probability that a path of two hops existsbetween the nodes j and k. QED.

Definition: Probabilistic-matrix multiplication is defined forprobabilistic connectivity matrices using (9). Since all connec-tivity matrices are square, probabilistic-matrix multiplication isdefined only for matrices of the same dimension (n × n). Theproduct of matrix A and matrix B is a new matrix AB whereeach element ab(j, k)(j �= k) of matrix AB is

ab(j, k) = 1 −∏

l �=j,k

(1 − a(j, l)b(l, k)) (11)

where a(j, l) and b(l, k) are elements of matrices A and B,respectively. Element ab(j, j) is, by convention, always 0.

The similarity between this definition and standard-matrixmultiplication should be obvious. Equation (11) is needed tomaintain independence when summing probabilities. As a mat-ter of convention, we set the diagonal elements (j, j) of prob-abilistic connectivity matrixes to 0. Our applications typicallyconcern the likelihood paths that exist between nodes by com-puting the likelihoods of paths passing through any intermediatenode. The value (j, j) is the probability that a path connectsnode j with itself. The existence of loops in the graph doesnot increase the likelihood that other nodes are connected. Con-straining diagonal values to 0 automatically removes loops fromour calculations.

Theorem 3: For a graph class represented by a probabilisticconnectivity matrix M , element (j, k) of Mz is the probability

that a walk of length z exists between the nodes j and k. Here,Mz is the product of M with itself z times using our conventions.

Proof: The proof is by induction. By definition, each element(j, k) is the probability that an edge exists between the nodesj and k. M2 is the result of multiplying matrix M with itself.Using Theorem 2, each element (j, k) of M2, except the diag-onals, is the probability that a path of length 2 exists betweennodes j and k. Using the same logic, Mz is calculated fromMz−1 using matrix multiplication to consider all possible inter-mediate nodes l between the nodes j and k, where Mz−1 hasthe probabilities of a walk of length z − 1 between j and l, andM has the values defined previously. QED.

Example 1: Probabilities of walks of length 3 in an Erdos–Renyi graph of four nodes for p = 0.65 are

M =

0 0.65 0.65 0.650.65 0 0.65 0.650.65 0.65 0 0.650.65 0.65 0.65 0

M2 =

0 0.666 0.666 0.6660.666 0 0.666 0.6660.666 0.666 0 0.6660.666 0.666 0.666 0

M3 =

0 0.679 0.679 0.6790.679 0 0.679 0.6790.679 0.679 0 0.6790.679 0.679 0.679 0

(12)

and p = 0.6 are

M =

0 0.6 0.6 0.60.6 0 0.6 0.60.6 0.6 0 0.60.6 0.6 0.6 0

M2 =

0 0.59 0.59 0.590.59 0 0.59 0.590.59 0.59 0 0.590.59 0.59 0.59 0

M3 =

0 0.583 0.583 0.5830.583 0 0.583 0.5830.583 0.583 0 0.5830.583 0.583 0.583 0

. (13)

B. Critical Values and Phase Changes in Ad Hoc Networks

For Erdos–Renyi [14] and range-limited [20] graphs, first-order monotone increasing graph properties follow 0–1 laws.These properties appear with probability asymptotically ap-proaching either 0 or 1, as the parameters defining the randomgraph class decrease or increase. A plot of property probabilityversus parameter value forms an S-shaped curve with an abruptphase transition between the 0 and 1 phases [2], [14]. The pa-rameter value where the phase transition occurs is referred toas the critical point. The connectivity matrices defined in thispaper can identify critical points and phase transitions in graphclasses.

Page 6: Mobile Network Analysis Using Probabilistic Connectivity Matrices

BROOKS et al.: MOBILE NETWORK ANALYSIS USING PROBABILISTIC CONNECTIVITY MATRICES 699

As an example, consider graph connectivity in Erdos–Renyigraphs. As mentioned in Section III, this property has threephases determined by the number of edges in the graph: subcrit-ical (graph almost certainly not connected), critical, and super-critical (graph almost certainly connected). The distribution ofthe number of edges E in an Erdos–Renyi graph has a binomialdistribution defined by taking n(n − 1)/2 trials taken with prob-ability p. In the subcritical phase, the size of the largest graphcomponent is O(logn); making the graph almost certainly dis-joint. In the supercritical phase, the largest graph component sizeis O(n). A single giant component dominates the graph. In thesupercritical phase, the probability that the graph is fully con-nected converges to e−e−c

where p = { log n + c + o(1)}/n[14].

Theorem 4: For Erdos–Renyi graphs of n nodes with 0 >p > 1, the probability of an edge existing between any twonodes and probabilistic connectivity matrix M , the criticalpoint for the property of graph connectivity occurs whenM = M2 [i.e., p = 1 −

(1 − p2

)n−2]. When M � M2 [i.e.,

p � 1 −(1 − p2

)n−2], the graph is in its subcritical phase.

When M � M2 [i.e., p � 1 −(1 − p2

)n−2], the graph is in

its supercritical phase. (Note that probability p is real valued).Proof: By definition, all nondiagonal elements of the Erdos–

Renyi graph matrix have the same value p and diagonal elementshave value 0. By symmetry, all nondiagonal elements of Mn

for any n will have the same value (diagonal elements are con-strained to remain 0). We use the symbols <,>,�, and � tocompare these matrices by referring to the value of the nondi-agonal elements.

Nondiagonal elements of M2 have value 1 − (1 − p2)n−2

from (8). From Theorem 3, this is the likelihood that a path oftwo hops exists between any two nodes in the graph. Let p(2)

represent nondiagonal elements of M2 [i.e., 1 − (1 − p2)n−2].p(2) is monotone increasing with respect to p. Both p and p(2) areconstrained to range [0 . . . 1]. The minimum (maximum) value0 (1) of both occur when p is 0 (1), at which point any value ofn will satisfy the equation and no phase change occurs.

If p(2) < p, the probability that a path of three hops existsbetween any two nodes p(3) is greater than the probability that apath of two hops exists p(2). p(3) has value 1 − (1 − p(2)p)n−2,which is monotone increasing with respect to both p and p(2).By recursion, the probability that any two nodes are connectedincreases with the path length as long as p(2) > p. As p(n−1)

approaches 1, a single giant component exists almost certainly.Although, there remains a finite shrinking probability that iso-lated nodes exist. By definition, the graph is in its supercriticalphase.

By symmetry, when p(2) < p, the likelihood of a path of jhops connecting any two nodes decreases monotonically with j.As p(n−1) approaches 0, the giant component almost certainlydoes not exist, and the graph becomes increasingly disjoint. Bydefinition [14], the graph is in its subcritical phase.

When M � M2, the graph has been shown to be in its super-critical phase, and when M � M2, the graph has been shownto be in its subcritical phase. This leaves M ≈ M2 as the criticalphase. By definition, the critical phase is the neighborhood of

Fig. 4. Empirical verification of Theorem 4. Two thousand instances of Erdos–Renyi graphs of five nodes were generated using Mathematica, as the probabilityof an edge varies from 0.01 to 1.00 (x -axis times 0.01 is the edge probability).The y-axis is the percent of graphs where all nodes were in a single connectedcomponent. Theorem 4 predicts the critical value when p ≈ 0.4. When p =0.35(0.40), 1 − (1 − p2)n−2 is 0.357 (0.407). The subcritical (p > 0.2) andsupercritical (p < 0.6) phases of the system are clearly visible.

the critical point; so the critical point can be calculated as thepoint where M = M2. QED.

Fig. 4 empirically verifies Theorem 4. This result differs fromresults in the mathematics literature, primarily due to our inter-est in studying finite systems. The authoritative sources [16]predict the emergence of the giant component in the neighbor-hood of p = 1/n, with the caveat that sharp thresholds like thisare not uniquely determined. They are asymptotically equiva-lent within a constant factor. Although there is not an algebraicequivalence between the criterion in Theorem 4 and the value1/n, numerical solution of the criteria p = 1 −

(1 − p2

)n−2for

p using Mathematica shows the single real-valued solution be-tween 0 and 1 to be within 0.006 for values of n greater than17. As n grows the solutions converge, making our criterion andthe one from [16] numerically equivalent and definitely withina constant factor.

Without rigor, we apply theorem 4 to non-Erdos–Renyi graphclasses to find estimates where phase changes may occur. Theapplication presented in Section V shows this to be a pragmaticsolution.

V. APPLICATION

We now present an example application. Consider a surveil-lance network charged with reporting when a member of a classof objects (targets) traverses a given surveillance domain (ter-rain). Reports are sent to a user community that we assume, forthe sake of discussion, is external to the terrain. The networkwill be viable as long as it assures that: 1) an object traversingthe terrain is detected (with acceptable error rates) and 2) theuser community is alerted. This criterion is a tautology: the net-work is viable as long as it performs its mission. One of theselarge surveillance sensor networks of n nodes will be deployedand some percentage (k/n) of nodes needs to be designated ascluster heads.

The cluster head could coordinate tracking activities [10]or could maintain system security by periodically refreshingcryptographic keys [11]. For robustness and ease of deployment,k nodes will be chosen at random as cluster head to coordinate

Page 7: Mobile Network Analysis Using Probabilistic Connectivity Matrices

700 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007

Fig. 5. Application of Theorem 4 to ad hoc graphs. Results include 95% confidence interval from 1000 trials using Matlab for a 100-m square region. Curvesshow the percentage of the 1000 trials that produced networks with a single connected component. Communications range (number of nodes) are varied in thegraph on the left (right). The large square points on the graphs indicate the predicted critical point.

the activities of all nodes within h hops. Nodes that do not belongto a cluster will not participate in the network, so cluster headswill only be able to communicate directly if they are within 2hhops of each other. This section shows how our results can beapplied to find values of k and h where the final system is aviable network that satisfies our two criteria.

The physical network is deployed as an ad hoc network wheren nodes are deployed in a terrain. In this example, the sensornode’s wireless communications range is normalized so thatthe terrain size is a unit square. Sensor nodes are vertices in arandom graph structure. Edges between vertices represent eitheran active communications link, or detection of a target passingbetween nodes. In practice, the edge probability distribution isthe minimum of the two likelihoods.

Section IV-B explains the three phases of a random graphclass: subcritical, critical, and supercritical. For random graphs,the curve of the maximum component size versus edge proba-bility takes the form e−e−c

[14], [16]. In the percolation theory,the inflection point of this curve is referred to as the percolationthreshold. Above the phase change (percolation threshold), asingle giant component connects a plurality (possibly majority)of the sensor nodes. The giant component is the only com-ponent of the network with O(n) members [14]. Percolationtheory has established these properties for systems with a giantcomponent [23].

1) For systems above the percolation threshold, a path existsthat connects the terrain’s external boundaries.

2) At the percolation threshold, property 1 is self-similar overscales.

The giant component has at least one path connecting all theexternal boundaries of the terrain (property 1). This property istrue for subsets of the system across scales (property 2). Thus,for a sensor network with a giant component, targets traversingthe network will, with a high probability, be detected by at leastone node that can report the detection to the user community.Therefore, the network fulfils our viability criterion. A fullertreatment of these issues is in [6].

As with any ad hoc network, isolated small holes can existin a network with a giant component, but the giant componentforms a unique quorum for the sensor network. Fig. 5(right)shows the predicted critical point in communications range fornetwork phase changes as the number of nodes n varies withthe simulation results. It is to be noted that networks with r tothe right of the critical value are virtually assured to contain aviable giant component. Our predicted inflection point in Fig. 5is computed using Theorem 4 for nodes

⌊n2

⌋and

⌊n2

⌋+ 1.

For an ad hoc network with adequate number of nodes n andrange r, we determine suitable and consistent values for h andk by mapping an Erdos–Renyi graph onto the ad hoc substrate.Each cluster head is chosen at random by using a Bernoulli trialwith p = k/n. The likelihoods

p(2)im = 1 −

∏j �=i,m

(1 − pijpjm) (14)

p(2h−1)im = 1 −

∏j �=i,m

(1 − p

(2h−2)ij pjm

)(15)

for nodes i =⌊

n2

⌋and m =

⌊n2

⌋+ 1 are computed from the ad

hoc probabilistic connectivity matrix to determine the likelihoodthat a path of 2 h − 1 hops or less exists between nodes on thenetwork. We use neighboring nodes in the calculation, since wehave found the ad hoc network results to be conservative.

Since nodes not connected to a cluster head do not participatein the network, direct network connections only form betweencluster heads whose regions overlap. Each cluster head now hasthe same probability (15) of connecting to every other clusterhead, which means that from Theorem 4, the inflection point forthe giant component will occur at

k = 2 +log

(1 − p2h−1

im

)log

(1 − (p2h−1

im )2) . (16)

Fig. 6 shows the results from a sample problem.

Page 8: Mobile Network Analysis Using Probabilistic Connectivity Matrices

BROOKS et al.: MOBILE NETWORK ANALYSIS USING PROBABILISTIC CONNECTIVITY MATRICES 701

Fig. 6. Matlab simulations showing 95% confidence interval after 35 repetitions for sample networks of 1000 nodes. For an ad hoc network of 1000 nodes, weuse Theorem 4 to predict the number of cluster heads needed to coordinate the activities of nodes within two hops. Cluster heads can communicate with each other,when they are within 2h hops of each other. This is an Erdos–Renyi graph being embedded on an ad hoc substrate. On the left, we predict that at least 42 clusterheads are needed to coordinate activities among nodes when communications range is 0.06. At the predicted inflection point, 80% of cluster heads are in the samecomponent. On the right, we predict that 151 cluster heads are needed when communications range is 0.05.

VI. CONCLUSION

This paper has presented the concept of using probabilisticconnectivity matrices for analyzing classes of random and pseu-dorandom graphs. Algorithms were presented for creating thesegraphs for Erdos–Renyi and ad hoc networks. We discussed theinsights gained by using these matrices. This includes calcula-tion of the expected degree of each node.

They can be used to determine critical points for randomgraphs. We have shown how we use these insights to determineappropriate parameters for designing self-organizing sensor net-works. We apply this clustering approach to both tracking [10]in the sensor network and maintaining sensor network secu-rity [11]. This list of applications given is by no means exhaus-tive, and we feel that many other uses exist for this approach.

Other researchers have used similar concepts. Our ad hocmodel is built on the phase change analysis in [2]. They usedsimulations to illustrate the ubiquity and predictability of thephase-change phenomenon for ad hoc wireless networks. Weextended that work by providing an analytical mechanism forpredicting network performance. The random key predistribu-tion security mechanisms [22], [23] use random sampling todistributed cryptographic keys in a network, and then apply re-sults from Erdos and Renyi to predict the ability of nodes toconnect with their neighbors. These mechanisms, however, ig-nore the spatial distribution of nodes. Integration of this analysistechnique into their methods could be used to predict the emer-gence of giant components to support specific applications [21].

Our interest in this topic has come from our research in dis-tributed systems. Most specifically, we are working in the areasof network survivability and surveillance systems. The idea ofdistributed P2P control and coordination is very important formaking these systems robust, and we have found the insightsprovided by modeling random graphs invaluable. In our opinion,

the techniques given here can be readily adapted to modelinggossip protocols, percolation systems, and P2P networks.

ACKNOWLEDGMENT

The authors thank the anonymous reviewers and editors fortheir suggestions, which have greatly improved the readabilityof the paper.

REFERENCES

[1] A.-L. Barabasi, Linked. Cambridge, MA: Perseus, 2002.[2] B. Krishnamachari, S. B. Wicker, and R. Bejar, “Phase transition phenom-

ena in wireless ad-hoc networks,” presented at the Symp. Ad-Hoc WirelessNetw., GlobeCom 2001. San Antonio, TX, Nov. [Online]. Available: http://www.krishnamachari.net/papers/phaseTransitionWirelessNetworks.pdf

[3] D. J. Watts, Small Worlds. Princeton, NJ: Princeton Univ. Press, 1999.[4] R. Albert, H. Jeong, and A.-L. Barabasi, “Error and attack tolerance of

complex networks,” Nature, vol. 406, pp. 378–382, Jul. 27, 2000.[5] D. Stauffer and A. Aharony, Introduction to Percolation Theory.

London, U.K.: Taylor & Francis, 1992.[6] R. Pastor-Storras and A. Vespignani, “Epidemic spreading in scale-free

networks,” Phys. Rev. Lett., vol. 86, no. 14, pp. 3200–3203, Apr. 2, 2001.[7] S. S. Iyengar and R. R. Brooks, Eds., Distributed Sensor Networks.

Boca Raton, FL: Chapman & Hall, 2005.[8] A. Kapur, N. Gautam, R. R. Brooks, and S. Rai, “Design, performance

and dependability of a peer-to-peer network supporting qos for mobilecode applications,” in Proc. 10th Int. Conf. Telecommun. Syst., Sep. 2002,pp. 395–419.

[9] R. R. Brooks and N. Orr, “A model for mobile code using interactingautomata,” IEEE Trans. Mobile Comput., vol. 1, no. 4, pp. 313–326,Oct./Dec. 2002.

[10] R. R. Brooks, P. Ramanathan, and A. Sayeed, “Distributed target trackingand classification in sensor networks,” Proc. IEEE, to be published.

[11] R. R. Brooks, Disruptive Security Technologies with Mobile Code andPeer-to-Peer Networks. Boca Raton, FL: CRC Press, 2005.

[12] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The Design and Analysis ofComputer Algorithms.. Reading, MA: Addison-Wesley, 1974.

[13] D. M. Cvetkovic, M. Doob, and H. Sachs, Spectra of Graphs. NewYork: Academic, 1979.

[14] B. Bollobas, Random Graphs. Cambridge, U.K.: Cambridge Univ.Press, 2001.

Page 9: Mobile Network Analysis Using Probabilistic Connectivity Matrices

702 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 4, JULY 2007

[15] R. Albert and A.-L. Barabasi, “Statistical mechanics of complex net-works,” Rev. Modern Phys., vol. 74, no. 47, 2002. [Online]. Available:http://arxiv.org/PS_cache/cond-mat/pdf/0106/0106096v1.pdf.

[16] S. Jensen, T. Luczak, and A. Rucinski, Random Graphs. New York:Wiley, 2000.

[17] G. Caldarelli, P. De Los Rios, and L. Pietronero, “Generalized net-work growth: From microscopic strategies to the real internet proper-ties,” Jul. 2003. [Online]. Available: http//arxiv.org/PS_cache/cond-mat/pdf/0307610v1.pdf.

[18] A.-L. Barabasi and R. Albert, “Emergence of scaling in random networks,”Science, vol. 286, pp. 509–512, Oct. 15, 1999.

[19] D. Volchenkov and P. Blanchard, “An algorithm generating scale freegraphs,” Apr. 2002. [Online]. Available: http://arxiv.org/ PS_cache/cond-mat/pdf/0204/0204126v1.pdf.Updated May 30, 2006.

[20] A. Goel, S. Rai, and B. Krishnamachari, “Sharp thresholds for monotoneproperties in random geometric graphs,” in Proc. ACM Symp. TheoryComput., Jun. 2004, pp. 580–586.

[21] R. R. Brooks, S. Armanath, and H. Siddul, “On adaptation to extendthe lifetime of surveillance sensor networks,” presented at the InnovationsCommercial Appl. Distrib. Sensor Netw. Symp. Bethesda, MD, Oct. 2005.

[22] L. Eschenauer and V. D. Gilgor, “A key-management scheme for dis-tributed sensor networks,” in Proc. 9th ACM Conf. Comput. Commun.Security, Nov. 2002, pp. 41–47.

[23] A. Perrig, J. Stankovic, and D. Wagner, “Security in wireless sensor net-works,” Commun. ACM, vol. 47, no. 6, pp. 53–57, Jun. 2004.

Richard R. Brooks (M’97–SM’04) did Graduatestudy in computer science and operations researchat the Conservatoire National des Arts et Metiers,Paris, France, the B.A. degree in mathematical sci-ences from The Johns Hopkins University, Baltimore,MD, in 1979, and the Ph.D. degree in computer sci-ence from Louisiana State University, Baton Rougein 1996.

He is currently an Associate Professor in the Hol-combe Department of Electrical and Computer Engi-neering, Clemson University, Clemson, SC. He was

the Head of the Distributed Systems Department, Applied Research Laboratory,Pennsylvania State University, State College, for over eight years. His currentresearch interests include scaling problems in distributed systems and adversar-ial situations,including applications in network security, command and control,and sensor networks.

Brijesh Pillai received the B.S. degree in com-puter engineering from the University of Mumbai,Mumbai, India in 2004, and the M.S. degree incomputer engineering from Clemson University,Clemson, SC, in 2006.

He is currently a Software Developer with Profes-sional Engineering Corporation. His research inter-ests include sensor networks, computer vision, andsecurity in grid networks.

Stephen Racunas received the B.S. degrees in math-ematics, physics, and electrical engineering fromCarnegie Mellon University, Pittsburgh, PA, in 1992,the M.S. degree in electrical engineering fromPrinceton University, Princeton, NJ, in 1994, andthe Ph.D. degree from Pennsylvania State Univer-sity, State College, in 2004.

He is currently a Research Scientist with the Com-putational Learning Laboratory, Stanford University,Stanford, CA. His current research interests includebiomedical signal processing, artificial intelligence,

and developing computer-aided design tools for biological hypothesis designand evaluation.

Suresh Rai (SM’86) received the Ph.D. degreein electronics and communication engineering fromKurukshetra University, Kurukshetra, India, in 1980.

He is currently a Professor in the Departmentof Electrical and Computer Engineering, LouisianaState University, Baton Rouge. His research interestsinclude network traffic, wavelet-based compression,and security.