user behavior analysis in wi-fi network
DESCRIPTION
User Behavior Analysis in Wi-Fi network. Anna Rosenberg Supervisor: Orly Avner. Overview. The goal of this project: to analyze a Wi-Fi network’s APs to model the wireless clients using the network The contributions of this project: analysis of Access Points - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/1.jpg)
USER BEHAVIOR ANALYSIS IN WI-FI NETWORKAnna Rosenberg
Supervisor: Orly Avner
![Page 2: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/2.jpg)
Overview
The goal of this project: to analyze a Wi-Fi network’s APs to model the wireless clients using the
network The contributions of this project:
analysis of Access Points the use of k-means and g-means
algorithms for clustering the network’s users
![Page 3: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/3.jpg)
Previous Work
"Modeling client arrivals at access points in wireless campus-wide networks (Maria Papadopouli, Haipeng Shen, Manolis Spanakis)" models of the arrival processes of clients at APs
as a time-varying Poisson process with different arrival-rate function
analyzing the traffic load characteristics (e.g., bytes, number of packets, associations, distinct clients, type of clients)
clustering the APs based on their visit arrival and on the building type
![Page 4: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/4.jpg)
Previous Work
Characterizing user behavior and network performance in a public wireless LAN. In Proceedings of the ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems, 2002. (Anand Balachandran, Geoffrey Voelker, Paramvir Bahl, and VenkatRangan) Their overall analysis of user behavior shows that:
Users are evenly distributed across all APs and user arrivals are correlated in time and space
User arrivals can be correlated into the network according to a two-state Markov-Modulated Poisson Process (MMPP).
There is an implicit correlation between session duration and average data rates. Longer sessions typically have very low data requirements. Most of the sessions with high average data rate are very short.
![Page 5: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/5.jpg)
Previous Work
Modeling users’ mobility among Wi-Fi access points.( Minkyong Kim, David Kotz) Networks messages were collected on the
Dartmouth campus Modeling user movements between APs Clustering the APs based on their peak hour
![Page 6: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/6.jpg)
Data
Router (Sniffer) Packets:
MAC address of the access points MAC address of the user Source/Destination IP addresses Size of the packet The time it was received
![Page 7: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/7.jpg)
IEEE 802.11 Architecture
Cells (called Basic Service Set or BSS) Base Station (called Access Point or in
short AP). Access Points are connected through
backbone (called Distribution System or DS)
The examined network:16 APs
![Page 8: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/8.jpg)
Arrival Rate at APs
AP1, AP8:
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:000
1
2
3
4
5
6
7x 10
5
Arrival Time [Hour/2]
Rat
e [B
/min
]
Plot of rate [B/min] for AP 1 , averaging window=0.5 hour
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 24:000
20
40
60
80
100
120
140
160
180
Arrival Time [Hour/2]
Rat
e [B
/min
]
Plot of rate [B/min] for AP 8 , averaging window=0.5 hour
Active from midday till the evening Active only in the evening
![Page 9: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/9.jpg)
Analyzing the arrival rate with different averaging windows
![Page 10: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/10.jpg)
Analyzing the arrival rate with different averaging windows
0 5 10 15 20 250
5
10
15x 10
5
Arrival Time [hour]
Rat
e [B
/min
]Plot of rate [B/min] for AP 1
w=0.1w=0.2
w=0.25
w=0.3
w=0.35w=0.4
![Page 11: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/11.jpg)
Users
3273 users The transmission rate :
![Page 12: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/12.jpg)
Coherence with the time of lectures and breaks
Users are active during the breaks and not active during the lectures that last 50-55 minutes.
![Page 13: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/13.jpg)
Visit duration
How to define a visit?
We chose 30 minutes as a maximal inter-arrival time between two packets that can be considered as packets of one visit.
![Page 14: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/14.jpg)
Features
The average characteristics: Average visit duration Average inter-arrival times between the
visits Average traffic Number of visits Total number of days in the systemThe std of inter-arrival times
The std of traffic The std of visitduration
![Page 15: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/15.jpg)
Features
No typical clusters that can be found among the networks users:
Av. inter visit timesvs. Av. visit duration
Av. inter visit timesvs. Number of visits
![Page 16: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/16.jpg)
Features
Av. trafficvs. Av. visit duration
Av. trafficvs. Number of visits
![Page 17: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/17.jpg)
Clustering
Unsupervised learning problem Finding a structure in a collection of
unlabeled data
Collection of objects which are “similar” Distance measure
![Page 18: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/18.jpg)
K-Means Clustering
Features: Average visit duration Average inter-visit times Average traffic per packets Maximal distance between visits Minimal distance between visits
![Page 19: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/19.jpg)
Results of K-Means Clustering K=2
Av. visit duration vs. Av. inter visit times
Av. inter visit timesvs. Av. traffic per packet
Max. distance between visitsvs. Min. distance between visits
![Page 20: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/20.jpg)
Results of K-Means Clustering K=3
Av. visit duration vs. Av. inter visit times
Av. inter visit timesvs. Av. traffic per packet
Max. distance between visitsvs. Min. distance between visits
![Page 21: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/21.jpg)
Results of K-Means Clustering K=4
Max. distance between visitsvs. Min. distance between visits
Av. visit duration vs. Av. inter visit times
Av. inter visit timesvs. Av. traffic per packet
![Page 22: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/22.jpg)
K-Means Clustering: conclusion k-means clustering algorithm based on average
characteristics of networks’ users can’t produce any isolated clusters. That is why we conclude that the algorithm based on average characteristics can’t cluster well the networks’ users.
Possible reasons for unsuccessful clustering: Using feature set that doesn’t provide enough
information about the system Not enough samples Using Euclidian distance
![Page 23: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/23.jpg)
G-Means Clustering Algorithm The right number k of clusters to use is
often not obvious Based on a statistical test for the
hypothesis that a subset of data follows a Gaussian distribution
The standard statistical significance level α - desired probability of incorrectly splitting
![Page 24: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/24.jpg)
G-means
Different feature set provides more data points Each point consists of the following components:
The visit duration The inter time between this visits and the previous visit Number of packets that were sent during the visit The average amount of data that was accessed during
the visit Normalize the data components to get proper results
even with simple Euclidean distance metric 50 users with maximal number of visits: 3457 points Users with more than 10 visits: 572 users, 15105
points
![Page 25: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/25.jpg)
G-means
The dependence of number of clusters on α:
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
50
100
150
200
250alpha vs av. numvber of clusters
alpha
num
ber
of c
lust
ers
![Page 26: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/26.jpg)
G-means results
70 clusters α = 0.0001
0 10 20 30 40 50 60 700
0.5
1
1.5
2
2.5
3
3.5
4user 136 labels
0 10 20 30 40 50 60 700
1
2
3
4
5
6user 202 labels
58 visits, 8788 packets, 30 clusters; the most common clusters:11, 20, 29 and 35.
59 visits, 28777 packets, 31 clusters; the most common cluster 30
![Page 27: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/27.jpg)
Evaluation
Purity 1( , ) max k j
jk
purity C cN
1 2{ , ,..., }k - the set of clusters
1 2{ , ,..., }JC c c c - the set of classes
Example: the majority class and number of members of the majority class for the three clusters are: x,5(cluster 1); o,4(cluster 2); and ◊,3(cluster 3). Purity is (1/17)×(5+4+3)≈0.71
![Page 28: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/28.jpg)
Evaluation
The dependence of the purity on α
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18alpha vs purity
alpha
purit
y
![Page 29: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/29.jpg)
Evaluation
New Evaluation Measurethe level of possibility of representing each user by one typical cluster1
1 Ni
i i
xE
N M
N – total number of users - number of samples contained in the most common cluster of user i - total number of samples of user i
Example: There are 3 users: x, o and ◊.Number of samples contained in the most common class and total number of samplesfor the three user are:5,8(user x); 4,5(user o); and 3,4(user ◊).E=(1/3)×(5/8 + 4/5 + 3/4)≈0.725
![Page 30: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/30.jpg)
Evaluation
The dependence of the evaluation measure E on α
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
alpha
![Page 31: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/31.jpg)
G-Means Clustering: conclusion g-means clustering algorithm based on the
points that consist of the 4 characteristics (that were described earlier) can’t represent each user by one typical cluster. That is why we conclude that this algorithm can’t cluster well the networks’ users.
Possible reasons for unsuccessful clustering: Using feature set that doesn’t provide enough
information about the system Not enough samples Using Euclidian distance
![Page 32: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/32.jpg)
Conclusions
The Access Points’ arrival rate is coherent with the time of lectures and breaks. The APs show low activity during the lectures and high activity
during the breaks. k-means clustering algorithm based on average
characteristics of networks’ users can’t produce any isolated clusters. That is why we conclude that the algorithm based on average characteristics can’t cluster well the networks’ users.
g-means clustering algorithm based on the points that consist of the 4 characteristics (that were described earlier) can’t represent each user by one typical cluster. That is why we conclude that this algorithm can’t cluster well the networks’ users.
![Page 33: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/33.jpg)
Future work
Select another subset of features Use another clustering algorithm Try to collect more data samples
![Page 34: User Behavior Analysis in Wi-Fi network](https://reader036.vdocuments.net/reader036/viewer/2022081517/56815a52550346895dc787d8/html5/thumbnails/34.jpg)
Questions