on access network identification - universiteit twente · on access network identification ... •...

On Access Network Identification

ADSL & Cable

Bachelor Assignment

Martijn Bolhuis

Supervisors:

Rafael Barbosa (UT/DACS)

Dr. ir. Pieter-Tjerk de Boer (UT/DACS)

University of Twente

Enschede, The Netherlands

Februari, 2012

Table of contents1 Introduction..............................................................................................................................3

1.1 Context..............................................................................................................................31.2 Problem statement............................................................................................................31.3 Research Question............................................................................................................41.4 Motivation........................................................................................................................41.5 Approach...........................................................................................................................5

2 Background information...........................................................................................................62.1 TCP...................................................................................................................................62.2 DOCSIS: Data Over Cable Service Interface Specification.............................................72.3 ADSL................................................................................................................................9

3 Method....................................................................................................................................123.1 Inter-ACK Time Distribution.........................................................................................13

3.1.1 Link Transmission Capabilities...............................................................................133.1.2 Duplex Capabilities.................................................................................................14

3.2 Acess Network Technology............................................................................................163.2.1 Cable.......................................................................................................................163.2.2 ADSL......................................................................................................................19

3.3 Practical Issues................................................................................................................223.3.1 ACK in different Operating Systems......................................................................223.3.2 ACK-pair detection.................................................................................................23

4 Validation................................................................................................................................244.1 Cable...............................................................................................................................24

4.1.1 Inter-ACK time.......................................................................................................244.1.2 Upstream-only.........................................................................................................26

4.2 ADSL..............................................................................................................................274.2.1 ADSL tests..............................................................................................................274.2.2 Effect of DSLAM settings......................................................................................294.2.3 Summary ADSL......................................................................................................33

5 Conclusions............................................................................................................................346 Future work............................................................................................................................357 References..............................................................................................................................36

2

1 Introduction

1.1 Context

In a previous master thesis [1] an approach was proposed to identify access network type by analysing the timing of TCP flows. This approach uses the difference in duplex capabilities to determine whether a connection runs over a wireless link (802.11x) or an Ethernet link. The results suggested that the method might also be applicable to distinguish other types of connection. Early tests over ADSL connections show distinctive patterns in timing of TCP flows which suggests that it is possible to identify such a connection type.

1.2 Problem statement

Figure 1.2.1 illustrates the problem scenario. There are two clients, one using ADSL and the other using cable, that establish a TCP connection to the server. In order to to determine which connection type the client is using, we perform the following test. The server sends a constant flow of back-to-back data packets to a connected client. Upon reception, the client sends ACK packets back according to the TCP protocol. This process is recorded by the server and subsequently analysed by extracting a property called the inter-acknowledgement time: the time between two consecutive ACK packets. This is done for multiple ACK pairs. The access network, i.e. the ADSL, Cable, Ethernet or WLAN link, will affect the timing of the data and ACK packets. We present the result of this test by a graph showing the distribution of the inter-ACK time. Ultimately, the inter-acknowledgement times should allow us to determine

3

Figure 1.2.1: Problem statement

whether the client was using a cable, ADSL, Ethernet or WLAN connection.

Two important assumptions are made in [1] with regards to the network set up. The first is that the access network is the link with the smallest data rate. A link in the path between the client and the server with a smaller capacity might conceal the timing signature caused by the access link. This assumption is reasonable since links in the “core network” of the Internet tend to have higher bandwidth when compared to access networks.

Secondly, it is assumed that the monitoring point is close to the clients. This is to minimize the effect of cross traffic, i.e. traffic generated by other users. When the monitoring point would be far from the clients the effect of cross traffic on the network might blur or hide the timing characteristics generated by the access network for the monitoring point.

It was possible to use the ADSL network on the university campus for testing, where the monitoring point is also located, meaning that in those tests the monitoring point is close to the client. It was not possible set up a laboratory-like experiment for the cable tests. Instead, networks from various internet service providers were used. This means that the monitoring point is not as close to the clients and thus the second assumption may not hold and potentially, the timing characteristics generated by the cable access network is weakened.

1.3 Research Question

The main research question we address in this work:

Can the approach described in [1] also be applied in the identification of cable and ADSL access networks?

Furthermore, this research also focuses in finding cable and ADSL characteristics that enable identification using the proposed method. In [1] it was concluded that Ethernet and WLAN have different inter-ACK time characteristics due to duplex capabilities. Are there characteristics of cable and ADSL that leave inter-ACK fingerprints? If it can be shown how the technology causes certain patterns in the identification process, this will confirm that this method is suitable for the identification of that technology.

1.4 Motivation

The main motivation for this research is scientific curiosity. It seemed interesting to us whether it was possible to identify an ADSL and cable connection by looking solely at the TCP flow information.

ADSL was chosen because early test results looked promising. In addition, ADSL is very commonly used. It is suggested by [2] that the identification of cable is also possible. Since cable is also commonly used, it is interesting to verify if the approach would also works on cable.

A possible application of this research might be a scenario where one wants to detect a host that uses an ADSL or cable connection. For example, in a P2P network it makes sense to prioritize cable over ADSL connections when distributing data, cable connections tend to be significantly faster than ADSL connections, especially uplink data rates.

4

1.5 Approach

We start our research with experiments. Most of the test set up and programs from the previous experiments done on Ethernet and WLAN by the author of [1] are still available to us.

The experiment consists in establishing a TCP session from a client to a specific server on the campus. Since a TCP session can be started with a program like telnet, which is easy to do and generally available on every modern computer, we ask volunteers with cable or ADSL modem to establish a TCP session to our server. This enables us to do a large number of tests on various provider networks.

In addition, it is possible to use the ADSL infrastructure at the university and experiment with the ADSL settings. This is useful because it gives insight into how the different settings affect the results. The knowledge of the exact data rate allows us to calculate the expected inter-ACK time and verify that the measured inter-ACK time matches. To explain the results, a literature study is done.

5

2 Background informationThis chapter contains background information on several technologies which is needed to understand the analysis and results of our research. In section 2.1, we discuss TCP. In section 2.2 and 2.3 we discuss the access network technologies, DOCSIS and ADSL respectively.

2.1 TCP

TCP stands for Transmission Control Protocol and together with IP (Internet Protocol) it is one of the most important protocols in the internet. Many internet applications like the world wide web and e-mail use TCP for reliable data transfer [3].

IP networks are packet-switched networks, meaning that the data that needs to be sent is divided into one or more packets. Each packets consists of a header field and a data field. The header field contains the information needed for routing, like source and destination IP address. The data field contains the payload data. IP Packets are routed individually through the network between intermediate nodes called routers. These devices use routing algorithms to forward the packet to its destination.

IP packets can be lost due to various reasons. For example, an overloaded network: when queue buffers of routers get saturated because of high amounts of traffic, packets will be dropped.

Another reason for packet loss is transmission errors. For example, electromagnetic interference might cause a bit-flip. Every IP packet contains a checksum field in the header. Using this checksum, routers or the destination node can check if a packet is faulty. If this is the case, the packet is dropped [4].

TCP was designed to overcome these issues. TCP provides the following features:

• Reliable data transfer: TCP guarantees that data will arrive at the destination with no errors even though IP packets can be lost in the network.

• In-order transport: IP Packets can arrive in a order different than the one as that they were sent. TCP makes sure that the data is in-order after transmission.

• Full-duplex communication: TCP allows communication in both directions simultaneously.

• Connection-oriented service: Before data can be sent, a logical connection is established between two hosts. Every connection has a port number which is an integer number from 0 to 65535. Port numbers allow multiple connections between hosts. The server is a host waiting for incoming connection requests on a certain port number. The host that initiates the connection is called a client.

TCP uses acknowledgement packets, referred to as ACKs, to guarantee reliable and in order data transfer. The basic operation of the TCP acknowledgements mechanism is as follows. A host that has data to send first divides the data into packets. In each packet a sequence number is included. These packets are then sent over the network to the destination. When the receiver receives a packet it will send an ACK packet including the sequence number of the received packet back to the sender. In TCP the acknowledgements sequence numbers are cumulative. An acknowledgement for number x indicates that all data up until x is received correctly.

6

To ensure proper transmission in the case an ACK or data packet is lost, the sender starts a timer when it sends a data packet. The value of this timer is the estimated Round Trip Time. When this timer expires and no acknowledgement has been received, a retransmission is triggered.

To the best of our knowledge, all modern implementations of TCP use an improvement called “Delayed Acknowledgements”. These implementations postpone the sending of an acknowledgement because the receiver might have data to send as a response to the received data. The acknowledgement can then be piggybacked in the data packet meaning that the acknowledgement and the (response) data are sent together as one packet which is more efficient than sending separate packets. RFC1122 [5], where this mechanism is defined, requires that an ACK packet must not be delayed for more than 0.5 seconds . Furthermore, it is recommended by [5] that an ACK is sent for at least every second packet.

2.2 DOCSIS: Data Over Cable Service Interface Specification

DOCSIS is a technology to provide internet access to users over existing cable TV networks. It was developed by CableLabs. The first version was released in 1997 which supported a maximum transfer rate of 42.88 Mbit/s downlink and 10.24 Mbit/s uplink. In version 2 the uplink rate was increased to 30.72 Mbit/s because of the growing demand for symmetric services like VoIP. The latest version 3 allows the provider to use multiple channels bands in both up- and downlink. This means that up- and downlink can be multiples of 30.72 Mbit/s and 42.88Mbit/s respectively. However, channels that are assigned to DOCSIS cannot no longer be used for analogue TV channels.

In Europe a slightly different version of DOCSIS is used which is called EuroDOCSIS. The only difference is that in Europe the PAL rather than NTSC standard is used for the TV transmission which specifies 8 Mhz channels as opposed to 6 Mhz in NTSC. This allows slightly higher transfer rates in EuroDOCSIS.

Figure 2.2.1 shows a typical cable TV network. The cable head end is the point were the TV cable terminates in the provider's premises. In order to provide internet access to customers, special equipment is installed at the cable head end called CMTS (Cable Modem Termination System). Furthermore, an interface to the internet is needed at this location. At the customer homes a cable modem (CM) is installed. DOCSIS defines the communication between the CM and the CMTS.

In figure 2.2.1, the direction from the homes to the cable head end is the uplink and the downlink is the opposite. Separate frequency channels are used for the up- and downlink. From the illustrated cable network architecture one can infer that on the downlink there is only one sender: the CMTS. The CMTS schedules all downlink traffic. On the uplink, however, there are multiple senders, namely all cable modems. In order to avoid collision on the uplink (two or more active senders at the same time), a scheduling mechanism is used.

The time on the uplink is divided into minislots. Both the duration and the payload of a minislot can be set by the provider. A typical payload setting is 8 bytes [6]. A frame consists of multiple minislots: some minislots are available for CMs to make contention-based reservations, others are reserved to a specific CM for data transfers. The CMTS sends a special Map PDU message on the downlink periodically. This Map PDU contains the allocations of minislots for one frame in the future.

7

Figure 2.2.1: A typical cable network

Figure 2.2.2: DOCSIS Uplink example

Figure 2.2.2 shows how this works. When a modem has data to send it first waits for a Map PDU on the downlink. Map PDU M1

has allocated the time slots ta until tb for reservation. Modems can have only one outstanding request at any given time, but it is possible to reserve multiple data slots in one request. The reservation slots are contention based, meaning that multiple modems can send a request which can result in collision. In the example, the modem makes a request for 3 minislots. The modem waits for the next Map PDU M2 which confirms that minislot 3, 4 and 5 are now allocated to that specific modem. The modem can then send the data in the reserved data minislots.

If a collision occurs during the reservation time, the next Map PDU will not contain an allocation to the modem for the reserved data slots. In this case the modem will enter exponential back-off. This means that it will wait for a random amount of time before trying to reserve minislots again. This random time is uniformly selected from the range [0; 2c -1] where c is the number of collisions. This doubles the expected back-off time after each consecutive failure, decreasing the probability of collosion after each attempt.

8

Map PDU M1 Map PDU M2Request 3 minislots

Allocations in M2Allocations in M

1data

CM

CMTSta tb

tc td tetg

1

reservation reservation data

2 3 5 6 7 8 9

Send data

4

Time

Frame Frame

If a CM is sending data on its allocated data slots and requires more data slots, it can 'piggyback' a request for new data slot into an already allocated data slot rather then using the contention based reservation slots. This is more efficient if the medium is busy and the probability of contention in the reservation slots is high [7].

Timing is very important for the proper function of this reservation mechanism. The data sent by the CM has to arrive at the CMTS within the interval of a minislot. However, the time necessary for the data signal to propagate through the cable depends on the distance between the CM and the CMTS. This is why the CM estimates the distance to the CMTS in a process called ranging [6]. Depending on the estimated difference, the modem will send the data before the beginning of the timeslot to cope with the propagation delay and to make sure that a transmissions in the uplink arrives within the intended time slot at the CMTS.

2.3 ADSL

The first way to provide internet access to users by using the existing infrastructure of old telephone copper wires were the dial-up modems. These modems operated on the frequency band also used for regular telephone services (often referred to as POTS: Plain Old Telephone Services).

ADSL (Asymmetric Digital Subscriber Line) is a technology to provide internet access over copper telephone wires which uses a different and broader frequency band than POTS, enabling higher data rates than dial-up modems and coexistence with POTS.

The main difference between TV cable and telephone networks is that a telephone networks have a dedicated wire to every home. This is not the case in cable-tv networks as seen in section 2.2.

9

Figure 2.3.1: A typical ADSL network

Figure 2.3.1 shows a typical ADSL network. ADSL is the technology used between the homes, where an ADSL modem is installed, and the DSLAM (Digital Subscriber Line Access Multiplexer). The DSLAM is the equipment on the provider's premises where the telephone wires terminate. At the DSLAM the traffic from all the homes is aggregated onto the provider's fast transport network, which can be either ATM or Ethernet. This connection terminates at the B-RAS (Broadband Remote Access Server), which will route the traffic on the internet. The B-RAS also performs authentication with the ADSL modems installed at the users homes to verify that the user has a subscription.

The ADSL technology used on the telephone wires uses a multi-carrier modulation technique, called DMT. The available frequency spectrum in the wire is divided into 256 frequency sub-carriers (also called bins) [8]. These bins are 4.3125 kHz wide. The modem will constantly measure the signal-to-noise ratio and the attenuation in each of the bins. The data rate in each of the bins will be adjusted accordingly: if the signal-to-noise ratio of a bin is high, i.e. the signal quality is good, a maximum modulation of 15 bits can be selected; if a bin is 'noisy', the number of bits will be lower. Finally, if the bin condition is severe, the bin is not used at all.Figure 2.3.2 shows the frequency plan for ADSL. The band 30 Hz – 4 kHz is used for telephone service. The band 4 – 25.875 kHz is an unused guard band. Bins 7 – 31 are used for upstream traffic and bins 32 – 255 are used for downstream traffic [9].

Figure 2.3.2: ADSL Frequency Plan

The total data rate is the summation of the data rates in each bin which can be up to 8 Mbit/s downstream and 1 Mbit/s upstream in ADSL 1. However, these rates can be lower in practice and vary between users. This depends on the quality of the cable and on distance between the modem and the DSLAM. The higher frequency bins suffer from more attenuation. Therefore, if the distance between the ADSL modem and the DSLAM is large, these bins are not used resulting in lower data rates. The maximum allowable distance is about 5 km.

ADSL offers two different ways of transferring data: via the fast path or interleaved path. When the fast path is used, data is sent in the order as it arrived from the higher layers. When interleaving is used, data is reordered as illustrated in figure 2.3.3 and forward error correction data is added. The reason for this is that electromagnetic interference typically arrives in bursts. If for example, an interference burst takes place during the indicated time, the complete second frame will not be received when the fast path is used. When using interleaving, parts of all three frames are lost but since extra error correction data is added, the complete frames can be restored making it more immune for interference and thus more reliable. A disadvantage of interleaving is that it causes a higher latency making it less suitable for traffic with real time requirements.

10

bin numbers

POTS

3231

1104 kHz

255

Downstream

138 kHz25.875 kHz

7

Upstream

4 kHz

0

Two improved versions of ADSL are available. ADSL2 supports data rates up to 12 Mbit/s downstream and 3.5 Mbit/s upstream by using improved modulation techniques. ADSL2+ uses the same improved modulation technique but doubles the frequency band to 2.2 MHz. As a consequence, the downstream rate is doubled up to 24 Mbit/s. However, both versions still suffer from attenuation. In figure 2.3.4, the downstream data rates versus the attenuation are plotted for ADSL, ADSL2 and ADSL2+. From this figure it can be concluded that if the attenuation on the line is more than 50 dB, the downstream rates for ADSL, ADSL2 and ADSL2+ are nearly the same.

11

Figure 2.3.4: ADSL Attenuation. [10]

Figure 2.3.3: Explaining the principle of interleaving

Data 1 Data 2 Data 3

1a 2a 3a 1b 2b 3b 1c 2c 3c

“Normal” order / Fast path

Interleaved pathtime

Interference

3 MethodThis chapter describes the method proposed and used by [1] for the identification of a wireless and Ethernet connection and discusses why this method works.

The method involves collecting a large number of ACK-pairs from one TCP session and one access network type. An ACK-pair is a set of two ACK segments that were received back-to-back. For every ACK-pair, the time between the arrival of the two ACKs is calculated. This time is defined as the inter-ACK time. Together, the set of all collected inter-ACK times form the inter-ACK time distribution. This distribution can be presented in a graph with the inter-ACK time on the horizontal axis and the frequency, i.e. the number of times the corresponding inter-ACK time occurred, on the vertical axis. The access network will affect the timing of the TCP data and ACK segments and thus the inter-ACK time which causes a distinctive pattern in the graphs for a given access network.

The remainder of this chapter is structured as follows. First, two important properties of an access network technology are discussed in section 3.1 that, according to [1], influence the inter-ACK time in the case of Ethernet and WLAN. In 3.2 these properties are revisited in order to analyse the inter-ACK time in cable and ADSL. Section 3.3 discusses practical issues with regards to the method. This includes a discussion on how ACKs are generated by various TCP implementations and an overview of requirements for 'valid' ACK-pairs.

In the examples throughout this chapter, we assume the following. In reality, these assumption do not always hold. As a solution, we dismiss ACK-pairs from our tests for which this is the case. In section 3.3.2, more precise requirements for 'valid' ACK-pairs are given.

• Data segments are sent back-to-back. • Every 2 data segments are acknowledged by one ACK

12

3.1 Inter-ACK Time Distribution

In [1] two important properties of the access link are identified that influence the inter-ACK time distribution: the link transmission capabilities and the duplex capabilities.

3.1.1 Link Transmission Capabilities

The link transmission capabilities consists of two components: the maximum up- and downstream data rates and overhead

Every access link technology has a maximum upstream and downstream data rate. For example, Ethernet 100BASE-TX has a maximum data rate 100Mbps for both the upstream and downstream.

Additionally, every link technology introduces overhead, I.e an header containing additional data that needs to be sent over the access link only in order to function properly. In Ethernet, for example, this overhead is 38 bytes per IP packet (12 byte inter-frame gap, 8 byte MAC preamble, MAC source en destination address (6 bytes each) and a 4 byte CRC field).

When a 1500 bytes TCP data packet is sent over an Ethernet link, the total amount of data that needs to be transferred over the Ethernet link will be 1538 bytes. A TCP ACK packet is typically 40 bytes. However, Ethernet requires that a frame is at least 46 bytes which results in the addition of 6 “padding” bytes. On top of that comes the additional overhead which results in 84 bytes for one TCP ACK over Ethernet. When 100 Mbps Ethernet is used it will take 123.04 μs for a data packet and 6.88 μs for an ACK to be sent over an Ethernet link.

Figure 1.2.1 illustrates how the inter-ACK time is affected. For the sake of readability, sequence numbers of TCP packets are represented as integer numbers above the arrows. Clearly, the speed of the access link and the overhead have an impact on the inter-ACK time.

Figure 3.1.1: Time-sequence diagram Ethernet

13

Data 4

Tim

e

6.88 μs

2 * 123.04 μsInter-ACKtime

Ack 4

Data 3

Ack 2

Data 2

Data 1

Serv

er

Clie

nt

3.1.2 Duplex Capabilities

The second factor that influences the inter-ACK time is the duplex capability. Links can be either full-duplex or half-duplex: full-duplex means that it is possible to send and receive simultaneously whereas half-duplex means that only one of the two is possible at the same time.

Figure 3.1.2 illustrates a scenario where the access link is full-duplex. After two data segments are received at the client, ACK 2 can immediately be sent since simultaneous sending and receiving (of data segment 3) is possible.

Figure 3.1.3 illustrates three possible scenarios in the half-duplex case. Contention will determine the order in which the packets will be sent. In scenario A, after receiving two data segments, the client obtains access to the medium for sending ACK 2. In scenario B, the server obtains access to the medium first for sending Data 3, and the client has to postpone sending ACK 2. After ACK 2 is sent, the server can send Data 4. This scenario results in a lower in inter-ACK time as can be seen from the shaded area. In scenario C, the server first sends all four data packets before the client obtains access to the medium for sending the two ACKs. This results in an even lower inter-ACK time.

Overall, the conclusion is that the duplex capabilities of the access link influence the distribution of inter-ACK times. When the link is full-duplex, an ACK can be sent immediately after receiving the corresponding data packet, meaning that the time between ACKs is mostly about the same: the distribution of the inter-ACK times will be low. This is shown figure 3.1.4 ,on the left side, where the distribution of the inter-ACK time is sketched. The inter-ACK time is on the horizontal axis and vertical axis shows the frequency, i.e. the number of times the corresponding inter-ACK time occurs. On the right side, the half-duplex case is sketched. Each peak corresponds with a scenario in 3.1.3. Since Ethernet is full-duplex and WLAN is half-duplex, the inter-ACK time can be used for the identification of these technologies.

Figure 3.1.2: Inter-ACK time, full-duplex link

14

Data 4

Tim

eClie

nt

Serv

er

Inter-ACKtimeAck 4

Data 3Ack 2

Data 2Data 1

Figure 3.1.3: Inter-ACK time, half-duplex link

Figure 3.1.4: Inter-ACK distribution sketched for full- and half-duplex

15

Inte

r-A

CK

tim

e

Inte

r-A

CK

tim

e

Inte

r-A

CK

ti

me

Scenario CScenario BScenario A

Serv

er

Data 1

Serv

er

Clie

nt

Data 1

Data 2

Ack 2

Data 3

Ack 4

Data 4

Ser

ver

Cli

ent

Ser

ver

Cli

ent

Data 1

Data 2

Ack 2

Data 3

Ack 4

Data 4

Data 1

Data 2

Ack 2

Data 3

Ack 4

Ser

ver

Cli

ent

Tim

e

Data 4

Inter-Ack time0Inter-Ack time0

Transmission 1 data

Freq

uenc

y

Freq

uenc

y

Transmission 1 ACK

Half-duplexFull-duplex

3.2 Acess Network Technology

3.2.1 Cable

Given the fact that DOCSIS is full-duplex, one could presume that the distribution of the inter-ACK time is similar to that of Ethernet. However, this is not the case because the upstream channel is shared between multiple users. In section 2.2, it was shown that a schedule mechanism with contention is used. [2] states that “The contention occurring on the upstream channel of Cable modem provides us the opportunity to distinguish cable from ADSL (which provides a dedicated collision free connection) ”. But how does the upstream channel in DOCSIS affect the inter-ACK time? This question can be answered by pointing out a similarity between a half-duplex access network, as discussed in section 3.1.2, and the upstream mechanism of DOCSIS. It was shown that, when a half-duplex network is used, the transmission of an ACK segment has to be postponed in some cases. More specifically, this delay depends on the number of data segments sent on the downstream, before the client obtains access to the medium which results in a peak-shaped inter-ACK time distribution. Due to the contention on the upstream channel of DOCSIS, ACK segments are also delayed. In the case of DOCSIS, this delay depends on the number of reservation attempts the CM has to make before a reservation is successful. A reservation can be attempted after every Map PDU. Since Map PDUs are sent periodically on the downstream, it can be expected that the inter-ACK time distribution of DOCSIS is similar, i.e. peak shaped, to that of a half-duplex technology, even though DOCSIS is full-duplex.

The time that an ACK is delayed by the upstream mechanism in DOCSIS can be studied in more detail by discussing various scenario's with different contention and analysing the effect on the inter-ACK time.

In figure 3.2.1, the inter-ACK time of ACK2 and ACK4 is evaluated in the case of no contention. It is assumed that an ACK segment fits in two minislots and that a frame consists of 14 minislots. In reality, both numbers are typically higher but we chose this to keep the figure clear. A Map PDU is sent every ΔTMAP seconds. Furthermore, the TCP data segments, sent on the downstream, and the minislot reservation for ACK2 are not shown in this illustration for the sake of readability.

After ACK2 is sent, the CM will make a reservation for sending ACK4. In order to do this, it first has to wait for Map PDU2. The CM can then attempt a reservation for two data slots in the minislots allocated for reservation. In the case of no contention, the reservation will be successful, meaning that Map PDU3 confirms that the minislots are now allocated to the CM. The CM can then send ACK4 in the requested slots. The the inter-ACK time is indicated in the figure.

This time can vary depending on where within the frames 1 and 3, the ACKs are sent. If, for example, ACK4 was sent in minislot 1 and 2 instead of 7 and 8, the inter-ACK time would have been lower. The inter-ACK time as a function of the selected minislots is shown in table 3.2.1. If the minislot selection in frame1 and frame3 is identical, the inter-ACK time will be exactly 2· ΔTMAP. This are the entries on the indicated diagonal of the table. If the selected minislots in frame3 is higher than the one in frame1, the inter-ACK time will be one or multiple minislot times (Tminislot) higher than 2· ΔTMAP.

16

Figure 3.2.1: DOCSIS Uplink example, no contention

1 & 2 2 & 3 3 & 4

1 & 2 2·ΔTMAP 2·ΔTMAP + Tminislot 2·ΔTMAP + 2·Tminislot ...

2 & 3 2·ΔTMAP - Tminislot 2·ΔTMAP 2·ΔTMAP + Tminislot ...

3 & 4 2·ΔTMAP - 2· Tminislot 2·ΔTMAP - Tminislot 2·ΔTMAP ...

... ... ... ... 2·ΔTMAP

Table 3.2.1: Inter-ACK time as a function of the selected minislots

This are all entries on the upper- right side of the diagonal. In the opposite case, the inter-ACK time will be one or multiple minislot times lower. In reality, a frame contains more minislots. But even though this table shows only a limited number of possibilities, it can be inferred that 2·ΔTMAP will always have the highest number of entries in the table and thus this time has the highest probability. Therefore, it can be expected that the scenario with no contention will cause a peak around 2·ΔTMAP.

The reservation of minislots as shown in Figure 3.2.1 can fail. For example, when other CMs try to reserve minislots, or when a collision occurs in the reservation slots. In this case, the CM cannot send ACK4 in frame3. Instead, the modem will have to attempt a new reservation in frame3. If this is successful, the ACK can be sent in frame4.

As a result, the inter-ACK time is now 3·ΔTMAP. Of course, the second reservation attempt can also fail, which means that the ACK is delayed for another ΔTMAP and thus the inter-ACK time is 4· ΔTMAP. All in all, it can be expected that the upstream behaviour of DOCSIS will cause peak every ΔTMAP in inter-ACK time graphs, starting at 2·ΔTMAP. This is sketched in figure 3.2.2.

17

Minislot frame3

Min

islo

t fra

me 1

Inter-ACK time

Time

ΔTMAP ΔTMAPΔTMAP

CM

datareservation

Map PDU M2

Map PDU M4

Send ACK4

4 9876532

datareservation

14 98765321 4

Send ACK2

9876532

datareservation

1

Reserve 2 minislots

Map PDU M3

CMTS

Allocations in M3Allocations in M

2Allocations in M1

frame3frame2frame1

Figure 3.2.2: Sketch of Inter-ACK time distribution of DOCSIS

The height of the peaks, i.e. the frequency, will decrease as the inter-ACK time increases since DOCSIS uses binary exponential back-off when a collision occurs in the reservation slots. This effectively means that the probability of a collision decreases as the number of attempts increases, and thus the probability of a higher inter-ACK time is lower.

DOCSIS can optionally use an alternative mechanism, called piggybacking, for making upstream bandwidth reservations. A reservation request for data minislots can be embedded in an already reserved data slot, instead of using the contention-based reservation slots. According to [11], there are 2·ΔTMAP times required for sending upstream data using piggybacking, and thus the inter-ACK time will also be 2·ΔTMAP in this case. Since an inter-ACK time of 2·ΔTMAP can also occur when piggybacking is not used, as was shown previously, it can be concluded that this mechanism will not affect the test results.

[11] also suggests the possibility of multiple packets, being sent in one upstream frame. The downstream channel in DOCSIS is typically faster than the upstream channel, and the upstream channel can introduce substantial delay. While data segments are being received on the downstream without delay, the corresponding ACK segments can be queued in the buffer of the CM due to contention. If possible, the CM will reserve minislots for as much queued data as possible, potentially creating a situation where multiple ACK segments are sent within one frame. In this situation, the inter-ACK time depends on the actual transmission time, i.e. transmission capabilities as explained in section 3.1.1, on the upstream. The transmission time of an 40 byte ACK segment is 101.7µs, given that the data rate on the uplink is 3 Mbit/s. This results in one more peak at this time.

The previous discussion assumed that the interval time between MAP messages (ΔTMAP), also called the MAP time in some literature, is fixed. Sources [12] and [13] suggest that a possible value for the MAP time is 2 ms and is indeed fixed. However, [14] states: “In most implementations the MAP time is constant, but dynamically varying MAP times are permitted in the specification ”. Hence, dynamical MAP times have to be considered. If dynamical MAP times are used, the peak pattern will not be observed since the expected inter-ACK time will not simply be a multiple of a fixed value ΔTMAP any more.

According to a Cisco tutorial [15], the MAP time will be based on the distance between the CMTS and the CM farthest away, when dynamic MAP time is used:

“If you are using Static Map Advance, all of the modem timing offsets are always derived from a max-delay based on 100 miles. Dynamic Map Advance, on the other hand, can learn

18

4·ΔTMAP

3·ΔTMAP

2·ΔTMAP Inter-Ack time0

Fre

quen

cy

ΔTMAP

which cable modem in a segment is truly the farthest away from the CMTS. It more precisely derives the timing offset, to tune the look-ahead time in the MAP accordingly.”

This suggests that the CMTS only needs to change the MAP time when this distance changes which might happen when a CM is switched on or off, or when a new subscriber connects its modem. Cable modems are typically 'always on' which implies that altering the MAP time is a rare event, even though dynamic MAP times are used. If this is the case, it might still be possible to see a similar peak-shaped pattern if the identification process is performed in a time frame where the MAP time happens to be static.

3.2.2 ADSL

We expected that the inter-ACK time distribution of ADSL would be one peak. The first reason for this is that ADSL offers full-duplex data transmission. In section 3.1.2, it was shown that Ethernet, which is also a full-duplex technology, results in a single peak pattern. In a half-duplex technology, segments are sometimes delayed because simultaneous sending and receiving is not possible. In the case of full-duplex, there is no such delay meaning that the inter-ACK time is mostly about the same.

Secondly, telephone networks and thus ADSL, provide a dedicated wire to every user. No scheduling mechanism is required to avoid collisions with other users. In section 3.2.1, it was motivated that the scheduling used in DOCSIS causes a peak-shaped pattern.

However, early ADSL tests, done in [1] show a distinctive “three peak” pattern as shown in figure 3.2.3. The three peaks are 800µs apart in both tests. Clearly, this does not match our previously mentioned expectations of one peak. We were unable to find a theoretic explanation for this behaviour. However, if this pattern is consistent it will be useful for identification since it is easily distinguishable from the pattern in cable networks.

19

Figure 3.2.3: ADSL inter-ACK time test

When examining the transmission capabilities, there are two factors that need to be considered: overhead and the data rate. Table 3.2.2 lists twelve ADSL configurations with different overhead. The first three columns show the configuration details. The AAL5 tail column shows the overhead imposed on one IP packet. The TCP/IP column shows the total amount data that needs to be sent for an 40 byte ACK.

In ADSL, ATM (Asynchronous Transfer Mode) is used which means that the data must always be fitted into an integer number of 53 bytes fixed sized ATM cells (5 bytes header, 48 bytes payload data). For example, when an 40 byte ACK is sent over a link with a configuration as described by the first row of the table, the overhead is 10 bytes which adds up to 50 bytes that need to be sent. Since the payload of an ATM cell is 48 bytes, there are two cells required for sending the data, resulting in two times the size of an ATM cell, 106 bytes, that need to be transferred.Using this table, the total number of bytes that need to be transferred for a 40 bytes ACK and 1500 bytes data segment can be calculated which is done in table 3.2.3.

Previously, we made the assumption that an ACK is sent for every two data segments. This means that the inter-ACK time can be calculated as follows. Tdata and TACK are the transmission times of a data and ACK segment respectively.

Δ ACK=2∗T data+ T ACK

When the data rates are known, the inter-ACK time can be calculated for all the ADSL configurations. In table 3.2.3, this is done for a data rate of 7616 kbps downstream, and 832 kbps upstream. Based on this, we should observe a peak in the inter-ACK time distribution at one of the calculated inter-ACK times, depending on the configuration of the network.

Note that other factors, for example the multi-carrier technique used in ADSL, can impose additional delay on the data. As these factors are not taken into account in our analysis, the transmission times presented in table 3.2.3 should be seen as an estimate, therefore the peaks in the graphs might match these values.

Table 3.2.2: ADSL Overhead, source [16]

20

21

Table 3.2.3: Expected inter-ACK time for various ADSL configurations

ACK Segment Data segmentName Type VC or LLC Total (Bytes) Total (Bytes) (ms)

VC 2 106 32 1696 4,4748593548LLC/NLPID 2 106 32 1696 4,4748593548

RFC2684R VC 1 53 32 1696 3,9771880808RFC2684R LLC/SNAP 2 106 32 1696 4,4748593548RFC2684B VC 2 106 32 1696 4,4748593548RFC2684B LLC/SNAP 2 106 32 1696 4,4748593548RFC2684B VC 2 106 32 1696 4,4748593548RFC2684B LLC/SNAP 2 106 32 1696 4,4748593548PPPoE VC 2 106 32 1696 4,4748593548PPPoE LLC/SNAP 2 106 33 1749 4,583594255PPPoE VC 2 106 32 1696 4,4748593548PPPoE LLC/SNAP 2 106 33 1749 4,583594255

Configuration Inter-ACK timeATM Cells ATM Cells

PPPoA RoutedPPPoA Routed

RoutedRoutedBridged w/oFCSBridged w/oFCSBridged w FCSBridged w FCSBridged w/oFCSBridged w/oFCSBridged w FCSBridged w FCS

Data rateUpstream 832 kbpsDownstream 7616 kbps

3.3 Practical Issues

In this section important issues are discussed regarding the generation and the detection of ACKs.

3.3.1 ACK in different Operating Systems

The delayed ACK mechanism, used in the majority of TCP implementations, allows delaying ACK packets as explained in section 2.1. The examples in the previous section that were used to analyse the inter-ACK time assume that an ACK is always sent for two data segments. Indeed, this is recommended by RFC1122 [5]: “in a stream of full-sized segments there SHOULD be an ACK for at least every second segment”. In a RFC standard SHOULD means that the implementation is not mandatory. RFC1122 also states that the maximum allowable time that an ACK can be delayed MUST not be more than 0.5 seconds. This means that the specification allows that the number of data segments that are acknowledged by one ACK can vary. The previously made assumption that every two data segments are acknowledged does not always hold. It is crucial to know when ACKs are generated by the TCP implementations because this will affect the inter-ACK time. In this section, the implementations of the “delayed ACK” mechanism will be discussed for the two operating systems used in this research.

Linux is an open source operating system, which allows for a detailed study of the TCP implementation, as the source code is publicly available. In Linux, the maximum delay of an ACK is determined by the value of the delayed ACK timer, which is constantly updated to the minimum of the sample round trip time (srtt) and the interval between data segments. The value has a lower- and upper bound of 40ms and 200ms respectively.

An ACK is immediately sent in the following situations:• More than one MSS (maximum segment size, i.e. full-sized segment) amount of data

is received and unacknowledged, and the receiver buffer has space for accepting advertised window worth of data. This should fulfill the requirement in RFC1122 that an ACK is sent for at least every second full sized data segment: if two full sized segments are received, an ACK is no longer delayed. However, in Linux this is only done if the receiver buffer has space for accepting advertised window worth of data. In practice, most of the time every second segment is ACKed, but it is possible that an ACK is only sent after more than two segments.

• TCP is in quick mode. The receiver will go into this mode when it deduces that the sender is in slow start. This happens at beginning of the TCP session or when the loss of a segment is detected. In quick mode, an ACK will be sent for every data segment with the aim of quickly increasing the congestion window size of the sender.

• Data is received out of order. It is recommended by RFC 1122 that an ACK is sent immediately when data is received out of order.

Windows XP is closed source meaning that the actual implementation cannot be studied. However, the following is known: the registry entry TcpDelAckTicks controls the maximum time that an ACK can be delayed. The default value is 2, meaning that an ACK can be delayed for maximal 200ms (2x 100ms). The TcpAckFrequency registry entry controls the maximum

22

number of data segments that are acknowledged which has default value 2. This means that every second data segment is acknowledged, but if for example a data segment is lost an ACK is not delayed for more than 200ms.

Based on this information, it can be inferred that an ACK is sent for either one or two data packets. This behaviour is confirmed in laboratory tests done in [1]. When an ACK is sent for one data segment the inter-ACK time tends to be in the order of hundreds of milliseconds, meaning that these ACKs are likely sent because the delayed ACK timer expired.

3.3.2 ACK-pair detection

In the previous sections it was explained why the inter-ACK time can be used for identification. However, not all ACK-pairs are useful for the purpose of identification. The timing signature created by the access network is sometimes concealed due to a number of reasons. This requires adequate selection of ACK-pairs:

• ACK segments corresponding to a data segment that was resent or received out of order, are dismissed from the analysis.

• ACK segments that have a corresponding data segments more than 500μs apart, are disregarded.

• ACK-pairs with an inter-ACK time higher than 200ms can be disregarded.

ACK segments are originally meant to make sure that data segments arrive at the receiver. IP networks are unreliable: packets can be dropped, and can arrive out of order. In an undisturbed TCP flow, i.e. no data segments are lost or received out of order, the inter-ACK time can give useful information about the access network. But when this is not the case, this information is lost. This has to be taken into account in the ACK-pair detection process. For this reason, ACK segments corresponding to a data segment that was resent or received out of order, are dismissed from the analysis.

Another requirement for valid ACK pairs is that the corresponding data segments are sent back-to-back. ACK segments that have a corresponding data segments more than 500μs apart, are disregarded.

Furthermore, based on the analysis done in section 3.3.1, ACK-pairs with an inter-ACK time higher than 200ms can also be disregarded since this indicates an expired delayed ACK timer. In this case, the inter-ACK time is affected by the TCP implementation and thus the timing signature caused by the access link is blurred.

23

4 ValidationIn this chapter the validation for the method is given by means of real-world tests performed on various cable and ADSL networks. The results are presented through graphs of the inter-ACK time distribution. The horizontal axis shows the inter-ACK time, and the vertical axis shows the frequency, i.e. the number of ACK-pairs, as a fraction of the total number ACK pairs.

The inter-ACK time distribution is calculated as follows. We have defined a set of fixed sized bins, each with its own consecutive time range. For every ACK-pair, the corresponding bin is determined. The number of ACK-pairs in each bin is counted, and subsequently calculated as the fraction of the total number of ACK-pairs.

We used a binsize of 125µs with the following bin ranges:

• Bin 0µs: [-25µs, 99µs]

• Bin 125µs: [100µs, 224µs]

• etc.

For example, when there is a peak in the graph at an inter-ACK time of 0 and a 'fraction of pairs' of 0.4, this means that 40% of all ACK-pairs were within the range of [-25µs, 99µs].

4.1 Cable

The validation for the cable hypothesis is presented in two parts. In section 4.1.1 the inter-ACK time graphs of various experiments are presented which confirms the patterns established previously. Section 4.1.2 present the results of an extra experiment that confirm our results.

4.1.1 Inter-ACK time

We conducted tests on two provider networks. A computer is directly connected to the cable modem by Ethernet and a TCP session is started to the server by using telnet. The server sends data to the client and records all TCP traffic. The experiment lasts for about one minute, to make sure that enough ACK-pairs are collected. All tests have at least a thousand ACK-pairs. The results are discussed in this section.

Figure 4.1.1 shows the results from a test on a DOCSIS 2 and 3 network. Clearly, this network is using 2 milliseconds MAP times. Over 50% of all ACK-pairs have an inter-ACK time of 4 milliseconds or 2·ΔTMAP, indicating a situation without contention. After this peak, there are peaks every 2 milliseconds up until 14 milliseconds, indicating contention on the upstream as explained in section 3.2.1. The peak at zero is likely the result of multiple ACKs being sent in one frame on the upstream.

We do not have a explanation for the peaks at about 3ms and 9ms. These peaks were not consistent over multiple tests. They do not always appear on the various tests that were done on the same modem and cable network. On some occasions, however, these peaks would appear at exactly the same inter-ACK time.

24

In the DOCSIS 3 test, the fraction of ACK-pairs with an inter-ACK time of range [-25µs, 99µs], is considerable larger than in DOCSIS 2. Although this difference is consistent in multiple tests done on DOCSIS 2 and 3 networks, we do not have an explanation for this.

Figure 4.1.2 shows that there a no significant differences when using a different operating system. These tests were performed on the same equipment and provider network. The amount of pairs in the bins do not relate to the operating system. This difference is likely caused by DOCSIS. For example, when there is a lot of traffic on the DOCSIS network, one can expect that contention will occur which results in higher peaks at higher inter-ACK times

Figure 4.1.2: Comparing various operating systems .

25

Figure 4.1.1: Comparing DOCSIS 2 and 3

4.1.2 Upstream-only

Another test was conducted in order to verify that the observed behaviour is in fact caused by the upstream channel of DOCSIS. The previously used inter-ACK time test, measures the effect on both up- and downstream. We designed a test which only uses the upstream channel of DOCSIS. We expect that the results of this upstream-only test is similar to the previously test.

The upstream-only test works as follows. A client, using a cable connection, starts a TCP connection to the a server. Upon connection the client starts sending data segments as opposed to the previous tests where the server was sending the data segments. The server logs all traffic which is analysed later on. The inter-data time, i.e. the arrival time between two back-to-back data segments, is of interest since these are now sent on the upstream channel. If the upstream causes the previously explained behaviour, it can be expected that the inter-data time distribution will be similar as the inter-ACK time distribution in the previous tests.

Figure 4.1.3 shows the result of an upstream-only test conducted on a DOCSIS version 3 network. The result confirms that the peak shaped distribution of the inter-ACK time is caused by the upstream channel of DOCSIS. Clearly, the inter-data time is similar as the inter-ACK time distribution from the previous tests.

Figure 4.1.3: DOCSIS Upstream-only test

26

4.2 ADSL

The results of the ADSL tests did not match our expectations. We conducted tests on various public ADSL networks. These results are discussed in section 4.2.1. We also conducted tests in a controlled environment, which enabled us to experiment with the DSLAM settings and investigate the effects on our test results. The results of these tests are discussed in section 4.2.2.

4.2.1 ADSL tests

Figure 4.2.1 shows the inter-ACK time distribution for two tests on the same modem, measured on a ADSL2+ network. Both look similar to the ADSL tests done in [1] (see figure 3.2.3). The distribution contains three peaks which are approximately equal distance apart (1 to 1.125 ms distance). However, the location and the distance between the peaks is different per test.

The inter-ACK time distribution in test B was conducted on the same equipment and provider as test A on a different time. The results are significantly different since the peaks have different locations. This indicates that differences in test results can occur over time, even though the same equipment is used.

Figure 4.2.2 shows a test performed on another ADSL network and contains one main peak located at 4.25ms. From the analysis presented in section 3.2.2, we expected a single peak distribution because ADSL is full-duplex and offers collision free service. This was the only test where we observed this pattern. We were unable to confirm the result because we could not perform a second test on the same equipment.

Figure 4.2.3 shows two more tests which are different from the previous one. Test 4 has five main peaks which are are 250µs apart. In test 5 the peaks are, again, roughly equal distance apart (1.875ms), except for the peaks at 1.75ms and 3.5ms where the distance is 1.75ms. This pattern looks similar to the cable pattern, however it can not be mistakenly identified as cable connection. If this was a cable connection the MAP time would be 1.875ms, and the first peak should be at 2 times the MAP time, which is not the case.

As the tests are not consistent, we cannot develop a method to characterize ADSL. The three-peak pattern does not always occur. However, most tests seem to have a number of peaks which are roughly equal distance apart. Furthermore, it was shown that even on the same equipment the inter-ACK time distribution might vary significantly from test to test.

27

28

Figure 4.2.1: Two ADSL tests on the same equipment

Figure 4.2.2

Figure 4.2.3

4.2.2 Effect of DSLAM settings

In an attempt to find consistent results we perform tests in a controlled environment. However, we were unable to reproduce the pattern in figure 4.2.1.We experimented with the DSLAM settings. Figure 4.2.4 shows test a with the standard DSLAM settings, meaning that these settings are normally used by the administrator. Again, as in the previous tests, there is a peak-shaped pattern.

29

Figure 4.2.4: Default DSLAM settings

Table 4.2.1 shows the standard DSLAM settings. The settings can be explained as follows:

• Interleaved path bit rate – The interleaved path uses interleaving as explained in section 2.3. This setting specifies the data rate on the down- and upstream.

• Fast path bit rate – The fast path does not use interleaving. This settings specifies the data rate on down- and upstream.

• Interleaving delay – Specifies the maximum delay on the interleaved path. According to the DSLAM manual “It defines the mapping (relative spacing) between subsequent input bytes at the interleave input and their placement in the bit stream at the interleave output”

• FEC check bytes – The number of Forward Error Correction bytes. This is redundant information added to the transferred data by the sender which helps the receiver to detect and correct a limited amount of errors. Essentially, a higher number of check bytes means more overhead and more latency but better resistance against errors.

• Margin – Specifies the margins for the SNR (Signal-to-noise ratio)

• Reed-Solomon codeword size – Another method for error detection and correction similar to FEC.

• Overhead framing – Specifies the framing overhead mode which can be set from mode 0 to 3. Mode 3 means that the synchronisation bytes are merged for multiple bearer channels. According to the manual, mode 3 is default and optimizes performances which suggests that this setting is commonly used. We were unable to change this setting.

• Trellis coding – Another method for forward error correction.

30

Table 4.2.1: Standard DSLAM settings

Setting ValueDown Up

Interleaved path bitrate 7616 Kbps 832 KbpsFast path bitrate 0 Kbps 0 Kbps

Interleaving delayFEC check bytes 16 bytes 16 bytesMargin 10 dB 2 dBReed Solomon codeword size 1 bytes 8 bytes

Setting Value

Overhead framing Mode 3Trellis coding Disabled

4000µs 4000µs

In section 3.2.2, we calculated the expected inter-ACK time for all ADSL configurations, given the data rate from the standard test as shown in table 4.2.1. We expected that the calculated inter-ACK times of one of the configurations would match to one of the peaks in figure 4.2.4 but this is not the case.

Figure 4.2.5 shows that halving the bit rates on the down- (3936 kbps) and upstream (384 kbps) separately, and at the same time have a significant effect. When only the downstream rate is lower, the peaks after 7 ms disappear. When the rate on the upstream is lower, the largest fraction of ACK-pairs has an inter-ACK time of approximately 11 ms as opposed to 5ms in the standard test.

Changing the interleaving delay shifts the distribution as shown in figure 4.2.6. The amount of shift is between 250µs and 500µs depending on the peak. A lower interleaving delay causes a shift between 125µs and 250µs to the right.

There is no significant difference between the interleaved and fast path as shown in figure 4.2.7. The difference is similar as lowering the interleaving delay. We expected that using the interleaved path with an interleaving delay of 0 µs, would have similar result as using the fast path since a delay of 0 µs effectively means no interleaving of data. However, it turns out that this is not the case as can be witnessed in figure 4.2.7.

31

Figure 4.2.5: Various data rates settings in the DSLAM

Figure 4.2.6: Various interleaving settings in the DSLAM on up- and downstream

Figure 4.2.7: Comparing the fast and interleaved path

32

4.2.3 Summary ADSL

The tests performed on various public provider networks are not consistent. Based on our knowledge of ADSL, we expected a single peak pattern. Early tests showed a three-peak shaped pattern. We have observed both of these patterns in some of our test results. However, we have observed other pattern as well.

The provider tests can not be reproduced in a controlled environment. Although the controlled environment tests were consistent, they showed no similarity with the tests on public networks.

The controlled environment results do not match with our inter-ACK time calculations. In the controlled environment tests, we have knowledge of the exact data rates. Combining this information with the overhead per data and ACK segment introduced by ADSL, we calculated the expected inter-ACK time for serveral ADSL configurations. However, the calculated times do not match with the results.

Changing the data rates in the DSLAM have a significant effect on the test results. This setting changes the number of peaks, and their location in the inter-ACK time distribution. Changing other settings (e.g. interleaving delay, using the fast path instead of the interleaved path) has little effect on the resuls. This setting shifts the distribution either to left or to the right.

External factors can affect the results. For example: the distance between the modem and DSLAM, quality of the cable and weather. These factors can affect the data rate. Since our tests showed that changing the data rates have a significant effect on the result, we expect that the influence of external factors is similar to this.

33

5 ConclusionsThe research question of this report was to see whether the approach, used and described by [1] for identification, can be applied on cable and ADSL. Furthermore, this research focussed on finding the technical characteristics that cause certain patterns in the inter-ACK time distribution to occur which would prove that it is in fact a suitable and reliable method.

In our cable experiments we were able to establish a consistent pattern in the inter-ACK time distribution. This pattern is caused by the upstream scheduling mechanism of DOCSIS. This contention-based mechanism requires the modem to reserve timeslots before it can send upstream ACKs. A periodically (denoted as the MAP time) sent MAP message specifies the allocations of the timeslots. A reservation takes at least 2 two MAP times: one for making the reservation, and subsequently to verify that the reservation is successful. If the reservation fails, the transfer is delayed for another MAP time. This causes a peak-shaped distribution of the inter-ACK time, starting at 2 times the MAP time and a peak at every multiple of the MAP time.

Most DOCSIS implementations use a static MAP time value [14]. However, [14] also suggests the possibility of dynamic MAP times which can potentially alter the inter-ACK time distribution. [15] suggests that the frequency at which the MAP time is changed when dynamic MAP times are used, is low. Therefore, we think that even if dynamic MAP times are used, identification is still possible. We were unable to confirm this with experiments, as no DOCSIS network with dynamical MAP times was available to us for testing. Overall, it can concluded the inter-ACK time distribution is a suitable method for identification of a DOCSIS connection.

Based on the fact that ADSL is full-duplex and offers a dedicated service which requires no scheduling, we expected that the inter-ACK time distribution would be a single peak, similar as in Ethernet. However, our test results show that this is not the case. In fact, we encountered several inconsistent peak-shaped patterns. In most cases, these peaks are approximately equal distance apart. However, we cannot propose a method of identification due to the lack of theoretical explanation.

Altering the data rates in the DSLAM seems to have a significant effect on the inter-ACK time distribution. Since the data rates can also be altered by external factors, it is not unlikely that this causes the inconsistencies in our test results. Changing other settings in the DSLAM had little effect on results. Considering all the above, we conclude that the we cannot use the inter-ACK time for the identification of ADSL.

34

6 Future workIn this research the effect of cross traffic, i.e. traffic on the access network generated by other users, was not investigated. Most experiments were conducted on public ISP networks meaning that these test were likely to be subject to some amount of cross traffic. It can be useful to know how the inter-ACK time distribution is affected by various levels of cross traffic in order to obtain more insight in the reliability of the identification method. For the same reason, it might is also be interesting to analyse the effect on test results when the traffic from/to the analysed host is high. Is it still possible to identify a cable connection if the user is uploading with bittorrent? This could be done using simulation.

We were unable to find a theoretical explanation for the inter-ACk time distribution. Moreover, the test results on ADSL networks were inconsistent. Further research is necessary in order obtain more insight in this matter.

35

7 References1: R. Barbosa, On Access Network identification and characterization, Design And Analysis Of Communication Systems, University of Twente, 20092: Wei Wei, Bing Wang, Chun Zhang, Don Towsley, Jim Kurose, Classification of Access Network Types, 20083: James F. Kurose, Keith W. Ross, Computer Networking: A Top-Down Approach Featuring the Internet, Pearson, 20044: RFC793: TCP Protocol Specification, 1981, Information Sciences Institut, University of Southern California, http://tools.ietf.org/html/rfc7935: R. Braden, RFC1122: Requirements for Internet Hosts , Internet Engineering Task Force, 1989, http://tools.ietf.org/html/rfc11226: Andrew S. Tanenbaum, Computer Networks, Pearson Education, 20037: Richard Murphy, A Simulation Study of DOCSIS Upstream Channel Bandwidth, Department of Computer Science, University of Texas at San Antonio, 20048: Charles K. Summers, ADSL Standards, Implementation, and Architecture, CRC Press, 19999: John A.C. Bingham, ADSL, VDSL, and Multicarrier Modulation, Wiley-Interscience, 200010: Unknown, ADSL Attenuation, Wikipedia, 2011, http://en.wikipedia.org/wiki/File:ADSL_Line_Rate_Attenuation.gif11: Jim Martin, Mike Westall, Validating an ‘ns’ Simulation Model of the DOCSIS Protocol, Department of Computer Science, Clemson University12: Unknown, Understanding Data Throughput in a DOCSIS World, Cisco, 2008, http://www.ciscosystems.com/en/US/tech/tk86/tk168/technologies_tech_note09186a0080094545.shtml13: John J. Downey, Understanding DOCSIS Data Throughput and How to Increase it, 200814: Jim Martin and James Westall, A Simulation Model of the DOCSIS Protocol, School of Computing Clemson, University Clemson15: Unknown, Cable Map Advance (Dynamic or Static?), Cisco, 2008, http://www.cisco.com/en/US/tech/tk86/tk89/technologies_tech_note09186a00800b48ba.shtml16: Jesper Dangaard Brouer, Optimization of TCP/IP Traffic AcrossShared ADSL, Department of Computer Science, University of Copenhagen

36

on access network identification - universiteit twente · on access network identification ... •...

Documents