bdrmap: inference of borders between ip networksmjl/imc-bdrmap.pdf · 2016. 11. 17. · bdrmap:...
TRANSCRIPT
w w w .caida.org
bdrmap: Inference of Borders Between IP Networks
Matthew Luckie, Amogh Dhamdhere,Bradley Huffaker, David Clark, kc claffy
IMC 2016, November 15th 2016
Who operates a router we observe in a traceroute path?
2
The Problem1. The Internet architecture has no notion of interdomain
boundaries at the network layer2. Traceroute is a 30-year old hack with limitations3. Using longest-matching prefix to infer ownership of
routers is known to be error prone4. Traceroute samples topology close to Vantage Point
(VP), reducing topological constraints for inferring ownership for distant routers
5. Concerns about revealing topology information can align operator incentives away from transparency
3
Assumption
4
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
We assume that routers generally respond with an on-path interface facing the VP, and links between routers
are point-to-point (IPv4 /30 or /31)
IP path: b2a1 a3
Challenges
5
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Neighbor router owned by AS B may respond with an IP address from AS A which the router uses to
form the point to point link.
IP path: b2a1 a3Router path: R1 R2 R3
Challenges
5
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Neighbor router owned by AS B may respond with an IP address from AS A which the router uses to
form the point to point link.
Industry convention for provider to assign interconnect IP address, but
no convention for peering
IP path: b2a1 a3Router path: R1 R2 R3
Challenges
6
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Neighbor router owned by AS B may respond with a third party IP address from AS C
IP path: b2a1 c1Router path: R1 R2 R3
Challenges
7
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Border router operated by AS D may respond with an address from AS B used to form point-to-point link, but
block probes from entering AS D
IP path: b2a1Router path: R1 R2 R3
a3 b5R5
??
Challenges
8
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Border router owned by AS B may use virtual routingfeatures; the router will respond with different IP addresses
that form the point-to-point link with AS D and AS E
IP paths: b4a1Router paths: R1 R2 R3
a3 b5R5
b6a1R1 R2 R3
a3 ??
Challenges
8
b1a1R1 R2 R3
AS A AS B
AS C
a2 a3 b2 b3
c1
c2R4
AS E
R6
AS D
R5
b4
b5
b6
b?
d?c3 e?
Border router owned by AS B may use virtual routingfeatures; the router will respond with different IP addresses
that form the point-to-point link with AS D and AS E
If b4 and b6 are not resolved for aliases, and E’s router is silent,
b6 might be incorrectly inferred to be neighbor E’s router
IP paths: b4a1Router paths: R1 R2 R3
a3 b5R5
b6a1R1 R2 R3
a3 ??
Challenges
9
R2 may load balance traceroute probes acrosstopologically diverse paths, resulting in false links.
IP paths: b4a3Router paths: R2 R3 R6
b1 b2R7
b2a3R2 R5 R4
b3 b5R8
R2 R3
AS Ba3 b1 R8
b2 R4
R5 R7 R9
b3b6
AS C
c1
c2R6b4 b5
Challenges
10
R2 may choose a different next hop as atraceroute measurement proceeds
IP paths: b2a3Router paths: R2 R3 R4
b1 c1R8
R2 R3
AS Ba3 b1 R8
b2 R4
R5 R7 R9
b3b6
AS C
c1
c2R6b4 b5
c2R9
Additional Challenges
1. Sibling ASes may confuse attempts to infer connectivity between organizations
- sibling information has known false and missing inferences
2. IXP addresses may appear inconsistently in paths- an IXP and/or member(s) may originate prefix into BGP,
or it might not be originated at all
3. Multiple ASes may originate a prefix into BGP- The more ASes, the more challenging to infer ownership
11
Motivation for Border Router Ownership Inference
• Network Modeling and Resilience- Early Internet models considered topology at AS-level, with
a single link between pairs of ASes- Our work enables the construction of a router-level
Interdomain connectivity map• Interdomain Congestion
- Public policy community has growing interest in identifying persistent congestion on interdomain links
- Greatest measurement challenge is identifying interdomain links to probe, and associating observed evidence of congestion to specific interdomain links
12
Related Work• Significant work on inferring router aliases: e.g.,
Mercator, Ally, Pre-specified timestamps, mrinfo, Discarte, Radargun, MIDAR, APAR + kapar,
• Significant work inferring AS-level connectivity- AS traceroute (SIGCOMM 2002 and SIGMETRICS 2003);
adjust IP-AS mappings with colocated traceroute and BGP- Where the Sidewalk ends (CoNEXT 2009): goal of
accurately inferring AS-level connectivity- Topology dualism (PAM 2010): evaluation of heuristics
13
Our Contributions1. Scalable, accurate method for inferring
interdomain boundaries for the network hosting the VP2. Efficient system to allow deployment on resource-
limited devices (e.g. SamKnows)3. Validation using ground truth from four
network operators and IXP databases4. Analysis of interdomain connectivity of a
large access ISP (Comcast): 45 links w/ Level3, Jan 20165. We release our data collection and analysis
system as part of scamper, with man pages
14https://www.caida.org/tools/measurement/scamper/
Select Interconnections from Top 3 Content to Top 6 Access
15
Roadmap• bdrmap
- Input data
- Data collection
- Analysis: overview of heuristics
• Validation, coverage of BGP-observed links
• Systems challenges and solutions
• Interconnection Insights16
ASrelationships
RIRdelegations
ASprefixes
VPASes
IXPprefixes
bdrmap scamper
§5.3: Data Collection
Router-level topology
bdrmap§5.4: Infer
interdomainlinks
Borderrouters
Approach to Border Mapping (1)
17
Assemble Input Data
RIR delegations: RIR statistics files
AS relationships and prefixes: BGP data
and as-rank algorithmIXP prefixes:
PeeringDB and PCHVP ASes:
manual oversight
Approach to Border Mapping (2)
18
Collect Raw Data tobuild router-level map
bdrmap is a driver,included as part
of scamper
ASrelationships
RIRdelegations
ASprefixes
VPASes
IXPprefixes
bdrmap scamper
§5.3: Data Collection
Router-level topology
bdrmap§5.4: Infer
interdomainlinks
Borderrouters
Data Collection
• Parts of our data collection process are similar to Rocketfuel- targeted traceroutes, informed by public BGP data- alias resolution to infer router-level topology
• Rocketfuel maps topologies of networks from the outside• bdrmap maps interdomain topologies from the inside• bdrmap data collection time depends on diameter and
complexity of hosting network;- typically 12-48 hours at 100pps
19
Data Collection
• Generate list of address blocks from BGP data• Gather traceroutes
- we focus on first-hop interdomain links, so use a stop-set (DoubleTree) to halt traceroutes from probing beyond hops in a neighbor network we have seen before
- bdrmap tries up to five different addresses per block, to avoid interpreting third-party addresses as neighbors
20
Data Collection• Perform Alias Resolution
- We use the Ally, Mercator, and Prefixscan alias resolution techniques as we collect traceroutes to collect raw data to support building a router-level graph
- We use MIDAR’s Monotonic Bounds Test where we use IP-ID based techniques, as well as repeated tests, to reduce the chance we infer false aliases
• Build Router Level Graph- Focus on interfaces observed in ICMP TTL-expired
messages; the source address on an ICMP echo response could be on any of the interfaces on the router
21
Approach to Border Mapping (3)
22
Infer Border Routers
bdrmap analyses rawtopology data to infer
border routers
ASrelationships
RIRdelegations
ASprefixes
VPASes
IXPprefixes
bdrmap scamper
§5.3: Data Collection
Router-level topology
bdrmap§5.4: Infer
interdomainlinks
Borderrouters
Heuristic Overview
23
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Set of heuristicsthat reverse-engineeroperator and router
behaviors
Applied in the orderpresented basedon constraints
available
Heuristic Overview
24
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Infer if the router is operated by the
network hosting the VP.
Other routers mustbe operated by a
neighbor AS
Heuristic Overview
25
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Sometimes we do not observe other topology after a neighbor router.
We can only reason about ownership using
the destination ASes probed where
we observed the router
Heuristic Overview
25
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Sometimes we do not observe other topology after a neighbor router.
We can only reason about ownership using
the destination ASes probed where
we observed the router
50-60% of the inferredinterdomain linksof the networks
we measured, mostly customers
Heuristic Overview
26
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Some routers only respond with an IP
address not routedin BGP.
We can only reason about ownership using subsequent IPs in the
path that are routed, or the destination ASes
probed for paths where we observed the
router
Heuristic Overview
26
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Some routers only respond with an IP
address not routedin BGP.
We can only reason about ownership using subsequent IPs in the
path that are routed, or the destination ASes
probed for paths where we observed the
router
1-5% of the interdomainlinks of the networks
we measured
Heuristic Overview
27
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using IP-AS mappings where the same AS on two
consecutive routers
We are unlikely to observe two
consecutive third-party IP addresses in a path
Heuristic Overview
27
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using IP-AS mappings where the same AS on two
consecutive routers
We are unlikely to observe two
consecutive third-party IP addresses in a path
30-40% of peer-to-peerinterdomain linksof the networks
we measured
Heuristic Overview
28
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using AS relationships and
IP-AS mappings.
We infer owners of routers that responded
using a third-party address, as well as known peers and
customers
Heuristic Overview
28
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using AS relationships and
IP-AS mappings.
We infer owners of routers that responded
using a third-party address, as well as known peers and
customers
For inferred interdomain links:2-3% third-party
20-30% known relationship
Heuristic Overview
29
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using IP-AS
mappings.
We have exhaustedbetter constraints
Heuristic Overview
29
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Reason about ownership using IP-AS
mappings.
We have exhaustedbetter constraints
2-3% of inferredinterdomain links
Infer additional aliases for routers
operated by the network hosting the VP
A single neighbor router is likely
connected to a single VP router with a point-
to-point link
Heuristic Overview
30
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Infer presence of silent neighbor
routers, and owners of routers that
responded without ICMP time exceeded
messages.
To avoid false link inferences, we only infer presence of a neighbor
when we know link exists through BGP
Heuristic Overview
31
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
Infer presence of silent neighbor
routers, and owners of routers that
responded without ICMP time exceeded
messages.
To avoid false link inferences, we only infer presence of a neighbor
when we know link exists through BGP
Heuristic Overview
31
VP
x?x
xx
?
a x
b
b
a
a
xx
c
x
C A
ABC
CB
Ac
C
xcc
a
bc b
b
1
2
3 4
5
6
7 8
4-5% of inferredinterdomain links
Coverage + Validation• Overall, 92-97% coverage of VP-network links in BGP
• We accurately find additional VP-network links not in BGP
• Validation: contacted 10 networks, received validation for 4
- R&E network: 131 of 136 links correct (96.3%)
- Large access network: 97.0% - 98.9% correct, depend on VP
- Tier-1 network: 2584 of 2650 links correct (97.5%)
- Small access network: 283 of 293 links (96.6%)32
Limitations• bdrmap depends on observing topology at
interconnection points to assemble useful constraints
- not always possible as traceroute may observe other paths
• Still restricted by limitations of what is possible with traceroute
• Alias resolution techniques are not always able to map IP addresses to the same underlying router
33
Using low-resource VPs
34
Central controllerbdrmap bdrmap bdrmap
remoted
VPscamper
VPscamper
VPscamper
We extended scamper to be remote controlled.Algorithm state and collected data can be kept off the device.
SamKnows / BISmark: 450Mhz MIPS CPU,64-128MB RAM, 16MB flash storage
Interconnection Insights• We used 19 geographically distributed VPs inside Comcast to
map router-level interdomain connectivity of Comcast in January 2016
35
GoogleNetflix
AkamaiApple
Level3Cogent
5 10 15 20 25 30 35 40 45
0 2 4 6 8 10 12 14 16 18 20
Cum
ulat
ive
num
ber o
f lin
ks
VP
Edgecast
Max interconnection density: 45 router links with Level3.
Required 17 VPs to observe them all
36
US West Coast
Interconnection Location
US East Coast
Pittsburgh, PA
Pompano, FL
Atlanta, GA
Detroit, MI
Nashville, TN
Chicago, IL
Minneapolis, MN
Houston, TX
Boulder, CO
Albuquerque, NM
Salt Lake City, UT
San Jose, CA
Monterey, CA
Seattle, WA
Beaverton, OR
UT CO TX MAIL TNMNNYNJVA
MDMI
GA PAFLWA
OR CA
VP Location
Rockville, MD
Concord, MA
VP Location
Hillsborough, NJ
VP Location
Washington, DC
Positionof VPs
37
US East CoastUS West Coast
Interconnection Location
Pittsburgh, PA
Pompano, FL
Atlanta, GA
Detroit, MI
Nashville, TN
Chicago, IL
Minneapolis, MN
Houston, TX
Boulder, CO
Albuquerque, NM
Salt Lake City, UT
San Jose, CA
Monterey, CA
Seattle, WA
Beaverton, OR
UT CO TX MAIL TNMNNYNJVA
MDMI
GA PAFLWA
OR CA
VP Location
Akamai
Rockville, MD
VP Location
Concord, MA
VP Location
Hillsborough, NJ
Washington, DC
Akamai exits
visible everywhere
38Interconnection Location
US East CoastUS West Coast
Pittsburgh, PA
Pompano, FL
Atlanta, GA
Detroit, MI
Nashville, TN
Chicago, IL
Minneapolis, MN
Houston, TX
Boulder, CO
Albuquerque, NM
Salt Lake City, UT
San Jose, CA
Monterey, CA
Seattle, WA
Beaverton, OR
UT CO TX MAIL TNMNNYNJVA
MDMI
GA PAFLWA
OR CA
VP Location
Rockville, MD
VP Location
Concord, MA
VP Location
Hillsborough, NJ
Washington, DC
Westcoastvisible
Eastcoastvisible
39Interconnection Location
US West Coast US East Coast
Pittsburgh, PA
Pompano, FL
Atlanta, GA
Detroit, MI
Nashville, TN
Chicago, IL
Minneapolis, MN
Houston, TX
Boulder, CO
Albuquerque, NM
Salt Lake City, UT
San Jose, CA
Monterey, CA
Seattle, WA
Beaverton, OR
UT CO TX MAIL TNMNNYNJVA
MDMI
GA PAFLWA
OR CA
Level3VP Location
Rockville, MD
VP Location
Concord, MA
VP Location
Hillsborough, NJ
Washington, DC
Visible exitsgeographically
scoped
Summary• We used active measurement techniques to build a
router-level map focused on router ownership inference for interdomain links for a network hosting a VP
• We developed and validated heuristics to distinguish VP-routers from neighbor routers, and to infer the operator of neighbor routers
• We used our system to investigate modern interconnection arrangements
• We publicly release our source code implementation
40https://www.caida.org/tools/measurement/scamper/
BACKUP SLIDES
Interconnection Insights• We used 19 geographically distributed VPs inside Comcast to
map router-level interdomain connectivity of Comcast in January 2016
42
Fewer than 2% of prefixes left networkvia the same border
for each VP.
For 73% of prefixes,we observed 5-15
distinct border routers,and 13% of prefixes
had more than 15 exits. 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0 5 10 15 20
CDF
Border routersNext hop ASNs
Number of exits ( border routers or next hop ASNs )
Infer routers operated by the network hosting the VP
43
1.1 R1 has interface in x, subsequent interface in x, but majority in A and nextas A Assign A
1.2 subsequent interface in X?
yesx1
multihomed
step 1
VP R1x1
R2 R3 R4
nextas AR1
x2 a1 a2
Assign X
first
R1x1
VPx1
R1
R2
R3
x2
x3
yes
no
Inferring owner of neighbor routers with firewalls
44
2.1 no subsequent routers observed?x1
Assign Ax1
yesR1
nextas AR1
firewallstep 2
VP
Infer operator of neighbor routers that use unrouted IP
45
3.1 unannounced address space followed addresses only in A?
step 3Assign A
yes
unrouted
R1x1
Assign Cyes
unrouted
R1x1
Assign Dyes
unrouted
R1x1
R1 R2 Rnx1 a1?1
3.2 subsequent majority common provider C?
R1 R2 Rnx1 a1?1
RmC
A B
3.3 no subsequent topology observed?
R1 R2x1 ?1 nextas D
no
no
b1
Use IP-AS mappings to infer operator of neighbor routers
46
4.1 All interfaces in A and least one subsequent interface in A?
Assign A
yesa1
a2
onenet
4.2 two subsequent interfaces in A?yes
step 4
R1 R2
x1VP R1
a1R2
a2R3
x1 R1
VP
Assign A
R1
a1
a2onenet
a3
no
Use AS relationship inferences to infer operator of neighbor
routers
47
5.1 path to B and A is provider of B?
yesx1a1
path to B
5.3 A is peer or customer of X peer/customer
A
X Aor
5.4 no X-A relationship, B provider of A, and X provider of B
step 5
5.2 if interface in A on path to B, and A is provider of B?
no
X
AB
5.5 all interfaces adjacent to R1 in A
VP R1 R2
Assign Bthirdparty
R1x1
VP R1
a1path to B
yesAssign B
thirdparty
R1
a1
x1VP R1 R2
a1
Assign A
R1x1
x1VP R1 R2
a1
x1VP R1 R2
a1
Assign Bmissing
R1
hiddenpeerAssign A
R1x1
AB
AB
no
no
no
X A?
yes
yes
yes
x1
Use IP-AS mappings to infer operator of neighbor routers in
ambiguous scenarios
48
6.1 majority of interfaces in A?
R1
x1R2
a1
R3
b1R4
a2
Assign A
x1 R1
count
R5a1
R5a1
step 6
yes
Infer additional aliases for border routers
49
7.1 R1 and R2 (owned by X)
connect to R3 (owned by A)
step 7
R1x1
R2x2
R3
x1 and x2 are aliases
R4 R3
x1
x2
yes
alias
Infer operator of neighbor routers without TTL expired
messages
50
8.1 R0 inferred, no response towards Ax1 path to A
step 8
R0VP
Assign A
R1
silent
Assign A
R1a1
ICMP
R1yes
no
8.2 R1 does not respond with ICMP
TTL expired messagesx1 R0VP R1
a1yes
path to A
Limitations
51
x1VP R0 R1
x2 x3 R2x4 x5 nextas A
(a) Actual router ownership
x1VP R0 R1
x2 x3 R2x4 x5 nextas A
(b) Inferred router ownership
AS X
AS X AS A
AS A
If an AS uses provider-aggregatable address space from their provider on interfaces on their
internal routers, bdrmap may incorrectly infer the position of interdomain link.
Limitations
52
x1VP R0 R1
x2R2
(a) Actual router ownership
x1VP R0
R1
x2
x3x4
nextas A
(b) Inferred router ownership without alias resolution
x3R3x4
R1
R3
AS A
AS B
AS X
AS B
AS AAS X
If router R1 responds with different IP addresses depending on the destination probed, and those addresses are not inferred to be aliases, bdrmap may incorrectly infer the
position of an interdomain link.
53
US East CoastUS West Coast
Interconnection Location
Pompano, FL
Atlanta, GA
Detroit, MI
Nashville, TN
Chicago, IL
Minneapolis, MN
Houston, TX
Boulder, CO
Albuquerque, NM
Salt Lake City, UT
San Jose, CA
Monterey, CA
Seattle, WA
Beaverton, OR
UT CO TX MAIL TNMNNYNJVA
MDMI
GA PAFLWA
OR CA
Level3VP Location
AkamaiGoogle
Rockville, MD
VP Location
VP Location
Concord, MA
Hillsborough, NJ
Washington, DC
Pittsburgh, PA
Development Approach
• We designed and implemented our algorithm over the course of a year, without validation data.
• We used DNS-naming, where available, to infer if our methods appeared to yield correct inferences
• Border routers with high out-degree usually implied an incorrect inference
• We did not use DNS-naming for validation as we found mislabeled interfaces, as well as names containing organization names, rather than AS numbers
54