inter-x summary report april-11b
TRANSCRIPT
-
8/2/2019 Inter-X Summary Report April-11b
1/31
April 11InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReport
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril2011
-
8/2/2019 Inter-X Summary Report April-11b
2/31
AboutENISATheEuropeanNetworkandInformationSecurityAgency(ENISA)isanEUagencycreatedtoadvance
thefunctioningoftheinternalmarket.ENISAisacentreofexpertisefortheEuropeanMemberStates
andEuropeaninstitutionsinnetworkandinformationsecurity,givingadviceandrecommendations
andactingasaswitchboardforinformationongoodpractices.Moreover,theagencyfacilitates
contactsbetweenEuropeaninstitutions,theMemberStates,andprivatebusinessandindustry
actors.Internet:http://www.enisa.europa.eu/
Acknowledgments:Whilecompilingthisreport,wetalkedextensivelyoveraperiodofmanymonthstoalargenumber
oftechnicalandmanagerialstaffatcommunicationsserviceproviders,vendors,andserviceusers.
Manyofoursourcesrequestedthatwenotacknowledgetheircontribution.Nonethelesswethankthemallhere.ENISAwouldliketoexpressitsgratitudetothestakeholdersthatprovidedinputtothe
survey.
Editor:PanagiotisTrimintzios,ENISA
Authors:
ChrisHall,HighwaymanAssociates
RichardClayton,CambridgeUniversity
RossAnderson,CambridgeUniversity
EvangelosOuzounis,ENISA
ContactFormoreinformationaboutthisstudy,pleasecontact:
Internet:http://www.enisa.europa.eu/act/res
18Apr2011(b)
LegalnoticeNoticemustbetakenthatthispublicationrepresentstheviewsandinterpretationsoftheeditorsandauthors,
unlessstatedotherwise.ThispublicationshouldnotbeconstruedtobeanactionofENISAortheENISAbodies
unlessadoptedpursuanttoENISARegulation(EC)No460/2004.Thispublicationdoesnotnecessarily
representthestateoftheartinInternetinterconnectionanditmaybeupdatedfromtimetotime.
Thirdpartysourcesarequotedasappropriate. ENISAisnotresponsibleforthecontentoftheexternalsources
includingexternalwebsitesreferencedinthispublication.
Thispublicationisintendedforeducationalandinformationpurposesonly.NeitherENISAnoranyperson
actingonitsbehalfisresponsiblefortheusethatmightbemadeoftheinformationcontainedinthis
publication.
Reproductionis
authorised
provided
the
source
is
acknowledged
2010EuropeanNetworkandInformationSecurityAgency(ENISA),allrightsreserved.
-
8/2/2019 Inter-X Summary Report April-11b
3/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20113
Tableof
Contents
ExecutiveSummary............................................................................................................. 4
IntroductiontotheSummaryReport.................................................................................. 6
1 Summary...................................................................................................................... 6
1.1 ScaleandComplexity.......................................................................................................8
1.2 TheNatureofResilience.................................................................................................9
1.3 TheLackofInformation.................................................................................................11
1.4 ResilienceandEfficiency...............................................................................................13
1.5 Resilienceand
Equipment
.............................................................................................
14
1.6 ServiceLevelAgreements(SLAs)andBestEfforts......................................................14
1.7 Reachability,TrafficandPerformance..........................................................................15
1.8 IsTransitaViableBusiness?..........................................................................................19
1.9 TheRiseoftheContentDeliveryNetworks..................................................................20
1.10 TheInsecurityofBGP.................................................................................................21
1.11 CyberExercisesonInterconnectionResilience.............................................................22
1.12 TheTragedyoftheCommons....................................................................................23
1.13 Regulation......................................................................................................................24
2 Recommendations.....................................................................................................
27
Recommendation1 IncidentInvestigation..........................................................................27
Recommendation2 DataCollectionofNetworkPerformanceMeasurements..................27
Recommendation3 ResearchintoResilienceMetricsandMeasurementFrameworks.....27
Recommendation4 DevelopmentandDeploymentofSecureInterdomainRouting........28
Recommendation5 ResearchintoASIncentivesthatImproveResilience..........................28
Recommendation6 PromotionandSharingofGoodPracticeonInternetInterconnections28
Recommendation7 IndependentTestingofEquipmentandProtocols..............................28
Recommendation8 ConductRegularCyberExercisesontheInterconnection
Infrastructure...................................................................................
28
Recommendation9 TransitMarketFailure..........................................................................29
Recommendation10 TrafficPrioritisation.............................................................................29
Recommendation11 GreaterTransparencyTowardsaResilienceCertificationScheme..29
RespondentstotheConsultation..................................................................................... 30
-
8/2/2019 Inter-X Summary Report April-11b
4/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20114
Executive
Summary
TheInternethassofarbeenextremelyresilient.Evenmajordisasters,suchas9/11andHurricane
Katrina,havehadonlyalocalimpact.Technicalfailureshavelastedonlyafewhours,andcongestion
hashadasustainedeffectonlywheretheinfrastructureisinadequate.Thelowcostandgeneral
reliabilityofcommunicationsovertheInternethaveledmoreandmoresystemstodependonit;we
arenowatthepointwhereasystemicfailurewouldnotjustdisruptemailandtheweb,butcause
significantproblemsforotherutilities,transport,finance,healthcareandtheeconomygenerally.So
thecontinuedresilienceoftheInternetiscriticaltothefunctioningofmodernsocieties,andhenceit
isrightandpropertoexaminewhetherthemechanismsthathavesuchanexcellenttrackrecordin
providingaresilientInternetarelikelytocontinuetobeaseffectiveinthefuture.
ThefocusofthisreportistheInternetinterconnectionecosystem.ThisholdstogetherallthenetworksthatmakeuptheInternet.Theecosystemiscomplexandhasmanyinterdependentlayers.
Thissystemofconnectionsbetweennetworksoccupiesaspacebetweenandbeyondthosenetworks
anditsoperationisgovernedbytheircollectiveselfinteresttheInternethasnocentralNetwork
OperationCentre,staffedwithtechnicianswhocanleapintoactionwhentroubleoccurs.Theopen
anddecentralisedorganisationthatistheveryessenceoftheecosystemisessentialtothesuccess
andresilienceoftheInternet.Yetthereareanumberofconcerns.
First,theInternetisvulnerabletovariouskindsofcommonmodetechnicalfailureswheresystems
aredisruptedinmanyplacessimultaneously;servicecouldbesubstantiallydisruptedbyfailuresof
otherutilities,particularlytheelectricitysupply;aflupandemiccouldcausethepeopleonwhose
workitdependstostayathome,justasdemandforhomeworkingbyotherswaspeaking;andfinally,becauseofitsopennature,theInternetisatriskofintentionallydisruptiveattacks.
Second,thereareconcernsaboutsustainabilityofthecurrentbusinessmodels.Internetserviceis
cheap,andbecomingrapidlycheaper,becausethecostsofserviceprovisionaremostlyfixedcosts;
themarginalcostsarelow,socompetitionforcespriceseverdownwards.Someofthelargest
operatorstheTier1transitprovidersarelosingsubstantialamountsofmoney,anditisnotclear
howfuturecapitalinvestmentwillbefinanced.Thereisariskthatconsolidationmightreducethe
currenttwentyoddproviderstoahandful,atwhichpointtheywouldstarttoacquirepricingpower
andtheregulationoftransitserviceprovisionmightbecomenecessaryasinotherconcentrated
industries.
Third,dependabilityandeconomicsinteractinpotentiallyperniciousways.MostofthethingsthatserviceproviderscandotomaketheInternetmoreresilient,fromhavingexcesscapacitytoroute
filtering,benefitotherprovidersmuchmorethanthefirmthatpaysforthem,leadingtoapotential
tragedyofthecommons.Similarly,securitymechanismsthatwouldhelpreducethelikelihoodand
theimpactofmalice,errorandmischancearenotimplementedbecausenoonehasfoundawayto
rollthemoutthatgivessufficientlyincrementalandsufficientlylocalbenefit.
Fourth,thereisremarkablylittlereliableinformationaboutthesizeandshapeoftheInternet
infrastructureoritsdailyoperation.Thishindersanyattempttoassessitsresilienceingeneraland
theanalysisofthetrueimpactofincidentsinparticular.Theopacityalsohindersresearchand
developmentofimprovedprotocols,systemsandpracticesbymakingithardtoknowwhatthe
issuesreallyareandharderyettotestproposedsolutions.
-
8/2/2019 Inter-X Summary Report April-11b
5/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20115
Sotheremaybesignificanttroublesaheadwhichcouldpresentarealthreattoeconomicandsocial
welfareandleadtopressureforregulatorstoact.YetdespitetheoriginoftheInternetinDARPAfundedresearch,themorerecenthistoryofgovernmentinteractionwiththeInternethasbeen
unhappy.Variousgovernmentshavemadehamfistedattemptstoimposecensorshipor
surveillance,whileothershavedefendedlocaltelecommunicationsmonopoliesorhaveproppedup
otherindustriesthatweredisruptedbytheInternet.Asaresult,Internetserviceproviders,whose
goodwillisessentialforeffectiveregulation,havelittleconfidenceinthelikelyeffectivenessofstate
action,andmanywouldexpectittomakethingsworse.
Anypolicyshouldthereforeproceedwithcaution.Atthisstage,therearefourtypesofactivitythat
canbeusefulattheEuropean(andindeedtheglobal)level.
Thefirstistounderstandfailuresbetter,sothatallmaylearnthelessons.Thismeansconsistent,
thorough,investigationofmajoroutagesandthepublicationofthefindings.Italsomeansunderstandingthenatureofsuccessbetter,bysupportinglongtermmeasurementofnetwork
performance,andbysustainingresearchinnetworkperformance.
Thesecondistofundkeyresearchintopicssuchasinterdomainroutingwithanemphasisnotjust
onthedesignofsecuritymechanisms,butalsoontrafficengineering,trafficredirectionand
prioritisation,especiallyduringacrisis,anddevelopinganunderstandingofhowsolutionsaretobe
deployedintherealworld.
Thethirdistopromotegoodpractices.Diverseserviceprovisioncanbeencouragedbyexplicitterms
inpublicsectorcontracts,andbyauditingpracticesthatdrawattentiontorelianceonsystemsthat
lackdiversity.Thereisalsoausefulroleinpromotingtheindependenttestingofequipmentand
protocols.
Thefourthispublicengagement.GreatertransparencymayhelpInternetuserstobemore
discerningcustomers,creatingincentivesforimprovement,andthepublicshouldbeengagedin
discussionsonpotentiallycontroversialissuessuchastrafficprioritisationinanemergency.And
finally,PrivatePublicPartnerships(PPPs)ofrelevantstakeholders,operators,vendors,publicactors
etcisimportantforselfregulation.InthiswayevenifregulationoftheInternetinterconnection
systemiseverneededaftermanyyears,policymakerswillbeabletomakeinformeddecisions
leadingtoeffectivepolicies.
Theobjectiveoftheseactivitiesshouldbetoensurethatwhenglobalproblemsdoarise,thedecision
andpolicymakershaveaclearunderstandingoftheproblemsandoftheoptionsforaction.
TherearelocalregulatoryactionsthatEuropecanencouragewhereneeded.Poor
telecommunicationsregulationcanleadtotheconsolidationoflocalserviceprovisionsothatcities
havefewerindependentinfrastructures;andincountriesthatarerecipientsofEUaid,
telecommunicationsmonopoliesoftendeepenthedigitaldivide.
TheaimofalltheseactivitiesshouldbetoensurethattheInternetisubiquitousandresilient,with
serviceprovidedbymultipleindependentcompetingfirmswhohavetheincentivestoprovidea
prudentlevelofcapacitynotjustforfairweather,butforwhenthestormsarrive.
-
8/2/2019 Inter-X Summary Report April-11b
6/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20116
Introduction
to
the
Summary
Report
ThisstudylooksattheresilienceoftheInternetinterconnectionecosystem.TheInternetisa
networkofnetworks,andtheinterconnectionecosystemisthecollectionoflayeredsystemsthat
holdsittogether.TheinterconnectionecosystemisthecoreoftheInternet,providingthebasic
functionofreachinganywherefromeverywhere.
TheExecutiveSummaryaboveprovidesanabstractofthereportssubjectandbroad
recommendations.
TheFullReporthasfourparts:
PartI SummaryandRecommendationsThiscontainsamoreextendedexaminationofthesubjectandadiscussionofour
recommendationsindetail,followedbytherecommendationsthemselves.
Thispartofthereportisbasedonthepartswhichfollow.
PartII StateoftheArtReviewThisincludesadetaileddescriptionoftheInternetsroutingmechanismsandanalysisoftheir
robustnessatthetechnical,economicandpolicylevels.
ThematerialinthispartsupportstheanalysispresentedinPartI,andsetsouttoexplainhow
andwhytheissuesandchallengesthereportidentifiescomeabout.
PartIII ReportontheConsultationAspartofthestudyabroadrangeofstakeholderswereconsulted.Thispartreportsonthe
consultationandsummarisestheresults.
PartIV BibliographyandAppendicesThereisanextensivebibliographyandsummariesofthefinancialstatementsofsomeofthe
majortransitproviders.
ThisSummaryReportisPartIoftheFullReport.
Twosectionsfollow:
Section1isasummaryoftheissuesandchallenges.Itisintendedtobereadasanintroductiontotherecommendations,givingthebackgroundandtherationaleforthem.ItservesalsoasanintroductiontotherestoftheFullReport.
Section2containsourrecommendations.Inthefollowing,sectionnumberreferencestoSections3onwardsrefertoPartIItheFullReport.
Referencesoftheform[C:xx]refertogeneralpointsmadeintheconsultation,whilethoseoftheform
[Q:xx]refertoquotationsfromtheconsultationwhichmadeaparticular,oraparticularlyapposite,
pointthosereferencespointtoPartIIIoftheFullReport.Referencesoftheform[1]refertothe
Bibliography,whichisinPartIVoftheFullReport.
ThisrevisedversionofthereportreplacestheversionpublishedinDecember2010.
-
8/2/2019 Inter-X Summary Report April-11b
7/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20117
1
Summary
TheInternethasbeenprettyreliablesofar,havingrecoveredrapidlyfrommostknownincidents.
TheeffectsofnaturaldisasterssuchasHurricaneKatrina,terroristattackssuchas9/11andassorted
technicalfailureshaveallbeenlimitedintimeandspace.Howeveritdoesappearlikelythatthe
Internetcouldsuffersystemicfailure,leadingperhapstolocalfailuresandsystemwidecongestion,
insomecircumstancesincluding:
Aregionalfailureofthephysicalinfrastructureonwhichitdepends(suchasthebulkpowertransmissionsystem)orthehumaninfrastructureneededtomaintainit(forexampleif
pandemicflucausesmillionsofpeopletostayathomeoutoffearofinfection).
Cascadingtechnicalfailures,ofwhichsomeofthemorelikelyneartermscenariosrelatetotheimminentchangeoverfromIPv4toIPv6;commonmodefailuresinvolvingupdatestopopular
makesofrouter(orPC)mayalsofallunderthisheading.
AcoordinatedattackinwhichacapableopponentdisruptstheBGPfabricbybroadcastingthousandsofbogusroutes,eitherviaalargeASorfromalargenumberofcompromised
routers.
ThereisevidencethatimplementationsoftheBorderGatewayProtocol(BGP)aresurprisingly
fragile.Thereisevidencethatsomeconcentrationsofinfrastructurearevulnerableandsignificant
disruptioncanbecausedbylocalisedfailure.Thereisevidencethatthehealthoftheinterconnection
systemasawholeisnothighamongtheconcernsofthenetworksthatmakeupthatsystembyand
largeeachnetworkstrivestoprovideaservicewhichisreliable,mostofthetime,atminimum
achievablecost.Theeconomicsdonotfavourhighdependabilityasthereisnoincentiveforanyone
toprovidetheextracapacitythatwouldbeneededtodealwithlargescalefailures.
Todate,wehavebeenfarfromanequilibrium:therapidgrowthincapacityhasmaskedamultitude
ofsinsanderrors.However,astheInternetmatures,asmoreandmoreoftheworldsopticalfibreis
lit,andascompaniesjostleforadvantage,thedynamicsmaychange.
TheremaywellnotbeanyimmediatecauseforconcernabouttheresilienceoftheInternet
interconnectionecosystem,butthereiscauseforconcernaboutthelackofgoodinformationabout
howitworksandhowwellitmightworkifsomethingwentverybadlywrong.
Thissectionproceedsasfollows:
inSection1.1thechallengesposedbythesheerscaleandcomplexityoftheInternetinterconnectionsystemarediscussed.
thenatureofresilienceandthedifficultyofassessingitarediscussedinSection1.2. Section1.3discussestheinformationthatwedonothave,andhowthatlimitsourabilityto
addresstheissueofresilience,amongotherthings.
resilienceandefficiencyareantipathetic,whichraisesthechallengesgiveninSection1.4. theproblemsposedbythereliabilityofequipment,andthepossibilityforsystemicfailureare
coveredinSection1.5.
Section1.6examinesthevalueofServiceLevelAgreementsinthecontextoftheinterconnectionsystem.
-
8/2/2019 Inter-X Summary Report April-11b
8/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20118
allpartsoftheInternetmustbeabletoreachallotherparts,soreachabilityisakeyobjective.However,beingabletoreachadestinationdoesnotguaranteethattrafficwillflowtoandfromthereeffectivelyandthatexpectedlevelsofperformancewillbemet.Section1.7discussesthe
challenges,withparticularreferencetothebehaviourofthesystemifsomeeventhasdisabled
partsofit.
everyyearthepriceoftransitgoesdown,andeveryyearpeoplefeelitmustleveloff.Thereasontobelievethatthepricewilltendtozero,andthechallengesthatposesarediscussedin
Section1.8.
theriseoftheContentDeliveryNetworks(CDNs)andtheeffectontheinterconnectionsystemisdiscussedinSection1.9.
Section1.10tacklestheinsecurityofBGP. inSection1.11thevalueofdisasterrecoveryexercises(wargames)isexamined. anumberofissuesarerelated;tacklingthemwouldbenefiteverybody,butaddressingthem
alsocostseachnetworkmorethantheygainindividually.ThisisdiscussedinSection1.12.
thecontentioussubjectofregulationisraisedinSection1.13.1.1 ScaleandComplexityTheInternetisverybigandverycomplicated[C:1].
TheinterconnectionsystemwecalltheInternetcomprisessome37,000AutonomousSystemsor
ASes(ISPsorsimilarentities)and355,000blocksofaddresses(addressablegroupsofmachines),spreadaroundtheworldasofMarch2011(seeSection3oftheFullReport).
Thisenormousscalemeansthatitishardtoconceiveofanexternaleventwhichwouldaffectmore
thanarelativelysmallfractionofthesystemasfarastheInternetisconcerned,alargeearthquake
ormajorhurricaneis,essentially,alittlelocaldifficulty.However,thefailureofasmallfractionof
thesystemmaystillhaveasignificantimpactonagreatmanypeople.Whenconsideringthe
resilienceofthissystemitisnecessarytoconsidernotonlytheglobalissues,butalargenumberof
separate,butinterconnected,localissues.
Thecomplexityofthesystemispartlyrelatedtoitssheerscale,andthenumberofinterconnections
betweenASes.Thisiscompoundedbyanumberoffactors.
Modellingtheinterconnectionsystemishardbecauseweonlyhavepartialviewsofitandbecauseithasanumberoflayers,eachwithitsownpropertiesandinteractingwithother
layers.Forexample,theconnectionsbetweenASesusemanydifferentphysicalnetworks,
oftenprovidedbythirdparties,whicharethemselveslargeandcomplicated.Resilience
dependsonthediversityofinterconnections,whichinturndependsonphysicaldiversity
whichcanbeanillusion,andisoftenunknown[C:7].
WhileitispossibletodiscoverpartoftheASleveltopologyoftheInternet(whichASesare
interconnected),fromaresilienceperspective,itwouldbemorevaluabletoknowtherouter
leveltopology,(thenumber,location,capacity,trafficlevelsetc.oftheactualconnections
betweenASes)[C:2].Ifwewanttoestimatehowtrafficmightmovearoundwhenconnections
fail,wealsoneedtoknowabouttheroutinglayer(whatroutestheroutershavelearnedfrom
eachother)sowecanestimatewhatrouteswouldbelostwhengivenconnectionsfailed,and
-
8/2/2019 Inter-X Summary Report April-11b
9/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril20119
whatrouteswouldbeusedinstead[C:3].Thatalsotouchesonroutingpolicy(thewayeach
ASdecideswhichroutesitwillprefer)andthetrafficlayer[whereendusertrafficisgoingtoandfrom].Thisisperhapsthemostimportantlayer,butverylittleisknownaboutitona
globalscale.
Theinterconnectionsystemdependsonothercomplexandinterdependentsystems.Therouters,thelinksbetweenthem,thesitestheyarehousedin,andalltheotherinfrastructure
thattheinterconnectionsystemdependson,themselvesdependonothersystemsnotably
electricitysupplyandthosesystemsdependinturnontheInternet.[C:8],[Q:3]and[Q:17].
Theinterconnectionecosystemisselforganisingandhighlydecentralised.ThedecisionwhethertointerconnectismadeindependentlybytheASes,drivenbytheirneedtobeableto
reach,andbereachablefrom,theentireInternet.Thesameholdsatlowerlevels:the
administratorsofanASconfiguretheirrouterstoimplementtheirroutingpolicy,thentheroutersselectanduseroutes.ButdifferentroutersinthesameASmayselectdifferentroutes
foragivendestination,soeventheadministratorsmaynotknow,apriori,whatpathtraffic
willtake.
Theinterconnectionecosystemisdynamicandconstantlychanging.Itsshapechangesallthetime,asnewconnectionsaremade,orexistingconnectionsfailorareremoved.Atthe
corporatelevel,transitproviderscomeandgo,organisationsmerge,andsoon.Attheindustry
level,therecentriseofthecontentdeliverynetworks(CDNs)changedthepatternof
interconnections.
Thepatternsofusearealsoconstantlyevolving.TheriseoftheCDNsalsochangedthedistributionoftraffic;andwhilepeertopeer(P2P)trafficbecamealargeproportionoftotaltrafficintheearlytomid2000s,nowvideotrafficofvariouskindsiscomingtodominateboth
intermsofvolumeandintermsofgrowth.
TheInternetiscontinuingtogrow.Infact,justabouteverythingaboutitcontinuestogrow:thenumberofASes,thenumberofroutes,thenumberofinterconnections,thevolumeof
traffic,etc.
Thescaleandcomplexityofthesystemmakeithardtograsp.Resilienceisitselfaslipperyconcept,
sotheresilienceoftheinterconnectionsystemisnontrivialtodefineletalonemeasure!
Thisstudyattemptstoprovidesomeinsightbydescribingtheworkingsofthesystemandwhatwe
knowaboutitsresilience.
1.2 TheNatureofResilienceThereisavastliteratureonreliabilitywhereengineersstudythefailureratesofcomponents,the
prevalenceofbugsinsoftware,andtheeffectsofwear,maintenanceetc.;theaimbeingtodesign
machinesorsystemswithaknownrateoffailureinpredictableoperatingconditions[1].
Robustnessrelatestodesigningsystemstowithstandoverloads,environmentalstressesandother
insults,forexamplebyspecifyingequipmenttobesignificantlystrongerthanisneededfornormal
operation.Intraditionalengineering,resiliencewastheabilityofamaterialtoabsorbenergyunder
stressandreleaseitlater.Inmodernsystemsthinking,itmeanstheoppositeofbrittlenessand
referstotheabilityofasystemororganisationtoadaptandrecoverfromaseriousfailure,ormore
generallytoitsabilitytosurviveinthefaceofthreats,includingthepreventionormitigationof
unsafe,hazardousordetrimentalconditionsthatthreatenitsexistence[2].Inthelongerterm,itcan
-
8/2/2019 Inter-X Summary Report April-11b
10/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201110
alsomeanevolvability:theabilityofasystemtoadaptgraduallyasitsenvironmentchangesan
ideaborrowedfromsystemsbiology[3][4].
Resilienceofasystemisdefinedastheabilitytoprovideandmaintainanacceptablelevelofservicein
thefaceofvariousfaultsandchallengestonormaloperation1.Thatistheabilitytoadaptitselfto
recoverfromaseriousfailure,ormoregenerallytoitsabilitytosurviveinthefaceofthreats.A
giveneventmayhavesomeimpactonasystemandhencesomeimmediateimpactontheserviceit
offers.Thesystemwillthenrecover,servicelevelswillimproveandatsometimefullserviceandthe
systemwillberestored.
Resiliencethereforerefersbothtofailurerecoveryatthemicrolevel,aswhentheInternetrecovers
fromthefailureofaroutersoquicklythatusersperceiveaconnectionfailureofperhapsafew
seconds(iftheynoticeanythingatall);throughcopingwithamidsizeincident,aswhenISPs
providedextraroutesinthehoursimmediatelyafterthe9/11terroristattacksbyrunningfibresacrosscollocationcentres;todisasterrecoveryatthestrategiclevel,wherewemightplanforthe
nextSanFranciscoearthquakeorforamalwarecompromiseofthousandsofrouters.Ineachcase
thedesiredoutcomeisthatthesystemshouldcontinuetoprovideserviceintheeventofsomepart
ofitfailing,withservicedegradinggracefullyifthefailureislarge.
Therearethustwoedgecasesofresilience:
1. theabilityofthesystemtocopewithsmalllocaleventssuchasequipmentfailuresandreconfigureitselfessentiallyautomaticallyandoveratimescaleofsecondstominutes.This
enablestheInternettocopewithdaytodayeventswithlittleornoeffectonserviceitis
reliable.Thisiswhatmostnetworkengineersthinkofasresilience.
2. theabilityofasystemtocopewithandrecoverfromamajorevent,suchasalargenaturaldisasteroracapableattack,onatimescaleofhourstodaysorevenlonger.Thistypeof
resilienceincludes,first,theabilityofthesystemtocontinuetooffersomeserviceinthe
immediateaftermath,andsecond,theabilitytorepairandrebuildthereafter.Thekeywords
hereareadaptandrecover.Thisdisasterrecoveryiswhatcivilauthoritiestendtothinkof
asresilience.
Thisstudyisinterestedintheresilienceoftheecosysteminthefaceofeventswhichhavemediumto
highimpactandwhichhaveacorrespondinglymediumtolowprobability.Itisthusbiasedtoward
thesecondofthesecases.
Robustnessisanimportantaspectofresilience.Arobustsystemwillhavetheabilitytoresistassaultsandinsults,sothatwhateversomeeventisthrowingatit,itwillbeunaffected,andno
resilientresponseisrequired.Whileresilienceistodowithcopingwiththeimpactofevents,
robustnessistodowithreducingtheimpactinthefirstplace.Thetwooverlap,andfromtheusers
perspectivethesearefinedistinctions;whattheuserwantsisforthesystemtobepredictably
dependable.
1following:JamesP.G.Sterbenz,DavidHutchison,EgemenK.etinkaya,AbdulJabbar,JustinP.Rohrer,MarcusSchller
andPaulSmith:Resilienceandsurvivabilityincommunicationnetworks:Strategies,principles,andsurveyof
disciplines,Computer
Networks,
Volume
54,
Issue
8,
1June
2010,
Pages
1245
1265,
Resilient
and
Survivable
networks.
-
8/2/2019 Inter-X Summary Report April-11b
11/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201111
Resilienceiscontextspecific.Robustnesscanbesensiblydefinedonlyinrespectofspecifiedattacks
orfailures,andinthesamewayresiliencealsomakessenseonlyinthecontextofrecoveryfromspecifiedevents,orinthefaceofasetofpossiblechallengesofknownprobability.Wecallbad
eventsofknownprobabilityrisk,butthereisaseparateproblemofuncertaintywherewedonot
knowenoughaboutpossiblefuturebadeventstoassignthemaprobabilityatall.Inthefaceof
uncertainty,itisdifficulttoassessacombinationofintermediatelevelsofserviceand
recovery/restorationtimes,especiallywhenwhatisacceptablemayvarydependingonthenature
andscaleoftheevent.[C:5]
Moreover,nogoodmetricsareavailabletoactuallyassesstheperformanceoftheInternetorits
interconnectionsystem.Thismakesitharderstilltospecifyacceptablelevelsofservice.Forthe
Internettheproblemiscompoundedbyitsscaleandcomplexity(seeabove)andbylackof
information(seebelow),whichmakeithardtoconstructamodelwhichmightbeusedtoattach
numberstoresilience.Itisevenhardtoassesswhatimpactagivensingleeventmighthavean
earthquakeinSanFranciscoofagivenseveritymayhaveapredictableimpactonthephysical
infrastructure,butthatneedstobetranslatedintoitseffectoneachnetwork,andhencetheeffecton
theinterconnectionsystem.
Giventhesedifficulties(andtherearemanymore),serviceproviderscommonlyfallbackon
measuresthatimproveresilienceingeneralterms,inthehopethatthiswillimprovetheirresponse
tofuturechallenges.Thisqualitativeapproachrunsintodifficultywhenthecostofanimprovement
mustbejustifiedonmuchmorerestrictedcriteria.FortheInternetasawhole,thecostjustification
ofinvestmentinresilienceisanevenhardercasetomake.
1.3 TheLackofInformationEachoftheASesthatmakeuptheInterneteachhasaNetworkOperationCentre(NOC),chargedwith
monitoringthehealthoftheASsnetworkandinstigatingactionwhenproblemsoccur.Thereisno
NOCfortheInternet.
Infactitisworsethanthat.ASesunderstandtheirownnetworksbutknowlittleaboutanyoneelses.
Ateveryleveloftheinterconnectionsystem,thereislittleglobalinformationavailable,andwhatis
availableisincompleteandofunknownaccuracy.Inparticular:
thereisnomapofphysicalconnectionstheirlocation,capacity,etc.; thereisnomapoftrafficandtrafficvolume; thereisnomapoftheinterconnectionsbetweenASeswhatroutestheyoffereachother.
TheInternetinterconnectionsystemis,essentially,opaque.Thisopacityhamperstheresearchand
developmentcommunitiesintheirattemptstounderstandtheworkingsoftheInternet,andto
developandtestimprovements;itmakesthestudyandmodellingofcomplexemergentproperties
suchasresilienceharderstill.[C:2],[Q:1]and[Q:2].
Thelackofinformationhasanumberofcauses:
Complexityandscale.Tomapthenetworksoffibrearoundtheworldmightbeatractableproblem.Overthosephysicalfibresrunmanydifferentlogicalconnections,eachofwhichwill
carrynetworktrafficfornumerousproviders,whichinturnsupportyetmoreprovidersnetworksandcircuitsrapidlymultiplyingupthecombinationsandpermutationsof
overlappinguseoftheunderlyingfibre.Furthermore,notallthosethingsarefixedproviders
-
8/2/2019 Inter-X Summary Report April-11b
12/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201112
rerouteexistingnetworksandcircuitsastheyextendoradapttheirnetworks.Tokeeptrack,
meticulousrecordkeepingisrequired,butevenwithinasingleASitisnotalwaysachieved.Atagloballevel,measuringtrafficvolumeswouldbeanimmenseundertaking,giventhesheer
numberofconnectionsbetweennetworks.
Theinformationhidingpropertiesoftheroutingsystem.Whentryingtomapconnectionsbyprobingthesystemfromtheoutside,eachprobewillrevealsomethingaboutthepath
betweentwopointsintheInternetatthetimeoftheprobe.Buttheproberevealslittleabout
whatotherpathsmayexistatothertimes,orwhatpathmightbetakenifanypartoftheusual
pathisnotworking,orwhattheperformanceofthoseotherpathsmightbe.
Securityconcerns.Mappingthephysicallayeristhoughttobeaninvitationtopeoplewithbadintentionstoimprovetheirtargetselectionsothosemapsthatdoexistareseldomshared.
Thecostofstoringandprocessingthedata.Iftherewascompleteinformation,therewouldbeaverygreatdealofit,andmorewouldbegeneratedeveryminute.Storingitand
processingitintoausableformwouldbeamajorengineeringtask.
Commercialsensitivity.Informationaboutwhether,howandwherenetworksconnecttoeachotherisdeemedcommerciallysensitivebysome.Informationabouttrafficvolumesis
quitegenerallyseenascommerciallysensitive.Becauseofthis,someadvocatepowerful
incentivestodiscloseinformation,andpossiblyinanonymisedandaggregatedform.[C:23]
Criticalinformationisnotcollectedinthefirstplace,ornotkeptuptodate.Informationgatheringandmaintenancecostsmoney,sotheremustbesomerealuseforitbeforea
networkwillbothertogatheritorstrivetokeepituptodate.TheInternetRoutingRegistries
(IRRs)arepotentiallyexcellentresources,butarenotnecessarilyuptodate,completeoraccurate,becausetheinformationseldomhasoperationalsignificance(andmayinanycasebe
deemedcommerciallysensitive).
Lackofgoodmetrics.Whiletherearesomewellknownmetricsfortheperformanceofconnectionsbetweentwopointsinanetwork,therearenoneforanetworkasawholeor,
indeed,anetworkofnetworks.ENISAhasalreadystartedworkinginthisdirection,lookingat
resiliencemetricsfromaholisticpointofview2.
Thepoorstateofinformationreflectsnotonlythedifficultyoffindingorcollectingdata,butalsothe
lackofgoodwaystoprocessanduseitevenifonehadit.
1.3.1 IncidentsasaSourceofInformationSmallincidentsoccureveryday,andlargeroneseverynowandthen.Giventhelackofinformation
abouttheinterconnectionsystem,theresultsofthesenaturalexperimentstellusmuchofwhatwe
knowaboutitsresilience.[C:4].Forexample,weknowthefollowing.
Itisstraightforwardtodiverttrafficawayfromitsproperdestinationbyannouncinginvalidroutes.ThewellknownincidentinFebruary2008inwhichYouTubestoppedworkingfora
fewhoursisoneexample;seeSection5.6.4.Morepublicity,andpoliticalconcern,wasraised
2
http://www.enisa.europa.eu/act/res/other
areas/metrics
-
8/2/2019 Inter-X Summary Report April-11b
13/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201113
bya2010incidentinwhichChinaTelecomadvertisedanumberofinvalidroutes,effectively
hijacking15%ofInternetaddressesfor18minutes;seeSection5.6.9. LatentbugsinBGPimplementationscandisruptthesystem.Mostrecently,inAugust2010,an
experimentthatsentanunusual(butentirelylegal)formofrouteannouncementtriggereda
buginsomerouters,causingtheirneighbourstoterminateBGPsessions,andformanyroutes
tobelost.Theeffectsofthisincidentlastedlessthantwohours;seeSection5.6.5.
Insomepartsoftheworldasmallnumberofcablesystemsarecritical.UnderseacablesnearAlexandriainEgyptwerecutinDecember2008.Interestingly,threecablesystemswere
affectedatthesametime,andtwoofthosesystemshadbeenaffectedsimilarlyin
January/Februaryofthatyear.Thisseriouslyaffectedtrafficforperhapstwoweeks.See
Section5.6.6.
Thesystemiscriticallydependentonelectricalpower.AlargepoweroutageinBrazilinNovember2009causedsignificantdisruption,thoughitlastedonlyfourandahalfhours;see
Section5.6.6.Interestingly,previousblackoutsinBrazilhadbeenattributedtohackers,
suggestingthattheseincidentsareexamplesoftheriskofinterdependentnetworks.This
particularconspiracytheoryhasbeenrefuted.
Theecosystemcanworkwellinacrisis.TheanalysisoftheeffectofthedestructionattheWorldTradeCentreinNewYorkon11thSeptember2001showsthatthesystemworkedwell
atthetime,andinthedaysthereafter,eventhoughlargecablesunderthebuildingswerecut
andotherfacilitiesweredestroyedordamaged.Generally,Internetservicesperformedbetter
thanthetelephonesystem(fixedandmobile).SeeSection5.6.10.
Thesesortsofincidentarewellknown.However,hardinformationabouttheexactcausesand
effectsishardtocomebymuchisanecdotalandincomplete,whilesomeisspeculativeorsimply
apocryphal.Valuableinformationisbeinglost.ThereportThe Internet under Crisis Conditions:
Learning from September 11,[5]isamodelofclarity;buteventheretheauthorswarn:
... While the committee is confident in its assessment that the events of September 11 had little effect
on the Internet as a whole ..., the precision with which analysts can measure the impact of such events
is limited by a lack of relevant data.
1.4 ResilienceandEfficiencyTherearefundamentaltensionsbetweenresilienceandefficiency.[Q:5]Resiliencerequiresspare
capacityandduplicationofresources,andsystemswhicharelooselycoupled(madeupoflargely
independentsubsystems)aremoreresilientthantightlycoupledsystemswhosecomponents
dependmoreoneachother.Butimprovingtheefficiencyofasystemgenerallymeanseliminating
excesscapacityandredundantresources.
Amorediversesystemisgenerallyamoreresilientone,butdiversityaddstocostandcomplexity.
Diversityofconnectionsismostefficientlyachievedusinginfrastructurewhosecostissharedby
manyoperators,butcollectiveactionproblemscanunderminetheresiliencegain[C:7][Q:9].Itis
efficienttoavoidduplicationofeffortinthedevelopmentofsoftwareandequipment,andefficientto
exploiteconomiesofscaleinitsmanufacture,butthisreducesthediversityofequipmentused[C:9].
ItisefficientfortheentireInternettodependononeprotocolforitsrouting,butthiscreatesasingle
pointoffailure.Settingupandmaintainingmultiple,diverse,separateconnectionstoothernetworkscoststimeandeffortandcreatesextracomplexitytobemanaged[C:6].
-
8/2/2019 Inter-X Summary Report April-11b
14/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201114
TheInternetisalooselycoupledcollectionofindependentlymanagednetworks.However,atits
corethereareafewverylargenetworks,eachofwhichstrivestobeasefficientaspossiblebothinternallyandinitsconnectionstoothernetworks.Soitisanopenquestionwhethertheactual
structureoftheInternetisasresilientasitsarchitecturewouldsuggest.Inthepastithasbeen
remarkablyresilient,andithascontinuedtoperformasithasevolvedfromatinynetwork
connectingahandfulofresearchfacilitiesintotheglobalinfrastructurethatconnectsbillionstoday.
However,asinotherareas,pastperformanceisnoguaranteeoffutureresults.
1.5 ResilienceandEquipmentAparticularconcernfortheinterconnectionsystemisthepossibilityofaninternaltechnicalproblem
thatcouldhaveasystemiceffect.TheimminentchangeovertoIPv6willprovideahighstress
environmentinwhichsuchaproblemcouldbemorelikelytomanifestitself,andthemostlikelyproximatecauseofsuchaproblemisbugsinBGPimplementations,whichcouldbeseriousgiventhe
smallnumberofequipmentvendorsforthiskindofequipment.[C:9]Therehavebeenanumberof
incidentsinwhichlargenumbersofroutersacrosstheentireInternethavebeenaffectedbythe
sameproblem,somethingunprecedentedandunexpectedwhichexposesabuginthesoftware,and
occasionallyinthespecificationofBGP.
Nosoftwareisfreefrombugs,buttheuniversaldependenceonBGPmakesbugstheremoreserious.
ISPsmaytestequipmentbeforebuyinganddeployingit,butthosetestsconcentrateonissues
directlyaffectingtheISP,suchastheperformanceoftheequipmentanditsabilitytosupportthe
requiredservices.Manufacturerstesttheirequipmentaspartoftheirdevelopmentprocess.Butthe
interestsofbothISPsandmanufacturersarefortheequipmenttoworkwellundernormal
circumstances.IndividualISPscannotaffordtodoexhaustivetestingoflowprobabilityscenariosforthebenefitoftheInternetatlarge.Themanufacturersfortheirpartbalancetheeffortandtime
spenttestingagainsttheircustomersdemandsfornewandusefulfeatures,newandfasterrouters
andlessexpensivesoftware.Alsoofconcernishowsecureroutersandroutingprotocolsareagainst
deliberateattemptstodisruptorsubornthem.
Anumberofrespondentstotheconsultationfeltthatmoneyspentontestingequipmentand
protocolswouldbemoneywellspent.[C:10]
1.6 ServiceLevelAgreements(SLAs)andBestEffortsInanymarketinwhichthebuyerhasdifficultyinestablishingtherelativevalueofdifferentsellersofferings,itiscommonforsellerstoofferguaranteestosupporttheirclaimstoquality.ServiceLevel
Agreements(SLAs)performthatfunctionintheinterconnectionecosystem.Fromaresilience
perspective,itwouldbenicetoseeISPsofferingSLAsthatcoverednotjusttheirownnetworksbut
theinterconnectionsystemtoo,andcustomerspreferringtobuyservicewithsuchSLAs.
Unfortunately,SLAsforInternetaccessingeneralarehard,andfortransitserviceareofdoubtful
value[C:20].Inparticular,whereanoperatoroffersanSLA,itdoesnotextendbeyondthebordersof
theirnetwork[C:19];sowhatevertheirguaranteesare,theydonotcovertheinterconnectionsystem
thepartbetweenthebordersofallnetworks.
TheSLAsofferedtoendcustomersbytheirISPsreflecttheSLAsthatISPsobtainfromtheirtransit
providersandpeers.ThestandardSLAsofferedtoendcustomersmaybepublished,buttheSLAs
offeredbetweennetworksmaybepartofcontractsthatarekeptconfidential.Givenhowlittlesuch
SLAsaregenerallythoughttocover,itisanopenquestionhowmuchinformationisbeinghidden
-
8/2/2019 Inter-X Summary Report April-11b
15/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201115
herebutitisanotheraspectofthegenerallackofinformationabouttheecosystematalllevels.
(Theconsultationaskedspecificallyaboutinterprovideragreements,seeSection9,Question8.)
Providersdonotattempttoguaranteeanythingbeyondtheirbordersbecausetheycannot.Anysuch
guaranteewouldrequireabacktobacksystemofcontractsbetweennetworkssothatliabilityfora
failuretoperformwouldbebornebythefailingnetwork.Thatsystemofcontractsdoesnotexist,
notleastbecausetheInternetisnotdesignedtoguaranteeperformance.Itisfundamentaltothe
currentInternetarchitecturethatpacketsaredeliveredonabesteffortsbasis,thatis,thenetwork
willdoitsbestbutitdoesnotguaranteeanything.TheInternetleavesthehardworkofmaintaining
aconnectiontotheendpointsoftheconnectiontheendtoendprinciple.TheTransmission
ControlProtocol(TCP),whichcarriesmostInternettrafficapartfromdelaysensitivetraffic,will
reducedemandifitdetectscongestionitisdesignedtoadapttotheavailablecapacity,notto
guaranteesomelevelofperformance.
TheotherdifficultywithSLAsiswhatcanandwhatshouldbemeasured.Forasingleconnection
betweenaandbitisclearwhatcanbemeasured,butitisnotclearwhatlevelofperformancecould
beguaranteed,orbywhom.Consideraconnectionfromainonenetworktobinanothernetwork,
whichtraversesfourothernetworksandtheconnectionsbetweenthem:
Figure1:Connectionbetweenaandb
Allthesenetworksareindependent,andhavetheirownSLAs,eachextendingonlyasfarastheir
borders.Ifwefollowthemoney,aispayingdirectlyandindirectlyforpacketstoandfromthe
connectionbetweennetworksYandZ.Similarly,bispayingforpacketstoandfromthemidpoint
ontheotherside.IfnetworkQhaslowstandards,orishavingabadday,towhomdoesacomplain?
NetworkXhasacontractwithasnetwork,andoffersanSLA,butthatdoesnotextendbeyondX.
NetworkYhasacontractwithX,withadifferentSLA,butevenifXcomplainedtoYaboutits
customersproblemwehavecometotheendofthemoneytrail:YcannotholdZtoaccountforthe
performanceofQ.SupposeaweretodemandastrongSLAfromtheirprovider:Xcertainlyhasno
wayofimposingsomestandardofserviceonQ,andsimplycannotoffertomakeanyguarantee.
EvenifitwerepossibletoestablishanendtoendSLAforthisconnection,andpinliabilityonthe
failingnetwork,therearehundredsofthousandsofpathsbetweenasnetworkandtherestoftheInternet.Theproblemisintractable.SowhatevervalueSLAshave,theydonotofferacontractual
frameworkthroughwhichcustomerscaninfluencetheresilienceoftheinterconnectionsystem,even
iftheywantedto.Inaddition,fewcustomersunderstandtheissue,orcaretodoanythingaboutit.
GenerallytheInternetisremarkablyreliable,socustomersprincipalinterestinchoosingasupplier
ispricepossiblymoderatedbythesuppliersreputation.[C:18]
1.7 Reachability,TrafficandPerformanceWhileenduserscareabouttrafficandperformance,thebasicmechanismoftheinterconnection
systemBGPonlyunderstandsreachability[Q:11].Itsfunctionistoprovideawayforevery
networktoreacheveryothernetwork,andfortraffictoflowacrosstheInternetfromonenetworktoanother.AllASes(theISPsandothernetworksthatmakeuptheInternet)speakBGPtoeachother,
-
8/2/2019 Inter-X Summary Report April-11b
16/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201116
andreachabilityinformationspreadsacrosstheBGPmeshofconnectionsbetweenthem.BGPisthe
heartoftheinterconnectionsystem,soitsmanydeficienciesareaproblem.[Q:16]
Theproblemswiththeprotocolitselfinclude:
thereisnomechanismtoverifythattheroutinginformationdistributedbyBGPisvalid.Inprincipletraffictoanydestinationcanbedivertedsotrafficcanbedisrupted,modified,
examinedorallthree.ThesesecurityissuesarediscussedseparatelyinSection1.10.
thereisnomechanisminBGPtoconveycapacityinformationsoBGPcannothelpreconfiguretheinterconnectionsystemtoavoidcongestion.[Q:12]Whenaroutefails,BGPwillfind
anotherroutetomaintainreachability,butthatroutemaynothavesufficientcapacityforthe
trafficitnowreceives.
themechanismsinBGPwhichmaybeusedtodirecttrafficawayfromcongestioninothernetworksinterdomaintrafficengineeringarestrictlylimited. whenthingschangeBGPcanbeslowtosettledown(converge)toanew,stablestate.[C:12] theabilityofBGPtocopeorcopewellunderextremeconditionsisnotassured.
EndusersexpecttobeabletoreacheverypartoftheInternet,soreachabilityisessential.Butthey
alsoexpecttobeabletomovedatatoandfromwhateverdestinationtheychoose,sotheyexpect
theirconnectionwiththatdestinationtoperformwell.AsBGPknowsnothingabouttraffic,capacity
orperformance,networkoperatorsmustuseothermeanstomeetendusersexpectations.When
somethingintheInternetchanges,BGPwillchangetheroutesusedtoensurecontinuing
reachability,butitisuptothenetworkoperatorstoensurethattheresultwillperformadequately,
andtakeotherstepsifitdoesnot.
Servicequalityinabesteffortsnetworkisalltodowithavoidingcongestion,forwhichitis
necessarytoensurethatthereisalwayssufficientcapacity.Themosteffectivewaytodothatisto
maintainenoughsparecapacitytoabsorbtheusualshorttermvariationsintrafficandprovidesome
safetymargin.Additionalsparecapacitymaybemaintainedtoallowtime(weeksormonths,
perhaps)fornewcapacitytobeinstalledtocaterforlongtermgrowthoftraffic.Maintainingspare
capacityinthiswayisknownasoverprovisioning;itiskeytodaytodayservicequalityandtothe
resilienceoftheinterconnectionsystem.
Eachoperatorconstantlymonitorsitsnetworkforsignsofcongestionandwillmakeadjustmentsto
relieveanyshorttermissues.Ingeneralthepatternoftrafficinanetworkofanysizeisstablefromdaytodayandmonthtomonth.Anoperatorwillalsomonitortheirnetworkforlongtermtrendsin
traffic.Themanagementofcapacityisgenerallydoneonthebasisofhistory,experienceandrulesof
thumb,supportedbysystemsforgatheringandprocessingtheavailabledata.Thelevelsofspare
capacityinanynetworkwilldependonmanythings,includinghowtheoperatorchoosestobalance
thecostofsparecapacityagainsttheriskofcongestion.
Akeypointhereisthatcapacityismanagedonthebasisofactualtrafficandtheusualdaytoday
events,withsomemarginforcontingenciesandgrowth.Capacityisnotmanagedonthebasisof
whatmighthappenifsomeunusualeventcausesalotoftraffictoshiftfromonenetworktoanother.
Ifaneventhasamajorimpactontheinterconnectionsystem,thentheamountofsparecapacity
withinandbetweennetworkswilldeterminethelikelihoodofsystemiccongestion.Soeach
individualnetworksdegreeofoverprovisioningmakessomecontributiontotheresilienceofthe
wholethoughitishardtosaytowhatextent.
-
8/2/2019 Inter-X Summary Report April-11b
17/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201117
IfaneventdisablessomepartoftheInternet,BGPwillworktoensurethatreachabilityis
maintained,butthenewpathsmayhavelesscapacitythantheusualones,whichmayresultincongestion.Formanyapplications,notablywebbrowsing,theeffectistoslowthingsdown,butnot
stopthemworking.Moredifficultiesarisewithanysortofdatathatisaffectedbyreduced
throughputorincreaseddelay,suchasVoIPandstreamingvideo.Congestionmaystopthese
applicationsworkingsatisfactorily,oratall.
Theimportantdistinctionbetweenreachabilityandtrafficisillustratedbyconsideringwhatappears
tobeasimplemetricforthestateoftheInternet:thepercentageofknowndestinationsthatare
reachablefrommostoftheInternetatanygivenmoment.Thismetricmaybeusedtogaugethe
impactofaBGPfailure,orofthefailureofsomecriticalfibre,oranyotherwidelyfeltevent.But
whilethesignificanceof,say,10%ofknowndestinationsbecomingunreachableisobviously
extremelyhighforthe10%cutoff,itmaynotbeterriblysignificantfortherestoftheInternet.We
wouldprefertoknowtheamount,andpossiblythevalue,oftrafficthatisaffected.Ifthe10%cutoff
accountsforalargeproportionoftheremaining90%straffic,theimpactcouldbesignificant.So
whentalkingabouttheresilienceofthesystem,whatisanacceptablelevelofthebestefforts
service?Areweaimingathavingemailwork95%ofthetimeto95%ofdestinations,orstreaming
videowork99.99%ofthetimeto99.99%ofdestinations?Theanswerwillhaveanenormouseffect
onthesparecapacityneeded!Eachextraorderofmagnitudeimprovement(sayfrom99%to99.9%)
couldcostanorderofmagnitudemoremoney;yetthebenefitsofservicequalityareunevenly
distributed.Forexample,apensionerwhousestheInternettochattograndchildrenonceaweek
maybehappywith99%oreven90%,whileacompanyprovidingacloudbasedbusinessservice
mayneed99.99%ormore.
1.7.1 TrafficPrioritisationInacrisisitiscommonforaccesstosomeresourcestoberestricted,tosheddemandandfreeup
capacity.Fortelephonyatraditionalapproachistogiveemergencyservicespriority.Butrestricting
phoneservicetoobviousemergencyworkerssuchasdoctorsisunsatisfactory.Modernmedical
practicedependsonteamworkingandcanbecrippledifnursesarecutoff;andmanypatientswho
dependonhomemonitoringmayhavetobehospitalisedifcommunicationsfail.
Ifcapacityislostinadisasterandpartsofthesystemarecongested,thenallusersofthecongested
partswillsufferareductioninservice,andsometypesoftraffic(notablyVoIP)maystopworking
effectively.Ifsometypes,sourcesordestinationsoftrafficaredeemedtobeimportant,andso
shouldbegivenpriorityinacrisis,thenseriousthoughtneedstobegiventohowtoidentifypriority
traffic,howtheprioritisationistobeimplementedandhowturningthatprioritisationonandofffits
intootherdisasterplanning.[Q:19]
Itisnotentirelystraightforwardtoidentifydifferenttypesoftraffic.Soanalternativeapproachmay
betoprioritisebysourceordestination.ItmaybetemptingtoconsiderservicessuchasFacebookor
YouTubeasessentiallytrivial,andYouTubeusesalotofbandwidth.However,inacrisiskeepingin
contactusingFacebookmaybeapriorityformany.Moreover,shuttingdownYouTubeinacrisis
therebypreventingthefreereportingofeventswouldrequiresolidjustification.Ontheother
hand,ratelimitingordinaryusers,irrespectiveoftraffictype,mayappearfair,butcouldaffect
essentialVoIPuse,andcuttingoffpeertopeertrafficcouldbeseenascensorship.
SoitisinappropriateforISPstodecidetodiscriminatebetweendifferentsortsoftraffic,orbetweencustomersofthesametype(althoughpremiumcustomersatpremiumratesmightexpecttoget
betterperformanceinacrisis).[Q:21]ItisnotevenclearthatISPsare,ingeneral,capableof
-
8/2/2019 Inter-X Summary Report April-11b
18/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201118
prioritisingsometrafficonanygivenbasis.So,ifsometrafficshouldbeprioritisedinacrisis,who
willmakethecall,andwillanyonebereadytoactwhentheydo?
Itisclearthatthischallengeentailsbothtechnicalandpolicyaspects.Theformerarerelatedmainly
tothemechanismsthatshouldexistinnetworkequipmenttosupporttrafficprioritisation.Thelatter
refermainlytothepoliciesthatspecifywhattrafficshouldbegivenpriority.Itisveryimportantto
tacklebothaspectsoftheproblem.
1.7.2 TrafficEngineeringTrafficEngineeringisthejargontermforadjustinganetworksothattrafficflowsareimproved.In
acrisisthatwouldmeanshiftingtrafficawayfromcongestedpaths.Thisislesscontroversialthan
trafficprioritisation,butnolessdifficult.
Whensomeeventcreatescongestioninsomepart(s)oftheinterconnectionsystemitwouldbe
convenientifnetworkscouldredirectsometrafficawayfromthecongestedparts.Whenanetwork
isdamageditsoperatorswillworktorelievecongestionwithintheirnetworkbydoinginternal
trafficengineering,addingtemporarycapacity,repairingthings,andsoon.Oneofthestrengthsof
theInternetisthateachoperatorwillbeworkingindependentlytorecoveritsownnetworkas
quicklyandefficientlyaspossible.
Whereanetworksusersareaffectedbycongestioninothernetworks,thesimpleststrategyisto
waituntilthosenetworksrecover.Thismayleavesparecapacityinothernetworksunused,soisnot
theoptimumstrategyforthesystemasawhole.However,therearetwoproblemswithtryingto
coordinateaction:
1. thereisnowayoftellingwherethesparecapacityinthesystemis;2. BGPprovidesverylimitedmeanstoinfluencetrafficinotheroperatorsnetworks.
Ineffect,ifnetworksattempttoredirecttraffictheyareblunderingaroundinthedark,attemptingto
makeadjustmentstoadelicateinstrumentwithahammer.Theirattemptstoredirecttrafficmay
createcongestionelsewhere,whichmaycausemorenetworkstotrytomovetrafficaround.Itis
possibletoimagineasituationinwhichmanynetworksarechasingeachothercreatingwavesof
congestionandroutingchangesastheydo,likethewavesofcongestionthatpassalongroadswhich
areneartheircarryingcapacity.
Withluck,ifanetworkcannothandlethetrafficitissentandpushesitawaytoothernetworks,it
willbedivertedtowardssparecapacityelsewhere.Givenenoughtimethesystemwouldadapttoanewdistributionofcapacity,andanewdistributionoftraffic.Itisimpossibletosayhowmuchtime
wouldberequired;itwoulddependontheseverityofthecapacityloss,butitcouldbedaysoreven
weeks.
Strategiclocalactionwillnotnecessarilyleadtoasociallyoptimalequilibrium,though,asthe
incentivesmaybeperverse.SinceanySLAwillstopattheedgeofitsnetwork,atransitprovidermay
wishtoengineertrafficawayfromitsnetworkinordertomeetitsSLAsfortrafficwithinitsnetwork.
Theresultmaystillbecongestion,somewhere,buttheSLAisstillmet.
-
8/2/2019 Inter-X Summary Report April-11b
19/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201119
1.7.3 RoutinginaCrisisExperienceshowsthatinacrisistheinterconnectionsystemcanquitequicklycreatenewpaths
betweennetworkstoprovideinterimconnectionsandextracapacityforexample,intheaftermath
ofthe9/11attack,asdiscussedabove.
Theinterconnectionecosystemhasoftenrespondedinthiswaywithmanypeopleimprovising,and
workingwiththepeopletheyknowpersonally.[C:13]Thisisrelatedtotrafficengineering,tothe
extentthatitaddressestheproblembyaddingextraconnectionstowhichtrafficcanbemoved.The
responseofthesystemmightbeimprovedandspeededupifthereweremorepreparationforthis
form,andperhapsotherforms,ofcooperationinacrisis.[C:14]
Intheend,ifthereisinsufficientcapacityinacrisis,thennoamountoftrafficengineeringormanual
reconfigurationwillfitaquartoftrafficintoapintofcapacity.Inextremecasessomeformofprioritisationwouldbeneeded.
1.8 IsTransitaViableBusiness?Theprovisionoftransittheserviceofcarryingtraffictoeverypossibledestinationisakeypartof
theinterconnectionsystem,butitmaynotbeasustainablebusinessinthenearfuture.
Nobodydoubtsthatthecostoftransithasfallenfast,orthatitisacommoditybusiness,exceptwhere
thereislittleornocompetition.IntheUS,overthelasttentofifteenyearstransitpriceshavefallen
atrateofaround40%perannumwhichresultsina99%dropoveratenyearperiod.Inother
partsoftheworldpricesstartedhigher,butasinfrastructurehasdeveloped,andtransitnetworks
haveextendedtointonewmarkets,thosepriceshavefallenforexample,pricesinLondonarenowscarcelydistinguishablefromthoseinNewYork.
Wherethereiseffectivecompetition,thepriceoftransitfalls,andconsumersbenefit.Ina
competitivemarket,pricetendstowardsthemarginalcostofproduction.Thetotalcostof
productionhasfallensharply,asinnovationreducesthecostoftheunderlyingtechnologiesand
withincreasingeconomiesofscale.Yeteveryyearindustryinsidersfeelthatsurelynobodycan
makemoneyattodaysprices,andthattheremustsoonbealevellingoff.Sofartherehasbeenno
levellingoff,thoughtherateatwhichpricesfallmaybediminishing.
Thereasonissimple:themarginalcostofproductionfortransitserviceisgenerallyzero.Atany
givenmomenttherewillbeanumberoftransitproviderswithsparecapacity:first,networkcapacity
comesinlumps,soeachtimecapacityisaddedtheincrementwillgenerallyexceedtheimmediate
need;second,networksaregenerallyoverprovisioned,sothereisalwayssomesparecapacity
thougheatingintothatmayincreasetheriskofcongestion,perhapsreducingservicequalityatbusy
timesorwhenthingsgowrong.
Thelogicofthismarketisthatthepricefortransitwilltendtowardszero.Soitisunclearhowpure
transitproviderscouldrecouptheircapitalinvestment.Thelogicofthemarketwouldappearto
favourconsolidationuntilthehandfuloffirmsleftstandingacquiremarketpower.
Atapracticallevel,theprovisionoftransitmaybeundertakennottomakeprofits,buttooffsetsome
ofthecostofbeinganInternetnetwork.Forsomenetworksthedecisiontooffertransitatthe
marketpricemaybeincreasinglyastrategicratherthanacommercialdecision.Anothersignificantfactoristherecentandcontinuingincreaseinvideotrafficandtherelatedriseintheamountof
trafficdeliveredbytheContentDeliveryNetworks(CDNs,seebelow).Thismeansthatthecontinued
-
8/2/2019 Inter-X Summary Report April-11b
20/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201120
reductionintheunitpricefortransitisnotbeingmatchedbyanincreaseintransittraffic,sotransit
providersrevenuesaredecreasing.
Theacknowledgedmarketleader,Level3,lost$2.9billionin20052008andafurther$0.6billionin
2009,andanother$0.6billionin2010.Itisnotpossibletosaywhatcontributiontheirtransit
businessmadetothis;industryinsidersnotethatLevel3didnotgothroughbankruptcyasmany
othersdid,andwouldmakeasmallprofitifitwerenotforthecostofservicingitsdebt.However,
theindustryasawholeislosinglargeamountsofmoney(wesummarisesomeofthemajor
providersfinancialstatementsinAppendixII).
1.9 TheRiseoftheContentDeliveryNetworksOverthepastfouryearsorso,moreandmoretraffichasbeendeliveredbyContentDelivery
Networks(CDNs).Theirrisehasbeenrapidandhaschangedtheinterconnectionlandscape,concentratingalargeproportionofInternettrafficintoasmallnumberofnetworks.Thisshifthas
beendrivenbybothcostandqualityconsiderations.Withthegrowthofvideocontent,ofeverricher
websites,andofcloudapplications,itmakessensetoplacecopiesofpopulardataclosertotheend
userswhofetchit.Thishasanumberofbenefits:
localconnectionsperformbetterthanremoteonesgivingquickerresponseandfastertransfers.
costsarereducedbecausethedataisnotbeingrepeatedlytransportedoverlargedistancessavingontransitcosts.However,thekeymotivationforthecustomersofCDNsisnotto
reducethecostofdelivery,buttoensurequalityandconsistencyofdeliverywhichis
particularlyimportantforthedeliveryofvideostreams;
thedataarereplicated,storedinanddeliveredfromanumberoflocationsimprovingresilience.
ThishasmovedtrafficawayfromtransitproviderstopeeringconnectionsbetweentheCDNsandthe
endusersISP.InsomecasescontentisdistributedtoserverswithintheISPsownnetwork,
bypassingtheinterconnectionsystemaltogether.
OneCDNclaimstodeliversome20%ofallInternettraffic.Sincethetrafficbeingdeliveredisthe
sortwhichisexpectedtogrowmostquicklyinthecomingyears,thisimpliesthatanincreasing
proportionoftrafficisbeingdeliveredlocally,andareducingproportionoftrafficisbeingcarried
(overlongdistances)bythetransitproviders.
AnothereffectofthisistoaddtrafficattheInternetExchangePoints(IXPs),whicharetheobvious
wayfortheCDNstoconnecttolocalISPs.ThisaddsvaluetotheIXPparticularlywelcomeforthe
smallerIXPs,whichhavebeenthreatenedbytheeverfallingcostoftransit(eatingintothecost
advantageofconnectingtotheIXP)andthefallingcostofconnectingtoremote(larger)IXPs(where
thereismoreopportunitytopickuptraffic).
Thereisapositiveeffectonresilience,andanegativeone.Thepositivesideisthatsystemsserving
usersinoneregionareindependentofthoseservingusersinotherregions,soalotoftraffic
becomeslessdependentonlongdistancetransitservices.Onthenegativeside,CDNsarenow
carryingsomuchtrafficthatifalargeoneweretofail,transitproviderscouldnotmeettheadded
demand,andsomeserviceswouldbedegraded.CDNsalsoconcentrateevermoreinfrastructurein
-
8/2/2019 Inter-X Summary Report April-11b
21/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201121
placeswherethereisalreadyalotofit.Ifpartsofsomelocalinfrastructurefailforanyreason,will
therebesufficientothercapacitytofallbackon?
Finally,itispossibletocountacoupleofdozenCDNsquitequickly,butitappearsthatperhapstwo
orthreearedominant.Someofthelargetransitprovidershaveenteredthebusiness,eitherwith
theirowninfrastructureorinpartnershipwithanexistingCDN.Thereareobviouseconomiesof
scaleintheCDNbusiness,andthereisnowasignificantinvestmentbarriertoentry.Thestateofthis
marketinafewyearstimeisimpossibletopredict,butnetworkeffectstendtofavourafew,very
large,players.TheseplayersareverylikelytoenduphandlingoverhalftheInternetstrafficby
volume.
1.10 TheInsecurityofBGPAfundamentalproblemwithBGPisthatthereisnomechanismtoverifythattheroutinginformationitdistributesisvalid.Inprincipletraffictoanydestinationcanbedivertedsotrafficcanbe
disrupted,modified,examinedorallthree.[C:11]Theeffectofthisisfeltonaregularbasiswhen
somenetworkmanagestoannouncelargenumbersofroutesforaddressesthatbelongtoother
networks;thiscandiverttrafficintowhatiseffectivelyablackhole.Suchincidentsarequitequickly
dealtwithbynetworkoperators,anddisruptioncanbelimitedtoafewhours,atmost.Itisworth
rememberingthattheoperationallayerispartoftheecosystem,andnotallproblemsrequire
technicalsolutions.
Thegreatfearisthatthisinsecuritymightbeexploitedasameanstodeliberatelydisruptthe
Internet,orpartsofit.Thereisalsoafrequentlyexpressedconcernthatroutehijackingmightbe
usedtolisteninontraffic,thoughthiscanbehardtodoinpractice.
ConfiguringBGProuterstofilteroutinvalidroutes,oronlyacceptvalidones,isencouragedasbest
practice.However,asdiscussedinSection3.1.11,whereitispractical(attheedgesoftheInternet)it
doesnotmakemuchdifference,untilmostnetworksdoit.Whereitwouldmakemostdifference(in
thelargertransitproviders)itisnotreallypracticalbecausetheinformationonwhichtobaseroute
filtersisincompleteandthetoolsavailabletomanageandimplementfiltersatthatscaleare
inadequate.[Q:13]
MoresecureformsofBGP,inwhichroutinginformationcanbecryptographicallyverified,dependon
therebeingamechanismtoverifytheownershipofblocksofIPaddresses,ortoverifythattheAS
whichclaimstobetheoriginofablockofIPaddressesisentitledtomakethatclaim.Thenotionof
titletoblocksofIPaddressesturnsoutnottobeasstraightforwardasmightbeexpected.However,someprogressisnowbeingmade,underthenameRPKI(ResourcePublicKeyInfrastructure).The
RPKIinitiativeshouldallowASestoignoreannouncementswheretheoriginisinvalidthatis,
wheresomeASisattemptingtouseIPaddressesitisnotentitledtouse.Thisisanimportantstep
forward,andmighttackleover90%offatfingerproblems(outagescausedbymistakesratherthan
deliberateattemptstodisrupt).[Q:14]
ButthecostofRPKIissignificant.EveryASmusttakestepstodocumenttheirtitletotheirIP
addresses,andthattitlemustberegisteredandattestedtobytheInternetRegistries.Then,everyAS
mustextendtheirinfrastructuretochecktherouteannouncementstheyreceiveagainsttheregister.
Whatismore,theproblemthatRPKItacklesis,sofar,largelyanuisancenotadisaster.Whensome
networkmanagestoannouncesomeroutesitshouldnot,thisisnoticedandfixedquitequickly,ifitmatters.SometimesanetworkannouncesIPaddressesnobodyelseisusinggenerallytheyareup
tonogood,butthisdoesnotactuallydisrupttheinterconnectionsystem.Sotheincentivetodo
-
8/2/2019 Inter-X Summary Report April-11b
22/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201122
somethingabouttheproblemisweak,althoughthenumberofsuchincidentsisexpectedtorise
whenIPv4addressesareexhaustedinlate2011.
Further,aroutemaypassthecheckssupportedbyRPKI,andstillbeinvalid.Anetworkcan
announceroutesforablockofIPaddresses,completewithavalidorigin,butdosoonlytodisruptor
interferewiththetraffic(apparently)onitswaytoitsdestination.TheSBGPextensionstoBGP
(firstpublishedin1997)addresstheissuemorecompletely,andtherehavebeenotherproposals
since;however,theymaketechnicalassumptionsaboutrouting(trafficgreedandvalleyfree
customerpreferences)thatdontholdintodaysInternet.Detailsofanewinitiative,BGPSEC,were
announcedinMarch2011.TheaimisthatthisshouldleadtoIETFstandardsby2013anddeployed
codeinroutersthereafter.
Duringthestandardisationprocessin20112013akeyissuewillbesecurityeconomics.ASessee
thecostofBGPsecurityashigh,andthebenefitessentiallyzerountilitisverywidelydeployed.Ideally,implementationanddeploymentstrategieswillgivelocal,incrementalbenefit,coupledwith
incentivesforearlyadopters.Onepossiblemechanismisforgovernmentstousetheirpurchasing
powertobootstrapearlyadoption;anotherisforrouterstoprefersignedroutes.Technicalissues
thatmustbestudiedduringthestandardisationphaseincludewhethermoresecureBGPmight,in
fact,bebadforresilience(aswaspointedoutintheconsultation,[Q:15]).Addingcryptographytoa
systemcanmakeitbrittle.Thereasonisthatwhenrecoveringfromanevent,newandpossibly
temporaryroutesmaybedistributedinordertoreplacelostroutes,andiftheunusualroutesare
rejectedbecausetheydonothavethenecessarycredentials,thenrecoverywillbeharder.Finally,
BGPSECwillnotbeasilverbullet,therearemanythreats,butitshouldtackleabouthalfthethings
thatcangowrongafterRPKIhasdealtwithoriginvalidation.
Tosumup,mostofthetimeBGPworkswonderfullywell,butthereisplentyofscopetomakeitmore
secureandmorerobust.However,individualnetworkswillgetlittledirectbenefitfromanimproved
BGP,despitethesignificantcost.Wewillprobablyneedsomenewincentivetopersuadenetworksto
investinmoresecureBGP,oraproposalforsecuringBGPthatgiveslocalbenefitsfromincremental
deployment.[Q:20]
1.11 CyberExercisesonInterconnectionResilienceThepracticalapproachtoassessingtheresilienceoftheinterconnectionsystemistorunlargescale
exercisesinwhichplausiblescenariosaretested.[C:16]Exercisescantestbothoperationaland
technicalaspectsaswellasprocedural,policy,structuralandcommunicationaspects.
Suchexerciseshaveanumberofadvantagesandbenefits.
Theystartwithrealworldissues.Theseexercisesarenotcheap,sothereisanincentivetoberealistic:plannersconsiderwhatreallyarethesortsofeventthatthesystemisexpectedto
face.
Theycanidentifysomedependenciesonphysicalinfrastructure.Byrequiringtheparticipantstoconsidertheeffectsofsomeinfrastructurefailure,anexercisemayrevealpreviously
unknowndependencies.
Theycanidentifycrosssystemdependencies.Forexample,howwellcannetworkoperationscentrescommunicateifthephonenetworkfails,orhowwellcanfieldrepairsproceedifthemobilephonenetworkisunavailable?[Q:17]
-
8/2/2019 Inter-X Summary Report April-11b
23/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201123
Theyexercisedisasterrecoverysystemsandprocedures.Thisisgenerallyagoodlearningexperienceforeverybodyinvolved,particularlyasotherwisecrisismanagementisgenerallyadhoc.[C:15]
Suchscenariotestinghasbeendoneatanationallevelandfoundtobevaluable3.Somethingata
largerscalehasalsobeenprovedtobevaluable.
On4thNovember2010theEuropeanMemberStatesorganisedthefirstpanEuropeancyber
exercise,calledCYBEREUROPE2010,whichwasfacilitatedbyENISA.Thefinalevaluationreport
publishedbyENISA4provestheimportanceofsuchexercisesandcallsforfutureactionsbasedon
thelessonslearned.
1.12 TheTragedyoftheCommonsTheresilienceoftheInternetinterconnectionsystembenefitseveryone,butanindividualnetwork
willnotingeneralgainanetbenefitifitincreasesitscostsinordertocontributetotheresilienceof
thewhole.[C:21]
Thismanifestsitselfinanumberofways.
InSection1.10above,wediscussedthevariousproposalsformoresecureformsofBGP,fromSBGPin1997toBGPSECin2011,noneofwhichhavesofarbeendeployed(seeSection
3.1.12).Thereislittledemandforsomethingwhichisgoingtobedifficulttoimplementand
whosedirectbenefitislimited.
ThereexistsbestpracticeforfilteringBGProuteannouncements,which,ifuniversallyapplied,wouldreduceinstancesofinvalidroutesbeingpropagatedbyBGPanddisruptingthesystem
(seeSection3.1.11).Buttheserecommendationsaredifficulttoimplementandmostlybenefit
othernetworks,soarenotoftenimplemented.
ThereisanIETFBCP5[6]forfilteringpackets,toreduceaddressspoofing,whichwouldmitigatedenialofserviceattacks(seeSection5.8.3).Theserecommendationsalsomostly
benefitothers,soarenotoftenimplemented.
AsmallerglobalroutingtablewouldreducetheloadonallBGProutersintheInternet,andleavemorecapacitytodealwithunusualevents.Nevertheless,theroutingtableisasabout
75%biggerthanitneedstobe,becausesomenetworksannounceextraroutestoreducetheir
owncosts(seeSection3.1.9).Othernetworkscouldresistthisbyignoringtheextraroutes,butthatwouldcosttimeandefforttoconfiguretheirrouters,andwouldmostlikelybeseenby
theircustomersasaservicefailure(notasanobleactofpublicservice).
ThesystemisstillillpreparedforIPv6,despitethenowimminent(circaQ32011)exhaustionofIPv4addressspace.[Q:10]
3GoodPracticeGuideonNationalCyberExercises,ENISATechnicalReport,2009.Availableat:
http://www.enisa.europa.eu/act/res/policies/goodpractices1/exercises4CYBEREUROPE2010EvaluationReport,ENISAReport2011.Available(after15/04/2011)at:
http://www.enisa.europa.eu/act/res/5
An
Internet
Engineering
Task
Force
(IETF)
Best
Common
Practice
(BCP)
is
as
official
as
it
gets
in
the
Internet.
-
8/2/2019 Inter-X Summary Report April-11b
24/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201124
Itisintheclearinterestofeachnetworktoensurethatinnormalcircumstancesbesteffortsmeans
ahighlevelofservice,byadjustinginterconnectionsandroutingpolicyeachnetworkhascustomerstoserveandareputationtomaintain[C:17].Normalcircumstancesincludetheusualday
todayfailuresandsmallincidents[Q:7].
Thecentralissueisthatthesecurityandresilienceoftheinterconnectionsystemisanexternalityas
farasthenetworksthatcompriseitareconcerned.Itisnotclearisthatthereisanyincentivefor
networkoperatorstoputsignificanteffortintoconsideringtheresilienceoftheinterconnection
systemunderextraordinarycircumstances.[Q:18]
1.13 RegulationRegulationisviewedwithapprehensionbytheInternetcommunity.Studiessuchasthisareseenas
stalkinghorsesforregulatoryinterference,whichisgenerallythoughtlikelytobeharmful.[C:22]
DespitehavingitsoriginsinaprojectfundedbyDARPA,aUSgovernmentagency,theInternethas
developedsincetheninanenvironmentthatislargelyfreefromregulation.Therehavebeenmany
localattemptsatregulatoryintervention,mostofwhichareseenasharmful.
ThegovernmentsofmanylessdevelopedcountriesattempttocensortheInternet,withvaryingdegreesofsuccess.TheGreatFirewallofChinaismuchdiscussed,butmanyother
statespracticeonlinecensorshiptoagreaterorlesserextent.Itisnotjustthatcensorship
itselfiscontrarytothemoresoftheInternetcommunitywhosecultureisgreatlyinfluenced
byCalifornia,thehomeofmanydevelopers,vendorsandservicecompanies.Attemptsat
censorshipcancausecollateraldamage,aswhenPakistanadvertisedroutesforYouTubeinan
attempttocensoritwithintheirborders,andinsteadmadeitunavailableonmuchoftheInternetforseveralhours.
Wherepoorregulationleadstoalackofcompetition,accesstotheInternetislimitedandrelativelyexpensive.Inmanylessdevelopedcountries,alocaltelecommunicationsmonopoly
restrictswirelinebroadbandaccesstourbanelites,forcingthemajoritytorelyonmobile
access.Howevertheproblemismoresubtlethanregulationbad,noregulationgood.Ina
numberofUScities,thediversityofbroadbandaccessisfalling;citiesthatusedtohavethree
independentinfrastructures(sayfromaphonecompany,acablecompanyandanelectricity
company)mayfindthemselvesovertimewithtwo,orevenjustone.Inbetterregulated
developedcountries(suchasmuchofEurope)localloopunbundlingyieldspricecompetition
atleast,thusmitigatingaccesscosts,evenifphysicaldiversityisharder.Finally,fewcountries
imposeauniversalserviceprovisiononserviceproviders;itslackcanleadtoadigitaldivide
betweenpopulatedareaswithbroadbandprovision,andruralareaswithout.
Therehasbeencontinuedcontroversyoversurveillanceforlawenforcementandintelligencepurposes.IntheCryptoWarsonthe1990s,theClintonadministrationtriedtocontrol
cryptography,whichtheindustrysawasthreateningnotjustprivacybutthegrowthof
ecommerceandotheronlineservices.TheClintonadministrationpassedthe
CommunicationsAssistanceforLawEnforcementAct(CALEA)in1994mandatingthe
cooperationoftelecommunicationscarriersinwiretappingphonecalls.TheEUhasaData
RetentionDirectivethatisupforrevisionin2011andthereisinterestbothintheUKandthe
USAinhowwiretappingshouldbeupdatedforanagenotonlyofVoIPbutalsoofdiverse
messagingsystems.Thiscreatesconflictsofinterestwithcustomers,raisesissuesofhumanrights,andleadstoargumentsaboutpaymentandsubsidy.
-
8/2/2019 Inter-X Summary Report April-11b
25/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201125
GovernmentswhichworryaboutCriticalNationalInfrastructuremaytreatInternetregulationasamatterofNationalSecurity,introducingdegreesofsecrecyandshadowyorganisations,whichdoesnothingtodispelconcernsaboutmotivationnothelpedbyatendencytotalk
abouttheprobleminapocalypticterms6.
Whateverthemotivation,governmentpoliciesareoftenformulatedwithinsufficientscientificand
technicalinput.Theyoftenmanagetoappearclueless,andinsomecasesmakethingsworse.This
studyisanattempttohelpalleviatethisproblem.
Thisstudyhasidentifiedanumberofareaswherethemarketdoesnotappeartoprovideincentives
tomaintaintheresilienceoftheinterconnectionsystematasociallyoptimallevel.However,any
attempttotackleanyoftheissuesbyregulationishamperedbyanumberoffactors:
thelackofgoodinformationaboutthestateandbehaviourofthesystem.Itishardtodeterminehowmaterialagivenissuemaybe.Itishardtodeterminewhateffectagiven
initiativeislikelytohavegoodorbad.
thescaleandcomplexityofthesystem.Scalemaymakelocalinitiativesineffective,whilecomplexitymeansthatitishardtopredicthowthesystemwillrespondoradapttoagiven
initiative.
thedynamicnatureofthesystem.CDNshavebeenaroundformanyyears,buttheiremergenceasamajorcomponentoftheInternetisrelativelyrecent;itistestamenttothe
systemsabilitytoadaptquickly(inthiscase,tothepopularityofstreamedvideo).
Upuntilnow,thelackofincentivestoprovideresilience(andinparticulartoprovideexcess
capacity)hasbeenrelativelyunimportant:theInternethasbeengrowingsorapidlythatithasbeenveryfarfromequilibrium,withahugeendowmentofsurpluscapacityduringthedotcomboomand
significantcapacityenhancementssincethen.Thiscannotgoonforever.
Onecaveat:wemustpointoutthattheprivatisation,liberalisationandrestructuringofutilities
worldwidehasledtoinstitutionalfragmentationinanumberofcriticalinfrastructureindustriesthat
couldintheorysufferdegradationofreliabilityandresilienceforthesamegeneralmicroeconomic
reasonswediscussinthecontextoftheInternet.Yetstudiesoftheelectricity,waterandtelecomms
industriesinanumberofcountrieshavefailedtofindareliabilitydeficitthusfar[7].Inpractice,
utilitieshavemanagedtocopebyacombinationofanticipatoryriskmanagementandPublicPrivate
Partnerships(PPPs).Howeveritissometimesnecessaryforgovernmenttoactasalenderoflast
resort.Ifarouterfails,wecanfallbackonanotherrouter,butifamarketfailsaswiththeCaliforniaelectricitymarketthereisnofallbackotherthanthestate.
Inconclusion,itmaybesometimebeforeregulatoryactioniscalledfortoprotecttheresilienceof
theInternet,butitmaywellbetimetostartthinkingaboutwhatmightbeinvolved.Regulatinga
newtechnologyishard;aninitiativedesignedtoimprovetodayssystemmaybeirrelevantto
tomorrows,or,worse,stiflecompetitionandinnovation.Forexample,therailwayssteadily
improvedtheirefficiencyfromtheirinceptioninthe1840suntilregulationstartedinthelate
6See[236]UKGovernment,CabinetOfficeFactsheet18:CyberSecurity.Andforthepopularperceptionofwhat
governmentthinks
see
[237]
Fight
Cyber
War
Before
Planes
Fall
Out
of
Sky.
-
8/2/2019 Inter-X Summary Report April-11b
26/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201126
nineteenthcentury,afterwhichtheirefficiencydeclinedsteadilyuntilcompetitionfromroadfreight
arrivedinthe1940s[8].
TheprudentcourseofactionforpolicymakerstodayistostartworkingtounderstandtheInternet
interconnectionecosystem.Themostimportantpackageofworkistoincreasetransparency,by
supportingconsistent,thorough,investigationofmajoroutagesandthepublicationofthefindings,
andbysupportinglongtermmeasurementofnetworkperformance.Thesecondpackagewe
recommendistofundkeyresearchintopicssuchasdistributedintrusiondetectionandthedesignof
securitymechanismswithpracticalpathstodeployment,andthethirdistopromotegoodpractice,
toencouragediverseserviceprovisionandtopromotethetestingofequipment.Thefourthpackage
includesthepreparationandrelationshipbuildingthroughaseriesofPPPsforresilience.Modest
andconstructiveengagementofthiskindwillenableregulatorstobuildrelationshipswithindustry
stakeholdersandleaveeveryoneinamuchbetterpositiontoavoid,ordelay,difficultand
uninformedregulation.Regulatoryinterventionmustafterallbeevidencebased;andwhilethereis
evidenceofanumberofissues,theworkingsofthishuge,complexanddynamicsystemaresopoorly
understoodthatthereisnotyetenoughevidenceonwhichtobasemajorregulatoryintervention
withsufficientconfidence.
-
8/2/2019 Inter-X Summary Report April-11b
27/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201127
2
Recommendations
Ourrecommendationscomeinfourgroups.Thefirstgroupisaimedatunderstandingfailures
better,sothatallmaylearnthelessons.
Recommendation1 IncidentInvestigationAnindependentbodyshouldthoroughlyinvestigateallmajorincidentsandreportpubliclyonthe
causes,effectsandlessonstobelearned.Incidentcorrelationandanalysismayleadtoassessment
andforecastmodels.Theappropriateframeworkshouldbetheresultofaconsultationwiththe
industryandtheappropriateregulatoryauthorities.Incidentinvestigationmightbeundertakenby
anindustryassociation,byanationalregulatororbyabodyattheEuropeanlevel,suchasENISA.
Thelastoptionwouldrequirefundingtosupportthework,and,perhaps,powerstoobtain
informationfromoperatorsundersuitablesafeguardstoprotectcommerciallysensitive
information.TheimplementationofArticle13aoftherecentEUTelecomPackage7mayprovidea
modelforthis.
Recommendation2 DataCollectionofNetworkPerformanceMeasurementsEuropeshouldpromoteandsupportconsistent,longtermandcomprehensivedatacollectionof
networkperformancemeasurements.Atpresentsomerealtimemonitoringisdonebycompanies
suchasArborNetandRenesys,andsomemoreisdonebyacademicprojectswhichtendtolanguish
oncetheirfundingrunsout.Thispatchworkisinsufficient.Thereshouldbesustainablefundingto
supportthelongtermcollection,processing,storageandpublicationofperformancedata.Thisalsohasanetworkmanagement/lawenforcementangleinthatrealtimemonitoringofthesystemcould
helpdetectunusualrouteannouncementsandotherundesirableactivity.
Thesecondgroupofrecommendationsaimsatsecuringfundingforresearchintopicsrelated
toresiliencewithanemphasisnotjustonthedesignofsecuritymechanisms,buton
developinganunderstandingofhowsolutionscanbedeployedintherealworld.
Recommendation3 ResearchintoResilienceMetricsandMeasurementFrameworksEuropeshouldsponsorresearchintobetterwaystomeasureandunderstandtheperformanceandresilienceofhuge,multilayerednetworks.Thisistheresearchaspectofthesecond
recommendation;oncethatprovidesaccesstogooddata,thedatashouldhelpcleverpeopletocome
upwithbettermetrics.
7Directive2002/21/ECoftheEuropeanParliamentandoftheCouncil,of7March2002,onacommonregulatory
frameworkforelectroniccommunicationsnetworksandservices(FrameworkDirective),asamendedbyDirective
2009/140/ECand
Regulation
544/2009.
-
8/2/2019 Inter-X Summary Report April-11b
28/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201128
Recommendation4 DevelopmentandDeploymentofSecureInterdomainRoutingEuropeshouldsupportthedevelopmentofeffective,practicalmechanismswhichhaveenough
incentivesfordeployment.Thismaymeanmechanismsthatgivelocalbenefittothefirmsthat
deploythem,evenwheredeploymentisincremental;itmayrequiretechnicalmechanismstobe
supplementedbypolicytoolssuchastheuseofpublicsectorpurchasingpower,subsidies,liability
shifts,orotherkindsofregulation.
Recommendation5 ResearchintoASIncentivesthatImproveResilienceEuropeshouldsupportresearchintoeconomicandlegalmechanismstoincreasetheresilienceofthe
Internet.Perhapsasystemofcontractscanbeconstructedtosecuretheinterconnectionsystem,
startingwiththeconnectionsbetweenthemajortransitprovidersandspreadingfromthecoretothe
edges.Alternatively,researchersmightconsiderwhetherliabilityrulesmighthaveasimilareffect.IfthefailureofaspecifictypeofroutercausedlossofInternetserviceleadingtodamageandlossof
life,theProductLiabilityDirective85/374/ECwouldalreadyletvictimssuethevendor;butthereis
nosuchprovisionrelatingtothefailureofatransitprovider.
Thethirdgroupofrecommendationsaimsatpromotinggoodpractice.
Recommendation6 PromotionandSharingofGoodPracticeonInternetInterconnections
Europeshouldsponsorandpromotegoodpracticeinnetworkmanagement.Wheregoodpracticeexistsitsadoptionmaybehamperedbypracticalandeconomicissues.Thepublicsectormaybeable
tohelp,butitisnotenoughtodeclareformotherhoodandapplepie!Itcancontributevarious
incentives,suchasthroughitsconsiderablepurchasingpower.Forthattobeeffective,purchasers
needawaytotellgoodservice.Thefirstthreeofourrecommendationscanhelp,buttherearesome
directmeasuresofqualitytoo.Suchinformationsharingshouldincludemodestandconstructive
engagementofindustrystakeholderswithpublicsectorinrelationshipbuildingstrategicdialogue
anddecisionsthroughaseriesofPPPsforresilience.
Recommendation7 IndependentTestingofEquipmentandProtocolsPublicbodiesatnationalorEuropeanlevelshouldsponsortheindependenttestingofroutingequipmentandprotocols.Theriskofsystemicfailurewouldbereducedbyindependenttestingof
equipmentandprotocols,lookingparticularlyforhowwelltheseperforminunusualcircumstances,
andwhethertheycanbedisrupted,suborned,overloadedorcorrupted.
Recommendation8 ConductRegularCyberExercisesontheInterconnectionInfrastructure
Theconsultationnotedthattheseareeffectiveinimprovingresilienceatlocalandnationallevels.
TheeffortsatthislevelshouldcontinueinallcountriesinEuropeasweareasweakastheweakest
link.ENISAwillsupportthenationalefforts.InadditionregularpanEuropeanexercisesshouldbe
organisedbyEuropeanMemberStatesinordertotestandimproveEuropeanwidecontingencyplans(measures,proceduresandstructures).Theselargescaleexerciseswillprovideanumbrella
foranumberofusefulactivities,suchasinvestigatingwhatextrapreparationmightberequiredto
-
8/2/2019 Inter-X Summary Report April-11b
29/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201129
providemoreroutesinacrisis;thuseffectivelybecomingpartofimprovingthepanEuropeancyber
preparednessandcontingencyplans.
Thefinalgroupofrecommendationsaimsatengagingpolicymakers,customersandthe
public.
Recommendation9 TransitMarketFailureItispossiblethatthecurrenttwentyoddlargesttransitprovidersmightconsolidatedowntoa
handful,inwhichcasetheymightstarttoexercisemarketpowerandneedtoberegulatedlikeany
otherconcentratedindustry.Ifthisweretohappenjustastheindustryusesupthelastofits
endowmentofdarkfibrefromthedotcomboom,thenpricesmightrisesharply.European
policymakersshouldstarttheconversationaboutwhattodothen.Actionmightinvolvenotjusta
numberofEuropeanagenciesbutalsonationalregulatoryauthorities.Recommendations1,2,3,and
5willpreparethegroundtechnicallysothatpolicymakerswillnotbeworkingentirelyinthedark,
butwealsoneedpoliticalpreparation.
Recommendation10 TrafficPrioritisationIf,inacrisis,sometrafficistobegivenpriority,andothertrafficistosufferdiscrimination,thenthe
basisforthischoicerequirespublicdebate,andmechanismstoachieveitneedtobedeveloped.
GiventhenumberofinterestsseekingtocensortheInternetforvariousreasons,anydecisionson
prioritisationwillhavetobetakenopenlyandtransparently,orpublicconfidencewillbelost.
Recommendation11 GreaterTransparencyTowardsaResilienceCertificationSchemeFinally,transparencyisnotjustaboutopennessintakingdecisionsonregulationoronemergency
procedures.Itwouldgreatlyhelpresilienceifendusersandcorporatecustomerscouldbeeducated
tounderstandtheissuesandsendtherightmarketsignals.Inthelongtermefforts,including
ENISAs,shouldfocusonwhatmechanismscanbedevelopedtogivethemthemeanstomakemore
informedchoices.Thismightinvolvecombiningtheoutputsfromrecommendations2,3,5,6and7
intoaqualitycertificationmarkscheme.Suchschememayproveanimportanttooltodrivethe
marketincentivestowardsenhancingtheresilienceofthenetworksandmoregenerallyofthe
interconnectionecosystem.
-
8/2/2019 Inter-X Summary Report April-11b
30/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201130
Respondents
to
the
Consultation
Wethankallthosewhogavetheirtimetorespondtotheconsultationandhelpuswiththisstudy.
Somechosetocontributeanonymously,butwecanthankbynamethefollwoing:
OlivierBonaventure Professor UCLouvain,Belgium
ScottBradner UniversityTechnologySecurityOfficer,OfficeoftheCIO
HarvardUniversity
BobBriscoe ChiefResearcher NetworksResearchCentre,BTGroupplc
kcclaffey PrincipalInvestigator CAIDA
AndrewCormack ChiefRegulatoryAdviser JANET(UK)JonCrowcroft MarconiProfessorofCommunications
SystemsComputerLab,CambridgeUniversity
JohnCurran CEO ARIN
DaiDavies GeneralManager Dante
NicolasDesmons ChargdeMission ARCEP,France
AmoghDhamdhere PostDoctoralResearcher CAIDA
GiuseppeDiBattista ProfessorofComputerScience RomaTreUniversity
NicoFischbach Director,NetworkArchitecture Colt
MarkFitzpatrick Engineer FederalOfficeofCommunications,OFCOM,Switzerland
DavidHutchison ProfessorofComputing LancasterUniversity
MalcolmHutty HeadofPublicAffairs LINX
ChristianJacquenet DirectoroftheStrategicProgramOffice
FranceTelecomGroup
BalachanderKrishnamurthy
Researcher AT&TLabsResearch
CraigLabovitz ChiefScientist ArborNetworks
UlrichLatzenhofer RundfunkundTelekomRegulierungs,Austria
SimonLeinen NetworkEngineer SWITCH
TonyLeung GlobalInternetandNetworkConvergenceManager
REACH
KurtErikLindqvist CEO Netnod
NeilLong ResearcherandFounder TeamCymruResearchNFP
PatriciaLongstaff DavidLevidowProfessorofCommunicationLawandPolicy
JamesMartinSeniorVisitingFellow,OxfordMartinSchoolVisitingScholar
SyracuseUniversity
TrinityCollege,Oxford
PaoloLucente Architect/Designer KPNInternational
BillManning USC/ISI
-
8/2/2019 Inter-X Summary Report April-11b
31/31
InterX:ResilienceoftheInternetInterconnectionEcosystem
SummaryReportApril201131
MaurizioPizzonia AssistantProfessor,ComputerScience RomaTreUniversity
AndrewPowell ManagerofAdviceDe