inter-x summary report april-11b

Upload: mark-leiser

Post on 05-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Inter-X Summary Report April-11b

    1/31

    April 11InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReport

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril2011

  • 8/2/2019 Inter-X Summary Report April-11b

    2/31

    AboutENISATheEuropeanNetworkandInformationSecurityAgency(ENISA)isanEUagencycreatedtoadvance

    thefunctioningoftheinternalmarket.ENISAisacentreofexpertisefortheEuropeanMemberStates

    andEuropeaninstitutionsinnetworkandinformationsecurity,givingadviceandrecommendations

    andactingasaswitchboardforinformationongoodpractices.Moreover,theagencyfacilitates

    contactsbetweenEuropeaninstitutions,theMemberStates,andprivatebusinessandindustry

    actors.Internet:http://www.enisa.europa.eu/

    Acknowledgments:Whilecompilingthisreport,wetalkedextensivelyoveraperiodofmanymonthstoalargenumber

    oftechnicalandmanagerialstaffatcommunicationsserviceproviders,vendors,andserviceusers.

    Manyofoursourcesrequestedthatwenotacknowledgetheircontribution.Nonethelesswethankthemallhere.ENISAwouldliketoexpressitsgratitudetothestakeholdersthatprovidedinputtothe

    survey.

    Editor:PanagiotisTrimintzios,ENISA

    Authors:

    ChrisHall,HighwaymanAssociates

    RichardClayton,CambridgeUniversity

    RossAnderson,CambridgeUniversity

    EvangelosOuzounis,ENISA

    ContactFormoreinformationaboutthisstudy,pleasecontact:

    [email protected]

    Internet:http://www.enisa.europa.eu/act/res

    18Apr2011(b)

    LegalnoticeNoticemustbetakenthatthispublicationrepresentstheviewsandinterpretationsoftheeditorsandauthors,

    unlessstatedotherwise.ThispublicationshouldnotbeconstruedtobeanactionofENISAortheENISAbodies

    unlessadoptedpursuanttoENISARegulation(EC)No460/2004.Thispublicationdoesnotnecessarily

    representthestateoftheartinInternetinterconnectionanditmaybeupdatedfromtimetotime.

    Thirdpartysourcesarequotedasappropriate. ENISAisnotresponsibleforthecontentoftheexternalsources

    includingexternalwebsitesreferencedinthispublication.

    Thispublicationisintendedforeducationalandinformationpurposesonly.NeitherENISAnoranyperson

    actingonitsbehalfisresponsiblefortheusethatmightbemadeoftheinformationcontainedinthis

    publication.

    Reproductionis

    authorised

    provided

    the

    source

    is

    acknowledged

    2010EuropeanNetworkandInformationSecurityAgency(ENISA),allrightsreserved.

  • 8/2/2019 Inter-X Summary Report April-11b

    3/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20113

    Tableof

    Contents

    ExecutiveSummary............................................................................................................. 4

    IntroductiontotheSummaryReport.................................................................................. 6

    1 Summary...................................................................................................................... 6

    1.1 ScaleandComplexity.......................................................................................................8

    1.2 TheNatureofResilience.................................................................................................9

    1.3 TheLackofInformation.................................................................................................11

    1.4 ResilienceandEfficiency...............................................................................................13

    1.5 Resilienceand

    Equipment

    .............................................................................................

    14

    1.6 ServiceLevelAgreements(SLAs)andBestEfforts......................................................14

    1.7 Reachability,TrafficandPerformance..........................................................................15

    1.8 IsTransitaViableBusiness?..........................................................................................19

    1.9 TheRiseoftheContentDeliveryNetworks..................................................................20

    1.10 TheInsecurityofBGP.................................................................................................21

    1.11 CyberExercisesonInterconnectionResilience.............................................................22

    1.12 TheTragedyoftheCommons....................................................................................23

    1.13 Regulation......................................................................................................................24

    2 Recommendations.....................................................................................................

    27

    Recommendation1 IncidentInvestigation..........................................................................27

    Recommendation2 DataCollectionofNetworkPerformanceMeasurements..................27

    Recommendation3 ResearchintoResilienceMetricsandMeasurementFrameworks.....27

    Recommendation4 DevelopmentandDeploymentofSecureInterdomainRouting........28

    Recommendation5 ResearchintoASIncentivesthatImproveResilience..........................28

    Recommendation6 PromotionandSharingofGoodPracticeonInternetInterconnections28

    Recommendation7 IndependentTestingofEquipmentandProtocols..............................28

    Recommendation8 ConductRegularCyberExercisesontheInterconnection

    Infrastructure...................................................................................

    28

    Recommendation9 TransitMarketFailure..........................................................................29

    Recommendation10 TrafficPrioritisation.............................................................................29

    Recommendation11 GreaterTransparencyTowardsaResilienceCertificationScheme..29

    RespondentstotheConsultation..................................................................................... 30

  • 8/2/2019 Inter-X Summary Report April-11b

    4/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20114

    Executive

    Summary

    TheInternethassofarbeenextremelyresilient.Evenmajordisasters,suchas9/11andHurricane

    Katrina,havehadonlyalocalimpact.Technicalfailureshavelastedonlyafewhours,andcongestion

    hashadasustainedeffectonlywheretheinfrastructureisinadequate.Thelowcostandgeneral

    reliabilityofcommunicationsovertheInternethaveledmoreandmoresystemstodependonit;we

    arenowatthepointwhereasystemicfailurewouldnotjustdisruptemailandtheweb,butcause

    significantproblemsforotherutilities,transport,finance,healthcareandtheeconomygenerally.So

    thecontinuedresilienceoftheInternetiscriticaltothefunctioningofmodernsocieties,andhenceit

    isrightandpropertoexaminewhetherthemechanismsthathavesuchanexcellenttrackrecordin

    providingaresilientInternetarelikelytocontinuetobeaseffectiveinthefuture.

    ThefocusofthisreportistheInternetinterconnectionecosystem.ThisholdstogetherallthenetworksthatmakeuptheInternet.Theecosystemiscomplexandhasmanyinterdependentlayers.

    Thissystemofconnectionsbetweennetworksoccupiesaspacebetweenandbeyondthosenetworks

    anditsoperationisgovernedbytheircollectiveselfinteresttheInternethasnocentralNetwork

    OperationCentre,staffedwithtechnicianswhocanleapintoactionwhentroubleoccurs.Theopen

    anddecentralisedorganisationthatistheveryessenceoftheecosystemisessentialtothesuccess

    andresilienceoftheInternet.Yetthereareanumberofconcerns.

    First,theInternetisvulnerabletovariouskindsofcommonmodetechnicalfailureswheresystems

    aredisruptedinmanyplacessimultaneously;servicecouldbesubstantiallydisruptedbyfailuresof

    otherutilities,particularlytheelectricitysupply;aflupandemiccouldcausethepeopleonwhose

    workitdependstostayathome,justasdemandforhomeworkingbyotherswaspeaking;andfinally,becauseofitsopennature,theInternetisatriskofintentionallydisruptiveattacks.

    Second,thereareconcernsaboutsustainabilityofthecurrentbusinessmodels.Internetserviceis

    cheap,andbecomingrapidlycheaper,becausethecostsofserviceprovisionaremostlyfixedcosts;

    themarginalcostsarelow,socompetitionforcespriceseverdownwards.Someofthelargest

    operatorstheTier1transitprovidersarelosingsubstantialamountsofmoney,anditisnotclear

    howfuturecapitalinvestmentwillbefinanced.Thereisariskthatconsolidationmightreducethe

    currenttwentyoddproviderstoahandful,atwhichpointtheywouldstarttoacquirepricingpower

    andtheregulationoftransitserviceprovisionmightbecomenecessaryasinotherconcentrated

    industries.

    Third,dependabilityandeconomicsinteractinpotentiallyperniciousways.MostofthethingsthatserviceproviderscandotomaketheInternetmoreresilient,fromhavingexcesscapacitytoroute

    filtering,benefitotherprovidersmuchmorethanthefirmthatpaysforthem,leadingtoapotential

    tragedyofthecommons.Similarly,securitymechanismsthatwouldhelpreducethelikelihoodand

    theimpactofmalice,errorandmischancearenotimplementedbecausenoonehasfoundawayto

    rollthemoutthatgivessufficientlyincrementalandsufficientlylocalbenefit.

    Fourth,thereisremarkablylittlereliableinformationaboutthesizeandshapeoftheInternet

    infrastructureoritsdailyoperation.Thishindersanyattempttoassessitsresilienceingeneraland

    theanalysisofthetrueimpactofincidentsinparticular.Theopacityalsohindersresearchand

    developmentofimprovedprotocols,systemsandpracticesbymakingithardtoknowwhatthe

    issuesreallyareandharderyettotestproposedsolutions.

  • 8/2/2019 Inter-X Summary Report April-11b

    5/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20115

    Sotheremaybesignificanttroublesaheadwhichcouldpresentarealthreattoeconomicandsocial

    welfareandleadtopressureforregulatorstoact.YetdespitetheoriginoftheInternetinDARPAfundedresearch,themorerecenthistoryofgovernmentinteractionwiththeInternethasbeen

    unhappy.Variousgovernmentshavemadehamfistedattemptstoimposecensorshipor

    surveillance,whileothershavedefendedlocaltelecommunicationsmonopoliesorhaveproppedup

    otherindustriesthatweredisruptedbytheInternet.Asaresult,Internetserviceproviders,whose

    goodwillisessentialforeffectiveregulation,havelittleconfidenceinthelikelyeffectivenessofstate

    action,andmanywouldexpectittomakethingsworse.

    Anypolicyshouldthereforeproceedwithcaution.Atthisstage,therearefourtypesofactivitythat

    canbeusefulattheEuropean(andindeedtheglobal)level.

    Thefirstistounderstandfailuresbetter,sothatallmaylearnthelessons.Thismeansconsistent,

    thorough,investigationofmajoroutagesandthepublicationofthefindings.Italsomeansunderstandingthenatureofsuccessbetter,bysupportinglongtermmeasurementofnetwork

    performance,andbysustainingresearchinnetworkperformance.

    Thesecondistofundkeyresearchintopicssuchasinterdomainroutingwithanemphasisnotjust

    onthedesignofsecuritymechanisms,butalsoontrafficengineering,trafficredirectionand

    prioritisation,especiallyduringacrisis,anddevelopinganunderstandingofhowsolutionsaretobe

    deployedintherealworld.

    Thethirdistopromotegoodpractices.Diverseserviceprovisioncanbeencouragedbyexplicitterms

    inpublicsectorcontracts,andbyauditingpracticesthatdrawattentiontorelianceonsystemsthat

    lackdiversity.Thereisalsoausefulroleinpromotingtheindependenttestingofequipmentand

    protocols.

    Thefourthispublicengagement.GreatertransparencymayhelpInternetuserstobemore

    discerningcustomers,creatingincentivesforimprovement,andthepublicshouldbeengagedin

    discussionsonpotentiallycontroversialissuessuchastrafficprioritisationinanemergency.And

    finally,PrivatePublicPartnerships(PPPs)ofrelevantstakeholders,operators,vendors,publicactors

    etcisimportantforselfregulation.InthiswayevenifregulationoftheInternetinterconnection

    systemiseverneededaftermanyyears,policymakerswillbeabletomakeinformeddecisions

    leadingtoeffectivepolicies.

    Theobjectiveoftheseactivitiesshouldbetoensurethatwhenglobalproblemsdoarise,thedecision

    andpolicymakershaveaclearunderstandingoftheproblemsandoftheoptionsforaction.

    TherearelocalregulatoryactionsthatEuropecanencouragewhereneeded.Poor

    telecommunicationsregulationcanleadtotheconsolidationoflocalserviceprovisionsothatcities

    havefewerindependentinfrastructures;andincountriesthatarerecipientsofEUaid,

    telecommunicationsmonopoliesoftendeepenthedigitaldivide.

    TheaimofalltheseactivitiesshouldbetoensurethattheInternetisubiquitousandresilient,with

    serviceprovidedbymultipleindependentcompetingfirmswhohavetheincentivestoprovidea

    prudentlevelofcapacitynotjustforfairweather,butforwhenthestormsarrive.

  • 8/2/2019 Inter-X Summary Report April-11b

    6/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20116

    Introduction

    to

    the

    Summary

    Report

    ThisstudylooksattheresilienceoftheInternetinterconnectionecosystem.TheInternetisa

    networkofnetworks,andtheinterconnectionecosystemisthecollectionoflayeredsystemsthat

    holdsittogether.TheinterconnectionecosystemisthecoreoftheInternet,providingthebasic

    functionofreachinganywherefromeverywhere.

    TheExecutiveSummaryaboveprovidesanabstractofthereportssubjectandbroad

    recommendations.

    TheFullReporthasfourparts:

    PartI SummaryandRecommendationsThiscontainsamoreextendedexaminationofthesubjectandadiscussionofour

    recommendationsindetail,followedbytherecommendationsthemselves.

    Thispartofthereportisbasedonthepartswhichfollow.

    PartII StateoftheArtReviewThisincludesadetaileddescriptionoftheInternetsroutingmechanismsandanalysisoftheir

    robustnessatthetechnical,economicandpolicylevels.

    ThematerialinthispartsupportstheanalysispresentedinPartI,andsetsouttoexplainhow

    andwhytheissuesandchallengesthereportidentifiescomeabout.

    PartIII ReportontheConsultationAspartofthestudyabroadrangeofstakeholderswereconsulted.Thispartreportsonthe

    consultationandsummarisestheresults.

    PartIV BibliographyandAppendicesThereisanextensivebibliographyandsummariesofthefinancialstatementsofsomeofthe

    majortransitproviders.

    ThisSummaryReportisPartIoftheFullReport.

    Twosectionsfollow:

    Section1isasummaryoftheissuesandchallenges.Itisintendedtobereadasanintroductiontotherecommendations,givingthebackgroundandtherationaleforthem.ItservesalsoasanintroductiontotherestoftheFullReport.

    Section2containsourrecommendations.Inthefollowing,sectionnumberreferencestoSections3onwardsrefertoPartIItheFullReport.

    Referencesoftheform[C:xx]refertogeneralpointsmadeintheconsultation,whilethoseoftheform

    [Q:xx]refertoquotationsfromtheconsultationwhichmadeaparticular,oraparticularlyapposite,

    pointthosereferencespointtoPartIIIoftheFullReport.Referencesoftheform[1]refertothe

    Bibliography,whichisinPartIVoftheFullReport.

    ThisrevisedversionofthereportreplacestheversionpublishedinDecember2010.

  • 8/2/2019 Inter-X Summary Report April-11b

    7/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20117

    1

    Summary

    TheInternethasbeenprettyreliablesofar,havingrecoveredrapidlyfrommostknownincidents.

    TheeffectsofnaturaldisasterssuchasHurricaneKatrina,terroristattackssuchas9/11andassorted

    technicalfailureshaveallbeenlimitedintimeandspace.Howeveritdoesappearlikelythatthe

    Internetcouldsuffersystemicfailure,leadingperhapstolocalfailuresandsystemwidecongestion,

    insomecircumstancesincluding:

    Aregionalfailureofthephysicalinfrastructureonwhichitdepends(suchasthebulkpowertransmissionsystem)orthehumaninfrastructureneededtomaintainit(forexampleif

    pandemicflucausesmillionsofpeopletostayathomeoutoffearofinfection).

    Cascadingtechnicalfailures,ofwhichsomeofthemorelikelyneartermscenariosrelatetotheimminentchangeoverfromIPv4toIPv6;commonmodefailuresinvolvingupdatestopopular

    makesofrouter(orPC)mayalsofallunderthisheading.

    AcoordinatedattackinwhichacapableopponentdisruptstheBGPfabricbybroadcastingthousandsofbogusroutes,eitherviaalargeASorfromalargenumberofcompromised

    routers.

    ThereisevidencethatimplementationsoftheBorderGatewayProtocol(BGP)aresurprisingly

    fragile.Thereisevidencethatsomeconcentrationsofinfrastructurearevulnerableandsignificant

    disruptioncanbecausedbylocalisedfailure.Thereisevidencethatthehealthoftheinterconnection

    systemasawholeisnothighamongtheconcernsofthenetworksthatmakeupthatsystembyand

    largeeachnetworkstrivestoprovideaservicewhichisreliable,mostofthetime,atminimum

    achievablecost.Theeconomicsdonotfavourhighdependabilityasthereisnoincentiveforanyone

    toprovidetheextracapacitythatwouldbeneededtodealwithlargescalefailures.

    Todate,wehavebeenfarfromanequilibrium:therapidgrowthincapacityhasmaskedamultitude

    ofsinsanderrors.However,astheInternetmatures,asmoreandmoreoftheworldsopticalfibreis

    lit,andascompaniesjostleforadvantage,thedynamicsmaychange.

    TheremaywellnotbeanyimmediatecauseforconcernabouttheresilienceoftheInternet

    interconnectionecosystem,butthereiscauseforconcernaboutthelackofgoodinformationabout

    howitworksandhowwellitmightworkifsomethingwentverybadlywrong.

    Thissectionproceedsasfollows:

    inSection1.1thechallengesposedbythesheerscaleandcomplexityoftheInternetinterconnectionsystemarediscussed.

    thenatureofresilienceandthedifficultyofassessingitarediscussedinSection1.2. Section1.3discussestheinformationthatwedonothave,andhowthatlimitsourabilityto

    addresstheissueofresilience,amongotherthings.

    resilienceandefficiencyareantipathetic,whichraisesthechallengesgiveninSection1.4. theproblemsposedbythereliabilityofequipment,andthepossibilityforsystemicfailureare

    coveredinSection1.5.

    Section1.6examinesthevalueofServiceLevelAgreementsinthecontextoftheinterconnectionsystem.

  • 8/2/2019 Inter-X Summary Report April-11b

    8/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20118

    allpartsoftheInternetmustbeabletoreachallotherparts,soreachabilityisakeyobjective.However,beingabletoreachadestinationdoesnotguaranteethattrafficwillflowtoandfromthereeffectivelyandthatexpectedlevelsofperformancewillbemet.Section1.7discussesthe

    challenges,withparticularreferencetothebehaviourofthesystemifsomeeventhasdisabled

    partsofit.

    everyyearthepriceoftransitgoesdown,andeveryyearpeoplefeelitmustleveloff.Thereasontobelievethatthepricewilltendtozero,andthechallengesthatposesarediscussedin

    Section1.8.

    theriseoftheContentDeliveryNetworks(CDNs)andtheeffectontheinterconnectionsystemisdiscussedinSection1.9.

    Section1.10tacklestheinsecurityofBGP. inSection1.11thevalueofdisasterrecoveryexercises(wargames)isexamined. anumberofissuesarerelated;tacklingthemwouldbenefiteverybody,butaddressingthem

    alsocostseachnetworkmorethantheygainindividually.ThisisdiscussedinSection1.12.

    thecontentioussubjectofregulationisraisedinSection1.13.1.1 ScaleandComplexityTheInternetisverybigandverycomplicated[C:1].

    TheinterconnectionsystemwecalltheInternetcomprisessome37,000AutonomousSystemsor

    ASes(ISPsorsimilarentities)and355,000blocksofaddresses(addressablegroupsofmachines),spreadaroundtheworldasofMarch2011(seeSection3oftheFullReport).

    Thisenormousscalemeansthatitishardtoconceiveofanexternaleventwhichwouldaffectmore

    thanarelativelysmallfractionofthesystemasfarastheInternetisconcerned,alargeearthquake

    ormajorhurricaneis,essentially,alittlelocaldifficulty.However,thefailureofasmallfractionof

    thesystemmaystillhaveasignificantimpactonagreatmanypeople.Whenconsideringthe

    resilienceofthissystemitisnecessarytoconsidernotonlytheglobalissues,butalargenumberof

    separate,butinterconnected,localissues.

    Thecomplexityofthesystemispartlyrelatedtoitssheerscale,andthenumberofinterconnections

    betweenASes.Thisiscompoundedbyanumberoffactors.

    Modellingtheinterconnectionsystemishardbecauseweonlyhavepartialviewsofitandbecauseithasanumberoflayers,eachwithitsownpropertiesandinteractingwithother

    layers.Forexample,theconnectionsbetweenASesusemanydifferentphysicalnetworks,

    oftenprovidedbythirdparties,whicharethemselveslargeandcomplicated.Resilience

    dependsonthediversityofinterconnections,whichinturndependsonphysicaldiversity

    whichcanbeanillusion,andisoftenunknown[C:7].

    WhileitispossibletodiscoverpartoftheASleveltopologyoftheInternet(whichASesare

    interconnected),fromaresilienceperspective,itwouldbemorevaluabletoknowtherouter

    leveltopology,(thenumber,location,capacity,trafficlevelsetc.oftheactualconnections

    betweenASes)[C:2].Ifwewanttoestimatehowtrafficmightmovearoundwhenconnections

    fail,wealsoneedtoknowabouttheroutinglayer(whatroutestheroutershavelearnedfrom

    eachother)sowecanestimatewhatrouteswouldbelostwhengivenconnectionsfailed,and

  • 8/2/2019 Inter-X Summary Report April-11b

    9/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril20119

    whatrouteswouldbeusedinstead[C:3].Thatalsotouchesonroutingpolicy(thewayeach

    ASdecideswhichroutesitwillprefer)andthetrafficlayer[whereendusertrafficisgoingtoandfrom].Thisisperhapsthemostimportantlayer,butverylittleisknownaboutitona

    globalscale.

    Theinterconnectionsystemdependsonothercomplexandinterdependentsystems.Therouters,thelinksbetweenthem,thesitestheyarehousedin,andalltheotherinfrastructure

    thattheinterconnectionsystemdependson,themselvesdependonothersystemsnotably

    electricitysupplyandthosesystemsdependinturnontheInternet.[C:8],[Q:3]and[Q:17].

    Theinterconnectionecosystemisselforganisingandhighlydecentralised.ThedecisionwhethertointerconnectismadeindependentlybytheASes,drivenbytheirneedtobeableto

    reach,andbereachablefrom,theentireInternet.Thesameholdsatlowerlevels:the

    administratorsofanASconfiguretheirrouterstoimplementtheirroutingpolicy,thentheroutersselectanduseroutes.ButdifferentroutersinthesameASmayselectdifferentroutes

    foragivendestination,soeventheadministratorsmaynotknow,apriori,whatpathtraffic

    willtake.

    Theinterconnectionecosystemisdynamicandconstantlychanging.Itsshapechangesallthetime,asnewconnectionsaremade,orexistingconnectionsfailorareremoved.Atthe

    corporatelevel,transitproviderscomeandgo,organisationsmerge,andsoon.Attheindustry

    level,therecentriseofthecontentdeliverynetworks(CDNs)changedthepatternof

    interconnections.

    Thepatternsofusearealsoconstantlyevolving.TheriseoftheCDNsalsochangedthedistributionoftraffic;andwhilepeertopeer(P2P)trafficbecamealargeproportionoftotaltrafficintheearlytomid2000s,nowvideotrafficofvariouskindsiscomingtodominateboth

    intermsofvolumeandintermsofgrowth.

    TheInternetiscontinuingtogrow.Infact,justabouteverythingaboutitcontinuestogrow:thenumberofASes,thenumberofroutes,thenumberofinterconnections,thevolumeof

    traffic,etc.

    Thescaleandcomplexityofthesystemmakeithardtograsp.Resilienceisitselfaslipperyconcept,

    sotheresilienceoftheinterconnectionsystemisnontrivialtodefineletalonemeasure!

    Thisstudyattemptstoprovidesomeinsightbydescribingtheworkingsofthesystemandwhatwe

    knowaboutitsresilience.

    1.2 TheNatureofResilienceThereisavastliteratureonreliabilitywhereengineersstudythefailureratesofcomponents,the

    prevalenceofbugsinsoftware,andtheeffectsofwear,maintenanceetc.;theaimbeingtodesign

    machinesorsystemswithaknownrateoffailureinpredictableoperatingconditions[1].

    Robustnessrelatestodesigningsystemstowithstandoverloads,environmentalstressesandother

    insults,forexamplebyspecifyingequipmenttobesignificantlystrongerthanisneededfornormal

    operation.Intraditionalengineering,resiliencewastheabilityofamaterialtoabsorbenergyunder

    stressandreleaseitlater.Inmodernsystemsthinking,itmeanstheoppositeofbrittlenessand

    referstotheabilityofasystemororganisationtoadaptandrecoverfromaseriousfailure,ormore

    generallytoitsabilitytosurviveinthefaceofthreats,includingthepreventionormitigationof

    unsafe,hazardousordetrimentalconditionsthatthreatenitsexistence[2].Inthelongerterm,itcan

  • 8/2/2019 Inter-X Summary Report April-11b

    10/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201110

    alsomeanevolvability:theabilityofasystemtoadaptgraduallyasitsenvironmentchangesan

    ideaborrowedfromsystemsbiology[3][4].

    Resilienceofasystemisdefinedastheabilitytoprovideandmaintainanacceptablelevelofservicein

    thefaceofvariousfaultsandchallengestonormaloperation1.Thatistheabilitytoadaptitselfto

    recoverfromaseriousfailure,ormoregenerallytoitsabilitytosurviveinthefaceofthreats.A

    giveneventmayhavesomeimpactonasystemandhencesomeimmediateimpactontheserviceit

    offers.Thesystemwillthenrecover,servicelevelswillimproveandatsometimefullserviceandthe

    systemwillberestored.

    Resiliencethereforerefersbothtofailurerecoveryatthemicrolevel,aswhentheInternetrecovers

    fromthefailureofaroutersoquicklythatusersperceiveaconnectionfailureofperhapsafew

    seconds(iftheynoticeanythingatall);throughcopingwithamidsizeincident,aswhenISPs

    providedextraroutesinthehoursimmediatelyafterthe9/11terroristattacksbyrunningfibresacrosscollocationcentres;todisasterrecoveryatthestrategiclevel,wherewemightplanforthe

    nextSanFranciscoearthquakeorforamalwarecompromiseofthousandsofrouters.Ineachcase

    thedesiredoutcomeisthatthesystemshouldcontinuetoprovideserviceintheeventofsomepart

    ofitfailing,withservicedegradinggracefullyifthefailureislarge.

    Therearethustwoedgecasesofresilience:

    1. theabilityofthesystemtocopewithsmalllocaleventssuchasequipmentfailuresandreconfigureitselfessentiallyautomaticallyandoveratimescaleofsecondstominutes.This

    enablestheInternettocopewithdaytodayeventswithlittleornoeffectonserviceitis

    reliable.Thisiswhatmostnetworkengineersthinkofasresilience.

    2. theabilityofasystemtocopewithandrecoverfromamajorevent,suchasalargenaturaldisasteroracapableattack,onatimescaleofhourstodaysorevenlonger.Thistypeof

    resilienceincludes,first,theabilityofthesystemtocontinuetooffersomeserviceinthe

    immediateaftermath,andsecond,theabilitytorepairandrebuildthereafter.Thekeywords

    hereareadaptandrecover.Thisdisasterrecoveryiswhatcivilauthoritiestendtothinkof

    asresilience.

    Thisstudyisinterestedintheresilienceoftheecosysteminthefaceofeventswhichhavemediumto

    highimpactandwhichhaveacorrespondinglymediumtolowprobability.Itisthusbiasedtoward

    thesecondofthesecases.

    Robustnessisanimportantaspectofresilience.Arobustsystemwillhavetheabilitytoresistassaultsandinsults,sothatwhateversomeeventisthrowingatit,itwillbeunaffected,andno

    resilientresponseisrequired.Whileresilienceistodowithcopingwiththeimpactofevents,

    robustnessistodowithreducingtheimpactinthefirstplace.Thetwooverlap,andfromtheusers

    perspectivethesearefinedistinctions;whattheuserwantsisforthesystemtobepredictably

    dependable.

    1following:JamesP.G.Sterbenz,DavidHutchison,EgemenK.etinkaya,AbdulJabbar,JustinP.Rohrer,MarcusSchller

    andPaulSmith:Resilienceandsurvivabilityincommunicationnetworks:Strategies,principles,andsurveyof

    disciplines,Computer

    Networks,

    Volume

    54,

    Issue

    8,

    1June

    2010,

    Pages

    1245

    1265,

    Resilient

    and

    Survivable

    networks.

  • 8/2/2019 Inter-X Summary Report April-11b

    11/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201111

    Resilienceiscontextspecific.Robustnesscanbesensiblydefinedonlyinrespectofspecifiedattacks

    orfailures,andinthesamewayresiliencealsomakessenseonlyinthecontextofrecoveryfromspecifiedevents,orinthefaceofasetofpossiblechallengesofknownprobability.Wecallbad

    eventsofknownprobabilityrisk,butthereisaseparateproblemofuncertaintywherewedonot

    knowenoughaboutpossiblefuturebadeventstoassignthemaprobabilityatall.Inthefaceof

    uncertainty,itisdifficulttoassessacombinationofintermediatelevelsofserviceand

    recovery/restorationtimes,especiallywhenwhatisacceptablemayvarydependingonthenature

    andscaleoftheevent.[C:5]

    Moreover,nogoodmetricsareavailabletoactuallyassesstheperformanceoftheInternetorits

    interconnectionsystem.Thismakesitharderstilltospecifyacceptablelevelsofservice.Forthe

    Internettheproblemiscompoundedbyitsscaleandcomplexity(seeabove)andbylackof

    information(seebelow),whichmakeithardtoconstructamodelwhichmightbeusedtoattach

    numberstoresilience.Itisevenhardtoassesswhatimpactagivensingleeventmighthavean

    earthquakeinSanFranciscoofagivenseveritymayhaveapredictableimpactonthephysical

    infrastructure,butthatneedstobetranslatedintoitseffectoneachnetwork,andhencetheeffecton

    theinterconnectionsystem.

    Giventhesedifficulties(andtherearemanymore),serviceproviderscommonlyfallbackon

    measuresthatimproveresilienceingeneralterms,inthehopethatthiswillimprovetheirresponse

    tofuturechallenges.Thisqualitativeapproachrunsintodifficultywhenthecostofanimprovement

    mustbejustifiedonmuchmorerestrictedcriteria.FortheInternetasawhole,thecostjustification

    ofinvestmentinresilienceisanevenhardercasetomake.

    1.3 TheLackofInformationEachoftheASesthatmakeuptheInterneteachhasaNetworkOperationCentre(NOC),chargedwith

    monitoringthehealthoftheASsnetworkandinstigatingactionwhenproblemsoccur.Thereisno

    NOCfortheInternet.

    Infactitisworsethanthat.ASesunderstandtheirownnetworksbutknowlittleaboutanyoneelses.

    Ateveryleveloftheinterconnectionsystem,thereislittleglobalinformationavailable,andwhatis

    availableisincompleteandofunknownaccuracy.Inparticular:

    thereisnomapofphysicalconnectionstheirlocation,capacity,etc.; thereisnomapoftrafficandtrafficvolume; thereisnomapoftheinterconnectionsbetweenASeswhatroutestheyoffereachother.

    TheInternetinterconnectionsystemis,essentially,opaque.Thisopacityhamperstheresearchand

    developmentcommunitiesintheirattemptstounderstandtheworkingsoftheInternet,andto

    developandtestimprovements;itmakesthestudyandmodellingofcomplexemergentproperties

    suchasresilienceharderstill.[C:2],[Q:1]and[Q:2].

    Thelackofinformationhasanumberofcauses:

    Complexityandscale.Tomapthenetworksoffibrearoundtheworldmightbeatractableproblem.Overthosephysicalfibresrunmanydifferentlogicalconnections,eachofwhichwill

    carrynetworktrafficfornumerousproviders,whichinturnsupportyetmoreprovidersnetworksandcircuitsrapidlymultiplyingupthecombinationsandpermutationsof

    overlappinguseoftheunderlyingfibre.Furthermore,notallthosethingsarefixedproviders

  • 8/2/2019 Inter-X Summary Report April-11b

    12/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201112

    rerouteexistingnetworksandcircuitsastheyextendoradapttheirnetworks.Tokeeptrack,

    meticulousrecordkeepingisrequired,butevenwithinasingleASitisnotalwaysachieved.Atagloballevel,measuringtrafficvolumeswouldbeanimmenseundertaking,giventhesheer

    numberofconnectionsbetweennetworks.

    Theinformationhidingpropertiesoftheroutingsystem.Whentryingtomapconnectionsbyprobingthesystemfromtheoutside,eachprobewillrevealsomethingaboutthepath

    betweentwopointsintheInternetatthetimeoftheprobe.Buttheproberevealslittleabout

    whatotherpathsmayexistatothertimes,orwhatpathmightbetakenifanypartoftheusual

    pathisnotworking,orwhattheperformanceofthoseotherpathsmightbe.

    Securityconcerns.Mappingthephysicallayeristhoughttobeaninvitationtopeoplewithbadintentionstoimprovetheirtargetselectionsothosemapsthatdoexistareseldomshared.

    Thecostofstoringandprocessingthedata.Iftherewascompleteinformation,therewouldbeaverygreatdealofit,andmorewouldbegeneratedeveryminute.Storingitand

    processingitintoausableformwouldbeamajorengineeringtask.

    Commercialsensitivity.Informationaboutwhether,howandwherenetworksconnecttoeachotherisdeemedcommerciallysensitivebysome.Informationabouttrafficvolumesis

    quitegenerallyseenascommerciallysensitive.Becauseofthis,someadvocatepowerful

    incentivestodiscloseinformation,andpossiblyinanonymisedandaggregatedform.[C:23]

    Criticalinformationisnotcollectedinthefirstplace,ornotkeptuptodate.Informationgatheringandmaintenancecostsmoney,sotheremustbesomerealuseforitbeforea

    networkwillbothertogatheritorstrivetokeepituptodate.TheInternetRoutingRegistries

    (IRRs)arepotentiallyexcellentresources,butarenotnecessarilyuptodate,completeoraccurate,becausetheinformationseldomhasoperationalsignificance(andmayinanycasebe

    deemedcommerciallysensitive).

    Lackofgoodmetrics.Whiletherearesomewellknownmetricsfortheperformanceofconnectionsbetweentwopointsinanetwork,therearenoneforanetworkasawholeor,

    indeed,anetworkofnetworks.ENISAhasalreadystartedworkinginthisdirection,lookingat

    resiliencemetricsfromaholisticpointofview2.

    Thepoorstateofinformationreflectsnotonlythedifficultyoffindingorcollectingdata,butalsothe

    lackofgoodwaystoprocessanduseitevenifonehadit.

    1.3.1 IncidentsasaSourceofInformationSmallincidentsoccureveryday,andlargeroneseverynowandthen.Giventhelackofinformation

    abouttheinterconnectionsystem,theresultsofthesenaturalexperimentstellusmuchofwhatwe

    knowaboutitsresilience.[C:4].Forexample,weknowthefollowing.

    Itisstraightforwardtodiverttrafficawayfromitsproperdestinationbyannouncinginvalidroutes.ThewellknownincidentinFebruary2008inwhichYouTubestoppedworkingfora

    fewhoursisoneexample;seeSection5.6.4.Morepublicity,andpoliticalconcern,wasraised

    2

    http://www.enisa.europa.eu/act/res/other

    areas/metrics

  • 8/2/2019 Inter-X Summary Report April-11b

    13/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201113

    bya2010incidentinwhichChinaTelecomadvertisedanumberofinvalidroutes,effectively

    hijacking15%ofInternetaddressesfor18minutes;seeSection5.6.9. LatentbugsinBGPimplementationscandisruptthesystem.Mostrecently,inAugust2010,an

    experimentthatsentanunusual(butentirelylegal)formofrouteannouncementtriggereda

    buginsomerouters,causingtheirneighbourstoterminateBGPsessions,andformanyroutes

    tobelost.Theeffectsofthisincidentlastedlessthantwohours;seeSection5.6.5.

    Insomepartsoftheworldasmallnumberofcablesystemsarecritical.UnderseacablesnearAlexandriainEgyptwerecutinDecember2008.Interestingly,threecablesystemswere

    affectedatthesametime,andtwoofthosesystemshadbeenaffectedsimilarlyin

    January/Februaryofthatyear.Thisseriouslyaffectedtrafficforperhapstwoweeks.See

    Section5.6.6.

    Thesystemiscriticallydependentonelectricalpower.AlargepoweroutageinBrazilinNovember2009causedsignificantdisruption,thoughitlastedonlyfourandahalfhours;see

    Section5.6.6.Interestingly,previousblackoutsinBrazilhadbeenattributedtohackers,

    suggestingthattheseincidentsareexamplesoftheriskofinterdependentnetworks.This

    particularconspiracytheoryhasbeenrefuted.

    Theecosystemcanworkwellinacrisis.TheanalysisoftheeffectofthedestructionattheWorldTradeCentreinNewYorkon11thSeptember2001showsthatthesystemworkedwell

    atthetime,andinthedaysthereafter,eventhoughlargecablesunderthebuildingswerecut

    andotherfacilitiesweredestroyedordamaged.Generally,Internetservicesperformedbetter

    thanthetelephonesystem(fixedandmobile).SeeSection5.6.10.

    Thesesortsofincidentarewellknown.However,hardinformationabouttheexactcausesand

    effectsishardtocomebymuchisanecdotalandincomplete,whilesomeisspeculativeorsimply

    apocryphal.Valuableinformationisbeinglost.ThereportThe Internet under Crisis Conditions:

    Learning from September 11,[5]isamodelofclarity;buteventheretheauthorswarn:

    ... While the committee is confident in its assessment that the events of September 11 had little effect

    on the Internet as a whole ..., the precision with which analysts can measure the impact of such events

    is limited by a lack of relevant data.

    1.4 ResilienceandEfficiencyTherearefundamentaltensionsbetweenresilienceandefficiency.[Q:5]Resiliencerequiresspare

    capacityandduplicationofresources,andsystemswhicharelooselycoupled(madeupoflargely

    independentsubsystems)aremoreresilientthantightlycoupledsystemswhosecomponents

    dependmoreoneachother.Butimprovingtheefficiencyofasystemgenerallymeanseliminating

    excesscapacityandredundantresources.

    Amorediversesystemisgenerallyamoreresilientone,butdiversityaddstocostandcomplexity.

    Diversityofconnectionsismostefficientlyachievedusinginfrastructurewhosecostissharedby

    manyoperators,butcollectiveactionproblemscanunderminetheresiliencegain[C:7][Q:9].Itis

    efficienttoavoidduplicationofeffortinthedevelopmentofsoftwareandequipment,andefficientto

    exploiteconomiesofscaleinitsmanufacture,butthisreducesthediversityofequipmentused[C:9].

    ItisefficientfortheentireInternettodependononeprotocolforitsrouting,butthiscreatesasingle

    pointoffailure.Settingupandmaintainingmultiple,diverse,separateconnectionstoothernetworkscoststimeandeffortandcreatesextracomplexitytobemanaged[C:6].

  • 8/2/2019 Inter-X Summary Report April-11b

    14/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201114

    TheInternetisalooselycoupledcollectionofindependentlymanagednetworks.However,atits

    corethereareafewverylargenetworks,eachofwhichstrivestobeasefficientaspossiblebothinternallyandinitsconnectionstoothernetworks.Soitisanopenquestionwhethertheactual

    structureoftheInternetisasresilientasitsarchitecturewouldsuggest.Inthepastithasbeen

    remarkablyresilient,andithascontinuedtoperformasithasevolvedfromatinynetwork

    connectingahandfulofresearchfacilitiesintotheglobalinfrastructurethatconnectsbillionstoday.

    However,asinotherareas,pastperformanceisnoguaranteeoffutureresults.

    1.5 ResilienceandEquipmentAparticularconcernfortheinterconnectionsystemisthepossibilityofaninternaltechnicalproblem

    thatcouldhaveasystemiceffect.TheimminentchangeovertoIPv6willprovideahighstress

    environmentinwhichsuchaproblemcouldbemorelikelytomanifestitself,andthemostlikelyproximatecauseofsuchaproblemisbugsinBGPimplementations,whichcouldbeseriousgiventhe

    smallnumberofequipmentvendorsforthiskindofequipment.[C:9]Therehavebeenanumberof

    incidentsinwhichlargenumbersofroutersacrosstheentireInternethavebeenaffectedbythe

    sameproblem,somethingunprecedentedandunexpectedwhichexposesabuginthesoftware,and

    occasionallyinthespecificationofBGP.

    Nosoftwareisfreefrombugs,buttheuniversaldependenceonBGPmakesbugstheremoreserious.

    ISPsmaytestequipmentbeforebuyinganddeployingit,butthosetestsconcentrateonissues

    directlyaffectingtheISP,suchastheperformanceoftheequipmentanditsabilitytosupportthe

    requiredservices.Manufacturerstesttheirequipmentaspartoftheirdevelopmentprocess.Butthe

    interestsofbothISPsandmanufacturersarefortheequipmenttoworkwellundernormal

    circumstances.IndividualISPscannotaffordtodoexhaustivetestingoflowprobabilityscenariosforthebenefitoftheInternetatlarge.Themanufacturersfortheirpartbalancetheeffortandtime

    spenttestingagainsttheircustomersdemandsfornewandusefulfeatures,newandfasterrouters

    andlessexpensivesoftware.Alsoofconcernishowsecureroutersandroutingprotocolsareagainst

    deliberateattemptstodisruptorsubornthem.

    Anumberofrespondentstotheconsultationfeltthatmoneyspentontestingequipmentand

    protocolswouldbemoneywellspent.[C:10]

    1.6 ServiceLevelAgreements(SLAs)andBestEffortsInanymarketinwhichthebuyerhasdifficultyinestablishingtherelativevalueofdifferentsellersofferings,itiscommonforsellerstoofferguaranteestosupporttheirclaimstoquality.ServiceLevel

    Agreements(SLAs)performthatfunctionintheinterconnectionecosystem.Fromaresilience

    perspective,itwouldbenicetoseeISPsofferingSLAsthatcoverednotjusttheirownnetworksbut

    theinterconnectionsystemtoo,andcustomerspreferringtobuyservicewithsuchSLAs.

    Unfortunately,SLAsforInternetaccessingeneralarehard,andfortransitserviceareofdoubtful

    value[C:20].Inparticular,whereanoperatoroffersanSLA,itdoesnotextendbeyondthebordersof

    theirnetwork[C:19];sowhatevertheirguaranteesare,theydonotcovertheinterconnectionsystem

    thepartbetweenthebordersofallnetworks.

    TheSLAsofferedtoendcustomersbytheirISPsreflecttheSLAsthatISPsobtainfromtheirtransit

    providersandpeers.ThestandardSLAsofferedtoendcustomersmaybepublished,buttheSLAs

    offeredbetweennetworksmaybepartofcontractsthatarekeptconfidential.Givenhowlittlesuch

    SLAsaregenerallythoughttocover,itisanopenquestionhowmuchinformationisbeinghidden

  • 8/2/2019 Inter-X Summary Report April-11b

    15/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201115

    herebutitisanotheraspectofthegenerallackofinformationabouttheecosystematalllevels.

    (Theconsultationaskedspecificallyaboutinterprovideragreements,seeSection9,Question8.)

    Providersdonotattempttoguaranteeanythingbeyondtheirbordersbecausetheycannot.Anysuch

    guaranteewouldrequireabacktobacksystemofcontractsbetweennetworkssothatliabilityfora

    failuretoperformwouldbebornebythefailingnetwork.Thatsystemofcontractsdoesnotexist,

    notleastbecausetheInternetisnotdesignedtoguaranteeperformance.Itisfundamentaltothe

    currentInternetarchitecturethatpacketsaredeliveredonabesteffortsbasis,thatis,thenetwork

    willdoitsbestbutitdoesnotguaranteeanything.TheInternetleavesthehardworkofmaintaining

    aconnectiontotheendpointsoftheconnectiontheendtoendprinciple.TheTransmission

    ControlProtocol(TCP),whichcarriesmostInternettrafficapartfromdelaysensitivetraffic,will

    reducedemandifitdetectscongestionitisdesignedtoadapttotheavailablecapacity,notto

    guaranteesomelevelofperformance.

    TheotherdifficultywithSLAsiswhatcanandwhatshouldbemeasured.Forasingleconnection

    betweenaandbitisclearwhatcanbemeasured,butitisnotclearwhatlevelofperformancecould

    beguaranteed,orbywhom.Consideraconnectionfromainonenetworktobinanothernetwork,

    whichtraversesfourothernetworksandtheconnectionsbetweenthem:

    Figure1:Connectionbetweenaandb

    Allthesenetworksareindependent,andhavetheirownSLAs,eachextendingonlyasfarastheir

    borders.Ifwefollowthemoney,aispayingdirectlyandindirectlyforpacketstoandfromthe

    connectionbetweennetworksYandZ.Similarly,bispayingforpacketstoandfromthemidpoint

    ontheotherside.IfnetworkQhaslowstandards,orishavingabadday,towhomdoesacomplain?

    NetworkXhasacontractwithasnetwork,andoffersanSLA,butthatdoesnotextendbeyondX.

    NetworkYhasacontractwithX,withadifferentSLA,butevenifXcomplainedtoYaboutits

    customersproblemwehavecometotheendofthemoneytrail:YcannotholdZtoaccountforthe

    performanceofQ.SupposeaweretodemandastrongSLAfromtheirprovider:Xcertainlyhasno

    wayofimposingsomestandardofserviceonQ,andsimplycannotoffertomakeanyguarantee.

    EvenifitwerepossibletoestablishanendtoendSLAforthisconnection,andpinliabilityonthe

    failingnetwork,therearehundredsofthousandsofpathsbetweenasnetworkandtherestoftheInternet.Theproblemisintractable.SowhatevervalueSLAshave,theydonotofferacontractual

    frameworkthroughwhichcustomerscaninfluencetheresilienceoftheinterconnectionsystem,even

    iftheywantedto.Inaddition,fewcustomersunderstandtheissue,orcaretodoanythingaboutit.

    GenerallytheInternetisremarkablyreliable,socustomersprincipalinterestinchoosingasupplier

    ispricepossiblymoderatedbythesuppliersreputation.[C:18]

    1.7 Reachability,TrafficandPerformanceWhileenduserscareabouttrafficandperformance,thebasicmechanismoftheinterconnection

    systemBGPonlyunderstandsreachability[Q:11].Itsfunctionistoprovideawayforevery

    networktoreacheveryothernetwork,andfortraffictoflowacrosstheInternetfromonenetworktoanother.AllASes(theISPsandothernetworksthatmakeuptheInternet)speakBGPtoeachother,

  • 8/2/2019 Inter-X Summary Report April-11b

    16/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201116

    andreachabilityinformationspreadsacrosstheBGPmeshofconnectionsbetweenthem.BGPisthe

    heartoftheinterconnectionsystem,soitsmanydeficienciesareaproblem.[Q:16]

    Theproblemswiththeprotocolitselfinclude:

    thereisnomechanismtoverifythattheroutinginformationdistributedbyBGPisvalid.Inprincipletraffictoanydestinationcanbedivertedsotrafficcanbedisrupted,modified,

    examinedorallthree.ThesesecurityissuesarediscussedseparatelyinSection1.10.

    thereisnomechanisminBGPtoconveycapacityinformationsoBGPcannothelpreconfiguretheinterconnectionsystemtoavoidcongestion.[Q:12]Whenaroutefails,BGPwillfind

    anotherroutetomaintainreachability,butthatroutemaynothavesufficientcapacityforthe

    trafficitnowreceives.

    themechanismsinBGPwhichmaybeusedtodirecttrafficawayfromcongestioninothernetworksinterdomaintrafficengineeringarestrictlylimited. whenthingschangeBGPcanbeslowtosettledown(converge)toanew,stablestate.[C:12] theabilityofBGPtocopeorcopewellunderextremeconditionsisnotassured.

    EndusersexpecttobeabletoreacheverypartoftheInternet,soreachabilityisessential.Butthey

    alsoexpecttobeabletomovedatatoandfromwhateverdestinationtheychoose,sotheyexpect

    theirconnectionwiththatdestinationtoperformwell.AsBGPknowsnothingabouttraffic,capacity

    orperformance,networkoperatorsmustuseothermeanstomeetendusersexpectations.When

    somethingintheInternetchanges,BGPwillchangetheroutesusedtoensurecontinuing

    reachability,butitisuptothenetworkoperatorstoensurethattheresultwillperformadequately,

    andtakeotherstepsifitdoesnot.

    Servicequalityinabesteffortsnetworkisalltodowithavoidingcongestion,forwhichitis

    necessarytoensurethatthereisalwayssufficientcapacity.Themosteffectivewaytodothatisto

    maintainenoughsparecapacitytoabsorbtheusualshorttermvariationsintrafficandprovidesome

    safetymargin.Additionalsparecapacitymaybemaintainedtoallowtime(weeksormonths,

    perhaps)fornewcapacitytobeinstalledtocaterforlongtermgrowthoftraffic.Maintainingspare

    capacityinthiswayisknownasoverprovisioning;itiskeytodaytodayservicequalityandtothe

    resilienceoftheinterconnectionsystem.

    Eachoperatorconstantlymonitorsitsnetworkforsignsofcongestionandwillmakeadjustmentsto

    relieveanyshorttermissues.Ingeneralthepatternoftrafficinanetworkofanysizeisstablefromdaytodayandmonthtomonth.Anoperatorwillalsomonitortheirnetworkforlongtermtrendsin

    traffic.Themanagementofcapacityisgenerallydoneonthebasisofhistory,experienceandrulesof

    thumb,supportedbysystemsforgatheringandprocessingtheavailabledata.Thelevelsofspare

    capacityinanynetworkwilldependonmanythings,includinghowtheoperatorchoosestobalance

    thecostofsparecapacityagainsttheriskofcongestion.

    Akeypointhereisthatcapacityismanagedonthebasisofactualtrafficandtheusualdaytoday

    events,withsomemarginforcontingenciesandgrowth.Capacityisnotmanagedonthebasisof

    whatmighthappenifsomeunusualeventcausesalotoftraffictoshiftfromonenetworktoanother.

    Ifaneventhasamajorimpactontheinterconnectionsystem,thentheamountofsparecapacity

    withinandbetweennetworkswilldeterminethelikelihoodofsystemiccongestion.Soeach

    individualnetworksdegreeofoverprovisioningmakessomecontributiontotheresilienceofthe

    wholethoughitishardtosaytowhatextent.

  • 8/2/2019 Inter-X Summary Report April-11b

    17/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201117

    IfaneventdisablessomepartoftheInternet,BGPwillworktoensurethatreachabilityis

    maintained,butthenewpathsmayhavelesscapacitythantheusualones,whichmayresultincongestion.Formanyapplications,notablywebbrowsing,theeffectistoslowthingsdown,butnot

    stopthemworking.Moredifficultiesarisewithanysortofdatathatisaffectedbyreduced

    throughputorincreaseddelay,suchasVoIPandstreamingvideo.Congestionmaystopthese

    applicationsworkingsatisfactorily,oratall.

    Theimportantdistinctionbetweenreachabilityandtrafficisillustratedbyconsideringwhatappears

    tobeasimplemetricforthestateoftheInternet:thepercentageofknowndestinationsthatare

    reachablefrommostoftheInternetatanygivenmoment.Thismetricmaybeusedtogaugethe

    impactofaBGPfailure,orofthefailureofsomecriticalfibre,oranyotherwidelyfeltevent.But

    whilethesignificanceof,say,10%ofknowndestinationsbecomingunreachableisobviously

    extremelyhighforthe10%cutoff,itmaynotbeterriblysignificantfortherestoftheInternet.We

    wouldprefertoknowtheamount,andpossiblythevalue,oftrafficthatisaffected.Ifthe10%cutoff

    accountsforalargeproportionoftheremaining90%straffic,theimpactcouldbesignificant.So

    whentalkingabouttheresilienceofthesystem,whatisanacceptablelevelofthebestefforts

    service?Areweaimingathavingemailwork95%ofthetimeto95%ofdestinations,orstreaming

    videowork99.99%ofthetimeto99.99%ofdestinations?Theanswerwillhaveanenormouseffect

    onthesparecapacityneeded!Eachextraorderofmagnitudeimprovement(sayfrom99%to99.9%)

    couldcostanorderofmagnitudemoremoney;yetthebenefitsofservicequalityareunevenly

    distributed.Forexample,apensionerwhousestheInternettochattograndchildrenonceaweek

    maybehappywith99%oreven90%,whileacompanyprovidingacloudbasedbusinessservice

    mayneed99.99%ormore.

    1.7.1 TrafficPrioritisationInacrisisitiscommonforaccesstosomeresourcestoberestricted,tosheddemandandfreeup

    capacity.Fortelephonyatraditionalapproachistogiveemergencyservicespriority.Butrestricting

    phoneservicetoobviousemergencyworkerssuchasdoctorsisunsatisfactory.Modernmedical

    practicedependsonteamworkingandcanbecrippledifnursesarecutoff;andmanypatientswho

    dependonhomemonitoringmayhavetobehospitalisedifcommunicationsfail.

    Ifcapacityislostinadisasterandpartsofthesystemarecongested,thenallusersofthecongested

    partswillsufferareductioninservice,andsometypesoftraffic(notablyVoIP)maystopworking

    effectively.Ifsometypes,sourcesordestinationsoftrafficaredeemedtobeimportant,andso

    shouldbegivenpriorityinacrisis,thenseriousthoughtneedstobegiventohowtoidentifypriority

    traffic,howtheprioritisationistobeimplementedandhowturningthatprioritisationonandofffits

    intootherdisasterplanning.[Q:19]

    Itisnotentirelystraightforwardtoidentifydifferenttypesoftraffic.Soanalternativeapproachmay

    betoprioritisebysourceordestination.ItmaybetemptingtoconsiderservicessuchasFacebookor

    YouTubeasessentiallytrivial,andYouTubeusesalotofbandwidth.However,inacrisiskeepingin

    contactusingFacebookmaybeapriorityformany.Moreover,shuttingdownYouTubeinacrisis

    therebypreventingthefreereportingofeventswouldrequiresolidjustification.Ontheother

    hand,ratelimitingordinaryusers,irrespectiveoftraffictype,mayappearfair,butcouldaffect

    essentialVoIPuse,andcuttingoffpeertopeertrafficcouldbeseenascensorship.

    SoitisinappropriateforISPstodecidetodiscriminatebetweendifferentsortsoftraffic,orbetweencustomersofthesametype(althoughpremiumcustomersatpremiumratesmightexpecttoget

    betterperformanceinacrisis).[Q:21]ItisnotevenclearthatISPsare,ingeneral,capableof

  • 8/2/2019 Inter-X Summary Report April-11b

    18/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201118

    prioritisingsometrafficonanygivenbasis.So,ifsometrafficshouldbeprioritisedinacrisis,who

    willmakethecall,andwillanyonebereadytoactwhentheydo?

    Itisclearthatthischallengeentailsbothtechnicalandpolicyaspects.Theformerarerelatedmainly

    tothemechanismsthatshouldexistinnetworkequipmenttosupporttrafficprioritisation.Thelatter

    refermainlytothepoliciesthatspecifywhattrafficshouldbegivenpriority.Itisveryimportantto

    tacklebothaspectsoftheproblem.

    1.7.2 TrafficEngineeringTrafficEngineeringisthejargontermforadjustinganetworksothattrafficflowsareimproved.In

    acrisisthatwouldmeanshiftingtrafficawayfromcongestedpaths.Thisislesscontroversialthan

    trafficprioritisation,butnolessdifficult.

    Whensomeeventcreatescongestioninsomepart(s)oftheinterconnectionsystemitwouldbe

    convenientifnetworkscouldredirectsometrafficawayfromthecongestedparts.Whenanetwork

    isdamageditsoperatorswillworktorelievecongestionwithintheirnetworkbydoinginternal

    trafficengineering,addingtemporarycapacity,repairingthings,andsoon.Oneofthestrengthsof

    theInternetisthateachoperatorwillbeworkingindependentlytorecoveritsownnetworkas

    quicklyandefficientlyaspossible.

    Whereanetworksusersareaffectedbycongestioninothernetworks,thesimpleststrategyisto

    waituntilthosenetworksrecover.Thismayleavesparecapacityinothernetworksunused,soisnot

    theoptimumstrategyforthesystemasawhole.However,therearetwoproblemswithtryingto

    coordinateaction:

    1. thereisnowayoftellingwherethesparecapacityinthesystemis;2. BGPprovidesverylimitedmeanstoinfluencetrafficinotheroperatorsnetworks.

    Ineffect,ifnetworksattempttoredirecttraffictheyareblunderingaroundinthedark,attemptingto

    makeadjustmentstoadelicateinstrumentwithahammer.Theirattemptstoredirecttrafficmay

    createcongestionelsewhere,whichmaycausemorenetworkstotrytomovetrafficaround.Itis

    possibletoimagineasituationinwhichmanynetworksarechasingeachothercreatingwavesof

    congestionandroutingchangesastheydo,likethewavesofcongestionthatpassalongroadswhich

    areneartheircarryingcapacity.

    Withluck,ifanetworkcannothandlethetrafficitissentandpushesitawaytoothernetworks,it

    willbedivertedtowardssparecapacityelsewhere.Givenenoughtimethesystemwouldadapttoanewdistributionofcapacity,andanewdistributionoftraffic.Itisimpossibletosayhowmuchtime

    wouldberequired;itwoulddependontheseverityofthecapacityloss,butitcouldbedaysoreven

    weeks.

    Strategiclocalactionwillnotnecessarilyleadtoasociallyoptimalequilibrium,though,asthe

    incentivesmaybeperverse.SinceanySLAwillstopattheedgeofitsnetwork,atransitprovidermay

    wishtoengineertrafficawayfromitsnetworkinordertomeetitsSLAsfortrafficwithinitsnetwork.

    Theresultmaystillbecongestion,somewhere,buttheSLAisstillmet.

  • 8/2/2019 Inter-X Summary Report April-11b

    19/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201119

    1.7.3 RoutinginaCrisisExperienceshowsthatinacrisistheinterconnectionsystemcanquitequicklycreatenewpaths

    betweennetworkstoprovideinterimconnectionsandextracapacityforexample,intheaftermath

    ofthe9/11attack,asdiscussedabove.

    Theinterconnectionecosystemhasoftenrespondedinthiswaywithmanypeopleimprovising,and

    workingwiththepeopletheyknowpersonally.[C:13]Thisisrelatedtotrafficengineering,tothe

    extentthatitaddressestheproblembyaddingextraconnectionstowhichtrafficcanbemoved.The

    responseofthesystemmightbeimprovedandspeededupifthereweremorepreparationforthis

    form,andperhapsotherforms,ofcooperationinacrisis.[C:14]

    Intheend,ifthereisinsufficientcapacityinacrisis,thennoamountoftrafficengineeringormanual

    reconfigurationwillfitaquartoftrafficintoapintofcapacity.Inextremecasessomeformofprioritisationwouldbeneeded.

    1.8 IsTransitaViableBusiness?Theprovisionoftransittheserviceofcarryingtraffictoeverypossibledestinationisakeypartof

    theinterconnectionsystem,butitmaynotbeasustainablebusinessinthenearfuture.

    Nobodydoubtsthatthecostoftransithasfallenfast,orthatitisacommoditybusiness,exceptwhere

    thereislittleornocompetition.IntheUS,overthelasttentofifteenyearstransitpriceshavefallen

    atrateofaround40%perannumwhichresultsina99%dropoveratenyearperiod.Inother

    partsoftheworldpricesstartedhigher,butasinfrastructurehasdeveloped,andtransitnetworks

    haveextendedtointonewmarkets,thosepriceshavefallenforexample,pricesinLondonarenowscarcelydistinguishablefromthoseinNewYork.

    Wherethereiseffectivecompetition,thepriceoftransitfalls,andconsumersbenefit.Ina

    competitivemarket,pricetendstowardsthemarginalcostofproduction.Thetotalcostof

    productionhasfallensharply,asinnovationreducesthecostoftheunderlyingtechnologiesand

    withincreasingeconomiesofscale.Yeteveryyearindustryinsidersfeelthatsurelynobodycan

    makemoneyattodaysprices,andthattheremustsoonbealevellingoff.Sofartherehasbeenno

    levellingoff,thoughtherateatwhichpricesfallmaybediminishing.

    Thereasonissimple:themarginalcostofproductionfortransitserviceisgenerallyzero.Atany

    givenmomenttherewillbeanumberoftransitproviderswithsparecapacity:first,networkcapacity

    comesinlumps,soeachtimecapacityisaddedtheincrementwillgenerallyexceedtheimmediate

    need;second,networksaregenerallyoverprovisioned,sothereisalwayssomesparecapacity

    thougheatingintothatmayincreasetheriskofcongestion,perhapsreducingservicequalityatbusy

    timesorwhenthingsgowrong.

    Thelogicofthismarketisthatthepricefortransitwilltendtowardszero.Soitisunclearhowpure

    transitproviderscouldrecouptheircapitalinvestment.Thelogicofthemarketwouldappearto

    favourconsolidationuntilthehandfuloffirmsleftstandingacquiremarketpower.

    Atapracticallevel,theprovisionoftransitmaybeundertakennottomakeprofits,buttooffsetsome

    ofthecostofbeinganInternetnetwork.Forsomenetworksthedecisiontooffertransitatthe

    marketpricemaybeincreasinglyastrategicratherthanacommercialdecision.Anothersignificantfactoristherecentandcontinuingincreaseinvideotrafficandtherelatedriseintheamountof

    trafficdeliveredbytheContentDeliveryNetworks(CDNs,seebelow).Thismeansthatthecontinued

  • 8/2/2019 Inter-X Summary Report April-11b

    20/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201120

    reductionintheunitpricefortransitisnotbeingmatchedbyanincreaseintransittraffic,sotransit

    providersrevenuesaredecreasing.

    Theacknowledgedmarketleader,Level3,lost$2.9billionin20052008andafurther$0.6billionin

    2009,andanother$0.6billionin2010.Itisnotpossibletosaywhatcontributiontheirtransit

    businessmadetothis;industryinsidersnotethatLevel3didnotgothroughbankruptcyasmany

    othersdid,andwouldmakeasmallprofitifitwerenotforthecostofservicingitsdebt.However,

    theindustryasawholeislosinglargeamountsofmoney(wesummarisesomeofthemajor

    providersfinancialstatementsinAppendixII).

    1.9 TheRiseoftheContentDeliveryNetworksOverthepastfouryearsorso,moreandmoretraffichasbeendeliveredbyContentDelivery

    Networks(CDNs).Theirrisehasbeenrapidandhaschangedtheinterconnectionlandscape,concentratingalargeproportionofInternettrafficintoasmallnumberofnetworks.Thisshifthas

    beendrivenbybothcostandqualityconsiderations.Withthegrowthofvideocontent,ofeverricher

    websites,andofcloudapplications,itmakessensetoplacecopiesofpopulardataclosertotheend

    userswhofetchit.Thishasanumberofbenefits:

    localconnectionsperformbetterthanremoteonesgivingquickerresponseandfastertransfers.

    costsarereducedbecausethedataisnotbeingrepeatedlytransportedoverlargedistancessavingontransitcosts.However,thekeymotivationforthecustomersofCDNsisnotto

    reducethecostofdelivery,buttoensurequalityandconsistencyofdeliverywhichis

    particularlyimportantforthedeliveryofvideostreams;

    thedataarereplicated,storedinanddeliveredfromanumberoflocationsimprovingresilience.

    ThishasmovedtrafficawayfromtransitproviderstopeeringconnectionsbetweentheCDNsandthe

    endusersISP.InsomecasescontentisdistributedtoserverswithintheISPsownnetwork,

    bypassingtheinterconnectionsystemaltogether.

    OneCDNclaimstodeliversome20%ofallInternettraffic.Sincethetrafficbeingdeliveredisthe

    sortwhichisexpectedtogrowmostquicklyinthecomingyears,thisimpliesthatanincreasing

    proportionoftrafficisbeingdeliveredlocally,andareducingproportionoftrafficisbeingcarried

    (overlongdistances)bythetransitproviders.

    AnothereffectofthisistoaddtrafficattheInternetExchangePoints(IXPs),whicharetheobvious

    wayfortheCDNstoconnecttolocalISPs.ThisaddsvaluetotheIXPparticularlywelcomeforthe

    smallerIXPs,whichhavebeenthreatenedbytheeverfallingcostoftransit(eatingintothecost

    advantageofconnectingtotheIXP)andthefallingcostofconnectingtoremote(larger)IXPs(where

    thereismoreopportunitytopickuptraffic).

    Thereisapositiveeffectonresilience,andanegativeone.Thepositivesideisthatsystemsserving

    usersinoneregionareindependentofthoseservingusersinotherregions,soalotoftraffic

    becomeslessdependentonlongdistancetransitservices.Onthenegativeside,CDNsarenow

    carryingsomuchtrafficthatifalargeoneweretofail,transitproviderscouldnotmeettheadded

    demand,andsomeserviceswouldbedegraded.CDNsalsoconcentrateevermoreinfrastructurein

  • 8/2/2019 Inter-X Summary Report April-11b

    21/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201121

    placeswherethereisalreadyalotofit.Ifpartsofsomelocalinfrastructurefailforanyreason,will

    therebesufficientothercapacitytofallbackon?

    Finally,itispossibletocountacoupleofdozenCDNsquitequickly,butitappearsthatperhapstwo

    orthreearedominant.Someofthelargetransitprovidershaveenteredthebusiness,eitherwith

    theirowninfrastructureorinpartnershipwithanexistingCDN.Thereareobviouseconomiesof

    scaleintheCDNbusiness,andthereisnowasignificantinvestmentbarriertoentry.Thestateofthis

    marketinafewyearstimeisimpossibletopredict,butnetworkeffectstendtofavourafew,very

    large,players.TheseplayersareverylikelytoenduphandlingoverhalftheInternetstrafficby

    volume.

    1.10 TheInsecurityofBGPAfundamentalproblemwithBGPisthatthereisnomechanismtoverifythattheroutinginformationitdistributesisvalid.Inprincipletraffictoanydestinationcanbedivertedsotrafficcanbe

    disrupted,modified,examinedorallthree.[C:11]Theeffectofthisisfeltonaregularbasiswhen

    somenetworkmanagestoannouncelargenumbersofroutesforaddressesthatbelongtoother

    networks;thiscandiverttrafficintowhatiseffectivelyablackhole.Suchincidentsarequitequickly

    dealtwithbynetworkoperators,anddisruptioncanbelimitedtoafewhours,atmost.Itisworth

    rememberingthattheoperationallayerispartoftheecosystem,andnotallproblemsrequire

    technicalsolutions.

    Thegreatfearisthatthisinsecuritymightbeexploitedasameanstodeliberatelydisruptthe

    Internet,orpartsofit.Thereisalsoafrequentlyexpressedconcernthatroutehijackingmightbe

    usedtolisteninontraffic,thoughthiscanbehardtodoinpractice.

    ConfiguringBGProuterstofilteroutinvalidroutes,oronlyacceptvalidones,isencouragedasbest

    practice.However,asdiscussedinSection3.1.11,whereitispractical(attheedgesoftheInternet)it

    doesnotmakemuchdifference,untilmostnetworksdoit.Whereitwouldmakemostdifference(in

    thelargertransitproviders)itisnotreallypracticalbecausetheinformationonwhichtobaseroute

    filtersisincompleteandthetoolsavailabletomanageandimplementfiltersatthatscaleare

    inadequate.[Q:13]

    MoresecureformsofBGP,inwhichroutinginformationcanbecryptographicallyverified,dependon

    therebeingamechanismtoverifytheownershipofblocksofIPaddresses,ortoverifythattheAS

    whichclaimstobetheoriginofablockofIPaddressesisentitledtomakethatclaim.Thenotionof

    titletoblocksofIPaddressesturnsoutnottobeasstraightforwardasmightbeexpected.However,someprogressisnowbeingmade,underthenameRPKI(ResourcePublicKeyInfrastructure).The

    RPKIinitiativeshouldallowASestoignoreannouncementswheretheoriginisinvalidthatis,

    wheresomeASisattemptingtouseIPaddressesitisnotentitledtouse.Thisisanimportantstep

    forward,andmighttackleover90%offatfingerproblems(outagescausedbymistakesratherthan

    deliberateattemptstodisrupt).[Q:14]

    ButthecostofRPKIissignificant.EveryASmusttakestepstodocumenttheirtitletotheirIP

    addresses,andthattitlemustberegisteredandattestedtobytheInternetRegistries.Then,everyAS

    mustextendtheirinfrastructuretochecktherouteannouncementstheyreceiveagainsttheregister.

    Whatismore,theproblemthatRPKItacklesis,sofar,largelyanuisancenotadisaster.Whensome

    networkmanagestoannouncesomeroutesitshouldnot,thisisnoticedandfixedquitequickly,ifitmatters.SometimesanetworkannouncesIPaddressesnobodyelseisusinggenerallytheyareup

    tonogood,butthisdoesnotactuallydisrupttheinterconnectionsystem.Sotheincentivetodo

  • 8/2/2019 Inter-X Summary Report April-11b

    22/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201122

    somethingabouttheproblemisweak,althoughthenumberofsuchincidentsisexpectedtorise

    whenIPv4addressesareexhaustedinlate2011.

    Further,aroutemaypassthecheckssupportedbyRPKI,andstillbeinvalid.Anetworkcan

    announceroutesforablockofIPaddresses,completewithavalidorigin,butdosoonlytodisruptor

    interferewiththetraffic(apparently)onitswaytoitsdestination.TheSBGPextensionstoBGP

    (firstpublishedin1997)addresstheissuemorecompletely,andtherehavebeenotherproposals

    since;however,theymaketechnicalassumptionsaboutrouting(trafficgreedandvalleyfree

    customerpreferences)thatdontholdintodaysInternet.Detailsofanewinitiative,BGPSEC,were

    announcedinMarch2011.TheaimisthatthisshouldleadtoIETFstandardsby2013anddeployed

    codeinroutersthereafter.

    Duringthestandardisationprocessin20112013akeyissuewillbesecurityeconomics.ASessee

    thecostofBGPsecurityashigh,andthebenefitessentiallyzerountilitisverywidelydeployed.Ideally,implementationanddeploymentstrategieswillgivelocal,incrementalbenefit,coupledwith

    incentivesforearlyadopters.Onepossiblemechanismisforgovernmentstousetheirpurchasing

    powertobootstrapearlyadoption;anotherisforrouterstoprefersignedroutes.Technicalissues

    thatmustbestudiedduringthestandardisationphaseincludewhethermoresecureBGPmight,in

    fact,bebadforresilience(aswaspointedoutintheconsultation,[Q:15]).Addingcryptographytoa

    systemcanmakeitbrittle.Thereasonisthatwhenrecoveringfromanevent,newandpossibly

    temporaryroutesmaybedistributedinordertoreplacelostroutes,andiftheunusualroutesare

    rejectedbecausetheydonothavethenecessarycredentials,thenrecoverywillbeharder.Finally,

    BGPSECwillnotbeasilverbullet,therearemanythreats,butitshouldtackleabouthalfthethings

    thatcangowrongafterRPKIhasdealtwithoriginvalidation.

    Tosumup,mostofthetimeBGPworkswonderfullywell,butthereisplentyofscopetomakeitmore

    secureandmorerobust.However,individualnetworkswillgetlittledirectbenefitfromanimproved

    BGP,despitethesignificantcost.Wewillprobablyneedsomenewincentivetopersuadenetworksto

    investinmoresecureBGP,oraproposalforsecuringBGPthatgiveslocalbenefitsfromincremental

    deployment.[Q:20]

    1.11 CyberExercisesonInterconnectionResilienceThepracticalapproachtoassessingtheresilienceoftheinterconnectionsystemistorunlargescale

    exercisesinwhichplausiblescenariosaretested.[C:16]Exercisescantestbothoperationaland

    technicalaspectsaswellasprocedural,policy,structuralandcommunicationaspects.

    Suchexerciseshaveanumberofadvantagesandbenefits.

    Theystartwithrealworldissues.Theseexercisesarenotcheap,sothereisanincentivetoberealistic:plannersconsiderwhatreallyarethesortsofeventthatthesystemisexpectedto

    face.

    Theycanidentifysomedependenciesonphysicalinfrastructure.Byrequiringtheparticipantstoconsidertheeffectsofsomeinfrastructurefailure,anexercisemayrevealpreviously

    unknowndependencies.

    Theycanidentifycrosssystemdependencies.Forexample,howwellcannetworkoperationscentrescommunicateifthephonenetworkfails,orhowwellcanfieldrepairsproceedifthemobilephonenetworkisunavailable?[Q:17]

  • 8/2/2019 Inter-X Summary Report April-11b

    23/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201123

    Theyexercisedisasterrecoverysystemsandprocedures.Thisisgenerallyagoodlearningexperienceforeverybodyinvolved,particularlyasotherwisecrisismanagementisgenerallyadhoc.[C:15]

    Suchscenariotestinghasbeendoneatanationallevelandfoundtobevaluable3.Somethingata

    largerscalehasalsobeenprovedtobevaluable.

    On4thNovember2010theEuropeanMemberStatesorganisedthefirstpanEuropeancyber

    exercise,calledCYBEREUROPE2010,whichwasfacilitatedbyENISA.Thefinalevaluationreport

    publishedbyENISA4provestheimportanceofsuchexercisesandcallsforfutureactionsbasedon

    thelessonslearned.

    1.12 TheTragedyoftheCommonsTheresilienceoftheInternetinterconnectionsystembenefitseveryone,butanindividualnetwork

    willnotingeneralgainanetbenefitifitincreasesitscostsinordertocontributetotheresilienceof

    thewhole.[C:21]

    Thismanifestsitselfinanumberofways.

    InSection1.10above,wediscussedthevariousproposalsformoresecureformsofBGP,fromSBGPin1997toBGPSECin2011,noneofwhichhavesofarbeendeployed(seeSection

    3.1.12).Thereislittledemandforsomethingwhichisgoingtobedifficulttoimplementand

    whosedirectbenefitislimited.

    ThereexistsbestpracticeforfilteringBGProuteannouncements,which,ifuniversallyapplied,wouldreduceinstancesofinvalidroutesbeingpropagatedbyBGPanddisruptingthesystem

    (seeSection3.1.11).Buttheserecommendationsaredifficulttoimplementandmostlybenefit

    othernetworks,soarenotoftenimplemented.

    ThereisanIETFBCP5[6]forfilteringpackets,toreduceaddressspoofing,whichwouldmitigatedenialofserviceattacks(seeSection5.8.3).Theserecommendationsalsomostly

    benefitothers,soarenotoftenimplemented.

    AsmallerglobalroutingtablewouldreducetheloadonallBGProutersintheInternet,andleavemorecapacitytodealwithunusualevents.Nevertheless,theroutingtableisasabout

    75%biggerthanitneedstobe,becausesomenetworksannounceextraroutestoreducetheir

    owncosts(seeSection3.1.9).Othernetworkscouldresistthisbyignoringtheextraroutes,butthatwouldcosttimeandefforttoconfiguretheirrouters,andwouldmostlikelybeseenby

    theircustomersasaservicefailure(notasanobleactofpublicservice).

    ThesystemisstillillpreparedforIPv6,despitethenowimminent(circaQ32011)exhaustionofIPv4addressspace.[Q:10]

    3GoodPracticeGuideonNationalCyberExercises,ENISATechnicalReport,2009.Availableat:

    http://www.enisa.europa.eu/act/res/policies/goodpractices1/exercises4CYBEREUROPE2010EvaluationReport,ENISAReport2011.Available(after15/04/2011)at:

    http://www.enisa.europa.eu/act/res/5

    An

    Internet

    Engineering

    Task

    Force

    (IETF)

    Best

    Common

    Practice

    (BCP)

    is

    as

    official

    as

    it

    gets

    in

    the

    Internet.

  • 8/2/2019 Inter-X Summary Report April-11b

    24/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201124

    Itisintheclearinterestofeachnetworktoensurethatinnormalcircumstancesbesteffortsmeans

    ahighlevelofservice,byadjustinginterconnectionsandroutingpolicyeachnetworkhascustomerstoserveandareputationtomaintain[C:17].Normalcircumstancesincludetheusualday

    todayfailuresandsmallincidents[Q:7].

    Thecentralissueisthatthesecurityandresilienceoftheinterconnectionsystemisanexternalityas

    farasthenetworksthatcompriseitareconcerned.Itisnotclearisthatthereisanyincentivefor

    networkoperatorstoputsignificanteffortintoconsideringtheresilienceoftheinterconnection

    systemunderextraordinarycircumstances.[Q:18]

    1.13 RegulationRegulationisviewedwithapprehensionbytheInternetcommunity.Studiessuchasthisareseenas

    stalkinghorsesforregulatoryinterference,whichisgenerallythoughtlikelytobeharmful.[C:22]

    DespitehavingitsoriginsinaprojectfundedbyDARPA,aUSgovernmentagency,theInternethas

    developedsincetheninanenvironmentthatislargelyfreefromregulation.Therehavebeenmany

    localattemptsatregulatoryintervention,mostofwhichareseenasharmful.

    ThegovernmentsofmanylessdevelopedcountriesattempttocensortheInternet,withvaryingdegreesofsuccess.TheGreatFirewallofChinaismuchdiscussed,butmanyother

    statespracticeonlinecensorshiptoagreaterorlesserextent.Itisnotjustthatcensorship

    itselfiscontrarytothemoresoftheInternetcommunitywhosecultureisgreatlyinfluenced

    byCalifornia,thehomeofmanydevelopers,vendorsandservicecompanies.Attemptsat

    censorshipcancausecollateraldamage,aswhenPakistanadvertisedroutesforYouTubeinan

    attempttocensoritwithintheirborders,andinsteadmadeitunavailableonmuchoftheInternetforseveralhours.

    Wherepoorregulationleadstoalackofcompetition,accesstotheInternetislimitedandrelativelyexpensive.Inmanylessdevelopedcountries,alocaltelecommunicationsmonopoly

    restrictswirelinebroadbandaccesstourbanelites,forcingthemajoritytorelyonmobile

    access.Howevertheproblemismoresubtlethanregulationbad,noregulationgood.Ina

    numberofUScities,thediversityofbroadbandaccessisfalling;citiesthatusedtohavethree

    independentinfrastructures(sayfromaphonecompany,acablecompanyandanelectricity

    company)mayfindthemselvesovertimewithtwo,orevenjustone.Inbetterregulated

    developedcountries(suchasmuchofEurope)localloopunbundlingyieldspricecompetition

    atleast,thusmitigatingaccesscosts,evenifphysicaldiversityisharder.Finally,fewcountries

    imposeauniversalserviceprovisiononserviceproviders;itslackcanleadtoadigitaldivide

    betweenpopulatedareaswithbroadbandprovision,andruralareaswithout.

    Therehasbeencontinuedcontroversyoversurveillanceforlawenforcementandintelligencepurposes.IntheCryptoWarsonthe1990s,theClintonadministrationtriedtocontrol

    cryptography,whichtheindustrysawasthreateningnotjustprivacybutthegrowthof

    ecommerceandotheronlineservices.TheClintonadministrationpassedthe

    CommunicationsAssistanceforLawEnforcementAct(CALEA)in1994mandatingthe

    cooperationoftelecommunicationscarriersinwiretappingphonecalls.TheEUhasaData

    RetentionDirectivethatisupforrevisionin2011andthereisinterestbothintheUKandthe

    USAinhowwiretappingshouldbeupdatedforanagenotonlyofVoIPbutalsoofdiverse

    messagingsystems.Thiscreatesconflictsofinterestwithcustomers,raisesissuesofhumanrights,andleadstoargumentsaboutpaymentandsubsidy.

  • 8/2/2019 Inter-X Summary Report April-11b

    25/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201125

    GovernmentswhichworryaboutCriticalNationalInfrastructuremaytreatInternetregulationasamatterofNationalSecurity,introducingdegreesofsecrecyandshadowyorganisations,whichdoesnothingtodispelconcernsaboutmotivationnothelpedbyatendencytotalk

    abouttheprobleminapocalypticterms6.

    Whateverthemotivation,governmentpoliciesareoftenformulatedwithinsufficientscientificand

    technicalinput.Theyoftenmanagetoappearclueless,andinsomecasesmakethingsworse.This

    studyisanattempttohelpalleviatethisproblem.

    Thisstudyhasidentifiedanumberofareaswherethemarketdoesnotappeartoprovideincentives

    tomaintaintheresilienceoftheinterconnectionsystematasociallyoptimallevel.However,any

    attempttotackleanyoftheissuesbyregulationishamperedbyanumberoffactors:

    thelackofgoodinformationaboutthestateandbehaviourofthesystem.Itishardtodeterminehowmaterialagivenissuemaybe.Itishardtodeterminewhateffectagiven

    initiativeislikelytohavegoodorbad.

    thescaleandcomplexityofthesystem.Scalemaymakelocalinitiativesineffective,whilecomplexitymeansthatitishardtopredicthowthesystemwillrespondoradapttoagiven

    initiative.

    thedynamicnatureofthesystem.CDNshavebeenaroundformanyyears,buttheiremergenceasamajorcomponentoftheInternetisrelativelyrecent;itistestamenttothe

    systemsabilitytoadaptquickly(inthiscase,tothepopularityofstreamedvideo).

    Upuntilnow,thelackofincentivestoprovideresilience(andinparticulartoprovideexcess

    capacity)hasbeenrelativelyunimportant:theInternethasbeengrowingsorapidlythatithasbeenveryfarfromequilibrium,withahugeendowmentofsurpluscapacityduringthedotcomboomand

    significantcapacityenhancementssincethen.Thiscannotgoonforever.

    Onecaveat:wemustpointoutthattheprivatisation,liberalisationandrestructuringofutilities

    worldwidehasledtoinstitutionalfragmentationinanumberofcriticalinfrastructureindustriesthat

    couldintheorysufferdegradationofreliabilityandresilienceforthesamegeneralmicroeconomic

    reasonswediscussinthecontextoftheInternet.Yetstudiesoftheelectricity,waterandtelecomms

    industriesinanumberofcountrieshavefailedtofindareliabilitydeficitthusfar[7].Inpractice,

    utilitieshavemanagedtocopebyacombinationofanticipatoryriskmanagementandPublicPrivate

    Partnerships(PPPs).Howeveritissometimesnecessaryforgovernmenttoactasalenderoflast

    resort.Ifarouterfails,wecanfallbackonanotherrouter,butifamarketfailsaswiththeCaliforniaelectricitymarketthereisnofallbackotherthanthestate.

    Inconclusion,itmaybesometimebeforeregulatoryactioniscalledfortoprotecttheresilienceof

    theInternet,butitmaywellbetimetostartthinkingaboutwhatmightbeinvolved.Regulatinga

    newtechnologyishard;aninitiativedesignedtoimprovetodayssystemmaybeirrelevantto

    tomorrows,or,worse,stiflecompetitionandinnovation.Forexample,therailwayssteadily

    improvedtheirefficiencyfromtheirinceptioninthe1840suntilregulationstartedinthelate

    6See[236]UKGovernment,CabinetOfficeFactsheet18:CyberSecurity.Andforthepopularperceptionofwhat

    governmentthinks

    see

    [237]

    Fight

    Cyber

    War

    Before

    Planes

    Fall

    Out

    of

    Sky.

  • 8/2/2019 Inter-X Summary Report April-11b

    26/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201126

    nineteenthcentury,afterwhichtheirefficiencydeclinedsteadilyuntilcompetitionfromroadfreight

    arrivedinthe1940s[8].

    TheprudentcourseofactionforpolicymakerstodayistostartworkingtounderstandtheInternet

    interconnectionecosystem.Themostimportantpackageofworkistoincreasetransparency,by

    supportingconsistent,thorough,investigationofmajoroutagesandthepublicationofthefindings,

    andbysupportinglongtermmeasurementofnetworkperformance.Thesecondpackagewe

    recommendistofundkeyresearchintopicssuchasdistributedintrusiondetectionandthedesignof

    securitymechanismswithpracticalpathstodeployment,andthethirdistopromotegoodpractice,

    toencouragediverseserviceprovisionandtopromotethetestingofequipment.Thefourthpackage

    includesthepreparationandrelationshipbuildingthroughaseriesofPPPsforresilience.Modest

    andconstructiveengagementofthiskindwillenableregulatorstobuildrelationshipswithindustry

    stakeholdersandleaveeveryoneinamuchbetterpositiontoavoid,ordelay,difficultand

    uninformedregulation.Regulatoryinterventionmustafterallbeevidencebased;andwhilethereis

    evidenceofanumberofissues,theworkingsofthishuge,complexanddynamicsystemaresopoorly

    understoodthatthereisnotyetenoughevidenceonwhichtobasemajorregulatoryintervention

    withsufficientconfidence.

  • 8/2/2019 Inter-X Summary Report April-11b

    27/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201127

    2

    Recommendations

    Ourrecommendationscomeinfourgroups.Thefirstgroupisaimedatunderstandingfailures

    better,sothatallmaylearnthelessons.

    Recommendation1 IncidentInvestigationAnindependentbodyshouldthoroughlyinvestigateallmajorincidentsandreportpubliclyonthe

    causes,effectsandlessonstobelearned.Incidentcorrelationandanalysismayleadtoassessment

    andforecastmodels.Theappropriateframeworkshouldbetheresultofaconsultationwiththe

    industryandtheappropriateregulatoryauthorities.Incidentinvestigationmightbeundertakenby

    anindustryassociation,byanationalregulatororbyabodyattheEuropeanlevel,suchasENISA.

    Thelastoptionwouldrequirefundingtosupportthework,and,perhaps,powerstoobtain

    informationfromoperatorsundersuitablesafeguardstoprotectcommerciallysensitive

    information.TheimplementationofArticle13aoftherecentEUTelecomPackage7mayprovidea

    modelforthis.

    Recommendation2 DataCollectionofNetworkPerformanceMeasurementsEuropeshouldpromoteandsupportconsistent,longtermandcomprehensivedatacollectionof

    networkperformancemeasurements.Atpresentsomerealtimemonitoringisdonebycompanies

    suchasArborNetandRenesys,andsomemoreisdonebyacademicprojectswhichtendtolanguish

    oncetheirfundingrunsout.Thispatchworkisinsufficient.Thereshouldbesustainablefundingto

    supportthelongtermcollection,processing,storageandpublicationofperformancedata.Thisalsohasanetworkmanagement/lawenforcementangleinthatrealtimemonitoringofthesystemcould

    helpdetectunusualrouteannouncementsandotherundesirableactivity.

    Thesecondgroupofrecommendationsaimsatsecuringfundingforresearchintopicsrelated

    toresiliencewithanemphasisnotjustonthedesignofsecuritymechanisms,buton

    developinganunderstandingofhowsolutionscanbedeployedintherealworld.

    Recommendation3 ResearchintoResilienceMetricsandMeasurementFrameworksEuropeshouldsponsorresearchintobetterwaystomeasureandunderstandtheperformanceandresilienceofhuge,multilayerednetworks.Thisistheresearchaspectofthesecond

    recommendation;oncethatprovidesaccesstogooddata,thedatashouldhelpcleverpeopletocome

    upwithbettermetrics.

    7Directive2002/21/ECoftheEuropeanParliamentandoftheCouncil,of7March2002,onacommonregulatory

    frameworkforelectroniccommunicationsnetworksandservices(FrameworkDirective),asamendedbyDirective

    2009/140/ECand

    Regulation

    544/2009.

  • 8/2/2019 Inter-X Summary Report April-11b

    28/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201128

    Recommendation4 DevelopmentandDeploymentofSecureInterdomainRoutingEuropeshouldsupportthedevelopmentofeffective,practicalmechanismswhichhaveenough

    incentivesfordeployment.Thismaymeanmechanismsthatgivelocalbenefittothefirmsthat

    deploythem,evenwheredeploymentisincremental;itmayrequiretechnicalmechanismstobe

    supplementedbypolicytoolssuchastheuseofpublicsectorpurchasingpower,subsidies,liability

    shifts,orotherkindsofregulation.

    Recommendation5 ResearchintoASIncentivesthatImproveResilienceEuropeshouldsupportresearchintoeconomicandlegalmechanismstoincreasetheresilienceofthe

    Internet.Perhapsasystemofcontractscanbeconstructedtosecuretheinterconnectionsystem,

    startingwiththeconnectionsbetweenthemajortransitprovidersandspreadingfromthecoretothe

    edges.Alternatively,researchersmightconsiderwhetherliabilityrulesmighthaveasimilareffect.IfthefailureofaspecifictypeofroutercausedlossofInternetserviceleadingtodamageandlossof

    life,theProductLiabilityDirective85/374/ECwouldalreadyletvictimssuethevendor;butthereis

    nosuchprovisionrelatingtothefailureofatransitprovider.

    Thethirdgroupofrecommendationsaimsatpromotinggoodpractice.

    Recommendation6 PromotionandSharingofGoodPracticeonInternetInterconnections

    Europeshouldsponsorandpromotegoodpracticeinnetworkmanagement.Wheregoodpracticeexistsitsadoptionmaybehamperedbypracticalandeconomicissues.Thepublicsectormaybeable

    tohelp,butitisnotenoughtodeclareformotherhoodandapplepie!Itcancontributevarious

    incentives,suchasthroughitsconsiderablepurchasingpower.Forthattobeeffective,purchasers

    needawaytotellgoodservice.Thefirstthreeofourrecommendationscanhelp,buttherearesome

    directmeasuresofqualitytoo.Suchinformationsharingshouldincludemodestandconstructive

    engagementofindustrystakeholderswithpublicsectorinrelationshipbuildingstrategicdialogue

    anddecisionsthroughaseriesofPPPsforresilience.

    Recommendation7 IndependentTestingofEquipmentandProtocolsPublicbodiesatnationalorEuropeanlevelshouldsponsortheindependenttestingofroutingequipmentandprotocols.Theriskofsystemicfailurewouldbereducedbyindependenttestingof

    equipmentandprotocols,lookingparticularlyforhowwelltheseperforminunusualcircumstances,

    andwhethertheycanbedisrupted,suborned,overloadedorcorrupted.

    Recommendation8 ConductRegularCyberExercisesontheInterconnectionInfrastructure

    Theconsultationnotedthattheseareeffectiveinimprovingresilienceatlocalandnationallevels.

    TheeffortsatthislevelshouldcontinueinallcountriesinEuropeasweareasweakastheweakest

    link.ENISAwillsupportthenationalefforts.InadditionregularpanEuropeanexercisesshouldbe

    organisedbyEuropeanMemberStatesinordertotestandimproveEuropeanwidecontingencyplans(measures,proceduresandstructures).Theselargescaleexerciseswillprovideanumbrella

    foranumberofusefulactivities,suchasinvestigatingwhatextrapreparationmightberequiredto

  • 8/2/2019 Inter-X Summary Report April-11b

    29/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201129

    providemoreroutesinacrisis;thuseffectivelybecomingpartofimprovingthepanEuropeancyber

    preparednessandcontingencyplans.

    Thefinalgroupofrecommendationsaimsatengagingpolicymakers,customersandthe

    public.

    Recommendation9 TransitMarketFailureItispossiblethatthecurrenttwentyoddlargesttransitprovidersmightconsolidatedowntoa

    handful,inwhichcasetheymightstarttoexercisemarketpowerandneedtoberegulatedlikeany

    otherconcentratedindustry.Ifthisweretohappenjustastheindustryusesupthelastofits

    endowmentofdarkfibrefromthedotcomboom,thenpricesmightrisesharply.European

    policymakersshouldstarttheconversationaboutwhattodothen.Actionmightinvolvenotjusta

    numberofEuropeanagenciesbutalsonationalregulatoryauthorities.Recommendations1,2,3,and

    5willpreparethegroundtechnicallysothatpolicymakerswillnotbeworkingentirelyinthedark,

    butwealsoneedpoliticalpreparation.

    Recommendation10 TrafficPrioritisationIf,inacrisis,sometrafficistobegivenpriority,andothertrafficistosufferdiscrimination,thenthe

    basisforthischoicerequirespublicdebate,andmechanismstoachieveitneedtobedeveloped.

    GiventhenumberofinterestsseekingtocensortheInternetforvariousreasons,anydecisionson

    prioritisationwillhavetobetakenopenlyandtransparently,orpublicconfidencewillbelost.

    Recommendation11 GreaterTransparencyTowardsaResilienceCertificationSchemeFinally,transparencyisnotjustaboutopennessintakingdecisionsonregulationoronemergency

    procedures.Itwouldgreatlyhelpresilienceifendusersandcorporatecustomerscouldbeeducated

    tounderstandtheissuesandsendtherightmarketsignals.Inthelongtermefforts,including

    ENISAs,shouldfocusonwhatmechanismscanbedevelopedtogivethemthemeanstomakemore

    informedchoices.Thismightinvolvecombiningtheoutputsfromrecommendations2,3,5,6and7

    intoaqualitycertificationmarkscheme.Suchschememayproveanimportanttooltodrivethe

    marketincentivestowardsenhancingtheresilienceofthenetworksandmoregenerallyofthe

    interconnectionecosystem.

  • 8/2/2019 Inter-X Summary Report April-11b

    30/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201130

    Respondents

    to

    the

    Consultation

    Wethankallthosewhogavetheirtimetorespondtotheconsultationandhelpuswiththisstudy.

    Somechosetocontributeanonymously,butwecanthankbynamethefollwoing:

    OlivierBonaventure Professor UCLouvain,Belgium

    ScottBradner UniversityTechnologySecurityOfficer,OfficeoftheCIO

    HarvardUniversity

    BobBriscoe ChiefResearcher NetworksResearchCentre,BTGroupplc

    kcclaffey PrincipalInvestigator CAIDA

    AndrewCormack ChiefRegulatoryAdviser JANET(UK)JonCrowcroft MarconiProfessorofCommunications

    SystemsComputerLab,CambridgeUniversity

    JohnCurran CEO ARIN

    DaiDavies GeneralManager Dante

    NicolasDesmons ChargdeMission ARCEP,France

    AmoghDhamdhere PostDoctoralResearcher CAIDA

    GiuseppeDiBattista ProfessorofComputerScience RomaTreUniversity

    NicoFischbach Director,NetworkArchitecture Colt

    MarkFitzpatrick Engineer FederalOfficeofCommunications,OFCOM,Switzerland

    DavidHutchison ProfessorofComputing LancasterUniversity

    MalcolmHutty HeadofPublicAffairs LINX

    ChristianJacquenet DirectoroftheStrategicProgramOffice

    FranceTelecomGroup

    BalachanderKrishnamurthy

    Researcher AT&TLabsResearch

    CraigLabovitz ChiefScientist ArborNetworks

    UlrichLatzenhofer RundfunkundTelekomRegulierungs,Austria

    SimonLeinen NetworkEngineer SWITCH

    TonyLeung GlobalInternetandNetworkConvergenceManager

    REACH

    KurtErikLindqvist CEO Netnod

    NeilLong ResearcherandFounder TeamCymruResearchNFP

    PatriciaLongstaff DavidLevidowProfessorofCommunicationLawandPolicy

    JamesMartinSeniorVisitingFellow,OxfordMartinSchoolVisitingScholar

    SyracuseUniversity

    TrinityCollege,Oxford

    PaoloLucente Architect/Designer KPNInternational

    BillManning USC/ISI

  • 8/2/2019 Inter-X Summary Report April-11b

    31/31

    InterX:ResilienceoftheInternetInterconnectionEcosystem

    SummaryReportApril201131

    MaurizioPizzonia AssistantProfessor,ComputerScience RomaTreUniversity

    AndrewPowell ManagerofAdviceDe