distributed systems communication · approach of building distributed systems to address these...

1

DISTRIBUTEDSYSTEMSCOMMUNICATIONLastclasswediscussedaboutthecorechallengesofbuildingdistributedsystems(incrementalscalabilityishard,atscalefailuresareinevitable,constantattacks,etc.).We'vesaidthatthecoreapproachofbuildingdistributedsystemstoaddressthesechallengesistoconstructlayersuponlayersofsystemsandservicesthatraisethelevelofabstractionincreasinglythroughwell-definedAPIs.Google'sstoragestack,aswellasSparkandothersystems,areperfectexamplesofthatlayereddesignfordistributedsystems.Inthislecture,welookatoneoftheseabstractions:RPC,whichenablescommunication.Tocooperate,machinesneedtofirstcommunicate.Howshouldtheydothat?RPCisthepredominantwayofcommunicatinginaDS.Interestinglyenough,RPC–thesimplestpossibleDSabstraction–reflectsmostofthechallengesindistributedsystems,showingyouhowfundamentalthosethingsare.

I.RemoteProcedureCalls(RPCs)Overallgoal:Tosimplifytheprogrammer'sjobwhenbuildingdistributedsystems.

- manycomplexitieswhenbuildingadistributedprogram- somearefundamentalandmustbedealtwithbyprogrammer,whileothersaremechanical

andcanbesolvedbygoodlanguage,runtimesupport- pre-RPC,allofthegorydetailsofsocket-levelcommunicationwereexposedtothe

programmer- post-RPC,aremoteinteractionlooks(nearly)identicaltoalocalprocedurecall,withthe

systemhiding(nearly)allofthedifferencesExampleof(badlywritten)socketcommunicationcode:

Anyissueswithit?Many…-lotsofuglyboilerplate-bugprone-portabilitydetailsareeasytomiss-hardtounderstandandhardtomaintain-hardtoevolveprotocol-vulnerabilityprone…

RPC:RPCtriestomakenetcommunicationlookjustlikealocalprocedurecall(LPC):

structfoomsg{u_int32_tlen;}send_foo(intoutsock,char*contents){intmsglen=sizeof(structfoomsg)+strlen(contents);charbuf=malloc(msglen);structfoomsg*fm=(structfoomsg*)buf;fm->len=htonl(strlen(contents));memcpy(buf+sizeof(structfoomsg),contents,strlen(contents));write(outsock,buf,msglen);}

Client-sidecode:z=fn(x,y)

Server-sidecode:fn(x,y){

//computeresultzreturnz;

}

2

WhytheLPCabstraction?Because,it’ssimple,elegant,andevennoviceprogrammersknowhowtousefunctioncalls!Othermodelsexist,buttheyare(arguably)hardertothinkabout.RPCmessagediagram:ClientServerrequest--->process<---responseRPCsoftwarestructure:

(figuretakenfromRPCpaperbyBirrell,et.al.)

Inimplementations,stubsaregeneratedautomaticallybyRPCframeworks(libraries),whichalsoprovidetheRPCRuntime.Theprogrammeronlywritesdefinitionsfortheirdatastructuresandprotocolsinan“interfacedefinitionlanguage”(IDL).We’llseetwoexamplesofRPClibrarieslater.GoodthingsaboutRPCabstraction:

- Hidesgorynetwork/marshalingdetailsthatonewouldhavetoimplementifdoing,e.g.,network-levelcommunication,byteorders(bigendianvs.littleendian)

- Supportsevolutionofthecommunicatingcomponentsindependently.(ManyRPCframeworksprovideexplicitsupportforevolution,e.g.,protocolversionnumbers,rulesofwhentoremove/renameargumentstoprocedurecalls,etc.)

- Allowsforefficientpackaging- Authenticationsupport- Locationindependence(particularlyifcombinedwithadirectoryservice)

ProblemstheRPCabstraction(i.e.,wheredistributionofRPCpeeksthroughtheLPCillusion):

- Latency:LPCisfast,RPCcanbereallyexpensive(esp.ifitgoesoverWAN,butnotonly)

3

- Pointertransfers:localaddressspaceisn’tsharedwithremoteprocess.Theprogrammer/RPClibraryhastomakeadecisionofwhethertofollowpointersandserializeindepth,ortoexcludethatinformation.Typically,IDLsavoidthisproblembyrequiringonetospecifyone’smessages/datastructurestoavoidpointerconfusion.

- Failures:theyaremorefundamentalthaninthesingle-process/LPCcase.o IfIissueanLPC,thenIcanbeinoneofthreesituations:(1)theLPCreturns,

thereforeIknowtheprocedurehasbeenexecuted;or(2)theLPCdoesn’treturn,inwhichcaseIknowtheLPChasn’tfinishedexecuting(itmaystillbeexecuting,ortheprocessthathostsboththecallerandthecalleemighthavedied).So,withLPC,thecallerandcalleecan’thavea“splitmind.”WithRPC,theycan.

o IfIissueanRPCandgetareturn,Iknowtheprocedurehasbeenexecutedremotely.ButifIdon’tgetareturn,whatdoIknow?HasthefunctioncompletedontheremotesideandIjustdon’tknowaboutitbecause,e.g.,theresponsehasbeenlostoverthenetwork?Hasitnothappened,because,e.g.,myrequesthasbeenlostoverthenetworkortheremotemachineisdead?Ormaybethemachine/networkisslow–shouldIwaitlongerfortheresponse?HowlongshouldIwaitforaresponse?Ican’ttellthedifferencebetweenthesesituations,andIdon’tknowhowtoact.Incertainscenarios,that’snotabigproblem,butinothersitis.

Forexample,imaginethedistributedsystemiscomposedofanATMmachineanditsbankback-endsystem.ApersonrequestsawithdrawalofcashfromtheATMmachine.TheATMmachineneedstocheckthatthepersonhassufficientfundsin,andwithdrawthesumfrom,hisbankaccountbeforegivingoutthemoney.TheATMrequeststhemoneytransferandthenwaits,andwaits,andwaits.Noresponse.Shoulditgiveoutthemoneyorshoulditnot,ifitdoesn’tgetaresponsefromthebank?Nosimpleanswer.Needsometimeout(personcan’twaitforever),butwhatshouldthatbe?SayATMrefusestogiveoutthemoneyuponatimeout.Butwhatifthebankalreadydeductedthemoneyfromtheaccount,howshouldwedealwiththat?TheATMneedstokeeptryingtotalkwiththebanktotellitthatithasn’tactuallyfinishedthetransaction,sothebankcanputbackthemoneyintheperson’saccount.ButwhatiftheATMitselffailsatsomepoint?Istheperson’smoneylostforever?IwouldhateinteractingwithsuchanATM.Alternatively,sayATMgivesoutthemoneyuponatimeout.Thisofcoursecanopenthebanktofraud:apersonmightquicklygotoanotherATMandextractthesamemoneyagain.

Hereinliesthebiggestdifficultyindistributedsystemscomparedtosingle-processsystems:withsinglemachines,youlargelyhavesharedfate(i.e.,eitherallfailsornothingdoes);withdistributedsystems,machines/processescanfailindependently,soyoucanhavesomemachinesthatareupwhileothersaredown,andyoucanalsohaveallmachinesupbutunabletocommunicateduetonetworkfailuresandpartitions.Thisleadstocoordinationchallenges,whichinturnleadtoinconsistenciesbetweenthedecisionsthesemachinesmake.SothatgoalofDSofprovidinga“coherentservice”isdifficulttoachieve.

4

II.Semantics

TheprecedingATMexampleleadsdirectlytoaquestionofsemantics.Whatshouldbethemeaning(semantic)ofanRPCinvocation?Networkfailuresimplytimeoutsandretransmissions.Theyimplythatclientswillendupretransmittingrequests,andtheserverwillseeduplicates.Howshouldtheserverbehavewhenitseesduplicates?Belowaresomepotentialanswers,anddifferentRPCframeworkswillimplementdifferentsemantics.At-least-once:theRPCcall,onceissuedbytheclient,isexecutedeventuallyatleastonce,butpossiblymultipletimes.Thissemanticistheeasiesttoimplementbutraisesissues.Toimplement,theRPClibraryontheclientkeepsissuingtheRPCmultipletimesuntilitgetsaresponsefromtheserver.Assumingthatfailures(ofnetwork,server)aretemporary,theRPCwillbeexecutedatleastonce.Thisisfineforidempotentrequests(e.g.,thisemailinauser’sinboxhasbeenread),butmightbeproblematicforothertypesofrequests(e.g.,ATMexample,purchasinganitem).

At-most-once:theRPCcall,onceissuedbyaclient,getsexecutedzerooronetime.Thissemanticisnexteasiesttoimplement,buthascomplicationsifyoutrytodoitsensibly.Toimplement,theserverhastorememberrequeststhatithasalreadyseenandprocessed,todetectduplicatestosquelch.Client/serversneedtoembeduniqueIDs(XIDs)inmessages.Somethinglikefollows:

Servermaintainss[xid](state),r[xid](returnvalue)ifs[xid]==DONE:returnr[xid]x=handler(args)s[xid]=DONEr[xid]=xreturnx

Onecomplication:whathappensifservercrashes/restarts?Serverwilllosethes,rtables.Solution:storeondisk[slow,expensive].Anothercomplication:evenifstoreondisk,whathappensifservercrashesjustbeforeoraftercalltohandler()?-afterrestart,servercan'ttellifhandler()wascalled-servermustreplywith"CRASH"whenclientresendsrequestSolution:serverhas"servernonce"thatidentifiesrestartgeneration-clientobtainsnonceduringbindwithserver-clienttransmitsnonceineveryRPCrequest-ifnoncedoesn'tmatchserver'sstate,returnCRASHExactlyonce:thisistheideal(itresemblestheLPCmodelmostcloselyandit’seasiesttounderstand),butit’ssurprisinglyhardtoimplement.ThisiswhentheRPC,onceissuedbytheclient,isinvokedexactlyoncebytheserver.Whyhardtobuild?-serverfailureataninopportunetimecanstymie--clientblocksforever-needtoimplementhandler()asatransactionthatincludess[xid],r[xid]commitonserver

5

III.ExampleLibrariesSomesamplecodeusingvariousRPClibrariescanbefoundathttps://columbia.github.io/ds1-class/lectures/02-rpc-handout.pdf.

IV.OtherCommunicationModelsRPCisjustonecommunicationmodelindistributedsystems.Itworkswellwhenthecommunicationpatternlookslikethis:

-point-to-point,request/process/responsetoaserverorservice.-lotsofcommunicationindistributedsystemslooklikethat,butnotall.

Herearesomeotherpopularcommunicationpatternsandthecorrespondingcommunicationabstractionsthatsupportthem.Wewon’tstudythemindepthinthiscourse.

Acknowledgements

TheprecedingnoteswereadaptedfromtheUniversityofWashingtondistributedsystemscourse,SteveGribble’sedition:http://courses.cs.washington.edu/courses/csep552/13sp/lectures/1/rpc.txt

https://columbia.github.io/ds1-class/lectures/02-rpc-handout.pdf

https://columbia.github.io/ds1-class/lectures/02-rpc-handout.pdf

distributed systems communication · approach of building distributed systems to address these...

Documents