p4 deep dive - netlab€¦ · p4 three goals 1. protocol independence • no native support for any...
TRANSCRIPT
7th September2017
Agenda
• 9:30– P4motivation,languageoverview• OpenFlow vsP4• SDNtoday
• ~10:30– P4demos,exampleapplications• Ipv4forwardingswitch• Calculator(run1+1onyourrouter!)• Multi-hoprouteinspection(in-networktelemetry)• Trackingper-flowTCPretransmissions
• Discussions:dataplaneprogrammability@ESnet
04/10/2017 P4deepdive@ESnet- RichardCziva 2
TheroadtoP4P4deepdive
04/10/2017 P4deepdive@ESnet- RichardCziva 3
Software-DefinedNetworking
• Programmablecontrolovernetworks• Physicallyseparatedcontrolplanetheforwardingplane• MostpopularrealizationofSDN’sdata-planeabstractionisOpenFlow• OpenFlow wasborn10yearsago
• Andalothaschanged
04/10/2017 P4deepdive@ESnet- RichardCziva 4
Software-DefinedNetworking
• OpenFlow becamethedefactostandardfornetworkprogramming• OriginallyitwastargetingLANandDCnetworks• LatelytheykeptextendingOpenFlow tosupportnewprotocols(mostlyencapsulation,networkvirtualization)• VxLAN (2010)• STT(StatelessTransportTunnel)(2012)• NSH(2014)• …
04/10/2017 P4deepdive@ESnet- RichardCziva 5
OpenFlow matchfields
04/10/2017 P4deepdive@ESnet- RichardCziva 6
• Matchfieldskeepgrowingandgrowing:hardforvendorstokeepup
Networkoperatorscanhavespecificrequirements!Canweletthemimplementit?
Apartialsolution:BPFmatching
• InsteadoffixedOpenFlow matchfields,whydon’tweuseaBerkeleyPacketFilters (BPF)?• (TheoriginalBPFpaperwaswrittenby StevenMcCanne and VanJacobson in1992whileat LawrenceBerkeleyLaboratory[9][10])
• BPFcanbeusedtodescribeapacketmatchingasandirectedacyclicgraph
04/10/2017 P4deepdive@ESnet- RichardCziva 7
Apartialsolution:BPFmatching
• ABPFprogramwassmallenoughtobeplacestoanOpenFlow packet(OXMfield)• BPFcanbeexecutedonanyplatform(platform-independence)• BPFreliesonsimpleinstructions(load,compare)->easytoimplementonhardware
04/10/2017 P4deepdive@ESnet- RichardCziva 8
Networkdata-planetoday
• TheASICdefineshowcanwehandlepackets• ASICsareallclosed,wecan’textendthemifrequired• Networksaredesignedusingabottom-updesign
04/10/2017 P4deepdive@ESnet- RichardCziva 9
FixedfunctionASIC
SwitchOS
API
driver
Networkingtomorrow
• Wewanttotellthedevicehowtobehave(toptobottomapproach)• Programmablenetworkdevices:• PISA(FlexibleMatch+Action ASICs):IntelFlexpipe,BarefootTofino• NPU:Netronome• CPU:DPDK,eBPF,OpenvSwitch• FPGA:XilinxNetFPGA
• P4definesalanguagetoprogramsomeofthesedevices
04/10/2017 P4deepdive@ESnet- RichardCziva 10
Programmableswitch
SwitchOS
API
driver
Benefitsoftopàbottom approach
• Tonetworkproviders• Eliminating”blackboxes”and“magicbehavior”• Developerscanprogram,testanddebugnetworkdevicesallthewaydown• Customrouting/switchingprotocolscanbekept“in-house”
• Tovendors• Extremelyfastiterationandfeaturerelease• Newbusinessmodelsforsellingsoftwareandhardwareseparate• Vendorscanfixdata-planebugsafterdeviceshavebeendeployed• Softwaredevelopmentpracticesusedineveryphase
04/10/2017 P4deepdive@ESnet- RichardCziva 11
Agenerallanguage:P4
• P4:alanguagedesignedtoallowtoptobottomprogrammability• P4wasproposedatSIGCOMM2013:
Bosshart,Pat,DanDaly,GlenGibb,MartinIzzard,NickMcKeown,JenniferRexford,ColeSchlesingeretal."P4:Programmingprotocol-independentpacketprocessors." ACMSIGCOMMComputerCommunicationReview 44,no.3(2014):87-95.
• P4consortium:over80vendors,manyUniversities,individuals…
04/10/2017 P4deepdive@ESnet- RichardCziva 12
P4vsOpenFlow
04/10/2017 P4deepdive@ESnet- RichardCziva 13
P4inveryhigh-levelP4deepdive
04/10/2017 P4deepdive@ESnet- RichardCziva 14
P4threegoals
1. Protocolindependence• Nonativesupportforanyprotocol• Theprogrammerdescribestheprotocols
2. Targetindependence• P4programsaredesignedtobe implementation-independent• CanbecompiledtoCPUs,NPUs,FPGAs,ASICs,networkprocessors,etc
3. In-fieldre-configurability• Duetotargetandprotocolindependenceandtheabstractlanguage,P4targets(e.g.,ASICs)canchangetheirbehaviorovertheirlifetime
04/10/2017 P4deepdive@ESnet- RichardCziva 15
P4vstraditionalswitches
• InaP4definedswitchthedataplaneisnotfixed,butdefinedbyaP4program• InaP4definedswitchtheAPIbetweenthedataplaneandthecontrolplaneisdefinedbyaP4program
04/10/2017 P4deepdive@ESnet- RichardCziva 16
From:https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.html
P4pipeline
• Parser:Parsesrawpacketsfromwireintometadata• Match+Actiontables:operateonmetadata• Deparser:Serializesmetadataintoapacket• Metadatabus:availableintheentirepipeline
04/10/2017 P4deepdive@ESnet- RichardCziva 17
Table1 Table2 Tablen
Deparser
Parser
Metadatabus
packet packet
AnexamplepipelinefromCorsa
04/10/2017 P4deepdive@ESnet- RichardCziva 18
CouldbeexpressedwithP4!
Languageoverview(P4_16)P4deepdive
04/10/2017 P4deepdive@ESnet- RichardCziva 19
Languageversions
• Therearemultipleversionsofthelanguage• Unfortunatelytheversionsarebackwards-incompatible• Andsomethingsonlyworkinformerversionsjustnow
• Iamintroducingthelatest:P4_16• Smallercorelanguagethanpreviousversions(only40keywords)• Thisissupposedtobeastablelanguagedefinition
04/10/2017 P4deepdive@ESnet- RichardCziva 20
From:https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.html
Coreabstractions
• Headertypes:definehowyourprotocolheaderslooklike• Parsers:tellthemhowtoparseheaders• Tables:containstateassociatinguser-definedkeyswithactions• Actions:describinghowapacketshouldbemanipulated• Match-Actionunits:createlookupkey,performtablelookup,executeaction• Controlflow:specifiestheorderofapplyingmatch-actionunits• Externobjects:registers,counters,meters,etc• Metadata:datastructuresassociatedwitheachpacket
04/10/2017 P4deepdive@ESnet- RichardCziva 21
Headerdefinitions
• Allowsyou,thedevelopertodescribepacketformatsandnameheaderfields• Fixedvsvariablelengthfieldsarebothpermitted
04/10/2017 P4deepdive@ESnet- RichardCziva 22
header Ethernet_h {bit<48>dstAddr;bit<48>srcAddr;bit<16>etherType;
}
header IPv4_h{bit<4>version;bit<4>ihl;bit<8>diffserv;bit<16>totalLen;bit<16>identification;bit<3>flags;bit<13>fragOffset;bit<8>ttl;bit<8>protocol;bit<16>hdrChecksum;bit<32>srcAddr;bit<32>dstAddr;varbit<320>options;
}
Parser
• Afinitestatemachinewithonestartstateandtwofinalstates
04/10/2017 P4deepdive@ESnet- RichardCziva 23
Start
Ethernet?
TCP UDP
Accept
Reject
IfEthernet
IfTCP
else
else
IfUDP
parserMyParser(){statestart{transitionparse_ethernet;}stateparse_ethernet {
packet.extract(hdr.ethernet);transitionselect(hdr.ethernet.etherType){
TYPE_IPV4:parse_ipv4;default:accept;
}}…}
Tables
• Tablescontainstate(pushedbythecontrolplane)toforwardpackets• Tablesdescribeamatch-actionunit• Packetmatchingcanbedoneeither:• Exactmatching• LongestPrefixMatch(LPM)• Ternarymatching(masking)
• Allpossibleactionshavetobedefinedinadvance
04/10/2017 P4deepdive@ESnet- RichardCziva 24
tableipv4_lpm{reads {
ipv4.dstAddr:lpm;}actions {
forward()}
}
Match-Actionunits
04/10/2017 P4deepdive@ESnet- RichardCziva 25
From:https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.html
Actions
• Actionsconsistofcodeandtakeactiondata• Actiondatacomesfromthecontrolplane(e.g.,IPaddresses/portnumbers)
• Specific,loop-freeprimitivescanbeexecutedinanaction• Numberofinstructionsmustbepredictable,sonoloopsorconditionalstatementsallowedhere
04/10/2017 P4deepdive@ESnet- RichardCziva 26
action ipv4_forward(macAddr_t dstAddr,egressSpec_t port){standard_metadata.egress_spec =port;hdr.ethernet.srcAddr =hdr.ethernet.dstAddr;hdr.ethernet.dstAddr =dstAddr;hdr.ipv4.ttl=hdr.ipv4.ttl- 1;
}
Externs
• ExternsarearchitecturespecificconstructswithwelldefinedAPIs• Examples:• Checksumcalculation• Registers• Counters• Meters
04/10/2017 P4deepdive@ESnet- RichardCziva 27
extern Checksum16{Checksum16();//constructorvoid clear();//prepareunitforcomputationvoid update<T>(in Tdata);//adddatatochecksumvoid remove<T>(in Tdata);//removedatafromexistingchecksumbit<16>get();//getthechecksumforthedataaddedsincelastclear
}
extern register<T>{register(bit<32>size);void read(outTresult,inbit<32>index);void write(inbit<32>index,inTvalue);
}
Metadata
1. User-definedmetadata(emptystruct atfirstforeachpacket)• Youcanputhereanythingyouwant• Accessibleeverywhereinthepipeline• Usefule.g.,forstoringthepackethash
2. Intrinsicmetadata- thisisprovidedbythearchitecture• Ingressport,egressportisdefinedhere• Timestampwhenpacketwasqueued,queuedepth• Multicasthash/multicastqueue• Packetpriority,packetcolor• Egressportspecification(e.g.,outputqueue)
04/10/2017 P4deepdive@ESnet- RichardCziva 28
Controlflow
• Stitchingeverythingtogether• Imperativeprogramexpressingthehigh-levellogicandthesequenceformatch-actionunitinvocations
04/10/2017 P4deepdive@ESnet- RichardCziva 29
control ingress(){[tabledefinitionshere][actiondefinitionshere]apply { if (hdr.ipv4.isValid()){ipv4_lpm.apply();
} }
}
V1Switch(ParserImpl(),verifyChecksum(),ingress(),egress(),computeChecksum(),DeparserImpl())main;
P4compiler,toolsP4deepdive
04/10/2017 P4deepdive@ESnet- RichardCziva 30
P4compiler
• TheP4compiler(p4c)produces:1. Dataplaneruntime2. APIformanagingthestateinthedataplane
04/10/2017 P4deepdive@ESnet- RichardCziva 31
P4softwareswitch
• p4lang/p4c-bm:generatesJSONconfigurationforbmv2• p4lang/bmv2:softwareswitchthatunderstandstheJSONconfig• bmv2:behaviormodelversion2
04/10/2017 P4deepdive@ESnet- RichardCziva 32
myprogram.p4 myprogram.json
bmv2-ssload
Compilewithp4c-bm
Dataplane
API
Runtimecommands
RuntimeAPIforsoftwareswitch
• Allowsyoutomanipulatetables,readregisters,counters…
• APIiseasilyusablewiththeprovidedsimple_switch_CLI program
04/10/2017 P4deepdive@ESnet- RichardCziva 33
- table_set_default <tablename><actionname><actionparameters>- table_add <tablename><actionname><matchfields>=><actionparameters>[priority]- table_delete <tablename><entryhandle>
p4@p4d2:~$simple_switch_CLI --thrift-port9090ObtainingJSONfromswitch...DoneControlutilityforruntimeP4tablemanipulationRuntimeCmd:counter_read counter0thisisthedirectcounterfortableipv4_lpmcounter[0]=BmCounterValue(packets=4,bytes=88)
Switchlogs+debugger
• Simpleswitchlogsareusefulenough:
• Thereisalsoagdb likedebugger:• Breakpoint,watchingaheaderfield,printsthevalueofaheaderfield,etc
04/10/2017 P4deepdive@ESnet- RichardCziva 34
p4@p4d2:~$tail-f/home/p4/p4_tutorial/P4D2_2017/exercises/mri/build/logs/s1.log[20:58:45.689][bmv2][D][thread1572][94.0][cxt 0]Pipeline'ingress':start[20:58:45.689][bmv2][D][thread1572][94.0][cxt 0]Pipeline'ingress':end[20:58:45.689][bmv2][D][thread1572][94.0][cxt 0]Egressportis0….[22:28:07.263][bmv2][T][thread1572][166.0][cxt 0]Applyingtable'ipv4_lpm’[22:28:07.263][bmv2][D][thread1572][166.0][cxt 0]Lookingupkey:
*ipv4.dstAddr:0a00020a[22:28:07.263][bmv2][D][thread1572][166.0][cxt 0]Table'ipv4_lpm':hitwithhandle1
P4examplesP4deepdive
04/10/2017 P4deepdive@ESnet- RichardCziva 35
Calculator
• Whosaidanetworkdevicecannotbeusedasacalculator?• Idea:• wesendpacketsdescribinganoperation(e.g.,1+1)• thedeviceperformstheoperationandwritestheresulttothepacket• thedevicesendsthepacketbacktothesenderthatparsestheresult(2)
04/10/2017 P4deepdive@ESnet- RichardCziva 36
H1 P4switch1+1?
2
Asimpleswitch
• BasicIPv4forwardingswitch• Stepbystep:
1. Packetisparsed(Ethernet,IPv4only)2. PacketismatchedusingLPMonIPv4
destinationaddresses3. Egressportisset4. Packet’sEthernetsourceanddestination
isset5. TTL=TTL-16. Checksumrecalculation
• Additional:counters
04/10/2017 P4deepdive@ESnet- RichardCziva 37
h1
h2
h3
s1
s2
s3
Multi-HopRouteInspection(MRI)
• Let’sfindoutwherehasourbeeninthenetwork• WewilluseTCP’soptionsfieldforthedata(parsingastackofheaders)• Everyswitchwillwriteit’sownID(swid)tothepacket• PacketfromH1-->H2willcollectswids[1,2]
04/10/2017 P4deepdive@ESnet- RichardCziva 38
h1
h2
h3
s1
s2
s3
TCPretransmissiontracker
• Goal:IdentifyallflowsthatareexperiencingTCPretransmission1. TrackifwehaveseenthesameTCP
packetbefore(SYNnumber)1. BasedonRTO(min1s),exponentialbackoff
2. TrackSACKflags• BetterwithburstsofpacketsinoneRTO
• ”Controller”:readsthecountersfromthedata-plane
04/10/2017 P4deepdive@ESnet- RichardCziva 39
Client Server
Seq =1
Seq =1Ack =1
RTO
Client Server
Seq =2
Seq =1Ack =1
RTOSeq =3
Ack =1,Sack=3Seq =2
PISA/PSA
• PSA=PortableSwitchArchitecture– WGinP4consortium• PISA =ProtocolIndependentSwitchArchitecture– referencechipdesign• ReferencearchitectureforaP4programmableswitch• Configurableparser,tables,etc.• BarefootTofinoisafullyprogrammablePISA• “World’sfastestandmostprogrammableswitchseriesupto6.5Tbps”
• Actualdevice:Edgecore Wedge100BF-65X:2RU65x100GE,6.5Tbit/s
04/10/2017 P4deepdive@ESnet- RichardCziva 40
Takeaway
• P4advocatesatop-downapproachfornetworkprogramming• Insteadofleavingthedevicestellushowtheyprocesspackets=>wewouldliketotellthemhowtodoit
• P4isjustonestandard,manymoretocome,however:
04/10/2017 P4deepdive@ESnet- RichardCziva 41
Inthenextfewyearsdata-planeprogrammabilitywillbecomecommonplaceandP4playsanimportantroleinit
References
• Figuresandsomedemoshavebeenselected:• P4tutorialslidesatSIGCOMM:https://github.com/p4lang/tutorials/blob/master/SIGCOMM_2016/p4-tutorial-slides.pdf• P4languagedocs:https://p4lang.github.io/p4-spec/docs/P4-16-v1.0.0-spec.html• ExamplesfromGithub:https://github.com/p4lang/tutorials
• P4paper
04/10/2017 P4deepdive@ESnet- RichardCziva 42
Thankyou
Thankyouforyourattention!
04/10/2017 P4deepdive@ESnet- RichardCziva 43