linux networking internals

Download Linux Networking Internals

If you can't read please download the document

Post on 12-Nov-2014

23 views

Category:

Documents

8 download

Embed Size (px)

DESCRIPTION

Slides for a course about the Linux kernel network stack.

TRANSCRIPT

TheLinuxNetworkSubsystemUnabletohandlekernelpagingrequestatvirtualaddress4d1b65e8 Unabletohandlekernelpagingrequestatvirtualaddress4d1b65e8 Covers Linux version 2.6.25 pgd=c0280000 pgd=c0280000 Version 1.1 [4d1b65e8]*pgd=00000000[4d1b65e8]*pgd=00000000 Internalerror:Oops:f5[#1] Internalerror:Oops:f5[#1] Moduleslinkedin:Moduleslinkedin:hx4700_udchx4700_udcasic3_baseasic3_base CPU:0 CPU:0 PCisatset_pxa_fb_info+0x2c/0x44 PCisatset_pxa_fb_info+0x2c/0x44 LRisathx4700_udc_init+0x1c/0x38[hx4700_udc] LRisathx4700_udc_init+0x1c/0x38[hx4700_udc] pc:[]lr:[]Nottainted sp:c076df78ip:60000093fp:c076df84 pc:[]lr:[]Nottainted

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

1

RightstocopyThiskitcontainsworkbytheAttributionShareAlike2.0 Youarefree tocopy,distribute,display,andperformthework tomakederivativeworks tomakecommercialuseofthework Underthefollowingconditions Attribution.Youmustgivetheoriginalauthorcredit. ShareAlike.Ifyoualter,transform,orbuilduponthiswork, youmaydistributetheresultingworkonlyunderalicense identicaltothisone. Foranyreuseordistribution,youmustmakecleartoothersthe licensetermsofthiswork. Anyoftheseconditionscanbewaivedifyougetpermissionfrom thecopyrightholder. Yourfairuseandotherrightsareinnowayaffectedbytheabove. Licensetext:http://creativecommons.org/licenses/bysa/2.0/legalcodeCopyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

followingauthors: Copyright20042006 MichaelOpdenacker michael@freeelectrons.com http://www.freeelectrons.com Copyright20032006 OronPeled oron@actcom.co.il http://www.actcom.co.il/~oron Copyright20042008 Codefidenceltd. info@codefidence.com http://www.codefidence.com

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

2

WhatisLinux?LinuxisakernelthatimplementsthePOSIXandSingleUnix SpecificationstandardswhichisdevelopedasanOpenSourceproject. WhenonetalksofinstallingLinux,oneisreferringtoaLinux Distribution:acombinationofLinuxandotherprogramsandlibrarythat formanoperatingsystem.

Linuxrunson24mainplatformsandsupportsapplications rangingfromccNUMAsuperclusterstocellularphonesand microcontrollers. Linuxis15yearsold,butisbasedonthe40yearsoldUnixdesign philosophyCopyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

3

LayersinaLinuxsystem

Kernel KernelModules Clibrary Systemlibraries Applicationlibraries Userprograms

Userprograms

Kernel Clibrary

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

4

KernelarchitectureApp1 App2 Clibrary Systemcallinterface Process management Memory management Filesystem support Filesystem types CPUsupport code CPU/MMU supportcode Storage drivers Character devicedrivers Network devicedrivers Hardware CPU RAM Storage Device control Networking ... User space

Kernel space

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

5

KernelModevs.UserModeAllmodernCPUssupportadualmodeofoperation: Usermode,forregulartasks. Supervisor(orprivileged)mode,forthekernel. ThemodetheCPUisindetermineswhichinstructionstheCPUis willingtoexecute: SensitiveinstructionswillnotbeexecutedwhentheCPUisin usermode. TheCPUmodeisdeterminedbyoneoftheCPUregisters,whichstores thecurrentRingLevel 0forsupervisormode,3forusermode,12unusedbyLinux.Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

6

TheSystemCallInterfaceWhenauserspacetasksneedstouseakernelservice,itwillmakea SystemCall. TheClibraryplacesparametersandnumberofsystemcallinregisters andthenissuesaspecialtrapinstruction. Thetrapatomicallychangestheringleveltosupervisormodeandthe setstheinstructionpointertothekernel. Thekernelwillfindtherequiredsystemcalledviathesystemcalltable andexecuteit. Returningfromthesystemcalldoesnotrequireaspecialinstruction, sinceinsupervisormodetheringlevelcanbechangeddirectly.

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

7

LinuxSystemCallPathKernel do_name() sys_name() entry.S Function call Trap

Task

Glibc Task

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

8

LinuxnetworkingSubsystemOverviewStack App App 1 App2 Socket Layer UDP Networking Stack Driver Stack Driver Hardware TCP IP Stack Driver Interface Driver ICMP Bridge App3

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

9

NetworkDeviceDriverHardwareInterfacepacket packet packet packet packet

TxSend Send Send SentOK SendErr Free

Memory Access

Driver

Memory mapped registers access

Rx

Free

Free

RcvOk

RcvErr RecvCRC RcvOK

Interruptspacket

packet

packet

packet

Driver allocates Ring Buffers. Driver resets descriptors to initial state. Driver puts packet to be sent in Tx buffers. Device puts received packet in Rx buffers. Driver/Device update descriptors to indicate state. Device indicates Rx and end of Tx with interrupt, unless interrupt mitigation techniques are applied.Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

DMA

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

10

NetworkDeviceRegistrationEachnetworkdeviceisrepresentedbyastructnet_device Theseareallocatedusing:structnet_device*alloc_netdev(size,mask, setup_func);

sizesizeofourprivdatapart maskanamingpattern(e.g.eth%d) setup_funcAfunctionthatsetupstherestofnet_device.

Andisregisteredviaacallto:intregister_netdev(structnet_device*dev);Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

11

NetworkDeviceInitializationThenet_devicestructureisinitalizedwithnumerousmethods andflagsbythesetupfunction:openrequestresources,registerinterrupts,startqueues. stopdeallocatesresources,unregisterirq,stopqueue. get_statsreportstatistics set_multicast_listconfiguredeviceformulticast hard_start_xmitcalledbythestacktoinitiateTx. IFF_MULTICASTDevicesupportmulticast IFF_NOARPDevicedoesnotsupportARPprotocolCopyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

12

PacketRepresentationWeneedtomanipulatepacketsthroughthestack Thismanipulationinvolvesefficiently:Addingprotocolheaders/trailersdownthestack. Removingprotocolheaders/trailersupthestack.

Packetscanbechainedtogether. Eachprotocolshouldhaveconvenientaccesstoheader fields. Todoallthisthekernelusesthesk_buffstructure.Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

13

SocketBuffersThesk_buffstructurerepresentsasinglepacket. Thisstructureispassedthroughtheprotocolstack. Itholdspointerstoabufferswiththepacketdata. Itholdsmanytypeofotherinformation:Datasize. Incomingdevice. Priority. Security...Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd. Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

14

structsk_buffnext: prev: sk: tstamp: dev: input_dev: h: nh: mac: dst: sp: cb: len: data_len: mac_len: csum: local_df: cloned: nohdr: pkt_type: fclone: Nextbufferinlist Previousbufferinlist Socketweareownedby Timewearrived Devicewearrivedon/areleavingby Devicewearrivedon Transportlayerheader Networklayerheader Linklayerheader Destinationroutecacheentry Securitypath,usedforxfrm Controlbuffer.Privatedata. Lengthofactualdata Datalength Lengthoflinklayerheader Checksum Allowlocalfragmentationflag Headmaybecloned(seerefcnt) Payloadreferenceonlyflag Packetclass Clonestatus ip_summed: DriverfedusanIPchecksum priority: users: protocol: truesize: head: data: tail: end: nfmark: nfct: nfctinfo: nf_bridge: tc_index: tc_verd: secmark: Packetqueuingpriority Usercountsee{datagram,tcp}.c Packetprotocolfromdriver Buffersize Headofbuffer Dataheadpointer Tailpointer Endpointer Netfilterhooksprivatedata Associatedconnection,ifany Connectiontrackinginfo. Saveddataaboutabridgedframe Trafficcontrolindex Trafficcontrolverdict SecuritymarkingforLSM

destructor: Destructfunction

ipvs_property:skbuffisownedbyipvs nfct_reasm: Netfilterconntrackreassemblypointer

dma_cookie: DMAoperationcookie

Copyright20062004,MichaelOpdenacker Copyright20032006,OronPeled Copyright20042006CodefidenceLtd.

Forfullcopyrightinformationseelastpage. CreativeCommonsAttributionShareAlike2.0license

15

SocketBufferDiagramheadroom Ethernet IP TCP Payload Pad

View more >