addressing the scalability of ethernet with moosemas90/moose/moose.pdfaddressing the scalability of...

Click here to load reader

Upload: others

Post on 10-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Addressing the Scalability of Ethernet with MOOSEMalcolm Scott, Daniel Wagner-Hall, Andrew Moore and Jon Crowcroft

    University of Cambridge Computer Laboratory{Malcolm.Scott, daw63, Andrew.Moore, Jon.Crowcroft}@cl.cam.ac.uk

    Abstract

    Ethernet does not scale well to large networks. The flatMAC address space, whilst having obvious benefits forthe user and administrator, is the primary cause of thispoor scalability; other recent efforts to improve uponEthernet’s scalability have addressed symptoms, ratherthan this underlying cause. In this paper we presentMOOSE, Multi-level Origin-Organised Scalable Ether-net, an Ethernet switch architecture that performs in-place rewriting of MAC addresses in order to imposea hierarchy upon the address space without reconfigu-ration or modification of connected devices. This re-moves the need for switches to maintain large forward-ing databases, is of direct use in implementing improvedrouting, and allows for a variety of other scalabilityand security innovations. We also present a globally-scalable, distributed and resilient protocol for the auto-matic assignment of addresses to switches, and for de-tecting and cheaply resolving addressing conflicts.

    1 IntroductionEthernet has lasted well since its inception in the ’70s [1]with Ethernet frame-structure and addressing remainingubiquitous in the data centre environment as in manyothers. Alongside IP and IP-transported services such asiSCSI, it is now commonplace to see converged networkservices such as physical disk interfaces and cluster in-terconnects layered directly over Ethernet (e.g. ATA-over-Ethernet and variants of Infiniband). However, Eth-ernet exhibits scalability issues on networks of morethan a few thousand devices, such as costly and energy-dense address table logic and storms of broadcast traffic.

    Aside from more physical devices, virtualised infras-tructure further increases the density of Ethernet ad-dresses in data centres. Widely-used layer-2 virtualisa-tion mandates a unique Ethernet address per virtual ma-chine [2]. This means that each physical machine in adata centre may represent many tens of Ethernet devices.

    The traditional method of avoiding such problems isthe artificial subdivision of a network, but this introducesan administrative burden, requires significant routingequipment and also precludes seamless migration—a

    necessity for virtualised infrastructure [3]. While IP Mo-bility [4] addresses the problem of maintaining higher-layer connections when roaming between subnets, it re-quires client support that is neither ubiquitous or reli-able. Common practice sees the provision of one phys-ical Ethernet network covering an entire data centre, oreven an entire WAN of data centres.

    Our approach, Multi-level Origin-Organised ScalableEthernet (MOOSE), provides all the advantages of anEthernet network without the capital and running costsand administrative overhead of a IP router-based ap-proach. MOOSE does this by providing a hierarchicaladdressing scheme without requiring host reconfigura-tion or modification.

    Ethernet’s scalability is limited firstly by the forward-ing database that every switch in an Ethernet networkmust maintain [5, §7.8–7.9]. A switch’s forwardingdatabase contains one entry per source address seen inany frame passing through that switch, and stores thatMAC address together with the learnt location of thataddress—the port on which packets from that addresswere last seen. This is later used to determine on whichport to transmit frames destined for that address. De-vices frequently broadcast frames throughout the net-work (e.g. ARP queries) so active devices on the net-work are listed in most switches’ forwarding databasesmost of the time.

    In modern switches the capacity of this database isgenerally of the order of 16,000 entries [6]. (Higher-capacity forwarding databases exist but are currentlyconstrained to very high-end switches.) On a moderatelylarge network, full databases are a serious risk. If thedatabase becomes full, entries will be discarded; framesfor unknown addresses are flooded to all ports and theresulting traffic storm could cause major problems, es-pecially in the presence of low-capacity edge links.

    Traditionally the forwarding database has been storedin a content-addressable memory (CAM) as lookupsmust be very fast, particularly as 10 Gbit/s Ethernet be-comes ubiquitous. As networks grow, the number of en-tries in a switch’s forwarding database must naturally in-crease; however, increasing the capacity of CAMs with-out sacrificing speed whilst constraining energy con-

    mailto:[email protected],[email protected],[email protected],[email protected]

  • sumption is proving to be challenging [7, 8]. Cheaperswitches use DRAM in place of a CAM, but this is likelyto remain slower especially for large tables.

    Secondly, Ethernet’s inability to handle networks con-taining loops also presents a scalability problem. TheRapid Spanning Tree Protocol, RSTP [5, §17], must re-move loops by disabling any redundant links. On a densemesh network, RSTP will disable a large proportion oflinks; this constrains frames to suboptimal routes andmay introduce bottlenecks in the network, particularlyaround the root of the spanning tree. In a data centre en-vironment, this potentially amounts to a very large pro-portion of capacity being wasted wherever redundant fi-bres are installed, e.g. between cabinet switches and be-tween data centres.

    Thirdly, not only does Ethernet flood frames des-tined for unknown hosts, but it also uses—and encour-ages higher-layer protocols to use—broadcast for con-trol messages. For example, ARP [9] performs addressresolution via broadcast queries, and DHCP [10] usesbroadcast messages for automatic configuration. It is im-practical to replace these protocols entirely as this wouldrequire software upgrades to every device, but it wouldbe desirable for the network to minimise the amount ofbroadcast traffic required to be forwarded.

    In this paper we identify the relevant underlying prob-lems in the design of Ethernet (section 2), review pre-vious work (section 3) and present the MOOSE switcharchitecture, which addresses inadequacies in the funda-mental operation of Ethernet in a novel yet backwards-compatible way (section 4). By revisiting the address-ing scheme itself, rather than simply addressing symp-toms of the problem as many previous proposed solu-tions have done, we can go about solving all of the abovescalability problems and more.

    A working high-level software implementation ofMOOSE is described and evaluated in section 6.

    This work expands on our previous paper presentedat the First Workshop on Data Center – Converged andVirtual Ethernet Switching (DC CAVES) [11]. We haveadded a crucial piece of the architecture—an automaticaddressing scheme with cheap conflict resolution—andhave better addressed the key issue of compatibility withexisting protocols.

    2 Ethernet’s Underlying ProblemThe original Ethernet was a shared-medium network,where every frame was broadcast and no switching tookplace. Modern-day wired Ethernet-based networks in-stead consist almost entirely of point-to-point links; asa result of this, the distinction between unicast, broad-cast and multicast has become more important. 802.11

    wireless LANs are the one remaining vestige of Ethernetoperating over shared media, where one switch (accesspoint) serves many hosts on the same radio channel.

    Ethernet’s poor scalability arises in various guises, asoutlined in section 1. It would seem at first glance thatthese are entirely distinct and unrelated. However, thereis a common underlying cause: that MAC addresses pro-vide no location information.

    Globally-unique MAC addresses are structured suchthat the first three bytes of a device’s address contain anorganisationally unique identifier (OUI) allocated to thedevice’s manufacturer by the IEEE, with the remainingthree bytes allocated by the manufacturer. This hierar-chy exists solely for the purpose of allocating unique ad-dresses in a decentralised fashion, and is of no use toEthernet switches, which must treat the unicast addressspace as flat.

    A flat address space has the advantage that no configu-ration of devices is required; a device can use its unique,manufacturer-assigned MAC address anywhere on anynetwork. However, this leaves each switch with the taskof discovering and storing the location of every address-able device.

    If the MAC address space were not flat, but insteadcontained enough information to locate the device pos-sessing the address, several advantages would be gained.Firstly, large forwarding databases would no longer haveto be maintained on every switch. This location infor-mation could instead be distributed across the networkso that frames are directed towards their destinations ac-cording to successive stages of a hierarchy.

    Secondly, a hierarchical MAC address space wouldalso make the addition of shortest-path routing consid-erably easier. Shortest-path routing is clearly a desir-able property for a network, yet it is one that Ethernetdoes not provide. Flat addressing does not lend itselfto easy routing: any address can be located anywhere onthe network, which means either advertising every host’sMAC address via the routing protocol—which scalesvery poorly—or providing some other location lookupservice. The use of hierarchical addresses, with eachswitch handling a block of sequential addresses akin toan IP subnet, would reduce the routing problem to theone that routing protocols were designed to solve.

    Thirdly, this would allow for reduction of broadcasttraffic in a variety of different ways. Hierarchical MACaddresses could, for example, be mapped directly anddeterministically onto the IP address space, if appro-priate for the specific deployment. This would allowswitches to respond directly and simply to DHCP andARP queries, avoiding the need to forward the mostcommon sources of broadcast frames. Alternatively, a

  • distributed directory service can be used, which is lesslimiting and is thus our preferred approach as detailed insection 4.5.

    The facility for network administrators to assign lo-cally administered addresses (LAAs) to devices has ex-isted for as long as Ethernet. However, configuring andmaintaining the LAA on every device based upon wherethey are connected would be a considerable and unwel-come administrative overhead. In this paper we there-fore present MOOSE, a system for applying hierarchicaladdressing to an Ethernet transparently and without anyconfiguration to edge devices.

    3 Related WorkIt is well-known that traditional Ethernet scales poorly,and there have been various attempts in recent yearsto rectify this. The most widely-used of these inreal-world networks is MPLS-VPLS (Multiprotocol La-bel Switching—Virtual Private LAN Service) [12].This connects Ethernet islands together through tunnelsacross a MPLS cloud. MPLS works by adding one ormore labels to the start of every frame, i.e. encapsulat-ing the frame inside its own protocol.

    In MPLS-VPLS, the label edge routers (LERs) mustdetermine the frame’s initial label(s) based upon the des-tination address via a lookup table. Frames follow prene-gotiated label-switched paths (LSPs) that, unlike Ether-net, are not constrained to follow a spanning tree; LSPsare precomputed at connection setup time and the rele-vant next hop is stored in a lookup table on each interme-diate switch. Each switch must hence use each frame’slabel to index into this lookup table to determine how toswitch the frame.

    The effect, once the connection has been negotiated,is to provide what appears to be one or more largeEthernet networks, transparently overlaid on the MPLScloud. Whilst this solves effectively the problem ofshortest-path routing across the MPLS cloud, the over-lay Ethernets are still susceptible to the usual scalabilityproblems—and in fact VPLS adds further large lookuptables on every switch that can in some configurationsscale even worse than Ethernet’s forwarding databases.LERs must map every MAC address to a LSP; labelswitch routers (LSRs) must store the next hop for ev-ery LSP in which they participate, which in the core ofthe network could scale as O(hosts2).

    A similar scheme is proposed by Hadžić [13], with thedifference that Ethernet-inside-Ethernet encapsulation isused rather than a new protocol. This has the advantagethat less processing is required on intermediate switchesin the backbone network. However, routes across thebackbone are constrained to a spanning tree, and encap-

    sulating switches must obtain a new destination addressfor every frame using a lookup table that—like Ether-net’s forwarding database—must contain every transmit-ting MAC address. Due to its heavy basis on Ethernet,this shares many of Ethernet’s scalability problems.

    SmartBridge [14] and Rbridges [15] both encapsu-late Ethernet frames in a new inter-switch protocol, andrun a link-state routing protocol between switches. Thelink state graph includes the location of every MACaddress—necessary because the address space remainsflat and any address could appear anywhere—i.e. it againcontains every host. Furthermore, switches must per-form expensive computation to update routing tableswhenever a MAC address joins or leaves the network.

    Myers et al. [16] suggest that Ethernet’s main failingis its broadcast service, and propose a new architecturein which hosts make explicit use of directory servicesoperated by switches rather than broadcasting queries.It is clear that switches’ participation is necessary inorder to deal with the broadcast problem; however themodifications to Ethernet suggested are not backwards-compatible and would require at least software modifi-cations to all connected devices. Ethernet is, perhaps un-fortunately, too widespread for this to be practical; trans-parent interception of broadcast frames and subsequentlocal handling or redirection via multicast or unicast re-mains the only practical solution. The use of hierarchicaladdressing is a useful stepping-stone to such a system,and our architecture includes a transparent directory ser-vice (ELK, section 4.5) for this purpose.

    SEATTLE [17] takes a more scalable approach. Arouting protocol is operated between switches, but incontrast to the approaches described above and in com-mon with MOOSE, the routing protocol only propagatesswitch location information, rather than every MAC ad-dress on the network. Flat MAC addresses are stillused, and hence a mechanism is required to look up theswitch to which a given address is connected. This isachieved by using a distributed hash table (DHT) op-erating on participating switches with local caching toalleviate load. This is certainly a step in the right direc-tion but introduces considerable complexity to switches,since they now must maintain and update the DHT con-tinually, and it is clear that a SEATTLE switch wouldhave a significant software component in the data path.MOOSE alleviates some of the complexity of SEATTLEby a combination of hierarchical addresses and delega-tion to a separate directory service.

    4 MOOSE ArchitectureThe basic operation of MOOSE is to assign a new hier-archical MAC address to each host on the network, as-

  • signed dynamically and automatically from the unicastLAA space. This dynamically-assigned address is re-ferred to as a MOOSE address to avoid confusion withhosts’ static, manufacturer-assigned MAC addresses.

    Every frame entering the network has its source ad-dress rewritten in-place to the sending host’s MOOSEaddress by the first MOOSE-aware switch it traverses;the new source address becomes the sending host’sMOOSE address. The switch that performs addressrewriting for a host—i.e. the closest MOOSE switch tothat host—is the host’s home switch and is responsiblefor assigning a MOOSE address to that host. (If non-MOOSE switches or hubs are in use, a host may havemore than one “closest” MOOSE switch, in which casean RSTP-like protocol must be used to elect a switch tohandle each edge segment.)

    The destination address is left intact in the expecta-tion that it already is a MOOSE address. Hosts’ ARPcaches will already contain the MOOSE addresses ofany hosts being communicated with as any packet re-ceived will already have had its source address rewritten;a host’s manufacturer-assigned MAC address is neverseen outside of the segment containing that host. Thisis a crucial point since encapsulation-based technolo-gies such as MPLS do not reveal to the destination hostthe address used for routing; as a result, switches mustalso convert destination as well as source addresses offrames entering the network. In other words, once againswitches must maintain large tables of remote hostson the network. The only destination rewriting thatMOOSE switches perform, however, is of the destina-tion addresses of frames destined for local hosts back totheir manufacturer-assigned MAC addresses; this is sim-ple as the required information is already known, andnecessary because otherwise that host’s network inter-face card would discard the frame as misaddressed.

    A MOOSE address consists of a switch identifier fol-lowed by a host identifier. For our examples, we sim-ply use a fixed three-byte switch identifier followed by afixed three-byte host identifier, as illustrated in figure 1.Since these two identifiers when concatenated must forma unicast LAA, the settings of two bits in the first byteof the switch identifier are fixed: the least significant bitmust be 0 to indicate a unicast address, and the second-least significant bit must be 1 to indicate a LAA. To caterfor variable length switch identifiers, some means of in-troducing separation between the switch and host iden-tifiers is required. Two possible implementations wouldbe for:

    • the first three bits of the address indicate how manyof the following 5-bit blocks make up the switchprefix;

    switch02:22:22

    switch02:33:33

    02:33:33:00:00:01

    02:33:33:00:00:04

    02:33:33:00:00:02

    02:33:33:00:00:03

    hosts

    02:22:22:00:00:01

    02:22:22:00:00:02

    02:22:22:00:00:03

    ...

    ...switch

    02:11:11

    Figure 1: Assignment of MOOSE addresses by switches

    • some constant delimiter to appear between theswitch identifier and host identifier, with switchidentifiers not allowed to contain the delimiter.

    The former is simple and gives 8 classes of switchidentifier. Because the size of a MOOSE network is lim-ited by the placement of IP routers, these classes shouldbe sufficient. Additionally, because switches are freeto change their identifier, they may trivially switch toa larger class if they have too many attached hosts, or asmaller class becomes full.

    The latter removes the fixed classes, allowing formore flexibility with the sizes of switch identifiers, atthe cost of complexity, and a reduction in the availableaddress space.

    Each switch can select for itself a unique switchidentifier, as identifier conflict resolution is cheap (sec-tion 4.2). When first joining the routing protocol (sec-tion 4.1), conflict should be very unlikely, as the switchwill in the process gain an up-to-date list of in-use iden-tifiers. Depending on requirements, the switch identi-fier may itself be a hierarchical address—e.g. six bits toidentify a network area followed by two bytes to identifya switch within that area—which could then be used toaid routing decisions.

    Each host is assigned a host identifier by its homeswitch from the pool of identifiers available to thatswitch. Only a host’s home switch ever bases a switch-ing decision on the host identifier, so the detail of howthese are allocated can vary from switch to switch. Suit-able schemes include:

    • sequential assignment;• the port number followed by a sequential portion

    (to allow for multiple hosts connected to one port);• a hash of the host’s real MAC address.The latter two approaches are preferable to a sim-

    ple sequential assignment, as they better isolate certainkinds of denial-of-service attack in which a malicioushost attempts to use up all available host identifiers onthe switch. They also require less state to be shared be-tween ports. The third option has the further advantagethat it is deterministic and hence can be recovered easilyin the event of a crash.

    It is hence possible to route frames through the net-

  • work to remote hosts by simply inspecting the switchidentifier in the destination address, and ignoring thehost identifier until the frame reaches the destinationhost’s home switch. Switches no longer need to keep atable of all MAC addresses seen recently; they only needstore the locations of other switches and of any directly-connected hosts.

    As well as reducing the amount of data that must beconsulted in order to make switching decisions, this pro-vides extra resilience by making this data much morepredictable. The number of MAC addresses in a net-work can increase unexpectedly in the event of an ad-dress flooding attack [18] or even under normal opera-tion if the network contains open wireless access points;relying on the MAC address list for forwarding leadsto some of the vulnerabilities of Ethernet. The set ofswitch identifiers participating in MOOSE switching, onthe other hand, is kept predictable and manageable byensuring that neighbouring switches (discovered usingLLDP [19]) are authenticated before they can partici-pate in the routing protocol. This authentication can beachieved at layer 3 using the security features found inmost popular routing protocols and/or at layer 2 using802.1X [20]. As the switch identifier is the only addressconsulted for forwarding decisions, a MOOSE switchis likely to remain reliable in the face of attacks thatcould have brought down a traditional Ethernet. Fur-thermore, any attacks based upon MAC address spoof-ing cannot function on a MOOSE network as the user-provided MAC address is translated immediately.

    4.1 Shortest Path Routing

    As described so far, MOOSE switches must still forwardframes along a spanning tree. As discussed in section 1,this is an undesirable property of Ethernet as it can causeframes to take a highly suboptimal path through the net-work. The foundations are in place to do much betterthan this using shortest-path routing.

    For the purpose of frame forwarding, a MOOSEswitch can be considered akin to a layer 3 router; it hasone locally-connected subnet—containing all addressesstarting with its switch identifier—and delivers frames toother subnets by passing them to an appropriate neigh-bouring switch. Bearing this in mind, the switch canrun a routing protocol of the kind normally used for IP,such as a variant of OSPF [21]. This allows frames tobe routed along the shortest available path, rather thanbeing constrained to a spanning tree. OSPF-OMP [22]may be particularly desirable due to its ability to makeuse of multiple equal-cost routing paths in order to im-prove performance [23].

    4.2 Address Selection and Conflict Resolution

    For reasons akin to those of the flaws of Ethernet,it is undesirable to guarantee universally unique pre-determined MOOSE switch identifiers. Due to the re-duced size of the switch ID space compared to the MACaddress space, this would also be infeasible. We there-fore propose that each switch selects an initial addressfor itself during startup. This could result in more thanone switch claiming an address, which would be un-desirable, so to mitigate the potential for MOOSE ad-dresses to find themselves in conflict we additionallypropose a simple and inexpensive conflict resolutionprotocol.

    Suppose two switches each have the same identifer.We note that if these switches are on separate MOOSEnetworks (on disconnected networks, or separated by anIP router), this situation brings no issue. Should they beon the same MOOSE network, however, a conflict ex-ists and must be resolved. Any routing protocol wouldrequire a switch to know which port other switches areconnected to, for instance by OSPF neighbour lists, orsimply by receiving frames and noting the switch portand source MOOSE address. When a switch receivesa MOOSE frame, it looks up the source switch in itsforwarding database, which is likely in fast Content Ad-dressable Memory. If it finds that source switch to be ona port other than that which it recognises from its table,one of three situations may be possible:

    • the source switch may be the same as the knownswitch, and have physically moved, or a topologychange has occurred;• the source switch may be a different one to the

    known switch, and they are in conflict;• the source switch may be the same as the known

    switch, but is sending frames down a different routeto the last used route.

    To avoid disruption to the network in the first case,and to give scope for switches to migrate within the net-work, the switch which detected the possible conflictshould ascertain whether the known switch is still aliveand present. The conflict-resolving switch thus attemptsto send a unicast frame to the known switch, via the portstored in the forwarding database, asking whether it isthere at a regular interval until a timeout. This will reachthe known switch rather than the new switch if it is stillpresent as other switches beyond that port must not havedetected the conflict yet. The nature of the timeout weleave unspecified, and can be implementation specific.It may, for instance, be a pre-defined constant, or it mayvary based on QoS information gathered if such capa-bilities are supported. When a MOOSE switch receives

  • such a frame, it should promptly respond with an ac-knowledgement frame, showing that it is alive.

    If, within the timeout period, the conflict resolverfinds the known host not to be alive, no conflict exists, sothe switch updates its view of the network by removingthe old entry from its forwarding database and triggeringa routing protocol refresh.

    If, on the other hand, the host is found to be alive, aconflict exists. The conflict resolver then sends a frameto the more recently found switch indicating that it isin conflict and should change its address. That switch,upon receiving this frame, changes its address and sendsa gratuitous ARP for each of its connected hosts, so thatthe rest of the network is aware of the change. To mit-igate the risks of a denial of service attack, or faultyequipment sending out conflict frames, an exponentialbackoff algorithm should be used when receiving con-flict notification frames.

    A switch should have a timer, and counter influenc-ing the maximum value of the timer, both initialised to0. When a conflict notification frame is received, thecounter is incremented (subject to a saturation value toavoid excessive timeouts). After a conflict has beenresolved—i.e. the switch has changed its address—atimer starts counting down from some time exponentialin that counter; subsequently the switch will only changeits address if the timer has returned to 0 by the time theconflict frame is received. The counter should be re-set to 0 when the timer reaches 0. Using this schemethe event of true conflict is handled quickly, even in theunlikely case that the newly acquired address is also inconflict. Any node emitting malicious or erroneous con-flict notifications, however, is rate-limited enough thattheir damage potential is much restricted, subject to asufficient timer being chosen.

    Algorithm 1 Conflict resolution backoff

    if timer > 0 thenif counter < counter max then

    counter = counter + 1;end// Discard conflict notification frame

    elsetimer = kcounter;change address();

    end

    This could be further enhanced by detecting repeatedconflicts involving the same switch or switches, ina manner similar to BGP Route Flap Damping [24],and performing more aggressive steps to avoid furtherconflicts—for example using a significantly increased

    Algorithm 2 Conflict resolution timer

    foreach clock tick doif timer > 0 then

    timer = timer − 1;else

    counter = 0;end

    end

    timeout, and/or having both switches in conflict selectnew addresses.

    The conflict resolution algorithm brings a marked im-provement on the equivilent vulnerability of Ethernet,that MAC addresses can be spoofed. We build in a flex-ible, well-defined system of recovery. The decentralisednature of the system makes it much less open to denialof service attack than any centralised directory may be.Having every MOOSE switch acting as a barrier to thepropagation of packets from addresses in conflict pro-vides a strong separation between recently bridged net-works with conflicting addresses, so that communica-tion within the individual networks may continue with-out modification, until bridge-crossing traffic appears, atwhich point resolution quickly happens. We also re-move the possibility for forwarding databases to fre-quenty have to switch their entry for a conflicted ad-dress, which can happen with MAC conflicts in tradi-tional Ethernet. Additionally, in the case of a switchidentifier spoofing attack, the conflict resolver acts as ahard boundary for the effects of such an attack.

    It is possible that the switch performing conflict res-olution could send a suggested replacement switch ad-dress to the switch in conflict, known by the conflictresolver to have a low probability of being present onthe network (because it is not present in its forwardingdatabase). This would reduce the chance of repeated col-lisions, and potentially allow for longer backoff periods,but may be premature optimisation.

    Because multi-path routing is often desirable, wecould introduce an extra datum during the source ad-dress rewriting performed by MOOSE switches. Whenan ingress MOOSE switch rewrites the source addressof an Ethernet frame to a MOOSE address, it could alsoprepend some hash of its manufacturer-assigned MACaddress to the data field, and increment the length fieldas necessary. The egress switch, when rewriting theMOOSE destination address to a host’s MAC address,then strips out this added datum. This allows the conflictresolver to check whether conflicts actually exist by lo-cal lookup, rather than probing other switches, at the costof added memory requirements in every switch. This

  • HostA

    HostB

    MAC address:00:16:17:6D:B7:CF

    MAC address:00:0C:F1:DF:6A:84

    Switch ID:02:11:11

    Switch ID:02:22:22

    Switch ID:02:33:33

    Tim

    e

    Query:00:16:17:6D:B7:CF

    broadcast

    Query:02:11:11:00:00:01

    broadcast

    sourcerewritten

    Response:00:0C:F1:DF:6A:84

    02:11:11:00:00:01

    sourcerewritten

    frame broadcast using reverse path forwarding

    Response:02:33:33:00:00:01

    02:11:11:00:00:01

    frame routed to 02:11:11destinationrewritten

    Response:02:33:33:00:00:01

    00:16:17:6D:B7:CF

    Figure 2: Sequence diagram of a broadcast query and subsequent unicast response

    may push the frame to be larger than Ethernet’s maxi-mum, so may require fragmenting the packet into two, atsmall added cost. Alternatively, assuming jumbo framesare permitted by the hardware, the maximum frame sizecould be marginally reduced to allow for this in the samemanner as for 802.1Q VLAN tags.

    From the cheapness of conflict resolution, certainother address management tasks become simple. Aswitch is free to choose its address when it joins the net-work however it wishes—attempting to re-use its last-used address, from a list of preferred addresses, or bygenerating an address entirely at random. More intricateaddressing schemes may be used on managed networksif desired, perhaps encapsulating deeper layers of hier-archy.

    4.3 Broadcast and Multicast

    Since Ethernet does still need to support arbitrary broad-cast frames, these must still be forwarded along a span-ning tree in order that they reach each host exactly once.An explicit spanning tree protocol is not required how-ever, as the tree can be deduced from the routing ta-ble via reverse path forwarding in a similar manner toProtocol-Independent Multicast (PIM) [25]. In otherwords, broadcast packets are routed as if they had beensent to the all-hosts multicast group.

    More general multicast groups can be implementedusing a combination of IGMP snooping [26] as usedby modern Ethernet switches, and participation of theMOOSE switches in PIM routing.

    4.4 Example

    To illustrate the basic behaviour of MOOSE switches,before we go on to describe further features, we willoffer a simple example. We will describe the steps in-volved in forwarding a broadcast frame containing aquery in some higher-layer IPv4-based protocol, andsubsequent unicast frame containing the response, be-tween two hosts A and B via three MOOSE switches02:11:11, 02:22:22 and 02:33:33; see figure 2.

    4.4.1 Query

    i) Host A transmits the broadcast query frame asit would on any Ethernet network, with its ownmanufacturer-assigned MAC address in the Ether-net header’s source field and the broadcast address(FF:FF:FF:FF:FF:FF) in the destination field.

    ii) The frame is received by switch 02:11:11, whichobserves the non-MOOSE address in the frame’ssource field, and rewrites the source field into aMOOSE address containing the switch identifierand the appropriate host identifier. As this is HostA’s first frame, the switch must allocate a host iden-tifier (in this case 00:00:01, making Host A’s com-plete MOOSE address 02:11:11:00:00:01).

    iii) The three switches broadcast the frame using re-verse path forwarding away from Host A.

    iv) The frame is received by Host B (and any otherhosts on the network) in its current form; no fur-ther rewriting is performed.

  • 4.4.2 Response

    i) Host B looks up Host A’s IP address in its ARPcache to determine a suitable destination addressfor the response frame. Since the rewritten queryframe arrived at Host B with the source field con-taining the MOOSE address 02:11:11:00:00:01,this is the address returned by the cache lookup.

    ii) As above, switch 02:33:33 assigns a MOOSE ad-dress to Host B (02:33:33:00:00:01) and rewritesthe source address of the frame.

    iii) The frame is now routed through the network basedsolely on the destination switch identifier—the hostidentifier is ignored for now. The routing table isconsulted for the location of switch 02:11:11 andthe frame is forwarded accordingly.

    iv) On receiving the frame, switch 02:11:11 observesthat it is destined for a host directly connected toitself (02:11:11:00:00:01). It prepares the framefor transmission along its final hop by rewritingthe destination address to Host A’s manufacturer-assigned MAC address. The source field of theframe is again left as the MOOSE address of HostB in order that this address is used for any furthercommunication with Host B.

    4.5 Directory Service

    A directory service, Enhanced Lookup (ELK), runs inconjunction with the basic MOOSE switch described sofar. ELK exists to handle ARP and DHCP queries ina broadcast-free manner by learning mappings from IPaddresses to MOOSE addresses. The master ELK direc-tory is served by one or multiple systems for resilienceand is reached using an anycast MOOSE address; thelayer-2 anycast feature is a convenient side-effect of run-ning a routing protocol. Slave copies of the directory canbe held nearer the edge of the network in order to takeload away from the masters; slaves can be reached forlookups via a separate anycast address, and the entireherd of ELK can be kept synchronised via the mastersusing a combination of multicast and unicast.

    MOOSE switches intercept ARP and DHCP packetsbroadcast by hosts and convert them into anycast ELKqueries to the nearest slave (for ARP) or master (forDHCP). (DHCP handling could make use of the proto-col’s existing DHCP relay mechanism.) The ELK slaveanswers ARP queries directly using information in thedirectory; as it does so, if the query is from a host not inthe directory, it learns the sender’s IP address to MOOSEaddress mapping. The ELK master can also act as aDHCP server, populating the ELK directory as it grantsIP address leases to clients.

    The one case in which the ELK directory will not con-

    HostB

    HostA

    host relocated to new switch

    data forwarded

    by care-of switch

    gratuitous ARP sent by new home switch

    Figure 3: Two ways to handle a host A roaming onto anotherswitch whilst maintaining communication with another host B

    tain the answer to a query is when answering an ARP re-quest for a host that is not configured to use DHCP andthat has not yet itself sent an ARP packet (i.e. has not yetcommunicated via IP). This must be dealt with by flood-ing the query to every active switch port, in a mannerakin to current Ethernet switches, and caching the resultin the ELK directory. Although this is not ideal, it is nec-essary in order to deal with this scenario in a compatiblemanner, and is unlikely to happen frequently.

    4.6 Mobility

    A consequence of introducing location-based hierarchyinto MAC addresses is the need to explicitly handle hostmobility. In a traditional Ethernet, hosts can migrate be-tween switches as the switches will learn the host’s newlocation as soon as it sends a frame. With MOOSE, ifa host relocates to a new switch its address changes andany ARP cache entries on other hosts pertaining to themigrated host become incorrect; frames will continue tobe sent to the host’s old location for a while. There aretwo strategies for dealing with this, as illustrated in fig-ure 3, which can be used separately or in conjunction:

    i) The previous home switch of the migrated host canforward frames sent to the host’s old address untiloutdated ARP cache entries expire. This is simi-lar to IP Mobility [4]: the previous home switchessentially becomes a care-of agent for the host.However, unlike IP Mobility, it requires no hostsupport. A handover protocol is necessary for theold and new home switches to set up such forward-ing: on the arrival of a new host at a switch, thatswitch would ask all other switches (via multicast)whether any had seen this host before, identifying itusing its manufacturer-assigned MAC address, andwould instruct such switches to redirect frames.

  • ii) A broadcast ARP announcement (or “gratuitousARP”) can be sent by the new home switch to im-mediately update remote ARP caches and the ELKdirectory with the new MOOSE address. This isthe technique used by Xen when migrating live vir-tual machines [3]. Unlike the previous approach,this works even if the previous switch is no longerreachable, for example if this host migration was asa result of a switch failure. This is a simpler ap-proach as a handover protocol is not required, butresults in additional broadcast traffic.

    Unless the frequency of host migrations is very high,the additional load introduced by either mobility ap-proach is expected to be negligible.

    5 Interoperability Considerations5.1 Layer-violating ProtocolsIn an ideal world, free from layering violations, all layer3 protocols would operate correctly on top of MOOSE inexactly the same way that they currently operate on topof Ethernet, with no protocol-specific handling neces-sary in the switch. In reality, however, protocols aboundwhich use hosts’ MAC addresses for purposes other thanlayer 2 addressing or which place MAC addresses in theframe payload. DHCP and ARP have already been men-tioned as such protocols which must be specifically han-dled by edge switches in order to operate; luckily, therewriting required for these important protocols is sim-ple.

    Of particular concern are recent standards for layeringon top of Ethernet protocols which were previously usedsolely on dedicated hardware interconnects, such as Fi-bre Channel over Ethernet (FCoE) [27]. In order to sup-port FCoE and similar protocols on a MOOSE network,each edge switch will need to be able to interpret andrewrite individual protocols that are in use. A produc-tion MOOSE switch would, therefore, need to be imple-mented such that it is possible to add rewriting supportfor additional protocols after manufacture, for exampleby loading an additional software or FPGA configura-tion module.

    Ultimately, in the general case, this problem couldbe addressed more satisfactorily by extending the Eth-ernet standard to provide a protocol-agnostic methodfor a layer 2 network to inform hosts of their own ad-dresses; LLDP [19] would make a good basis for this ex-tension. This would allow the use of network-assignedMAC addresses for any protocol, with some rewritingperformed either partially (within the frame payload)or fully by the host itself, and furthermore would al-low higher-layer protocols to respond to changes of thehost’s network-assigned address (e.g. due to mobility).

    Such a mechanism could be deployed incrementally asneeded, with switches able to perform address rewritingfor hosts which are not able to do this themselves. Thisis, however, a very long-term solution, and protocol-specific rewriting on the switch is likely to be requiredfor the foreseeable future.

    FCoE in particular is unusual, however, as it alreadydoes its own dynamic allocation of MAC address to de-vices. It is conceivable that an extension to FCoE couldbe developed which allows a network-wide dynamic ad-dress assignment scheme such as MOOSE to be ex-ploited to provide addresses directly to fibre channel de-vices.

    5.2 Edge Virtual BridgingThe rise of virtualisation has caused an unanticipatedproliferation of software switches, usually in the hostoperating system or hypervisor which provides networkconnectivity to multiple virtual machines. Since soft-ware switches are almost always neither fast nor cen-trally manageable in the same way as hardware switches,there is ongoing work to standardise—by Cisco as PortExtension [28] and by the IEEE as P802.1Qbg [29]—ameans of making these software switches act merely asadditional ports which are logically part of a more cen-tral hardware switch. This reduces the work requiredby a virtual edge switch: frames from local virtual edgeports can be forwarded straight out via the uplink to aphysical switch without consideration, and frames fromthe uplink will arrive simply tagged with a virtual edgeport identifier.

    (The scope of Port Extension in particular is greaterthan this, and allows for physical port extenders to existin place of switches where a large number of ports buta small amount of processing is required, but virtualisa-tion is likely to be the most significant use case.)

    Edge Virtual Bridging and Port Extension requirevery little adaptation to be implemented on a MOOSEswitch. It is unlikely, although too early in the standard-isation process to say for certain, that the virtual bridgewill need to be MOOSE-aware. A virtual-bridging-aware physical MOOSE switch will thus simply needto take into account the possibility that one physical portmay hide a large number of virtual ports when allocatinghost identifiers, as it would if it had an Ethernet switchconnected on that port. If, however, the virtual bridgeis made MOOSE-aware, the hierarchical addressing ofMOOSE could be exploited to allow the virtual bridgeto allocate host identifiers itself, given that it is likely tobe aware of the exact number and nature of virtual edgeports. The parent MOOSE switch would accordingly al-locate an address prefix to each child virtual bridge, andhosts’ full MOOSE addresses could be formed as:

  • switch ID : child ID : host ID(parent) allocated allocated

    by parent by child

    6 ImplementationWe have implemented a MOOSE switch in threaded,object-oriented Python as a proof-of-concept. The archi-tecture is intended for clarity and modularity rather thanraw performance, but this implementation is still capa-ble of switching data at 100 Mbps on a modern desktopPC.

    Data forwarding and control functions are kept sepa-rate for clarity and to mimic a hardware implementation.

    6.1 Data plane

    The software MOOSE switch is intended to be run ona Linux PC with several network interface cards, anduses raw sockets to send and receive frames directlyin promiscuous mode, so that all frames are receivedwhether or not they are addressed to that PC.

    Each network interface is managed by two indepen-dent threads: a FrameReceiver and a FrameTransmit-ter. A Port object is maintained in shared memory, con-taining shared data structures such as a per-port for-warding database (implemented as two Python dictio-naries: one for locally-connected hosts and one for re-mote switches). The relation between modules is out-lined in figure 4.

    The FrameReceiver thread does most of the work; themain steps performed are:

    i) A received frame is immediately packaged in aFrame object, which provides methods for access-ing individual fields within the frame’s headers.Some checks are run so that unwanted framesare dropped: for example, frames whose sourceaddresses indicate that they have already passedthrough this switch, which could be a sign of a rout-ing protocol malfunction or a misconfigured switchelsewhere in the network with the same switchidentifier.

    ii) If the frame is a DHCP or ARP query, it is trans-ferred to the control plane for conversion into anELK query.

    iii) Once the frame has been received and checked tobe valid, the source address is rewritten if it isnot already a MOOSE address. This process re-quires a host identifier, which is reused or allocatedas appropriate; allocated identifiers are stored in aPython dictionary (hash table). In this implemen-tation, in order to allow each port to issue hostidentifiers independently, each identifier starts witha byte identifying to which port this host is con-

    Framereceiver

    Frametransmitter

    Network interface card

    raw sockets

    Sourcerewriting

    Port

    Forwarding database

    Figure 4: Prototype data plane architecture

    nected; for example, the host identifier of the firsthost seen on port 3 will be 03:00:01.

    iv) The locally-connected-host forwarding database isupdated where necessary based upon the frame’ssource address.

    v) The frame is passed to the relevant forwardingdatabase, which obtains the correct destination Portobject from its internal dictionary. The frame is en-queued with the FrameTransmitter of this port.

    The only processing the FrameTransmitter performsbefore sending the frames over the relevant raw socketonto the network is to rewrite the destination address tobe the target host’s manufacturer-assigned MAC addressif the destination switch ID is this switch’s, i.e. if thathost is directly connected to this switch.

    The use of threads for parallel transmission to and re-ception from each network interface makes this softwaredesign analogous to a basic hardware design, with sev-eral independently-operating ports interconnected by aswitch fabric. It is our eventual goal to produce a hard-ware prototype as described in section 7.

    6.2 Control plane

    The control plane operates largely independently of thedata plane, in a separate thread, as it is much less timing-sensitive than the data plane. In a production implemen-tation, the control plane would likely run in software ona microprocessor.

    The main function of the control plane is to run therouting protocol in order to determine the location ofand best route to every other switch. This implemen-tation uses PWOSPF [30], a simple link state routingprotocol based on OSPF version 2 [21], as a proof ofconcept—the authentication features of OSPF are notrequired for a prototype implementation. A produc-tion MOOSE switch would likely require a full-featuredOSPF implementation (or another routing protocol) toensure security and resilience, and in particular to pre-

  • Address HWtype HWaddress Iface10.100.11.1 ether 02:00:0c:01:00:01 eth110.100.11.3 ether 02:00:0a:01:00:01 eth110.100.11.4 ether 02:00:0a:03:00:01 eth110.100.11.8 ether 02:00:0b:02:00:01 eth1

    Figure 5: ARP cache of an unmodified Linux PC connected toa network of MOOSE switches (02:00:0a, 02:00:0b, 02:00:0c)showing the addresses automatically assigned to other hosts

    vent the unauthorised spoofing of switches by end users.

    As PWOSPF is designed to operate on 4-byte IPaddresses rather than the 3- or 6-byte identifiers usedin MOOSE, the fields intended for IP addresses areretrofitted to contain a switch identifier followed by onenull byte of padding. Each switch handles all addressesstarting with its switch identifier, which is equivalent toeach switch routing a subnet of length 24 bits.

    The routing protocol calculates the shortest path to ev-ery other switch using Dijkstra’s algorithm [31]. Theresults are used to update the remote-switch forwardingdatabase maintained in each Port object of the local dataplane with the best Port on which to output frames des-tined for each switch.

    The control plane would also be responsible for oper-ating the mobility handover protocol; however this pro-tocol was unimplemented in the prototype.

    6.3 Evaluation

    The prototype MOOSE switch was found to behavetransparently to a variety of unmodified devices commu-nicating via IP with each other and with other hosts onthe Internet, both with physical and wireless connectionsto a network of three MOOSE switches. The only vis-ible effect was, as intended, that hosts’ individual ARPcaches show MOOSE addresses, as shown in figure 5.

    A second test was run on a virtual network compris-ing six virtual switches each connected to ten Xen virtualmachines acting as client hosts. This allows for compar-ison with a traditional Ethernet switch (as implementedby the Linux bridging driver). The forwarding databaseof each switch was inspected during a period when allclients were actively transmitting broadcast packets. Inthe case of Ethernet, the switches’ forwarding databaseseach contained sixty entries. In the case of MOOSE,each switch’s two forwarding database dictionaries con-tained five entries for the other switches and ten entriesfor the locally-connected hosts respectively. The storagerequirement of the forwarding has been reduced fromO(hosts) to O(switches), assuming that the number oflocally-connected hosts is a small constant; this is a sig-nificant improvement.

    7 Conclusions and Future WorkEthernet remains popular due to its simplicity and ubiq-uity, but is showing its age and exhibits serious scala-bility issues in large deployments. Previously-proposedimprovements address either a few of the problems in asimple way, or most of the problems in a highly complexor backwards-incompatible way. We have demonstrateda simple, novel and easily-implementable approach forsignificantly boosting the scalability of Ethernet, alongwith a working software implementation.

    Our next step will be to produce a true hardwareprototype. This will be built using the NetFPGA plat-form [32]. The NetFPGA card comprises four gigabitEthernet interfaces connected directly to an FPGA withfull control over everything from the Ethernet PHY andMAC inwards, plus a PCI interface to allow frames tobe passed to a PC for processing in software (e.g. forrunning the control plane).

    We also intend to implement a more extensive setof additional Ethernet features, including in particular802.1Q VLANs and Quality-of-Service provision.

    8 AcknowledgementsWe acknowledge the support of the UK EPSRC whichfunded this project through grant EP/D076803/1. Weare also grateful for David Simner’s invaluable secu-rity insight, and for the countless comments and sug-gestions made by him, Ian Abel, Dave Eyers, MalteSchwarzkopf, Laurence Aitchison and Derek Murray.

    Useful feedback and suggestions were provided byseveral attendees of the ITC 21 First Workshop on DataCenter – Converged and Virtual Ethernet Switching.

    References[1] R. M. Metcalfe and D. R. Boggs, “Ethernet: dis-

    tributed packet switching for local computer net-works,” Commun. ACM, vol. 19, no. 7, pp. 395–404, 1976.

    [2] P. Barham, et al., “Xen and the art of virtualiza-tion,” in SOSP, 2003, pp. 164–177.

    [3] C. Clark, et al., “Live migration of virtual ma-chines,” in USENIX NSDI, 2005.

    [4] C. Perkins, “IP Mobility Support for IPv4,”RFC 3344 (Proposed Standard), Aug. 2002,updated by RFC 4721. [Online]. Available:http://www.ietf.org/rfc/rfc3344.txt

    [5] IEEE, “802.1D: Standard for local and metropoli-tan area networks: Media access control (MAC)bridges,” 2004.

    http://www.ietf.org/rfc/rfc3344.txt

  • [6] 3Com Corporation, “Switch 5500G 10/100/1000family data sheet.” [Online]. Avail-able: http://www.3com.com/other/pdfs/products/en US/400908.pdf

    [7] F. Yu, et al., “Efficient multimatch packet classifi-cation and lookup with tcam,” IEEE Micro, vol. 25,no. 1, Jan. 2005.

    [8] K. Pagiamtzis and A. Sheikholeslami, “Content-Addressable Memory (CAM) circuits and architec-tures: a tutorial and survey,” IEEE Journal of Solid-State Circuits, vol. 41, pp. 712–727, 2006.

    [9] D. C. Plummer, “Ethernet Address Resolution Pro-tocol,” RFC 826 (Standard), Nov. 1982. [Online].Available: http://www.ietf.org/rfc/rfc826.txt

    [10] R. Droms, “Dynamic Host Configuration Proto-col,” RFC 2131 (Draft Standard), Mar. 1997,updated by RFCs 3396, 4361. [Online]. Available:http://www.ietf.org/rfc/rfc2131.txt

    [11] M. Scott, A. Moore, and J. Crowcroft, “Addressingthe scalability of Ethernet with MOOSE,” in ITC21 First Workshop on Data Center – Convergedand Virtual Ethernet Switching (DC CAVES), Sep.2009.

    [12] E. Rosen, A. Viswanathan, and R. Callon, “Multi-protocol Label Switching Architecture,” RFC3031 (Proposed Standard), Jan. 2001. [Online].Available: http://www.ietf.org/rfc/rfc3031.txt

    [13] I. Hadžić, “Hierarchical MAC address space inpublic Ethernet networks,” in IEEE GLOBECOM,vol. 3, 2001.

    [14] T. Rodeheffer, C. Thekkath, and D. Anderson,“SmartBridge: a scalable bridge architecture,” inSIGCOMM, 2000.

    [15] R. Perlman, “Rbridges: transparent routing,” in IN-FOCOM, vol. 2, 2004.

    [16] A. Myers, E. Ng, and H. Zhang, “Rethinkingthe service model: Scaling Ethernet to a millionnodes,” in ACM SIGCOMM Workshop on Hot Top-ics in Networking, Nov. 2004.

    [17] C. Kim, M. Caesar, and J. Rexford, “Floodlessin SEATTLE: a scalable Ethernet architecture forlarge enterprises,” in SIGCOMM, 2008, pp. 3–14.

    [18] S. Sipes, “Why your switched network isn’t se-cure,” in Intrusion Detection FAQ. The SANS In-stitute, Sep. 2000. [Online]. Available: http://www.sans.org/resources/idfaq/switched network.php

    [19] IEEE, “802.1AB: Station and media access controlconnectivity discovery,” 2009.

    [20] ——, “802.1X: Port based network access con-trol,” 2004.

    [21] J. Moy, “OSPF Version 2,” RFC 2328 (Standard),Apr. 1998. [Online]. Available: http://www.ietf.org/rfc/rfc2328.txt

    [22] C. Villamizar, “OSPF optimized multipath(OSPF-OMP),” IETF Internet Draft, Feb. 1999.[Online]. Available: http://tools.ietf.org/html/draft-ietf-ospf-omp-02

    [23] G. M. Schneider and T. Nemeth, “A simulationstudy of the OSPF-OMP routing algorithm,” Com-puter Networks, vol. 39, no. 4, pp. 457–468, 2002.

    [24] C. Villamizar, R. Chandra, and R. Govindan,“BGP Route Flap Damping,” RFC 2439 (ProposedStandard), Nov. 1998. [Online]. Available: http://www.ietf.org/rfc/rfc2439.txt

    [25] A. Adams, J. Nicholas, and W. Siadak, “ProtocolIndependent Multicast - Dense Mode (PIM-DM):Protocol Specification (Revised),” RFC 3973(Experimental), Jan. 2005. [Online]. Available:http://www.ietf.org/rfc/rfc3973.txt

    [26] M. Christensen, et al., “Considerations forInternet Group Management Protocol (IGMP)and Multicast Listener Discovery (MLD)Snooping Switches,” RFC 4541 (Informa-tional), May 2006. [Online]. Available: http://www.ietf.org/rfc/rfc4541.txt

    [27] T11 FC-BB-5 working group, “Fibre channelbackbone – 5,” Jun. 2009.

    [28] J. Pelissier, “Introduction to port extension,” in ITC21 First Workshop on Data Center – Convergedand Virtual Ethernet Switching (DC CAVES), Sep.2009.

    [29] A. Jeffree, P. Congdon, and J. Pelissier,“P802.1Qbg: Edge virtual bridging,” Un-approved PAR, Sep. 2009. [Online]. Avail-able: http://ieee802.org/1/files/public/docs2009/new-qbg-draft-par-0909-V2.pdf

    [30] Stanford University High-PerformanceNetworking Group, “Pee-Wee OSPFprotocol details.” [Online]. Available:http://web.archive.org/web/20070708180017/http://yuba.stanford.edu/cs344 public/docs/pwospf ref.txt

    [31] E. Dijkstra, “A note on two problems in connexionwith graphs,” Numerische Mathematik, vol. 1, pp.269–271, 1959.

    [32] J. W. Lockwood, et al., “NetFPGA—an open plat-form for gigabit-rate network switching and rout-ing,” in IEEE MSE, 2007, pp. 160–161.

    http://www.3com.com/other/pdfs/products/en_US/400908.pdfhttp://www.3com.com/other/pdfs/products/en_US/400908.pdfhttp://www.ietf.org/rfc/rfc826.txthttp://www.ietf.org/rfc/rfc2131.txthttp://www.ietf.org/rfc/rfc3031.txthttp://www.sans.org/resources/idfaq/switched_network.phphttp://www.sans.org/resources/idfaq/switched_network.phphttp://www.ietf.org/rfc/rfc2328.txthttp://www.ietf.org/rfc/rfc2328.txthttp://tools.ietf.org/html/draft-ietf-ospf-omp-02http://tools.ietf.org/html/draft-ietf-ospf-omp-02http://www.ietf.org/rfc/rfc2439.txthttp://www.ietf.org/rfc/rfc2439.txthttp://www.ietf.org/rfc/rfc3973.txthttp://www.ietf.org/rfc/rfc4541.txthttp://www.ietf.org/rfc/rfc4541.txthttp://ieee802.org/1/files/public/docs2009/new-qbg-draft-par-0909-V2.pdfhttp://ieee802.org/1/files/public/docs2009/new-qbg-draft-par-0909-V2.pdfhttp://web.archive.org/web/20070708180017/http://yuba.stanford.edu/cs344_public/docs/pwospf_ref.txthttp://web.archive.org/web/20070708180017/http://yuba.stanford.edu/cs344_public/docs/pwospf_ref.txthttp://web.archive.org/web/20070708180017/http://yuba.stanford.edu/cs344_public/docs/pwospf_ref.txt

    Abstract1 Introduction2 Ethernet's Underlying Problem3 Related Work4 MOOSE Architecture4.1 Shortest Path Routing4.2 Address Selection and Conflict Resolution4.3 Broadcast and Multicast4.4 Example4.4.1 Query4.4.2 Response

    4.5 Directory Service4.6 Mobility

    5 Interoperability Considerations5.1 Layer-violating Protocols5.2 Edge Virtual Bridging

    6 Implementation6.1 Data plane6.2 Control plane6.3 Evaluation

    7 Conclusions and Future Work8 Acknowledgements